Answer 9 questions attached in the worksheet.

Q1. What does a VCF file contain and how is the data formatted? How is it generated? Be sure to include any relavant fields that are standard to the format?

Q2.What are local optima in the text of phylogeny? What are some approaches to avoid them?

Q3.In Linux, what does the apt command do? When would you use it?

Q4.You have a spreadsheet with the five columns: ID, condition, and 100 columns of protein mass spec measurement values. Would you use PCA or LDA to analyze this data and why? How many dimensions of data are there? How many axes?

Q5. What is ANN( Artificial Neural NNetworks), Feed-Forward neural Network, Convolutional Neural Networks(CNN), and Generative Adversarial netwroks (GAN). What is the different between them?

Q6. Describe how Kmers can be used to assemble a genome from shotgun sequence data

Q7. What is the difference between the input, hidden, and output layers in an ANN?

Q8. Which Linux command would you use to create a new directory? Give an example

Q9. Describe how a hash function works, using both descriptive text and diagram with examples.