Molecular Systematics - Practice Exam Questions

These are exam questions from previous semesters. Note that as the course evolves, some subjects are given different degrees of emphasis, and this change in emphasis is reflected by the structure of the exam.

Sample Midterm Exam I

  1. Identify two advantages and two disadvantages of molecular phylogenetic methods. (3pts)
  2. Distinguish between the special cases of homology; "orthology" and "paralogy", particularly as they apply to duplicated genes. (3 pts)
  3. What is an algorithmic method? How do these differ from optimality methods? (3pts)
  4. Why are random addition sequences used in conjunction with heuristic searches? (3pts)
  5. What is an advantage of restriction-site map data over DNA sequence data for phylogenetic reconstruction? (3pts)
  6. Describe the salient features of each of the following models of sequence evolution: Jukes-Cantor, Kimura 2-parameter, F84/HKY85, General Time Reversible. (10 pts)
  7. The three main types of tree searches we discussed were: Exhaustive, Branch and Bound, and Heuristic. Briefly describe each of these, and describe each method in terms of: 1) thoroughness; 2) probability of finding the 'best' tree; 3) speed. (15 pts)
  8. Nonparametric bootstrapping is a random resampling method widely used in phylogenetic analysis. Describe how a bootstrap analysis is performed, and indicate what information it provides. If bootstrap analysis finds strong (100%) support for a branch on a tree, what does that mean? (10 pts)
  9. We have discussed three major families of phylogenetic methods: parsimony, likelihood, and distance methods. Briefly describe each of these approaches, and contrast them against each other. Be sure to note important differences in their assumptions, and give a general indication of the steps involved in performing an analysis with each method. (15 points)
  10. Listed below are examples of several examples of molecular phylogenetic problems and analytical methods that might be chosen to analyze the data. For each problem, indicate: (a) whether you feel the approach described is appropriate; (b) explain why you are of this opinion; and (c) for cases where you feel the approach listed is not appropriate, suggest a better approach and explain why you would prefer that approach. (20 pts)
    1. The ribosomal database project now has more than 3000 small subunit ribosomal RNA sequences, with about 1200 characters in the alignment. Gary decides to analyze the dataset via maximum likelihood using a general time-reversible (GTR) model with gamma rate correction.
    2. Jamila has a dataset composed of 100 tufA sequences from both bacteria and plastids. She determines that some of the sequences are highly A-T rich (some with as much as 95% A-T at third codon-positions), while others are similarly G-C rich. She performs a neighbor-joining analysis using LogDet distances.
    3. While studying coxI DNA sequences from mitochondria of the angiosperm genus Plantago and its close relatives in the Scrophulariaceae, Yangrae notices that some species of Plantago seem to be undergoing very rapid sequence evolution. These taxa have sequences that are very different from any other sequence in the alignment (the sequences are so divergent that they cannot be identified by southern hybridization). He decides to analyze the dataset using unweighted parsimony.
    4. Ken is studying the angiosperm genus Clarkia, and has determined chloroplast restriction-site maps for about 30 species. He codes a dataset composed of presence/absence characters for each restriction site, and analyzes it with unweighted parsimony.

     

Sample Midterm exam II

Answer each question in standard written English. In most cases a satisfactory answer will require only a few lines. The point values for each question are indicated.

1. Describe the distinction between an algorithmic method and an optimality method. (5 pts)

2. Isozyme (or allozyme) analysis involves the comparison of electrophoretic mobilities of known proteins. The proteins are typically separated by starch-gel electrophoresis, identified by biochemical assay, and scored by position on the gel. Identify the major advantages and disadvantages of this method. (5pts)

3. We have discussed three families of optimality criteria: maximum parsimony, maximum likelihood, and distance. Briefly describe each of these optimality criteria as applied to DNA data. Be sure to Identify the critical differences between these criteria. (15 pts)

3a. Parsimony

3b. Maximum Likelihood

3c. Distance (particularly Fitch-Margoliash and related methods)

4a. Diagram a methodical search tree for an exhaustive search of five taxa (label them A,B,C,D,E). (20pts)

4b. Describe the branch-and-bound algorithm, and explain how it can speed a search while still ensuring that the shortest possible tree will be found. You may want to make reference to the search tree you diagrammed above.

5a. Using unweighted maximum parsimony, calculate the length of the trees shown, given the alignment below -- make your calculations clear. Treat all character-state transformations as equally likely and reversible. (15 pts)

 Alpha	ATGGC GGGAA AAAGT
 Beta	ATGTC AAGAA ACTCA 
 Gamma	ATGTC AAGAA ACTCA 
 Delta	ATGGC GGGGC GAGAT 
 EpsilonATGGC GGGGC GAGGT 
 Zeta	ATGGC TGGGA ACGGA

 

5b. Which of the trees is favored (better) according to parsimony?

5c. How many of the characters in the matrix above are considered informative according to parsimony?

6. Explain how the likelihood of a tree is calculated in maximum likelihood analysis of DNA data. (10 pts)

7. The following diagram illustrates several models of DNA sequence evolution. Label the diagram, and use it to describe the important differences among these models of sequence evolution. Pay particular attention to the Jukes-Cantor (JC), Kimura two parameter (K2P), Felsenstein/Hasegawa-Kishono-Yano (F84/HKY85), and General Time Reversible (GTR) models. (15 pts)

8. Describe bootstrap analysis: (15 pts)

8a. Explain how the analysis is performed.

8b. How should bootstrap values be interpreted, i.e., what do they mean? What useful information does bootstrap analysis provide?

8c. What are some major problems with bootstrap analysis?