Multiple Sequence Alignment

Lab, Week of November 6

Practice using ClustalW from the command line interface to align the sequences. Running ClustalW from the command line is much like running the programs Needle or Water, both of which you have used before. On starting ClusalW, it will present you with a menu; use this to input a data file.

The program will then present you with the same menu. Start by trying a full alignment.

After the analysis has completed, there should be two new files in your working directory. Look at the contents of these two files. One should be an alignment.

The other should be a dendrogram (tree) file. You will find that the tree in the tree file doesn't look much like a tree. However, if you look carefully at the file you will see identifiers and numbers within nested parentheses. Sequences that are within a common set of parentheses share a branch. The numbers are branch lengths. Try to draw the tree that is represented by the tree file.

Based on the names of the proteins in the original text file, do you think this is a good phylogentic tree?

Go to the EBI website try ClustalW first to see if you got the same tree. Then try t-coffee. How does the alignment compare? How does the tree compare? Do you get any additional results?

If you have time, go to the MEME website and run the sequences through the MEME algorithm. What results do you get? Run the second MEME ouput file through the MAST algorithm using the genepept database. How do the results differ from those you got with ClustalW and t-coffee?