|
|
Synthesize the previous pieces of information into the ability to perform a single, coherent analysis. We have used gcg, Perl, and Unix to open the doors to complex searches for information. Today our goal is to expand upon these pieces of information and bring them together in order to have more complete analyses.
Netfetch is an extremely useful tool for quickly downloading sequences
into gcg. However as we have noticed, it is flawed in its output. In
order for me to get around this, I created a bash script called
fixnetfetch; another option would be to modify your script which
translates from the ncbi flat file format to fasta.
As you go through these exercises, keep a log of what you are doing using the tools on your workstation and answer all the relevant questions. When you have finished exploring these sequences, print out your log and turn it in.
Acquire the sequence for accession L07390.
Blast the sequence against nr and find the closest match to the model
mosquito, Anopheles gambiae and Caenorhabditis elegans. Compare the blast
alignment between the test sequence and Anopheles to an alignment from
gcg's Smith Waterman alignment via bestfit. How do the statistics
provided by the two algorithms match?
Now imagine the following scenario:
>unknown.seq
CCTTGATAAGTGCGTACGNCNAGGTTTTCCNATTCANANGTTNTAAAANG
ACGGCCAGTGAATNGTAATACGACTCACTATAGGGCGAATTGGGTACCGG
GCCCCCCCTCGAGGTCGACGGTATCGATAAGCTTGATatgAGTTCCAATC
TAAAAAATAATGAATATAAAGAAGGATGTTATCCTTTGTTCTTTTTTGAA
AATTTCTACGTAAAAGTCTCTATTAATACTGCTTATTATATACTTAAGAC
AGACAAAAGAAAAAAAGATAAACAAATAAAATCCAATTTATTAAAAAAAA
ACAATCAAATGATACCTTTCATTTTCTACTTATTAAACGATTAATTCGTA
AGATAAGGCAACAAAATTTTAACTGGAGTGAATCATCTAGATTGTATGAT
TTTTCTAATAAAATAGAACCTAATTATAAATATGAATATAATAGAATTAA
GTTATTTTATATTTTATTAATAGAGAATTTGATATTTTTAGTATTACGAT
TCTTATGGGAACAAAAACAAGAGAAAAAGAATGATTTTTCTCTTTTCATT
AAAAAATCTATTCAATTTGCTTTTCCTTTTTTAGAACATAAAATGAGTAA
TTCTGCTTCAATAATAGAAGGACAACTTTGTTTTTCTTATACAACTAGAA
AGCTTAATTTTTTGCTTTTTTTTCTTTACAAGAGAATCCGCGATACTGTC
TTTATAAATTTACTAAAAAAAATATTCAAGTTTAATAAATTACTTTTAAG
GAAAGAGAATTATTTCAATGTAAATTATTTCAATGTAATGTCTAAAATCA
GATTATTGGACTTATTAGCAAACTTATATGGAAATGAATTTGATTCTTTT
TTTGTTTACAATATTTTAAAAATACATAACTTAAATTGTCTTTTTTTGCC
ATATAAATCTATAGAAGATTATTCTTTACTACAAAAACACAATATTATTA
TTAATAGTAATAGTTATAAAAATCAAATAAATATATCTTCTTTTTCTTGG
TTAATTATCAATTTTATATATTCCATATACGGACACATTTTCTATATACG
CCGTGGCATTTCATTTCTAATAATCCTTAAACTAGGACGAGGTTTCTCTC
GATTTTGGAAATTTAATTGTGTCAAATTTATACAATTGAAATTAGAATCT
AATCGTTCTTTTTATTTAATACAGTCACGGTTTGTTTTACGTCAAAGTTC
GTTATTCTTAGGGTATAAAATTATAAATAGGTTTTGGCAAAAAAAACTAA
AAATTAAAGCATCTTCTTGGTCTTTTTTTGTTTTTTTAAAAGATCGAAAA
ATATCTTCAGAAATACCAATTGATAATCTTATTACTAATTTAACTGTAAT
TAATTTATGTAATAAAAAAGGTTATCCAATTCATAAAGCTTCGTGGTCTA
CATTTAGTGATCAACAAATTATAAAAATTTATAATAAAGTGTGGAATGAA
TTATTTTTGTATTATTGTGGATCTTCGAATCGTTCTATTTTAACTCAAAT
TCAGTATATTTTAGAATTTTCATGTATTAAAACTTTAGCTTTTAAACATA
AATCTAATATTAGATTGGCATGGGAGCAATACAGAAAAGATGTGTCATTA
TCCAACTTAGAAAGCGATATAGATTATTTTGGTAAAATCTCATATAATTT
TCCTTCTTTATTTCAAAAAAAAAACTTTTTTTGGCTTTTAGGAATTTCTA
GAATTGATCATCCAAATTCTTTTATTATTGAGTCATATTCAAGAATACAT
GAGGAAAGCCGCTTGCATtgaATCGAATTCCTGCAGCCCGGGGGATCCAC
TAGTTCTAGAGCGGCCGCCACCGCGGTGGAGCTCCAGCTTTTGTTCCCTT
TAGTGAGGGTTAATTTCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCC
TGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAA
CATAA
Created: Wed Sep 15 00:58:22 EDT 2004 by Charles F. Delwiche |