Exercises
·Unix Introduction
·BLAST
·PERL
·Genbank
·BLAST, GCG
·GCG
·Seqlab
·Synthesis
·MSA
·Paup
·Phylogeny
·Examine


·An editor primer
·A GCG cheatsheet
·Flat2fasta homework
·Dynamic Programming homework
·High scoring words homework
·GCG homework
·Seqlab homework
·Mystery sequence homework
·Paup homework

The Seqlab menus:

  • Pairwise Comparison
    • Gap: Needleman Wunsch
    • Bestfit: Smith Waterman
    • framealign: align protein sequence and codons in all reading frames of nucleotide sequence. Thus showing if the optimal alignment has a frameshift.
    • compare: Finds points of similarity which may be plotted with dotplot
    • dotplot: plots the output of compare into a pretty dotplot
    • gapshow: plots an alignment by graphing similarities and gaps
    • profilegap: create an alignment between a profile and 1+ sequences
  • Multiple Comparison
    • Pileup: Create MSA from progressive pairwise alignments
    • plotsimilarity: Plot running average of similarity among sequences of a MSA
    • pretty: Displays multiple sequence alignments and calculates a consensus sequence
    • motif: save conserved similarities among multiple sequences as a set of profiles which may be used to search a db
    • profilemake: Create a position specific scoring matrix from a MSA
    • profilegap: Make an optimal alignment between a profile and 1+ sequences
    • hmmeralign: Start with a hidden markov model profile and create an optimal alignment
    • overlap: Compare 2 sets of dna sequences in both orientations
    • nooverlap: Identify where a group of nucleotide sequences do not share any subsequence
    • olddistance: Make a table of pairwise similarities in a group of aligned sequences
  • Database Reference Searching
    • lookup: Identify sequences by name, accession, etc etc etc and return sequences
    • stringsearch: identify sequences by searching for characters in the documentation .. eg. type 'globin'
  • Database Sequence Searching
    • blast: Perform the basic local alignment search tool between a sequence and the local sequence dataset.
    • netblast: Ditto, except perform a (broken) network connection to ncbi.nlm.nih.gov -- instead use blastcl3
    • netfetch: Download sequence(s) from ncbi, be aware that the output from netfetch is not compatible with the rest of gcg and must be massaged.
    • psiblast: Perform a position specific iterated blast against the local ncbi db.
    • fasta: Perform a Pearson/Lipman search for local alignments and db searches.
    • ssearch: Smith/Waterman search against a db, the slowest option.
    • tfasta: Pearson/Lipman search after translating to all 6 reading frames.
    • tfastx: Same, except handle frameshifts while searching the db.
    • fastx: Same, with frameshifts and on both strands.
    • framesearch: Compares nucleotide/protein sequences, trying out all available frames.
    • hmmersearch: Uses a profile from hmmbuild in order to search a db.
    • motifsearch: Uses profile(s) from meme to perform similar search.
    • profilesearch: Uses profile from profilemake which is a MSA profile.
    • profilesegments: Using segments of similarity from profilesearch, perform an optimal alignment.
    • findpatterns: Find specific patterns, with or without ambiguities/mismatches.
    • motifs: Search a sequence for motifs from the PROSITE motif directory
    • wordsearch: find sequences with a large number of common words similar to query sequence
    • Seqments: Given input from wordsearch, this provides an alignment and shows segments.
  • Editing and Publication
    • assemble: Makes sequences from pieces of sequences, apparently seqed is better?
    • pretty: Displays Multiple Sequence alignments and calculates a consensus sequence.
    • plasmidmap: Shows a circular plot of a plasmid.
  • Evolution
    • distances: Create a table of pairwise distances among aligned sequences.
    • growtree: Create a tree from a table of distances a la distances.
    • diverge: Estimate number of pairwise synonymous and non-synonymous mutations.
    • paupsearch: A gcg interface to paup
    • paupdisplay: Tree display/edit utility for paup in gcg
  • Fragment Assembly
    • gelstart: creates a fragment assembly project for gcg.
    • gelenter: Adds new sequences to an assembly project.
    • gelmerge: Create contigs from sequences in an assembly project.
    • gelassemble: A sequence editor for contig editing.
    • gelview: Display the structure of contigs.
    • geldisassemble: Break up contigs and start over.
  • Gene finding and pattern recognition
    • testcode: plot non randomness in the third position of all codons.
    • codonpreference: attempt to find orfs in a single frame by looking at similarity of codon usage to a codon usage table.
    • frames: Shows potential reading frames on three forward and three reverse strands.
    • terminator: search for prokaryotic factor-independent RNA polymerase terminators.
    • motifs: search for motifs using the PROSITE dictionary of protein sites and patterns.
    • meme: find conserved motifs in unaligned sequences to be saved as a profile.
    • findpatterns: identifying short patterns like GAATTC, allows ambiguity or mismatches.
    • composition: Determine the composition of sequences, dinucleotide and trinucleotide included.
    • codonfrequency: Tabulate codon frequency, used by others...
    • fitconsensus: using a consensus table from consensus to find best regions of consensus in a sequence.
    • consensus: Using pre-aligned nucleotide information, calculate ATGC% for each position.
    • xnu: Replace tandem repeats with Xs useful for BLAST.
    • seq: Current algorithm for swapping repeats with Xs for BLAST.
  • HMMER
    • Hmmeralign: Take a hidden markov model profile and use it to align a group of sequences.
    • Hmmerbuild: Create a position specific profile useful for alignment, emission of sequences, or searching.
    • Hmmercalibrate: Given a profile, compare it to a large number of random sequences to create an extreme value distribution of search scores and use these to inform a new profile.
    • HmmerConvert: Convert a hmm to another format.
    • HmmerEmit: Generate sequences that match a given hmm profile.
    • HmmerFetch: Pull a given profile from a profile database.
    • HmmerIndex: Create an indexed set of HMM profiles which can be searched with hmmerfetch.
    • HmmerPfam: Compare sequences to profiles in a library like pfam to identify domains.
    • HmmerSearch: Use a profile as a query and search a db with it to find similar sequences.
  • Importing/Exporting
    • Reformat: Rewrite sequence files etc into the gcg format.
    • FromStaden: From staden's assembler into gcg.
    • FromEMBL: From the EMBL format into gcg.
    • FromGenbank: Reformat sequences in the genbank flat file format into gcg.
    • FromPIR: From the Protein identification resource format into gcg.
    • FromIG: From the Intelligenetics format to gcg.
    • FromFASTA: From the old style fasta format into gcg.
    • ToStaden: Export from gcg to staden.
    • ToPIR: Export to PIR.
    • ToIG: Export to IG.
    • ToFASTA: Export to FASTA.
  • Mapping
    • Map: Create a restriction enzyme map on both strands of a sequence.
    • MapPlot: Plot the output of Map.
    • MapSort: Take positions of restriction enzyme sites and map the pieces according to size.
    • PlasmidMap: Draw a circular plot of a plasmid construct.
    • Fingerprint: Calculate the products of a T1 ribonuclease digestion.
    • PeptideMap: Create a peptide map of amino acid sequence.
    • PeptideSort: Sort peptide fragments from the digest of an amino acid sequence.
  • Primer Selection
    • Prime: Select oligonucleotide primers from template DNA sequence.
    • PrimePair: Evaluate individual primers to determine their compatibility for use as primer pairs.
    • MeltTemp: Compute the approximate melting temperature of an oligonucleotide sequence.
  • Protein Analysis
    • Motifs: Given a protein sequence, search the prosite directory of protein sites and patterns and show abstracts.
    • ProfileScan: Given a db of profiles, find domains within a single sequence.
    • HmmerPfam: Compare a sequence to profiles in pfam etc.
    • PeptideSort: Ibid.
    • Isoelectric: Plot charge vs. pH for any protein sequence.
    • PeptideMap: Ibid.
    • PepPlot: Secondary structure and hydrophobicity in parrallel.
    • PeptideStructure: Make secondary structure predictions including alpha helices and beta sheets etc.
    • PeptidePlot: Plot the output from PeptideStructure.
    • Moment: Make a contour of the helically hydrophobic moments of a peptide sequence.
    • TransMem: Search for lkikely transmembrane sequences.
    • HelicalWheel: plot a peptide sequence as a helical wheel to help find amphiphilic regions.
    • HTHscan: search protein sequences for helix-turn-helix motifs.
    • SPSscan: Search sequences for signal peptides.
    • CoilScan: Locate coiled-coil sequences.
    • Xnu: Ibid.
    • Seg: Ibid.
  • Nucleic acid Secondary Structure.
    • mfold: Measure optimal and sub-optimal energy states using Zucker's energy minimization metode.
    • plotfold: plot the structre of potential protein folding structures using Zucker's algorithm..
    • StemLoop: Search for stem structures (inverted repeats.)
    • Dotplot: Ibid.
  • Translation
    • Translate: Convert nucleotide sequence to its translation.
    • backtranslate: Return from amino acid sequence to nucleotide.
    • Map: Map a sequence with restriction enzyme placements and protein translation.
    • Reverse: Create the reverse and/or complement of a sequence.

Created: Wed Sep 15 00:58:22 EDT 2004 by Chuck Delwiche
Last modified: Mon Nov 8 15:49:44 EST 2004 by Ashton Trey Belew.