·Unix Introduction

·An editor primer
·A GCG cheatsheet
·Flat2fasta homework
·Dynamic Programming homework
·High scoring words homework
·GCG homework
·Seqlab homework
·Mystery sequence homework
·Paup homework

The Seqlab menus:

  • Pairwise Comparison
    • Gap: Needleman Wunsch
    • Bestfit: Smith Waterman
    • framealign: align protein sequence and codons in all reading frames of nucleotide sequence. Thus showing if the optimal alignment has a frameshift.
    • compare: Finds points of similarity which may be plotted with dotplot
    • dotplot: plots the output of compare into a pretty dotplot
    • gapshow: plots an alignment by graphing similarities and gaps
    • profilegap: create an alignment between a profile and 1+ sequences
  • Multiple Comparison
    • Pileup: Create MSA from progressive pairwise alignments
    • plotsimilarity: Plot running average of similarity among sequences of a MSA
    • pretty: Displays multiple sequence alignments and calculates a consensus sequence
    • motif: save conserved similarities among multiple sequences as a set of profiles which may be used to search a db
    • profilemake: Create a position specific scoring matrix from a MSA
    • profilegap: Make an optimal alignment between a profile and 1+ sequences
    • hmmeralign: Start with a hidden markov model profile and create an optimal alignment
    • overlap: Compare 2 sets of dna sequences in both orientations
    • nooverlap: Identify where a group of nucleotide sequences do not share any subsequence
    • olddistance: Make a table of pairwise similarities in a group of aligned sequences
  • Database Reference Searching
    • lookup: Identify sequences by name, accession, etc etc etc and return sequences
    • stringsearch: identify sequences by searching for characters in the documentation .. eg. type 'globin'
  • Database Sequence Searching
    • blast: Perform the basic local alignment search tool between a sequence and the local sequence dataset.
    • netblast: Ditto, except perform a (broken) network connection to ncbi.nlm.nih.gov -- instead use blastcl3
    • netfetch: Download sequence(s) from ncbi, be aware that the output from netfetch is not compatible with the rest of gcg and must be massaged.
    • psiblast: Perform a position specific iterated blast against the local ncbi db.
    • fasta: Perform a Pearson/Lipman search for local alignments and db searches.
    • ssearch: Smith/Waterman search against a db, the slowest option.
    • tfasta: Pearson/Lipman search after translating to all 6 reading frames.
    • tfastx: Same, except handle frameshifts while searching the db.
    • fastx: Same, with frameshifts and on both strands.
    • framesearch: Compares nucleotide/protein sequences, trying out all available frames.
    • hmmersearch: Uses a profile from hmmbuild in order to search a db.
    • motifsearch: Uses profile(s) from meme to perform similar search.
    • profilesearch: Uses profile from profilemake which is a MSA profile.
    • profilesegments: Using segments of similarity from profilesearch, perform an optimal alignment.
    • findpatterns: Find specific patterns, with or without ambiguities/mismatches.
    • motifs: Search a sequence for motifs from the PROSITE motif directory
    • wordsearch: find sequences with a large number of common words similar to query sequence
    • Seqments: Given input from wordsearch, this provides an alignment and shows segments.
  • Editing and Publication
    • assemble: Makes sequences from pieces of sequences, apparently seqed is better?
    • pretty: Displays Multiple Sequence alignments and calculates a consensus sequence.
    • plasmidmap: Shows a circular plot of a plasmid.
  • Evolution
    • distances: Create a table of pairwise distances among aligned sequences.
    • growtree: Create a tree from a table of distances a la distances.
    • diverge: Estimate number of pairwise synonymous and non-synonymous mutations.
    • paupsearch: A gcg interface to paup
    • paupdisplay: Tree display/edit utility for paup in gcg
  • Fragment Assembly
    • gelstart: creates a fragment assembly project for gcg.
    • gelenter: Adds new sequences to an assembly project.
    • gelmerge: Create contigs from sequences in an assembly project.
    • gelassemble: A sequence editor for contig editing.
    • gelview: Display the structure of contigs.
    • geldisassemble: Break up contigs and start over.
  • Gene finding and pattern recognition
    • testcode: plot non randomness in the third position of all codons.
    • codonpreference: attempt to find orfs in a single frame by looking at similarity of codon usage to a codon usage table.
    • frames: Shows potential reading frames on three forward and three reverse strands.
    • terminator: search for prokaryotic factor-independent RNA polymerase terminators.
    • motifs: search for motifs using the PROSITE dictionary of protein sites and patterns.
    • meme: find conserved motifs in unaligned sequences to be saved as a profile.
    • findpatterns: identifying short patterns like GAATTC, allows ambiguity or mismatches.
    • composition: Determine the composition of sequences, dinucleotide and trinucleotide included.
    • codonfrequency: Tabulate codon frequency, used by others...
    • fitconsensus: using a consensus table from consensus to find best regions of consensus in a sequence.
    • consensus: Using pre-aligned nucleotide information, calculate ATGC% for each position.
    • xnu: Replace tandem repeats with Xs useful for BLAST.
    • seq: Current algorithm for swapping repeats with Xs for BLAST.
    • Hmmeralign: Take a hidden markov model profile and use it to align a group of sequences.
    • Hmmerbuild: Create a position specific profile useful for alignment, emission of sequences, or searching.
    • Hmmercalibrate: Given a profile, compare it to a large number of random sequences to create an extreme value distribution of search scores and use these to inform a new profile.
    • HmmerConvert: Convert a hmm to another format.
    • HmmerEmit: Generate sequences that match a given hmm profile.
    • HmmerFetch: Pull a given profile from a profile database.
    • HmmerIndex: Create an indexed set of HMM profiles which can be searched with hmmerfetch.
    • HmmerPfam: Compare sequences to profiles in a library like pfam to identify domains.
    • HmmerSearch: Use a profile as a query and search a db with it to find similar sequences.
  • Importing/Exporting
    • Reformat: Rewrite sequence files etc into the gcg format.
    • FromStaden: From staden's assembler into gcg.
    • FromEMBL: From the EMBL format into gcg.
    • FromGenbank: Reformat sequences in the genbank flat file format into gcg.
    • FromPIR: From the Protein identification resource format into gcg.
    • FromIG: From the Intelligenetics format to gcg.
    • FromFASTA: From the old style fasta format into gcg.
    • ToStaden: Export from gcg to staden.
    • ToPIR: Export to PIR.
    • ToIG: Export to IG.
    • ToFASTA: Export to FASTA.
  • Mapping
    • Map: Create a restriction enzyme map on both strands of a sequence.
    • MapPlot: Plot the output of Map.
    • MapSort: Take positions of restriction enzyme sites and map the pieces according to size.
    • PlasmidMap: Draw a circular plot of a plasmid construct.
    • Fingerprint: Calculate the products of a T1 ribonuclease digestion.
    • PeptideMap: Create a peptide map of amino acid sequence.
    • PeptideSort: Sort peptide fragments from the digest of an amino acid sequence.
  • Primer Selection
    • Prime: Select oligonucleotide primers from template DNA sequence.
    • PrimePair: Evaluate individual primers to determine their compatibility for use as primer pairs.
    • MeltTemp: Compute the approximate melting temperature of an oligonucleotide sequence.
  • Protein Analysis
    • Motifs: Given a protein sequence, search the prosite directory of protein sites and patterns and show abstracts.
    • ProfileScan: Given a db of profiles, find domains within a single sequence.
    • HmmerPfam: Compare a sequence to profiles in pfam etc.
    • PeptideSort: Ibid.
    • Isoelectric: Plot charge vs. pH for any protein sequence.
    • PeptideMap: Ibid.
    • PepPlot: Secondary structure and hydrophobicity in parrallel.
    • PeptideStructure: Make secondary structure predictions including alpha helices and beta sheets etc.
    • PeptidePlot: Plot the output from PeptideStructure.
    • Moment: Make a contour of the helically hydrophobic moments of a peptide sequence.
    • TransMem: Search for lkikely transmembrane sequences.
    • HelicalWheel: plot a peptide sequence as a helical wheel to help find amphiphilic regions.
    • HTHscan: search protein sequences for helix-turn-helix motifs.
    • SPSscan: Search sequences for signal peptides.
    • CoilScan: Locate coiled-coil sequences.
    • Xnu: Ibid.
    • Seg: Ibid.
  • Nucleic acid Secondary Structure.
    • mfold: Measure optimal and sub-optimal energy states using Zucker's energy minimization metode.
    • plotfold: plot the structre of potential protein folding structures using Zucker's algorithm..
    • StemLoop: Search for stem structures (inverted repeats.)
    • Dotplot: Ibid.
  • Translation
    • Translate: Convert nucleotide sequence to its translation.
    • backtranslate: Return from amino acid sequence to nucleotide.
    • Map: Map a sequence with restriction enzyme placements and protein translation.
    • Reverse: Create the reverse and/or complement of a sequence.

Created: Wed Sep 15 00:58:22 EDT 2004 by Chuck Delwiche
Last modified: Mon Nov 8 15:49:44 EST 2004 by Ashton Trey Belew.