Computer program GMAP has been developed for assisting the
biologist working on synthetic gene design and modular redesign
of natural genes (wild type) with a view to ease the design of
useful ``cassettes'' for future manipulations of the genes. GMAP
uses the `e-generic' algorithm for searching the restriction
sites in DNA sequences. The e-generic algorithm is based on set
theory. The main function of GMAP is to search for potential
restriction sites in protein coded DNA sequences and to search
the restriction sites in nob-ambiguous DNA sequences which can be
introduced into the sequence by one or two three mutations,
without altering the amino acid sequence (ie. silent mutation).
Moreover, it has the additional option whereby translationally
non-silent R.E. sites that can be generated by limited mismatch
ing of bases can also be mapped.
Files contained in this package:
|
|
---|
FILES | DESCRIPTION
|
---|
|
|
---|
GMAP.EXE | EXECUTIVE VERSION OF GMAP PROGRAM
|
CODON.DOC | DOCUMENT FILE FOR CODON-USAGE
|
RNASE.AMN | AMINO ACID SEQUENCE FILE OF RIBONUCLEASE A
|
SK.DNA | DNA SEQUENCE OF STREPTOKINASE
|
REST.RES | RESTRICTION ENZYME AND THEIR RECOGNITION SEQUENCE
|
RESTC.COM | COMMERCIAL AVAILABLE RESTRICTION ENZYME AND THEIR
RECOGNITION SEQUENCE
|
ECOLI.COD | CODON USAGE TABLE CREATED FROM ECOLI USING MOST
FREQUENT CODON
|
ECOLI1.COD | CODON USAGE TABLE CREATED FROM ECOLI USING
PARTIAL AMBIGUOUS CODONS
|
gmap.tar | All files in GMAP package
|
gmap.tar | uuencoded gmap.tar file
|
GMAP: An Introduction
GMAP is a multi-purpose computer program that aids in the de
novo design of synthetic genes as well as the cassette mutagene
sis of natural genes by predicting potential restriction enzyme
(R.E.) sites in the target DNA sequences. Specifically, it car
ries out the following tasks.
i) Mapping the potential restriction endonuclease (R.E.) sites
in non-ambiguous DNA sequence, such as that of natural genes,
that can be introduced in the DNA sequence with or without alter
ing the amino acid sequence i.e. through non-silent or silent
mutations;
ii) predicting the number and type of mutations required to
introduce unique R.E. sites in the non-ambiguous DNA sequences
after a limited number (1, 2 or 3 bp per R.E. site) of transla
tionally silent/non-silent mutations;
iii) searching all R.E. sites in ambiguous DNA sequence obtained
by reverse translation of a given amino acid sequence;
iv) searching R.E. sites in DNA sequence obtained from reverse
translation of amino acid sequence employing user-defined codon
usage.
Finding translationally silent R.E. sites in DNA sequences
has become particularly important for biologists, especially
those dedicated to the investigation of protein structure/func
tion relationships. The ability to predict potential R.E. sites
that are resident in an ambiguous DNA sequence, such as those
obtained by reverse translation of protein amino acid sequences,
allows one to construct synthetic genes with appropriately placed
sites for cutting and joining DNA segments; similarly, the abili
ty to introduce translationally silent R.E. sites by limited
mutagenesis into a non-ambiguous DNA sequence (eg., the open-
reading-frames of natural genes) or in a translationally non-
silent manner elsewhere in genes (such as promoters, splice
junctions and other control elements that are not normally ex
pressed into proteins) permits the modular redesign of genes for
cassette' mutagenesis. A pertinent example of the latter type of
application is when enhancing the expression of whole genes by
cassette mutagenesis wherein one desires to cut just outside of a
coding region in order to fuse it to a stronger promoter.
The program GMAP is fully menu driven. The option `Input
amino acid sequence allows the user to input the amino acid
sequence (in single or three letter code) using keyboard or text
file, and also allows one to create and update the amino acid
sequence file. The sequence data obtained from PIR or NBRF can
also be directly used to create the input amino acid sequence
file. The option `Input DNA sequence file allows one to create
and update the DNA sequence file. Nucleotides can be inputted
using only NC-IUB designated symbols (cf. Eur. J. Biochem. 1985,
150:1-5), and other symbols will be rejected. The data can being
inputted using keyboard or from text (or ASCII) file, so that the
sequence data extracted from GenBank or EMBL can be directly used
for creating a DNA sequence file. This option also allows one to
convert amino acid sequence into DNA sequence by using a user-
defined codon preference table. The option `Input restriction en
zyme sequence' allows the user to create and update the restric
tion enzyme data file. The prototype restriction endonuclease
recognition sequences of type II enzymes are already stored in a
file REST.RES (cf. R.R. Roberts and D. Macelis; Nucleic Acids
Res. 1993, 21:3125). The `Input Codon Usage Table' allows one to
create and update the codon preference table. A file containing
the codons preferred by E. coli is included with the program (cf.
Wada et. al., Nucleic Acids Res. 1992, 20:2111).
The `Search R.E. sites in amino acid sequence' option allows
the user to i) search for all the R.E. sites in fully ambiguous
DNA sequence obtained from reverse translation of amino acid se
quence; ii) search the sites for a specific restriction enzyme in
reverse translated ambiguous DNA sequence iii) reverse translated
a given amino acid sequence into fully or partially ambiguous DNA
sequence or into completely non-ambiguous DNA sequence using
user-defined codon preference iv) search all R.E. sites in par
tial (or non-ambiguous) DNA sequence obtained from reverse trans
lation of amino acid sequence employing user defined codon pref
erence table v) search the sites for user-specified enzyme in
partially ambiguous or completely non-ambiguous DNA sequence
obtained from reverse translation of amino acid sequence with
user-defined codon usage.
The `Search R.E. sites in DNA sequences' option allows the
user to i) search all the potential R.E. sites which can be
introduced in DNA sequence by limited site-directed silent/non-
silent mutagenesis and the number of mutations required to intro
duce a site ii) search the potential sites for a specific re
striction enzyme, which can be introduced in DNA sequence by
site-directed silent/non-silent mutagenesis, and the number of
mutations required to introduce a site iii) translate the DNA
sequence into amino acid sequence iv) search the preexisting
sites of all R.E.'s in DNA sequence , and v) search exiting sites
of a specific R.E. in the DNA sequence.
The `Output DNA/Amino acid/R.E./Codon usage table' option
allows the display (or printout or save in file) of the amino
acid sequence, DNA sequence, restriction enzyme data and codon
preference usage table. Besides the main options and sub options
there are other available options that allow the user to output
the results in the desired format.
LICENCE:
This program remains the copyright property of the Institute of
Microbial Technology, Chandigarh, INDIA an institution of the
CSIR, Govt of India. This program may be freely used by anybody
subject to the following conditions:
1. The authors nor the Institute of Microbial Technology
Chandigarh assume any responsibility for any losses or
damage that may be caused by the use or misuse of the
accompanying software.
2. The authors nor the Institute of Microbial Technology
Chandigarh give any warranty with regards to the software
being able to function on any computer.
3. The accompanying software may not be copied nor distributed
with any modifications, and this document file MUST be
included with all copies
4. No fee may be charged for the copying and/or distribution of
the accompanying software.
5. Users must agree to accept any risk as a condition of the
free use of the accompanying software.
Any suggestion, bug report will be greatly appreciated. Please
send them to:
G P S Raghava, Scientist
Computer Center
Institute of Microbial Technology,
Sector 39A, Chandigarh 160 014,
India.
Email address: raghava@imtech.ernet.in
or
Girish Sahni, Scientist
Section of Molecular Biology
Institute of Microbial Technology,
Sector 39A, Chandigarh 160 014,
India.
Email address: girish@imtech.ernet.in