Glossary
A
nitrogenous base, one member of the base pair AT (adenine-thymine).
Alternative
form of a genetic locus; a single allele for each locus is inherited from each
parent (e.g., at a locus for eye color the allele might result in blue or brown
eyes).
Variation
in alleles among members of the same species.
Different
ways of combining a gene's exons to make variants of the complete protein
Any
of a class of 20 molecules that are combined to form proteins in living things.
The sequence of amino acids in a protein and hence protein function are
determined by the genetic code.
An
increase in the number of copies of a specific DNA fragment; can be in vivo or
in vitro.
Adding
pertinent information such as gene coded for, amino acid sequence, or other
commentary to the database entry of raw sequence of DNA bases.
Nucleic
acid that has a sequence exactly opposite to an mRNA molecule made by the body;
binds to the mRNA molecule to prevent a protein from being made.
Individual
primary recombinant clones (hosted in phage, cosmid, YAC, or other vector) that
are placed in two-dimensional arrays in microtiter dishes. Each primary clone
can be identified by the identity of the plate and the clone location (row and
column) on that plate. Arrayed libraries of clones can be used for many
applications, including screening for a specific gene or genomic region of
interest.
Putting
sequenced fragments of DNA into their correct chromosomal positions.
A
technique that uses X-ray film to visualize radioactively labeled molecules or
fragments of molecules; used in analyzing length and number of DNA fragments
after they are separated by gel electrophoresis.
A
cross between an animal that is heterozygous for alleles obtained from two
parental strains and a second animal from one of those parental strains. Also
used to describe the breeding protocol of an outcross followed by a backcross.
Bacterial
artificial chromosome
(BAC)
A
vector used to clone DNA fragments (100- to 300-kb insert size; average, 150
kb) in Escherichia coli cells. Based on naturally occurring F-factor
plasmid found in the bacterium E. coli.
One
of the molecules that form DNA and RNA molecules.
Two
nitrogenous bases (adenine and thymine or guanine and cytosine) held together
by weak bonds. Two strands of DNA are held together in the shape of a double
helix by the bonds between base pairs.
The
order of nucleotide bases in a DNA molecule; determines structure of proteins
encoded by that DNA.
A
method, sometimes automated, for determining the base sequence.
The
science of managing and analyzing biological data using advanced computing
techniques. Especially important in analyzing genomic research data.
A
set of biological techniques developed through basic research and now applied
to research and product development. In particular, biotechnology refers to the
use by industry of recombinant DNA, cell fusion, and new bioprocessing techniques.
A
computer program that identifies homologous (similar) genes in different
organisms, such as rice, arabidopsis, or soybean.
A
gene located in a chromosome region suspected of being involved in a disease.
Gel-filled
silica capillaries used to separate fragments for DNA sequencing. The small
diameter of the capillaries permit the application of higher electric fields,
providing high speed, high throughput separations that are significantly faster
than traditional slab gels.
A
collection of DNA sequences that code for genes. The sequences are generated in
the laboratory from mRNA sequences.
A
unit of measure of recombination frequency. One centimorgan is equal to a 1%
chance that a marker at one genetic locus will be separated from a marker at a
second locus due to crossing over in a single generation. In rice, one
centimorgan is equivalent, on average, to 270 kb.
Circular
DNA found in the photosynthesizing organelle (chloroplast) of plants instead of
the cell nucleus where most genetic material is located.
The
self-replicating genetic structure of cells containing the cellular DNA that
bears in its nucleotide sequence the linear array of genes. In prokaryotes,
chromosomal DNA is circular, and the entire genome is carried on one
chromosome. Eukaryotic genomes consist of a number of chromosomes whose DNA is
associated with different kinds of proteins.
An
exact copy made of biological material such as a DNA segment (e.g., a gene or
other region), a whole cell, or a complete organism.
Using
specialized DNA technology to produce multiple, exact copies of a single gene
or other segment of DNA to obtain enough material for further study. This
process is referred to as cloning DNA. The resulting cloned (copied)
collections of DNA molecules are called clone libraries. A second type of
cloning exploits the natural process of cell division to make many copies of an
entire cell. The genetic makeup of these cloned cells, called a cell line, is
identical to the original cell. A third type of cloning produces complete,
genetically identical animals such as the famous Scottish sheep, Dolly.
DNA
molecule originating from a virus, a plasmid, or the cell of a higher organism
into which another DNA fragment of appropriate size can be integrated without
loss of the vector's capacity for self-replication; vectors introduce foreign
DNA into host cells, where the DNA can be reproduced in large quantities.
Examples are plasmids, cosmids, and yeast artificial chromosomes; vectors are
often recombinant molecules containing DNA sequences from several sources.
Study
of genetics of a newly sequenced genome by comparisons with model organisms
such as rice, Arabidopsis etc.
DNA
that is synthesized in the laboratory from a messenger RNA template.
Nucleic
acid base sequence that can form a double-stranded structure with another DNA
fragment by following base-pairing rules (A pairs with T and C with G). The
complementary sequence to GTAC for example, is CATG.
Trait
that has a genetic component that does not follow strict Mendelian inheritance.
May involve the interaction of two or more genes or gene-environment
interactions.
In
genetics, the expectation that genetic material and the information gained from
testing that material will not be available without the donor's consent.
A
base sequence in a DNA molecule (or an amino acid sequence in a protein) that
has remained essentially unchanged throughout evolution.
Group
of cloned (copied) pieces of DNA representing overlapping regions of a
particular chromosome.
A
map depicting the relative order of a linked library of overlapping clones
representing a complete chromosomal segment.
Artificially
constructed cloning vector containing the cos gene of phage lambda. Cosmids can
be packaged in lambda phage particles for infection into E. coli; this
permits cloning of larger DNA fragments (up to 45kb) than can be introduced
into bacterial hosts in plasmid vectors.
The
breaking during meiosis of one maternal and one paternal chromosome, the
exchange of corresponding sections of DNA, and the rejoining of the
chromosomes. This process can result in an exchange of alleles between
chromosomes.
The
study of the physical appearance of chromosomes.
A
type of chromosome map whereby genes are located on the basis of cytological
findings obtained with the aid of chromosome mutations.
A
genetic characteristic in which the genes are found outside the nucleus, in
chloroplasts or mitochondria. Results in offspring inheriting genetic material
from only one parent.
A
nitrogenous base, one member of the base pair GC (guanine and cytosine) in DNA.
A
collection of databases, data tables, and mechanisms to access the data on a
single subject.
A
loss of part of the DNA from a chromosome; can lead to a disease or abnormality.
A
description of a specific chromosome that uses defined mutations --specific
deleted areas in the genome-- as 'biochemical signposts,' or markers for
specific areas.
A
full set of genetic material consisting of paired chromosomes, one from each
parental set. Most animal cells except the gametes have a diploid set of
chromosomes.
Alteration
of DNA at a specific site and its reinsertion into an organism to study any
effects of the change.
Alleles
carrying particular DNA sequences associated with the presence of disease.
The
molecule that encodes genetic information. DNA is a double-stranded molecule
held together by weak bonds between base pairs of nucleotides. The four
nucleotides in DNA contain the bases adenine (A), guanine (G), cytosine (C),
and thymine (T). In nature, base pairs form only between A and T and between G
and C; thus the base sequence of each single strand can be deduced from that of
its partner.
A
facility that stores DNA extracted from various organisms in whole or cloned
form.
The
use of existing DNA as a template for the synthesis of new DNA strands. In
eukaryotes, replication occurs in the cell nucleus.
The
relative order of base pairs, whether in a DNA fragment, gene, chromosome, or
an entire genome.
A
discrete portion of a protein with its own function. The combination of domains
in a single protein determines its overall function.
The
first set of sequences generated by the genome sequencing programme. While
incomplete, it offers a virtual road map to an estimated 95% of all genes.
Draft sequence data are mostly in the form of 10,000 base pair-sized fragments whose
approximate chromosomal locations are known.
A
method of separating large molecules (such as DNA fragments or proteins) from a
mixture of similar molecules. An electric current is passed through a medium
containing the mixture, and each kind of molecule travels through the medium at
a different rate, depending on its electrical charge and size. Agarose and
acrylamide gels are the media commonly used for electrophoresis of proteins and
nucleic acids.
A
process using high-voltage current to make cell membranes permeable to allow
the introduction of new DNA; commonly used in recombinant DNA technology.
Common
bacterium that has been studied intensively by geneticists because of its small
genome size, normal lack of pathogenicity, and ease of growth in the
laboratory.
DNA
originating outside an organism that has been introducted into the organism.
The
protein-coding DNA sequence of a gene.
An
enzyme that cleaves nucleotides sequentially from free ends of a linear nucleic
acid substrate.
A
short strand of DNA that is a part of a cDNA molecule and can act as identifier
of a gene. Used in locating and mapping genes.
Each
generation of offspring in a breeding program, designated F1, F2, etc.
In
genetics, the identification of multiple specific alleles on an organism’s DNA
to produce a unique identifier for that sample.
High-quality,
low error, gap-free DNA sequence of the genome.
Fluorescence in situ
hybridization
(FISH)
A
physical mapping approach that uses fluorescein tags to detect hybridization of
probes with metaphase chromosomes and with the less-condensed somatic
interphase chromatin.
The
complete order of bases in a gene. This order determines which protein a gene
will produce.
The
study of genes, their resulting proteins, and the role played by the proteins
the body's biochemical processes.
Mature
male or female reproductive cell (sperm or ovum) with a haploid set of
chromosomes.
Many
DNA sequences carry long stretches of repeated G and C which often indicate a
gene-rich region.
The
fundamental physical and functional unit of heredity. A gene is an ordered
sequence of nucleotides located in a particular position on a particular
chromosome that encodes a specific functional product (i.e., a protein or RNA
molecule).
Repeated
copying of a piece of DNA; a characteristic of tumor cells.
Development
of cDNA microarrays from a large number of genes. Used to monitor and measure
changes in gene expression for each gene represented on the chip.
The
process by which a gene's coded information is converted into the structures
present and operating in the cell. Expressed genes include those that are
transcribed into mRNA and then translated into protein and those that are
transcribed into RNA but not translated into protein (e.g., transfer and
ribosomal RNAs).
Group
of closely related genes that make similar products.
Determination
of the relative positions of genes on a DNA molecule (chromosome or plasmid)
and of the distance, in linkage units or physical units, between them.
All
the variations of genes in a species.
Predictions
of possible genes made by a computer program based on how well a stretch of DNA
sequence matches known gene sequences
The
biochemical material, either RNA or protein, resulting from expression of a
gene. The amount of gene product is used to measure how active a gene is;
abnormal amounts can be correlated with disease-causing alleles.
The
sequence of nucleotides, coded in triplets (codons) along the mRNA, that
determines the sequence of amino acids in protein synthesis. A gene's DNA
sequence can be used to predict the mRNA sequence, and the genetic code can in
turn be used to predict the amino acid sequence.
Altering
the genetic material of cells or organisms to enable them to make new
substances or perform new functions.
A
gene or other identifiable portion of DNA whose inheritance can be followed.
Difference
in DNA sequence among individuals, groups, or populations (e.g., genes for red
flower versus white flower).
Susceptibility
to a genetic disease. May or may not result in actual development of the
disease.
Testing
a group of plants (especially parental lines) to identify individuals at high
risk of having or passing on a specific genetic disorder.
All
the genetic material in the chromosomes of a particular organism; its size is
generally given as its total number of base pairs.
Research
and technology-development effort aimed at mapping and sequencing the genome
certain model organisms.
A
collection of clones made from a set of randomly generated overlapping DNA
fragments that represent the entire genome of an organism.
The
study of genes and their function.
The
genetic constitution of an organism, as distinguished from its physical
appearance (its phenotype).
A
nitrogenous base, one member of the base pair GC (guanine and cytosine) in DNA.
A
single set of chromosomes (half the full set of genetic material) present in
the egg and sperm cells of animals and in the egg and pollen cells of plants.
A
way of denoting the collective genotype of a number of closely linked loci on a
chromosome.
Having
only one copy of a particular gene. For example, in humans, males are hemizygous
for genes found on the Y chromosome.
The
presence of different alleles at one or more loci on homologous chromosomes.
DNA
sequence that is very similar across several different types of organisms.
A
fast method of determining the order of bases in DNA.
A
member of a chromosome pair in diploid organisms or a gene that has the same
origin and functions in two or more species.
Chromosome
containing the same linear gene sequences as another, each derived from one
parent.
Swapping
of DNA fragments between paired chromosomes.
Similarity
in DNA or protein sequences between individuals of the same species or among different
species.
An
organism that has two identical alleles of a gene.
Use
of a DNA or RNA probe to detect the presence of the complementary DNA sequence
in cloned bacterial or cultured eukaryotic cells.
Studies
performed outside a living organism such as in a laboratory.
Studies
carried out in living organisms.
During
meiosis each of the two copies of a gene is distributed to the germ cells
independently of the distribution of other genes.
DNA
sequence that interrupts the protein-coding sequence of a gene; an intron is
transcribed into RNA but is cut out of the message before it is translated into
protein.
An
enzyme performing the same function as another enzyme but having a different
set of amino acids. The two enzymes may function at different speeds.
Stretches
of DNA that do not code for genes; most of the genome consists of so-called
junk DNA which may have regulatory and other functions. Also called non-coding
DNA.
A
photomicrograph of an individual's chromosomes arranged in a standard format
showing the number, size, and shape of each chromosome type; used in
low-resolution physical mapping to correlate gross chromosomal abnormalities
with the characteristics of specific diseases.
Unit
of length for DNA fragments equal to 1000 nucleotides.
Deactivation
of specific genes; used in laboratory organisms to study gene function.
An
unordered collection of clones (i.e., cloned DNA from a particular organism)
whose relationship to each other can be established by physical mapping.
The
proximity of two or more markers (e.g., genes, RFLP markers) on a chromosome;
the closer the markers, the lower the probability that they will be separated
during DNA repair or replication processes (binary fission in prokaryotes,
mitosis or meiosis in eukaryotes), and hence the greater the probability that
they will be inherited together.
Where
alleles occur together more often than can be accounted for by chance.
Indicates that the two alleles are physically close on the DNA strand.
A
map of the relative positions of genetic loci on a chromosome, determined on
the basis of how often the loci are inherited together. Distance is measured in
centimorgans (cM).
The
position on a chromosome of a gene or other chromosome marker; also, the DNA at
that position. The use of locus is sometimes restricted to mean expressed DNA
regions.
The
group of related organisms used in constructing a genetic map.
Unit
of length for DNA fragments equal to 1 million nucleotides and roughly equal to
1 cM.
A
process by which genetic traits are passed from parents to offspring. Named for
Gregor Mendel, who first studied and recognized the existence of genes and this
method of inheritance.
Messenger
RNA
(mRNA)
RNA
that serves as a template for protein synthesis.
Sets
of miniaturized chemical reaction areas that may also be used to test DNA
fragments, antibodies, or proteins.
The
genetic material found in mitochondria, the organelles that generate energy for
the cell. Not inherited in the same fashion as nucleic DNA.
The
process of nuclear division in cells that produces daughter cells that are
genetically identical to each other and to the parent cell.
The
use of statistical analysis, computer analysis, or model organisms to predict
outcomes of research.
The
study of the structure, function, and makeup of biologically important
molecules.
The
development of transgenics to produce proteins for pharmaceutical and
industrial use.
The
study of macromolecules important in biological inheritance.
A
laboratory approach that performs multiple sets of reactions in parallel
(simultaneously); greatly increasing speed and throughput.
Any
heritable change in DNA sequence.
A
gel-based laboratory procedure that locates mRNA sequences on a gel that are
complementary to a piece of DNA used as a probe.
A
large molecule composed of nucleotide subunits.
A
subunit of DNA or RNA consisting of a nitrogenous base (adenine, guanine,
thymine, or cytosine in DNA; adenine, guanine, uracil, or cytosine in RNA), a
phosphate molecule, and a sugar molecule (deoxyribose in DNA and ribose in
RNA). Thousands of nucleotides are linked to form a DNA or RNA molecule.
The
cellular organelle in eukaryotes that contains most of the genetic material.
A
phenotypic trait produced by two or more genes working together.
Oligonucleotide
A
molecule usually composed of 25 or fewer nucleotides; used as a DNA synthesis
primer.
The
sequence of DNA or RNA located between the start-code sequence (initiation
codon) and the stop-code sequence (termination codon).
A
set of genes transcribed under the control of an operator gene.
P1-derived
artificial chromosome
(PAC)
One
type of vector used to clone DNA fragments (100- to 300-kb insert size;
average, 150 kb) in Escherichia coli cells. Based on bacteriophage (a
virus) P1 genome.
A
virus for which the natural host is a bacterial cell.
A
trait not caused by inheritance of a gene but appears to be identical to a
genetic trait.
A
map of the locations of identifiable landmarks on DNA (e.g., restriction-enzyme
cutting sites, genes), regardless of inheritance. Distance is measured in base
pairs; the highest-resolution map is the complete nucleotide sequence of the
chromosomes.
Autonomously
replicating extra-chromosomal circular DNA molecules, distinct from the normal
bacterial genome and nonessential for cell survival under nonselective
conditions. Some plasmids are capable of integrating into the host genome. A
number of artificially constructed plasmids are used as cloning vectors.
Polymerase chain reaction
(PCR)
A
method for amplifying a DNA base sequence using a heat-stable polymerase and
two 20-base primers, one complementary to the (+) strand at one end of the
sequence to be amplified and one complementary to the (-) strand at the other
end. Because the newly synthesized DNA strands can subsequently serve as
additional templates for the same primer sequences, successive rounds of primer
annealing, strand elongation, and dissociation produce rapid and highly
specific amplification of the desired sequence. PCR also can be used to detect
the existence of the defined sequence in a DNA sample.
Enzyme
that catalyzes the synthesis of nucleic acids on preexisting nucleic acid
templates, assembling RNA from ribonucleotides or DNA from
deoxyribonucleotides.
Difference
in DNA sequence among individuals that may underlie differences in health.
Genetic variations occurring in more than 1% of a population would be
considered useful polymorphisms for genetic linkage analysis.
A
protein or part of a protein made of a chain of amino acids joined by a peptide
bond.
The
study of variation in genes among a group/s of individuals.
A
technique used to identify genes, usually those that are associated with
diseases, based on their location on a chromosome.
Short
pre-existing polynucleotide chain to which new deoxyribonucleotides can be
added by DNA polymerase.
Single-stranded
DNA or RNA molecules of specific base sequence, labeled either radioactively or
immunologically, that are used to detect the complementary base sequence by
hybridization.
A
DNA site to which RNA polymerase will bind and initiate transcription.
Proteins
expressed by a cell or organ at a particular time and under specific
conditions.
The
study of the full set of proteins encoded by a genome.
A
sequence of DNA similar to a gene but nonfunctional; probably the remnant of a
once-functional gene that accumulated mutations.
A
nitrogen-containing, double-ring, basic compound that occurs in nucleic acids.
The purines in DNA and RNA are adenine and guanine.
A
nitrogen-containing, single-ring, basic compound that occurs in nucleic acids.
The pyrimidines in DNA are cytosine and thymine; in RNA, cytosine and uracil.
Clone
containing recombinant DNA molecules.
A
combination of DNA molecules of different origin that are joined using
recombinant DNA technologies.
Procedure
used to join together DNA segments in a cell-free system (an environment
outside a cell or organism). Under appropriate conditions, a recombinant DNA
molecule can enter a cell and replicate there, either autonomously or after it
has become integrated into a cellular chromosome.
The
process by which progeny derive a combination of genes different from that of
either parent. In higher organisms, this can occur by crossing over.
A
DNA base sequence that controls gene expression.
Sequences
of varying lengths that occur in multiple copies in the genome.
Degree
of molecular detail on a physical map of DNA, ranging from low to high.
Restriction
enzyme, endonuclease
A
protein that recognizes specific, short nucleotide sequences and cuts DNA at
those sites. Bacteria contain over 400 such enzymes that recognize and cut more
than 100 different DNA sequences.
Restriction
fragment length polymorphism (RFLP)
Variation
between individuals in DNA fragment sizes cut by specific restriction enzymes;
polymorphic sequences that result in RFLPs are used as markers on both physical
maps and genetic linkage maps. RFLPs usually are caused by mutation at a
cutting site.
Restriction-enzyme
cutting site
A
specific nucleotide sequence of DNA at which a particular restriction enzyme
cuts the DNA. Some sites occur frequently in DNA (e.g., every several hundred
base pairs); others much less frequently (rare-cutter; e.g., every 10,000 base
pairs).
An
enzyme used by retroviruses to form a complementary DNA sequence (cDNA) from
their RNA. The resulting DNA is then inserted into the chromosome of the host
cell.
A
class of RNA found in the ribosomes of cells.
A
chemical found in the nucleus and cytoplasm of cells; it plays an important
role in protein synthesis and other chemical activities of the cell. The
structure of RNA is similar to that of DNA. There are several classes of RNA
molecules, including messenger RNA, transfer RNA, ribosomal RNA, and other
small RNAs, each serving a different purpose.
A
widely used method of determining the order of bases in DNA.
In
genomic mapping, a series of contigs that are in the right order but not
necessarily connected in one continuous stretch of sequence.
A
process whereby the order of multiple sequenced DNA fragments is determined.
Short
(200 to 500 base pairs) DNA sequence that has a single occurrence in the genome
and whose location and base sequence are known. Detectable by polymerase chain
reaction, STSs are useful for localizing and orienting the mapping and sequence
data reported from many different laboratories and serve as landmarks on the
developing physical map of the genome. Expressed sequence tags (ESTs) are STSs
derived from cDNAs.
Determination
of the order of nucleotides (base sequences) in a DNA or RNA molecule or the
order of amino acids in a protein.
The
instrumentation and procedures used to determine the order of nucleotides in
DNA.
Sequencing
method that involves randomly sequenced cloned pieces of the genome, with no
foreknowledge of where the piece originally came from. This can be contrasted
with "directed" strategies, in which pieces of DNA from known
chromosomal locations are sequenced. Because there are advantages to both strategies,
researchers use both random (or shotgun) and directed strategies in combination
to sequence the genome.
Single
nucleotide polymorphism (SNP)
DNA
sequence variations that occur when a single nucleotide (A, T, C, or G) in the
genome sequence is altered.
Transfer
by absorption of DNA fragments separated in electrophoretic gels to membrane
filters for detection of specific base sequences by radio-labeled complementary
probes.
The
effort to determine the 3D structures of large numbers of proteins using both
experimental techniques and computer simulation
Genes
occurring in the same order on chromosomes of different species.
Multiple
copies of the same base sequence on a chromosome; used as markers in physical
mapping.
A
nitrogenous base, one member of the base pair AT (adenine-thymine).
The
synthesis of an RNA copy from a sequence of DNA (a gene); the first step in
gene expression.
A
protein that binds to regulatory regions and helps control gene expression.
The
full complement of activated genes, mRNAs, or transcripts in a particular
tissue at a particular time
A
class of RNA having structures with triplet nucleotide sequences that are
complementary to the triplet nucleotide coding sequences of mRNA. The role of
tRNAs in protein synthesis is to bond with amino acids and transfer them to the
ribosomes, where proteins are assembled according to the genetic code carried
by mRNA.
A
process by which the genetic material carried by an individual cell is altered
by incorporation of exogenous DNA into its genome.
An
experimentally produced organism in which DNA has been artificially introduced
and incorporated into the organism's germ line.
The
process in which the genetic code carried by mRNA directs the synthesis of
proteins from amino acids.
A
class of DNA sequences that can move from one chromosomal site to another.
A
nitrogenous base normally found in RNA but not DNA; uracil is capable of
forming a base pair with adenine.
A
technique used to identify and locate proteins based on their ability to bind
to specific antibodies.
The
form of an organism that occurs most frequently in nature.
Yeast artificial chromosome
(YAC)
Constructed
from yeast DNA, it is a vector used to clone large DNA fragments.