Modeling of Biological Systems
A Workshop at the National Science Foundation
March 14 and 15, 1996
Peter Kollman, University of California, San Francisco, Chair
Sponsored by the National Science Foundation
Levin, Princeton University, Co-Chair
Alberto Apostolico, University of
Marjorie Asmussen, University of Georgia
Bruce L. Bush, Merck
Carlos Castillo-Chavez, Cornell University
Eisenberg, Rush Medical College
Bard Ermentrout, University of
Christopher Fields, Santa Fe Institute
Alan Hastings, University of California,
Michael Hines, Yale University
Barry Honig, Columbia
Lynn Jelinski, Cornell University
Nancy Kopell, Boston
Don Ludwig, University of British Columbia
University of Washington
George Oster, University of California,
Alan Perelson, Los Alamos National labs
Charles Peskin, Courant
Institute of Mathematical Sciences
Greg Petsko, Brandeis University
Rinzel, National Institutes of Health
Robert Silver, Marine Biological
Sylvia Spengler, Lawrence Berkeley Labs
Florida State University
Carla Wofsy, University of New Mexico
(MCB 96-29868 to
the University of California, San Francisco)
TABLE OF CONTENTS
III. Molecular and Cellular Biology
IV. Organismal Biology
V. Ecology and Evolution
VI. Cross-Cutting Issues
VII. Educational Issues
The common theme of this report is the tremendous potential of mathematical
and computational approaches in leading to fundamental insights and important
practical benefits in research on biological systems. Mathematical and
computational approaches have long been appreciated in physics and in the last
twenty years have played an ever-increasing role in chemistry. In our opinion,
they are just coming into their own in biology.
The goals of these
mathematical and computational approaches are to elucidate mechanisms for
seeming disparate phenomena. For example, how does the atomic level structure of
an enzyme lead to its functional, enzyme catalysis? To understand this
structure/function relationship requires fundamental quantum mechanical and
molecular dynamical calculations, but successful simulations may lead to
understanding of disease and drug therapy. Knowing the three dimensional
structure of the muscle protein kinesin may lead to understanding of muscle
action as well as other cellular motors. Simulations of the embryonic and fetal
heart at different stages of development are helping to elucidate the role of
fluid forces in shaping the developing heart. The structure and dynamics of
earth's ecosystems are critical elements in how they function and
mathematical/computational methods play a critical role in understanding their
In these examples and the many others in the body of this
report (sections III-V), mathematical/computational methods, based either on
fundamental physical laws (e.g. quantum mechanics), empirical data, or a
combination of both, are providing a key element in biological research. These
methods can provide hypotheses that let one go beyond the empirical data and can
be constantly tested for their range of validity.
Our report also
highlights (section VI) computational issues that are common across biology,
from the molecular to the ecosystem. Computers are getting more powerful at a
prodigious rate and, in parallel, the potential for computational methods to
ever more complex systems is also increasing. Thus, it is essential that the
next generation of biological scientists have a strong training in mathematics
and computation from kindergarten through graduate school. We discuss
educational issues in section VII of our report.
A purpose of this
report is to increase the awareness among biological scientists of the
ever-increasing utility of mathematical and computational approaches in biology.
Sometimes newly emerging areas and interdisciplinary areas are in danger of
falling between the cracks at funding agencies. Specifically, we hope that this
report will raise the level of awareness at the National Science Foundation and
other funding agencies on nurturing computational and mathematical research in
the biological sciences.
Characterization of biological
systems has reached an unparalleled level of detail. To organize this detail and
arrive at a better fundamental understanding of life processes, it is imperative
that powerful conceptual tools from mathematics and the physical sciences be
applied to the frontier problems in biology. Modeling of biological systems is
evolving into an important partner of experimental work. All facets of biology,
environmental, organismic, cellular and molecular biology are becoming more
accessible to chemical, physical and mathematical approaches. This area of
opportunity was highlighted in a 1992 report, supported by the National Science
Foundation, entitled "Mathematics and Biology, the Interface, Challenges and
A workshop was held at the National Science
Foundation (NSF) on March 14 and 15, 1996 that built on the findings of MBICO in
order to critically evaluate its findings and to suggest which areas were the
most promising as foci for further research. This workshop brought together 25
scientists, with expertise ranging from the molecular to the cellular to the
organism to the ecosystem level, all of whom have an interest in applications of
mathematical/computational approaches to biological systems. The goal of the
workshop was to identify important research areas where
theoretical/computational studies could be of most use in giving insight and in
aiding related experimental work. This is done below. Because of the small size
of our group, the limited time we had, and our not unlimited vision, one must
view the areas of research opportunities presented below as representative, not
exhaustive. Hopefully, our report can provide some guidance and an historical
marker as to the state-of-the-art inModeling of Biological Systems, ca. 1996.
Our report is divided into five sections. We follow the organization of
the NSF in dividing our description of research opportunities into three areas:
Molecular and Cellular Biology, Organismal Biology and Ecology and Evolution.
These three sections are followed by a section focussing on issues that cross
the boundaries between these areas and a final section on educational issues.
III. MOLECULAR AND CELLULAR
A central organizing theme in
Molecular and Cellular Biology is the relationship between structure of
molecules and high level complexes of molecules and their function, both in
normal and aberrant biological contexts. The connection between structure and
function was most clearly illustrated in the paper that began Molecular Biology,
the elucidation of the structure of DNA by Watson and Crick.
immediately illustrated how DNA can replicate and retain the original
information stored in it. Thus, the structure showed how this molecule
functions. But this example also shows the important role of mathematics,
chemistry and physics in elucidating structure/function relationships in
biology. Both the information contained in DNA duplexes and their higher order
structures have been usefully analyzed by mathematics, as the sections below on
the GENOME and MOLECULAR HISTOLOGY illustrate, and important questions have been
answered and many still remain unanswered in these areas.
developments in physics and chemistry have played fundamental roles in enabling
structure determination of the essential molecules in biology - proteins,
nucleic acids, membranes and saccharides - and in that fashion, helping one to
understand their function. Some aspects of these efforts are described below in
the sections on PROTEIN STRUCTURE and NUCLEIC ACIDS. The use of the simulation
methodologies first developed in the physics and chemistry communities to
simulate molecules of biological interest is described in SIMULATIONS. Evolution
has occurred on a molecular as well as a macroscopic scale and some of the
molecules and their properties that have evolved are quite astonishing. The
section on BIOINSPIRED MATERIALS points out the possibilities inherent in making
use of some of the materials that have evolved in the process of molecular
Although much progress has been made in understanding
structures of molecules of biological interest and using this to infer function,
a tremendous amount remains to be done. Some of the key questions include: What
is the structure of the DNA in the nucleus and how does this structure govern
DNA transcription? Given the DNA sequence, what determines the RNA and protein
structures that the DNA codes for? Given the protein structure, what is its
function? How did this function evolve and is it optimized? How can one use this
function to design pharmaceuticals that will have a really impact on disease
without upsetting the rest of the delicately balanced biological system? What
can we learn from other organisms, some that grow under extreme conditions of
temperature and pressure, about the nature and limits of living cells and the
molecules that make them up?
The above are just some of the key
questions, but it is clear from their nature that mathematical and
physical/chemical methods will be essential in answering these questions. These
methods provide the tools and language of molecular structure from the smallest
to the largest molecules and the fundamental laws to explain how molecules
interact and form their three dimensional shape. It is this three dimensional
shape which determines the molecular function. We have reached an incredibly
exciting time of the determination of protein structure, with over 200 different
types of globular protein structures known and an estimate of the order of 10**3
expected to exist in all of biology. Thus, we may soon have examples of every
type of globular protein structure, as well as insight into the nature of the
gene which determines it.
It is clear that the nature of biological
signaling pathways is very complex and involves many feedback loops and fail
safe mechanisms. The tools of mathematics are essential to understanding these.
These signaling pathways are just one example where there is a connection
between the material presented in Molecular and Cellular and in Organismal
Biology. How do these molecular signals ultimately get transmitted into neural
signals and how can we understand possible defects at every level of these
pathways --are defects due to mutations in the proteins, subtle changes in
concentration of normal molecules or some external influence? These are exciting
and extremely important questions that involve understanding the connections
from the molecular to the cellular to the organismal level.
GENOMEIn the six years since the MBICO report, genomic sequence
information has continued its exponential growth. Sequencing technology is being
applied directly to sequence diversity analysis and gene expression analysis via
high throughput, chip-based, automated assay systems. This influx has changed
both the questions that are asked, as well as the range of the interactions
For example, high throughput expression data are now both
tissue-specific and specific to stages of development. Over 300,000 human
expressed sequence tags are now available in public databases, representing at
least 40,000 human genes. Moreover within the next 5 years, as many as 50
complete genomes will be sequenced. Indeed the complete genomes of number of
simple organisms have already been sequenced (see e.g. (Fleischman, 1995)), the
sequence of the yeast genome has recently been published (see e.g. Williams,
1996), and C. elegans is reported to be a year or two away. No one is sure as
how best to exploit genomic data but it is clear that there will soon be an
explosion of biological information on an unprecedented scale.
become increasingly important to carry out comparisons of entire genomes rather
than just single genes, with a concomitant expansion in the time to compute.
Multiple comparisons remain even more problematic. A similar expansion of
queries from local regions of interest (say 50,000 bp) to long range patterns of
sequence or expression is now necessary, with synthetic regions on the order of
25 Mb considered a reasonable length for consideration.
biochemical research is producing exponentially-growing data sets. In addition
to the examples cited above of DNA sequences (currently doubling about every 6
months) and gene expression data (chips supporting 1000s of assays per day),
combinatorial library screens (10,000s of compounds against 1000s of targets)
are producing vast quantities of systematic data on function. Technological
developments will increase these data acquisition rates by an order of magnitude
or more in the next few years.
Significant work is required to develop
data management systems to make these data not just retrievable, but usable as
input to computations and amenable to complex, ad hoc queries across multiple
data types. Significant work is also required on techniques for integrating data
obtained for multiple observables, at different scales, with different
uncertainties (data fusion) and for formulating meaningful queries against such
heterogeneous data (data mining).
For example, it should be possible in
the future to ask what differences to expect in the kinetic efficiencies of a
signal-transduction pathway across multiple individuals, given the differences
in the sequences of the proteins involved in the pathway. Answering such queries
will require improvements in data models, heterogeneous database management
systems, multivariate correlation analysis, molecular structure prediction,
constrained-network modeling, and uncertainty management.
PROTEIN STRUCTURE AND FUNCTIONAs the amount of genomic data
grows, three dimensional structure will provide an increasingly important means
for exploiting and organizing this information. Structure provides a unique yet
largely unexplored vehicle for deducing gene function from sequence data.
Structure also links genomic information to biological assays and serves as a
basis for rational development of bioactive compounds, including drugs and
Research opportunities in this area can be divided into four
distinct categories: experimental structure determination, structure prediction,
structure exploitation of globular proteins, and modeling of membrane proteins,
where the determination of high resolution structures is much more difficult.
Structure DeterminationDuring the past decade, advances in
protein crystal growth, diffraction data collection and experimental phase
determination have led to an explosion of structural information. (Ringe and
Petsko, 1996)Despite this rapid growth, the demand for new structural data
remains high. Areas where mathematical and computational approaches are still
needed to increase the throughput further to include direct phase determination,
improved structure solution by molecular replacement, and automated electron
density map interpretation.
NMR: Solution NMR is now providing
high resolution protein, DNA and RNA structures that rival those from x-ray
crystallography. The limit to the size of molecules whose structures can be
solved by NMR is dictated by chemical complexity, solubility, redundancy and
molecular tumbling time. For proteins, the current upper size limit is currently
about 200 KDa. Mathematical and modeling issues that pertain to NMR structures
include developing methods to account for the effects of molecular dynamics, to
specify the reliability of the structures, and to specify regions of molecular
disorder and/or under-determined constraints. Data reduction for NMR should be
automated. Current research along this line includes developing data management
tools for assigning resonances and keeping track of cross-peaks from
multi-dimensional experiments (Johnson and Blevins, 1994).
- Direct phasing: The phases of diffracted x-rays cannot be observed; they
must be deduced experimentally by indirect methods. Despite recent advances in
experimental phase determination by techniques such as MADD phasing (Leahy
et al, 1992), this step is often a bottleneck in structure
solution. Direct solution to the phase problem for macromolecule crystal
structures would revolutionize structural biology.
- Improved molecular replacement methods: As the database of solved
structures grows, molecular replacement methods, in which a homologous model
of the unknown protein structure is used to phase the measured diffraction
pattern (Arnold and Rossmann, 1992), will become increasingly important. This
method often fails, especially when sequence identity between the unknown
protein and the homologous structure is low. Better approaches to molecular
replacement are needed as are better methods for model building of homologous
- Automated electron density map interpretation: Once the measured
diffraction amplitudes have been assigned phases, an electron density map is
calculated by Fourier summation. This map must then be interpreted in terms of
an atomic model. Present computer graphics methods for fitting models to
electron density are tedious, labor intensive and not always accurate.
Computational approaches to automating a large part of this process are thus
needed. These can take several forms: recognition of overall protein folding
types from low to medium resolution maps (with impact on electron microscopy
and electron tomography as well); automated identification of secondary
structure; automated chain tracing, and alignment of the amino acid sequence
with the electron density.
Structure predictionThe most effective methods of structure
prediction currently available involve constructing models of proteins with
unknown structures based on templates derived from protein structures that have
been determined (see e.g. Bowie et al, 1991). There has been remarkable
progress in the development of these "fold recognition" methods in the past few
years and they offer new opportunities in structure prediction that simply did
not exist a few years ago (see e.g. the November 1995 issue of Proteins
Fold recognition methods can be used to predict the
structures of proteins that have not yet been determined experimentally and to
find homology relationships between proteins that cannot be detected with
traditional sequence alignment methods. The challenges that now arise offer
research opportunities in a number of areas. These include the integration of
structural information in sequence alignment methods, the development of
improved scoring functions for the association of a given sequence with a given
structure (see e.g. Bryant and Lawrence, 1993), and the identification of
folding templates that focus on key structural elements to be matched to
sequence fragments (Orengo et al, 1995). These problems will all require
the development of new computational methods that allow the analysis and
integration of large quantities of structural and sequence data and new
simplified physical models that are designed to the requirements of this
Once an overall structural template has been derived,
there is a need for methods to predict three dimensional structure at the atomic
level. There has been significant progress in the past few years in the building
of site chain conformations onto backbone templates (see e.g. Lee and Subbiah,
1991) but faster and more accurate solutions to this problem would be extremely
useful. Assuming the conserved structural framework regions are known there is
also a need for new methods which model the structures of loops onto fixed
structural framework regions (see e.g. Levitt, 1993) a problem which is of
unique importance for membrane proteins. These can benefit from fast
minimization and conformational search procedures and from improved physical
models which relate structure to free energy (see e.g. Smith and Honig, 1994)).
Structure exploitationThe growing body of structural information
provides a new way of organizing biological data, with applications including
the prediction of function given a structure, the discovery of new principles of
protein-protein interactions, and the discovery of new evolutionary
relationships that were not evident from sequence alone. Structure determination
is usually done to address fundamental problems in cell biology, biochemistry or
pharmacology. The specific questions raised by a structure include: where on the
protein surface are the binding sites? What are the chemical groups that prefer
to bind to these sites? How do the protein and ligand structures change in
response to binding? What are the roles of protein and cofactor groups in
catalysis? How do protein dynamic properties influence protein function?
The construction of a new class of protein structure/function databases
offers a possible approach to these problems. For example, the characterization
of different protein binding sites in terms of physical and geometric properties
will be useful in predicting the function of new proteins whose structures have
been determined, and more, generally, provides a new way of organizing and
interpreting biological data. This area offers research opportunities in
problems including the construction of new methods to represent three
dimensional objects and their incorporation into databases, the merging of these
databases with sequence and function databases, and the development of new
physical models to characterize functionally active regions in proteins.
Structure-based drug design requires locating all usable binding sites
followed by the design of small molecules that bind tightly and specifically to
them (Guida, 1994). Existing computational methods often fail because they do
not adequately account for solvent effects (see e.g. Eisenberg and McLachlan,
1986) nor for the possibility of conformational adjustment (Kearsley et
al, 1994) Better procedures are urgently needed.
Studies of enzyme
catalysis ultimately require simulation of entire reaction pathways including
all bond breaking and bond-making steps as well as the random motion of the
enzyme substrate system. Existing methods of combining quantum mechanical and
molecular mechanical potential functions to carry out such simulations are still
ratherinaccurate. This is particularly true for the interactions of metal ions
and clusters which are found in a high percentage of enzymes. Improved
mathematical and computational methods are needed in all of these areas and it
is an area of much active research (Gao, 1996).
One new experimental
area that is certain to have major impact on the exploitation of structural
information is combinatorial chemistry. (Gordon et al, 1994) New
techniques for high-speed parallel synthesis of novel organic compounds are
generating libraries of literally hundreds of thousands of molecules, many of
which bind to important biological targets. Methods must be developed for
organizing, correlating and interpreting the plethora of structure/activity data
produced by screening such libraries. The union of combinatorial chemistry and
structural biology offers the possibility of deducing the rules for molecular
recognition, which may ultimately allow us to build accurate models of
multiprotein complexes from the structures of their components. The merging of
small molecule and structural databases offers unique and important challenges
in this regard.
Study of membrane
proteins presents special challenges, but also promises to yield exciting and
important information. Greater understanding of membrane protein structure and
function will enhance dramatically our understanding of basic biochemical
processes such as signal transduction, and make possible significant advances in
biotechnology (e.g., receptor-based biosensors) and biomedical sciences (e.g.,
structure-aided drug design). Technical problems make it difficult or impossible
to determine high-resolution structures for most membrane proteins at present.
However, a great deal of experimental data is available for many membrane
proteins, and this information can often be used in concert with computational
tools to generate reasonable three-dimensional models (Findlay, 1996). The
models in turn are beneficial in formulation of hypotheses and design of future
experiments (Kontoyianni and Lybrand, 1993). A number of developmental issues
must be addressed to enhance modeling capabilities for study of proteins in
general, and membrane proteins in particular. For example, it is not well
understood at present how much "constraint" information is needed to permit
construction of a reasonable three-dimensional model structure, or even which
types of experimental information are most useful in model building exercises.
Additional methodological developments are also needed for improved
representation and treatment of lipid bilayers (e.g., efficient treatment of
long-range electrostatic interactions, modified Hamiltonians for representation
of anisotropic pressure tensors, etc.) and lipid-protein interactions. A number
of prokaryotic membrane proteins are now quite well characterized (e.g.,
bacterial chemotaxis receptors (Bourret et al, 1991) and porins (Kreusch
and Schulz, 1994), and can serve as useful models for more complex membrane
proteins from higher organisms. These systems are ideal test cases for
evaluation of new procedures for membrane protein modeling.
progress in the understanding of membrane protein structure and function has
been hindered by the lack of a large number of high-resolution structures.
Structures from x-ray crystallography are limited to those complexes that
crystallize, whereas those from high-resolution solution NMR are limited to
cases where the assemblies have sufficiently short correlation times to produce
narrow lines. Techniques from solid state NMR, including rotational resonance
(RR) and rotational echo double resonance (REDOR) and EPR spectroscopy
(Steinhoff et al, 1994), offer special opportunities for obtaining highly
specific distance constraints for membrane proteins. A promising avenue of
research is to delineate the minimum amount of distance information needed to
specify a structure, and to predict in what order one could perform the least
number of specific NMR or EPR experiments to arrive at a structure.
The problem of RNA structure prediction and
DNA and RNA interactions with proteins is of central biological interest. There
is a need here for improved physical models to describe the interactions of
nucleic acids which differ from most proteins in that they induce large local
electric fields. Recently, methods have been developed for treating highly
charged macromolecules which are surrounded by concentrated ion atmospheres (see
e.g. (Misra et al, 1993; York et al, 1995). These and related
methods open up a variety of opportunities for simulating important biological
phenomena involving nucleic acids at atomic level resolution.
explosive growth of information about RNA structure and function offers new
opportunities that were nonexistent a few years ago. Requirements in this area
range from computational and mathematical techniques to describe the interaction
of large fragments (see e.g. (Easterwood et al, 1994) which are treated
as rigid structural units to accurate atomic-level representations. Similarly,
methods must be developed to integrate experimental and phylogenetic data into
modeling studies (Jaeger et al, 1994).
SIMULATIONSSimulations of molecules of biological interest use
computational representations that range from simple lattice models to full
quantum mechanical wave functions of nuclei and electrons. If one has access to
a macromolecular structure derived from NMR or X-ray crystallography, then one
can begin with a full atom representation and fruitfully examine "small changes"
in the system such as ligand binding or site specific mutation. Again, the goal
is to reproduce and predict structure, dynamics and thermodynamics. In fact,
simulations can provide the connecting link between structure (X-ray and NMR)
and function (experimental measurements of thermodynamic properties).
the last 10 years, because of increased computer power, molecular dynamics
calculations have progressed from the short-time simulation a macromolecule
without explicit solvent to full representations of solvent and counterions
carried out over a few nanoseconds (Berendsen, 1996). Developments in both
hardware and software for parallel computing have played a major role. However,
the longest time simulations that have been carried out are still 9 orders of
magnitude away from the typical time scale for experimental protein folding.
Simplified but realistic models, for example using a continuum treatment of the
solvent (Gilson et al, 1995), could increase the time scale by 1-2 orders
of magnitude. Continuum representations may more readily incorporated into Monte
Carlo methods and thus allow large movements of the molecule during simulation
(Senderowitz et al, 1996). In some cases, the use of Langevin and
Brownian dynamics and multiple time step algorithms (Humphreys et al,
1994) may be warranted. The simulation of biological molecules at the molecular
level has generated much excitement and these approaches have become an
increasingly important partner with experimental studies of these complex
Electrostatic interactions are a crucial component in the
structure and function of biological macromolecules. In the last few years
electrostatic models based on numerical solutions to the Poisson-Boltzmann (PB)
equation have been used extensively as a basis for interpreting experimental
observations on proteins and nucleic acids (Honig and Nicholls, 1995) including
for example the prediction of the pKa's of ionizable groups (see e.g. Bashford
and Karplus, 1990). Electrostatic potential plays a special role in membrane
phenomena: the energies involved are large and the experimental effects of
potential changes are also large, often dominant. The extension of PB methods to
membranes and channels is an area of great interest.
BIO-INSPIRED MATERIALSBio-inspired materials represent a special
area of opportunity for developing new high-performance engineering materials
based on ideas inferred from Nature (Tirrell et al, 1994). For example,
the proteins derived from spider silk serve as the inspiration for high-strength
fibers (Simmons et al, 1996); the adhesives from barnacles suggest how to
produce glues that cure and function underwater; and the complex
protein-inorganic interactions in mollusk shells supply ideas for producing
ceramics that are less brittle than current ones. It is likely that ultimate
bio-inspired materials will be chimeric, that is, they will be produced as a
hybrid between biological and synthetic components. Consequently, these
materials represent a special class of the protein folding problem and of
polymer physics. In addition to the molecular level interactions, the ultimate
mechanical properties of such materials derive also from long-range
interactions, orientation and crystallite size. Models from polymer science and
from protein folding must be combined and adapted to predict how mechanical
properties such as modulus, strength and elasticity depend on these physical
parameters. Once such models are also able to explain the mechanical properties
of wild-type biomaterials, they can be used in a predictive sense to guide the
production of chimeric materials.
MOLECULAR HISTOLOGYUnderstanding the spatial conformation of
biological macromolecules (DNA, RNA, protein) and functional changes in
conformation provides an ongoing challenge to mathematics. Analytical and
computational models based in geometry and topology continue to be very
successful in providing a theoretical and computational framework for the
analysis of enzyme mechanism and macromolecular conformation (Rybenkov, 1993;
Schlick and Olson, 1992; White, 1992; Sumners et al, 1995; Lander and
New experimental modalities, such as cryo-electron
microscopy, (Stasiak et al, 1996), optical tweezers (Smith et al,
1996), provide spatial and structural data of ever-increasing resolution. This
new spectrum of high-resolution data will require correspondingly
high-resolution mathematical models to aid in the design and interpretation of
experiments. Refinement of existing models will provide a starting point, but
new ideas and new combinations of old ideas are needed. One particularly
important need is the development of efficient descriptors of spatial
conformation of macromolecules; descriptors that will afford efficient database
entry and retrieval of information, while encoding biologically significant
The central organizing theme for
Cells and Cell Systems is how behavior and function at one level of organization
emerges from the structure and interactions of components at lower levels. In
the set of topics described in this section the lower level of organization is
subcellular or cellular. Though some of the subcellular components that play a
role in these models are molecular, the focus is not on the structure of those
molecules, but on the part that they play in cellular and multicellular
function. The section on CELL SIGNALING deals with the role of specific
molecules in regulation of processes such as cell division, cellular
communication, and gene expression. In the MECHANICS AND EMBRYOLOGY section, the
focus is on how mechanochemical processes at the molecular level can drive the
processes that lead to macroscopic changes in shapes of tissues and organs. The
problems discussed in BIOFLUID DYNAMICS again start at the level of individual
(bacterial) cells, with substructures (flagella) interacting at tiny scales with
the hydrodynamics to produce macroscopic behavior (swimming).
sections on IMMUNOLOGY AND VIROLOGY and NEUROSCIENCES focus on scientific
problems that involve larger multicellular systems. Understanding the immune
system requires insights about how classes of molecules found on the cell
surface generate the complex signals which lead to a normal immune response;
this response, which includes a memory of previous interactions with antigens,
is a property of the entire immune system, not of individual cells. Similarly,
the nervous system can be studied at the level of individual cells, to
understand how the biophysical properties of cellular membranes contribute to
the responses of individual cells; but an understanding of the functioning of
the nervous system also requires a study of the behavior of large scale networks
CELL SIGNALINGControl of cellular processes, mediated by
interactions of signaling molecules and their cell surface receptors, is a
central and unifying theme in current experimental cell biology. Within the past
five years, techniques of molecular biology have revealed many of the kinases,
phosphatases and other molecules involved in signal transduction pathways, as
well as molecular sub-domains and sequence motifs that determine distinct
functions. New techniques for measuring phosphorylation, calcium fluxes, and
other early biochemical responses to receptor interactions are being applied to
study many cell signaling systems (e.g., chemotactic bacteria, neurons and
lymphocytes). Genetically engineered experimental systems consisting of
homogeneous cell lines, transfected with homogeneous populations of wild type
and mutant receptors and effector molecules, have facilitated acquisition of
much of the new information about the intracellular molecules that mediate
signal transduction. Improved measurement and experimental design make
mathematical modeling an increasingly feasible tool for testing ideas about the
interactions of these molecules.
Modeling has contributed to our
understanding of key cell surface interactions (e.g., ligand-induced receptor
aggregation, cell-cell interactions, and cell adhesion). Modeling has also
clarified the nature and effects of cellular responses (e.g., internalization
and secretion of proteins, cell division and differentiation, and cell
motility). Recent combinations of modeling and experiment have brought a deeper
understanding of the role of calcium in the regulation of cell division,
neuronal communication, regulation of muscle contraction, pollination, and other
cellular processes. (Silver, 1996) Representative descriptions of collaborative
work applying mathematics to problems in experimental cell biology are found in
Alt et al, 1996; Goldstein and Wofsy, 1994 and Lauffenburger and
Linderman, 1993. Other recent examples of the productive application of theory
to cell signaling and cell motility include Alon et al, 1995; Bray, 1995;
Jafri and Keizer, 1994; Naranja et al, 1994; Tranquillo and Alt, 1996 and
Tyson et al, 1996. Over the next few years, we can expect mathematical
modeling to play a central role in the design and interpretation of experiments
aimed at understanding in detail the biochemical reactions leading from receptor
interactions to changes in gene expression, cell division, and other functional
MECHANICS AND EMBRYOLOGYRecent advances in instrumentation have
made it possible to measure motions and mechanical forces at the molecular scale
(Svoboda and Block, 1994). Concomitant with these new mechanical measurements
are crystallographic and x-ray diffraction techniques that have revealed the
atomic structure and molecular geometry of mechanochemical enzymes to angstrom
resolutions (Rayment and Holden, 1994). Together, these techniques have begun to
supply data that has revived interest in cellular mechanics, and reinvigorated
the view of enzymes as mechanochemical devices. It is now possible to make
realistic models of molecular mechanochemical processes that can be related
directly to experimentally observable, and controllable, parameters (Peskin and
Oster, 1995). These advances in experimental technology have initiated a
renaissance in theoretical efforts to readdress the central question: how do
protein machines work? More precisely, how is chemical energy transduced into
directed mechanical forces that drive so many cellular events?
Embryology has also moved beyond descriptive observation to encompass
genetic control of development and localization of protein effectors. The stress
and strain measurements that are now possible at the cellular scale promise to
unite the genetics, biochemistry and biomechanics of development (Oliver et
al, 1995). By characterizing the mechanical properties of embryonic cells
and tissues, mathematical models can be used to discriminate between various
possible mechanisms for driving morphogenesis (Davidson et al, 1995).
Examples encompass all phenomena that involve the coordinated movement
of macromolecules, cells or tissues. How do embryonic cells crawl and bacteria
swim (Dembo, 1989; Berg, 1995; Mogilner and Oster, 1996)? How are proteins
shuttled about the cell (Scholey, 1994)? What drives the grand progression of
cell division (Murray and Hunt, 1993)? What drives the shaping of tissues and
organs during embryonic development (Murray and Oster, 1984; Brodland, 1994) and
the reshaping of organs after injury (Tranquillo and Murray, 1993; Olsen et
BIOFLUID DYNAMICSBecause of the ongoing revolution in computer
technology, we can now solve fluid dynamics problems in the three spatial
dimensions and time (Ellington and Pedley, 1995). This opens up biological
opportunities on many different scales of size. On the organ scale, for example,
one can now perform fluid dynamics simulations of the embryonic and fetal heart
at different stages of development. Such models will help to elucidate the role
of fluid forces in shaping the developing heart. The swimming mechanics of
microorganisms are also accessible to computer simulation. A particularly
challenging problem in this field concerns the intense hydrodynamic interaction
among the different flagella of the same bacterium: When the flagella are
spinning so that their helical waves propagate away from the cell body, they
wrap around each other to form a kind of superflagellum that propels the
bacterium steadily along; when their motors are reversed and the flagella spin
the other way, the superflagellum unravels and the bacterium tumbles in place.
Because of the difficulty of measuring microscopic fluid flows, hydrodynamics
within cells is a much neglected aspect of cellular and intracellular
biomechanics. Indeed, computation provides our only window onto this important
aspect of cellular physiology. The incompressibility and viscosity of water have
the effect of coupling motions along different axes, and between objects quite
distant from one another; biomolecular processes are also modulated by the
necessity of moving water out of the way. A new feature in this realm of micro
and nano hydrodynamics is the importance of Brownian motion and the related
significance of osmotic mechanics (including sol-gel transformations) for
controlling fluid motions.
Progress in this field will depend on access
to large-scale scientific computing. It is important that the best technology be
made available to scientists on a scale sufficient to sustain this kind of
research. This will also necessitate supporting people with the expertise to
make effective use of these powerful machines. At universities, such people are
often in non-faculty, non-tenured research positions. We needsupport to sustain
their crucial role.
IMMUNOLOGY AND VIROLOGYDuring the last two years mathematical
modeling has had a major impact on research in immunology and virology. Serious
collaborations between theorists and experiment provided breakthroughs by
viewing experiments in which AIDS patients were given potent anti-retroviral
drugs as perturbations of a dynamical system. Mathematical modeling combined
with analysis of data obtained during drug clinical trials established for the
first time that HIV is rapidly cleared from the body and that approximately 10
billion virus particles are produced daily (Ho et al, 1995). This work
had tremendous impact on the AIDS community and has, for the first time, given
them a quantitative picture of the disease process. The impact of this type of
analysis has extended beyond AIDS, and opportunities exist for developing
realistic and useful models of many viral diseases. Challenges remain in
studying drug therapy as a nonlinear control problem, and the issue of how
rapidly viruses mutate and become drug resistant under different therapeutic
regimes needs to be considered. Such issues also apply to the development of
antibiotic resistance in bacterial disease.
Opportunities exist for
substantial advances in immunology by the use of modeling techniques. Molecular
modeling is providing insights into the structure and function of the cell
surface molecules crucial for the operation of the immune system:
immunoglobulin, the T cell receptor, and molecules coded for by the major
histocompatibility complex genes, as well as molecules being recognized by the
immune system. The biochemical sequelae of molecular recognition involve the
generation of complex biochemical and enzymatic signals, whose net effect are
changes in gene expression followed in many cases by cell proliferation, cell
differentiation and cell movement. How these changes are orchestrated to produce
an immune response remain to be elucidated. However, modeling can give us
insights into how cells interact by direct contact and via secreted molecules,
cytokines, to produce the coordinated behavior necessary to meetimmune system
NEUROSCIENCESThe fundamental challenge in neuroscience is to
understand how behavior emerges from properties of neurons and networks of
neurons. Advances in experimental methodologies are providing detailed
information on ionic channels, their distribution over the dendritic and axonal
membranes of cells, their regulation by modulatory agents, and the kinetics of
synaptic interactions. The development of fast computing, sophisticated
simulation tools, and improved numerical algorithms has enabled the development
of detailed biophysically-based computational models that reproduce the complex
dynamic firing properties of neurons and networks. Such computations provide a
two-fold opportunity for advancing our knowledge: (1) they both explain and
drive new experiments, (2) they provide the basis for new mathematical theories
that enable one to obtain reduced models that retain the quantitative essence of
the detailed models. These reduced models, which allow the bridging of multiple
spatial and temporal scales, are the building blocks for higher level models.
Modeling tools and mathematical analysis allow us to address the central
question: What are the cellular bases for neural computations and tasks such as
sensory processing, motor behavior and cognition? (Koch and Segev, 1989; Bower,
1992) More specifically, how do intrinsic properties of neurons combine in
networks with synaptic properties, connectivity, and the cable properties of
dendrites to produce our interaction with the world? Neural modulators affect
both the intrinsic currents and the synaptic interactions between neurons.
(Harris-Warrick et al, 1992) The effects of these changes at the network
level are difficult to work out even for small networks. The largest challenge
in this area is to understand how systems with enormous numbers of degrees of
freedom and large numbers of different modulators combine to produce flexible
but stable behavior. The geometry and electrical cable properties of the
branching dendrites of neurons also affect network activity. (Stuart and
Sakmann, 1994) Mathematical analysis is needed to interpret the results of
massive computations, and to incorporate the insights into network models.
The dynamics of neural networks (Golomb et al, 1996; Kopell and
LeMasson,1994) affect both cognitive and sensory-motor behavior. To understand
motor behavior, one must construct models that illuminate the role of feedback
between neural and mechanical subsystems. For sensory systems, one of the most
important problems is to understand how the brain controls the data that it
receives, including understanding more rigorously the quantitative
parameterization/description of natural stimuli. A current active area of
inquiry is the characterization of codes used in information processing in the
nervous system. (Softky and Koch, 1993; Shadlen and Newsome, 1995; Softky, 1995)
Among the issues raised by this question is how the complex dynamics of the
cortex can help shape responses to stimuli, including selecting pathways that
lead to different behavior.
Modeling has become an accepted and central
tool in neurobiology. The current scientific goals listed above create specific
challenges in modeling. Some of these concern the handling and interpretation of
the far greater volume of data that is now, or potentially, available, e.g.
through multiunit recording techniques. With very large and complex models
(Whittington et al, 1995), techniques for systematically choosing
parameters are important, as are methods for comparing models and understanding
their differences. Both computers and mathematical analysis will play major
roles in dealing with the technical problems; mathematical analysis remains the
fundamental tool for providing a deep understanding of how models differ in
V. ECOLOGY AND EVOLUTIONARY
Evolution is the central
organizing theme in biology (e.g. Roughgarden,1979), and its manifestation in
the relationships among types of organisms spans levels of organization, and
reaches out from biology to earth and social sciences. Thus, the core problems
in ecology and evolution run the gamut from those that address fundamental
biological issues to those that address the role of science in human affairs.
Fundamental challenges facing ecologists and evolutionary biologists relate to
the threats of the loss of biological diversity, global change, and the search
for a sustainable future, as well as to the continued search for an
understanding of the biological world and how it came to assume its present
form. To what extent is the organization of the biological world the predictable
and unique playing out of the fundamental rules governing its evolution, and to
what extent has it been constrained by historical accident? How are the
interactions among species, ranging from the tight interdependence of host and
parasite to the more diffuse connections among plant species in a forest,
manifested in their coevolutionary patterns and life history evolution? What are
the evolutionary relationships among closely related species, in terms of their
shared phylogenetic histories? How do human influences, such as the use of
antibiotics and pesticides, exploitation of fisheries and land, and accelerated
patterns of global change, influence the evolutionary dynamics of species and
patterns of invasion? To what extent can an evolutionary perspective help us to
prepare for the future, in terms of understanding what species might be best
suited to new environments? The latter is important both in terms of natural
patterns of change, and deliberate manipulations through breeding and species
Among the central issues are those relating to
biodiversity (Tilman,1994) How it is maintained, how it supports ecosystem
services, likely patterns of change, and steps to preserve it. This leads to a
fundamental set of core issues, both in terms of their importance, and in terms
of their ripeness for success:
Conservation biology, and the preservation of biodiversityWhat
factors maintain biodiversity? How can new approaches to phylogenetic analyses,
in clarifying the evolutionary relationships within and among species, help us
to understand how we should measure biodiversity? How are ecosystems organized
into functional groups, ecologically and evolutionarily, and how does that
organization translate into the maintenance of critical ecosystem processes,
such as productivity and biogeochemical cycles, as well as climate mediation,
sequestering of toxicants, and other issues of importance to human life on
Global changeWhat are the connections between the physical and
biological parts of the global biosphere, and the multiple scales of space, time
and organizational complexity on which critical processes are played out?
(Bolker et al, 1995) In particular, how are individual plants influenced
by changes in atmospheric patterns; and, more difficult, how do those effects on
individual plants feed back to influence regional and global patterns of climate
and biological diversity? How do effects on phytoplankton and zooplankton relate
to each other, and to the broader patterns that may be observed?
Emerging diseaseHow do patterns of population growth and resource use,
as well as the profligate use of antibiotics, contribute to the emergence and
reemergence of deadly new diseases, many of them antibiotic resistant? (Ewald,
1995) Are there approaches to management of the diversity of those diseases,
guided by both an evolutionary and an ecosystem perspective, that can reduce the
threat and provide new strategies for mitigation?
Resource management The history of the management of our sources of
food and fiber is not one of unmitigated successes, and many of these crucial
resources are threatened to a level that they will be unable to support the
needs of humanity in the coming decades. The prospect of large-scale alterations
of the earth's physical and biological systems creates a potential conflict
between human needs, desires and capabilities. (Walters and Parma, 1996; Walters
and Maguire, 1996) This situation is further complicated by the limitations of
our understanding and ability to control complex biological systems. We must
develop methods for decision-making and management that are appropriate for an
uncertain future. (Hilborn et al, 1995)
In all of these issues,
there are a variety of cross-cutting themes, some biological, some
methodological or conceptual. From a biological point of view, the essential
point is that all that we see has been shaped by evolutionary processes; from an
ecological point of view, it is that organisms do not exist in isolation, but
have existed within the context of other species and an abiotic environment,
making essential an ecosystem perspective on issues ranging from the management
of diseases to the management of our global surroundings. Indeed, a central
challenge is to understand how the properties even of ecosystems, those loose
assemblages of species in particular habitats, can be understood in terms of the
diffuse coevolution of the components within very open systems.
modeling point of view, fundamental issues remain how to deal with variation
within as well as variation among units, for example in the importance of
heterogeneity in evolutionary processes or infectious transfer. The interplay
among processes operating on very different scales also pervade these questions,
from evolution through global change. And finally, techniques for
simplification, and for relating behaviors at the level of individuals to
macroscopic descriptions, provide the tools for making the essential
Progress in all of these research areas will derive from
the application of a suite of approaches, ranging from explicit spatial and
stochastic simulations to more compact (Durrett and Levin, 1994) mathematical
descriptions that allow analysis and simplification. Recent advances in computer
technology have opened up the possibility of including much more detail than
ever before in simulation approaches, yielding the possibility of including much
more biological detail. This detail comes at a cost, however. The ability to
generate information does not equal understanding, and the mathematical
challenge is to develop techniques which can include the essential details
driving the complex models, while allowing an understanding of the features
driving the biological behavior at a deeper level that will allow
generalization. This will require both close attention to the underlying
biological details and fundamental mathematical progress in taking appropriate
limits and achieving manageable simplification of complex, spatially explicit,
Below, we focus on modeling opportunities in some of
the specific subfields in the general areas of ecology and evolution.
POPULATION GENETICSWhile evolution is the great unifying principle
underlying all of biology, evolutionary genetics forms the foundation of
evolution. Challenging mathematical and computational applications in this
critical area range from the development of theoretical frameworks from which to
infer the operation of evolutionary mechanisms such as natural selection at the
molecular level through the organismal level, to understanding the genetic basis
of interactions among species.
One critical area, still in its infancy,
concerns the identification and genetic analysis (Coyne et al, 1991) of
genes that play key roles in species and environmental interactions. The mapping
of such quantitative trait loci consists of three interrelated inference
problems: detecting the effects of these loci, determining the number of major
loci affecting a trait, and locating them relative to genomic markers. A
complete solution thus involves problems of testing, model selection, and
estimation. Once ecological and genetic analysis of traits limiting adaptive
responses is complete, it will be possible to address crucial evolutionary
questions such as the relative importance of gene flow, genetic trade-offs, and
A second exciting area concerns life history
evolution, which often focuses upon the timing of life history events or the
allocation of organismal resources and time among conflicting demands such as
longevity and fecundity. Evolution of these traits can be studied from
quantitative genetic descriptions in which transient dynamics are explored
(Tuljapulkar and Wiener, 1995), while the selective environment is reduced to a
selection gradient. Alternatively, the nature of the environment's selective
effect on a trait can be explored through optimization approaches. There is a
pressing need for more complex formulations such as models bridging the gap
between problems of allocation and timing, models explicitly (Charlesworth,
1994) incorporating how genes act at different ages and over time, for models at
the interface between life history evolution and behavior (Charlesworth, 1994),
and for models examining how life histories (Tuljapulkar, 1994) are influenced
by temporal and spatial variation in the environment.
Beyond the species
level, the coevolutionary dynamics of the quantitative traits that are often
involved in species interactions pose many challenges and opportunities to
theoretical, computational, and mathematical biologists that cut across all
areas of ecology and evolution. For example, the study of the evolution of
virulence (Frank, 1993, 1994) in insect-parasitoid-host systems and fungi-virus
interactions in plants and the study of mechanisms of specialization and the
analysis of hybrid zones are part of the cutting-edge research being conducted
at the interface of biology and the mathematical sciences.
rapid accumulation of sequence data for entire genomes, we are now poised to
analyze the set of genes, their order and organization, codon usage, etc. across
taxa (Griffiths and Tavare, 1996) and how and perhaps why this has evolved over
time. (Thorne et al, 1992) This requires an increased ability to model
how information is represented and acted upon in biological systems (Griffiths
and Tavare, 1996) based on tools from such fields as discrete mathematics,
combinatorics, and formal languages. Novel, perhaps ad-hoc formulations are
needed to form the mathematical basis of genomic analyses because classical
quantitative formulations of notions such as information, similarity, and
classification - all inextricably related to biology - are inadequate.
Correspondingly, methods for organizing vast sequence data into data structures
and databases suited for the most efficient data storage and access are needed,
along with improved algorithms for sequence analysis and the identification of
homologies among sequences.
Population genetic surveys of the genetic
structure of natural populations are a critical tool from which to deduce the
evolutionary history of, and evolutionary forces at work in, natural
populations. Current population genetic theory and data analysis methods are
largely based upon single or a few genetic loci, each with two alternate forms
(alleles). Current data, however, typically includes the genetic makeup at a
large number of genetic markers which, with the advent of new molecular
techniques such as the polymerase chain reaction, are increasingly hypervariable
with a large number of alternate forms segregating at each. New theoretical
frameworks and statistical methods are needed to extract and utilize the full
evolutionary information contained in these complex data sets.
CONSERVATION BIOLOGYVirtually all important questions in conservation
biology require making predictions, so theory and mathematical methods have
played and will continue to play a central role. Although many of the underlying
scientific issues have been defined during the past decade, many questions
remain to be resolved. What species would be lost in the wake of an invasion,
and what are the effects on ecosystem function? For example, what are the
consequences of the replacement of native fish species by introduced species?
Substantial progress is likely (and needed) in the near future in understanding
the dynamics of invading exotic species, determining more carefully the role
genetics plays in the dynamics of rare or endangered species, and in the
ecological dynamics of threatened species.
Theoretical studies have
focused on the population size or characteristics needed to allow species to
maintain the genetic diversity necessary to allow long term persistence. (Lande,
1993, 1994) These answers have shown that an effective population size is
required, but further work is needed to understand how effective population size
is related to actual population size and structure and life history
characteristics -- what can actually be observed. These lead to interesting
mathematical challenges dealing with structured populations, and with
integrating ecological and genetical models.
The impact of invading
exotic species on existing native ecological communities and species is perhaps
the most important conservation issue today (OTA, 1993). There has been almost
no development of theories predicting rates of spread of species within the
context of even simple communities, and the related mathematical problems of
coupled reaction diffusion equations are challenging as well. Although the basic
mathematical models of spatial spread can be traced at least as far back as
Fisher (1937), recent work has shown that the situation is far more complex, as
rates of spread can vary by at least an order of magnitude as model assumptions
are changed. (e.g. Lewis & Kareiva, 1993; Zadocks and Van Den Bosch,
1994)Further work will be able to lead to robust quantitative predictions of
rates of spread.
MANAGEMENT OF NATURAL SYSTEMSIn recent years there has been an abrupt
shift in management philosophy. (Hiborn et al, 1995) The old goal of
managing individual species in order to reach and maintain optimal conditions
has been replaced by a new goal of maintaining ecosystem function and adapting
to new conditions or changes in the system. This shift reflects a more mature
attitude towards nature that recognizes the limitations of our knowledge and
capabilities, the importance of interactions between species and an appreciation
of the dangers of a command and control mode of operation.
approach to management makes it possible to apply elements of the scientific
method in a new and significant context: we may design experimental management
schemes to provide information that is required to improve the management
process and adapt to changes, even unforeseen changes. This new approach
challenges our mathematical and statistical skills. Successful adaptation
requires effective and timely organization of data through estimation of
parameters that affect system dynamics, including the dynamics of our learning.
That information then must be translated into an assessment of the likely
consequences of management strategies and actions.
The major challenges
facing the human species cannot be met by a reductionist or piecemeal approach.
Instead we must muster all of our ingenuity and resources to learn about the
behavior of intact natural systems under stress and perturbation, and adapt our
human institutions to a finite and vulnerable world.
GLOBAL CHANGE AND BIODIVERSITYClimate change and associated changes in
greenhouse gases have made imperative the examination of the potential impacts
on natural systems, and associated feedbacks. Advances in computational
capabilities have made possible the construction of detailed individual-based
models that take account of the responses of individual trees to changes in
environmental conditions, and their mutual effects. Yet such models are
tremendously data-hungry, and have great potential for error propagation. To
make their predictions robust, and to allow those predictions to be interfaced
with the much broader scale predictions of climate models, and the masses of
broad scale information that are becoming available from remote sensing, we must
find ways to reduce dimensionality and simplify those overly detailed models.
Similar comments apply to models of other systems, such as the aggregation of
social organisms from cellular slime molds to marine and terrestrial
invertebrates and vertebrates. Methods such as moment closure and hydrodynamic
limits, borrowed from other disciplines, are proving remarkably promising,
especially when coupled with experimental approaches (Levin and Pacala, 1996).
This represents one of the most challenging and important issues in
ecosystem science. At the same time, masses of data are becoming available from
global observation systems, and critical experiments are providing understanding
of the linkages between ecosystem structure and function, and in particular the
role of biodiversity in maintaining system processes. The next 5-10 years hold
remarkable potential for integrated theoretical, empirical and computational
approaches to elucidate profound and important issues (Field, 1992; Bolker,
THE DYNAMICS OF INFECTIOUS DISEASESThe subject of infectious disease
dynamics has been one of the oldest and most successful in mathematical biology
for a century, and has seen powerful advances in recent years in mathematical
theory, and in the application of that theory to management strategies (see, for
example, Anderson and May, 1991). Much of the literature has assumed homogeneous
mixing, so that every individual is equally likely to infect every other
individual; but such models are inadequate to describe the central qualitative
features of most diseases, especially those that are sexually transmitted, or
for which spatial or socioeconomic structure localizes interactions. The
classical work of Hethcote and Yorke (1984) on core-group dynamics highlighted
the importance of such effects, and formed the basis upon which much recent work
rests. Such work, involving spatial structure, frequency and density dependence,
and behavioral factors have not only forced us to revise old paradigms, but have
reenergized the interplay among nonlinear dynamics, ecology and epidemiology.
VI. CROSS-CUTTING ISSUES
AND COMPUTATIONAL ISSUES SPANNING ALL DOMAINS - - RELATIONSHIP BETWEEN
SIMULATION AND MATHEMATICS
The revolution in computer technology
enables us to perform complex simulations only dreamed of a decade ago.
Effective use of this technology requires substantial use of mathematics
throughout all stages of the simulation process: the quantitative (or
qualitative) formulation of models, the design of appropriate data types and
algorithms, translation of models into efficient computer implementations,
estimation of parameter values, visualization of the output, and comparison of
simulation results with results of further experimentation. Mathematics is also
essential in the critical step of developing algorithms that compute important
properties of models without recourse to numerical simulation.
Furthermore, mathematics can significantly enhance our understanding of
processes that are studied through simulation. For example, theories of
dynamical systems describe patterns that are widespread, so much so that they
have been called "universal." The elucidation of such recurring patterns is a
central part of mathematics. Mathematics ponders a common language, a context
that gives meaning to simulation results and a firm foundation for the
algorithmic infrastructure of simulation. Such a foundation ensures that
simulation methods are generalizable and capable of generating predictions.
Moreover, theory can serve as a basis for reducing models without loss of
information, thereby improving the efficiency of large-scale simulations.
MAJOR CHALLENGING ISSUES THAT SPAN ALL AREAS OF MODELING SYSTEMS
A. Integrating data and developing models of complex systems across multiple
spatial and temporal scales.
- scale relations and coupling
- temporal complexity and coding
- parameter estimation and treatment of uncertainty
- statistical analysis and data mining
- simulation modeling and prediction.
- large and small nucleic acids
- membrane systems
- general macromolecular assemblies
- cellular, tissue, organismal systems
- ecological and evolutionary systems.
C. Image analysis and
- image interpretation and data fusion
- inverse problems
- 2, 3, and higher-dimensional visualization and virtual reality
D. Basic mathematical issues
- formalisms for spatial and temporal encoding
- complex geometry
- relationships between network architecture and dynamics
- combinatorial complexity
- theory for systems that combine stochastic and nonlinear effects, often in
partially distributed systems.
E. Data management
- data modeling and data structure design
- query algorithms, especially across heterogeneous data types
- data server communication, especially peer-to-peer replication
- distributed memory management and process management.
VII. EDUCATIONAL ISSUES
As noted above,
mathematical analysis and computer modeling have become indispensable tools in
biology in recent years. These techniques have had a major impact in areas
ranging from ecology and population biology to neurosciences to gene and protein
sequence analysis and three-dimensional molecular modeling. Mathematical and
modeling techniques make it possible to analyze and interpret enormous amounts
of data, yielding information and revealing patterns and relationships that
would otherwise remain hidden.
Given the essential role that
mathematical and modeling techniques play in so many diverse areas of biology,
there is a clear need for appropriate training opportunities in computational,
mathematical, and theoretical biology. Suitable and practical mechanisms to
encourage and nurture training in computational biology might include 1)
graduate training grant programs that involve faculty engaged in both
computational and experimental approaches, 2) postdoctoral fellowships to
encourage mathematicians and computational scientists to pursue research
training in biology, and to enable biologists to acquire computational and
modeling skills, and 3) summer workshops and short courses to help practicing
biologists, mathematicians, and computational scientists to begin to bridge the
gap between these rather diverse disciplines.
In addition to training of
computational biology specialists, there is a clear and dramatic need for
enhanced training in mathematics and computational methods for biological
science students or others who might enter the workforce in any scientific
discipline. A systematic approach, beginning at the K-12 level, that emphasizes
the importance of mathematics and modeling in biology activities (as outlined in
the National Science Standards) would help insure that students are better
prepared to utilize mathematical approaches in undergraduate biology curricula,
and less likely to avoid mathematically rigorous courses in undergraduate
programs because of weak mathematics backgrounds or "math phobia". Improved
mathematics training at the earliest levels will also likely increase the number
of students interested in pursuing graduate study in interdisciplinary areas of
mathematical and computational biology. Greater emphasis on mathematics and
computational studies at the K-12 and undergraduate levels can also be coupled
effectively with programs to encourage women and underrepresented minorities to
pursue careers in science, especially in interdisciplinary areas that bridge the
biological, mathematical, and computational sciences.
Finally, it should
be recognized that computer simulations and mathematical modeling tools can be
effective teaching aids in the biological sciences. Topics like protein
structure-function relationships benefit greatly from interactive,
three-dimensional graphics demonstrations. Computer simulations and animations
based on mathematical models can be an extremely effective way to illustrate the
behavior and properties of complex systems, ranging from protein-ligand
interactions to migration behavior of large animal populations. Therefore,
inclusion of mathematical and computational course work as a logical and
sequential theme articulated in K-12 and undergraduate curricula will likely
have far-reaching benefits for biology education.
Hammer, D. A., and Springer, T. A. (1995). Lifetime of the P-selectin --
Carbohydrate Bond and Its Response to Tensile Force in Hydrodynamic Flow. Nature
Alt, W., Deutsch, A., and Dunn, G., eds.,
(1996). Mechanisms of Cell and Tissue Motion. Birkhaeuser Verlag, Basel.
Anderson, R.M. and May, R.M. (1991) Infectious Diseases of
Humans. Oxford Univ. Press.
Arnold, B. and Rossmann, M.G.
(1986). Effect of Errors, Redundancy, and Solvent Content in the Molecular
Replacement Procedure for the Structure Determination of Biological
Macromolecules. Proc. Natl. Acad. Sci. USA 83, 5489-5493.
Asilomar (1995). Proteins 23, 295-460.
Bashford, D. and Karplus, M. (1990). pKa's of Ionizable
Groups in Proteins - Atomic Detail from a Continuum Electrostatic Model. J. Mol.
Biol. 29, 10219-10225.
Berendsen, H. J. C. (1996).
Bio-molecular Dynamics Comes of Age. Science 271, 954-955.
H. (1995). Torque Generation by the Flagella Rotary Motor. Biophys. 68
(4 Suppl), 163S-166S.
Bolker, B.M., Pacala, S.W., Bazzaz,
F.A. and Canham, C.D. (1995) Species Diversity and Ecosystem Response to Carbon
Dioxide Fertilization -Conclusions from a Temperate Forest Model. Global Change
Biology 1, 373-381.
Bolker, B. M., Pacala, S.W., Canham,
C., Bazzaz, F. and Levin, S.A. (1995) Species Diversity and Ecosystem Response
to Carbon Dioxide Fertilization: Conclusions from a Temperate Forest Model.
Global Change Biology 1, 373-381.
Bourret, R. B.,
Borkovitch, K. A. and Simon, M. I. (1991). Signal Transduction Pathways
Involving Protein Phosphorylation in Prokaryotes. Ann. Rev. Biochem. 60,
Bower, J.M., Guest Editor (1992). Special Issue:
Modeling the Nervous System. Trends In Neuroscience 15, #11.
Bowie, J. U., Luthy, R. and Eisenberg, D. (1991). A Method
to Identify Protein Sequences That Fold into a Known Three-Dimensional
Structure. Science 253, 164-170.
Brodland, G. (1994).
Finite Element Methods for Developmental Biology. In International Review of
Cytology, 150, Academic Press, Inc. pp. 95-118.
(1995). Protein Molecules as Computational Elements in Living Cells.
Nature 376, 307-312.
Bryant, S. H. and Lawrence, C. E.
(1993). An Empirical Energy Function for Threading Protein Sequence Through the
Folding Motif. Proteins 16, 92-112.
(1994). Evolution in Age-Structured Populations. Cambridge University
Press, 2nd edition.
Coyne, J.A., Aulard, S. and Berry, A.
(1991). Lack of Underdominance in a Naturally Occurring Pericentric Inversion in
Drosophila-Melanogaster and Its Implications for Chromosome Evolution. Genetics
Coyne, J.A., Charlesworth, B. and Orr, H.A.
(1991). Haldane's Rule Revisited. Evolution 45, 1710-1714.
Davidson, L., Koehl, M., Keller, A. and Oster, G. (1995).
How Do Sea Urchins Gastrulate? Distinguishing Between Mechanism of Primary
Invagination Using Biomechanics. Development 121, 2005-2018.
Dembo, M. (1989). Mechanics and Control of the
Cytoskeleton in Amoeba proteus. Biophys. J. 55, 1053-1080.
Doering, C., Ermentrout, B. and Oster, G. (1995). Rotary
DNA Motors. Biophys. J. 69, 2256-2267.
Durrett, R. and
Levin, S.A. (1994) Stochastic Spatial Models: A User's Guide to Ecological
Applications. Phil. Trans. Soc. Lond. B. 343, 329-350.
T. R., Major, F., Malhotra, A. and Harvey, S. C. (1994). Orientations of
Transfer RNA in the Ribosomal A and P Sites. Nucl. Acid. Res. 22, 3779-3789.
Ewald, P.W. (1995). The Evolution Of Virulence - A
Unifying Link Between Parasitology and Ecology. J. Parasitology 81, 659-669.
Eisenberg, D. and McLachlan, A. D. (1986). Solvation
Energy in Protein Folding and Binding. Nature 319, 199-203.
Ellington, C.P. and Pedley, T.J., eds. (1995). Biological
Fluid Dynamics. Company of Biologists Limited, Cambridge UK
C.F., Chapin III, F. S., Matson, P. A. and Mooney, H. A. (1992)
Responses of Terrestrial Ecosystems to the Changing Atmosphere: A Resource-Based
Approach. Ann. Rev Ecol. Syst. 23, 201-235.
Findlay, J. B. C.
(1996). Membrane Protein Models. BIOS, Oxford
(1937). The Wave of Advance of Advantageous Genes. Ann. Eugen. (Lond.)
Fleischmann, R.D., Adams, M.D., White, O.,
Clayton, R.A. (1995) Whole-Genome Random Sequencing and Assembly of Haemophilus
influenzae Rd. Science 269, 496-512.
Frank, S.A. (1993).
Evolution of Host-Parasite Diversity. Evolution 47, 1721-1732.
Frank, S.A. (1994). Coevolutionary Genetics of Host and
Parasites with Quantitative Inheritance. Evolutionary Ecology 8, 74-94.
Gao, J. (1996). Hybrid Quantum and Molecular Mechanical
Simulations - An Alternative Avenue to Solvent Effects in Organic Chemistry.
Acc. Chem. Res. 29, 298-305.
Gilson, M. K., McCammon, J.
A. and Madura, J. D. (1995). Molecular Dynamics Simulation with a Continuum
Electrostatic Model of the Solvent. J. Comput. Chem. 16, 1081-1095.
Goldstein, B., and Wofsy, C., eds. (1994). Lectures on
Mathematics in the Life Sciences 24: Cell Biology. American Mathematical
Society, Providence, RI.
Golomb, D., Wang, X-J, and
Rinzel, J. (1996) Propagation of Spindle Waves in a Thalamic Slice Model. J.
Neurophys 75, 750-769.
Griffiths, R.C and Tavare, S.
(1996). Computational Methods for the Coalescent. IMA volume, P. Donnelly and S.
Tavare, eds. In press.
Guida, W. C. (1994). Software for
Structure-Based Drug Design. Curr. Opin. Struc. Biol. 4, 777-781.
Harris-Warrick, R., Marder, E., Selverston, A. and
Moulins, M., eds. (1992). Dynamic Biological Networks: The Stomatogastric
Nervous System, MIT Press
Hethcote, H.W. and Yorke, J.A.
(1984) Gonorrhea: Transmission Dynamics and Control. Lect. Notes in Biomath. 56,
Hilborn, R., Walters, C.J. and Ludwig, D. (1995)
Sustainable Exploitation of Renewable Resources. Annual Review Of Ecology And
Systematics 26, 45-67.
Ho, D. D., Neumann, A. U.,
Perelson, A. S., Chen, W., Leonard, J. M. and Markowitz, M. (1995). Rapid
Turnover of Plasma Virions and CD4 Lymphocytes in HIV-1 Infection. Nature 373,
Honig, B. and Nicholls, A. (1995). Classical
Electrostatics in Biology and Chemistry. Science 268, 1144-1149.
Humphreys, D. D., Freisner, R. A. and Berne, B. J. (1994).
A Multiple Time-Step Molecular Dynamics Algorithm for Macromolecules. J. Phys.
Chem. 98, 6884-6892.
Jaeger, L., Michel, F. , and Westhof,
E. (1994). Involvement of a GRNA Tetraloop in Long-Range Tertiary Interactions.
J. Mol. Biol. 236, 1271-1276.
Jafri, S. M., and Keizer, J.
(1994) Diffusion of Inositol 1,4,5-Trisphosphate But Not Ca2+ Is Necessary for a
Class of Inositol 1,4,5-Trisphosphate-Induced Ca2+ Waves. Proc. Natl. Acad. Sci.
Johnson, B. A. and Blevins, R. A. (1994).
NMRView: A Computer Program for the Visualization and Analysis of NMR Data. J.
Bimolec. NMR 4, 603-614.
Kearsley, S.K., Underwood, D.J.,
Sheridan, R.P. and Miller, M.D. (1994). Flexibases - A Way to Enhance the Use of
Molecular Docking Methods. J. Comp. Assist. Mol. Design 8, 565-582.
Koch, C. and Segev, I., eds. (1989). Methods in Neuronal
Modeling: From Synapses to Networks, MIT Press, Cambridge MA. 2nd Edition, in
Kontoyianni, M. and Lybrand, T. P. (1993).
Three Dimensional Models for Integral Membrane Proteins: Possibilities and
Pitfalls. Perspect, Drug Disc. Design 1, 291-300.
and LeMasson, G. (1994) Rhythmogenesis, Amplitude Modulation, and
Multiplexing in a Cortical Architecture, Proc. Natl. Acad. Sci, USA 91,
Kreusch, A. and Schulz, G. E. (1994) Refined
Structure of the Porin from Rhodopseudomonas blastica. Comparison with the Porin
from Rhodobacter capsulatus. J. Mol. Biol. 243, 891-905.
(1993). Risks of Population Extinction from Demographic and
Environmental Stochasticity. Am. Nat. 142, 011-927.
(1994). Risks of Population Extinction from New Deleterious Mutations.
Evolution 48, 1460-1469.
Lander, E.S. and Waterman, M.S.
(1995). Calculating the Secrets of Life, National Academy Press, Washington,
Lauffenburger, D. A. and J. J. Linderman (1993).
Receptors: Models for Binding, Trafficking, and Signaling. Oxford University
Leahy, D.J., Hendrickson, W.A., Aukhil, I.
and Erickson, H.P. (1992). Structure of a Fibronectin Type III Domain from
Tenascin Phased by MAD Analysis of the Selenomethionyl Protein. Science 158,
Lee, C. and Subbiah, S. (1991). Prediction of
Protein Side-Chain Conformation by Packing Optimization. J. Mol. Biol. 217,
Levitt, M. (1993). Accurate Modeling of Protein
Conformation by Automatic Segment Matching. J. Mol. Biol. 226, 507-533.
Levin, S.A. and Pacala, S.W. (1996). Theories of
Simplification and Scaling of Spatially Distributed Processes. In press, 1997.
In Spatial Ecology: The Role of Space in Population Dynamics and Interspecific
Interactions. D. Tilman and P. Kareiva, eds, Princeton University Press,
Lewis, M.A., Kareiva, P. (1993). Allee
Dynamics and the Spread of Invading Organisms. Theoretical Population Biology
Lyubchenko, Y. Shlyakhtenko, L., Harrington,
R., Oden, P. and Lindsay, S. (1993). Atomic Force Microscopy of Long DNA:
Imaging in Air and Under Water. Proc. Natl. Acad. Sci. USA 90, 2137-2140.
Misra, V.K., Hecht, J.L., Sharp, K.A., Friedman, R.A. and
Honig, B. (1993). Salt Effects on Protein-DNA Interactions - The Lambda-CI
Repressor and EcoRI Endonuclease. J. Mol. Biol. 238, 264-280.
Mogilner, A. and Oster, G. (1996). Cell Motility Driven by
Actin Polymerization. Biophys. J., in press.
and Hunt, T. (1993). The Cell Cycle: An Introduction. New York, W.H.
Murray, J. and Oster, G. (1984). Cell Traction
Models for Generating Pattern and Form in Morphogenesis. J. Math. Biol 19,
Naranjo, D., Latorre, R., Cherbavaz, D., McGill,
P., and Schumaker, M. F. (1994). A Simple Model for Surface Charges on Ion
Channels. Biophysical J. 66, 59-70.
OTA (Office of Technology
Assessment), (Sept, 1993). Harmful Non-Indigenous Species in the United
States. OTA-F-565. US Govt. Printing Office, Washington D.C.
Oliver, T., Dembo, M. and Jacobson, K. (1995). Traction
Forces in Locomoting Cells. Cell Motil. Cytoskel. 31, 225-240.
Olsen, L., Sherratt, J. and Maini, P. (1995). A
Mechanochemical Model for Adult Dermal Wound Contraction and the Permanence of
the Contracted Tissue Displacement Profile. J. Theor. Biol. 177(2), 113-128.
Orengo, C. A., Swindell, M. B., Michie, A. D., Zvelebil,
M. J., Driscoll, P. C., Waterfield, M. D. and Thornton, J. M. (1995). Structural
Similarity between Pleckstrin Homology Domain and Verotoxin: The Problem of
Measuring and Evaluating Structural Similarity. Prot. Sci. 4, 1977-1983.
Peskin, C. and Oster, G. (1995). Coordinated Hydrolysis
Explains the Mechanical Behavior of Kinesin. Biophys. J. 68(4), 202s-210s.
Rayment, I. and H. Holden (1994). The Three-Dimensional
Structure of a Molecular Motor. TIBS 19, 129-134.
and Petsko, G.A. (1996). A User's Guide to Protein Crystallography. In
Protein Engineering and Design, P.R. Carey, ed. Academic Press, San Diego.
Roughgarden, J. (1979). Theory of Populations Genetics and
Evolutionary Ecology: An Introduction. Macmillan, New York.
Rybenkov, V.V., Cozzarelli, N.R. and Vologodskii, A.V.
(1993). Probability of DNA Knotting and the Effective Diameter of the DNA Double
Helix, Proc. Nat. Acad. Sci. 90, 5307-5311.
and Olson, W.K. (1992). Supercoiled DNA Energetics and Dynamics by
Computer Simulation, J. Mol. Biol. 223, 1089-1119.
(1994). Kinesin-Based Organelle Transport. In Modern Cell Biology:
Microtubules. J. S. Hyams and C. W. Lloyd, eds. New York, Wiley-Liss. 13: pp.
Senderowitz, H., Guanieri, F. and Still, W. C.
(1996), A Smart Monte Carlo Technique for Free Energy Simulations of
Multiconformational Molecules, Direct Calculation of the Conformational
Populations of Organic Molecules. J. Amer. Chem. Soc. 117, 8211-8219.
Shadlen, M. and Newsome, W. (1994), Noise, Neural Codes
and Cortical Organization. Curr. Opin. Neurobiol. 4, 569-579.
Silver, R.B. Calcium, BOBs, Microdomains and a Cellular
Decision: Control of Mitotic Cell Division in Sand Dollar Blastomeres, Cell (in
Simmons, A. H. , Michal, C. A. and Jelinski, L. W.
(1996). Molecular Orientation and Two-Component Crystalline Fraction of Spider
Dragline Silk, Science 271, 84-87.
Smith, K. C. and Honig,
B. (1994). Evaluation of the Conformational Free Energies of Loops in Proteins.
Proteins: Structure, Function, and Genetics 18, 119-132.
S.B., Cui, Y. and Bustamante, C. (1996). Overstretching B-DNA: The
Elastic Response of Individual Double-Stranded and Single-Stranded DNA
Molecules, Science 271, 795-799.
Softky, W.R. (1995).
Simple Codes Versus Efficient Codes. (Commentary) Curr. Opin. Neurobiol. 5,
Softky, W.R. and Koch, C. (1993). The Highly
Irregular Firing of Cortical Cells is Inconsistent with Temporal Integration of
Random EPSPs. J. Neuroscience 13, 334-350.
et al. (1996). Determination of DNA Helical Repeat and of the
Structure of Supercoiled DNA by Cryo-Electron Microscopy. In Mathematical
Approaches to Biomolecular Structure and Dynamics, IMA Proceedings 82, Springer
Verlag, New York, p. 117.
Steinhoff, H. J., Mollaaghabada,
R., Altenbach, C., Khorana, H. G. and Hubbell, W. L. (1994). Site-Directed Spin
Labeling Studies of Structure and Dynamics in Bacteriorhodopsin. Biophys. Chem.
Stuart, G.J. and Sakmann, B. (1994). Active
Propagation of Somatic Action Potentials into Neocortical Pyramidal Cell
Dendrites. Nature 367, 69-72.
Sumners, D.W., Ernst, C.,
Spengler, S.J. and Cozzarelli, N.R. (1995). Analysis of the Mechanism of DNA
Recombination Using Tangles, Quarterly Reviews of Biophysics 28, 253-313.
Svoboda, K. and S. Block (1994). Force and Velocity
Measured for Single Kinesin Molecules. Cell 77, 773-84.
J.S., Kishino, H. and Febenstein, J. (1992). Inching Toward Reality: An
Improved Likelihood Model of Sequence Evolution. J. Mol. Evolution 34, 3-16.
Tilman, D. (1994) Competition and Biodiversity in
Spatially Structured Habitats. Ecology 75, 2-16.
Tirrell, J. G.,
Fournier, M. J., Mason, T. L. and Tirrell, D. A. (1994). Biomolecular
Materials. Chem. Eng. News, December 19, 40-51.
Tranquillo, R. T.,
and Alt, W. (1996). Stochastic Model of Receptor-Mediated Cytomechanics
and Dynamic Morphology of Leukocytes. J. Math. Biol. 34, 361-412.
Tranquillo, R. and J. D. Murray (1993). Mechanistic Model
of Wound Contraction. J. Surg. Res 55, 233-47.
and Wiener, P. (1994). Migration in Variable: Exploring Life History
Evolution Using Structured Population Models. J. Theor. Biol. 166 75-90.
Tuljapurkar, S. (1994). Stochastic Demography and Life
Histories. In Frontiers in Mathematical Biology, S.A. Levin, ed.
Springer-Verlag, Berlin, pp. 254-262.
Tyson, J. J., Novak,
B., Odell, G. M., Chen, K., and Thron, C. D. (1996). Chemical Kinetic Theory:
Understanding Cell-Cycle Regulation. Trends in Biochemical Sciences 21, 89-96.
Walters, C. and Maguire, J.J. (1996). Lessons For Stock
Assessment from the Northern Cod Collapse. Reviews In Fish Biology And Fisheries
Walters, C. and Parma, R.M. (1996). Fixed
Exploitation Rate Strategies for Coping with Effects of Climate Change. Canadian
Journal Of Fisheries And Aquatic Sciences 53, 148-158.
N. (1996). Yeast Genome Sequence Ferments New Research, Science 272,
White, J.H., (1992). Geometry and Topology. In
Proceedings of Symposia in Applied Mathematics 45, American Mathematical
Society, Providence, R.I., 17.
Whittington, M.A., Traub,
R.D., and Jefferys, J.G.R. (1995). Synchronized Oscillations in Interneuron
Networks Driven by Metabotropic Glutamate Receptor Activation. Nature 373,
Wofsy, C., Kent, U. K., Mao, S-Y., Metzger, H.,
and Goldstein, B. (1995). Kinetics of Tyrosine Phosphorylation When IgE Dimers
Bind to Fce Receptors on Rat Basophilic Leukemia Cells. J. Biol. Chem. 270,
York, D.M., Yang, W.T., Lee, H., Darden, T.
and Pedersen, L.G. (1995). Toward the Accurate Modeling of DNA - The Importance
of Long-Range Electrostatics. J. Amer. Chem. Soc. 117, 5001-5002.
Zadoks, J.C. and Van Den Bosch, F. (1994). On The Spread
Of Plant Disease - A Theory On Foci. Annual Review Of Phytopathology 32,