powerful general mathematical programs noted above, there exist several sets of
more specifically targeted software with the capacity to generate, portray and
quantify the behavior of nonlinear continuous and discrete abstract and real
dynamical systems. These often also include algorithmic modules that are useful in
tailoring new models and measures (see for examples, Parker and Chua, 1989;
Baker and Gollub, 1991; Nusse and Yorke, 1991; Sprott and Rowlands, 1991;
Sprott, 1993; Korsch and Jodl, 1994; Enns and McGuire, 1997). Learning from and
using this software, along with only a little programming in the high level languages
and computer algebra programs listed above, permit the non-mathematician
neuroscientist, willing to read in the literature such as that described below, to do
independent, cutting edge research in applied dynamical systems.
Described below will be the computational discoveries in abstract systems
and real neuroscientific data that have led to multiple contexts of quantitative
description. These include those that are: (1) Geometric and conserve metric
distances; (2) Topological and conserve relative positions but not distances; (3)
Single or multiple global quantitative descriptors such as scaling numbers or scaling
187
number spectra; (4) Non-Gaussian distributions with heavy tails and correlations
reflected in their Hurst, Fano, Allan and Levy exponents; (5) Statistical dynamical
descriptions of trajectories of the system in their embedding space such as
Lyapounov exponents, Hausdorff-Mandelbrot dimensions, Sinai-Ruelle-Bowen
measures, and Adler-Weiss-Ornstein topological and metric entropies.
Characteristics which discriminate between experimental versus control
conditions in parametric computational and real physiological and pharmacological
experiments serve to generate and test ideas and imagery arising out of behavior
observed in both biological and abstract dynamical realms. New experiments can
be suggested by the implicative structure of dynamical systems theory as well as
neurobiological findings and intuitions. As examples, the sudden “switch” of manicdepressive
bipolarity syndromes may be a “bifurcation” in nonlinear dynamical
systems; the “noise” of the statistical physicist may be the “arousal” of the brain
stem-thalamic biogenic amine and reticular formation neurophysiologist; aspects of
“thought disorder” in the pathophysiology of schizophrenic patients may be an
entropic sequencing idiosyncrasy in the “symbolic dynamics” of a particular brain
system attractor; neuronal “bursting” may be the “intermittency” of a neurodynamical
system; a multiplicity of “discrete ion channel conductances” may be a single “global
scaling hierarchy” of conductances times. The number of published examples of this
fusion of ideas and methodology in the biological-relevant literature is already in the
several hundreds and Medline counts indicate is growing exponentially.
Representative samples of these are described below.
In addition to the technological advances in computational hardware and
software, the major scientific surprise making this new era possible is the discovery
of universalities, the finite set of behaviors characteristic of most, if not all nonlinear
systems, across most if not all of the specific equations or neural systems being
explored. This makes the emergence of semi-quantitative equivalence relations
between model and data not only possible but likely, even though we don’t now and
perhaps never will know enough to either write or solve completely the specific and
detailed equations for the biological system of interest. We neuroscientists need not
be apologetic for using these ideas and tools qualitatively and empirically. In fact,
188
unanticipated results of analog and digital computer experiments were responsible
for most if not all of the discoveries underlying the current era’s revolution in applied
nonlinear mathematics
.
Modern Applied Dynamical Systems Emerged from Accidental
Computational Discoveries
A medical student named Herr, in his thesis research with the “radio
engineer”, Van der Pol (1926), was simulating cardiac electrophysiology with an
analog device which permitted real time, exploration of a full range of parameter
values long before there were fast enough digital processors to do so. Studying the
behavior of equations of a periodically, pace maker, driven, nonlinear triode
oscillator, Herr found orbital points that appeared to belong to two different periods
simultaneously thus violating the uniqueness of solutions of differential equation
theory. The Van der Pol relaxation oscillator equations, with their slow buildup and
sudden discharge of membrane potential are good models for the slow-fast
processes of repolarization and depolarization of Hodgkin-Huxley type equations
(Rinzel, 1985). Periodically driven, nonlinear differential equations of the Van der
Pol type are generally applicable to the multiplicity of dynamical regimes of neuronal
dynamics (Carpenter, 1979; Aihara et al, 1984; Chay and Rinzel, 1985) and, with
periodic and aperiodic driving and noise, can be made relevant to particular
mammalian neuronal subsystems in the context of clinically relevant global
electrophysiological phenomena such as Magoun’s (1954) brain stem evoked EEG
and behavioral arousal (Nicolis, 1986; Selz and Mandell, 1992; Mandell and Selz,
1993).
In the early 1940’s, using the pre-publication results of similar analog
computer studies (Levinson, 1949), the Cambridge mathematicians, Mary
Cartwright and Joe Littlewood (1945; McMurran, S., Tattersal, J.,1999) used
geometric methods to prove that the highly nonlinear, periodically driven Van der
Pol equations, depending upon one or two changing parameters, generated fixed
point (“homeostatic”), periodic (“cyclic”), subharmonic (“period doubling”),
quasiperiodic (“multiply periodic”), intermittent (“bursting”) and “deterministically
189
random” patterns. We now know such phenomena to be universal characteristics of
bifurcation scenarios in nonlinear dynamical systems where bifurcation means
discontinuous changes in patterns of behavior (dependent variables) resulting from
smooth changes in parameters (independent variables). Alerted to their presence in
computer experiments with biologically relevant nonlinear differential equations,
these phenomena have since been found in time series from patch clamped
membrane channels, single neurons, neuronal networks, neuroendocrine systems,
brain waves and patterns of behavior in animals and man (see below). Cartwright-
Littlewood found that the inner and outer edges of the domains of attraction (all the
initial values that eventually wind up in the attractor—the limit set of all bounded
solutions) of two different sets of subharmonic periods for the same parameter
settings were interlaced at many scales in what is today called a fractal basin
boundary. It was in this way that the specific values of the end state are understood
to be indeterminate since the starting values in the fractal basin boundary are
impossible to isolate and specify with adequate experimental precision.
Similar biologically-relevant analog computer discoveries about the Van der
Pol and comparable periodically forced, dissipative (energy utilizing) Duffing
equations (Zeeman, 1976) were made in the early 1960’s by electrical engineer,
Yoshi Ueda (1992), but his thesis director, Chihiro Hayashi of Japan’s Kyoto
University, was sufficiently disturbed by this evidence for the existence of bounded
solutions (attractors) that were neither fixed points (equilibria) nor periodic orbits
(cycles), the only ones known at the time and therefore “strange,” that he refused to
let Ueda publish his findings until he did so as an independent investigator in the
1970’s.
In the early 1960’s, Edward Lorenz (1963), a meteorologist and student of
the Harvard mathematician and dynamical systems pioneer, George Birkoff (1922),
was computing the output of a very reduced subset of Saltzman’s differential
equations for predicting the weather (1962). Lorenz found that numerically
integrated trajectories manifested unpredictable times and directions of motion
between the two spiral orbits of what has come to be known as the Lorenz attractor.
Very small differences in starting values led to widely diverse final values, and, just
190
as importantly, far apart initial values could be found close together in the limit set.
This behavior was called “sensitivity to initial conditions” by David Ruelle (1978;
Ruelle and Takens, 1971). It is noteworthy, however, that over a range of values of
the parameters, the overall pattern of the orbits of the Lorenz attractor results in
characteristic geometric pictures as well as invariant statistical descriptors.
Qualitative and quantitative global similarities were gained while specific solutions
were lost in these “strange attractors” of nonlinear systems. Analog computer
simulation of a simpler set of equations inspired by nonlinear chemical reaction
kinetics led to the discovery by Rössler (1976) of another early and generic strange
attractor combining sensitivity to initial conditions and characteristic geometries and
measures.
It was the Russian mathematicians, A.N. Kolmogorov (1957), Sinai (1959)
and V.I. Arnold (Arnold and Avez, 1968), the French mathematicians, Rene Thom
(1972) and David Ruelle (1978) and the U.C. Berkeley mathematicians, Steve
Smale (1967) and his student, Rufus Bowen (1975), and their associates who
gathered together these and other related computational discoveries and embedded
them in a qualitative theory of nonlinear differential equations, using a variety of
formalisms, including point set and differential topology, geometry, analysis and
ergodic (having an invariant statistical description) measure theory that formally
established the fundamentals for research in nonlinear dynamical systems. Here a
dynamical system refers generally and simply to the components and nonlinear
processes (transformations) that move points (values) in discrete (“map”) or
continuous (“differential equation”) time around in an appropriately defined space.
The phrase, “nonlinear transformation” in this context does not imply easily solvable
curved functions, such as those representing the sigmoid kinetic or threshold
functions of enzymes and neuronal networks or those that smoothly log transform
the amplitudes of auditory or other sensory modalities in man, but rather allude to
expressions containing products, powers and functions of the computational and/or
3
experimental variables , such as xx , ( x i
or si n( x)
.
x i
1 2
)
191
As noted above, the cross-disciplinary cohesiveness of such a vaguely
defined field occurred as the result of the unanticipated discovery of a relatively
small set of nonlinear phenomena, universalities, that implicated many fields of
mathematics, from differential geometry to number theory, and were found in a
broad range of physical and biological realizations, from turbulent plasmas and
chemical and enzymatic reactions to neuroendocrine hormone release patterns. It is
perhaps counter-intuitive but, whereas linear systems can generate an infinite
number of solutions locating points anywhere the person writing the equations
wants them to go, nonlinear systems are generally restricted to a finite set of global
dynamics and these emerge on their own from the intrinsic dynamics of the system.
Trying to make these systems follow orders, not unlike finding the most clinically
effective dosage range of a psychopharmacological agent, require the empiricism of
trial and error experiments.
A second class of computational accidents involving nonlinear systems
resulted in unanticipated coherence rather than unpredictable disorder. Using one
of the early “high speed” digital computers at Los Alamos, MANIC I, Enrico Fermi
with Pasta and Ulam (1955) attempted to obtain a many-body statistical
thermodynamic equilibrium analogous to heat generated noise by coupling 64
particles together with nonlinear springs. They found only a few low period modes
that oscillated indefinitely. Instead of equidistribution of the energy into 128 degrees
of freedom (64 locations × 64 velocities in 128 dimensional phase space), they
found it gathered up into only few coherent modes. Although the relevance to
biological science of nonlinear multifrequency coherence is a bit off from our focus,
it is worthwhile noting that a recent (Karhunen-Loeve) decomposition of the alpha
band of the resting alert human EEG revealed only three dominant temporal–spatial
modes: anterior-posterior, rotational and standing (Friedrich et al, 1991) and “few
frequency coherence” is a frontier of inquiry in brain wave research.
A heterogeneous collection of coupled nonlinear elements in the form of
widely distributed, multi-location, multifrequency systems such as cross-cortical,
brain stem-thalamic-cortical and interconnected spinal motor neurons can generate
192
coherent activity. This temporal and phase coherence plays an important role in
current theory of sensory-associative-motor integrative function, how distributed
attributions come together in the brain representation of a single object or process,
in the context of the so-called “binding problem” (Singer, 1993; Bressler, 1995;
Nicolelis, 1995; Schiff et al, 1997). Diffusely distributed neurochemical variables
have been invoked. For example, the role of metabotropic glutamate receptors in
driving the synchronization of interneuronal networks has been suggested as a
mechanistic model (Whittington et al, 1995). The objects of relevance to the
discovery of Fermi-Pasta-Ulam are studied as the nonlinear physics of
nondissapative wave processes and are called solitons (Zabusky and Kruskal,
1965). They have been invoked to model nerve conduction and information
transport in brain (Scott, 1990).
A third counter-intuitive set of accidental computational findings is in an area
of research called symbolic dynamics which involves the universal parameterdependent
coding language and capacity of nonlinear systems. In the early 1960’s,
a group around Stan Ulam at Los Alamos (Cooper, 1987) used one of the early
“high powered” computers, MANIAC II, to iterate (letting the output of the action of a
discrete time function serve as its input the next time around) simple equations they
called “maps.” These reduced dimensional objects shaped like tents, sine functions
and parabolas can be extracted from and represent the behavior of higher
dimensional, nonlinear differential equations (see Devaney, 1989; Schuster, 1989 or
Moon, 1992 for intuitive descriptions). They varied a parameter, such as the height
of the tent or parabola, to systematically change the period and/or phase (order) of
the symbol sequence (Metropolis et al, 1973). Normalizing the range of values of
the output to [0,1] and transforming the series of values into a binary code, L ≤ 0.5
and R > 0.5, they found an invariant, one parameter dependent, progression of
ordered periods, R, RLR, RLRR…RLLRL…, in all such single maximum maps. This
“U (universal)sequence” has also been found as singly or multiply present in a
variety of real systems, including complicated chemical reactions (Simoyi et al,
1982; Coffman et al, 1986). This means one can “dial” the parameter to generate
“words” of sufficient computational complexity to serve as a language. These
193
computer experimental findings had already been anticipated in a remarkable
mathematical proof by Sharkovskii (1964).
The dynamical richness of these simple, single maximum, one dimensional
maps was computationally explored in the context of ecological and epidemiological
issues in the classical studies of Robert May (1976). It has been possible to relate
the individually characteristic L, R sequence behavior of human subjects on a
computer task to a unique parameter of a tent map generating those sequences
which predicted age and discriminated subclinical obsessive compulsive from
borderline syndromes (Selz and Mandell, 1993). The dynamical entropy of
unstructured L,R behavior also discriminated a population of schizophrenic patients
from normals (Paulus et al, 1996). More generally, parameter dependent dynamical
coding, built into the universal behavior of its constitutive equations, is a mechanism
with which a nonlinear dynamical system, such as nerve membrane equations as
above, or in the aggregate, the middle layer of a completely connected neural
network, can encode, Morse code-like, messages (Paulus et al, 1989).
Bifurcations in Biologically Relevant Dynamical Systems
Bifurcations, “splitting into (two) branches,” are observed over a smooth
change in control parameter(s) (independent variables), as a discontinuous and
qualitative change in the dynamical (time-dependent) pattern of the observable
(Guckenheimer and Holmes, 1993; Wiggens, 1990; see Strogatz, 1994, for a
particularly intuitive description). Qualitative here means how the dynamics of the
trajectory appear as a geometric-topological (relative shaped not necessarily sized)
pattern in phase space. In such a space, the orbital points are located along the x-
axis by their value, x at time t, and along the y-axis by their time rate of change at
that t, dx . To visualize a representative phase portrait in the plane, start by
dt
imagining the pattern made by mass hanging on a linear spring at rest as
dx
represented by a point centered at x = 0, y = = 0. When perturbed from rest, the
dt
194
phase portrait of the motion of this “harmonic oscillator,” is composed of a
(continuous) series of points representing its location, graphed along x , its rate of
motion graphed along , y =
dx dx
. x and
dt dt
co-localize the circular orbit as it speeds
up and slows down while it bobs up and down. The transition from a fixed point (the
mass at rest) to a circle (the bobbing mass), a bifurcation in phase space, results in
the loss of topological equivalence. That is, the phase space geometries before and
after the bifurcation cannot be smoothly distorted into each other. Continuity and
connectedness of the space is lost. For topological equivalence, stretching, bending
and warping are allowed but not tearing apart and/or gluing together. Following the
bifurcation of a fixed point into a circle, even limitless shrinking of the ring leaves a
hole. The appearance or disappearance of an equilibrium fixed point (called a
“saddle-node” bifurcation), splitting into two (“period doubling” bifurcation), its
exploding into a circle (“Hopf” bifurcation to a limit cycle), a circle splitting into two or
more incommensurate cycles (“secondary Hopf” bifurcation) and these multiperiodic
(“quasiperiodic”) dynamics breaking down into a recursive spirals (“homoclinic
bifurcation to chaos”) are among the common bifurcations in nonlinear dynamical
systems, and all of them have been observed in many neurobiological settings.
In the forced-dissipative (energetically driven and energy consuming)
dynamical systems relevant to the neurosciences---this characteristic contrasts with
the dissipation free momentum of the classical mechanics of astrophysical bodies---
there are four “most generic” bifurcation scenarios as a parameter changes that
may, but need not, lead to chaos (see below for definition) (for early and physically
oriented treatments see Eckmann, 1981; Ott, 1981; Berge′ et al, 1984, Kaneko,
1983). These scenarios are: (1) Fixed point or cycle splittings into twice-as-long
period lengths 1→2→4→8→16…called the subharmonic or “period doubling route”;
(2) The transformation of fixed points to one and then more periodic orbits, multiple
independent (nonharmonic, incommensurate) frequency oscillations, their mode
lockings and then breakdown called the “quasiperiodic route”; (3) Fixed point or
cyclic equilibria metamorphosed into irregular bursting patterns called the
“intermittency route”; and (4) In the context of quasiperiodic dynamics, adjacent
195
nonharmonic frequency encoding parameter spaces fusing, resulting in new periods
that are the sums of their adjacent ones: period 2 + period 3 = period 5, in what is
called the “period adding route”. Technically precise classification of bifurcations
involve much more careful definitions and well studied technical constraints
involving such issues as the symmetries and dimensionality of the system of
observables, how many control parameters (“codimensions”) are required to
reasonably realize the bifurcation and the particular way the fixed points of the
system become unstable, all of which are directly explorable when the equations
are known or can be hypothetically inferred from the qualitative behavior of real
data.
We note a few examples from the wide variety of bifurcating systems that
can be found in the biomedical literature of interest for the biological sciences. With
substrate input rate as the bifurcation parameter, the phosphofructokinase regulated
glycolytic cycle in yeast extract was found to change among steady state, periodic
and period doubling (subharmonic) regimes (Boiteux et al, 1975). Transitions
between steady state, oscillatory and chaotic patterns have been reported in variety
of physiological measures in man including respiratory rhythms and circulating
blood cell concentrations over time (Mackey and Glass, 1977; Glass and Mackey,
1988 ) and models of dopamine cell dynamics (King et al, 1984). Flow rate
parameter sensitive periodic, bursting and chaotic behavior has been found in a
peroxidase-oxidase system (Olsen and Degn, 1977). A brain enzyme, substantia
nigral dopaminergic tyrosine hydroxylase, manifested different saturation and
fluctuation patterns, including bursting and periodicity, in experiments in which low
(physiological) levels of tetrahydrobiopterin cofactor were the bifurcation parameters
and adrenergic drugs were used as modulators (Mandell and Russo, 1981).
All four of the generic bifurcation routes to chaos, period doubling, changing
multifrequency (quasiperiodicity), period adding and bursting (called “intermittency”)
were observed in self-sustained oscillations induced in the neural membranes of
space clamped, giant squid axons that were immersed in a 550mM NaCl, and
electrically stimulated over changing amplitudes and frequencies (Aihara et al,
1986; Takahashi et al, 1990). With external stimulus current level as the control
196
parameter, the R15 cell of the abdominal ganglion of the Aplysia demonstrates
transitions between bursting and periodic modes as well as period doubling, a
signatory period 3 and the Lyapounov characteristic exponent evidence (see below)
for the discontinuous onset of chaos (Canavier et al, 1990). Manipulating feed back
delay, the human pupillary light reflex will bifurcate into regular oscillations (Milton
and Longtin, 1990). A transition between a regime of irregular discharging to
oscillatory bursting behavior was induced in basal forebrain cholinergic neurons by
neurotensin (Alonso et al, 1994). Sympathetic nerve discharge in decerebrate,
ventilated cats demonstrated transitions between periodic, multiple periodic
(quasiperiodic with changing ratios to the ventilation frequency) and subharmonic
behavior in response to inferior vena cava occlusion, vagotomy, aortic constriction
and spinalization (Porta et al, 1996). Period adding bifurcations were induced by
changing calcium concentrations or the addition of a potassium channel blocker in
the “pacemaker” formed when (rat) sciatic nerve is chronically injured (Ren et al,
1997). Changing levels of the L-type calcium channel antagonist, verapramil, alter
the pattern of vasomotion of rabbit ear arteries among sets of multiple independent
periods, “quasi-periodicity,” mode locking and chaos (De Brower, 1998). At critical
intensity and frequency, flicker visual stimulation of the salamander generates a
pharmacologically modifiable period doubling bifurcation in their ganglion cells (one
spike for every two flickers) which is also seen subjectively and in occipital lobe
evoked potentials at critical frequencies in bright, full-field flickered humans (Crevier
and Meister, 1998).
Qualitative and Quantitative Universality in Nonlinear Dynamical Systems
“Universality” (see above) entered the parlance of physics in the context of
the statistical mechanics of phase transitions near their critical points (Stanley,
1971; Stauffer, 1985; Yeomans, 1993) and has come to refer to the finite set of
transitions and quantities common to nonlinear systems arising in their
neighborhoods. A common physical example is the triple point of water-ice-steam
on the temperature-pressure phase plane where a small change in temperature or
197
pressure leads to a global qualitative change in physical state. Analogously, the loss
of topological equivalence occurs at the fixed point that, for examples, splits into two
or explodes into a cyclic orbit in phase space. The same critical point behaviors and
quantities occur in a wide variety of specific processes and their equations, and they
are independent of the way the trajectories first arrived in the fixed point
neighborhood. Once the system enters the regime of critical behavior, the predictive
significance of its dynamical history is lost. This may also be the case for emergent
psychiatric disorder (Mandell et al, 1985; Mandell and Selz, 1992; Ehlers, 1995;
Paulus et al, 1996; Huber et al, 1999).
There are diagnostic patterns of behavior when a nonlinear system is in a
neighborhood of a potential bifurcation. They include sudden and/or large jumps
resulting from a small change in experimental conditions, the appearance of big
baseline fluctuations (anomalously large variance), the lengthening of the time
required to relax following evoked or spontaneous perturbation (‘”critical slowing”),
the same global change in state occurring at different values of the parameter when
increasing versus decreasing a parameter’s value (“hysteresis”), the existence of
some range of values of the observable that cannot be attained by manipulation of
the parameter (“inaccessibility”) and the availability of two or more distinct states in
the same parameter neighborhood (“modality”) (Thom, 1972; Arnold, 1984; Gilmore,
1981). It is perhaps relevant to polydrug psychopharmacology and clinical
management that the higher the co-dimension (the greater number of effective
parameters being manipulated), the greater the accessibility and control of selected
state stability becomes with respect to difficult to obtain behaviors. Examples of the
potential advantages of simultaneous manipulation of multiple influences have been
developed for affect disorder and anorexia nervosa (Callahan and Sashin, 1987).
As evidence for the independence of critical behavior from specific history,
the qualitatively universal bifurcations along the four canonical routes to chaos
manifest dimensionless ratios of parameter and phase space geometries between
bifurcations. These ratios are quantitatively universal. The formalisms that rescale
the distances from fixed points in parameter and observable spaces result in the
same picture across scale, a dilatational symmetry (also called self similarity or
198
affinity). They are called renormalization group equations, and, with respect to
prediction, they replace any or all of the original specific predictive equations for the
particular system under study (Cvitanovic′, 1989). Whereas the U sequence and
critical point behaviors are manifestations of qualitative universality, these scaling
numbers are manifestations of quantitative universality. We discuss them here
because their omnipresence in computationally realized differential equations as
well as physical and chemical experiments along with their quantitative specificity
(with values in all systems as “constant” as π) constitute a most persuasive
argument for the substantiality of modern dynamical systems approaches to brain
and other biological research.
The physical and physiological requirements for manifestations of these
universal bifurcation scenarios can appear to be remarkably minimal. In physics, for
example a full panoply can be observed in a “dripping faucet” (Shaw,1984).
Similarly, a small piece of extirpated and perfused myenteric or femoral artery will
demonstrate these transitions in vasomotion spontaneously and almost
independent of flow rate (Stergiopulos et al, 1998).
Feigenbaum discovered that in dynamical systems manifesting a series of
period doubling bifurcations, the ratio of the parameter value at which the next
period doubling bifurcation occurred relative to the last one ≈
1
… and the ratio
4.
6692
of the magnitude of the spawning point to the one spawned ≈ 2.5. (Feigenbaum,
1979). By “rescaling” distances along a parameter value (see below) using what is
called a “universal renormalization operator” the geometric situation around each
bifurcation point (though of different absolute “size) remains relatively the same. In
intermittent systems, burst length varies as the inverse square root of the distance
of the value of the parameter from that value that elicited the fixed point (Manneville
and Pomeau, 1980). The universal characteristic of the third common parametric
route to chaos, quasiperiodicity, is that the ratio of independent frequencies found
most resistant to mode locking and breakdown into chaos is, ω i
= 1.618… the
ω
i+1
number to which the ratio of adjacent Fibonocci numbers converge (1, 2, 3, 5, 8, 13,
199
21…) (Shenker and Kadanoff, 1982). Similar quantitative scaling properties were
also discovered in the parametric period adding route (Kaneko, 1983).
All of these scaling numbers have been found in experiments and in
remarkable agreement with theory. Examples have been discovered in electronic
circuits, hydrodynamic and mercury flows, acoustic systems, laser dynamics and
oscillating chemical reactions (see Cvitanovic′, 1989, for representative list of
references). Whereas qualitative evidence for all of these bifurcation scenarios have
been found in brain relevant experiments, there is yet to be a bifurcating
experimental biological system with adequate precision across a sufficient range of
magnitudes such that quantitative universality could be demonstrated across a
sufficient range of values to be convincing. We remind ourselves that in order to
establish a Fiegenbaum number, each period doubling bifurcation of the several
required necessitates about a five-fold improvement in the experimenter’s ability to
specify the control parameter.
Using Invariant Measures of Dynamical Neurobiological Systems
Before the modern era of dissipatively forced (energy utilizing) dynamical
systems research, the known attactors of an experiment’s initial values resulted
from their convergence onto either a fixed point or a limit cycle. An attractor can be
regarded as a set which remains in bounded space and to which all orbits in this
neighborhood converge (Milnor, 1985). Since by the rules of differential equations,
orbits are required to be both smooth (graphable without lifting the pencil) and
unique (different trajectories don’t intersect since the point of intersection would no
longer be unique), the foundational Poincare-Bendixon theorem says that any such
orbit confined to a two dimensional phase space that doesn’t converge to a fixed
point must, no matter how long it irregularly wanders, must, eventually intersect with
itself and then go around the same route again in a (perhaps very long) cycle. In
most neuroscience research as well, we have generally regarded our data as
manifesting either tolerable (or intolerable) fluctuations around mean values (fixed
200
points) or more or less regular cycles. We analyze our “fixed point” data using
quantities such as the mean and variance of distributional statistics and the cycle
data using the amplitude, frequency, cycle length and phase of trigonometric
functions. In central tendency-oriented research, rare, very high amplitude events
have usually been considered aberrations and tossed, and imperfect periodic
behavior is treated by “cosiner analysis” as regular cycles contaminated by
measurement or system noise. Whereas technically, chaotic dynamics must live in
dimension greater than two (for orbits to be more than a fixed point or limit cycle,
able to snake around without necessarily intersecting ), the Lorenz attractor has
dimension just a little over two, our difficulties with establishing the “true”
physiological dimension of real biological observables (see below) makes such a
consideration more theoretical than practical.
The orbits of a forced-dissipative dynamical system in a parameter regime
engendering chaos, converge onto an attractor which is neither a fixed point nor a
limit cycle, thus the origin of the name “strange attractor” (Ruelle and Takens,
1971). It was James Yorke that first named these dynamics “chaos” (Li and Yorke,
1975). The necessarily statistical properties of the chaotic orbits on strange
attractors follow from the generic characteristics of their motions (see Shaw, 1981
for a still conceptually current, non-mathematical treatment). These kinds of
statistics are studied in a research context called the “ergodic theory of dynamical
systems” (Ruelle, 1979; Eckmann and Ruelle, 1985). Ergodic is a word used to
characterize a system with (or without) a particular condition placed on its statistical
measures: the existence of an invariant measure which is undecomposabile into
two invariant measures and, equivalently (though not obviously) one in which the
time average equals its average in the geometric space into which it is embedded.
One may arrive at the same ergodic measure from studying a single very long orbit
or from summing across many individual but shorter orbits. This ergodic
equivalence is made possible due to the definitional existence of at least one
invariant statistical measure and the dynamics of the system which ideally include a
uniformly, sequence disordering process called “mixing” (see below).
201
Of course, most real biological dynamics are not uniformly mixing and so are
non-ergodic, but we shall see that the ways they fail to be ergodic (and thus remain
in the conceptual context of ergodic measures) are descriptively useful (Mandell
and Selz, 1997a). The emergence of many statistical approaches to characterizing
these motions have been accompanied by the expected controversies about which
is best or correct (see below) and have been applied to the problem of diagnosis
and clinical discrimination in a variety of neuroscience settings. In ideal abstract
chaotic dynamical systems called Axiom A (Russians called them “C systems”),
where most mathematical theorems are proven (Smale, 1967), all these measures,
if properly computed, are equivalent. In real life, as in the related case of ergodicity,
they are not, and since no single one is complete, the more (incomplete) measures
we use in our studies along with interest in the way that they differ, supplies more
useful information about the system. Though researching and elucidating the most
reliable and valid ways of computing these measures are a valuable goal, the
current debates focused on the superiority of a single particular measure,
constructed in a particular way in relationship to issues of insoluble absolutes like
“randomness” versus “deterministic chaos may not be particularly valuable for
uncovering new characteristics and potential mechanisms underlying a specific set
of real neurobiological observables.
Emphasizing diversity and relevance to the clinical biological sciences, we
note that quantifying patterns in ergodic (non-ergodic) measures have aided: the
discrimination between normal and abnormal opticokinetic nystagmus in neurology
patients (Aeson et al, 1997); localizing a two year old subcortical stroke in an EEG
of a patient with no other signs or neurological findings (Molnar et al, 1997); the
diagnosis of early (not late) multiple sclerosis, as a nonspecific long tract disorder,
in patients with mild optical neuritis using cardiac rate dynamics (Ganz and
Faustman, 1996); seizure prediction from minutes to hours before the event in
which subthreshold, pre-phase transition spatial diffusion and oscillations in
characteristic changes in these measures can be found (Martinerie et al, 1998;
Elger and Lehnertz, 1998; Pign et al, 1997; Iasemidis et al, 1990); using these
measures on the EEG to differentially predict hereditary predisposition to alcoholism
202
(Ehlers et al, 1995); indicating the presence or absence of septic encephalopathy
(Straver et al, 1998); using time series from jejunal manometry to discriminate
objectifiable somatic from psychological conversion related irritable bowel syndrome
(Wackerbauer et al, 1998); analyzing time-dependent patterns in plasma hormone
levels to discriminate between the presence or absence of a functioning tumor
(Hartman et al, 1994, Mandell and Selz, 1997a); automated differentiation of ataxic
from normal speech (Accardo and Menulo, 1998); and discrimination of
temporomandibular joint dysfunction from normal patterns of chewing motions
(Morinushi et al, 1998).
Styles of Orbital Motions in Chaotic Dynamical Systems
In chaotic dynamics, in various specific ways, an initial hypothetical handful
of points lined up along the trajectory and acted on over time by the nonlinear
differential equation (“operator”), get out of order in an unpredictable way. Here the
hypothetical handful can come from a statistical aggregate of initial conditions or
from a single recursive orbit studied over long times. As noted above, ergodic
theorists call this getting out of order “mixing” and how and to what degree this
happens consumes many mathematical theorems but for purposes of brain
research, it can be best described using a variety of statistical measures. For
example, visualizing the Lorenz attractor (see above) as a butterfly in phase space,
the points get out of order because as they spiral out (“stretching”) to the edge of
one wing and return (“folding”) to the unstable fixed point on the butterfly’s body
whence they either jump to some place on the other wing to spiral to its edge or
return to the same wing to spiral out again. Which one of these is chosen is
exquisitely sensitive to very small changes in where the trajectory started and very
small fluctuations in where it returned to the unstable fixed point on the butterfly’s
body. In fact, specification of these locations is beyond the precision of any real,
thermodynamically vulnerable system.
Chaotic trajectories on the Rössler attractor (see above) wind out (“stretch”)
to the edge along the inside of a conch shell in phase space and then are mapped
203
back (“fold”) into the spiral unpredictably somewhere in a mixing mechanism that
has been called “displaced reinjection.” In the slow-fast oscillations of the forced van
der Pol in the chaotic regime, points in the slow phase (“repolarization”) jitter around
and step on each other’s heels, getting out of order while waiting on the ledge
before jumping (“depolarization”) to the next slow phase (“repolarization”) at some
unpredictable time, thus generating a variably irregular series of interspike intervals.
Stretching and folding are also responsible for getting points get out of order
in the single maximum map of the unit interval (studied for universal qualitative and
quantitative properties by May and Feigenbaum and others as described above).
With increases in parameter values, the parabolic hill function onto which the unit
line has been stretched gets steeper, more stretched. Mapping points on the hill
back onto the straight line of the unit interval results in what amounts to the line
folding back on itself. This stretching and folding eventually fills the line with points,
but their sequence, from end to end, gets shuffled like a deck of cards.
As described more generally above, points that start as neighbors may get
separated (“divergence along the attractor”) and those that start at a distance from
each other may be thrown together (“compression back onto the attractor”). These
expanding and folding motions that characterize the chaotic behavior on strange
attractors have been likened to the actions of a taffy puller (Rössler, 1976). It is in
this way that nearby points can separate without leaving the attractor. It is also the
case that once indistinguishably close but then separated points may be
compressed together again generating new, temporary (unstable) cycles of all
possible period lengths. These unstable fixed points may be the most important
feature of chaotic systems from the standpoint of new ideas about brain
mechanisms (Pei and Moss, 1996; So et al, 1997). This aggregation of unstable
loops can occur from points fluctuating away and back to the attractor as well as
during the crowding of points at the turns after their stretching out on more linear
parts of the flow. Under the mixing flow of a chaotic dynamics, it is also true that a
single point eventually explores the entire attractor, no attractor location is
inaccessible to it.
204
Although counter-intuitive when expressed in words, the trajectories that one
sees in the graphics of chaotic attractors result from the actions along the “unstable”
directions of the stretching distortion; the actions in the otherwise invisible stable
directions “iron down” the points onto this unstable manifold (n dimensional abstract
surface).
As might be expected from this set of characteristic motions, the diagnostic
triad of chaotic dynamical systems are: (1) Sensitivity to initial conditions—tiny
distances between starting points are magnified and large distances between
starting points are reduced under the stretching and folding actions of the system;
(2) The presence of a theoretically infinite but countable number of unstable
periodic orbits of theoretically all period lengths—points in phase space can be
viewed an attractive-repellers, visited and left by the orbits recursively as the
dynamics proceed; and (3) Indecomposability—the attractor is not separable into
isolated regions and no points escape (see Devaney, 1989, for one of the clearest
definitions). Of particular relevance to information encoding and transport by brain
mechanisms, it is important to visualize that new information in the form of unstable
periodic orbits is being created as well as destroyed by the dynamics. The
logarithmic rate of formation of these new orbits is computed as the system’s
topological entropy (see below).
Assuming the real neurobiological system under study is behaving in these
ways (and often much has to be done to help justify such a claim), the observables
take the form of an irregular and/or episodic time series of amplitudes, as in
repeated sample, neuroendocrine studies of plasma hormone levels (Veldhuis and
Johnson, 1992) or a sequence of times between events as in neuronal interspike
intervals (Katz, 1966; Perkel et al, 1967). These time or time-sequence series are
generally studied from three relatively distinct yet complementary quantitative
perspectives: (1) As stochastic (“random”) processes with various amounts of
sequential dependency (autocorrelations) and scale (sample length) dependencies;
(2) As “deterministic” smooth or discrete, vectorial geometries in phase space
following reconstruction and/or embedding of the series as phase portraits or return
maps; (3) As information generating and transporting, topological (about relative
205
nearness and sequential order not absolute distances), symbolic dynamical
processes which as either (1) or (2) can be analyzed with respect to its various
entropies, algorithmic complexities and word content and syntax. A variety of
techniques aimed at deciding between the relevance of one or another of these
underlying assumptions (such as series and Fourier phase shuffling to destroy
statistical autocorrelations and vectorial continuities but leaving the probability
density distributions intact ) may at times help emphasize one or another of these
orientations in the analyses (see Ott et al, 1994 for a collection of articles on this
topic).
Nonconvergent Distributions and Power Law Scaling in Biologically Relevant
Time Series
The statistical distribution with which most of us are familiar is the Gaussian
which can be generated by summing and averaging a series of independent
random events. The average behavior head/tails probabilities observed by one
person flipping a fair coin for a very long time or by many people flipping similar
coins for shorter times converges upon the invariant measure of 0.5. The variance,
“second moment” in the distribution of a population of coin flipping sequences will
be finite and computable. In a graph of this distribution, the tails will converge to the
x axis in a Gaussian exponential manner. The longer or the more numerous the
“sample” series of observations, the closer they will approximate the “ergodic”
invariant measures representing the true “central moments” of the behavior of this
“population” of fair flipping coins. Since the coins are not changing their relevant
characteristics over the time of observation, we say that the series is not time
dependent but instead is “stationary.” Computation of correlations over increasing
lags to determine how much and for how many flips the sequences continue to
resemble themselves yield an exponential decay with a single characteristic
correlation length. This reflects the existence of a finite variance from which its
amplitude is derived and serves as the single characteristic temporal scale of the
random process.
206
Before describing the relatively new set of measures of biological processes
designed to find and quantitate what are assumed to be relatively sample size
insensitive, distributionally nonconvergent and multiply correlated processes that
are without a single time or space scale, we should remind ourselves that there is
already much more apparent “order” in a generically random situation than our
intuitions would lead us to believe. For example, if we keep cumulative scores in a
competition between heads and tails and determine the distribution of trials between
those in which the number of heads and tails are even, we will get periods between
zero crossings of many lengths with very short ones and very (very) long ones
being most statistically prominent. The distribution of these wavelengths is shaped
like a symmetrically fat-tailed, bowl (Feller, 1968). As another illustration, expected
runs of heads or tails in this Gaussian random task are longer and more frequent
than we might suspect. It has been proven that the expected run length grows with
n coin flips (as an order of magnitude estimate) like the logarithm (for a fair coin,
base 1/p = 1/0.5 = 2) of n. For example, in 512 ( e.g. 2 9 ) tosses, we cannot report a
run of 9 heads as a evidence for a biased coin or the sign of some deterministic
coin tossing mechanism (Erdos and Renyi, 1970). If we had a 0.6 head biased coin,
then the observation of a run of 13 heads couldn’t dissuade us from a random
mechanism!
Unlike our random coin task, the variances of many, perhaps most, time
series of biologically-relevant events, do not tend to converge onto a limiting value
as sample size, n, grows, but rather continue to increase (or decrease) with n in a
scale invariant manner. Instead of “regressing to the mean” with increasing sample
length or number, the likelihood of a larger deviation than previously observed
increases with n. Analyses of inter-event intervals reveals a multiplicity of
characteristic times. One interpretation of these finding might be that this represents
evidence for the inherent “nonstationarity” of biological mechanisms as reflected in,
for examples, the frequency of saccades concomitant with ceaselessly shifting foci
of visual attention (Steriade and McCarley, 1990), or our inability to not think of
“white bear” when so instructed (Wegner, 1994). Hermann Haken, the father of
laser-inspired “synergetics,” has said that biological mechanisms are not in a steady
207
state for very long, spontaneously and irregularly jumping from one unstable
dynamical state to another (1997). This suggests that meaningful tension between
experimental sample lengths long enough to minimize statistical error and short
enough to be stationary may be, for the biological sciences, more apparent than
relevant.
The studies reviewed below exploit measures arising from the view that the
noisy statistics of nonstationarity in biological processes are not a sign of
measurement error, but rather evidence consonant with the statistical physics of
nonequilibrium states and phase transitions (Stanley, 1971; Stauffer, 1985;
Yeomans, 1993). Very high amplitude fluctuations and multiple, up to infinite,
correlation lengths are characteristic of the normal, on-going biological dynamical
behaviors, which are apparently without characteristic amplitude and time scales.
From this point of view, if most or all information is widely distributed in the brain
(e.g., serial order of visual tasks involving motor cortical neurons, Carpenter et al,
1999) ) then the “binding problem” (see above) could also be solved by multiple, up
to infinite spatial and temporal correlation lengths in place of the current theories of
monofrequency resonances (Singer, 1993). Hierarchical neurodynamical
mechanisms communicating across many mechanistic temporal and spatial scales,
brain information transport analogous to the energy cascade of hydrodynamic
turbulent velocities (Tennekes and Lumley, 1972), would be likely in the parametric
vicinity of incipient bifurcations and phase transitions.
Three closely related techniques for quantifying the systematic changes in
average fluctuation amplitudes with n (scale, sample length) involve a “power law,”
linear slope relationship between the logarithm of an index of variability and the
logarithm of sample segment sizes. These easy, yet powerful methods were
brought to experimentalists’ attention by Benoit Mandelbrot (Montroll and Badger,
1974; Mandelbrot, 1983; Fedor, 1988; Bassingthwaighte et al, 1994; Liebovitch,
1998). To estimate the exponent in Hurst rescaled range analysis, we compute the
standard deviation and the range of the deviation of the running sum from the mean
on sequential subsamples of increasing size. The Hurst power law exponent is the
slope of the straight line formed by graphing the logarithm of the subsample length
208
along the x axis and the logarithm of the ratio of the range to the standard deviation
on the y axis. An independent random system has a Hurst of 0.5. If a sequential
increase or decrease in an amplitude or inter-event time tends to be followed by a
change in the same direction, the Hurst > 0.5. If an increase in the measure tends
to be followed by a decrease, then Hurst < 0.5. Computation of the Fano factor
(power law exponent) exploits the same general strategy using the variance/mean
in place of the range/variance and counting the number of events (such as single
neuron discharges or heartbeats) in time windows of increasing length, generating a
similar log-log graph. There is a relatively long history of the use of spike-number
variance-to mean ratio in studies of response variability in visual cortical neurons
(see Teich et al, 1996 for a review). The Allen factor (power law exponent) tends to
reduce the influence of local trends by a computation of the variance of the
difference between the number of events in two successive time windows divided
by twice the mean number of events in the window.
Each system’s invariant logarithmic slope across sample segment sizes
takes the place of its missing finite variance in characterizing experimental data in
which the distributional tails do not converge (or do so very slowly) to the x axis.
Recent approaches to these measures in the context of stochastic analysis of DNA
sequences, but also applied to normal and pathological cardiac inter-beat intervals
and gait interval sequences, have dealt with the influence of non-stationarity due to
apparent trends in the data on α-equivalent indices by local mean-normalization of
the fluctuations at each window size (Peng et al, 1993; Peng et al, 1995; Hausdorff
et al, 1995).
The rate of decay of the densities in the tails of the probability distribution as
they approach extreme values along the x axis, called the Levy exponent when
represented in Fourier space (technically, as a “characteristic function” of the
probability distribution) (Shlesinger, 1988; Shlesinger et al, 1995), can also be
computed directly on the distribution by fitting the tails with a two parameter curve
quantifying their “fatness” and rates of decay (Mantegna, 1991). We can speak of a
Gaussian tail as having an exponential decay rate representable by α = 2 implying
209
finite variance. A tail with a nonconvergent decay rate of 1 < α < 2 indicates nonfinite
variance in the data such that the usual “normal curve” derived, standard
deviation dependent tests of statistical significance are without meaning. α < 1
indicates the data is without a consequential mean and will require the use of
interquartile measures to locate the center of the distribution (Adler et al, 1998).
Recalling that the Hurst, Fano and Allan indices are invariant across sample
segment size, we remind ourselves that, as is the case in the finite mean and
variance, α = 2, Gaussian, any of the other “α tails” also retain their value (“shape”)
across all partitions that might be used to sort and sum the observable. This
property is called convolutional, α, stability. In passing it should be noted that the
last outpost of convergence of a probability density distribution with α = 2 is called
“log-normal,” in which the tails along the x axis are “pulled in” by the variable being
plotted as its logarithm.
A Hurst exponent of > 0.5 in the data is associated with a Levy exponent of <
2.0, and both would be indicative of a process in which the characteristic style of
change, rather than decay with some finite correlation length, would persist across
all time. Using a bursting neuron as a generic example, a short interspike interval
would, on the average, be followed by another short one and a long one by another
long one, and this behavior, unlike our fair coin flipping sequence of observables,
would not become uncorrelated with itself even over infinite time. Another way to
represent this infinite, innumerably lengthed, correlation property is via its implicate
frequency (inverse wavelength) content by computing its best fit assortment (along
with their densities) of a range of short to long sine waves forming the Fourier
transformation of the correlation function. The condition of correlated fluctuations
across many measured temporal scales yields yet another power law slope when
graphed as the logarithm of its range of frequencies, f, plotted along the x axis,
versus their corresponding amplitudes squared, powers, plotted along the y axis.
Naming this spectral power law exponent β, the system’s characteristic scaling law
is usually expressed as 1 (Fedor, 1988; Hughes, 1995; Shlesinger, 1996;
β
f
210
Liebovitch, 1998).
We see that the Hurst exponent, Fano and Allen factors, Levy exponent and
power spectral scaling exponent are kindred statistical descriptors. They are most
usefully applicable to systems with distributions that fail to be Gaussian or
asymmetrically Poisson, the latter from random data sequences with only positive x
values, thus backed up toward zero by a minimum inter-event interval or amplitude.
These time series are sequentially dependent, not conventionally stationary, without
finite central moments and with self-correlations that don’t demonstrate Gaussian
exponential decay with sample length or time. The following are some examples of
the use of these measures in studies of biological dynamics. .
Examples of Biological Data with Divergent Distributions and Power Law
Scaling
A paradigm challenging group of experiments involved models and measures
of the distribution of characteristic open and closed times of membrane ion
conductance channels. The usual approach to this problem assumed the existence
of a small set of distinguishable channel types that were reflected in discrete
conductance events with a small set of characteristic open and closed times. The
distributions of each of could be fitted with its own, Markov process derived,
exponential. With technical advances and improved temporal resolution, more
characteristic times and their associated α = 2 exponentials were reported with as
many as three not being unusual. Liebovitch (and Sullivan,1987; 1989) used
analogue to digital transformation of current recordings from the unselective corneal
epithelial channels and voltage dependent potassium channels in cultured mouse
hippocampal cells at temporal resolutions ranging from 170 to 5000 Hz and found
similarly shaped, α < 2, nonconvergent distributions across temporal scales. This
led these investigators to suggest that, related to the >16 recorded magnitudes of
characteristic times, from picoseconds to months, in autonomous protein motion
(Careri et al, 1975; Gurd and Rothgeb, 1979), that there was an “α stable” hierarchy
211
of lifetimes of states, observable at almost any temporal resolution that methods
would allow.
Early and representative studies comparing the fit of the data with
hierarchical scaling functions versus a sum of a small number of Markovian
exponentials included studies of a calcium activated potassium channel in human
fibroblasts (Stockbridge and French, 1989) which yielded evidence to support both
models, as did studies of membrane conductances in corneal epithelial cells by
another group (Korn and Horn, 1988). In a systematic comparison of scaling and
Markov exponential modes of the gating kinetics of GABA activated chloride
channels, acetylcholine activated end plate potentials, calcium activated potassium
channels and fast chloride channels (McManus et al, 1988), it was found that the
latter fit the data best in most experiments. Similar results were reported in studies
of the glutamate and delayed rectifier potassium channel with respect to
distributions of open and closed times (Sansom et al, 1989). Space does not permit
a systematic account of the continuing debate and conflicting studies about these
representations and the implicit biophysics of discrete, finite versus continuous,
hierarchical channel event heterogeneity. It is interesting that recent experiments
making use of Hurst rescaled range analyses of time series of whole cell membrane
voltage fluctuations (without the assumptions and current renormalizing procedures
associated with patch clamping) have yielded additional evidence for multiply
correlated, Hurst > 0.5, α < 2 power law behavior of what some might regard more
generally as a protein relaxation time mediated hierarchical array of ion
conductance behaviors (Liebovitch and Todorov, 1996).
Following the discovery of (very) subsaturating (“far from equilibrium”) rat
brain levels of the common cofactor for tyrosine and tryptophan hydroxylases,
tetrahydrobiopterin (Bullard et al, 1978), studies of amino acid substrate saturation
functions and time courses determined at these low, physiological co-reactant
concentrations manifested patterns of hierarchical multiplicity and discontinuities
suggestive of bifurcations and time-dependent fluctuations with fractional
(hierarchical) time scaling exponents that were sensitive to psychotropic drugs
212
(Mandell and Russo, 1981; Knapp and Mandell, 1983; Russo and Mandell, 1984a;
Russo and Mandell, 1986). Similar bifurcating and power law kinetics were found in
receptor-ligand binding systems (Mandell, 1984) which were confirmed by more
recent studies of diffusion-limited binding kinetics with receptors immobilized on a
biosensor surface (Sadana, 1998). Hierarchical kinetics have also been reported in
time courses of drug and metabolite levels (Koch and Zajcek, 1991), tissue tracer
washout studies (Beard and Bassingthwaighte, 1998), carrier mediated transport
processes (Ogihara et al, 1998), general pharmacokinetic functions (Macheras et al,
1996) and biochemical networks (Yates, 1992). It is likely that bifurcating and
hierarchical, power law kinetic functions will be studied more commonly in the
chemical literature in general (Shlesinger and Zaslavsky, 1996; Berlin et al, 1996)
as well as applied to a variety of protein-mediated biological functions (Dewey,
1997).
The first demonstration of and stochastic model for nonconvergent
distributions of interspike intervals of a single neuron was by Gerstein and
Mandelbrot (1964). Though rich with possibilities, it has been only very recently that
additional work from this point of view has been published. This is likely due to the
fact that most neuroscience oriented statistical packages, with rare exceptions, are
without techniques for computing descriptive parameters for these divergent
probability density distributions. This has not been the case for economic time
series, download STABLE from http:///www.cas.american.edu/~jpnolan. Recently,
applications of the Fano and Allan factor as well as power spectral scaling
exponents to observed and shuffled series of spike counts and interspike intervals
in the auditory and visual systems (including spatial and/or time resolved single unit
recordings in retinal ganglion, lateral geniculate and lateral superior olivary cells as
well a auditory nerve fibers) demonstrate the characteristic behavior of
nonconvergent, hierarchical stochastic systems (Teich, 1989; Teich et al, 1990;
Lowen and Teich, 1992; Kumar and Johnson, 1993; Kelly et al, 1996; Teich et al,
1997). These statistical techniques are well suited to the characterization of the
irregularly intermittent bursting patterns generic for activity in single neurons as well
213
as in nonlinear equations representing them and other brain processes (Mandell,
1983).
An early study of power spectral scaling in the EEG reported alpha band
fluctuations that extended a 1 , β ≈ 1 pattern to 0.02 Hz (Musha, 1981), as did
β
f
other applications of the log-log power spectrum to the EEG in man (Hu and Hu,
1988; Prichard, 1992). This power law scaling led naturally to the suggestion that
the range of frequencies available in the electromagnetic signal from the calivarial
surface extends far beyond those currently appreciated and may be available for
study using relatively noise free recording techniques such as the
magnetoelectroencephalogram (Mandell and Selz, 1991). A not surprising range of
intrinsic correlation lengths reflected in Hurst > 0.5 and/or Levy exponents < 2 have
been reported in lamb fetal breathing patterns (Szeto et al, 1992). The exponent
has been shown to be sensitive to maternal alcohol intake in humans (Akay and
Mulder, 1998), rat neonatal motoric activity (Selz et al, 1995), and nuchal atonia
duration sequences (associated with putative intra-uterine REM sleep) in fetal
sheep (Anderson et al, 1998).
Sequential amplitudes in 1 Hz stimulated soleus spinal cord H-reflex
demonstrated a 1 , β ≈ 0.83 in control subjects and, reflecting the decrement in
β
f
correlations, by 0.31 in patients with losses in supraspinal input from spinal cord
injury (Nozaki et al, 1996). Whereas the sequences of fixation times in eye
movements of normal control subjects reading difficult material demonstrated an
exponentially decaying distribution, those of schizophrenic patients demonstrated a
power law tail, consistent with more sequential correlations (Yokoyama et al, 1996).
This finding may be related to the appearance of velocity arrests, runs of sticky fixed
points, in a spatially oscillating target task, called “smooth pursuit eye movement
dysfunction” in schizophrenic patients which has been modeled as a parametric
disorder in a periodically driven nonlinear dynamical system (Huberman, 1987). The
“short time fractal dimension” has been used to discriminate acoustic signal
transformations from the speech of normal subjects and ataxic patients (Accardo
214
and Mumolo, 1998). Spontaneous changes in the apparent syllabic sound made by
regularly presented, word-like auditory stimuli emerge irregularly, the duration of
perceived sameness demonstrating a power law distribution of “dwell” times (Tuller
et al, 1998). The same kind of power law distribution of characteristic “brain times”
can be found in studies of gait cycle durations in normal walking (Hausdorff et al,
1996) with a decrease in this locally detrended, α-like index compared with controls
(0.91±0.05) in patients with the basal ganglia disorders of Parkinson’s (0.82±0.06)
and Huntington’s (0.60±0.04) Diseases (Hausdorff et al, 1998). Hurst > 0.5 has
been speculated to more accurately quantitate the fundamental time structure of
cells that was previously called circahoralian (ultradian) intracellular rhythms
(Brodski, 1998).
Reconstructions of Time Series as Orbital Geometries
Rene Thom (1972), extending the ideas of Poincaré and D’Arcy Thompson
(1942), argued that experimentally useful, intuitive connections between the
qualities of biological processes and the quantities of an explicit (equations known)
or implicit (equations unknown) dynamical system could be best achieved through
the use of graphic representations of their geometric and topological forms. Notably
successful examples can be found in the work of Thom, Arnold (1984) and Zeeman
(1977), who were inspired by “caustics” (the shapes made on surfaces by the
coincidence of reflected or refracted light rays) and Whitney’s representation of
parametric manifolds (surfaces) by the shadows they would make on a plane when
back lit (Whitney, 1955). This led to a small number of qualitatively predictive,
number-of-independent-parameters dependent shapes, such as “folds” “cusps” and
“wavefronts.” Experimentally crossing the values of these independent variable
forms at their singular boundaries successfully predicted discontinuities in the
otherwise smooth alterations in the dependent variable; i.e. bifurcations
(“catastrophes”) in the behavior of the observable. This approach was best suited to
the study of systems with many independent variables and one dependent variable
that could be mapped on the axis of the latter to represent a continuum of
215
operationally defined “energy states.” Smooth changes along the path of the
nonlinear parameter manifold generated discontinuous changes in energy levels
indicating states of the observable. Crossing a wrinkle in an “independent variable”
(some call it “order parameter” to indicate its emergence rather than availability for
predictable manipulation) such as the nonlinear parameter surface of the
countervailing influences of survival fear and financial cost, may lead to a bifurcation
in behavior from peace (“low energy”) to war (“high energy”) (Zeeman, 1977).
In a similar geometric spirit but dealing with nonequilibrium systems in
motion, the conditions such that one could “smoothly” embed a trajectory like a
continuously recorded EEG record, a complicatedly coiled snake into a three or
higher dimensional box without loss of its essential dynamical or statistically
measureable properties, was settled by Whitney in what is now referred to as the
“embedding theorem” (Whitney, 1936). Starting with a tangled knot of overlapping
vectorial orbits with apparent “non-invertable points” (given a point, one cannot
chose among or between the more than one point that it apparently came from), it
can always be unwrapped into a non-crossing trajectory satisfying uniqueness when
reconstructed in a box of a little more than twice the parameter-determined
dimension of the original space of observables.
A common technique for the spatial reconstruction of the output of a
dynamical system is called a “time delay embedding.” This approach, first
suggested by Ruelle (1987, pg. 28) replaced the value, x, versus the time
derivative, dx , phase portrait plot described for a continuously perturbed bob on a
dt
spring above. A sequence of observables over time, in, for example, three
dimensional “phase space” (Packard et al, 1980; Takens, 1981; Sauer et al, 1991),
is depicted by a curve representing the system’s trajectory at times t 1 , t 2 , t 3 , by
sliding one-by-one down the series and plotting each p 1 , p 2 , p 3 , location with
respect to each other along the x, y and z axes respectively. The choice of time
interval between the points, the delay, can be delicate and usually some standard
fraction of the decay time of the sequence’s autocorrelation length, “the decay time
of mutual information” is chosen. There are many technical considerations,
216
including those involving the choice of the embedding space vis a vis the “true”
dimension of the attractor. This becomes an issue when, for example, the attractor
shrinks over time to some subspace of the initial embedding (Liebert et al, 1991 and
references therein).
If we imagine the process of time series reconstruction to inscribe an
attractor’s untidy ball-of-string of recurrent trajectories in three dimensions, we can
then, by making the z-dimension a constant value, cut the ball with a two
dimensional plane, a “Poincaré surface of section.” This could yield a roundish
cloud of discrete points on the x,y plane and t n-1 – t n would be the time between two
piercings of this surface. It has been proven that almost any cut, as long as it is
made transverse to the direction of the orbital trajectories, is equally valid and useful
for further analyses (Oseledec, 1968). If the original embedding and subsequence
section was in high enough dimension to allow invertability, we might have enough
(trial and error) knowledge to be able to write a discrete equation, a “return map,” f,
f
that would move one point to the next on the plane as (x,y) t ←→ ⎯ (x,y) .
What can sometimes be case with real systems (Coffman et al, 1986), is that
reducing the geometric reconstruction still one dimension further, accepting noninvertability,
ironing down the points in the plane onto the x axis line (normalized to
[0,1]), and plotting the values at x t against x t+1 (“mapping the unit interval to itself”),
can generate points in the general shape of a parabola with dynamics representable
by the same family of one parameter, single maximum discrete equations that
generated May's sequence of bifurcations, Feigenbaum’s scaling and Metropolis,
Stein and Stein’s (and Sharkovskii’s) U sequence (see discussions of qualitative
and quantitative universalities above).
Although sometimes a significant change in brain system physiology, such as
penicillin-induced epileptic neuron spiking activity is revealed simply by a change in
the graphic appearance of suitably embedded time series data (Zimmerman and
Rapp, 1991), more often statistical measures made on the geometric dynamics of
the points on the attractor are required.
n−1
t n
217
Orbital Divergence Characterizes Expansive Dynamics on Biological
Attractors
In the dynamical world of equilibria (fixed points in phase space) and
periodic cycles (fixed points of a return map), a common concern involves their
stability. What happens if an adventitious jiggle moves the orbit a little distance
away from the fixed point? Would the wind wiggled suspension bridge start to flap
with increasing amplitude or would it damp back down quickly to its stable state. A
“Lyapounov functional,” L, is constructed which can be visualized like a smooth
potential bowl around the fixed point such that any L stable solution that starts at its
bottom tends to stay there or is asymptotically L stable if the solution converges to
the fixed point at the bowl’s bottom as t → ∞ . If the point is not L stable, it is L
unstable.
The modern study of nonlinear systems have produced another kind of
stability issue with a similar appellation yielding other direction specific indices, the
Lyapounov characteristic exponents, λ (Oseledec, 1968; Eckmann and
Ruelle,1985; Ruelle, 1990; Ott et al, 1994). In this context, the instability is not one
of perturbative escape from a fixed point, but of the average rate with which the
(theoretically infinitesimal) distances among a handful of points representing a set of
initial conditions (each a precision limited, hypothetical repetition of the same
experiment), are being stretched apart by the expansive action of a strange attractor
system. In three dimensions, one can envision a ball of initial conditions being
elongated along the unstable direction and ironed down from both sides along the
stable direction over time, transforming the ball into an ellipsoid and then into a
(recurrent) curve. In simplest terms and thinking about a one dimensional scalar
time series, the Lyapounov exponent reflects the multiplicative average (logarithmic
addition) of the sequence of slopes of the series of straight lines connecting the
points. An average slope of > 45 ° is expansive such that a linear distance on the x-
axis is increased when mapped onto the y axis. A slope of < 45 ° is a contraction
mapping reducing the linear distance of the x-axis when mapped to the y axis.
218
Rössler’s generic chaotic system (see above) moving recurrently in a three
dimensional box can be orthogonally decomposed into three directional motions in a
moving frame, each with a signatory sign of λ . The unstable direction of expansive
stretching is characterized by some number > 0, λ( + ), the stable direction of
contractive folding, some number, < 0, λ( − ), and the neutrally stable direction of
recurrence, λ( 0 ). For The “Lyapounov spectrum” of the Rössler attractor is
[ λ( + ), λ( −), λ( 0 )] (Shaw, 1981). An n-dimensional dynamical systems has n onedimensional
Lyapounov exponents, and it is sometimes the case in relatively noise
free, finite semi-stationary data lengths of the neurosciences, that a λ > 0 can be
shown to exist for a second one, in a dynamical situation called “hyperchaos” by
(Rossler, 1979). For example, two and sometimes three λ( + ) have been reported in
the flows on the EEG attractor of normal alert subjects (Gallez and Babloyantz,
1991). The presence of measurement noise, the finiteness of neurophysiological
sample lengths as well as the relatively small expansive actions in some directions
in the chaotic attractors of brain dynamics lead to the finding that most often, only
one “leading Lyapounov exponent,” λ( + ), is reliably computable (Sano and
Sawada, 1885; Wolf et al, 1985; Eckmann et al, 1986).
A counter-intuitive fact about the stability of a dynamical system when a
decrease in the value of λ( + ) is observed such that λ( + ) → λ( 0 ), is that this more
neutral stability augers a global bifurcation (Guckenheimer and Holmes, 1983). A
small perturbation does not change the global dynamics of an already expanding
and contracting (called “hyperbolic”) dynamical system, it will maintain the style of
its motions. However, when λ( + ) → λ( 0 ), a velocity changing perturbation evokes a
bifurcation to a new dynamic in what is called “loss of hyperbolic stability.” The best
examples come from the observations of this kind of change in the EEG predicting
the onset of epileptic seizures in patients with focal or temporal lobe epilepsy
(Iasemidis et al, 1988,1990; Iasemidis and Sackellares, 1996 ).
219
The number and variety of algorithmic strategies for computing Lyapounov
exponents that are applicable to real data divide naturally into those that compute
directly the average rate of separation of neighboring points from the “fiduciary”
orbit, as observed on the reconstructed attractor, from which only the largest λ can
be obtained (Wolf et al, 1985), and a variety of techniques based on assumed
model maps of the unknown flow along which the sequential products of the local
derivatives are computed. The logarithms of the straight line slopes of the sequence
of directionally decomposed local tangent vectors multiplied, yield as many
Lyapounov exponents as directions (Sano and Sawaka, 1985; Eckmann et al, 1986;
Geist et al, 1990). The techniques of regularization by which these model processes
approximate the unknown flow include those with least squares, linear fit
assumptions (Eckmann et al, 1986; Sato et al, 1987; Buzug et al, 1990), more
detailed fits involving polynomial expressions in higher powers (Briggs, 1990; Brown
et al, 1991; Bryant et al, 1991) and techniques such as “singular value
decomposition” which decomposes the flow into orthogonal components before
computing the logarithmic rate of divergence of nearby points on each of them
(Stoop and Parisi, 1991). A clever check on the Lyapounov number obtained is to
study the flow backwards so that, for example, some rate of separation of points in
the forward direction would approximate the rate of convergence in the time
reversed data (Parlitz, 1992).
Among the sources of spurious Lyapounov exponents are sample lengths
that are too short and/or too measurement-noisy to compute a statistically stable
average, embedding dimensions that are too high or low and attractors (many of
physiological relevance) that have geometric features such as sharp corners or tight
folds as in the Rössler (where points gather) or delicate boundary points such as
those on the body on the Lorenz butterfly (see above) where very small distances
determine whether the orbit makes big jumps to the right or left wing leading to
uncharacteristically large separations. This “nonuniformity” in the rates of expansion
and contraction in the dynamics over the attractor, a source of error in computations
of statistical indices of the average behavior, becomes a useful tool in
characterizing individual differences in sets of neurobiological data ranging from
220
brain enzyme kinetics (Mandell, 1984) and single neuron firing patterns (Selz and
Mandell, 1991) to human psychomotor and cognitive behavior (Selz, 1992; Selz and
Mandell, 1993).
The Leading λ( + ) of Some Biologically Relevant Time Series
An early application of a simplified form of leading Lyapounov exponent to
brain data involved the computation of the one dimensional averaged slope of in
vitro studies of psychopharmacological drug and peptide effects on time series of
catecholamine and indoleamine biosynthetic enzyme activities studied at
physiological, far-from-equilibrium reactant concentrations (Russo and Mandell,
1984b; Knapp and Mandell, 1984). A contemporaneous study also suggested the
influence of differences in initial conditions for pharmacokinetic equilibrium times in
drug binding kinetics by proteins (Bayne and Hwang, 1985).
The most extensive applications to the clincial neurosciences of the
Lyapounov measure of the exponential divergence of orbital points has involved
reconstructed brain wave attractors from the intracranial or scalp recordings of the
EEG (Duke and Pritchard, 1991; Dvorak and Holden, 1991; Jansen and Brandt,
1993). Space prevents us from surveying more than a small representative set of
the studies (Jansen, 1996). It should be noted, however, that this is an area in
which “state of the art” research has grown quite complicated and somewhat
controversial with respect to technical issues. The choices of the digitizing
frequency of the smooth record, the dimension of the embedding space and time
delays continue to be debated in the context of numerical computations of λ and
dimension measures (Mayer-Kress, 1986; Ott et al, 1994).
Controls for the implicitly required statistical discrimination between
“randomness” and “deterministic chaos” consist of sequence and (Fourier) phase
randomization generating “surrogate data” which conserve the probability
distributions and destroy the correlation properties and attractor geometries (Sauer
et al, 1991; Ott et al, 1994). Since neither bring with them any connections with
221
known or explorable brain mechanisms, one might argue that at this early stage of
the work it would be more desirable to simply report the quantitative findings,
leaving unanswerable questions about ultimate causality for later discussion (see
below).
The first EEG λ( + ) was reported in a patient with epilepsy (Babloyantz and
Destexhe, 1986) which was confirmed by others (Iasemedia et al, 1988; Frank et
all, 1990). An important study of simultaneous time series from 16 subdural
electrodes placed in the right temporal cortex of a patient with a right medial
temporal lobe epileptogenic focus demonstrated that a decrease in a single lead’s
λ( + ) reliably anteceded and localized the first signs of the incipient seizure. The
rest of the leads followed with similarly decreased positivity in their leading
Lyapounov exponents associated with spatially coherent patterns of behavior. In
addition, the averaged value of the leading Lyapounov exponents in the 16 leads
increased post-ictally over the averaged values of λ( + ) in the pre-ictal state
(Iasemidis et al, 1988,1990). These findings, including seizure anticipation for 25
minutes, were confirmed using intracranial recordings in 16 patients with temporal
lobe epilepsy (Elger and Lehnertz, 1998). The para-ictal decrease and post-ictal
increase in λ( + ) found in patients with focal temporal lobe seizures was confirmed
more generally in left and right pre-frontal-to-mastoid EEG recordings made before,
during and after electroconvulsive shock treatment of psychiatric patients (Krystal
and Weiner, 1991). Pre-ictal changes were also found six minutes before seizure
onset from scalp EEG recordings in 17/19 patients with chronic focal epilepsy
(Martinerie et al, 1998). The most exciting potential application of this approach is
its use, in real time, for the prediction and prophylactic treatment of incipient
seizures, minutes to hours before the event, in place of or augmenting long term
drug management (Iasemidis and Sackellares, 1996).
There is a growing literature about leading Lyapounov exponent(s) in the
reconstructed attractor of the EEG associated with a variety of normal and
pathological human behavioral states. For examples, two and sometimes three
222
λ( + ) were reported in awake relaxed subjects and were lost in deep sleep (Stage
IV) and coma (advanced Jakob-Creutzfeld disease), suggesting that level of
consciousness correlated positively with amount of orbital divergence (Gallez and
Bablioyantz, 1991). A “pathologically low” leading λ( + ) was also found to be
characteristic of the EEG of patients with Alzheimer’s syndromes (Jeong et al,
1998). Technically defined sleep stages (I, II, III, IV, REM) were found to correlate
well with the values of the leading λ( + ) of the EEG in normal subjects (Fell et al,
1993; 1996; Pradhan and Sadasivan, 1996). EEG recordings during problem
solving sometimes, but not always, demonstrated a relationship between values of
λ( + ) and the kind or amount of load of the task (Micheloyannis et al, 1998;
Popivanovov et al, 1998; Meyer-Lindenberg et al, 1998). Both emotionally positive
and negative videos increased the value of the leading λ( + )(Aftanos et al, 1997) as
did computer generated music with sounds that exploited a “pleasing” hierarchical,
1/f but not an “unpleasant” 1/f 2 frequency spectrum (see previous section about
power law scaling) (Jeong et al, 1998). The EEG theta rhythm of “day dreaming”
manifested a lower λ( + ) than the “relaxed alert awake” alpha rhythm (Roschke et al,
1997). Relationships between the Lyapounov spectra demonstrated both regional
independence and task-related dependence in the magnetoencephalography record
in man (Kowalik and Elbert, 1995).
These and other studies suggest that divergence rate of orbits on a
geometrically reconstructed attractor is a subtle measure, which can be quantified
as a continuous variable and which has been found to be useful in a variety of
neuroscience-related, experimental contexts. The range includes the
characterization of the discharge pattern of a single somatic or renal sympathetic
nerve fiber (Gong et al, 1998;Zhang and Johns, 1998); quantifying the results of
perturbing autonomic nervous system activity, for examples, exercise, atropine and
propranolol decrease λ( + ) in the cardiac interbeat interval attractor (Hagerman et
al, 1996) and interference with the function of the baroreflex or clonidine alters the
λ( + ) in the blood pressure attractor in man and animals (Wagner et al, 1996;
223
Mestivier et al, 1998); and predicting defects in visual learning functions from
decreases in the λ( + ) of the cardiac interbeat interval attractor in patients with
multiple sclerosis (Ganz and Faustman, 1996).
We recall that on theoretical grounds (Guckenheimer and Holmes, 1983), a
decrease in the positivity of λ( + ) → λ( 0)
in a delay coordinate, geometric
reconstruction of a time series of observables may auger an incipient global
bifurcation in the system’s dynamics. As reviewed above, this has turned out to be
the case in several studies of the EEG and electrocorticogram in epileptic patients.
Futher research will be required to see if this idea has substance more generally for
predicting “catastrophic” changes in other brain-related systems.
Power Law Scaling of Orbital Geometries in Time Series Reconstructions
Benoit Mandelbrot’s book in its first incarnation was derived from his lectures
at College de France in 1973 and 1974 and was called Les Objets Fractals: Forme,
Hasard et Dimension (Mandelbrot, 1975). This essay was translated into English as
Fractals, Form Chance and Dimension (Mandelbrot, 1977). Later expanded and
reworked editions displayed another title, The Fractal Geometry of Nature
(Mandelbrot, 1982) but the deep conceptual, sometimes poetic fusion and confusion
generated by the apparent identity among the objects of his first title remains.
“Fractal,” along with “chaos” and “strange attractor” are among the most widely
familiar new words in modern dynamical systems research. Fractal is the most
difficult to rigorously define and is commonly misunderstood due to the evocative
yet dream-like cognitive condensations provoked by the first title and its reflections
in Mandelbrot’s prose. A common conceptual confusion is exemplified by the
assumed relation between “fractal time event distributions” of the cardiac interbeat
interval and the “fractal like” anatomy of the purkinje network of the cardiac
conduction system. Data from both contexts are often shown juxtaposed in the
same illustration as though their relationships were obvious (Goldberger et al, 1990;
Goldberger, 1996; Liebovitch and Todorov, 1996). “Fractal times” and “fractal
224
geometries” are not related to each other essentially, either in the mathematical or
physiological domain, but are often made vaguely equivalent on the basis of their
lexical similarity.
An experimentally meaningful relationship between fractal statistics (hazard),
dynamical fractals (dimension) and fractal geometries (form), has to be proven on a
case by case basis and not assumed from their common designation. Among the
informal attempts to do this have been those that involve the branching pattern of
nerves and the associated reductions in their diameter-dependent characteristic
conduction velocities yielding a multiplicity of “arrival times.” There is, however, a
more central idea common to these concatenated meanings of fractal: the
statistical, dynamical and geometric expressions of “scaling,” a word which is not
mentioned in Mandelbrot’s book titles. The cluster of theories, theorems and
methods associated with the idea of scaling (and renormalization) have led to Nobel
Prizes for Flory (1971), Wilson (1975) and de Gennes (1979) and the (equivalent
mathematical) Field’s Medal for McMullen (1994). There is speculation that the last
two awards were supported by the inspiration and interest given their research by
Mandelbrot’s intuitions and books.
Scaling laws take the place of (unknown causal) physical laws by indicating
the proportion by which observables of a system can be changed in relationship to
each other such that some statement about them, “this varies with that,” still holds.
In a cross species comparison, as the average weight of a mammalian body, called
lb, increases, the skeletal weight, called w, increases at an exponentially greater
rate: w goes like lb 1.08 where lb 1.0 would indicate that they grew across species at
the same rate. Plotting log (lb) on the x axis and log (w) on the y axis in a log-log
plot results in a straignt line with a slope that indicates the power law scaling
relationship between body weight and skeletal weight across mammals. The slope
of the scaling exponent of 1.08 is a little over 45 ° = 1. In contrast, the metabolic rate,
r, goes like (lb) 0.75 , r ≈ (lb) 0.75 . Larger animals (relative to their weight) have lower
basal metabolic rates (Schmidt-Nielsen, 1984). We don’t completely know the chain
of intervening mechanisms that relate these variables to each other but we do know
225
invariant scaling laws that describe their relationships within some limits on the
range of values.
In describing the functional size, radius of gyration, R g , of a polymer such as
a polypeptide, composed of N monomers, assume each of the amino acids to be
the same and that they are in a “good” hydrophobic solvent that didn’t stick the
polymer together in a fold. Flory (1971) found a scaling law for certain broad classes
of polymers and solvents, R g ≈ αN ν , where the exponent, ν = 3/5, was universal, N
indicated the number of monomers in the chain and the value of “pre-factor” α
depended upon the particular monomer and solvent chosen. Log R g plotted against
log N has a “power law” slope of 0.60. For an equally static but less physical
example, there is the well known Zipf law of “vocabulary balance”(Zipf, 1949). First
reported for the 260,450 words of James Joyce’s Ulysses, the slope of the log of the
rank of the words found (ordered from most to least along x) plotted against the log
of their frequency (along y) results in a power law that is (generally) true for other
collections of words and in other languages.
An accessible example of a dynamical scaling law arises in a two
dimensional lattice model of a forest which is to be set on fire with probability p
independent random single tree ignitions. At some critical p, p c , the fire sweeps
through the entire forest (“percolates”) and the correlation length of the connected
clusters grows as |p-p c | -γ with a universal scaling exponent, γ = 4/3, for all Monte
Carlo, two dimensional percolation problems (Stauffer, 1985; Grimmett, 1989).
Mandelbrot’s scheme for the power laws that compose his fractal geometry
of dynamical objects is a measure made on the pattern of occupancy in the
embedding space by the reconstructed orbits of an attractor. It is, generally,
mass ≈ length D 0
in which D (the subscript that of the “capacity dimension”) is not the
0
whole number of Euclidian dimensions, d, of the space in which the orbits are
embedded. After Hausdorff’s “convergence of external and internal measures”
(Hurewicz and Wallman, 1948), the (capacity) fractal dimension D is also defined
as being larger than its topological dimension and smaller than its Euclidian
embedding dimension. Graphing a time series on a plane one can think of its
0
226
topological dimension as that of a line equal to one. If each time step had the
largest up or down amplitude as possible, its fractal dimension would approach (but
not reach) that of the embedding plane, Euclidean d = 2.
The D 0 of the one dimensional Richardson technique (Mandelbrot, 1967) can
be computed by covering the one dimensional surface of a time series with a
number, #, of line segments of several orders of magnitude range of lengths, l
.Graphing log(l) along the x-axis and log #(l) along the y-axis yields a negative linear
slope, -s. As defined, 1- s = D 0 noting that (-(-s)→+s) such that 1 < D 0 = 1+s < 2.
Strain differences and peptide and psychotropic drug-induced changes in D 0
computed in this way were found in time series of fluctuations in rat brainstem
tyrosine and tryptophan hydroxylase activities under far-from-equilibrium coreactant
concentrations (Mandell and Russo, 1981; Knapp et al, 1981; Knapp and
Mandell, 1983; 1984). Systematic influences of stimulant drug dose on D 0 were
found as well in these systems (Mandell et al, 1982). This simple measure, made
directly on the “roughness” of the graph of a one dimensional time series rather than
on its orbital reconstruction, has been used to discriminate the pattern of
fluctuations in daily mood scales in normal subjects and mood disordered patients
(Woyshville et al, 1999). These findings confirmed dimensional scaling exponents
on higher dimensional embeddings of similar time series in mood disordered
patients (Gottschalk et al, 1995; Pezard et al, 1996). Due to the ease and rapidity of
its computation, techniques involving D 0 on one dimensional time series are
currently in development as possible real time epilepsy predictors when analyzing
the output of a large number of EEG leads simultaneously.
If M(ε) is the minimum number of d-dimensional cubes of side ε required to
cover the d-dimensionally embedded attractor, plotting a logarithmic range of rulers
of length ε (as ε→0) along the x axis and a logarithmic range of number of cubes,
M(ε), each of corresponding ε-edge size, along the y axis, results in a negative
(more smaller M(ε) ‘s and fewer bigger M(ε) ‘s) power law slope D 0 . Here the
numbered covering cubes, M(ε), are those in which the probability of containing at
least one point (its “probability density measure,” often called µ) is not zero. We
227
note that changing the ratios of the numbers of cubes that are dense in point
probability to those that are sparse would not influence the value of D 0 . This helps
differentiate D 0 from other dimensions and, as noted above, D 0 as a maximal
estimate of the fractal dimension, is called the capacity dimension and by
convention the scaling law is written
M
( ε)
D
≈ ε
− 0
. More specifically, D0 is calculated
by repeatedly dividing the d-dimensionally embedded phase space into equal d-
dimensional hypercubes and plotting the log of the fraction of the hypercubes
containing data points versus the log of the (normalized) linear dimension (“length
scale”) of the hypercubes. The slope fitted to the most linear part of the slope
(usually the middle 50%) indicates the capacity dimension. D 0 is computed for
increasing embedding (and cube) dimension, d, until it achieves an asymptotic
plateau, it “saturates”. This is but one of a range of geometric scaling exponents,
“dimensions,” that are currently being computed (Farmer et al, 1983; Grassberger
and Procaccia, 1983; Meyer-Kress, 1986; Theiler, J. (1990); Gershenfeld, 1992; Ott
et al,, 1994).
Although still subject to debate, convention has it that the sample length
required to determine this most primitive of dimension computations goes like
D
10 0
(e.g. a dimension of 2.45 requires a sample length of at least 282 points).
Assuming robust findings using D0 as indicated by non-parametric tests of
significance in test-retest, before and after, drug treatment designs, this arbitrary
criteria sounds more like ritual than meaningful help for the clinical neuroscientist
with (say) 100 spinal fluid hormone and metabolite samples painfully and
laboriously collected from a patient’s indwelling catheter over 48 hours. In the
context of real data (and not numerical studies of differential equations), we are
dealing with empirical findings that must find their meaning (or lack of) in the context
of questions about issues in the neurosciences, not in abstract questions such as
those about the number of dimensions that an unknown differential equation would
require to represent the data (Broomhead and King, 1986). In a similar arbitrary
spirit, a system manifesting a D 0 > 5 is considered not discriminable from a random
process; e.g. the difference between D 0 = 5 versus D 0 = 7 (though perhaps
statistically significant) is thought to be without meaning. Since in neurobiological
228
research, “random” (if it doesn’t mean measurement error) indicates unknown
degrees of freedom, this D 0 > 5 rule is also without relevance for brain research.
D 1 is called the “information dimension” and is computed by counting the
number of ε-cubes, M(ε), it takes to cover the points constituting some fixed fraction
of all of the points of the set of orbital points on the attractor and can be regarded as
the “core dimension” (without the outliers) of the set. The counterintuitive finding is
that D 1 is nearly constant across a range of fixed fractions that are less than the
whole measure (Farmer et al, 1983). The invariance of D 1 can even be taken to the
extreme by computing the D
M
lim ln ( ε )
ε ln( ε)
1
=
→o
around (typical, not all) single points. In
this context, D 1 is called the “pointwise dimension” or “singularity exponent” and, as
might be anticipated, its value is usually less than that of D 0 .
The scaling exponent that is both sensitive to point densities and easiest to
compute from real data is the “correlation dimension,” D 2. Here, analogous to the
relationship between the amplitudes of the variance and the correlation function in
conventional statistics, the measure squared is of interest for the computation of D 2 ,
e.g.
M ( ε )
2
I( 2, ε ) = ∑[ µ ( Ci
)] (see below for this use of measure µ on sum Σ of cubes Ci).
i=
1
The selection of D 2 as the fractal measure dominates the studies that invoke scaling
exponents to quantify the distributions of points on the attractor as reconstructed
from time series in the neurosciences (Grassberger and Procaccia, 1983; Mayer-
Kress, 1986; Ott et al, 1994). Several sets of programs are available for its
computation (for example, Sprott and Rowlands, 1991). Generally, a correlation
sum (“integral”, R(ε) ) is computed from a starting point by counting all subsequent
point pairs with distances between them less than ε as ε→0 and plotting
D
2
lim ln( R( ε)
= . D 2 is computed for increasing embedding (and therefore
ε → 0 ln( ε)
hypercube) dimension, d, until D 2 achieves an asymptotic plateau, it “saturates”
(Ding et al, 1993).
It is generally the case that D 0 ≥ D 1 ≥ D 2 (Farmer et al, 1983).
229
In his statistical explorations of experimental results in hydrodynamic
turbulence, Mandelbrot (1974) called attention to the need for a multiplicity of
characteristic scaling exponents, a range of values for each exponent and their
sensitivity to orbital point density distributions (the latter called the Sinai-Ruelle-
Bowen or natural measure (Eckmann and Ruelle, 1985)). These needs grew out of
the intrinsic heterogeneity in the time dynamics and the nonuniform point
distributions in phase space of orbitally divergent, real physical systems. Even with
relatively uniform orbital point distributions, it is intuitively obvious that as ε → 0, the
smaller ε- cubes are over-represented and larger ε- cubes are under-represented in
the M(ε) computation (Farmer et al, 1983). For a concrete example, the fraction of
the total number of cubes containing say 75% of the points would obviously
decrease as the ε-lengths studied gets smaller. Normalizing the D i measures with
respect to point densities would correct for this systematic distortion. In addition, the
non-systematic influence of real system heterogeneity and non-uniformity in both
time and reconstruction space distributions makes the need for relating the D i
measures to the natural measure even more pressing.
The derivation of many separate scaling exponents, as well as global
generalized exponents and the incorporation of point densities in their computation,