powerful general mathematical programs noted above, there exist several sets of more specifically targeted software with the capacity to generate, portray and quantify the behavior of nonlinear continuous and discrete abstract and real dynamical systems. These often also include algorithmic modules that are useful in tailoring new models and measures (see for examples, Parker and Chua, 1989; Baker and Gollub, 1991; Nusse and Yorke, 1991; Sprott and Rowlands, 1991; Sprott, 1993; Korsch and Jodl, 1994; Enns and McGuire, 1997). Learning from and using this software, along with only a little programming in the high level languages and computer algebra programs listed above, permit the non-mathematician neuroscientist, willing to read in the literature such as that described below, to do independent, cutting edge research in applied dynamical systems. Described below will be the computational discoveries in abstract systems and real neuroscientific data that have led to multiple contexts of quantitative description. These include those that are: (1) Geometric and conserve metric distances; (2) Topological and conserve relative positions but not distances; (3) Single or multiple global quantitative descriptors such as scaling numbers or scaling 187 number spectra; (4) Non-Gaussian distributions with heavy tails and correlations reflected in their Hurst, Fano, Allan and Levy exponents; (5) Statistical dynamical descriptions of trajectories of the system in their embedding space such as Lyapounov exponents, Hausdorff-Mandelbrot dimensions, Sinai-Ruelle-Bowen measures, and Adler-Weiss-Ornstein topological and metric entropies. Characteristics which discriminate between experimental versus control conditions in parametric computational and real physiological and pharmacological experiments serve to generate and test ideas and imagery arising out of behavior observed in both biological and abstract dynamical realms. New experiments can be suggested by the implicative structure of dynamical systems theory as well as neurobiological findings and intuitions. As examples, the sudden “switch” of manicdepressive bipolarity syndromes may be a “bifurcation” in nonlinear dynamical systems; the “noise” of the statistical physicist may be the “arousal” of the brain stem-thalamic biogenic amine and reticular formation neurophysiologist; aspects of “thought disorder” in the pathophysiology of schizophrenic patients may be an entropic sequencing idiosyncrasy in the “symbolic dynamics” of a particular brain system attractor; neuronal “bursting” may be the “intermittency” of a neurodynamical system; a multiplicity of “discrete ion channel conductances” may be a single “global scaling hierarchy” of conductances times. The number of published examples of this fusion of ideas and methodology in the biological-relevant literature is already in the several hundreds and Medline counts indicate is growing exponentially. Representative samples of these are described below. In addition to the technological advances in computational hardware and software, the major scientific surprise making this new era possible is the discovery of universalities, the finite set of behaviors characteristic of most, if not all nonlinear systems, across most if not all of the specific equations or neural systems being explored. This makes the emergence of semi-quantitative equivalence relations between model and data not only possible but likely, even though we don’t now and perhaps never will know enough to either write or solve completely the specific and detailed equations for the biological system of interest. We neuroscientists need not be apologetic for using these ideas and tools qualitatively and empirically. In fact, 188 unanticipated results of analog and digital computer experiments were responsible for most if not all of the discoveries underlying the current era’s revolution in applied nonlinear mathematics . Modern Applied Dynamical Systems Emerged from Accidental Computational Discoveries A medical student named Herr, in his thesis research with the “radio engineer”, Van der Pol (1926), was simulating cardiac electrophysiology with an analog device which permitted real time, exploration of a full range of parameter values long before there were fast enough digital processors to do so. Studying the behavior of equations of a periodically, pace maker, driven, nonlinear triode oscillator, Herr found orbital points that appeared to belong to two different periods simultaneously thus violating the uniqueness of solutions of differential equation theory. The Van der Pol relaxation oscillator equations, with their slow buildup and sudden discharge of membrane potential are good models for the slow-fast processes of repolarization and depolarization of Hodgkin-Huxley type equations (Rinzel, 1985). Periodically driven, nonlinear differential equations of the Van der Pol type are generally applicable to the multiplicity of dynamical regimes of neuronal dynamics (Carpenter, 1979; Aihara et al, 1984; Chay and Rinzel, 1985) and, with periodic and aperiodic driving and noise, can be made relevant to particular mammalian neuronal subsystems in the context of clinically relevant global electrophysiological phenomena such as Magoun’s (1954) brain stem evoked EEG and behavioral arousal (Nicolis, 1986; Selz and Mandell, 1992; Mandell and Selz, 1993). In the early 1940’s, using the pre-publication results of similar analog computer studies (Levinson, 1949), the Cambridge mathematicians, Mary Cartwright and Joe Littlewood (1945; McMurran, S., Tattersal, J.,1999) used geometric methods to prove that the highly nonlinear, periodically driven Van der Pol equations, depending upon one or two changing parameters, generated fixed point (“homeostatic”), periodic (“cyclic”), subharmonic (“period doubling”), quasiperiodic (“multiply periodic”), intermittent (“bursting”) and “deterministically 189 random” patterns. We now know such phenomena to be universal characteristics of bifurcation scenarios in nonlinear dynamical systems where bifurcation means discontinuous changes in patterns of behavior (dependent variables) resulting from smooth changes in parameters (independent variables). Alerted to their presence in computer experiments with biologically relevant nonlinear differential equations, these phenomena have since been found in time series from patch clamped membrane channels, single neurons, neuronal networks, neuroendocrine systems, brain waves and patterns of behavior in animals and man (see below). Cartwright- Littlewood found that the inner and outer edges of the domains of attraction (all the initial values that eventually wind up in the attractor—the limit set of all bounded solutions) of two different sets of subharmonic periods for the same parameter settings were interlaced at many scales in what is today called a fractal basin boundary. It was in this way that the specific values of the end state are understood to be indeterminate since the starting values in the fractal basin boundary are impossible to isolate and specify with adequate experimental precision. Similar biologically-relevant analog computer discoveries about the Van der Pol and comparable periodically forced, dissipative (energy utilizing) Duffing equations (Zeeman, 1976) were made in the early 1960’s by electrical engineer, Yoshi Ueda (1992), but his thesis director, Chihiro Hayashi of Japan’s Kyoto University, was sufficiently disturbed by this evidence for the existence of bounded solutions (attractors) that were neither fixed points (equilibria) nor periodic orbits (cycles), the only ones known at the time and therefore “strange,” that he refused to let Ueda publish his findings until he did so as an independent investigator in the 1970’s. In the early 1960’s, Edward Lorenz (1963), a meteorologist and student of the Harvard mathematician and dynamical systems pioneer, George Birkoff (1922), was computing the output of a very reduced subset of Saltzman’s differential equations for predicting the weather (1962). Lorenz found that numerically integrated trajectories manifested unpredictable times and directions of motion between the two spiral orbits of what has come to be known as the Lorenz attractor. Very small differences in starting values led to widely diverse final values, and, just 190 as importantly, far apart initial values could be found close together in the limit set. This behavior was called “sensitivity to initial conditions” by David Ruelle (1978; Ruelle and Takens, 1971). It is noteworthy, however, that over a range of values of the parameters, the overall pattern of the orbits of the Lorenz attractor results in characteristic geometric pictures as well as invariant statistical descriptors. Qualitative and quantitative global similarities were gained while specific solutions were lost in these “strange attractors” of nonlinear systems. Analog computer simulation of a simpler set of equations inspired by nonlinear chemical reaction kinetics led to the discovery by Rössler (1976) of another early and generic strange attractor combining sensitivity to initial conditions and characteristic geometries and measures. It was the Russian mathematicians, A.N. Kolmogorov (1957), Sinai (1959) and V.I. Arnold (Arnold and Avez, 1968), the French mathematicians, Rene Thom (1972) and David Ruelle (1978) and the U.C. Berkeley mathematicians, Steve Smale (1967) and his student, Rufus Bowen (1975), and their associates who gathered together these and other related computational discoveries and embedded them in a qualitative theory of nonlinear differential equations, using a variety of formalisms, including point set and differential topology, geometry, analysis and ergodic (having an invariant statistical description) measure theory that formally established the fundamentals for research in nonlinear dynamical systems. Here a dynamical system refers generally and simply to the components and nonlinear processes (transformations) that move points (values) in discrete (“map”) or continuous (“differential equation”) time around in an appropriately defined space. The phrase, “nonlinear transformation” in this context does not imply easily solvable curved functions, such as those representing the sigmoid kinetic or threshold functions of enzymes and neuronal networks or those that smoothly log transform the amplitudes of auditory or other sensory modalities in man, but rather allude to expressions containing products, powers and functions of the computational and/or 3 experimental variables , such as xx , ( x i or si n( x) . x i 1 2 ) 191 As noted above, the cross-disciplinary cohesiveness of such a vaguely defined field occurred as the result of the unanticipated discovery of a relatively small set of nonlinear phenomena, universalities, that implicated many fields of mathematics, from differential geometry to number theory, and were found in a broad range of physical and biological realizations, from turbulent plasmas and chemical and enzymatic reactions to neuroendocrine hormone release patterns. It is perhaps counter-intuitive but, whereas linear systems can generate an infinite number of solutions locating points anywhere the person writing the equations wants them to go, nonlinear systems are generally restricted to a finite set of global dynamics and these emerge on their own from the intrinsic dynamics of the system. Trying to make these systems follow orders, not unlike finding the most clinically effective dosage range of a psychopharmacological agent, require the empiricism of trial and error experiments. A second class of computational accidents involving nonlinear systems resulted in unanticipated coherence rather than unpredictable disorder. Using one of the early “high speed” digital computers at Los Alamos, MANIC I, Enrico Fermi with Pasta and Ulam (1955) attempted to obtain a many-body statistical thermodynamic equilibrium analogous to heat generated noise by coupling 64 particles together with nonlinear springs. They found only a few low period modes that oscillated indefinitely. Instead of equidistribution of the energy into 128 degrees of freedom (64 locations × 64 velocities in 128 dimensional phase space), they found it gathered up into only few coherent modes. Although the relevance to biological science of nonlinear multifrequency coherence is a bit off from our focus, it is worthwhile noting that a recent (Karhunen-Loeve) decomposition of the alpha band of the resting alert human EEG revealed only three dominant temporal–spatial modes: anterior-posterior, rotational and standing (Friedrich et al, 1991) and “few frequency coherence” is a frontier of inquiry in brain wave research. A heterogeneous collection of coupled nonlinear elements in the form of widely distributed, multi-location, multifrequency systems such as cross-cortical, brain stem-thalamic-cortical and interconnected spinal motor neurons can generate 192 coherent activity. This temporal and phase coherence plays an important role in current theory of sensory-associative-motor integrative function, how distributed attributions come together in the brain representation of a single object or process, in the context of the so-called “binding problem” (Singer, 1993; Bressler, 1995; Nicolelis, 1995; Schiff et al, 1997). Diffusely distributed neurochemical variables have been invoked. For example, the role of metabotropic glutamate receptors in driving the synchronization of interneuronal networks has been suggested as a mechanistic model (Whittington et al, 1995). The objects of relevance to the discovery of Fermi-Pasta-Ulam are studied as the nonlinear physics of nondissapative wave processes and are called solitons (Zabusky and Kruskal, 1965). They have been invoked to model nerve conduction and information transport in brain (Scott, 1990). A third counter-intuitive set of accidental computational findings is in an area of research called symbolic dynamics which involves the universal parameterdependent coding language and capacity of nonlinear systems. In the early 1960’s, a group around Stan Ulam at Los Alamos (Cooper, 1987) used one of the early “high powered” computers, MANIAC II, to iterate (letting the output of the action of a discrete time function serve as its input the next time around) simple equations they called “maps.” These reduced dimensional objects shaped like tents, sine functions and parabolas can be extracted from and represent the behavior of higher dimensional, nonlinear differential equations (see Devaney, 1989; Schuster, 1989 or Moon, 1992 for intuitive descriptions). They varied a parameter, such as the height of the tent or parabola, to systematically change the period and/or phase (order) of the symbol sequence (Metropolis et al, 1973). Normalizing the range of values of the output to [0,1] and transforming the series of values into a binary code, L ≤ 0.5 and R > 0.5, they found an invariant, one parameter dependent, progression of ordered periods, R, RLR, RLRR…RLLRL…, in all such single maximum maps. This “U (universal)sequence” has also been found as singly or multiply present in a variety of real systems, including complicated chemical reactions (Simoyi et al, 1982; Coffman et al, 1986). This means one can “dial” the parameter to generate “words” of sufficient computational complexity to serve as a language. These 193 computer experimental findings had already been anticipated in a remarkable mathematical proof by Sharkovskii (1964). The dynamical richness of these simple, single maximum, one dimensional maps was computationally explored in the context of ecological and epidemiological issues in the classical studies of Robert May (1976). It has been possible to relate the individually characteristic L, R sequence behavior of human subjects on a computer task to a unique parameter of a tent map generating those sequences which predicted age and discriminated subclinical obsessive compulsive from borderline syndromes (Selz and Mandell, 1993). The dynamical entropy of unstructured L,R behavior also discriminated a population of schizophrenic patients from normals (Paulus et al, 1996). More generally, parameter dependent dynamical coding, built into the universal behavior of its constitutive equations, is a mechanism with which a nonlinear dynamical system, such as nerve membrane equations as above, or in the aggregate, the middle layer of a completely connected neural network, can encode, Morse code-like, messages (Paulus et al, 1989). Bifurcations in Biologically Relevant Dynamical Systems Bifurcations, “splitting into (two) branches,” are observed over a smooth change in control parameter(s) (independent variables), as a discontinuous and qualitative change in the dynamical (time-dependent) pattern of the observable (Guckenheimer and Holmes, 1993; Wiggens, 1990; see Strogatz, 1994, for a particularly intuitive description). Qualitative here means how the dynamics of the trajectory appear as a geometric-topological (relative shaped not necessarily sized) pattern in phase space. In such a space, the orbital points are located along the x- axis by their value, x at time t, and along the y-axis by their time rate of change at that t, dx . To visualize a representative phase portrait in the plane, start by dt imagining the pattern made by mass hanging on a linear spring at rest as dx represented by a point centered at x = 0, y = = 0. When perturbed from rest, the dt 194 phase portrait of the motion of this “harmonic oscillator,” is composed of a (continuous) series of points representing its location, graphed along x , its rate of motion graphed along , y = dx dx . x and dt dt co-localize the circular orbit as it speeds up and slows down while it bobs up and down. The transition from a fixed point (the mass at rest) to a circle (the bobbing mass), a bifurcation in phase space, results in the loss of topological equivalence. That is, the phase space geometries before and after the bifurcation cannot be smoothly distorted into each other. Continuity and connectedness of the space is lost. For topological equivalence, stretching, bending and warping are allowed but not tearing apart and/or gluing together. Following the bifurcation of a fixed point into a circle, even limitless shrinking of the ring leaves a hole. The appearance or disappearance of an equilibrium fixed point (called a “saddle-node” bifurcation), splitting into two (“period doubling” bifurcation), its exploding into a circle (“Hopf” bifurcation to a limit cycle), a circle splitting into two or more incommensurate cycles (“secondary Hopf” bifurcation) and these multiperiodic (“quasiperiodic”) dynamics breaking down into a recursive spirals (“homoclinic bifurcation to chaos”) are among the common bifurcations in nonlinear dynamical systems, and all of them have been observed in many neurobiological settings. In the forced-dissipative (energetically driven and energy consuming) dynamical systems relevant to the neurosciences---this characteristic contrasts with the dissipation free momentum of the classical mechanics of astrophysical bodies--- there are four “most generic” bifurcation scenarios as a parameter changes that may, but need not, lead to chaos (see below for definition) (for early and physically oriented treatments see Eckmann, 1981; Ott, 1981; Berge′ et al, 1984, Kaneko, 1983). These scenarios are: (1) Fixed point or cycle splittings into twice-as-long period lengths 1→2→4→8→16…called the subharmonic or “period doubling route”; (2) The transformation of fixed points to one and then more periodic orbits, multiple independent (nonharmonic, incommensurate) frequency oscillations, their mode lockings and then breakdown called the “quasiperiodic route”; (3) Fixed point or cyclic equilibria metamorphosed into irregular bursting patterns called the “intermittency route”; and (4) In the context of quasiperiodic dynamics, adjacent 195 nonharmonic frequency encoding parameter spaces fusing, resulting in new periods that are the sums of their adjacent ones: period 2 + period 3 = period 5, in what is called the “period adding route”. Technically precise classification of bifurcations involve much more careful definitions and well studied technical constraints involving such issues as the symmetries and dimensionality of the system of observables, how many control parameters (“codimensions”) are required to reasonably realize the bifurcation and the particular way the fixed points of the system become unstable, all of which are directly explorable when the equations are known or can be hypothetically inferred from the qualitative behavior of real data. We note a few examples from the wide variety of bifurcating systems that can be found in the biomedical literature of interest for the biological sciences. With substrate input rate as the bifurcation parameter, the phosphofructokinase regulated glycolytic cycle in yeast extract was found to change among steady state, periodic and period doubling (subharmonic) regimes (Boiteux et al, 1975). Transitions between steady state, oscillatory and chaotic patterns have been reported in variety of physiological measures in man including respiratory rhythms and circulating blood cell concentrations over time (Mackey and Glass, 1977; Glass and Mackey, 1988 ) and models of dopamine cell dynamics (King et al, 1984). Flow rate parameter sensitive periodic, bursting and chaotic behavior has been found in a peroxidase-oxidase system (Olsen and Degn, 1977). A brain enzyme, substantia nigral dopaminergic tyrosine hydroxylase, manifested different saturation and fluctuation patterns, including bursting and periodicity, in experiments in which low (physiological) levels of tetrahydrobiopterin cofactor were the bifurcation parameters and adrenergic drugs were used as modulators (Mandell and Russo, 1981). All four of the generic bifurcation routes to chaos, period doubling, changing multifrequency (quasiperiodicity), period adding and bursting (called “intermittency”) were observed in self-sustained oscillations induced in the neural membranes of space clamped, giant squid axons that were immersed in a 550mM NaCl, and electrically stimulated over changing amplitudes and frequencies (Aihara et al, 1986; Takahashi et al, 1990). With external stimulus current level as the control 196 parameter, the R15 cell of the abdominal ganglion of the Aplysia demonstrates transitions between bursting and periodic modes as well as period doubling, a signatory period 3 and the Lyapounov characteristic exponent evidence (see below) for the discontinuous onset of chaos (Canavier et al, 1990). Manipulating feed back delay, the human pupillary light reflex will bifurcate into regular oscillations (Milton and Longtin, 1990). A transition between a regime of irregular discharging to oscillatory bursting behavior was induced in basal forebrain cholinergic neurons by neurotensin (Alonso et al, 1994). Sympathetic nerve discharge in decerebrate, ventilated cats demonstrated transitions between periodic, multiple periodic (quasiperiodic with changing ratios to the ventilation frequency) and subharmonic behavior in response to inferior vena cava occlusion, vagotomy, aortic constriction and spinalization (Porta et al, 1996). Period adding bifurcations were induced by changing calcium concentrations or the addition of a potassium channel blocker in the “pacemaker” formed when (rat) sciatic nerve is chronically injured (Ren et al, 1997). Changing levels of the L-type calcium channel antagonist, verapramil, alter the pattern of vasomotion of rabbit ear arteries among sets of multiple independent periods, “quasi-periodicity,” mode locking and chaos (De Brower, 1998). At critical intensity and frequency, flicker visual stimulation of the salamander generates a pharmacologically modifiable period doubling bifurcation in their ganglion cells (one spike for every two flickers) which is also seen subjectively and in occipital lobe evoked potentials at critical frequencies in bright, full-field flickered humans (Crevier and Meister, 1998). Qualitative and Quantitative Universality in Nonlinear Dynamical Systems “Universality” (see above) entered the parlance of physics in the context of the statistical mechanics of phase transitions near their critical points (Stanley, 1971; Stauffer, 1985; Yeomans, 1993) and has come to refer to the finite set of transitions and quantities common to nonlinear systems arising in their neighborhoods. A common physical example is the triple point of water-ice-steam on the temperature-pressure phase plane where a small change in temperature or 197 pressure leads to a global qualitative change in physical state. Analogously, the loss of topological equivalence occurs at the fixed point that, for examples, splits into two or explodes into a cyclic orbit in phase space. The same critical point behaviors and quantities occur in a wide variety of specific processes and their equations, and they are independent of the way the trajectories first arrived in the fixed point neighborhood. Once the system enters the regime of critical behavior, the predictive significance of its dynamical history is lost. This may also be the case for emergent psychiatric disorder (Mandell et al, 1985; Mandell and Selz, 1992; Ehlers, 1995; Paulus et al, 1996; Huber et al, 1999). There are diagnostic patterns of behavior when a nonlinear system is in a neighborhood of a potential bifurcation. They include sudden and/or large jumps resulting from a small change in experimental conditions, the appearance of big baseline fluctuations (anomalously large variance), the lengthening of the time required to relax following evoked or spontaneous perturbation (‘”critical slowing”), the same global change in state occurring at different values of the parameter when increasing versus decreasing a parameter’s value (“hysteresis”), the existence of some range of values of the observable that cannot be attained by manipulation of the parameter (“inaccessibility”) and the availability of two or more distinct states in the same parameter neighborhood (“modality”) (Thom, 1972; Arnold, 1984; Gilmore, 1981). It is perhaps relevant to polydrug psychopharmacology and clinical management that the higher the co-dimension (the greater number of effective parameters being manipulated), the greater the accessibility and control of selected state stability becomes with respect to difficult to obtain behaviors. Examples of the potential advantages of simultaneous manipulation of multiple influences have been developed for affect disorder and anorexia nervosa (Callahan and Sashin, 1987). As evidence for the independence of critical behavior from specific history, the qualitatively universal bifurcations along the four canonical routes to chaos manifest dimensionless ratios of parameter and phase space geometries between bifurcations. These ratios are quantitatively universal. The formalisms that rescale the distances from fixed points in parameter and observable spaces result in the same picture across scale, a dilatational symmetry (also called self similarity or 198 affinity). They are called renormalization group equations, and, with respect to prediction, they replace any or all of the original specific predictive equations for the particular system under study (Cvitanovic′, 1989). Whereas the U sequence and critical point behaviors are manifestations of qualitative universality, these scaling numbers are manifestations of quantitative universality. We discuss them here because their omnipresence in computationally realized differential equations as well as physical and chemical experiments along with their quantitative specificity (with values in all systems as “constant” as π) constitute a most persuasive argument for the substantiality of modern dynamical systems approaches to brain and other biological research. The physical and physiological requirements for manifestations of these universal bifurcation scenarios can appear to be remarkably minimal. In physics, for example a full panoply can be observed in a “dripping faucet” (Shaw,1984). Similarly, a small piece of extirpated and perfused myenteric or femoral artery will demonstrate these transitions in vasomotion spontaneously and almost independent of flow rate (Stergiopulos et al, 1998). Feigenbaum discovered that in dynamical systems manifesting a series of period doubling bifurcations, the ratio of the parameter value at which the next period doubling bifurcation occurred relative to the last one ≈ 1 … and the ratio 4. 6692 of the magnitude of the spawning point to the one spawned ≈ 2.5. (Feigenbaum, 1979). By “rescaling” distances along a parameter value (see below) using what is called a “universal renormalization operator” the geometric situation around each bifurcation point (though of different absolute “size) remains relatively the same. In intermittent systems, burst length varies as the inverse square root of the distance of the value of the parameter from that value that elicited the fixed point (Manneville and Pomeau, 1980). The universal characteristic of the third common parametric route to chaos, quasiperiodicity, is that the ratio of independent frequencies found most resistant to mode locking and breakdown into chaos is, ω i = 1.618… the ω i+1 number to which the ratio of adjacent Fibonocci numbers converge (1, 2, 3, 5, 8, 13, 199 21…) (Shenker and Kadanoff, 1982). Similar quantitative scaling properties were also discovered in the parametric period adding route (Kaneko, 1983). All of these scaling numbers have been found in experiments and in remarkable agreement with theory. Examples have been discovered in electronic circuits, hydrodynamic and mercury flows, acoustic systems, laser dynamics and oscillating chemical reactions (see Cvitanovic′, 1989, for representative list of references). Whereas qualitative evidence for all of these bifurcation scenarios have been found in brain relevant experiments, there is yet to be a bifurcating experimental biological system with adequate precision across a sufficient range of magnitudes such that quantitative universality could be demonstrated across a sufficient range of values to be convincing. We remind ourselves that in order to establish a Fiegenbaum number, each period doubling bifurcation of the several required necessitates about a five-fold improvement in the experimenter’s ability to specify the control parameter. Using Invariant Measures of Dynamical Neurobiological Systems Before the modern era of dissipatively forced (energy utilizing) dynamical systems research, the known attactors of an experiment’s initial values resulted from their convergence onto either a fixed point or a limit cycle. An attractor can be regarded as a set which remains in bounded space and to which all orbits in this neighborhood converge (Milnor, 1985). Since by the rules of differential equations, orbits are required to be both smooth (graphable without lifting the pencil) and unique (different trajectories don’t intersect since the point of intersection would no longer be unique), the foundational Poincare-Bendixon theorem says that any such orbit confined to a two dimensional phase space that doesn’t converge to a fixed point must, no matter how long it irregularly wanders, must, eventually intersect with itself and then go around the same route again in a (perhaps very long) cycle. In most neuroscience research as well, we have generally regarded our data as manifesting either tolerable (or intolerable) fluctuations around mean values (fixed 200 points) or more or less regular cycles. We analyze our “fixed point” data using quantities such as the mean and variance of distributional statistics and the cycle data using the amplitude, frequency, cycle length and phase of trigonometric functions. In central tendency-oriented research, rare, very high amplitude events have usually been considered aberrations and tossed, and imperfect periodic behavior is treated by “cosiner analysis” as regular cycles contaminated by measurement or system noise. Whereas technically, chaotic dynamics must live in dimension greater than two (for orbits to be more than a fixed point or limit cycle, able to snake around without necessarily intersecting ), the Lorenz attractor has dimension just a little over two, our difficulties with establishing the “true” physiological dimension of real biological observables (see below) makes such a consideration more theoretical than practical. The orbits of a forced-dissipative dynamical system in a parameter regime engendering chaos, converge onto an attractor which is neither a fixed point nor a limit cycle, thus the origin of the name “strange attractor” (Ruelle and Takens, 1971). It was James Yorke that first named these dynamics “chaos” (Li and Yorke, 1975). The necessarily statistical properties of the chaotic orbits on strange attractors follow from the generic characteristics of their motions (see Shaw, 1981 for a still conceptually current, non-mathematical treatment). These kinds of statistics are studied in a research context called the “ergodic theory of dynamical systems” (Ruelle, 1979; Eckmann and Ruelle, 1985). Ergodic is a word used to characterize a system with (or without) a particular condition placed on its statistical measures: the existence of an invariant measure which is undecomposabile into two invariant measures and, equivalently (though not obviously) one in which the time average equals its average in the geometric space into which it is embedded. One may arrive at the same ergodic measure from studying a single very long orbit or from summing across many individual but shorter orbits. This ergodic equivalence is made possible due to the definitional existence of at least one invariant statistical measure and the dynamics of the system which ideally include a uniformly, sequence disordering process called “mixing” (see below). 201 Of course, most real biological dynamics are not uniformly mixing and so are non-ergodic, but we shall see that the ways they fail to be ergodic (and thus remain in the conceptual context of ergodic measures) are descriptively useful (Mandell and Selz, 1997a). The emergence of many statistical approaches to characterizing these motions have been accompanied by the expected controversies about which is best or correct (see below) and have been applied to the problem of diagnosis and clinical discrimination in a variety of neuroscience settings. In ideal abstract chaotic dynamical systems called Axiom A (Russians called them “C systems”), where most mathematical theorems are proven (Smale, 1967), all these measures, if properly computed, are equivalent. In real life, as in the related case of ergodicity, they are not, and since no single one is complete, the more (incomplete) measures we use in our studies along with interest in the way that they differ, supplies more useful information about the system. Though researching and elucidating the most reliable and valid ways of computing these measures are a valuable goal, the current debates focused on the superiority of a single particular measure, constructed in a particular way in relationship to issues of insoluble absolutes like “randomness” versus “deterministic chaos may not be particularly valuable for uncovering new characteristics and potential mechanisms underlying a specific set of real neurobiological observables. Emphasizing diversity and relevance to the clinical biological sciences, we note that quantifying patterns in ergodic (non-ergodic) measures have aided: the discrimination between normal and abnormal opticokinetic nystagmus in neurology patients (Aeson et al, 1997); localizing a two year old subcortical stroke in an EEG of a patient with no other signs or neurological findings (Molnar et al, 1997); the diagnosis of early (not late) multiple sclerosis, as a nonspecific long tract disorder, in patients with mild optical neuritis using cardiac rate dynamics (Ganz and Faustman, 1996); seizure prediction from minutes to hours before the event in which subthreshold, pre-phase transition spatial diffusion and oscillations in characteristic changes in these measures can be found (Martinerie et al, 1998; Elger and Lehnertz, 1998; Pign et al, 1997; Iasemidis et al, 1990); using these measures on the EEG to differentially predict hereditary predisposition to alcoholism 202 (Ehlers et al, 1995); indicating the presence or absence of septic encephalopathy (Straver et al, 1998); using time series from jejunal manometry to discriminate objectifiable somatic from psychological conversion related irritable bowel syndrome (Wackerbauer et al, 1998); analyzing time-dependent patterns in plasma hormone levels to discriminate between the presence or absence of a functioning tumor (Hartman et al, 1994, Mandell and Selz, 1997a); automated differentiation of ataxic from normal speech (Accardo and Menulo, 1998); and discrimination of temporomandibular joint dysfunction from normal patterns of chewing motions (Morinushi et al, 1998). Styles of Orbital Motions in Chaotic Dynamical Systems In chaotic dynamics, in various specific ways, an initial hypothetical handful of points lined up along the trajectory and acted on over time by the nonlinear differential equation (“operator”), get out of order in an unpredictable way. Here the hypothetical handful can come from a statistical aggregate of initial conditions or from a single recursive orbit studied over long times. As noted above, ergodic theorists call this getting out of order “mixing” and how and to what degree this happens consumes many mathematical theorems but for purposes of brain research, it can be best described using a variety of statistical measures. For example, visualizing the Lorenz attractor (see above) as a butterfly in phase space, the points get out of order because as they spiral out (“stretching”) to the edge of one wing and return (“folding”) to the unstable fixed point on the butterfly’s body whence they either jump to some place on the other wing to spiral to its edge or return to the same wing to spiral out again. Which one of these is chosen is exquisitely sensitive to very small changes in where the trajectory started and very small fluctuations in where it returned to the unstable fixed point on the butterfly’s body. In fact, specification of these locations is beyond the precision of any real, thermodynamically vulnerable system. Chaotic trajectories on the Rössler attractor (see above) wind out (“stretch”) to the edge along the inside of a conch shell in phase space and then are mapped 203 back (“fold”) into the spiral unpredictably somewhere in a mixing mechanism that has been called “displaced reinjection.” In the slow-fast oscillations of the forced van der Pol in the chaotic regime, points in the slow phase (“repolarization”) jitter around and step on each other’s heels, getting out of order while waiting on the ledge before jumping (“depolarization”) to the next slow phase (“repolarization”) at some unpredictable time, thus generating a variably irregular series of interspike intervals. Stretching and folding are also responsible for getting points get out of order in the single maximum map of the unit interval (studied for universal qualitative and quantitative properties by May and Feigenbaum and others as described above). With increases in parameter values, the parabolic hill function onto which the unit line has been stretched gets steeper, more stretched. Mapping points on the hill back onto the straight line of the unit interval results in what amounts to the line folding back on itself. This stretching and folding eventually fills the line with points, but their sequence, from end to end, gets shuffled like a deck of cards. As described more generally above, points that start as neighbors may get separated (“divergence along the attractor”) and those that start at a distance from each other may be thrown together (“compression back onto the attractor”). These expanding and folding motions that characterize the chaotic behavior on strange attractors have been likened to the actions of a taffy puller (Rössler, 1976). It is in this way that nearby points can separate without leaving the attractor. It is also the case that once indistinguishably close but then separated points may be compressed together again generating new, temporary (unstable) cycles of all possible period lengths. These unstable fixed points may be the most important feature of chaotic systems from the standpoint of new ideas about brain mechanisms (Pei and Moss, 1996; So et al, 1997). This aggregation of unstable loops can occur from points fluctuating away and back to the attractor as well as during the crowding of points at the turns after their stretching out on more linear parts of the flow. Under the mixing flow of a chaotic dynamics, it is also true that a single point eventually explores the entire attractor, no attractor location is inaccessible to it. 204 Although counter-intuitive when expressed in words, the trajectories that one sees in the graphics of chaotic attractors result from the actions along the “unstable” directions of the stretching distortion; the actions in the otherwise invisible stable directions “iron down” the points onto this unstable manifold (n dimensional abstract surface). As might be expected from this set of characteristic motions, the diagnostic triad of chaotic dynamical systems are: (1) Sensitivity to initial conditions—tiny distances between starting points are magnified and large distances between starting points are reduced under the stretching and folding actions of the system; (2) The presence of a theoretically infinite but countable number of unstable periodic orbits of theoretically all period lengths—points in phase space can be viewed an attractive-repellers, visited and left by the orbits recursively as the dynamics proceed; and (3) Indecomposability—the attractor is not separable into isolated regions and no points escape (see Devaney, 1989, for one of the clearest definitions). Of particular relevance to information encoding and transport by brain mechanisms, it is important to visualize that new information in the form of unstable periodic orbits is being created as well as destroyed by the dynamics. The logarithmic rate of formation of these new orbits is computed as the system’s topological entropy (see below). Assuming the real neurobiological system under study is behaving in these ways (and often much has to be done to help justify such a claim), the observables take the form of an irregular and/or episodic time series of amplitudes, as in repeated sample, neuroendocrine studies of plasma hormone levels (Veldhuis and Johnson, 1992) or a sequence of times between events as in neuronal interspike intervals (Katz, 1966; Perkel et al, 1967). These time or time-sequence series are generally studied from three relatively distinct yet complementary quantitative perspectives: (1) As stochastic (“random”) processes with various amounts of sequential dependency (autocorrelations) and scale (sample length) dependencies; (2) As “deterministic” smooth or discrete, vectorial geometries in phase space following reconstruction and/or embedding of the series as phase portraits or return maps; (3) As information generating and transporting, topological (about relative 205 nearness and sequential order not absolute distances), symbolic dynamical processes which as either (1) or (2) can be analyzed with respect to its various entropies, algorithmic complexities and word content and syntax. A variety of techniques aimed at deciding between the relevance of one or another of these underlying assumptions (such as series and Fourier phase shuffling to destroy statistical autocorrelations and vectorial continuities but leaving the probability density distributions intact ) may at times help emphasize one or another of these orientations in the analyses (see Ott et al, 1994 for a collection of articles on this topic). Nonconvergent Distributions and Power Law Scaling in Biologically Relevant Time Series The statistical distribution with which most of us are familiar is the Gaussian which can be generated by summing and averaging a series of independent random events. The average behavior head/tails probabilities observed by one person flipping a fair coin for a very long time or by many people flipping similar coins for shorter times converges upon the invariant measure of 0.5. The variance, “second moment” in the distribution of a population of coin flipping sequences will be finite and computable. In a graph of this distribution, the tails will converge to the x axis in a Gaussian exponential manner. The longer or the more numerous the “sample” series of observations, the closer they will approximate the “ergodic” invariant measures representing the true “central moments” of the behavior of this “population” of fair flipping coins. Since the coins are not changing their relevant characteristics over the time of observation, we say that the series is not time dependent but instead is “stationary.” Computation of correlations over increasing lags to determine how much and for how many flips the sequences continue to resemble themselves yield an exponential decay with a single characteristic correlation length. This reflects the existence of a finite variance from which its amplitude is derived and serves as the single characteristic temporal scale of the random process. 206 Before describing the relatively new set of measures of biological processes designed to find and quantitate what are assumed to be relatively sample size insensitive, distributionally nonconvergent and multiply correlated processes that are without a single time or space scale, we should remind ourselves that there is already much more apparent “order” in a generically random situation than our intuitions would lead us to believe. For example, if we keep cumulative scores in a competition between heads and tails and determine the distribution of trials between those in which the number of heads and tails are even, we will get periods between zero crossings of many lengths with very short ones and very (very) long ones being most statistically prominent. The distribution of these wavelengths is shaped like a symmetrically fat-tailed, bowl (Feller, 1968). As another illustration, expected runs of heads or tails in this Gaussian random task are longer and more frequent than we might suspect. It has been proven that the expected run length grows with n coin flips (as an order of magnitude estimate) like the logarithm (for a fair coin, base 1/p = 1/0.5 = 2) of n. For example, in 512 ( e.g. 2 9 ) tosses, we cannot report a run of 9 heads as a evidence for a biased coin or the sign of some deterministic coin tossing mechanism (Erdos and Renyi, 1970). If we had a 0.6 head biased coin, then the observation of a run of 13 heads couldn’t dissuade us from a random mechanism! Unlike our random coin task, the variances of many, perhaps most, time series of biologically-relevant events, do not tend to converge onto a limiting value as sample size, n, grows, but rather continue to increase (or decrease) with n in a scale invariant manner. Instead of “regressing to the mean” with increasing sample length or number, the likelihood of a larger deviation than previously observed increases with n. Analyses of inter-event intervals reveals a multiplicity of characteristic times. One interpretation of these finding might be that this represents evidence for the inherent “nonstationarity” of biological mechanisms as reflected in, for examples, the frequency of saccades concomitant with ceaselessly shifting foci of visual attention (Steriade and McCarley, 1990), or our inability to not think of “white bear” when so instructed (Wegner, 1994). Hermann Haken, the father of laser-inspired “synergetics,” has said that biological mechanisms are not in a steady 207 state for very long, spontaneously and irregularly jumping from one unstable dynamical state to another (1997). This suggests that meaningful tension between experimental sample lengths long enough to minimize statistical error and short enough to be stationary may be, for the biological sciences, more apparent than relevant. The studies reviewed below exploit measures arising from the view that the noisy statistics of nonstationarity in biological processes are not a sign of measurement error, but rather evidence consonant with the statistical physics of nonequilibrium states and phase transitions (Stanley, 1971; Stauffer, 1985; Yeomans, 1993). Very high amplitude fluctuations and multiple, up to infinite, correlation lengths are characteristic of the normal, on-going biological dynamical behaviors, which are apparently without characteristic amplitude and time scales. From this point of view, if most or all information is widely distributed in the brain (e.g., serial order of visual tasks involving motor cortical neurons, Carpenter et al, 1999) ) then the “binding problem” (see above) could also be solved by multiple, up to infinite spatial and temporal correlation lengths in place of the current theories of monofrequency resonances (Singer, 1993). Hierarchical neurodynamical mechanisms communicating across many mechanistic temporal and spatial scales, brain information transport analogous to the energy cascade of hydrodynamic turbulent velocities (Tennekes and Lumley, 1972), would be likely in the parametric vicinity of incipient bifurcations and phase transitions. Three closely related techniques for quantifying the systematic changes in average fluctuation amplitudes with n (scale, sample length) involve a “power law,” linear slope relationship between the logarithm of an index of variability and the logarithm of sample segment sizes. These easy, yet powerful methods were brought to experimentalists’ attention by Benoit Mandelbrot (Montroll and Badger, 1974; Mandelbrot, 1983; Fedor, 1988; Bassingthwaighte et al, 1994; Liebovitch, 1998). To estimate the exponent in Hurst rescaled range analysis, we compute the standard deviation and the range of the deviation of the running sum from the mean on sequential subsamples of increasing size. The Hurst power law exponent is the slope of the straight line formed by graphing the logarithm of the subsample length 208 along the x axis and the logarithm of the ratio of the range to the standard deviation on the y axis. An independent random system has a Hurst of 0.5. If a sequential increase or decrease in an amplitude or inter-event time tends to be followed by a change in the same direction, the Hurst > 0.5. If an increase in the measure tends to be followed by a decrease, then Hurst < 0.5. Computation of the Fano factor (power law exponent) exploits the same general strategy using the variance/mean in place of the range/variance and counting the number of events (such as single neuron discharges or heartbeats) in time windows of increasing length, generating a similar log-log graph. There is a relatively long history of the use of spike-number variance-to mean ratio in studies of response variability in visual cortical neurons (see Teich et al, 1996 for a review). The Allen factor (power law exponent) tends to reduce the influence of local trends by a computation of the variance of the difference between the number of events in two successive time windows divided by twice the mean number of events in the window. Each system’s invariant logarithmic slope across sample segment sizes takes the place of its missing finite variance in characterizing experimental data in which the distributional tails do not converge (or do so very slowly) to the x axis. Recent approaches to these measures in the context of stochastic analysis of DNA sequences, but also applied to normal and pathological cardiac inter-beat intervals and gait interval sequences, have dealt with the influence of non-stationarity due to apparent trends in the data on α-equivalent indices by local mean-normalization of the fluctuations at each window size (Peng et al, 1993; Peng et al, 1995; Hausdorff et al, 1995). The rate of decay of the densities in the tails of the probability distribution as they approach extreme values along the x axis, called the Levy exponent when represented in Fourier space (technically, as a “characteristic function” of the probability distribution) (Shlesinger, 1988; Shlesinger et al, 1995), can also be computed directly on the distribution by fitting the tails with a two parameter curve quantifying their “fatness” and rates of decay (Mantegna, 1991). We can speak of a Gaussian tail as having an exponential decay rate representable by α = 2 implying 209 finite variance. A tail with a nonconvergent decay rate of 1 < α < 2 indicates nonfinite variance in the data such that the usual “normal curve” derived, standard deviation dependent tests of statistical significance are without meaning. α < 1 indicates the data is without a consequential mean and will require the use of interquartile measures to locate the center of the distribution (Adler et al, 1998). Recalling that the Hurst, Fano and Allan indices are invariant across sample segment size, we remind ourselves that, as is the case in the finite mean and variance, α = 2, Gaussian, any of the other “α tails” also retain their value (“shape”) across all partitions that might be used to sort and sum the observable. This property is called convolutional, α, stability. In passing it should be noted that the last outpost of convergence of a probability density distribution with α = 2 is called “log-normal,” in which the tails along the x axis are “pulled in” by the variable being plotted as its logarithm. A Hurst exponent of > 0.5 in the data is associated with a Levy exponent of < 2.0, and both would be indicative of a process in which the characteristic style of change, rather than decay with some finite correlation length, would persist across all time. Using a bursting neuron as a generic example, a short interspike interval would, on the average, be followed by another short one and a long one by another long one, and this behavior, unlike our fair coin flipping sequence of observables, would not become uncorrelated with itself even over infinite time. Another way to represent this infinite, innumerably lengthed, correlation property is via its implicate frequency (inverse wavelength) content by computing its best fit assortment (along with their densities) of a range of short to long sine waves forming the Fourier transformation of the correlation function. The condition of correlated fluctuations across many measured temporal scales yields yet another power law slope when graphed as the logarithm of its range of frequencies, f, plotted along the x axis, versus their corresponding amplitudes squared, powers, plotted along the y axis. Naming this spectral power law exponent β, the system’s characteristic scaling law is usually expressed as 1 (Fedor, 1988; Hughes, 1995; Shlesinger, 1996; β f 210 Liebovitch, 1998). We see that the Hurst exponent, Fano and Allen factors, Levy exponent and power spectral scaling exponent are kindred statistical descriptors. They are most usefully applicable to systems with distributions that fail to be Gaussian or asymmetrically Poisson, the latter from random data sequences with only positive x values, thus backed up toward zero by a minimum inter-event interval or amplitude. These time series are sequentially dependent, not conventionally stationary, without finite central moments and with self-correlations that don’t demonstrate Gaussian exponential decay with sample length or time. The following are some examples of the use of these measures in studies of biological dynamics. . Examples of Biological Data with Divergent Distributions and Power Law Scaling A paradigm challenging group of experiments involved models and measures of the distribution of characteristic open and closed times of membrane ion conductance channels. The usual approach to this problem assumed the existence of a small set of distinguishable channel types that were reflected in discrete conductance events with a small set of characteristic open and closed times. The distributions of each of could be fitted with its own, Markov process derived, exponential. With technical advances and improved temporal resolution, more characteristic times and their associated α = 2 exponentials were reported with as many as three not being unusual. Liebovitch (and Sullivan,1987; 1989) used analogue to digital transformation of current recordings from the unselective corneal epithelial channels and voltage dependent potassium channels in cultured mouse hippocampal cells at temporal resolutions ranging from 170 to 5000 Hz and found similarly shaped, α < 2, nonconvergent distributions across temporal scales. This led these investigators to suggest that, related to the >16 recorded magnitudes of characteristic times, from picoseconds to months, in autonomous protein motion (Careri et al, 1975; Gurd and Rothgeb, 1979), that there was an “α stable” hierarchy 211 of lifetimes of states, observable at almost any temporal resolution that methods would allow. Early and representative studies comparing the fit of the data with hierarchical scaling functions versus a sum of a small number of Markovian exponentials included studies of a calcium activated potassium channel in human fibroblasts (Stockbridge and French, 1989) which yielded evidence to support both models, as did studies of membrane conductances in corneal epithelial cells by another group (Korn and Horn, 1988). In a systematic comparison of scaling and Markov exponential modes of the gating kinetics of GABA activated chloride channels, acetylcholine activated end plate potentials, calcium activated potassium channels and fast chloride channels (McManus et al, 1988), it was found that the latter fit the data best in most experiments. Similar results were reported in studies of the glutamate and delayed rectifier potassium channel with respect to distributions of open and closed times (Sansom et al, 1989). Space does not permit a systematic account of the continuing debate and conflicting studies about these representations and the implicit biophysics of discrete, finite versus continuous, hierarchical channel event heterogeneity. It is interesting that recent experiments making use of Hurst rescaled range analyses of time series of whole cell membrane voltage fluctuations (without the assumptions and current renormalizing procedures associated with patch clamping) have yielded additional evidence for multiply correlated, Hurst > 0.5, α < 2 power law behavior of what some might regard more generally as a protein relaxation time mediated hierarchical array of ion conductance behaviors (Liebovitch and Todorov, 1996). Following the discovery of (very) subsaturating (“far from equilibrium”) rat brain levels of the common cofactor for tyrosine and tryptophan hydroxylases, tetrahydrobiopterin (Bullard et al, 1978), studies of amino acid substrate saturation functions and time courses determined at these low, physiological co-reactant concentrations manifested patterns of hierarchical multiplicity and discontinuities suggestive of bifurcations and time-dependent fluctuations with fractional (hierarchical) time scaling exponents that were sensitive to psychotropic drugs 212 (Mandell and Russo, 1981; Knapp and Mandell, 1983; Russo and Mandell, 1984a; Russo and Mandell, 1986). Similar bifurcating and power law kinetics were found in receptor-ligand binding systems (Mandell, 1984) which were confirmed by more recent studies of diffusion-limited binding kinetics with receptors immobilized on a biosensor surface (Sadana, 1998). Hierarchical kinetics have also been reported in time courses of drug and metabolite levels (Koch and Zajcek, 1991), tissue tracer washout studies (Beard and Bassingthwaighte, 1998), carrier mediated transport processes (Ogihara et al, 1998), general pharmacokinetic functions (Macheras et al, 1996) and biochemical networks (Yates, 1992). It is likely that bifurcating and hierarchical, power law kinetic functions will be studied more commonly in the chemical literature in general (Shlesinger and Zaslavsky, 1996; Berlin et al, 1996) as well as applied to a variety of protein-mediated biological functions (Dewey, 1997). The first demonstration of and stochastic model for nonconvergent distributions of interspike intervals of a single neuron was by Gerstein and Mandelbrot (1964). Though rich with possibilities, it has been only very recently that additional work from this point of view has been published. This is likely due to the fact that most neuroscience oriented statistical packages, with rare exceptions, are without techniques for computing descriptive parameters for these divergent probability density distributions. This has not been the case for economic time series, download STABLE from http:///www.cas.american.edu/~jpnolan. Recently, applications of the Fano and Allan factor as well as power spectral scaling exponents to observed and shuffled series of spike counts and interspike intervals in the auditory and visual systems (including spatial and/or time resolved single unit recordings in retinal ganglion, lateral geniculate and lateral superior olivary cells as well a auditory nerve fibers) demonstrate the characteristic behavior of nonconvergent, hierarchical stochastic systems (Teich, 1989; Teich et al, 1990; Lowen and Teich, 1992; Kumar and Johnson, 1993; Kelly et al, 1996; Teich et al, 1997). These statistical techniques are well suited to the characterization of the irregularly intermittent bursting patterns generic for activity in single neurons as well 213 as in nonlinear equations representing them and other brain processes (Mandell, 1983). An early study of power spectral scaling in the EEG reported alpha band fluctuations that extended a 1 , β ≈ 1 pattern to 0.02 Hz (Musha, 1981), as did β f other applications of the log-log power spectrum to the EEG in man (Hu and Hu, 1988; Prichard, 1992). This power law scaling led naturally to the suggestion that the range of frequencies available in the electromagnetic signal from the calivarial surface extends far beyond those currently appreciated and may be available for study using relatively noise free recording techniques such as the magnetoelectroencephalogram (Mandell and Selz, 1991). A not surprising range of intrinsic correlation lengths reflected in Hurst > 0.5 and/or Levy exponents < 2 have been reported in lamb fetal breathing patterns (Szeto et al, 1992). The exponent has been shown to be sensitive to maternal alcohol intake in humans (Akay and Mulder, 1998), rat neonatal motoric activity (Selz et al, 1995), and nuchal atonia duration sequences (associated with putative intra-uterine REM sleep) in fetal sheep (Anderson et al, 1998). Sequential amplitudes in 1 Hz stimulated soleus spinal cord H-reflex demonstrated a 1 , β ≈ 0.83 in control subjects and, reflecting the decrement in β f correlations, by 0.31 in patients with losses in supraspinal input from spinal cord injury (Nozaki et al, 1996). Whereas the sequences of fixation times in eye movements of normal control subjects reading difficult material demonstrated an exponentially decaying distribution, those of schizophrenic patients demonstrated a power law tail, consistent with more sequential correlations (Yokoyama et al, 1996). This finding may be related to the appearance of velocity arrests, runs of sticky fixed points, in a spatially oscillating target task, called “smooth pursuit eye movement dysfunction” in schizophrenic patients which has been modeled as a parametric disorder in a periodically driven nonlinear dynamical system (Huberman, 1987). The “short time fractal dimension” has been used to discriminate acoustic signal transformations from the speech of normal subjects and ataxic patients (Accardo 214 and Mumolo, 1998). Spontaneous changes in the apparent syllabic sound made by regularly presented, word-like auditory stimuli emerge irregularly, the duration of perceived sameness demonstrating a power law distribution of “dwell” times (Tuller et al, 1998). The same kind of power law distribution of characteristic “brain times” can be found in studies of gait cycle durations in normal walking (Hausdorff et al, 1996) with a decrease in this locally detrended, α-like index compared with controls (0.91±0.05) in patients with the basal ganglia disorders of Parkinson’s (0.82±0.06) and Huntington’s (0.60±0.04) Diseases (Hausdorff et al, 1998). Hurst > 0.5 has been speculated to more accurately quantitate the fundamental time structure of cells that was previously called circahoralian (ultradian) intracellular rhythms (Brodski, 1998). Reconstructions of Time Series as Orbital Geometries Rene Thom (1972), extending the ideas of Poincaré and D’Arcy Thompson (1942), argued that experimentally useful, intuitive connections between the qualities of biological processes and the quantities of an explicit (equations known) or implicit (equations unknown) dynamical system could be best achieved through the use of graphic representations of their geometric and topological forms. Notably successful examples can be found in the work of Thom, Arnold (1984) and Zeeman (1977), who were inspired by “caustics” (the shapes made on surfaces by the coincidence of reflected or refracted light rays) and Whitney’s representation of parametric manifolds (surfaces) by the shadows they would make on a plane when back lit (Whitney, 1955). This led to a small number of qualitatively predictive, number-of-independent-parameters dependent shapes, such as “folds” “cusps” and “wavefronts.” Experimentally crossing the values of these independent variable forms at their singular boundaries successfully predicted discontinuities in the otherwise smooth alterations in the dependent variable; i.e. bifurcations (“catastrophes”) in the behavior of the observable. This approach was best suited to the study of systems with many independent variables and one dependent variable that could be mapped on the axis of the latter to represent a continuum of 215 operationally defined “energy states.” Smooth changes along the path of the nonlinear parameter manifold generated discontinuous changes in energy levels indicating states of the observable. Crossing a wrinkle in an “independent variable” (some call it “order parameter” to indicate its emergence rather than availability for predictable manipulation) such as the nonlinear parameter surface of the countervailing influences of survival fear and financial cost, may lead to a bifurcation in behavior from peace (“low energy”) to war (“high energy”) (Zeeman, 1977). In a similar geometric spirit but dealing with nonequilibrium systems in motion, the conditions such that one could “smoothly” embed a trajectory like a continuously recorded EEG record, a complicatedly coiled snake into a three or higher dimensional box without loss of its essential dynamical or statistically measureable properties, was settled by Whitney in what is now referred to as the “embedding theorem” (Whitney, 1936). Starting with a tangled knot of overlapping vectorial orbits with apparent “non-invertable points” (given a point, one cannot chose among or between the more than one point that it apparently came from), it can always be unwrapped into a non-crossing trajectory satisfying uniqueness when reconstructed in a box of a little more than twice the parameter-determined dimension of the original space of observables. A common technique for the spatial reconstruction of the output of a dynamical system is called a “time delay embedding.” This approach, first suggested by Ruelle (1987, pg. 28) replaced the value, x, versus the time derivative, dx , phase portrait plot described for a continuously perturbed bob on a dt spring above. A sequence of observables over time, in, for example, three dimensional “phase space” (Packard et al, 1980; Takens, 1981; Sauer et al, 1991), is depicted by a curve representing the system’s trajectory at times t 1 , t 2 , t 3 , by sliding one-by-one down the series and plotting each p 1 , p 2 , p 3 , location with respect to each other along the x, y and z axes respectively. The choice of time interval between the points, the delay, can be delicate and usually some standard fraction of the decay time of the sequence’s autocorrelation length, “the decay time of mutual information” is chosen. There are many technical considerations, 216 including those involving the choice of the embedding space vis a vis the “true” dimension of the attractor. This becomes an issue when, for example, the attractor shrinks over time to some subspace of the initial embedding (Liebert et al, 1991 and references therein). If we imagine the process of time series reconstruction to inscribe an attractor’s untidy ball-of-string of recurrent trajectories in three dimensions, we can then, by making the z-dimension a constant value, cut the ball with a two dimensional plane, a “Poincaré surface of section.” This could yield a roundish cloud of discrete points on the x,y plane and t n-1 – t n would be the time between two piercings of this surface. It has been proven that almost any cut, as long as it is made transverse to the direction of the orbital trajectories, is equally valid and useful for further analyses (Oseledec, 1968). If the original embedding and subsequence section was in high enough dimension to allow invertability, we might have enough (trial and error) knowledge to be able to write a discrete equation, a “return map,” f, f that would move one point to the next on the plane as (x,y) t ←→ ⎯ (x,y) . What can sometimes be case with real systems (Coffman et al, 1986), is that reducing the geometric reconstruction still one dimension further, accepting noninvertability, ironing down the points in the plane onto the x axis line (normalized to [0,1]), and plotting the values at x t against x t+1 (“mapping the unit interval to itself”), can generate points in the general shape of a parabola with dynamics representable by the same family of one parameter, single maximum discrete equations that generated May's sequence of bifurcations, Feigenbaum’s scaling and Metropolis, Stein and Stein’s (and Sharkovskii’s) U sequence (see discussions of qualitative and quantitative universalities above). Although sometimes a significant change in brain system physiology, such as penicillin-induced epileptic neuron spiking activity is revealed simply by a change in the graphic appearance of suitably embedded time series data (Zimmerman and Rapp, 1991), more often statistical measures made on the geometric dynamics of the points on the attractor are required. n−1 t n 217 Orbital Divergence Characterizes Expansive Dynamics on Biological Attractors In the dynamical world of equilibria (fixed points in phase space) and periodic cycles (fixed points of a return map), a common concern involves their stability. What happens if an adventitious jiggle moves the orbit a little distance away from the fixed point? Would the wind wiggled suspension bridge start to flap with increasing amplitude or would it damp back down quickly to its stable state. A “Lyapounov functional,” L, is constructed which can be visualized like a smooth potential bowl around the fixed point such that any L stable solution that starts at its bottom tends to stay there or is asymptotically L stable if the solution converges to the fixed point at the bowl’s bottom as t → ∞ . If the point is not L stable, it is L unstable. The modern study of nonlinear systems have produced another kind of stability issue with a similar appellation yielding other direction specific indices, the Lyapounov characteristic exponents, λ (Oseledec, 1968; Eckmann and Ruelle,1985; Ruelle, 1990; Ott et al, 1994). In this context, the instability is not one of perturbative escape from a fixed point, but of the average rate with which the (theoretically infinitesimal) distances among a handful of points representing a set of initial conditions (each a precision limited, hypothetical repetition of the same experiment), are being stretched apart by the expansive action of a strange attractor system. In three dimensions, one can envision a ball of initial conditions being elongated along the unstable direction and ironed down from both sides along the stable direction over time, transforming the ball into an ellipsoid and then into a (recurrent) curve. In simplest terms and thinking about a one dimensional scalar time series, the Lyapounov exponent reflects the multiplicative average (logarithmic addition) of the sequence of slopes of the series of straight lines connecting the points. An average slope of > 45 ° is expansive such that a linear distance on the x- axis is increased when mapped onto the y axis. A slope of < 45 ° is a contraction mapping reducing the linear distance of the x-axis when mapped to the y axis. 218 Rössler’s generic chaotic system (see above) moving recurrently in a three dimensional box can be orthogonally decomposed into three directional motions in a moving frame, each with a signatory sign of λ . The unstable direction of expansive stretching is characterized by some number > 0, λ( + ), the stable direction of contractive folding, some number, < 0, λ( − ), and the neutrally stable direction of recurrence, λ( 0 ). For The “Lyapounov spectrum” of the Rössler attractor is [ λ( + ), λ( −), λ( 0 )] (Shaw, 1981). An n-dimensional dynamical systems has n onedimensional Lyapounov exponents, and it is sometimes the case in relatively noise free, finite semi-stationary data lengths of the neurosciences, that a λ > 0 can be shown to exist for a second one, in a dynamical situation called “hyperchaos” by (Rossler, 1979). For example, two and sometimes three λ( + ) have been reported in the flows on the EEG attractor of normal alert subjects (Gallez and Babloyantz, 1991). The presence of measurement noise, the finiteness of neurophysiological sample lengths as well as the relatively small expansive actions in some directions in the chaotic attractors of brain dynamics lead to the finding that most often, only one “leading Lyapounov exponent,” λ( + ), is reliably computable (Sano and Sawada, 1885; Wolf et al, 1985; Eckmann et al, 1986). A counter-intuitive fact about the stability of a dynamical system when a decrease in the value of λ( + ) is observed such that λ( + ) → λ( 0 ), is that this more neutral stability augers a global bifurcation (Guckenheimer and Holmes, 1983). A small perturbation does not change the global dynamics of an already expanding and contracting (called “hyperbolic”) dynamical system, it will maintain the style of its motions. However, when λ( + ) → λ( 0 ), a velocity changing perturbation evokes a bifurcation to a new dynamic in what is called “loss of hyperbolic stability.” The best examples come from the observations of this kind of change in the EEG predicting the onset of epileptic seizures in patients with focal or temporal lobe epilepsy (Iasemidis et al, 1988,1990; Iasemidis and Sackellares, 1996 ). 219 The number and variety of algorithmic strategies for computing Lyapounov exponents that are applicable to real data divide naturally into those that compute directly the average rate of separation of neighboring points from the “fiduciary” orbit, as observed on the reconstructed attractor, from which only the largest λ can be obtained (Wolf et al, 1985), and a variety of techniques based on assumed model maps of the unknown flow along which the sequential products of the local derivatives are computed. The logarithms of the straight line slopes of the sequence of directionally decomposed local tangent vectors multiplied, yield as many Lyapounov exponents as directions (Sano and Sawaka, 1985; Eckmann et al, 1986; Geist et al, 1990). The techniques of regularization by which these model processes approximate the unknown flow include those with least squares, linear fit assumptions (Eckmann et al, 1986; Sato et al, 1987; Buzug et al, 1990), more detailed fits involving polynomial expressions in higher powers (Briggs, 1990; Brown et al, 1991; Bryant et al, 1991) and techniques such as “singular value decomposition” which decomposes the flow into orthogonal components before computing the logarithmic rate of divergence of nearby points on each of them (Stoop and Parisi, 1991). A clever check on the Lyapounov number obtained is to study the flow backwards so that, for example, some rate of separation of points in the forward direction would approximate the rate of convergence in the time reversed data (Parlitz, 1992). Among the sources of spurious Lyapounov exponents are sample lengths that are too short and/or too measurement-noisy to compute a statistically stable average, embedding dimensions that are too high or low and attractors (many of physiological relevance) that have geometric features such as sharp corners or tight folds as in the Rössler (where points gather) or delicate boundary points such as those on the body on the Lorenz butterfly (see above) where very small distances determine whether the orbit makes big jumps to the right or left wing leading to uncharacteristically large separations. This “nonuniformity” in the rates of expansion and contraction in the dynamics over the attractor, a source of error in computations of statistical indices of the average behavior, becomes a useful tool in characterizing individual differences in sets of neurobiological data ranging from 220 brain enzyme kinetics (Mandell, 1984) and single neuron firing patterns (Selz and Mandell, 1991) to human psychomotor and cognitive behavior (Selz, 1992; Selz and Mandell, 1993). The Leading λ( + ) of Some Biologically Relevant Time Series An early application of a simplified form of leading Lyapounov exponent to brain data involved the computation of the one dimensional averaged slope of in vitro studies of psychopharmacological drug and peptide effects on time series of catecholamine and indoleamine biosynthetic enzyme activities studied at physiological, far-from-equilibrium reactant concentrations (Russo and Mandell, 1984b; Knapp and Mandell, 1984). A contemporaneous study also suggested the influence of differences in initial conditions for pharmacokinetic equilibrium times in drug binding kinetics by proteins (Bayne and Hwang, 1985). The most extensive applications to the clincial neurosciences of the Lyapounov measure of the exponential divergence of orbital points has involved reconstructed brain wave attractors from the intracranial or scalp recordings of the EEG (Duke and Pritchard, 1991; Dvorak and Holden, 1991; Jansen and Brandt, 1993). Space prevents us from surveying more than a small representative set of the studies (Jansen, 1996). It should be noted, however, that this is an area in which “state of the art” research has grown quite complicated and somewhat controversial with respect to technical issues. The choices of the digitizing frequency of the smooth record, the dimension of the embedding space and time delays continue to be debated in the context of numerical computations of λ and dimension measures (Mayer-Kress, 1986; Ott et al, 1994). Controls for the implicitly required statistical discrimination between “randomness” and “deterministic chaos” consist of sequence and (Fourier) phase randomization generating “surrogate data” which conserve the probability distributions and destroy the correlation properties and attractor geometries (Sauer et al, 1991; Ott et al, 1994). Since neither bring with them any connections with 221 known or explorable brain mechanisms, one might argue that at this early stage of the work it would be more desirable to simply report the quantitative findings, leaving unanswerable questions about ultimate causality for later discussion (see below). The first EEG λ( + ) was reported in a patient with epilepsy (Babloyantz and Destexhe, 1986) which was confirmed by others (Iasemedia et al, 1988; Frank et all, 1990). An important study of simultaneous time series from 16 subdural electrodes placed in the right temporal cortex of a patient with a right medial temporal lobe epileptogenic focus demonstrated that a decrease in a single lead’s λ( + ) reliably anteceded and localized the first signs of the incipient seizure. The rest of the leads followed with similarly decreased positivity in their leading Lyapounov exponents associated with spatially coherent patterns of behavior. In addition, the averaged value of the leading Lyapounov exponents in the 16 leads increased post-ictally over the averaged values of λ( + ) in the pre-ictal state (Iasemidis et al, 1988,1990). These findings, including seizure anticipation for 25 minutes, were confirmed using intracranial recordings in 16 patients with temporal lobe epilepsy (Elger and Lehnertz, 1998). The para-ictal decrease and post-ictal increase in λ( + ) found in patients with focal temporal lobe seizures was confirmed more generally in left and right pre-frontal-to-mastoid EEG recordings made before, during and after electroconvulsive shock treatment of psychiatric patients (Krystal and Weiner, 1991). Pre-ictal changes were also found six minutes before seizure onset from scalp EEG recordings in 17/19 patients with chronic focal epilepsy (Martinerie et al, 1998). The most exciting potential application of this approach is its use, in real time, for the prediction and prophylactic treatment of incipient seizures, minutes to hours before the event, in place of or augmenting long term drug management (Iasemidis and Sackellares, 1996). There is a growing literature about leading Lyapounov exponent(s) in the reconstructed attractor of the EEG associated with a variety of normal and pathological human behavioral states. For examples, two and sometimes three 222 λ( + ) were reported in awake relaxed subjects and were lost in deep sleep (Stage IV) and coma (advanced Jakob-Creutzfeld disease), suggesting that level of consciousness correlated positively with amount of orbital divergence (Gallez and Bablioyantz, 1991). A “pathologically low” leading λ( + ) was also found to be characteristic of the EEG of patients with Alzheimer’s syndromes (Jeong et al, 1998). Technically defined sleep stages (I, II, III, IV, REM) were found to correlate well with the values of the leading λ( + ) of the EEG in normal subjects (Fell et al, 1993; 1996; Pradhan and Sadasivan, 1996). EEG recordings during problem solving sometimes, but not always, demonstrated a relationship between values of λ( + ) and the kind or amount of load of the task (Micheloyannis et al, 1998; Popivanovov et al, 1998; Meyer-Lindenberg et al, 1998). Both emotionally positive and negative videos increased the value of the leading λ( + )(Aftanos et al, 1997) as did computer generated music with sounds that exploited a “pleasing” hierarchical, 1/f but not an “unpleasant” 1/f 2 frequency spectrum (see previous section about power law scaling) (Jeong et al, 1998). The EEG theta rhythm of “day dreaming” manifested a lower λ( + ) than the “relaxed alert awake” alpha rhythm (Roschke et al, 1997). Relationships between the Lyapounov spectra demonstrated both regional independence and task-related dependence in the magnetoencephalography record in man (Kowalik and Elbert, 1995). These and other studies suggest that divergence rate of orbits on a geometrically reconstructed attractor is a subtle measure, which can be quantified as a continuous variable and which has been found to be useful in a variety of neuroscience-related, experimental contexts. The range includes the characterization of the discharge pattern of a single somatic or renal sympathetic nerve fiber (Gong et al, 1998;Zhang and Johns, 1998); quantifying the results of perturbing autonomic nervous system activity, for examples, exercise, atropine and propranolol decrease λ( + ) in the cardiac interbeat interval attractor (Hagerman et al, 1996) and interference with the function of the baroreflex or clonidine alters the λ( + ) in the blood pressure attractor in man and animals (Wagner et al, 1996; 223 Mestivier et al, 1998); and predicting defects in visual learning functions from decreases in the λ( + ) of the cardiac interbeat interval attractor in patients with multiple sclerosis (Ganz and Faustman, 1996). We recall that on theoretical grounds (Guckenheimer and Holmes, 1983), a decrease in the positivity of λ( + ) → λ( 0) in a delay coordinate, geometric reconstruction of a time series of observables may auger an incipient global bifurcation in the system’s dynamics. As reviewed above, this has turned out to be the case in several studies of the EEG and electrocorticogram in epileptic patients. Futher research will be required to see if this idea has substance more generally for predicting “catastrophic” changes in other brain-related systems. Power Law Scaling of Orbital Geometries in Time Series Reconstructions Benoit Mandelbrot’s book in its first incarnation was derived from his lectures at College de France in 1973 and 1974 and was called Les Objets Fractals: Forme, Hasard et Dimension (Mandelbrot, 1975). This essay was translated into English as Fractals, Form Chance and Dimension (Mandelbrot, 1977). Later expanded and reworked editions displayed another title, The Fractal Geometry of Nature (Mandelbrot, 1982) but the deep conceptual, sometimes poetic fusion and confusion generated by the apparent identity among the objects of his first title remains. “Fractal,” along with “chaos” and “strange attractor” are among the most widely familiar new words in modern dynamical systems research. Fractal is the most difficult to rigorously define and is commonly misunderstood due to the evocative yet dream-like cognitive condensations provoked by the first title and its reflections in Mandelbrot’s prose. A common conceptual confusion is exemplified by the assumed relation between “fractal time event distributions” of the cardiac interbeat interval and the “fractal like” anatomy of the purkinje network of the cardiac conduction system. Data from both contexts are often shown juxtaposed in the same illustration as though their relationships were obvious (Goldberger et al, 1990; Goldberger, 1996; Liebovitch and Todorov, 1996). “Fractal times” and “fractal 224 geometries” are not related to each other essentially, either in the mathematical or physiological domain, but are often made vaguely equivalent on the basis of their lexical similarity. An experimentally meaningful relationship between fractal statistics (hazard), dynamical fractals (dimension) and fractal geometries (form), has to be proven on a case by case basis and not assumed from their common designation. Among the informal attempts to do this have been those that involve the branching pattern of nerves and the associated reductions in their diameter-dependent characteristic conduction velocities yielding a multiplicity of “arrival times.” There is, however, a more central idea common to these concatenated meanings of fractal: the statistical, dynamical and geometric expressions of “scaling,” a word which is not mentioned in Mandelbrot’s book titles. The cluster of theories, theorems and methods associated with the idea of scaling (and renormalization) have led to Nobel Prizes for Flory (1971), Wilson (1975) and de Gennes (1979) and the (equivalent mathematical) Field’s Medal for McMullen (1994). There is speculation that the last two awards were supported by the inspiration and interest given their research by Mandelbrot’s intuitions and books. Scaling laws take the place of (unknown causal) physical laws by indicating the proportion by which observables of a system can be changed in relationship to each other such that some statement about them, “this varies with that,” still holds. In a cross species comparison, as the average weight of a mammalian body, called lb, increases, the skeletal weight, called w, increases at an exponentially greater rate: w goes like lb 1.08 where lb 1.0 would indicate that they grew across species at the same rate. Plotting log (lb) on the x axis and log (w) on the y axis in a log-log plot results in a straignt line with a slope that indicates the power law scaling relationship between body weight and skeletal weight across mammals. The slope of the scaling exponent of 1.08 is a little over 45 ° = 1. In contrast, the metabolic rate, r, goes like (lb) 0.75 , r ≈ (lb) 0.75 . Larger animals (relative to their weight) have lower basal metabolic rates (Schmidt-Nielsen, 1984). We don’t completely know the chain of intervening mechanisms that relate these variables to each other but we do know 225 invariant scaling laws that describe their relationships within some limits on the range of values. In describing the functional size, radius of gyration, R g , of a polymer such as a polypeptide, composed of N monomers, assume each of the amino acids to be the same and that they are in a “good” hydrophobic solvent that didn’t stick the polymer together in a fold. Flory (1971) found a scaling law for certain broad classes of polymers and solvents, R g ≈ αN ν , where the exponent, ν = 3/5, was universal, N indicated the number of monomers in the chain and the value of “pre-factor” α depended upon the particular monomer and solvent chosen. Log R g plotted against log N has a “power law” slope of 0.60. For an equally static but less physical example, there is the well known Zipf law of “vocabulary balance”(Zipf, 1949). First reported for the 260,450 words of James Joyce’s Ulysses, the slope of the log of the rank of the words found (ordered from most to least along x) plotted against the log of their frequency (along y) results in a power law that is (generally) true for other collections of words and in other languages. An accessible example of a dynamical scaling law arises in a two dimensional lattice model of a forest which is to be set on fire with probability p independent random single tree ignitions. At some critical p, p c , the fire sweeps through the entire forest (“percolates”) and the correlation length of the connected clusters grows as |p-p c | -γ with a universal scaling exponent, γ = 4/3, for all Monte Carlo, two dimensional percolation problems (Stauffer, 1985; Grimmett, 1989). Mandelbrot’s scheme for the power laws that compose his fractal geometry of dynamical objects is a measure made on the pattern of occupancy in the embedding space by the reconstructed orbits of an attractor. It is, generally, mass ≈ length D 0 in which D (the subscript that of the “capacity dimension”) is not the 0 whole number of Euclidian dimensions, d, of the space in which the orbits are embedded. After Hausdorff’s “convergence of external and internal measures” (Hurewicz and Wallman, 1948), the (capacity) fractal dimension D is also defined as being larger than its topological dimension and smaller than its Euclidian embedding dimension. Graphing a time series on a plane one can think of its 0 226 topological dimension as that of a line equal to one. If each time step had the largest up or down amplitude as possible, its fractal dimension would approach (but not reach) that of the embedding plane, Euclidean d = 2. The D 0 of the one dimensional Richardson technique (Mandelbrot, 1967) can be computed by covering the one dimensional surface of a time series with a number, #, of line segments of several orders of magnitude range of lengths, l .Graphing log(l) along the x-axis and log #(l) along the y-axis yields a negative linear slope, -s. As defined, 1- s = D 0 noting that (-(-s)→+s) such that 1 < D 0 = 1+s < 2. Strain differences and peptide and psychotropic drug-induced changes in D 0 computed in this way were found in time series of fluctuations in rat brainstem tyrosine and tryptophan hydroxylase activities under far-from-equilibrium coreactant concentrations (Mandell and Russo, 1981; Knapp et al, 1981; Knapp and Mandell, 1983; 1984). Systematic influences of stimulant drug dose on D 0 were found as well in these systems (Mandell et al, 1982). This simple measure, made directly on the “roughness” of the graph of a one dimensional time series rather than on its orbital reconstruction, has been used to discriminate the pattern of fluctuations in daily mood scales in normal subjects and mood disordered patients (Woyshville et al, 1999). These findings confirmed dimensional scaling exponents on higher dimensional embeddings of similar time series in mood disordered patients (Gottschalk et al, 1995; Pezard et al, 1996). Due to the ease and rapidity of its computation, techniques involving D 0 on one dimensional time series are currently in development as possible real time epilepsy predictors when analyzing the output of a large number of EEG leads simultaneously. If M(ε) is the minimum number of d-dimensional cubes of side ε required to cover the d-dimensionally embedded attractor, plotting a logarithmic range of rulers of length ε (as ε→0) along the x axis and a logarithmic range of number of cubes, M(ε), each of corresponding ε-edge size, along the y axis, results in a negative (more smaller M(ε) ‘s and fewer bigger M(ε) ‘s) power law slope D 0 . Here the numbered covering cubes, M(ε), are those in which the probability of containing at least one point (its “probability density measure,” often called µ) is not zero. We 227 note that changing the ratios of the numbers of cubes that are dense in point probability to those that are sparse would not influence the value of D 0 . This helps differentiate D 0 from other dimensions and, as noted above, D 0 as a maximal estimate of the fractal dimension, is called the capacity dimension and by convention the scaling law is written M ( ε) D ≈ ε − 0 . More specifically, D0 is calculated by repeatedly dividing the d-dimensionally embedded phase space into equal d- dimensional hypercubes and plotting the log of the fraction of the hypercubes containing data points versus the log of the (normalized) linear dimension (“length scale”) of the hypercubes. The slope fitted to the most linear part of the slope (usually the middle 50%) indicates the capacity dimension. D 0 is computed for increasing embedding (and cube) dimension, d, until it achieves an asymptotic plateau, it “saturates”. This is but one of a range of geometric scaling exponents, “dimensions,” that are currently being computed (Farmer et al, 1983; Grassberger and Procaccia, 1983; Meyer-Kress, 1986; Theiler, J. (1990); Gershenfeld, 1992; Ott et al,, 1994). Although still subject to debate, convention has it that the sample length required to determine this most primitive of dimension computations goes like D 10 0 (e.g. a dimension of 2.45 requires a sample length of at least 282 points). Assuming robust findings using D0 as indicated by non-parametric tests of significance in test-retest, before and after, drug treatment designs, this arbitrary criteria sounds more like ritual than meaningful help for the clinical neuroscientist with (say) 100 spinal fluid hormone and metabolite samples painfully and laboriously collected from a patient’s indwelling catheter over 48 hours. In the context of real data (and not numerical studies of differential equations), we are dealing with empirical findings that must find their meaning (or lack of) in the context of questions about issues in the neurosciences, not in abstract questions such as those about the number of dimensions that an unknown differential equation would require to represent the data (Broomhead and King, 1986). In a similar arbitrary spirit, a system manifesting a D 0 > 5 is considered not discriminable from a random process; e.g. the difference between D 0 = 5 versus D 0 = 7 (though perhaps statistically significant) is thought to be without meaning. Since in neurobiological 228 research, “random” (if it doesn’t mean measurement error) indicates unknown degrees of freedom, this D 0 > 5 rule is also without relevance for brain research. D 1 is called the “information dimension” and is computed by counting the number of ε-cubes, M(ε), it takes to cover the points constituting some fixed fraction of all of the points of the set of orbital points on the attractor and can be regarded as the “core dimension” (without the outliers) of the set. The counterintuitive finding is that D 1 is nearly constant across a range of fixed fractions that are less than the whole measure (Farmer et al, 1983). The invariance of D 1 can even be taken to the extreme by computing the D M lim ln ( ε ) ε ln( ε) 1 = →o around (typical, not all) single points. In this context, D 1 is called the “pointwise dimension” or “singularity exponent” and, as might be anticipated, its value is usually less than that of D 0 . The scaling exponent that is both sensitive to point densities and easiest to compute from real data is the “correlation dimension,” D 2. Here, analogous to the relationship between the amplitudes of the variance and the correlation function in conventional statistics, the measure squared is of interest for the computation of D 2 , e.g. M ( ε ) 2 I( 2, ε ) = ∑[ µ ( Ci )] (see below for this use of measure µ on sum Σ of cubes Ci). i= 1 The selection of D 2 as the fractal measure dominates the studies that invoke scaling exponents to quantify the distributions of points on the attractor as reconstructed from time series in the neurosciences (Grassberger and Procaccia, 1983; Mayer- Kress, 1986; Ott et al, 1994). Several sets of programs are available for its computation (for example, Sprott and Rowlands, 1991). Generally, a correlation sum (“integral”, R(ε) ) is computed from a starting point by counting all subsequent point pairs with distances between them less than ε as ε→0 and plotting D 2 lim ln( R( ε) = . D 2 is computed for increasing embedding (and therefore ε → 0 ln( ε) hypercube) dimension, d, until D 2 achieves an asymptotic plateau, it “saturates” (Ding et al, 1993). It is generally the case that D 0 ≥ D 1 ≥ D 2 (Farmer et al, 1983). 229 In his statistical explorations of experimental results in hydrodynamic turbulence, Mandelbrot (1974) called attention to the need for a multiplicity of characteristic scaling exponents, a range of values for each exponent and their sensitivity to orbital point density distributions (the latter called the Sinai-Ruelle- Bowen or natural measure (Eckmann and Ruelle, 1985)). These needs grew out of the intrinsic heterogeneity in the time dynamics and the nonuniform point distributions in phase space of orbitally divergent, real physical systems. Even with relatively uniform orbital point distributions, it is intuitively obvious that as ε → 0, the smaller ε- cubes are over-represented and larger ε- cubes are under-represented in the M(ε) computation (Farmer et al, 1983). For a concrete example, the fraction of the total number of cubes containing say 75% of the points would obviously decrease as the ε-lengths studied gets smaller. Normalizing the D i measures with respect to point densities would correct for this systematic distortion. In addition, the non-systematic influence of real system heterogeneity and non-uniformity in both time and reconstruction space distributions makes the need for relating the D i measures to the natural measure even more pressing. The derivation of many separate scaling exponents, as well as global generalized exponents and the incorporation of point densities in their computation,