^{1}

^{*}

^{1}

^{2}

^{1}

Conceived and designed the experiments: FG GBM. Performed the experiments: EMA FG. Analyzed the data: EMA YSP. Contributed reagents/materials/analysis tools: EMA YSP FG GBM. Wrote the paper: EMA GBM.

The authors have declared that no competing interests exist.

Because of the parallels found with human language production and acquisition, birdsong is an ideal animal model to study general mechanisms underlying complex, learned motor behavior. The rich and diverse vocalizations of songbirds emerge as a result of the interaction between a pattern generator in the brain and a highly nontrivial nonlinear periphery. Much of the complexity of this vocal behavior has been understood by studying the physics of the avian vocal organ, particularly the syrinx. A mathematical model describing the complex periphery as a nonlinear dynamical system leads to the conclusion that nontrivial behavior emerges even when the organ is commanded by simple motor instructions: smooth paths in a low dimensional parameter space. An analysis of the model provides insight into which parameters are responsible for generating a rich variety of diverse vocalizations, and what the physiological meaning of these parameters is. By recording the physiological motor instructions elicited by a spontaneously singing muted bird and computing the model on a Digital Signal Processor in real-time, we produce realistic synthetic vocalizations that replace the bird's own auditory feedback. In this way, we build a bio-prosthetic avian vocal organ driven by a freely behaving bird via its physiologically coded motor commands. Since it is based on a low-dimensional nonlinear mathematical model of the peripheral effector, the emulation of the motor behavior requires light computation, in such a way that our bio-prosthetic device can be implemented on a portable platform.

Brain Machine Interfaces (BMIs) decode motor instructions from neuro-physiological recordings and feed them to bio-mimetic effectors. Many applications achieve high accuracy on a limited number of tasks by applying statistical methods to these data to extract features corresponding to certain motor instructions. We built a bio-prosthetic avian vocal organ. The device is based on a low-dimensional mathematical model that accounts for the dynamics of the bird's vocal organ and robustly relates smooth paths in a physiologically meaningful parameter space to complex sequences of vocalizations. The two physiological motor gestures (sub-syringeal pressure and ventral syringeal muscular activity), are reconstructed from the bird's song, and the model is implemented on a portable Digital Signal Processor to produce synthetic birdsong when driven by a freely behaving bird via the sub-syringeal pressure gesture. This exemplifies the plausibility of a type of synthetic interfacing between the brain and a complex behavior. In this type of devices, the understanding of the bio-mechanics of the periphery is key to identifying a low dimensional physiological signal coding the motor instructions, therefore enabling real-time implementation at a low computational cost.

The complex motor behavior originating the rich vocalizations of adult oscine birds results from the interaction between a central pattern generator (the brain) and a nonlinear biomechanical periphery (the bird's vocal organ)

In an effort to understand what gives rise to complexity in this behavior, a part of the birdsong community has set focus on the capabilities of the periphery to produce vocalizations owning a diverse set of nontrivial acoustic features

Through a combination of experimental observations and theoretical analysis, low-dimensional mathematical models have been proposed that account for the physical mechanisms of sound production in the avian vocal organ

Part of the appeal of counting with this model is the prospect of applying it to the construction of a bio-prosthetic device. In this scenario computation is relatively inexpensive because of the low dimension of the mathematical model. In addition, the physical description of the peripheral effectors led to the identification of a set of smoothly varying parameters that determine the behavior

The usual strategy of BCIs and BMIs (Brain Computer Interfaces and Brain Machine Interfaces) is to decode motor commands from recordings of physiological activity in the brain and use this activity to control bio-mimetic devices

Our current understanding of the biophysics of the avian vocal organ, particularly our capacity to identify the dynamical mechanisms by which complex behavior occurs when the peripheral systems are driven by low dimensional, smooth instructions, allows us to propose an example of a different kind of bio-prosthetic solution. The model predicts a diversity of qualitatively different solutions to the system for continuous paths in a parameter space. Not only is this parameter space suggested by the model, but it is also physiologically pertinent.

We present a device that is driven by a freely behaving Zebra finch to produce realistic, synthetic vocalizations in real-time. The device is based on the real time integration of the mathematical model of the vocal organ on a Digital Signal Processor (DSP). It is controlled by the bird's subsyringeal air sac pressure gesture, which is transduced, digitized and fed to the DSP to provide the model with the appropriate path in parameter space.

The work is organized as follows. In the

All experiments were conducted in accordance with the Institutional Animal Care and Use Committee of the University of Utah.

One of the most studied species of songbirds is the Zebra finch. Its song presents a set of diverse acoustic features, which can be accounted for by the dynamics displayed by the mathematical model of its vocal organ

The song of an adult Zebra finch is composed of the repetition of a highly stereotyped sequence of syllables, preceded by a variable number of introductory notes. A typical sequence or

The oscine vocal system has two independent sound sources in the syrinx, where airflow is modulated to produce sound

Sounds are produced in the syringeal valves, and then filtered through the vocal tract. In the syrinx, the labia oscillate modulating the airflow. They support two coordinated modes of oscillation: an upward propagating superficial wave, and an oscillation around their mass centers (C). In the vocal tract we highlight trachea, glottis, OEC and beak. Pressure

The bipartite syrinx of the oscines is a pair of valves, located at the junction of the bronchi and the trachea (see

The upper vocal tract consists of the air-filled passages that link the syrinx to the environment. It determines much of the distribution of energy of the sound across its harmonic frequency components, defining a perceptual property as important as the timbre

A model to account for the mechanism of sound production in the bird's syrinx that presents a good compromise between level of description and computational complexity was presented in

The kinematic description of this mechanism is carried in terms of the displacement from equilibrium of the midpoint position of each labium

In the driving term,

In this model, acoustic features of the solutions are determined by physiologically meaningful parameters. Assuming that the coefficients in the restitution term of system (1) are proportional to the tension of the ventral syringeal muscles (vS), this model is capable of producing synthetic, realistic birdsong. The rationale behind this assumption is that the contraction of these muscles handles the stiffness of the labia by stretching them

The oscillating labia in the syrinx modulate the airflow producing sound, which is modified as it goes through the vocal tract. Relevant acoustic features, such as the spectral content of vocalizations, are determined by the geometry of the vocal tract

In order to produce realistic synthetic vocalizations, we introduce a model of the vocal tract as a dynamical system, which includes a tube approximating the trachea and a Helmholtz resonator to represent the OEC. The sound produced in the syrinx enters this system, and we are able to compute the sound radiated to the atmosphere.

The trachea, by its effect and its physiology, is approximated by a tube that is closed in the syringeal end and open at the glottis. The pressure at its input

The transmitted part of the pressure fluctuation

In acoustics, it is common to write an analog electronic computational model to describe a system of filters. The acoustic pressure is represented by an electric potential and the volume flow by the electric current

The mathematical model of the vocal organ represented by equations (1, 2, 3 and 4), rather than an attempt to obtain a statistical a filter or a set of causal rules between a coded motor command and a behavior, is a physical description of the complex peripheral effector. By studying the nonlinear dynamics of this model, specially that of its oscillatory solutions, we find nontrivial relationships between paths in parameter space and acoustic characteristics of sounds synthesized by it.

In this model, the non-interacting vocal tract acts as a passive filter; sounds produced in the syrinx are altered by it only on the relative weight of their harmonic components. This is why the dynamical analysis searching for qualitatively different oscillatory solutions is done on the equations ruling the dynamics of the syrinx. A thorough description of the set of solutions and bifurcations of the system (1,2) was carried out in

Within the rich dynamical scenario displayed by the system, two distinct mechanisms giving rise to oscillatory solutions stand out: a Hopf and a Saddle Node on an Invariant Cycle (SNIC) bifurcations. They are sketched in

When parameter

In addition to providing an orientation in the task of seeking the relevant control parameters, the dynamical analysis of the model makes way for the real-time implementation of the model by reducing its computational cost while retaining the relevant dynamics

The mathematical model for the vocal organ, the reduction of the system ruling the dynamics of the sound source, and the identification of the pertinent parameters accounting for its motor control, they all make way for the construction of a bio-prosthetic device. The parameters determining acoustic properties of vocalizations in the normal form (5) are physiologically meaningful and the set of differential equations is easy enough to compute in a portable platform such as a Digital Signal Processor. By fitting the parameters and integrating the system in real-time, synthetic song can be produced in a device controlled by the motor instructions elicited by a freely behaving bird.

In many bio-prosthetic solutions, the physiological motor gestures used to drive the device are degraded respect to those recorded in the intact subjects

The device reconstructs the intended motor gesture from this degraded pressure gesture to trigger integration of the model. When the pressure pattern corresponding to the syllables comprising a motif are identified, the mathematical model for the vocal organ is computed with the appropriate paths in parameter space, to produce the corresponding synthetic output.

The electronic syrinx is capable of reproducing synthetic birdsong in real-time when driven by the air sac pressure of a freely behaving, muted bird. The pressure gesture is recorded, together with the bird's song, in order to fit the parameters of the model. The bird is then muted via a bypass of airflow away from the syrinx and its pressure gesture is digitized and fed to the Digital Signal Processor (DSP), where the model is implemented. An algorithm that reconstructs the pressure gesture of the intact bird is also implemented, to trigger the integration of the model on when the gesture corresponding to the first syllable of a motif is detected and off when it corresponds. Below there is a description of the procedure.

The bird is cannulated. A cannula (

These data are then used to fit the physiologically meaningful parameters of the vocal organ model in its normal form (system (5)): the time-varying (

The characteristics of sounds produced by integrating the normal form of the model are determined by paths in the

Two important features that we seek to match are the spectral richness and the fundamental frequency of the vocalizations. One way of quantifying the spectral richness is to compute the Spectral Content Index (

Parameter

The optimal value of the scaling factor

Setting also the parameters of the tract so that the resonance of the OEC lies close to

By these means and upon smoothing to interpolate the values of the parameters within each segment, a table of values of the reconstructed motor gestures is obtained for every segment of the bird's motif. When the model is driven by them, synthetic song comparable to the one that was recorded is produced. An example of the fitted parameter series is illustrated in

The thoracic air sac pressure is recorded together with the song; the pressure and corresponding sound of a bout are shown in (A). With these records, the temporal series

In

The bird is then muted via the insertion of an open cannula (

The pressure pattern of the muted bird is then recorded as the bird attempts to produce song by producing the gesture corresponding to calls, introductory notes, and motifs.

The recordings of the intact and degraded pressure gestures (

The length

This ensures that a correlation threshold can be set that leaves high probability of detection of the bout with low probability of false triggering (false inference of the onset of a motif from a call or any other note). Assuming that the cross-correlation values in the calls and introductory notes and the cross-correlation values in the beginning of a motif follow normal distributions with distinct mean and variance, these probabilities can be estimated. In the example shown here, the pressure gesture of a muted bird was recorded as it attempted to elicit

The criterion for the selection of these quantities responds to the times of the procedure and the rate of success. The number of motifs used for calibration is the data available after one recording session of

In

We also tune an algorithm that detects the interruption of the song, comparing properties of the muted pressure gesture with thresholds on absolute value and variation of the intact gesture during the subsequent motif. The song bout of a Zebra finch is highly stereotyped. Once a motif starts, a fixed sequence of syllables is followed, and a bout consists of the repetition of this motif, with the eventual interleaving of an introductory note between one and the next

The method used to trigger the integration of the model introduces a delay. A segment of the first syllable is used to detect the onset of a motif in the muted bird's pressure gesture. The length of this segment determines the time it takes the synthesis to begin after the bird has attempted to produce the syllable. The synthesis, which begins after the detection, produces the song that corresponds immediately after that segment. In this way, the delay does not introduce a shift in time in the feedback. Instead, the part of the motif used for detection is skipped.

The model with the constructed parameter paths (

The pressure gesture of the bird is recorded simultaneously with its song. Then, the parameters driving the normal form to produce synthetic song are reconstructed, and the bird is muted. The bird is then connected to the electronic syrinx via its thoracic air sac pressure, which is digitized and fed to the DSP. In the DSP, an algorithm detects the onset of the first syllable of the motif in the degraded gesture of the muted bird. Upon detection, the model is integrated in real-time while the attempt of the bird to continue with the motif is inferred from the motor gesture. The computed pressure fluctuations at the output of the beak are converted to an analog signal and played through a speaker located

The sound registered in the box via the microphone, the direct analog output of the DSP system and the altered pressure gesture are digitized and recorded at

The device succeeds in synthesizing song online when driven by the pressure gesture of a muted bird. From the altered motor gesture, the algorithm infers the segment of a motif intended by the bird and computes the model to produce the vocalizations. An example is illustrated in

Intact pressure gesture and sonogram, with different colors and opacity of shading indicating the different syllables, and an arrow indicating the segment of the first syllable used for detection (upper panels). When the muted bird drives the syrinx, we see in the sonogram that synthetic sound is produced after the first syllable is detected and until recognition of the interruption of the motif (lower panels).

The upper panels of the figure display the recorded subsyringeal pressure and sonogram of a segment of a bout with its preceding introductory call. A song bout of this bird (B06) is composed of a number of introductory notes (O) and the repetition of a simple motif containing two syllables (A and B), indicated by different colors and opacity of shading in

The bird is then muted and placed in the setup to drive the electronic vocal organ with its pressure gesture. In the lower panel of

This example illustrates how this device works, and shows that it is successful in synthesizing the song motif as the bird drives it. We evaluate its success by counting the times the motif was properly detected and synthesized, and how many times a false trigger occurred. During a session of

Despite the variability of the altered pressure gesture in the subsequent days (

We show here that realistic vocal behavior is synthesized in real-time by our device, as it is controlled by the spontaneous behavior of a muted bird, a physiological signal (its air sac pressure) that is degraded in respect to the one recorded in the intact bird. The computing platform is a low cost, portable processor, and the initial rate of success is high. This is an encouraging example of the plausibility of a kind of interface between the central motor pattern generator and the synthetic, bio-mimetic behavior. DSP technology is being implemented in a variety of biologically inspired problems, and together with Field Programmable Gate Array technology (FPGA) is likely to become a standard solution for a variety of bio-mimetic applications

Brain computer and brain machine interfaces (BCI and BMI) typically read physiological data and attempt to decode motor instructions that drive peripheral devices in order to produce synthetic behavior

We have built a device that emulates complex motor behavior when driven by a subject by its actual (yet degraded) physiological motor gestures. It successfully reproduces the result of the stereotyped motor gesture that leads to the behavior,

The relative computational and technological simplicity of the device relies on the current level of understanding of the peripheral biomechanical effector

Furthermore, exploration of the model leads to the finding that much of the diversity and complexity of the behavior can be explained in terms of the dynamical features of this nonlinear system

In addition, knowledge of nonlinear dynamics allows us to find the simplest system with equivalent oscillatory behavior. The reduction of the low dimensional mathematical model for the syrinx to its normal form reduces the computational requirements and makes way for the implementation on a real-time computing solution, such as a DSP.

Realistic vocal behavior is synthesized online, controlled by the motor gesture of a freely behaving muted bird, which is a physiological signal that is degraded respect to the one recorded in the intact bird. This was achieved by computing in real time a mathematical model describing the mechanisms of sound production in the interface between the motor pattern generator and the behavior, the highly nonlinear vocal organ. The computing platform is a low cost, portable processor. This successful avian vocal prosthesis is an encouraging example of the plausibility of a kind of interface between the central motor pattern generator and the synthetic, bio-mimetic behavior. An advance towards models in which certain complex features of the motor behavior are understood in terms of the underlying nonlinear mechanisms of the peripheral effectors has the potential to enhance solutions of brain-bio-mimetic effector interfaces in many ways.

We thank María de los Ángeles Suarez for her excellent technical assistance. EMA also thanks I. Vissani for useful comments.