^{1}

^{*}

^{2}

^{2}

^{2}

Conceived and designed the experiments: JRW TL. Performed the experiments: JRW TL. Analyzed the data: JRW. Contributed reagents/materials/analysis tools: JRW. Wrote the paper: JRW TL WEB JDB.

The authors have declared that no competing interests exist.

The field of neural prosthetics aims to develop prosthetic limbs with a brain-computer interface (BCI) through which neural activity is decoded into movements. A natural extension of current research is the incorporation of neural activity from multiple modalities to more accurately estimate the user's intent. The challenge remains how to appropriately combine this information in real-time for a neural prosthetic device.

Here we propose a framework based on

Our results reveal that a fusion-based approach has the potential to improve prediction accuracy over individual decoders of varying quality, and we hope that this work will encourage multimodal neural prosthetics experiments in the future.

Each year ∼150,000 people in the United States undergo an arm or leg amputation

The problem of translating neural activity into direct movements is known as

Each modality involves specific hardware (e.g. electrodes), and analysis of these signals requires algorithms carefully designed to predict the user's intent given the characteristics of the signal (e.g. signal-to-noise ratio, noise distributions, dependencies). Neural decoding algorithms generate a

Decoding of individual neural modalities is a consistently improving field with many robust methodologies. However, due to the limitations of current recording technologies, more advanced prosthetic limbs will require multiple neural signals with varying information content in order to achieve full functionality. A major computational challenge is to analyze all signals simultaneously to provide the best estimate of the user's desired movement.

Here we present a framework for combining information from multiple modalities to more accurately decode user intent for a prosthetic device. There are two solution paradigms for this problem:

Though data fusion allows for all information to be assessed at once by a single algorithm, current hardware architectures for neural prostheses are parallelized with multiple recording platforms and processors, inherently advocating parallelized decoding prior to a final state prediction. As most decoding algorithms are optimized for specific modalities, we employ techniques for

In this report, we examine two algorithms for decision fusion of continuous variables: the Kalman filter and artificial neural networks (ANNs). We implemented three of the most successful individual neural decoding algorithms with simulated cortical neural spike data to test the capabilities of each fusion method. Through these simulations, we reveal the advantages and limitations of these approaches. Our methodology provides a flexible framework for fusing state estimates from decoding algorithms with different properties and hopefully will encourage multimodal experiments for improved control of sophisticated neural prosthetic devices.

We first formulate decision fusion in terms of Bayesian statistical inference. For our purposes, measurements are predictions from the individual decoders, and the system state is the 2-dimensional velocity vector of the prosthetic endpoint. Given the history of all measurements up to timestep

To simplify the model, we assume

Artificial neural networks have also been used as a method for fusing decisions from supervised classifiers and data from multiple sensors. An ANN is a mathematical model composed of simulated neuron ^{th}

(_{j}

We implemented feed-forward ANNs with either one or two hidden layers. At each timestep, the state estimates of each individual decoder are provided to the input units, while the output layer produces a fused estimate of the x and y velocities. The activation functions for all hidden units are tansigmoid, and the output layer uses linear functions. To train each ANN, we employed the scaled conjugate gradient method for learning the neuron weights and the mean squared error as a criterion function. We additionally optimized the number of hidden units by searching the space of all permutations ranging from one to 12 hidden units in the first layer, and zero to 11 hidden units in the second layer. Thus, 144 ANNs were examined to find an optimal selection of hidden units within each layer.

Similar to Moran and Schwartz _{t}_{p}_{t}_{t}_{0}_{p}

The Kalman filter framework as a single neural decoder was very similar to that of the fusion implementation. The individual Kalman filter modeled the relationship between neural spikes and the state of the device as a linear Gaussian process. The dimensionality of this observation model was larger than the observation model used for the fusion Kalman filter.

We employed a model similar to the population vector algorithm (PVA) described in Moran and Schwartz _{0}_{n}_{y}_{x}_{0}_{p}_{p}

The linear filters constructed for decoding used sliding windows of length four timepoints to form a response matrix of neuron firing rates. To train each filter, we performed a multiple regression of the x and y velocities over a response matrix spanning the entire training set:

Evaluation trials were designed to compare the accuracy of individual decoder predictions to “fused” results obtained from the Kalman filter and ANNs. Below we describe the three major components of each experiment: (i) individual decoder training, (ii) fusion decoder training, and (iii) final testing. See

Flowchart describing fusion of Kalman filter (KF), PVA, and the optimal linear decodes using the Kalman filter and ANNs. Experimental trials contained three major phases: (i) individual decoder training, (ii) fusion decoder training, and (iii) final testing. In each experiment, individual decoders were first trained using the same simulated spike count data. Next, fusion decoders were trained on the individual decoders' outputs (predicted velocity components in

Each single decoder (PVA, Kalman filter, and optimal linear decoder) was trained on an identical dataset composed of 50 simulated neuron spike observations with a corresponding endpoint path. Trials associated with high-quality and poor-quality decoders used training datasets with 3,000 and 1,500 time-steps, respectively.

When training the decision fusion algorithms, a set of predictions for each individual decoder is required. One could simply let the single decoders make predictions based on the initial training dataset, but this could lead to overfitting and poor performance on new data. To avoid this, a second dataset for _{x}_{y}

After training the fusion and individual decoders, a set of trajectories and corresponding spike signals were generated for testing. Each trajectory represented 3,000 timesteps. For each trial, cortical spikes counts were input to individual decoders, which output predictions for

We generated random trajectories in 2-dimensional position space according to the following model:

We present the fusion problem in the context of estimating the endpoint velocity of a prosthetic arm using several different decoding algorithms of varying accuracy. Decoding studies often focus on endpoint trajectories, leaving the controls of the limb to determine optimal joint positions and velocities by inverse kinematics.

To investigate these fusion methods, we simulated neural spike data and implemented the following algorithms for spike decoding: standard Kalman filter

Testing the fusion algorithms first required training each individual decoder. Each trained algorithm was then used to decode a fusion training dataset and a separate fusion validation dataset for training the artificial neural network. The use of a validation dataset prevents overtraining of the ANN. The outputs of the trained algorithms (in our case x and y velocities) served as inputs to train the fusion algorithms (

We measure the accuracy of the decoded trajectories in terms of the root mean squared error (_{rms}_{rms}_{rms}_{rms}

(_{rms}^{nd}-layer sizes. Note the first column in each matrix corresponds to all single hidden-layer networks. Interestingly, many single hidden-layer networks outperform more complex networks, indicating the dynamic accuracies of different neural network topologies. _{rms}

Trial | 1 | 2 | 3 | 4 |

^{st} hidden layer |
8 | 11 | 12 | 6 |

^{nd} hidden layer |
11 | 7 | 10 | 10 |

_{rms} |
0.085±0.002 | 0.083±0.002 | 0.097±0.003 | 0.103±0.004 |

The final decoded trajectories are presented in _{rms}_{rms}

Trial | 1 | 2 | 3 | 4 |

0.073±0.001 | 0.069±0.001 | 0.126±0.004 | 0.090±0.004 | |

0.093±0.002 | 0.090±0.002 | 0.102±0.003 | 0.107±0.005 | |

0.174±0.003 | 0.172±0.003 | 0.179±0.003 | 0.203±0.011 | |

0.119±0.004 | ||||

0.0850±0.002 | 0.083±0.002 | 0.103±0.004 |

The accuracy of neural decoders depends not only on the sophistication of the decoding algorithms but also on the physical recording locations and the nature of the signals. A few millimeters of discrepancy in electrode placement can dramatically impact decoding accuracy

To address this scenario, we subsequently tested the ability of our fusion algorithms to handle poor quality decoding. Generating a simulated neural training set lacking sufficient complexity and size, we retrained the individual decoders resulting in unacceptable decoding accuracy. We ran four decoding trials, comparing the fusion outputs to the single decoders. In _{rms}

These two sets correspond to trials 2 and 3 in

Trial | 1 | 2 | 3 | 4 |

0.557±0.009 | 0.828±0.017 | 0.549±0.009 | 0.898±0.017 | |

0.371±0.006 | 0.550±0.012 | 0.365±0.006 | 0.523±0.010 | |

0.248±0.005 | 0.318±0.013 | 0.238±0.004 | 0.285±0.005 | |

0.108±0.002 | 0.198±0.003 | |||

0.116±0.005 | 0.091±0.002 |

To determine if the improvement of the fusion algorithms was statistically significant, we generated 468 additional randomized trajectories (selected from a large space of smooth realistic movements, see Methods) and corresponding simulated neural spike datasets. For each trial, we employed only a single ANN topology, because searching a space of topologies is not feasible for real-time decoding. The selected ANN used a single hidden-layer with six hidden units, the same as the number of input nodes. The fusion Kalman filter resulted in significantly lower _{rms}

The improvement of fusion algorithms over the combined individual decoders was statistically significant (

We have described a framework for fusing decisions in the context of multimodal prosthetic devices. Investigating the Kalman filter and ANNs, we have shown that each fusion method is capable of producing accurate fusion decodes and can adapt to decodes of varying quality over time.

While our expertise is targeted towards neural decoding for prosthetic limb movement, this approach may be generalized to the larger field of brain-machine interfaces (BMIs) to help improve communication for patients suffering from severe paralysis, locked-in syndrome, and other neurological injuries. Recent BMI studies have demonstrated success in providing some level of communication for subjects

The computational expense of a fusion step in a neural prosthetic device is of notable importance. Each of the methods examined in this study is capable of running in real-time on a single processor, which is likely to be the hardware implementation of such a framework. Furthermore, the computational cost of individual modality decoders is increasing considerably, with many suggesting parallel processing implementations

Progress in neural recording technologies may eventually lead to opportunities for data fusion, where a single decoder is used on all modalities simultaneously. Our choice to employ decision fusion in this study was in large part due to the current capabilities of neural prostheses and those in development, making our findings timely.

Our results must be qualified because of the artificial nature of our cortical spike data. Though our analysis is based on simulated neural activity, we sought to capture the fundamental features of spike data including: a realistic number of monitored neurons, randomized preferred directions, and firing rates exhibiting Poisson noise. Our simulated neurons are indeed close to ideal, but we have shown the significant improvement decision fusion can provide when fusing predictions from decoders of variable accuracy – a result independent of the simulated data itself. Currently, no continuous real-time multimodal neural data recordings are available, but several are in production, and the community has shown an evident interest in this direction

An ideal neural prosthesis will be fully autonomous, capable of independently retraining and adapting to different human conditions and mechanical failure. Electrode loss is arguably the most important limiting factor for neural prostheses proliferation

Neural prosthetics is a swiftly evolving field with ambitious goals. Restoring the functionality of a limb for an individual will require innovative technology and robust computational methods to rapidly and accurately assess user intent.

We are grateful to Matthew Para, Francesco Tenore, Vikram Aggarwal (Johns Hopkins U.), and Cevat Ustun (Caltech) for helpful conversations regarding this project. We also thank Dan Mendat (Rutgers), David Huberdeau (Johns Hopkins U.), John Kegelman (Johns Hopkins U.) and Justin Bartley for their technical contributions in the Revolutionizing Prosthetics 2009 team at the Johns Hopkins University Applied Physics Lab.