^{1}

^{1}

^{1}

^{2}

The authors have declared that no competing interests exist.

Cells are crowded and spatially heterogeneous, complicating the transport of organelles, proteins and other substrates. One aspect of this complex physical environment, the mobility of passively transported substrates, can be quantitatively characterized by the diffusion coefficient: a descriptor of how rapidly substrates will diffuse in the cell, dependent on their size and effective local viscosity. The spatial dependence of diffusivity is challenging to quantitatively characterize, because temporally and spatially finite observations offer limited information about a spatially varying stochastic process. We present a Bayesian framework that estimates diffusion coefficients from single particle trajectories, and predicts our ability to distinguish differences in diffusion coefficient estimates, conditional on how much they differ and the amount of data collected. This framework is packaged into a public software repository, including a tutorial Jupyter notebook demonstrating implementation of our method for diffusivity estimation, analysis of sources of uncertainty estimation, and visualization of all results. This estimation and uncertainty analysis allows our framework to be used as a guide in experimental design of diffusivity assays.

Diffusion is essential for the intra-cellular transport of many organelles, proteins and substrates. In the crowded and heterogeneous physical environment of the cell, diffusivity is a local, spatially dependent characteristic of the space, dependent on factors such as the size of the particle, and the local viscosity and spatial crowding. These spatial heterogeneities must be addressed when using diffusion coefficients as readouts of intra-cellular transport and the physical environment. This intra-cellular diffusion coefficient is often experimentally estimated through two approaches: single particle tracking (SPT) [

In single particle tracking experiments, a live cell is imaged in successive frames, and individual punctate objects are tracked to construct a trajectory of time-dependent positions (

In SPT, a live cell is imaged over a series of time points. Individual punctate objects are localized at each time-step, and these positions are traced from frame to frame to produce individual time-lapse trajectories.

For objects undergoing homogeneous isotropic diffusion, the MSD of puncta is a linear function of lag time (

In FCS, a laser illuminates a region of a sample containing fluorescently tagged particles [

Like FCS, SPT can be used to probe local diffusivities and is robust to anomalous diffusion models [

While powerful analyses from SPT have indicated the complexity of transport in live cells, the spatial variation of the diffusion coefficient remains poorly characterized. This can be attributed to challenges in disentangling effects of biological heterogeneity and limited sampling of a stochastic process [

Other packages with information theoretic frameworks for trajectory analysis have been released; for example, the Single-Molecule Analysis by Unsupervised Gibbs sampling (“SMAUG”) software package [

We generated sample trajectories with known diffusion coefficients by simulating Brownian motion of particles in a d-dimensional space. At each time-point and along each spatial dimension, a step size was drawn from a zero-mean Gaussian ^{2} defined by the diffusion coefficient: ^{2} = 〈|Δ^{2}〉 = 2

A 2D diffusive trajectory with no localization error is drawn for T time-steps. At each time-step, a cloud of Gaussian uncertainty is drawn; the shape and shading of this cloud demonstrate how likely it is for the position of be measured at any of the surrounding points rather than in the true position. A sample alternative trajectory is drawn (purple) showing the path we might observe the particle to take, due to the localization error in measuring the true position as a function of time.

To mimic the static localization error inherent in microscopy-generated trajectories in our simulated trajectories, we added Gaussian error to the locations of simulated particles at each time point [

The locations of the simulated particle at each time-point (with and without error included) are stored in a DataFrame, and these trajectories are digested into frame-to-frame displacements; realistically these step sizes were

To estimate the diffusivity underlying a single trajectory (and our uncertainty in this estimation), we employ Bayesian inference [

The parameters

The posterior distribution peaks near the true diffusion coefficient and has a width corresponding to the confidence interval of our estimate, which is largely determined by the trajectory length and magnitude of localization error.

To characterize our uncertainty on whether trajectories come from regions with different diffusivities, we require a way to quantitatively discriminate between pairs of posterior distributions. To achieve this, we use the Kullback-Leibler (KL) divergence. The KL divergence acts as a single-value estimation of how well we can analytically distinguish whether the step sizes from a trajectory came from the diffusivity predicted by one posterior or the other. The KL divergence of two inverse-gamma distributions

A repository for our source code is publicly available at the Allen Cell Modeling GitHub page

When the position of a diffusing object is recorded as a trajectory of discrete steps in time, the sizes of those steps can be mathematically represented as stochastic draws from a distribution characterized by the diffusion coefficient. Our method for estimating the diffusion coefficient relies on breaking individual trajectories into frame-to-frame steps, and applying a Bayesian statistical framework to predict the diffusivity underlying each set of stochastically derived step sizes. From a single trajectory, this framework provides not only an estimation of the diffusivity, but also a representation of our uncertainty. While our framework could be adapted to analyze more complex dynamic models, our current implementation introduces a workflow for analyzing isotropic homogeneous diffusion; therefore, trajectories with unknown diffusivity will result in a step-size distribution which is normally distributed, with zero mean and unknown variance

Bayesian inference is built on the use prior and posterior distributions [

In this section, we will step through the process of applying Bayesian analysis to our particular case. First, we will get introduced to the governing principle of this approach, called Bayes’ theorem [

Bayes’ theorem tells us that the posterior distribution for an unknown variable

How does this apply to the diffusion process we have been exploring? In our problem, we have taken single particle trajectories and split them into frame-to-frame step sizes. We can say, then, that our Bayesian “observed variable ^{2}, and our likelihood function is the normal distribution of step sizes, i.e.

The prior is our initial guess of the probability distribution of values for our unknown variable, ^{2}. To determine the prior distribution for our cases, ^{2}), we consider the mathematical dependence of the normally distributed step sizes on the variance ^{2}:

We see that this dependence looks a bit like a gamma distribution, except that our variable of interest is found in the denominator. This class of function is intuitively called an inverse-gamma function (^{2} values to follow an inverse-gamma distribution, and therefore this is the form of our prior: ^{2}) = ^{2}).

We have now seen how to place the observed and unknown Bayesian variables in the context of our problem, and explored the Normal and inverse gamma distributions which can be used as our likelihood and prior distributions, respectively. With these pieces in hand, we can now find the class of function for our posterior distribution, as the product of our prior and likelihood distributions (^{2}) and ^{2}) also has an inverse gamma dependence on ^{2}. We note that our posterior distribution is a function of the same class as the posterior—we will come back to this after a brief note.

In this section we have built up a framework for performing Bayesian analysis to estimate a distribution of variances, but we promised an estimation of the diffusion coefficient. Now let us recall that the variance of the diffusive step size distribution is directly proportional t the diffusion coefficient (^{2} = 2

In general, when the prior and posterior for Bayesian analysis take the same mathematical form, the prior is referred to as a “conjugate prior.” The matching of the conjugate prior and posterior function types dramatically simplifies the statistical method, presenting one advantage of this prior. A second advantage of our prior is that the inverse-gamma distribution acts a conservative initial “guess,” with any order of magnitude diffusivity is equally likely, before the introduction of any data. In the Bayesian method of statistical inference, the choice of prior can bias our results; for instance, if we expect the diffusivity to be around 1 ^{2}/

The estimation of diffusivity from a single trajectory is limited by the finite trajectory length and accuracy in localizing the object at each time point. As a result, careful consideration of how each of these factors will impact the estimation uncertainty is necessary when constructing an experimental design. To address this, we have constructed a framework for generating look-up tables predicting the percent error posterior diffusivity estimation conditional on a set of trajectory lengths and localization errors.

Many methods for estimating diffusivity from a single trajectory rely on the analysis of the frame-to-frame step-size distribution extracted from that trajectory. However, during a microscopy experiment, there will always be an inherent limitation to the degree of accuracy that an object can be localized in each frame. This arises from both static and dynamic sources of localization error; static localization error occurs due to the inherent limit to spatial resolution of imaging experiments, while dynamic localization error comes from the non-instantaneous nature of capturing an image resulting in object movement during image acquisition [

As a result of limitations in spatial resolution, when the object is tracked and trajectories generated, an inherent limitation in localization accuracy is encoded in the trajectory, and therefore skews the step-size values being used to infer the diffusion coefficient. To demonstrate the impact of localization error on SPT, we provide an example simulated trajectory with varying amounts of localization error applied (

A 2D diffusive trajectory with no localization error is drawn for T time-steps. That same trajectory is then redrawn in increasingly light colors, for increasing levels of localization error. This error is parameterized in the form of the standard deviation of a Gaussian blur, in microns. This example allows us to visualize the impact that a range of localization errors would have on the same trajectory.

Left: Sample simulated 2D trajectories composed of 100 steps with diffusion coefficient _{1} = 0.01 ^{2}/_{2} = 0.02 ^{2}/

Diffusive trajectories are composed of successive steps, whose sizes are stochastic draws from a distribution set by the diffusivity. When only short trajectories are available, we have only a limited set of draws from this distribution—as a result, the variance of this distribution is difficult to accurately predict, and the posterior distribution of diffusivity probabilities will be less accurate and precise. While it would be ideal to simply collect longer trajectories, this is often experimentally impossible; therefore, we aim to give experimentalists an analysis framework to estimate how accurately they can predict diffusivity given their own limitations in tracking.

Because our trajectories are simulated, we benefit from the knowledge of the true diffusivity and degree of localization error, and can therefore precisely quantify the relation between the error in our Bayesian estimation of diffusivity and the level of localization error. This provides a look-up table for experimentalists to predict the accuracy in diffusivity estimation that can be achieved with their own particular microscopy experiment, shown in

Of course, due to the stochastic nature of diffusive properties, even with all the same simulation parameters, the posterior error will vary from one simulation to the next. In order to capture the mean effect of each parameter on posterior error, the results in ^{4} replicates of the same simulation parameterization.

The percent error for a given posterior is measured as the percent error between the true diffusion coefficient used to generate the trajectory, and the mode of the posterior distribution (or the diffusion coefficient which gives the maximum value of the probability density function). This heatmap reports the mean percent error magnitude for 10^{4} posteriors generated under each set of trajectory length and localization error conditions, with diffusion coefficients of (A) 0.01 ^{2}/^{2}/^{2}/

For example, in a study of the ^{2}/^{2}/

In addition, it should be noted that the number of spatial dimensions of the assay (i.e. whether trajectories are measured in two or three spatial dimensions) as well as the mean-squared displacement (related to the diffusion coefficient) can impact the relationship between localization error and Bayesian estimation error. For a more in-depth discussion and simulation of this, please see the tutorial Jupyter notebook in our project GitHub repository.

With the above percent error analysis derived for simulated trajectories with known diffusivities, a picture arises of how our estimates of the diffusivity differ from the true values. As a result, when this technique is applied to experimentally-derived trajectories whose underlying diffusivities are unknown, we may want to ask ‘how likely is it that two trajectories resulting in different diffusivity estimates were actually derived from regions with the same diffusivity?’ The biological motivation and analog for this technical question is ‘how heterogeneous is the physical cellular environment?’

This will depend on the amount of overlap between the two diffusivity posterior distributions, which is determined by: (1) how different the underlying diffusion coefficients are (how far apart the theoretical maxima of posteriors are) and (2) how uncertain we are in our estimations (how wide the posterior distributions are). One way to measure the difference between two distributions is to use the Kullback-Leibler divergence (KL divergence). A KL divergence of zero indicates that two distributions are identical; one interpretation of this metric is that its inverse tells you the number of times you can draw samples from one distribution in place of the other before there is significant information loss.

In order to communicate the distinguishability of pairs of posteriors conditional on their trajectory parameters, we have created a heatmap look-up table of the KL divergence of posterior pairs, dependent upon the ratio of their underlying diffusion coefficients (i.e. _{2}/_{1}), and the trajectory length. An example of this look-up table heatmap is provided in

Heatmap displaying the average KL divergence of diffusivity posteriors. For each entry in the heatmap, two trajectories of the same length (x-axis) are produced, with differing underlying diffusivities with the ratio _{2}/_{1} (y-axis). A posterior is estimated for each, and their KL divergence is calculated as a measure of the distinguishability of the underlying diffusivities. As this process is stochastic, this is repeated 10^{4}, with the average being the value reported in the heatmap.

Given a single trajectory, let us compare what we could learn of the underlying diffusivity through MSD analysis and our Bayesian framework. In MSD analysis, the trajectory would be split into step sizes associated with every possible lag time (that is, the mean of the squared displacement for all step sizes between frames

This confidence interval offers an added benefits over MSD analysis. Through posterior visualization and the KL divergence analysis described in the previous section, this Bayesian estimation framework provides us with a straightforward visual and quantitative way to diagnose how likely it is that diffusivity estimates from two trajectories are actually describing regions with different physical properties. In the case of MSD, comparison of single-trajectory diffusivity estimates is done by plotting

In the introduction of this paper, we discussed the importance of analysis techniques that acknowledge the heterogeneity of cellular environments. The single-trajectory dependence of this tool offers a framework to build on for characterizing variations in the diffusivities felt by trajectories recorded in different cellular regions. By mapping the diffusivity estimates from each trajectory (value most probable from posterior distribution) to the spatial region where the tracked substrate was localized, the user can build up a spatial mapping of the diffusivity. While frameworks exist for spatial mapping of the physical properties of cells, such as nanorheology of injected particles [

As we have discussed, the presence of localization error and the finite nature of trajectories will contribute to the uncertainty in any analysis of single particle trajectories. Here, we discuss several other important limitations to be considered when using this software package.

This framework is currently only implemented for the analysis of pure diffusion, however anomalous diffusion (particularly sub-diffusion) is commonly reported in the analysis of biological trajectories. Users could adapt the package to analyze trajectories undergoing anomalous diffusion by editing our Bayesian estimation code. We have described how our conjugate prior and posterior model have been selected specifically to analyze a normal distribution of step sizes with zero mean; because the step size distribution is dependent upon the diffusion model, the class of function used for the prior and posterior will also be dependent upon the diffusion model. To modify this framework for other diffusion models, users would therefore select new prior and posterior distributions, and require a new equation for calculating the KL divergence for a pair of distributions belonging to this mathematical function class (i.e. a replacement for

Realistic intra-cellular transport is additionally complicated by the presence of active transport and flow. Furthermore, the affects of confinement and characterization of the physical properties of the cytoplasm (i.e. elasticity) can further complicate intra-cellular dynamics. As these factors are not considered in the current implementation of our framework, they will contribute to the error in the analysis of experimentally derived trajectories.

Many research studies have demonstrated intracellular transport to be sub-diffusive (i.e.

The percent error for a given posterior is measured as the percent error between the true diffusion coefficient used to generate the trajectory, and the mode of the posterior distribution (or the diffusion coefficient which gives the maximum value of the probability density function). Trajectories for this figure are simulated using fractional Brownian motion with Hurst coefficient ^{4} posteriors generated under each set of trajectory length and localization error conditions, with diffusion coefficients of (A) 0.01 ^{2}/^{2}/^{2}/

Heterogeneity of diffusive dynamics may majorly impact the transport of essential cellular substrates but remains largely uncharacterized. To shed light on the feasibility of resolving spatial from stochastic drivers of diffusive heterogeneity in trajectory data, we developed a framework for predicting our ability to detect differences in diffusivity under different experimental regimes. Our framework is intended to inform the design of experiments characterizing the spatial dependence of diffusivity on sub-cellular location.

We would like to thank Steph Weber, for her helpful comments on this manuscript, and Molly Maleckar, Gabriel Mitchell and Jamie Sherman for their helpful conversations. We thank Jackson Brown for his CookieCutter template and guidance in repository initialization, and Thao Do for her scientific illustration. Finally, we thank Paul G. Allen, founder of the Allen Institute for Cell Science, for his vision, encouragement and support.

PONE-D-19-23195

A Bayesian framework for the detection of diffusive heterogeneity

PLOS ONE

Dear Dr Cass,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Nov 17 2019 11:59PM. When you are ready to submit your revision, log on to

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see:

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Juan Carlos del Alamo

Academic Editor

PLOS ONE

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The

Reviewer #1: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This study describes a Bayesian inference algorithm to estimate local values of the diffusion coefficient inside live cells from single trajectories of intracellular particles. This type of algorithm can be useful to researchers interested in quantifying diffusivity of heterogeneous or time-varying environments, including but limited to the cytoplasm of live cells. The manuscript describes related existing efforts in the literature, and makes a convincing point that the present algorithm, and in particular its associated freely accessible implementation, is sufficiently different from those existing efforts. In particular, while the parametric nature of the present study is a limitation with respect to existing, non-parametric, efforts, its simplicity may be advantageous to those without significant expertise in statistical mechanics.

The manuscript analyzes the error in the estimated D based on the localization error of the particle. As expected, the error decreases with the length of the recorded trajectory. However, Figure 5 shows this error seems to be unacceptably large for some combinations of values, and it is unclear whether a “typical experiment” (this is a loose term whose meaning is expanded below) would yield acceptable results. The authors argue that the purpose of Figure 5 is for each researcher to assess the error for their own experiments. This is valuable but Figure 5 is plotted in a way that makes this assessment difficult:

1) Only two values of D are covered. It would seem to make more sense to plot Figure 5 normalizing the localization error with (D*tau)^(1/2). This could capture the D-dependence of the estimation error, and only one panel might be necessary to cover all D values.

2) Second, a line plot or contour plot format would be preferable to read errors in the plot.

3) It would be informative to represent a “typical experiment” or experiments in the localization error and trajectory length coordinates of Figure 5. The authors can use experiments from the literature and / or their own data from previous studies.

As the authors point out, a limitation of the study is that it focuses on an idealized model of intracellular diffusion. The authors argue that the method could be adjusted to account for complicated phenomena, such as subdiffusion, but the intended audience of this algorithm may not find this straightforward. This issue is compounded with the fact that this reviewer finds the purely diffusive case to be particularly amenable to the analytical calculation of the posterior distribution. Other cases may be harder… It would be informative to illustrate how the algorithm would be modified in the subdiffusive or e.g., persistent-random walk case by presenting the posterior distribution for those processes.

Finally, there may be cases in which modifying the algorithm to account for non-purely diffusive, isotropic behavior is not feasible or where the actual behavior that needs to be accounted for is unknown a priori. It would be informative to know the error in estimated D in those cases. Again, subdiffusive, persistent-random or anisotropic random walk cases come to mind.

Minor comments:

1) Is the alpha in equation 2 related to the alpha in equation 1?

**********

6. PLOS authors have the option to publish the peer review history of their article (

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

This response is better viewed in the Response to Reviewers file, where our responses are interwoven with the reviewer's comments and clearly indicated. I have copied this text and placed the reviewer's original notes in brackets, to given our responses appropriate context.

[This study describes a Bayesian inference algorithm to estimate local values of the diffusion coefficient inside live cells from single trajectories of intracellular particles. This type of algorithm can be useful to researchers interested in quantifying diffusivity of heterogeneous or time-varying environments, including but limited to the cytoplasm of live cells. The manuscript describes related existing efforts in the literature, and makes a convincing point that the present algorithm, and in particular its associated freely accessible implementation, is sufficiently different from those existing efforts. In particular, while the parametric nature of the present study is a limitation with respect to existing, non-parametric, efforts, its simplicity may be advantageous to those without significant expertise in statistical mechanics.]

We’d like to thank the reviewer for their thoughtful consideration of our manuscript and helpful comments. Indeed, we hope that this tool may be of particular use to researchers without deeply computational or statistical backgrounds, offering a modifiable tool with more flexibility than the MSD, but less complexity than existing software that can have a higher barrier to entry.

[The manuscript analyzes the error in the estimated D based on the localization error of the particle. As expected, the error decreases with the length of the recorded trajectory. However, Figure 5 shows this error seems to be unacceptably large for some combinations of values, and it is unclear whether a “typical experiment” (this is a loose term whose meaning is expanded below) would yield acceptable results. The authors argue that the purpose of Figure 5 is for each researcher to assess the error for their own experiments. This is valuable but Figure 5 is plotted in a way that makes this assessment difficult:

1) Only two values of D are covered. It would seem to make more sense to plot Figure 5 normalizing the localization error with (D*tau)^(1/2). This could capture the D-dependence of the estimation error, and only one panel might be necessary to cover all D values.

2) Second, a line plot or contour plot format would be preferable to read errors in the plot.]

I agree with the reviewer that this kind of non-dimensionalization would offer a more robust representation of the model results. There are many relevant parameters whose relative values impact the estimation error, and careful choice of how to represent this data is certainly important. However, since we are aiming for this tool to be engaging for a more experimental audience, we wanted to maintain the use of more tangible parameters, which maintain their straightforward physical interpretability, and chose this format to prioritize familiarity and approachability over density of information reporting.

For any given experiment you may have a single localization error and limited range of trajectory lengths. So understand why the parameterization of this figure may not be the most versatile in its use for any one experiment. However, we hope that this parameterization of the lookup table might instead be applicable to a wider audience. As this figure is viewed by a wide audience of researchers with trajectories of varying lengths and localization errors constraining their experiments, we hope they can get a ballpark answer for the question of whether this tool will be useful for them. While the diffusion coefficient value is of course relevant as well, we hope that representing the effects on three different orders of magnitude of diffusivities can give the reader a taste of what might be possible (a third order of magnitude was added in our revised manuscript).

That said, the tool is designed to have an accompanying tutorial Jupyter notebook with examples of how each figure is generated. This was a conscious choice to ensure that those without extensive computational backgrounds have a more easily approached interface with the code, in order to tweak analysis parameters themselves, and see how the results change if they switch out parameter values to generate adaptation of our provided figures which are tailored to their own experimental parameters and constraints. We hope that this design can help make up for the limitations in what can be presented in the manuscript figures.

[3) It would be informative to represent a “typical experiment” or experiments in the localization error and trajectory length coordinates of Figure 5. The authors can use experiments from the literature and / or their own data from previous studies.]

Thank you for this feedback - we have incorporated this suggestion.

[As the authors point out, a limitation of the study is that it focuses on an idealized model of intracellular diffusion. The authors argue that the method could be adjusted to account for complicated phenomena, such as subdiffusion, but the intended audience of this algorithm may not find this straightforward. This issue is compounded with the fact that this reviewer finds the purely diffusive case to be particularly amenable to the analytical calculation of the posterior distribution. Other cases may be harder... It would be informative to illustrate how the algorithm would be modified in the subdiffusive or e.g., persistent-random walk case by presenting the posterior distribution for those processes.

Finally, there may be cases in which modifying the algorithm to account for non-purely diffusive, isotropic behavior is not feasible or where the actual behavior that needs to be accounted for is unknown a priori. It would be informative to know the error in estimated D in those cases. Again, subdiffusive, persistent-random or anisotropic random walk cases come to mind.]

We agree that demonstrating example adaptations to the prior and posterior for more complex intracellular would be an exciting enhancement of what this manuscript could offer, however these advancements are beyond the scope of what we are hoping to explicitly provide within this manuscript. We hope that the detailed demonstration of how this tool is built and acknowledgement that prior and posterior distributions may be adapted for more complex needs is sufficient for demonstrating the value of this tool.

However, we agree with the reviewer’s important note that sub-diffusive motion is of particularly great importance to address in greater detail. To address this important concern, we have taken the reviewers suggestion of applying our analysis tool (with the existing prior and posterior designed for pure diffusion) to biologically- relevant sub-diffusive trajectories and reported the resulting estimation error. For this task we have used trajectories simulated using fractional Brownian motion, as previous work has shown this to be a prevalent mode of intracellular transport. The Hurst coefficient (H) used to define this process is H = alpha/2, where alpha is the parameter giving the MSD’s scaling with time lag (tau) as in Eq 1 of our manuscript. Thus, we have set the Hurst coefficient in the simulated trajectories to best represent reported results for the sub-diffusive time scaling alpha = 0.75 (or H = 0.375).

We feel this was an important addition to the manuscript in demonstrating the applicability of the tool and are grateful for the reviewer’s suggestion to include this.

[Minor comments:

1) Is the alpha in equation 2 related to the alpha in equation 1?]

Thanks for pointing this out! No, the two are not related. We’ve changed the inverse gamma parameters to (a, b) rather than (alpha, beta) to disambiguate.

Submitted filename:

PONE-D-19-23195R1

A Bayesian framework for the detection of diffusive heterogeneity

PLOS ONE

Dear Dr Cass,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

We would appreciate receiving your revised manuscript by Apr 26 2020 11:59PM. When you are ready to submit your revision, log on to

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see:

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Juan Carlos del Alamo

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: N/A

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I appreciate the authors' efforts to address my concerns and am for the most part satisfied with their revisions. I have a couple of remaining comments.

First, the list of references is rather short and there are places at which the authors discuss standard statistical inference theory without providing appropriate references to the literature. Perhaps a graduate level textbook would be enough. I believe this would be important considering the targeted audience.

Second, I still believe the authors overestimate the generality / flexibility of their approach. I understand the computational framework they present could be extended to other scenarios more representative of intracellular fluctuations than a Gaussian process. However, it is not clear that these extensions would be trivial. In fact, they even recognize this point themselves (circa line 370). I appreciate the authors including a section where they benchmark their tool for fractional Brownian motion. I would suggest to temper the statements about generality of the framework. Also, please use the same color axis and color bars in figures 5 and 7 to facilitate direct comparison (as in by caxis of Matlab or clim of python).

Reviewer #2: The authors manuscript with the accompanying, well-documented python repository is a valuable tool for researchers without significant expertise in Bayesian statistics. It would be more helpful to gather better intuition for KL divergence criterion with more details on how to interpret Fig 6. For e.g, an approximate threshold value of threshold KL below which the posteriors have a given probability to represent the same true diffusion constant (and therefore, not representative of the heterogeneous environment). Authors explain the intuition of KL values, but a rule of thumb would be more beneficial to design experiments.

The authors also provide an accessible way of estimating baseline errors in inference using heatmaps in Fig 5. A very important source of sensitivity to parameter inference lies in prior distribution parameters and a brief guide of choosing parameters (a,b) to not introduce bias in analysis (uninformative prior) would be recommended.

Minor typo: In pg 7/18, the likelihood function is written as p(theta | x) instead of p(x | theta).

**********

7. PLOS authors have the option to publish the peer review history of their article (

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

>>>Reviewer #1: I appreciate the authors' efforts to address my concerns and am for the most part satisfied with their revisions. I have a couple of remaining comments.

Thanks for taking the time to consider our manuscript again and for your helpful feedback.

>>>First, the list of references is rather short and there are places at which the authors discuss standard statistical inference theory without providing appropriate references to the literature. Perhaps a graduate level textbook would be enough. I believe this would be important considering the targeted audience.

We appreciate the suggestion and have added references to a graduate Bayesian statistics text (Gelman et al’s “Bayesian Data Analysis”) where appropriate.

>>>Second, I still believe the authors overestimate the generality / flexibility of their approach. I understand the computational framework they present could be extended to other scenarios more representative of intracellular fluctuations than a Gaussian process. However, it is not clear that these extensions would be trivial. In fact, they even recognize this point themselves (circa line 370). I appreciate the authors including a section where they benchmark their tool for fractional Brownian motion. I would suggest to temper the statements about generality of the framework.

We have removed the “flexible” descriptor in line 354 and added another statement acknowledging the challenge of framing more complex priors/ posteriors in line 372-374 in the “Framework limitations” sections.

>>>Also, please use the same color axis and color bars in figures 5 and 7 to facilitate direct comparison (as in by caxis of Matlab or clim of python).

Thanks for catching this; we’ve updated the rendering of these figures so that each comparable plot (ie 5A/7A, 5B/7B etc have identical color axes / color bars.

>>>Reviewer #2: The authors manuscript with the accompanying, well-

documented python repository is a valuable tool for researchers without significant expertise in Bayesian statistics.

We appreciate you taking the time to review our manuscript and accompanying repository.

>>>It would be more helpful to gather better intuition for KL divergence criterion with more details on how to interpret Fig 6. For e.g, an approximate threshold value of threshold KL below which the posteriors have a given probability to represent the same true diffusion constant (and therefore, not representative of the heterogeneous environment). Authors explain the intuition of KL values, but a rule of thumb would be more beneficial to design experiments.

Thanks for this suggestion; we agree that including a benchmark value would increase the usability of this reference table and have included a value and associated brief discussion in lines 332-342.

>>>The authors also provide an accessible way of estimating baseline errors in inference using heatmaps in Fig 5. A very important source of sensitivity to parameter inference lies in prior distribution parameters and a brief guide of choosing parameters (a,b) to not introduce bias in analysis (uninformative prior) would be recommended.

We agree that a discussion of this prior bias and parameter choice strengthens the manuscript and the tool’s usability; we’ve included a brief discussion of this in lines 191-208.

>>>Minor typo: In pg 7/18, the likelihood function is written as p(theta | x) instead of p(x | theta).

Thanks for catching this typo; we’ve fixed it in the updated manuscript.

Submitted filename:

A Bayesian framework for the detection of diffusive heterogeneity

PONE-D-19-23195R2

Dear Dr. Cass,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact

With kind regards,

Juan Carlos del Alamo

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PONE-D-19-23195R2

A Bayesian framework for the detection of diffusive heterogeneity

Dear Dr. Cass:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact

For any other questions or concerns, please email

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Juan Carlos del Alamo

Academic Editor

PLOS ONE