The author has declared that no competing interests exist.

A two year long experimental dataset in which authors of [

The hypothesis of a mind-matter interaction, that is, the possibility that human intention may have an impact on matter at a distance, is usually regarded by most physicists as a highly controversial concept. It is nonetheless related to von Neumann’s interpretation [

Along those lines, the experiment first proposed by Ibison and Jeffers in [

Ibison and Jeffers reported contradictory and inconclusive results from their pioneering experiments [

In this paper, we independently re-analyze the dataset presented in [

In an effort for reproducible research, the ∼80 Gb of raw data are publicly available on the Open Science Framework platform at the address

The outline of the paper is as follows. We briefly recall the experiment’s protocol in Section 2.1, and define the difference in fringe visibility Δ

The apparatus consists of a laser, a double-slit, and a camera recording the interference pattern; and is located in IONS’ laboratory, in Petaluma, California. Details are in [

In 2013, the feedback was inversely proportional to a sliding 3-second span average of the fringe visibility: the higher the line, or the higher the pitch of the tone, the lower was the fringe visibility, the closer was the system to “particle-like” behaviour.

In 2014, due to a coding error, the feedback was inversed: the feedback now increased when the fringe visibility

As controls, a Linux machine connects to the server via Internet at regular intervals. The server does not know who it is dealing with: it computes and sends feedback, and records interference data just as it would do for a human participant.

Each session always starts and finishes with a relaxation epoch. A total of 10 concentration and 11 relaxation epochs are recorded per session, which makes the whole session last about 10 minutes and 30 seconds. Some sessions end before all epochs are completed, due to Internet connection issues, or to participants’ impatience. One possible bias could come from participants’ self-selection: it could be argued that participants with poor results quit the experiment earlier than participants performing well. To avoid this bias, we need to take as many sessions as possible into account. On the other hand, very short sessions do not enable a precise estimation of any measurable difference between the two types of epochs. We decide to keep only sessions containing more than

Given

The camera records at 4Hz a line of 3000 pixels, an example of which is shown in _{M}) and minimum (note env_{m}) envelopes of the interference pattern computed with cubic spline interpolation between local extrema. Local extrema are automatically detected after a Savitzky-Golay filter of order 2 on a 29-pixel moving window that smooths the interference pattern in order to remove the pixel jitter that appears on some camera frames. We have also tried other smoothing options: same order Savitzky-Golay filters with 39 and 49-pixel window-lengths, as well as simple moving average filters with 20 and 30-pixel window-lengths, with no significant change in the overall results.

Example of a camera shot of the interference pattern, along with its two spline interpolated envelopes.

For a better signal to noise ratio, we consider the 19 middle fringes of the pattern.

Zoom around the 19 middle fringes of the interference pattern, along with its two interpolated envelopes. The fringe visibility as defined in

For each camera frame, we extract one scalar. The choice of this scalar is not straightforward and we will explore different choices throughout the paper. Following the analyses published in [

The red square signal represents the concentration/relaxation epochs.

For each session, we extract a single scalar value: the difference between the median of the fringe visibility during concentration epochs, and the median of the fringe visibility during relaxation epochs. The medians are considered as they are more robust to outliers than the average. Formally, given the fringe visibility time series ^{c} (resp. ^{r}) as the reduction of

If the mind-matter interaction hypothesis is false, one would normally expect _{0}, by performing a trimmed mean percentile bootstrap test (following Section 4.4.4 of [^{4} in our experiments). The statistical procedure is the following:

Generate a bootstrap sample

Trim the bootstrap sample: denoting by _{q} the integer closest to _{q} lowest and _{q} highest values from _{q}.

Compute the sample mean

Repeating steps 1 to 3

Consider 0 ≤ _{0} is rejected with significance level

Output:—

- (

Note that this normalized shift is only computed for illustration purposes (in order to observe in which direction potential shifts of the mean appear): it is

A time lag

The null hypothesis we are testing is therefore: _{0}: _{l}. We then apply the Holm-Bonferonni method [_{0}. To this end, write _{(1)} ≤ _{(2)} ≤ … ≤ _{(m)} the values of {_{l}} sorted in ascending order. The overall

_{l} versus the time lag _{0} for the human ’13 sessions (resp. control ’13, human ’14, control ’14) is _{0}. We also observe a shift towards negative values for the 2013 human sessions, even though in a much less significant manner than in [

Normalized shift and

We now propose to make a very different choice in the analysis of this data than the one originally proposed. The authors in [

Another fundamental difference between our analysis and the one proposed in [

Note that, for the sake of completeness, we will later show (in Fig 12 with the discussion in Section 3) the results obtained by aggregating both years’ data after sign inversion and/or supposing prior knowledge of the time lag. For now, however, we keep both years’ data separate, and test against several time lags.

In the next four sections (Section 2.5 to Section 2.8), we look at the robustness of the results regarding all the seemingly arbitrary decisions we made at every step of this pre-analysis, namely: the fringe number to consider (we chose fringe 9), the trimming intensity

Fringe number 9 is an arbitrary choice and it is necessary to look at other fringes. ^{−1}). The big surprise comes from the 2013 control sessions that show a significant (

Normalized shift and

To look at all fringes at once,

for the human and control sessions of each year as a function of the fringe number. Results are shown for

To go further, and in order to prevent us from choosing the fringe number(s) that serve one hypothesis or the other, we propose two strategies that both take into account information from all fringes.

We propose to investigate a new null hypothesis comprehending all fringes: ^{−1}, 5 × 10^{−1}, 1).

Normalized shifts of each of the 494 individual tests versus the time lag and the fringe number for all four different types of sessions. Results are shown for

The variability observed in

Given this new definition of fringe visibility, we test the null hypothesis:

Normalized shifts of each of the 260 individual tests versus the time lag and

We first observed that results are not robust with respect to the choice of fringe number one studies. To avoid choosing a fringe number, we i/ performed a test whose null hypothesis encompasses all fringe numbers, ii/ performed a test on the average of the fringe visibility over central fringes. Both analyses show the following:

the 2013 human sessions shift towards negative Δ

the 2014 human sessions shift towards positive Δ

the 2013 control sessions shift towards positive Δ

the 2014 control sessions do not show a clear and consistent shift;

all these shifts are however deemed insignificant (^{−2}) after correcting for multiple testing.

We now investigate if these results are robust to i/ the trimming intensity

Corrected for multiple comparisons

We recall that

Corrected for multiple comparisons

Until now we have been using the normalized difference between the interpolated envelopes as the definition of the fringe visibility (see _{n} and its preceding local minimum _{n}:
^{−2}.

(top) Normalized shifts and _{l} versus the time lag for fringe number 9 (with

For a fringe number _{n}, there is no reason to define its visibility by comparing _{n} to its previous local minimum _{n} rather than its succeeding local minimum _{n+1}. If one defines

One concludes that the results as summarized at the end of Section 2.5 are robust with respect to the fringe visibility estimation method.

The preliminary analysis proposed in Section 2.4 is subject to four seemingly arbitrary choices: the fringe number, the minimal length of a session, the trimming intensity

the 2013 human sessions shift towards negative Δ

the 2014 human sessions shift towards positive Δ

the 2013 control sessions shift towards positive Δ

the 2014 control sessions do not show a clear and consistent shift;

all these shifts are deemed insignificant (^{−2}) after correcting for multiple testing.

We show that these results are robust regarding the intensity

^{−8}), with the direction of the deviation conforming to the observers’ intentions.” Such a small

In this paper, we corrected point i/ and we argued that points ii/ and iii/ were not solid choices from our statistical re-analysis point-of-view, and preferred a more conservative standpoint by analyzing both years separately and testing 26 different time lags before correcting for multiple comparisons; both these choices necessarily inducing a lower statistical power. For completeness, we show in ^{−3}: they cannot be interpreted as strong evidence of mind-matter interaction, but may motivate further replication attempts. These additional results seem to point out that the erroneous statistical test used in [^{−8} instead of the ∼10^{−3} that we find here) –which further lead the authors to erroneous conclusions.

(top) Scenario 1: a time lag of 9 seconds is chosen from the start, and both years are analyzed separately. (middle) Scenario 2: 26 different time lags are tested and then corrected for multiple comparisons, and the data from both years are combined after sign inversion for 2014. (bottom) Scenario 3: a time lag of 9 seconds is chosen from the start, and the data from both years are combined after sign inversion for 2014.

Before we conclude, let us make an important statement. We have made many statistical tests, and to prevent

The thorough analysis pursued in this paper contradicts the results previously published in [