^{1}

^{2}

^{1}

^{¤}

^{1}

^{3}

^{2}

The authors have declared that no competing interests exist.

Current address: Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America

For women with access to healthcare and early detection, breast cancer deaths are caused primarily by metastasis rather than growth of the primary tumor. Metastasis has been difficult to study because it happens deep in the body, occurs over years, and involves a small fraction of cells from the primary tumor. Furthermore, within-tumor heterogeneity relevant to metastasis can also lead to therapy failures and is obscured by studies of bulk tissue. Here we exploit heterogeneity to identify molecular mechanisms of metastasis. We use “organoids”, groups of hundreds of tumor cells taken from a patient and grown in the lab, to probe tumor heterogeneity, with potentially thousands of organoids generated from a single tumor. We show that organoids have the character of biological replicates: within-tumor and between-tumor variation are of similar magnitude. We develop new methods based on population genetics and variance components models to build between-tumor and within-tumor statistical tests, using organoids analogously to large sibships and vastly amplifying the test power. We show great efficiency for tests based on the organoids with the most extreme phenotypes and potential cost savings from pooled tests of the extreme tails, with organoids generated from hundreds of tumors having power predicted to be similar to bulk tests of hundreds of thousands of tumors. We apply these methods to an association test for molecular correlates of invasion, using a novel quantitative invasion phenotype calculated as the spectral power of the organoid boundary. These new approaches combine to show a strong association between invasion and protein expression of Keratin 14, a known biomarker for poor prognosis, with ^{−45} for within-tumor tests of individual organoids and ^{−6} for pooled tests of extreme tails. Future studies using these methods could lead to discoveries of new classes of cancer targets and development of corresponding therapeutics. All data and methods are available under an open source license at

For women with access to healthcare and early detection, breast cancer deaths are caused primarily by metastasis rather than growth of the primary tumor. Metastasis has been difficult to study because it happens deep in the body, occurs over years, and involves a small fraction of cells from the primary tumor. Furthermore, individual cells within a tumor can behave very differently, leading to failures of therapies. Here we exploit heterogeneity to develop new methods to identify molecular mechanisms of metastasis. We use “organoids”, groups of hundreds tumor cells taken from a patient and grown in the lab. Thousands of organoids can be generated from a single tumor sample to probe different regions and amplify the amount of information provided. Organoids provide information about metastasis because they vary in their ability to invade the growth medium. We introduce a new phenotype for invasion obtained by converting the boundary of an organoid into a frequency spectrum, then summing the power across all frequencies. We analyze this metastasis-related phenotype by adapting methods from population genetics that compare the most extreme siblings in a family. We analogously compare the most invasive vs. least invasive organoids from each tumor. Power calculations suggest that studies of 50–100 individuals, with 100-1000 organoids generated from each, could reveal DNA mutations and aberrant gene expression associated with invasion. We validate this approach by demonstrating strong statistical significance between invasion and protein expression of Keratin 14, a known biomarker for poor prognosis. Future studies using these methods could lead to discoveries of new classes of cancer targets and development of corresponding therapeutics.

Metastasis, rather than growth of the primary tumor, is the major cause of breast cancer mortality in developed nations [

Despite its importance in driving mortality, metastasis remains difficult to target clinically. Most therapies aim to suppress proliferation or eliminate proliferating cells rather than slow or halt specific stages of the metastatic process, such as invasion, dissemination, and seeding of secondary micro-metastases. These processes remain challenging to study because they occur deep in the body, can take years or decades to develop, and may arise in part from genetic and genomic heterogeneity of tumor cells [

Organotypic cell culture has provided powerful new systems for studying metastasis [

Heterogeneity exists both between tumors and within tumors. Measurements of the tumor bulk can obscure within-tumor heterogeneity, losing the ability to characterize smaller cell populations responsible for metastatic phenotypes or therapy resistance. In our context, however, heterogeneity can be exploited by probing variation in phenotype, genotype, and gene expression across different organoids generated from a single tumor. We propose that between-tumor and within-tumor variation are analogous to between-family and within-family variation in population genetics, with organoids generated from a single tumor corresponding to siblings within a large family.

We use this insight to quantitatively analyze organoid behavior with methods from population genetics. Within-tumor methods are particularly powerful because thousands of organoids can be generated from a single tumor, their shared genetic and environmental background can be subtracted as a common baseline, and pooled tests of the most extreme organoids are very efficient. As with genome-wide association studies, genetic and genomic associations identified by organoid population genetics could then be validated experimentally by directed perturbations. This combination of population-based organoid studies of invasion phenotypes and genetic/genomic perturbations could lead to new classes of targets for metastatic breast cancer and other invasive cancers.

Statistical genetics methods benefit from having quantitative rather than qualitative phenotypes. Invasion and other metastasis-related phenotypes have generally been qualitative, however, based on visual sorting into categories often denoted ‘+’, ‘++’, ‘+++’. Reproducible methods for automated scoring, similar to the quantitative growth rates used for proliferation phenotypes, have additional value in permitting more systematic, genome-scale studies that complement detailed phenotyping of individual genetic or chemical perturbations.

Fractal dimension calculations can provide features for image classification [

Here we define a quantitative phenotype for the invasive behavior of organoids generated from human breast tumor tissue. The mathematical technique is motivated by methods developed for analysis of biological shape using spectral transforms of a boundary [

We provide a formal description of the mathematical procedure, including normalization for organoid size, smoothing of possible pixelation artifacts, and recognizing high-curvature invasive boundaries. Organoids generated from distinct tumor samples are analyzed using these methods. A variance components model, the standard framework for quantitative traits in statistical genetics [

We conduct such an analysis for Keratin 14, using immunofluorescence to characterize protein expression within each organoid for correlation tests with invasion. We present results for tests based on all organoids, on the most and least invasive organoids from each tumor, and on pools of the extreme tails. These tests are all significant. We conclude with implications for population-based studies of cancer metastasis.

The use of human tumor specimens was approved by JHM-IRB X as study number NA_00077976, “Molecular Regulation of Breast Cancer Invasion”. The IRB determined that the use of de-identified tumor specimens is not human subjects research for which IRB review is required.

A total of 60 human breast tumor specimens were obtained as surgical samples from the Cooperative Human Tissue Network in accordance with a Johns Hopkins School of Medicine IRB acknowledged exempt study design. Basic demographic information (age, ethnicity, sex), entry date, and type of sample (primary tumor, recurrence) was available for every specimen. A total of 823 organoids were generated from 52 of the specimens. Most statistical analyses were limited to 47 specimens with 5 or more organoids, corresponding to 811 organoids. While all samples were consented for organoid generation and analysis, consent for continued access to medical records was not consistently requested; consequently, this study was not designed for analysis of outcomes. For more detail about the clinical cohort, see

Breast tumor specimens were processed individually to generate organoids, with approximately 300–500 cells per organoids, according to published protocols [

Organoids in DIC images were manually traced using I_{v}, _{v}} for

Boundary coordinates were analyzed using custom software, I_{−1}, _{−1}) ≡ (_{V−1}, _{V−1}). Next, _{j}, _{j}) for integer

Areas for organoids were calculated using the shoelace formula applied to the interpolation points. An effective diameter was calculated as

We then used Python _{j}, _{j}} to obtain _{k}, _{k}) and (_{−k}, _{−k}) are complex conjugates, only the

The spectral power _{k} at frequency _{k} is invariant to rotation in the imaging plane. The term _{0}, representing the boundary centroid, is discarded to make the power invariant to translation in the imaging plane. Thus the total spectral power

We then normalized the spectral power by the power of the first mode, _{1}, to make the sum scale-invariant, introduced a smoothing transformation to correct for pixelation, and introduced a derivative transformation that represents curvature (see ^{2}(^{2}(^{2}(2^{2} (see

The full data set _{ti}}, with index _{t}} denoting an individual organoid generated from tumor _{t} denoting the number of organoids generated from tumor

The invasiveness of an organoid is modeled as a random variable from a particular probability distribution. Model selection corresponds to identifying the form of a probability distribution that could have generated the observed data; it also usually involves estimating the values of parameters required by the distribution. We consider three possible models describing the variation of organoid invasiveness within and between tumors. The null model, _{0}, assumes an equal mean, _{0}, and variance, _{1}, incorporates an independent mean, _{t}, for each tumor, but retains a shared within-group variance _{2}, incorporates both a tumor-dependent mean, _{t}, and a tumor-dependent variance,

Bayesian model selection is used to select the most likely model. The posterior probability of a model _{M},
_{θ} is the number of observations used to obtain

An important additional consideration for model selection is the scale of the invasiveness data. Many biological processes involve multiplicative noise, generating a log-normal distribution (the exponential of a normally distributed random variable) rather than a normal distribution arising from additive noise. Applying a logarithmic transform, as is usually done when comparing gene or protein expression levels, usually works well to recover normally-distributed data. This is valuable because statistical models and hypothesis tests often assume normally distributed data, and in many cases further assume that variance can be modeled as a single parameter independent of group.

To assess the arithmetic scale and the logarithmic scale via model selection, the data _{ti}}, with _{ti} ≡ _{ti} for the arithmetic scale and _{ti} ≡ log_{10} _{ti} for the logarithmic scale. The probabilities of the observed data for the three models are

Model selection based on normal distributions can be sensitive to noise in the data, particularly when group sizes are small. Bootstrap replicates were used to increase the robustness of estimates from small populations [

An additional procedure was used to guard against the difficulty of estimating the within-tumor variance for tumors with small numbers of measured organoids. We performed model selection again, but this time restricted the tumors to those having at least two organoids. A new series of 10,000 bootstrap replicates was generated to obtain converged estimates for this sample. The same procedure was then performed requiring at least 3 organoids, at least 4 organoids, and so on up to at least 10 organoids.

Variance components models describe the distributions of random variables in structured population where subgroups have shifted means but share a common variance. These models also provide a framework for hypothesis testing and provide unbiased estimates of variances. In standard usage, populations refer to individuals, and the structure arises from families within the population. Here, the population refers to individual organoids, and the structure arises because subsets of organoids correspond to a single tumor. After model selection as described above, the observation of organoid _{ti}. The variance components model considers nested hypotheses for _{ti}, expressed in terms of hypotheses _{0} and _{1} equivalent to models _{0} and _{1} above:

The ANOVA test statistic, _{1}, is
_{1} = _{2} = … = _{T}, _{1} is a random variable following an

The phenotypes _{ti} permit discovery of associations with biological or experiment factors, including gene expression levels, genetic variants, or culture conditions. These biological factors are denoted _{ti} for factor _{t} is incorporated as a random effect and the associated factor as a fixed effect, leading to a mixed effect model:
_{B} for the between-tumor test and _{W} for the within-tumor tests.

The relationship between the type I and II error rates, the fraction of variance explained _{I} is the normal quantile corresponding to the desired false-positive rate, and _{II} is the normal quantile corresponding to the desired false-negative rate (see _{II} and hence the power are usually different.

While measuring the invasiveness of each organoid is feasible, performing genomic analysis of each organoid could be prohibitively expensive. An alternative approach is to pool the most invasive and least invasive organoids, and then to perform genomic analysis of the pooled upper and lower tails. We have shown that power is optimized by selecting the upper and lower 27%, with efficiency equivalent to individual measurements of a population 80% as large; the selection threshold and efficiency are robust to sibship size, effect size, and allele frequency in the context of genetic studies [

Assuming pooled tests with _{P} representing pooling efficiency, the relationship between power and variance explained is

We use these power relationships to calculate the critical effect size required to achieve specified Type I and Type II error given an experimental design:
_{B} and _{W} should be identical, the true correlations _{B} and _{W} may be very different. Between-tumor variation in both the invasiveness {_{ti}} and the features {_{ti}} often weaken the between-tumor correlation, yielding _{B} < _{W}. Furthermore, with 100-1000 organoids generated per tumor,

Linear models were used to perform between-tumor and within-tumor tests for correlation of invasion with protein expression of K14. For these statistical tests, we used tumors with at least 5 organoids, restricting analysis to 47 tumors and 811 organoids. Defining _{ti} as before as the log_{10}-transformed spectral power for organoid _{ti} in turn as the rank-transformed total K14 and mean K14 of each organoid. Tumor means _{ti} and _{ti} were calculated. The between-tumor test used a linear model with an intercept, while the within-tumor test used a linear model without an intercept. We performed similar tests for rank-transformed organoid area.

For analysis restricted to extreme tails, the organoids with greatest spectral power (upper tail) and least spectral power (lower tail) corresponding to fraction _{ti} for organoids within the upper and lower tails. These were entered into a paired _{ti} rather than _{ti} would give identical results.

Tissue samples from breast cancer tumors were acquired from the Cooperative Human Tissue Network. Organoids were generated from 52 specimens, each from a different individual. Organoids were cultured in a three-dimensional collagen I matrix that, in previous work, has been shown to promote invasion [

The method used to define a quantitative phenotype for invasion is outlined for an organoid that is highly invasive (panels A, B, C), moderately invasive (panels D,E,F), and weakly invasive (panels G,H,I). These three organoids were selected from the 43 organoids generated from tumor 10, illustrating heterogeneity within a single tumor sample. Differential interference contrast (DIC) microscopy was used for image acquisition, with a scale of approximately 0.5

The number of boundary points from manual segmentation was variable (_{ti} for organoid

(A) Distribution of the number of manual segmentation boundary points per organoid. (B) Distribution of organoid effective diameter, defined as

This process is illustrated for 3 of the 43 organoids generated from tumor sample 10 (

The individual spectral components provide invasive behavior fingerprints. Lower-order components generally reflect overall lengthening of an organoid along one or more axes, and higher-order components arise from more convoluted boundaries. While spectral components are not considered individually here, they could be used to cluster similar invasive patterns that may correspond to distinct invasion mechanisms.

Organoids for all 52 tumor samples were characterized using this spectral method, with a resulting ordering that generally agrees with visual impressions of invasiveness (

Organoid boundaries are shown for 823 organoids generated from 52 breast tumors and imaged after six days of growth in 3D culture. Each column corresponds to organoids from a single tumor, denoted by an identifier underneath the column. Organoid boundaries were converted to a quantitative spectral power phenotype, represented by a false color map from blue (non-invasive) to red (highly invasive). For each tumor, organoids are stacked from less invasive to more invasive as characterized by the spectral power. Tumors are then arranged from left to right based on the median organoid invasiveness. Differences in numbers of organoids per tumor are from constraints on experimental capacity rather than biological differences between tumors. Heterogeneity is observed on both the horizontal axis (between-tumor variation) and the vertical axis (within-tumor variation).

These data indicate that tumors differ systematically in their ability to generate invasive organoids. Similarly, organoids generated from different cells within a tumor show different abilities to invade. Finally, the spectral power method is robust even for highly invasive organoids whose boundaries are partially truncated by the field of view, for example the top boundaries of the two most invasive organoids from Tumor 49 (^{th} from the left). The spectral power nevertheless properly characterizes these organoids as invasive.

Bayesian model selection was used to guide the choice of scale, arithmetic versus logarithmic for the invasion spectral power, and the choice of statistical model describing within-tumor and between-tumor heterogeneity. Invasion heterogeneity has qualitatively different appearance on an arithmetic versus logarithmic scale (

Each boxplot represents the distribution of invasion scores for organoids generated from a single tumor, with tumors ordered from left to right by median organoid invasiveness. The boxplot for each tumor indicates the median value (red bar), lower and upper quartile values (box extent), and outliers as individual points. (A) Distributions generated using invasion on an arithmetic scale are asymmetric, with the median closer to the first quartile and a larger upper tail. The interquartile range increases substantially with the median invasiveness. (B) Distributions generated using invasion on a logarithmic scale are more symmetric, with the median approximately halfway between the first and third quartile. The interquartile range increases less with the median.

We used Bayesian statistics to provide a quantitative analysis of suitability of the arithmetic versus logarithmic scale for statistical modeling. For each scale, we considered three generative models for invasiveness within and between tumors, giving each model an equal prior of 1/3 (Eqs

For invasion on an arithmetic scale, Model 2 is selected, with less than 1 × 10^{−50} probability assigned to Model 0 or Model 1. The assignment of all probability to Model 2 is a quantitative reflection of the qualitative observations of distribution skew and increasing variance with increasing median invasiveness (

In contrast, for the logarithmic scale, Model 1 and Model 2 both have appreciable probability, indicating the suitability of statistical models that assume equal variance for organoid invasiveness within each tumor (

(A) Bootstrap replicates were used to increase robustness to limited number of tumors and organoids per tumor. Bootstraps were conducted for all 52 tumors and then, with increasing stringency, for tumors generating at least 2 through 10 organoids, with 30 tumors meeting the final requirement (solid line). The average number of organoids per tumor increased from 15.8 to 22.9 for these replicates (dashed line). (B) Three generative models were considered for between-tumor and within-tumor variation in logarithmic-scale organoid invasiveness: Model 0 assumes a single mean and variance shared by all tumors; Model 1 assumes a shared variance, but assigns each tumor its own mean; Model 2 assigns each tumor its own mean and variance. Converged estimates were obtained from 10,000 bootstrap replicates for thresholds of 1 organoid per tumor up to 10 organoids per tumor. For these thresholds, the posterior probability is 55-65% for Model 1 (green bars), with the remaining probability assigned to Model 2 (blue bars). Model 0 (red bars) had vanishing probability, not visible on this scale.

We further increased the robustness of estimates for the logarithmic scale by using 10,000 bootstrap replicates. These calculations suggest a posterior probability of 55-65% for Model 1, the remaining probability assigned to Model 2, and vanishing probability for Model 0 (

We conclude from this analysis that the log-transformed spectral power is compatible with a normal mixture model, in which each tumor has an individual mean and all tumors share a single variance. This model is the standard statistical framework for analyzing quantitative traits in population genetics and is described by conventional parametric statistics. The spectral measure on its original arithmetic scale is less suitable because of skew and heteroscedasticity. The log-transform in this context is similar to log-transforms used for gene expression and other quantitative characters that are better described by log-normal distributions than by normal distributions, possibly reflecting multiplicative rather than additive noise.

The previous results indicate that the standard framework for analyzing genetic and phenotypic variation in a structured population, a variance components model, is appropriate for analysis of variation of invasion for organoids generated from tumors. Each tumor is analogous to a family in a population-based study, and each organoid is analogous to a sibling within the family. For organoid _{ti} is modeled as a tumor mean _{t} plus a deviation _{ti}, with

This structure is essentially identical to the structure of an ANOVA model testing the hypothesis that all _{t} values are identical; under the null hypothesis, the test statistic follows an _{T−1, N−K} distribution for _{51,771}, we find ANOVA test statistic ^{−39}. The strong significance of this hypothesis test is the frequentist equivalent of the Bayesian model selection assigning vanishing probability to the null model.

The ANOVA model provides an estimate of the variance components from the between-tumor heterogeneity,

Component | Value | Fraction of toal |
---|---|---|

0.2344 | 1.000 | |

0.0652 | 0.278 | |

0.7218 | 0.722 |

Variance components were calculated on a log_{10} scale from 823 organoids generated from 52 tumors.

In population genetics, heterogeneity within families often provides a powerful substrate for identifying genetic or genomic factors that drive phenotypes. Within-tumor heterogeneity is problematic in the clinical setting, as region-specific and cell-specific differences in resistance to chemotherapeutics makes it more challenging to eliminate the entire tumor with a given drug regimen [

We envision statistical tests that model the dependence of the invasiveness, denoted _{t} of tumor _{t} is
_{0} is the population-level mean invasiveness and _{0} is the population-level biological factor. The null hypothesis ^{2} random variable with 1 degree of freedom. Under the alternative, ^{2} follows a non-central ^{2} distribution with non-centrality parameter (^{2}/(1 − ^{2}) for

A compact equation connects the test statistic ^{2}, the effect size ^{2}, the two-tailed type I error _{I} with quantile _{I} defined by Φ(_{I}) = 1 − _{I}/2 for cumulative normal distribution Φ, and quantile _{II} defined through the type II error _{II} as Φ(_{II}) = _{II} (Eqs _{I} = 0.05/20, 000, and _{I} = 4.708. For a typical 80% requested power, _{II} = 0.2, and _{II} = −0.842. Under these assumptions, the population required to detect effect size ^{2} is ^{2})/^{2}.

In a simple model for a phenotype that depends on ^{2} for an individual gene would be 1/^{2} in the range 0.05 to 0.1, with mutant alleles in 10–20 different genes leading to the same syndrome through phenocopy. Variants identified from genome-wide association studies (GWAS) have ^{2} ∼ 0.01, or even smaller in large meta-analyses. Thus, factors that explain even 1% of the variation in tumor invasion could have high biological relevance in identifying pathways and potential targets. Corresponding population sizes required are 280 tumors required to detect a Mendelian-like association with ^{2} = 0.1 and ∼3000 tumors required to detect a GWAS-like association with ^{2} = 0.01.

These population sizes can be vastly reduced, however, by exploiting the within-tumor heterogeneity, ignored in the above analysis. Each observation of tumor invasiveness _{ti} is separated into the population mean _{0}, the tumor-based mean _{ti} for organoid _{ti} are separated into the population mean _{0}, the tumor mean _{ti}. In the framework of a variance components model, these lead to between-tumor and within-tumor tests that are statistically independent:
_{B} = 0, with two-tailed alternative _{B} ≠ 0; similarly, the null for the within-tumor test is _{W} = 0, with alternative _{W} ≠ 0. Equations similar to ^{2} = 0.01, the number of observations required remains 3000. These may be obtained from 100’s of organoids generated per tumor from only 10’s of tumors, rather than 1000’s of tumors required for a between-tumor test.

These power relationships assume that individual measurement of invasiveness

In a pooled RNA-Seq study, the most invasive organoids would be pooled to generate a single RNA-Seq library, and similarly the least invasive organoids would be pooled to generate a second library. The number of RNA-Seq libraries for a within-tumor test would then be reduced from the number of organoids to twice the number of tumors, a 100–1000× reduction in effort. A general conclusion for an additive model is that pooling the upper 27% and the lower 27% optimizes the power and has 80% efficiency, defined as having equivalent power to an individual-level test conducted on 80% of the original population [

A pooling design slightly modifies the power relationships, primarily by increasing the number of organoids required by 10–20% relative to a design in which each organoid is individually characterized genomically (Eqs ^{2} values required to detect an effect at 80% power, assuming 20,000 gene-based tests with a corresponding two-tailed ^{−6}. For between-tumor tests (

The critical effect size defined as variance explained, ^{2}, is shown for (A) between-tumor tests and (B) within-tumor tests. Calculations assume 20,000 two-tailed gene-based tests with genome-wide significance level 2.5 × 10^{−6} and 80% power. For between-tumor tests, the ratio of within-tumor to between-tumor variance is set to the observed value of 2.6. For within-tumor tests, pooling is assumed to reduce efficiency to 80%. Color bars indicate contour levels; between-tumor tests are limited to much larger effects and use only the upper region of the scale.

While

The contour lines for between-tumor tests are vertical (

Motivated by the predicted power of a population-based test, we used this approach to test the association of invasion with protein expression of Keratin 14 (K14). We chose protein characterization rather than RNA-Seq because these samples were not consented for genomics, and we chose K14 because of strong evidence linking K14 with laboratory analysis for tumor invasion in mouse and human [

We therefore quantified K14 by immunofluorescence using epifluorescence microscopy, with pixel values mapped from 0 (no expression) to 1 (saturation), for the same series of organoids imaged by DIC for invasion. Boundaries identified from DIC images were superimposed on the K14 images, and pixel intensities within each organoid boundary were gathered (

The DIC images (panels A,C,E) were paired with K14 epifluorescence images obtained at identical resolution (panels B,D,F). Dots indicate boundaries from the DIC images interpolated to 256 equally spaced points and superimposed on both the DIC and K14 images.

(A) Histogram of total Keratin 14 (K14) expression per organoid, calculated as the sum of the K14 intensity on a [0, 1] scale for pixels within the organoid divided by the total number of image pixels. (B) Histogram of mean K14 expression, calculated as the sum of the K14 intensity divided by the area of the organoid in pixels. The organoid size is less than the image size, and therefore the mean K14 is greater than the total K14. Both the total and mean were rank-normalized to generate uniform distributions for robust statistical analysis.

Analyses were conducted according to the strategy outlined above: between-tumor tests for the tumor means estimated from the individual organoids, similar to a standard analysis of the tumor bulk, and within-tumor tests for invasiveness and K14 values corrected for the tumor mean. The within-tumor tests were conducted using three separate methods to explore the power of pooling. First, we performed a standard regression test that used each organoid as a single observation. Next, we restricted attention to the organoids in the tails of the distribution, and again performed a standard regression test using this subset of organoids. The tail fraction

Between-tumor tests of total K14 versus invasion show a positive correlation, but are not significant, even at the single-test level (^{−45}, ^{−10} even for tail fractions as low as 5% (^{−6}, the typical threshold for a 0.05 family-wise error rate (FWER) when correcting for tests of 20,000 human genes or proteins.

(A) Between-tumor tests of tumor means (points) do not yield significance for a linear model (dashed line). (B) The within-tumor test shows a highly significant association (^{−45}) for a linear model (dashed line) between invasiveness and total Keratin 14 protein expression for individual organoids corrected for their tumor-specific baselines. Organoids in the extreme tails are shown for symmetric tails of 10% through 50%, with organoids in the 10% tail also belonging to larger tails and so on. The dashed regression line uses all the observations (50% tails). (C) Tests performed using organoids restricted to extreme tails are also highly significant (solid line). Pooled tests of mean values for organoids in the upper vs. lower tail, performed as a paired-sample ^{−6} when correcting for 20,000 genes or proteins tested, compatible for use with RNA-Seq (dashed line).

Thus, the pooling approach described here should have power to detected a similarly sized effect from RNA-Seq data generated from pooled highly invasive versus non-invasive organoids. The statistical significance is weakly sensitive to pooling fraction, with similar results for pooling fractions from 20% to 50%. These results are in accord with theory developed for pooled analysis in the context of genome-wide association studies [^{2} = 0.05, than for the within-tumor test, ^{2} = 0.22.

We performed a similar series of tests for mean K14, correcting for possible confounding with tumor size (^{−13},

(A) Between-tumor tests of tumor means (points) do not yield significance for a linear model (dashed line). (B) The within-tumor test shows a highly significant association (1.0 × 10^{−13}) for a linear model (dashed line) between invasiveness and mean Keratin 14 protein expression for individual organoids corrected for their tumor-specific baselines. Organoids in the extreme tails are shown for symmetric tails of 10% through 50%, with organoids in the 10% tail also belonging to larger tails and so on. The dashed regression line uses all the observations (50% tails). (C) Tests performed using organoids restricted to extreme tails are also highly significant (solid line). Pooled tests of mean values for organoids in the upper vs. lower tail, performed as a paired-sample

Given the stronger association with total K14 than mean K14, we next investigated associations with organoid area, rank-transformed to a uniform distribution to permit robust analysis. The between-tumor test was significant at a single-test level (^{−52}), and extreme tail and pooled tests were also significant for genome-wide or proteome-wide tests (^{−10} for many pooling fractions).

(A) Between-tumor tests between organoid size and invasiveness remain significant. (^{−52}) for a linear model (dashed blue line) between invasiveness and rank-transformed organoid area. Organoids are colored according to extreme tail membership. (C) Tests performed using organoids restricted to extreme tails are also highly significant (solid line), as are pooled tests for association of area with invasiveness.

Population-based studies have been highly effective in revealing the genetic architecture of complex disease through genome-wide association studies (GWAS). Similar studies of somatic aberrations in cancer, whether genetic mutations or epigenetic or gene expression drivers, have not had the cohort sizes to permit similarly powered studies. Most tumor studies enroll fewer than 1000 individuals, whereas many GWAS have populations over 100,000. Our insight is that tumor heterogeneity, probed by organoids, permits 100’s to 1000’s of independent measurements from a single tumor, with tumors and organoids analogous to families and sibships in a population genetics study. Furthermore, similar to efficient genetic study designs using the most extreme or discordant siblings, we can increase efficiency be restricting analysis to organoids in the extreme tails of a phenotype distribution, or even single measurements of pooled tails. We have developed this approach successfully by validating the association of Keratin 14 with a quantitative phenotype for organoid invasion.

To permit quantitative analysis, we also have developed a new spectral-based phenotype for assessing the invasiveness of organoids generated from human breast tumors. This phenotype, when measured on a logarithmic scale, is ideal for quantitative trait analysis. Bayesian model selection indicates that a mixed effects model describes the data well: while each tumor has its own mean invasiveness, tumors share a common variance describing within-tumor heterogeneity. A variance components model finds that the within-tumor variance is approximately 2.6× larger than the between-tumor heterogeneity. The implication of this finding is that measurements of bulk tumor capture only a small fraction of the information inherent in heterogeneous tumor tissue. The ability to probe heterogeneity is the motivating factor for single-cell DNA and RNA sequencing. Here, we demonstrate that organoids are also able to probe this heterogeneity.

The organoid phenotype analyzed here is a surrogate for an initial step of metastasis, which in addition to invasion by individual cells or collectives also includes dissemination, re-seeding, and outgrowth. Organoids can provide invasion-related quantitative phenotypes as a step towards more comprehensive analysis of the genetic and genomic determinants of metastasis. In patients, tumor invasion can be assessed from two-dimensional sections by various methods as part of clinical prognosis [

Molecular characterization of invasive versus non-invasive organoids could identify biological factors that are drivers and effectors of metastasis. These could provide new hypotheses for therapeutic targets and for predictive biomarkers. Again using the proven success of GWAS as a model, we analyze the power of between-tumor and within-tumor tests, analogous to between-family and within-family tests in population genetics. We find that between-tumor tests have limited power; even after 200 tumors have been analyzed, power is limited to detect effects similar to Mendelian genes in hereditary disorders. Within-tumor tests have excellent power, however, potentially equivalent to well-powered GWAS that can identify genes and variants that contribute as little as 1% to 0.1% to population-level variation. These results provide a possible explanation for the challenges in converting bulk tumor genomics data to therapeutically usefully knowledge: only the very largest effects have been detected because much of the information inherent in individual cells and sub-regions has been lost.

In addition to gene expression markers, direct observation of protein levels can be informative. Keratin 14 was quantified here using immunofluorescence. In previous work, we have used genetic engineering of fusion proteins for live imaging of fluorescent tags [

To validate the power of within-tumor tests, extreme tails analysis, and pooled tests, we demonstrated highly significant associations between increased expression and increased invasiveness, ^{−45} for total Keratin 14 and ^{−13} for Keratin 14 normalized to the imaged cross-sectional area. Tests restricted to organoids in the extreme tails retained high power and strong significance. Thus, generating 100’s of organoids per tumor and restricting analysis to the most invasive 5 to 10, selected either visually or by segmentation followed by automated analysis of the tumor boundary, could lead to new discoveries. Even greater experimental savings come with a pooled design, for example collecting the most extreme organoids from each tumor to generate a single RNA-Seq library for each tail. Pooled tests would have power to detect association for molecular features with effect sizes similar to total Keratin 14, even after correcting for multiple testing of 20,000 genes.

We conclude that organoid-based studies, enrolling on the scale of 100 participants with breast cancer and generating 100-1000 organoids per tumor, will have the ability to discover clinically relevant driver and effector genes for basic molecular drivers of phenotypes relevant for breast cancer. This population genetics framework is directly applicable to analyzing the molecular determinants in different cancer types and in future studies designed to correlate organoid phenotypes with clinical outcomes. Our approach could also be generalized to later stages of metastasis through development and validation of additional quantitative traits that capture biological variation in the capacity of cancer cells to, for example, disseminate or seed distant organs.

(PDF)

(XLSX)

(GZ)

We thank Prof. Jerry Prince for advice on parametric transformations for quantitative analysis of shape, Prof. Paul Newton and Prof. Paul Macklin for suggestions relating to low-pass filters, Prof. Donald Geman and Prof. Rene Vidal for advice on segmentation, and Dr. Andre Kuchavary for discussions of spectral methods for related problems.