Statistical functions¶
chisq (c1, c2, c3, c4) |
Calculates p-value (Chi-square approximation) and odds ratio for a 2x2 table. |
fisher_exact_test (c1, c2, c3, c4) |
Calculates the p-value, odds ratio, and 95% confidence interval with Fisher’s exact test for a 2x2 table. |
ctt (c1, c2, c3, c4, min_cell_count) |
Calculates p-value and odds ratio for 2x2 table. |
dbeta (x, a, b) |
Returns the probability density at x of a beta distribution with parameters a (alpha) and b (beta). |
dpois (x, lamb[, log_p]) |
Compute the (log) probability density at x of a Poisson distribution with rate parameter lamb. |
hardy_weinberg_p (n_hom_ref, n_het, n_hom_var) |
Compute Hardy-Weinberg Equilbrium p-value and heterozygosity ratio. |
binom_test (x, n, p, alternative) |
Performs a binomial test on p given x successes in n trials. |
pchisqtail (x, df) |
Returns the probability under the right-tail starting at x for a chi-squared distribution with df degrees of freedom. |
pnorm (x) |
The cumulative probability function of a standard normal distribution. |
ppois (x, lamb[, lower_tail, log_p]) |
The cumulative probability function of a Poisson distribution. |
qchisqtail (p, df) |
Inverts pchisqtail() . |
qnorm (p) |
Inverts pnorm() . |
qpois (p, lamb[, lower_tail, log_p]) |
Inverts ppois() . |
-
hail.expr.functions.
chisq
(c1, c2, c3, c4) → hail.expr.expressions.typed_expressions.StructExpression[source]¶ Calculates p-value (Chi-square approximation) and odds ratio for a 2x2 table.
Examples
>>> hl.chisq(10, 10, 10, 10).value Struct(odds_ratio=1.0, p_value=1.0)
>>> hl.chisq(30, 30, 50, 10).value Struct(odds_ratio=0.2, p_value=0.000107511176729)
Parameters: - c1 (int or
Expression
of typetint32
) – Value for cell 1. - c2 (int or
Expression
of typetint32
) – Value for cell 2. - c3 (int or
Expression
of typetint32
) – Value for cell 3. - c4 (int or
Expression
of typetint32
) – Value for cell 4.
Returns: StructExpression
– Atstruct
expression with two fields, p_value (tfloat64
) and odds_ratio (tfloat64
).- c1 (int or
-
hail.expr.functions.
fisher_exact_test
(c1, c2, c3, c4) → hail.expr.expressions.typed_expressions.StructExpression[source]¶ Calculates the p-value, odds ratio, and 95% confidence interval with Fisher’s exact test for a 2x2 table.
Examples
>>> hl.fisher_exact_test(10, 10, 10, 10).value Struct(p_value=1.0000000000000002, odds_ratio=1.0, ci_95_lower=0.24385796914260355, ci_95_upper=4.100747675033819)
>>> hl.fisher_exact_test(30, 30, 50, 10).value Struct(p_value=0.00019049994432397886, odds_ratio=0.20287462096407916, ci_95_lower=0.07687933053900567, ci_95_upper=0.4987032678214519)
Notes
This method is identical to the version implemented in R with default parameters (two-sided, alpha = 0.05, null hypothesis that the odds ratio equals 1).
Parameters: - c1 (int or
Expression
of typetint32
) – Value for cell 1. - c2 (int or
Expression
of typetint32
) – Value for cell 2. - c3 (int or
Expression
of typetint32
) – Value for cell 3. - c4 (int or
Expression
of typetint32
) – Value for cell 4.
Returns: StructExpression
– Atstruct
expression with four fields, p_value (tfloat64
), odds_ratio (tfloat64
), ci_95_lower (:py:data:.tfloat64`), and ci_95_upper (tfloat64
).- c1 (int or
-
hail.expr.functions.
ctt
(c1, c2, c3, c4, min_cell_count) → hail.expr.expressions.typed_expressions.StructExpression[source]¶ Calculates p-value and odds ratio for 2x2 table.
Examples
>>> hl.ctt(10, 10, 10, 10, min_cell_count=15).value Struct(odds_ratio=1.0, p_value=1.0)
>>> hl.ctt(30, 30, 50, 10, min_cell_count=15).value Struct(odds_ratio=0.202874620964, p_value=0.000190499944324)
Notes
If any cell is lower than min_cell_count, Fisher’s exact test is used. Otherwise, faster chi-squared approximation is used.
Parameters: - c1 (int or
Expression
of typetint32
) – Value for cell 1. - c2 (int or
Expression
of typetint32
) – Value for cell 2. - c3 (int or
Expression
of typetint32
) – Value for cell 3. - c4 (int or
Expression
of typetint32
) – Value for cell 4. - min_cell_count (int or
Expression
of typetint32
) – Minimum cell count for chi-squared approximation.
Returns: StructExpression
– Atstruct
expression with two fields, p_value (tfloat64
) and odds_ratio (tfloat64
).- c1 (int or
-
hail.expr.functions.
dbeta
(x, a, b) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Returns the probability density at x of a beta distribution with parameters a (alpha) and b (beta).
Examples
>> hl.dbeta(.2, 5, 20).value 4.900377563180943
Parameters: - x (
float
orExpression
of typetfloat64
) – Point in [0,1] at which to sample. If a < 1 then x must be positive. If b < 1 then x must be less than 1. - a (
float
orExpression
of typetfloat64
) – The alpha parameter in the beta distribution. The result is undefined for non-positive a. - b (
float
orExpression
of typetfloat64
) – The beta parameter in the beta distribution. The result is undefined for non-positive b.
Returns: - x (
-
hail.expr.functions.
dpois
(x, lamb, log_p=False) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Compute the (log) probability density at x of a Poisson distribution with rate parameter lamb.
Examples
>>> hl.dpois(5, 3).value 0.10081881344492458
Parameters: - x (
float
orExpression
of typetfloat64
) – Non-negative number at which to compute the probability density. - lamb (
float
orExpression
of typetfloat64
) – Poisson rate parameter. Must be non-negative. - log_p (
bool
orBooleanExpression
) – If true, the natural logarithm of the probability density is returned.
Returns: Expression
of typetfloat64
– The (log) probability density.- x (
-
hail.expr.functions.
hardy_weinberg_p
(n_hom_ref, n_het, n_hom_var) → hail.expr.expressions.typed_expressions.StructExpression[source]¶ Compute Hardy-Weinberg Equilbrium p-value and heterozygosity ratio.
Examples
>>> hl.hardy_weinberg_p(20, 50, 26).value Struct(r_expected_het_freq=0.500654450262, p_hwe=0.762089599352)
>>> hl.hardy_weinberg_p(37, 200, 85).value Struct(r_expected_het_freq=0.489649643074, p_hwe=1.13372103832e-06)
Notes
For more information, see the Wikipedia page
Parameters: - n_hom_ref (int or
Expression
of typetint32
) – Homozygous reference count. - n_het (int or
Expression
of typetint32
) – Heterozygote count. - n_hom_var (int or
Expression
of typetint32
) – Homozygous alternate count.
Returns: StructExpression
– A struct expression with two fields, r_expected_het_freq (tfloat64
) and p_value (tfloat64
).- n_hom_ref (int or
-
hail.expr.functions.
binom_test
(x, n, p, alternative: str) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Performs a binomial test on p given x successes in n trials.
Examples
>>> hl.binom_test(5, 10, 0.5, 'less').value 0.6230468749999999
With alternative
less
, the p-value is the probability of at most x successes, i.e. the cumulative probability at x of the distribution Binom(n, p). Withgreater
, the p-value is the probably of at least x successes. Withtwo.sided
, the p-value is the total probability of all outcomes with probability at most that of x.Returns the p-value from the exact binomial test of the null hypothesis that success has probability p, given x successes in n trials.
Parameters: - x (int or
Expression
of typetint32
) – Number of successes. - n (int or
Expression
of typetint32
) – Number of trials. - p (float or
Expression
of typetfloat64
) – Probability of success, between 0 and 1. - alternative – : One of, “two.sided”, “greater”, “less”.
Returns: Expression
of typetfloat64
– p-value.- x (int or
-
hail.expr.functions.
pchisqtail
(x, df) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Returns the probability under the right-tail starting at x for a chi-squared distribution with df degrees of freedom.
Examples
>>> hl.pchisqtail(5, 1).value 0.025347318677468304
Parameters: - x (float or
Expression
of typetfloat64
) - df (float or
Expression
of typetfloat64
) – Degrees of freedom.
Returns: Expression
of typetfloat64
- x (float or
-
hail.expr.functions.
pnorm
(x) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ The cumulative probability function of a standard normal distribution.
Examples
>>> hl.pnorm(0).value 0.5
>>> hl.pnorm(1).value 0.8413447460685429
>>> hl.pnorm(2).value 0.9772498680518208
Notes
Returns the left-tail probability p = Prob(:math:Z < x) with :math:Z a standard normal random variable.
Parameters: x (float or Expression
of typetfloat64
)Returns: Expression
of typetfloat64
-
hail.expr.functions.
ppois
(x, lamb, lower_tail=True, log_p=False) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ The cumulative probability function of a Poisson distribution.
Examples
>>> hl.ppois(2, 1).value 0.9196986029286058
Notes
If lower_tail is true, returns Prob(\(X \leq\) x) where \(X\) is a Poisson random variable with rate parameter lamb. If lower_tail is false, returns Prob(\(X\) > x).
Parameters: - x (float or
Expression
of typetfloat64
) - lamb (float or
Expression
of typetfloat64
) – Rate parameter of Poisson distribution. - lower_tail (bool or
BooleanExpression
) – IfTrue
, compute the probability of an outcome at or below x, otherwise greater than x. - log_p (bool or
BooleanExpression
) – Return the natural logarithm of the probability.
Returns: Expression
of typetfloat64
- x (float or
-
hail.expr.functions.
qchisqtail
(p, df) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Inverts
pchisqtail()
.Examples
>>> hl.qchisqtail(0.01, 1).value 6.634896601021213
Notes
- Returns right-quantile x for which p = Prob(\(Z^2\) > x) with \(Z^2\) a chi-squared random
- variable with degrees of freedom specified by df. p must satisfy 0 < p <= 1.
Parameters: - p (float or
Expression
of typetfloat64
) – Probability. - df (float or
Expression
of typetfloat64
) – Degrees of freedom.
Returns: Expression
of typetfloat64
-
hail.expr.functions.
qnorm
(p) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Inverts
pnorm()
.Examples
>>> hl.qnorm(0.90).value 1.2815515655446008
Notes
Returns left-quantile x for which p = Prob(\(Z\) < x) with \(Z\) a standard normal random variable. p must satisfy 0 < p < 1.
Parameters: p (float or Expression
of typetfloat64
) – Probability.Returns: Expression
of typetfloat64
-
hail.expr.functions.
qpois
(p, lamb, lower_tail=True, log_p=False) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Inverts
ppois()
.Examples
>>> hl.qpois(0.99, 1).value 4
Notes
Returns the smallest integer \(x\) such that Prob(\(X \leq x\)) \(\geq\) p where \(X\) is a Poisson random variable with rate parameter lambda.
Parameters: - p (float or
Expression
of typetfloat64
) - lamb (float or
Expression
of typetfloat64
) – Rate parameter of Poisson distribution. - lower_tail (bool or
BooleanExpression
) – Corresponds to lower_tail parameter in inverseppois()
. - log_p (bool or
BooleanExpression
) – Exponentiate p before testing.
Returns: Expression
of typetfloat64
- p (float or