Functions

These functions are exposed at the top level of the module, e.g. hl.case.

Core language functions

literal(x, dtype, str, NoneType] = None) Captures and broadcasts a Python variable or object as an expression.
cond(condition, consequent, alternate, …) Expression for an if/else statement; tests a condition and returns one of two options based on the result.
switch(expr) Build a conditional tree on the value of an expression.
case(missing_false) Chain multiple if-else statements with a CaseBuilder.
bind(f, *exprs) Bind a temporary variable and use it in a function.
null(t, str]) Creates an expression representing a missing value of a specified type.
is_missing(expression) Returns True if the argument is missing.
is_defined(expression) Returns True if the argument is not missing.
coalesce(*args) Returns the first non-missing value of args.
or_else(a, b) If a is missing, return b.
or_missing(predicate, value) Returns value if predicate is True, otherwise returns missing.
range(start, stop[, step]) Returns an array of integers from start to stop by step.

Constructors

bool(x) Convert to a Boolean expression.
float(x) Convert to a 64-bit floating point expression.
float32(x) Convert to a 32-bit floating point expression.
float64(x) Convert to a 64-bit floating point expression.
int(x) Convert to a 32-bit integer expression.
int32(x) Convert to a 32-bit integer expression.
int64(x) Convert to a 64-bit integer expression.
interval(start, end[, includes_start, …]) Construct an interval expression.
str(x) Returns the string representation of x.
struct(**kwargs) Construct a struct expression.
tuple(iterable) Construct a tuple expression.

Collection constructors

array(collection) Construct an array expression.
empty_array(t, str]) Returns an empty array of elements of a type t.
set(collection) Convert a set expression.
empty_set(t, str]) Returns an empty set of elements of a type t.
dict(collection) Creates a dictionary.

Collection functions

map(f, collection) Transform each element of a collection.
flatmap(f, collection) Map each element of the collection to a new collection, and flatten the results.
zip(*arrays, fill_missing) Zip together arrays into a single array.
zip_with_index(a) Returns an array of (index, element) tuples.
flatten(collection) Flatten a nested collection by concatenating sub-collections.
any(f, collection) Returns True if f returns True for any element.
all(f, collection) Returns True if f returns True for every element.
filter(f, collection) Returns a new collection containing elements where f returns True.
sorted(collection, key, NoneType] = None[, …]) Returns a sorted array.
find(f, collection) Returns the first element where f returns True.
group_by(f, collection) Group collection elements into a dict according to a lambda function.
len(x) Returns the size of a collection or string.

Numeric functions

abs(x) Take the absolute value of a numeric value or array.
exp(x) Computes e raised to the power x.
is_nan(x) Returns True if the argument is nan (not a number).
log(x[, base]) Take the logarithm of the x with base base.
log10(x) Take the logarithm of the x with base 10.
sign(x) Returns the sign of a numeric value or array.
sqrt(x) Returns the square root of x.
int(x) Convert to a 32-bit integer expression.
int32(x) Convert to a 32-bit integer expression.
int64(x) Convert to a 64-bit integer expression.
float(x) Convert to a 64-bit floating point expression.
float32(x) Convert to a 32-bit floating point expression.
float64(x) Convert to a 64-bit floating point expression.
floor(x) The largest integral value that is less than or equal to x.
ceil(x) The smallest integral value that is greater than or equal to x.

Numeric collection functions

min(*exprs) Returns the minimum of a collection or of given numeric expressions.
max(*exprs) Returns the maximum element of a collection or of given numeric expressions.
mean(collection) Returns the mean of all values in the collection.
median(collection) Returns the median value in the collection.
product(collection) Returns the product of values in the collection.
sum(collection) Returns the sum of values in the collection.
argmin(array, unique) Return the index of the minimum value in the array.
argmax(array, unique) Return the index of the maximum value in the array.

String functions

json(x) Convert an expression to a JSON string expression.
hamming(s1, s2) Returns the Hamming distance between the two strings.
delimit(collection[, delimiter]) Joins elements of collection into single string delimited by delimiter.
entropy(s) Returns the Shannon entropy of the character distribution defined by the string.

Statistical functions

chisq(c1, c2, c3, c4) Performs chi-squared test of independence on a 2x2 contingency table.
fisher_exact_test(c1, c2, c3, c4) Calculates the p-value, odds ratio, and 95% confidence interval using Fisher’s exact test for a 2x2 table.
ctt(c1, c2, c3, c4, min_cell_count) Performs chi-squared or Fisher’s exact test of independence on a 2x2 contingency table.
dbeta(x, a, b) Returns the probability density at x of a beta distribution with parameters a (alpha) and b (beta).
dpois(x, lamb[, log_p]) Compute the (log) probability density at x of a Poisson distribution with rate parameter lamb.
hardy_weinberg_p(n_hom_ref, n_het, n_hom_var) Performs test of Hardy-Weinberg equilibrium.
pchisqtail(x, df) Returns the probability under the right-tail starting at x for a chi-squared distribution with df degrees of freedom.
pnorm(x) The cumulative probability function of a standard normal distribution.
ppois(x, lamb[, lower_tail, log_p]) The cumulative probability function of a Poisson distribution.
qchisqtail(p, df) Inverts pchisqtail().
qnorm(p) Inverts pnorm().
qpois(p, lamb[, lower_tail, log_p]) Inverts ppois().

Randomness

rand_bool(p) Returns True with probability p (RNG).
rand_norm([mean, sd]) Samples from a normal distribution with mean mean and standard deviation sd (RNG).
rand_pois(lamb) Samples from a Poisson distribution with rate parameter lamb (RNG).
rand_unif(min, max) Returns a random floating-point number uniformly drawn from the interval [min, max].

Genetics functions

locus(contig, pos, reference_genome, …) Construct a locus expression from a chromosome and position.
locus_from_global_position(global_pos, …) Constructs a locus expression from a global position and a reference genome.
locus_interval(contig, start, end[, …]) Construct a locus interval expression.
parse_locus(s, reference_genome, …) Construct a locus expression by parsing a string or string expression.
parse_variant(s, reference_genome, …) Construct a struct with a locus and alleles by parsing a string.
parse_locus_interval(s, reference_genome, …) Construct a locus interval expression by parsing a string or string expression.
call(*alleles[, phased]) Construct a call expression.
unphased_diploid_gt_index_call(gt_index) Construct an unphased, diploid call from a genotype index.
parse_call(s) Construct a call expression by parsing a string or string expression.
downcode(c, i) Create a new call by setting all alleles other than i to ref
triangle(n) Returns the triangle number of n.
is_snp(ref, alt) Returns True if the alleles constitute a single nucleotide polymorphism.
is_mnp(ref, alt) Returns True if the alleles constitute a multiple nucleotide polymorphism.
is_transition(ref, alt) Returns True if the alleles constitute a transition.
is_transversion(ref, alt) Returns True if the alleles constitute a transversion.
is_insertion(ref, alt) Returns True if the alleles constitute an insertion.
is_deletion(ref, alt) Returns True if the alleles constitute a deletion.
is_indel(ref, alt) Returns True if the alleles constitute an insertion or deletion.
is_star(ref, alt) Returns True if the alleles constitute an upstream deletion.
is_complex(ref, alt) Returns True if the alleles constitute a complex polymorphism.
is_valid_contig(contig[, reference_genome]) Returns True if contig is a valid contig name in reference_genome.
is_valid_locus(contig, position[, …]) Returns True if contig and position is a valid site in reference_genome.
allele_type(ref, alt) Returns the type of the polymorphism as a string.
pl_dosage(pl) Return expected genotype dosage from array of Phred-scaled genotype likelihoods with uniform prior.
gp_dosage(gp) Return expected genotype dosage from array of genotype probabilities.
get_sequence(contig, position[, before, …]) Return the reference sequence at a given locus.
mendel_error_code(locus, is_female, father, …) Compute a Mendelian violation code for genotypes.
liftover(x, dest_reference_genome[, min_match]) Lift over coordinates to a different reference genome.
min_rep(locus, alleles) Computes the minimal representation of a (locus, alleles) polymorphism.