Expressions¶
Expression |
Base class for Hail expressions. |
ArrayExpression |
Expression of type tarray . |
ArrayNumericExpression |
Expression of type tarray with a numeric type. |
BooleanExpression |
Expression of type tbool . |
CallExpression |
Expression of type tcall . |
CollectionExpression |
Expression of type tarray or tset . |
DictExpression |
Expression of type tdict . |
IntervalExpression |
Expression of type tinterval . |
LocusExpression |
Expression of type tlocus . |
NumericExpression |
Expression of numeric type. |
Int32Expression |
Expression of type tint32 . |
Int64Expression |
Expression of type tint64 . |
Float32Expression |
Expression of type tfloat32 . |
Float64Expression |
Expression of type tfloat64 . |
SetExpression |
Expression of type tset . |
StringExpression |
Expression of type tstr . |
StructExpression |
Expression of type tstruct . |
-
class
hail.expr.expressions.
Expression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Base class for Hail expressions.
-
__eq__
(other)[source]¶ Returns
True
if the two expressions are equal.Examples
>>> x = hl.literal(5) >>> y = hl.literal(5) >>> z = hl.literal(1)
>>> (x == y).value True
>>> (x == z).value False
Notes
This method will fail with an error if the two expressions are not of comparable types.
Parameters: other ( Expression
) – Expression for equality comparison.Returns: BooleanExpression
–True
if the two expressions are equal.
-
__ne__
(other)[source]¶ Returns
True
if the two expressions are not equal.Examples
>>> x = hl.literal(5) >>> y = hl.literal(5) >>> z = hl.literal(1)
>>> (x != y).value False
>>> (x != z).value True
Notes
This method will fail with an error if the two expressions are not of comparable types.
Parameters: other ( Expression
) – Expression for inequality comparison.Returns: BooleanExpression
–True
if the two expressions are not equal.
-
collect
()[source]¶ Collect all records of an expression into a local list.
Examples
Collect all the values from C1:
>>> first3 = table1.C1.collect() [2, 2, 10, 11]
Warning
Extremely experimental.
Warning
The list of records may be very large.
Returns: list
-
show
(n=10, width=90, truncate=None, types=True)[source]¶ Print the first few rows of the table to the console.
Examples
>>> table1.SEX.show() +-------+-----+ | ID | SEX | +-------+-----+ | int32 | str | +-------+-----+ | 1 | M | | 2 | M | | 3 | F | | 4 | F | +-------+-----+
>>> hl.literal(123).show() +--------+ | <expr> | +--------+ | int32 | +--------+ | 123 | +--------+
Warning
Extremely experimental.
Parameters: - n (
int
) – Maximum number of rows to show. - width (
int
) – Horizontal width at which to break columns. - truncate (
int
, optional) – Truncate each field to the given number of characters. IfNone
, truncate fields to the given width. - types (
bool
) – Print an extra header line with the type of each field.
- n (
-
take
(n)[source]¶ Collect the first n records of an expression.
Examples
Take the first three rows:
>>> first3 = table1.X.take(3) [5, 6, 7]
Warning
Extremely experimental.
Parameters: n (int) – Number of records to take. Returns: list
-
value
¶ Evaluate this expression.
Notes
This expression must have no indices, but can refer to the globals of a
hail.Table
orhail.MatrixTable
.Returns: The value of this expression.
-
-
class
hail.expr.expressions.
ArrayExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.CollectionExpression
Expression of type
tarray
.>>> names = hl.literal(['Alice', 'Bob', 'Charlie'])
See also
-
__getitem__
(item)[source]¶ Index into or slice the array.
Examples
Index with a single integer:
>>> names[1].value 'Bob'
>>> names[-1].value 'Charlie'
Slicing is also supported:
>>> names[1:].value ['Bob', 'Charlie']
Parameters: item (slice or Expression
of typetint32
) – Index or slice.Returns: Expression
– Element or array slice.
-
append
(item)[source]¶ Append an element to the array and return the result.
Examples
>>> names.append('Dan').value ['Alice', 'Bob', 'Charlie', 'Dan']
Note
This method does not mutate the caller, but instead returns a new array by copying the caller and adding item.
Parameters: item ( Expression
) – Element to append, same type as the array element type.Returns: ArrayExpression
-
contains
(item)[source]¶ Returns a boolean indicating whether item is found in the array.
Examples
>>> names.contains('Charlie').value True
>>> names.contains('Helen').value False
Parameters: item ( Expression
) – Item for inclusion test.Warning
This method takes time proportional to the length of the array. If a pipeline uses this method on the same array several times, it may be more efficient to convert the array to a set first (
set()
).Returns: BooleanExpression
–True
if the element is found in the array,False
otherwise.
-
extend
(a)[source]¶ Concatenate two arrays and return the result.
Examples
>>> names.extend(['Dan', 'Edith']).value ['Alice', 'Bob', 'Charlie', 'Dan', 'Edith']
Parameters: a ( ArrayExpression
) – Array to concatenate, same type as the callee.Returns: ArrayExpression
-
-
class
hail.expr.expressions.
ArrayNumericExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.ArrayExpression
Expression of type
tarray
with a numeric type.Numeric arrays support arithmetic both with scalar values and other arrays. Arithmetic between two numeric arrays requires that the length of each array is identical, and will apply the operation positionally (
a1 * a2
will multiply the first element ofa1
by the first element ofa2
, the second element ofa1
by the second element ofa2
, and so on). Arithmetic with a scalar will apply the operation to each element of the array.>>> a1 = hl.literal([0, 1, 2, 3, 4, 5])
>>> a2 = hl.literal([1, -1, 1, -1, 1, -1])
-
__add__
(other)[source]¶ Positionally add an array or a scalar.
Examples
>>> (a1 + 5).value [5, 6, 7, 8, 9, 10]
>>> (a1 + a2).value [1, 0, 3, 2, 5, 4]
Parameters: other ( NumericExpression
orArrayNumericExpression
) – Value or array to add.Returns: ArrayNumericExpression
– Array of positional sums.
-
__floordiv__
(other)[source]¶ Positionally divide by an array or a scalar using floor division.
Examples
>>> (a1 // 2).value [0, 0, 1, 1, 2, 2]
Parameters: other ( NumericExpression
orArrayNumericExpression
)Returns: ArrayNumericExpression
-
__mod__
(other)[source]¶ Positionally compute the left modulo the right.
Examples
>>> (a1 % 2).value [0, 1, 0, 1, 0, 1]
Parameters: other ( NumericExpression
orArrayNumericExpression
)Returns: ArrayNumericExpression
-
__mul__
(other)[source]¶ Positionally multiply by an array or a scalar.
Examples
>>> (a2 * 5).value [5, -5, 5, -5, 5, -5]
>>> (a1 * a2).value [0, -1, 2, -3, 4, -5]
Parameters: other ( NumericExpression
orArrayNumericExpression
) – Value or array to multiply by.Returns: ArrayNumericExpression
– Array of positional products.
-
__neg__
()[source]¶ Negate elements of the array.
Examples
>>> (-a1).value [0, -1, -2, -3, -4, -5]
Returns: ArrayNumericExpression
– Array expression of the same type.
-
__pow__
(other)[source]¶ Positionally raise to the power of an array or a scalar.
Examples
>>> (a1 ** 2).value [0.0, 1.0, 4.0, 9.0, 16.0, 25.0]
>>> (a1 ** a2).value [0.0, 1.0, 2.0, 0.3333333333333333, 4.0, 0.2]
Parameters: other ( NumericExpression
orArrayNumericExpression
)Returns: ArrayNumericExpression
-
__sub__
(other)[source]¶ Positionally subtract an array or a scalar.
Examples
>>> (a2 - 1).value [0, -2, 0, -2, 0, -2]
>>> (a1 - a2).value [-1, 2, 1, 4, 3, 6]
Parameters: other ( NumericExpression
orArrayNumericExpression
) – Value or array to subtract.Returns: ArrayNumericExpression
– Array of positional differences.
-
-
class
hail.expr.expressions.
BooleanExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.NumericExpression
Expression of type
tbool
.>>> t = hl.literal(True) >>> f = hl.literal(False) >>> na = hl.null(hl.tbool)
>>> t.value True
>>> f.value False
>>> na.value None
-
__and__
(other)[source]¶ Return
True
if the left and right arguments areTrue
.Examples
>>> (t & f).value False
>>> (t & na).value None
>>> (f & na).value False
The
&
and|
operators have higher priority than comparison operators like==
,<
, or>
. Parentheses are often necessary:>>> x = hl.literal(5)
>>> ((x < 10) & (x > 2)).value True
Parameters: other ( BooleanExpression
) – Right-side operand.Returns: BooleanExpression
–True
if both left and right areTrue
.
-
__invert__
()[source]¶ Return the boolean negation.
Examples
>>> (~t).value False
>>> (~f).value True
>>> (~na).value None
Returns: BooleanExpression
– Boolean negation.
-
__or__
(other)[source]¶ Return
True
if at least one of the left and right arguments isTrue
.Examples
>>> (t | f).value True
>>> (t | na).value True
>>> (f | na).value None
The
&
and|
operators have higher priority than comparison operators like==
,<
, or>
. Parentheses are often necessary:>>> x = hl.literal(5)
>>> ((x < 10) | (x > 20)).value True
Parameters: other ( BooleanExpression
) – Right-side operand.Returns: BooleanExpression
–True
if either left or right isTrue
.
-
-
class
hail.expr.expressions.
CallExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tcall
.>>> call = hl.call(0, 1, phased=False)
-
__getitem__
(item)[source]¶ Get the i*th* allele.
Examples
Index with a single integer:
>>> call[0].value 0
>>> call[1].value 1
Parameters: item (int or Expression
of typetint32
) – Allele index.Returns: Expression
of typetint32
-
is_diploid
()[source]¶ True if the call has ploidy equal to 2.
Examples
>>> call.is_diploid().value True
Returns: BooleanExpression
-
is_haploid
()[source]¶ True if the call has ploidy equal to 1.
Examples
>>> call.is_haploid().value False
Returns: BooleanExpression
-
is_het
()[source]¶ Evaluate whether the call includes two different alleles.
Examples
>>> call.is_het().value True
Returns: BooleanExpression
–True
if the two alleles are different,False
if they are the same.
-
is_het_nonref
()[source]¶ Evaluate whether the call includes two different alleles, neither of which is reference.
Examples
>>> call.is_het_nonref().value False
Returns: BooleanExpression
–True
if the call includes two different alternate alleles,False
otherwise.
-
is_het_ref
()[source]¶ Evaluate whether the call includes two different alleles, one of which is reference.
Examples
>>> call.is_het_ref().value True
Returns: BooleanExpression
–True
if the call includes one reference and one alternate allele,False
otherwise.
-
is_hom_ref
()[source]¶ Evaluate whether the call includes two reference alleles.
Examples
>>> call.is_hom_ref().value False
Returns: BooleanExpression
–True
if the call includes two reference alleles,False
otherwise.
-
is_hom_var
()[source]¶ Evaluate whether the call includes two identical alternate alleles.
Examples
>>> call.is_hom_var().value False
Returns: BooleanExpression
–True
if the call includes two identical alternate alleles,False
otherwise.
-
is_non_ref
()[source]¶ Evaluate whether the call includes one or more non-reference alleles.
Examples
>>> call.is_non_ref().value True
Returns: BooleanExpression
–True
if at least one allele is non-reference,False
otherwise.
-
n_alt_alleles
()[source]¶ Returns the number of non-reference alleles.
Examples
>>> call.n_alt_alleles().value 1
Returns: Expression
of typetint32
– The number of non-reference alleles.
-
one_hot_alleles
(alleles)[source]¶ Returns an array containing the summed one-hot encoding of the alleles.
Examples
>>> call.one_hot_alleles(['A', 'T']).value [1, 1]
This one-hot representation is the positional sum of the one-hot encoding for each called allele. For a biallelic variant, the one-hot encoding for a reference allele is
[1, 0]
and the one-hot encoding for an alternate allele is[0, 1]
. Diploid calls would produce the following arrays:[2, 0]
for homozygous reference,[1, 1]
for heterozygous, and[0, 2]
for homozygous alternate.Parameters: alleles ( ArrayStringExpression
) – Variant alleles.Returns: ArrayInt32Expression
– An array of summed one-hot encodings of allele indices.
-
phased
¶ True if the call is phased.
Examples
>>> call.phased.value False
Returns: BooleanExpression
-
ploidy
¶ Return the number of alleles of this call.
Examples
>>> call.ploidy.value 2
Returns: Expression
of typetint32
-
unphased_diploid_gt_index
()[source]¶ Return the genotype index for unphased, diploid calls.
Examples
>>> call.unphased_diploid_gt_index().value 1
Returns: Expression
of typetint32
-
-
class
hail.expr.expressions.
CollectionExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tarray
ortset
>>> a = hl.literal([1, 2, 3, 4, 5])
>>> s3 = hl.literal({'Alice', 'Bob', 'Charlie'})
-
all
(f)[source]¶ Returns
True
if f returnsTrue
for every element.Examples
>>> a.all(lambda x: x < 10).value True
Notes
This method returns
True
if the collection is empty.Parameters: f (function ( (arg) -> BooleanExpression
)) – Function to evaluate for each element of the collection. Must return aBooleanExpression
.Returns: BooleanExpression
. –True
if f returnsTrue
for every element,False
otherwise.
-
any
(f)[source]¶ Returns
True
if f returnsTrue
for any element.Examples
>>> a.any(lambda x: x % 2 == 0).value True
>>> s3.any(lambda x: x[0] == 'D').value False
Notes
This method always returns
False
for empty collections.Parameters: f (function ( (arg) -> BooleanExpression
)) – Function to evaluate for each element of the collection. Must return aBooleanExpression
.Returns: BooleanExpression
. –True
if f returnsTrue
for any element,False
otherwise.
-
filter
(f)[source]¶ Returns a new collection containing elements where f returns
True
.Examples
>>> a.filter(lambda x: x % 2 == 0).value [2, 4]
>>> s3.filter(lambda x: ~(x[-1] == 'e')).value {'Bob'}
Notes
Returns a same-type expression; evaluated on a
SetExpression
, returns aSetExpression
. Evaluated on anArrayExpression
, returns anArrayExpression
.Parameters: f (function ( (arg) -> BooleanExpression
)) – Function to evaluate for each element of the collection. Must return aBooleanExpression
.Returns: CollectionExpression
– Expression of the same type as the callee.
-
find
(f)[source]¶ Returns the first element where f returns
True
.Examples
>>> a.find(lambda x: x ** 2 > 20).value 5
>>> s3.find(lambda x: x[0] == 'D').value None
Notes
If f returns
False
for every element, then the result is missing.Parameters: f (function ( (arg) -> BooleanExpression
)) – Function to evaluate for each element of the collection. Must return aBooleanExpression
.Returns: Expression
– Expression whose type is the element type of the collection.
-
flatmap
(f)[source]¶ Map each element of the collection to a new collection, and flatten the results.
Examples
>>> a.flatmap(lambda x: hl.range(0, x)).value [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4]
>>> s3.flatmap(lambda x: hl.set(hl.range(0, x.length()).map(lambda i: x[i]))).value {'A', 'B', 'C', 'a', 'b', 'c', 'e', 'h', 'i', 'l', 'o', 'r'}
Parameters: f (function ( (arg) -> CollectionExpression
)) – Function from the element type of the collection to the type of the collection. For instance, flatmap on aset<str>
should take astr
and return aset
.Returns: CollectionExpression
-
group_by
(f)[source]¶ Group elements into a dict according to a lambda function.
Examples
>>> a.group_by(lambda x: x % 2 == 0).value {False: [1, 3, 5], True: [2, 4]}
>>> s3.group_by(lambda x: x.length()).value {3: {'Bob'}, 5: {'Alice'}, 7: {'Charlie'}}
Parameters: f (function ( (arg) -> Expression
)) – Function to evaluate for each element of the collection to produce a key for the resulting dictionary.Returns: DictExpression
. – Dictionary keyed by results of f.
-
length
()[source]¶ Returns the size of a collection.
Examples
>>> a.length().value 5
>>> s3.length().value 3
Returns: Expression
of typetint32
– The number of elements in the collection.
-
map
(f)[source]¶ Transform each element of a collection.
Examples
>>> a.map(lambda x: x ** 3).value [1.0, 8.0, 27.0, 64.0, 125.0]
>>> s3.map(lambda x: x.length()).value {3, 5, 7}
Parameters: f (function ( (arg) -> Expression
)) – Function to transform each element of the collection.Returns: CollectionExpression
. – Collection where each element has been transformed according to f.
-
size
()[source]¶ Returns the size of a collection.
Examples
>>> a.size().value 5
>>> s3.size().value 3
Returns: Expression
of typetint32
– The number of elements in the collection.
-
-
class
hail.expr.expressions.
DictExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tdict
.>>> d = hl.literal({'Alice': 43, 'Bob': 33, 'Charles': 44})
-
__getitem__
(item)[source]¶ Get the value associated with key item.
Examples
>>> d['Alice'].value 43
Notes
Raises an error if item is not a key of the dictionary. Use
DictExpression.get()
to return missing instead of an error.Parameters: item ( Expression
) – Key expression.Returns: Expression
– Value associated with key item.
-
contains
(item)[source]¶ Returns whether a given key is present in the dictionary.
Examples
>>> d.contains('Alice').value True
>>> d.contains('Anne').value False
Parameters: item ( Expression
) – Key to test for inclusion.Returns: BooleanExpression
–True
if item is a key of the dictionary,False
otherwise.
-
get
(item, default=None)[source]¶ Returns the value associated with key k or a default value if that key is not present.
Examples
>>> d.get('Alice').value 43
>>> d.get('Anne').value None
>>> d.get('Anne', 0).value 0
Parameters: - item (
Expression
) – Key. - default (
Expression
) – Default value. Must be same type as dictionary values.
Returns: Expression
– The value associated with item, or default.- item (
-
key_set
()[source]¶ Returns the set of keys in the dictionary.
Examples
>>> d.key_set().value {'Alice', 'Bob', 'Charles'}
Returns: SetExpression
– Set of all keys.
-
keys
()[source]¶ Returns an array with all keys in the dictionary.
Examples
>>> d.keys().value ['Bob', 'Charles', 'Alice']
Returns: ArrayExpression
– Array of all keys.
-
map_values
(f)[source]¶ Transform values of the dictionary according to a function.
Examples
>>> d.map_values(lambda x: x * 10).value {'Alice': 430, 'Bob': 330, 'Charles': 440}
Parameters: f (function ( (arg) -> Expression
)) – Function to apply to each value.Returns: DictExpression
– Dictionary with transformed values.
-
size
()[source]¶ Returns the size of the dictionary.
Examples
>>> d.size().value 3
Returns: Expression
of typetint32
– Size of the dictionary.
-
values
()[source]¶ Returns an array with all values in the dictionary.
Examples
>>> d.values().value [33, 44, 43]
Returns: ArrayExpression
– All values in the dictionary.
-
-
class
hail.expr.expressions.
IntervalExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tinterval
.>>> interval = hl.interval(3, 11) >>> locus_interval = hl.parse_locus_interval("1:53242-90543")
-
contains
(value)[source]¶ Tests whether a value is contained in the interval.
Examples
>>> interval.contains(3).value True
>>> interval.contains(11).value False
Parameters: value – Object with type matching the interval point type. Returns: BooleanExpression
–True
if value is contained in the interval,False
otherwise.
-
end
¶ Returns the end point.
Examples
>>> interval.end.value 11
Returns: Expression
-
includes_end
¶ True if the interval includes the end point.
Examples
>>> interval.includes_end.value False
Returns: BooleanExpression
-
includes_start
¶ True if the interval includes the start point.
Examples
>>> interval.includes_start.value True
Returns: BooleanExpression
-
overlaps
(interval)[source]¶ True if the the supplied interval contains any value in common with this one.
Examples
>>> interval.overlaps(hl.interval(5, 9)).value True
>>> interval.overlaps(hl.interval(11, 20)).value False
Parameters: interval ( Expression
with typetinterval
) – Interval object with the same point type.Returns: BooleanExpression
-
start
¶ Returns the start point.
Examples
>>> interval.start.value 3
Returns: Expression
-
-
class
hail.expr.expressions.
LocusExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tlocus
.>>> locus = hl.locus('1', 1034245)
-
contig
¶ Returns the chromosome.
Examples
>>> locus.contig.value '1'
Returns: StringExpression
– The chromosome for this locus.
-
global_position
()[source]¶ Returns a zero-indexed absolute position along the reference genome.
The global position is computed as
position
- 1 plus the sum of the lengths of all the contigs that precede this locus’scontig
in the reference genome’s ordering of contigs.See also
locus_from_global_position()
.Examples
A locus with position 1 along chromosome 1 will have a global position of 0 along the reference genome GRCh37.
>>> hl.locus('1', 1).global_position().value 0
A locus with position 1 along chromosome 2 will have a global position of (1-1) + 249250621, where 249250621 is the length of chromosome 1 on GRCh37.
>>> hl.locus('2', 1).global_position().value 249250621
A different reference genome than the default results in a different global position.
>>> hl.locus('chr2', 1, 'GRCh38').global_position().value 248956422
Returns: Expression
of typetint64
– Global base position of locus along the reference genome.
-
in_autosome
()[source]¶ Returns
True
if the locus is on an autosome.Notes
All contigs are considered autosomal except those designated as X, Y, or MT by
ReferenceGenome
.Examples
>>> locus.in_autosome().value True
Returns: BooleanExpression
-
in_autosome_or_par
()[source]¶ Returns
True
if the locus is on an autosome or a pseudoautosomal region of chromosome X or Y.Examples
>>> locus.in_autosome_or_par().value True
Returns: BooleanExpression
-
in_mito
()[source]¶ Returns
True
if the locus is on mitochondrial DNA.Examples
>>> locus.in_mito().value True
Returns: BooleanExpression
-
in_x_nonpar
()[source]¶ Returns
True
if the locus is in a non-pseudoautosomal region of chromosome X.Examples
>>> locus.in_x_nonpar().value False
Returns: BooleanExpression
-
in_x_par
()[source]¶ Returns
True
if the locus is in a pseudoautosomal region of chromosome X.Examples
>>> locus.in_x_par().value False
Returns: BooleanExpression
-
in_y_nonpar
()[source]¶ Returns
True
if the locus is in a non-pseudoautosomal region of chromosome Y.Examples
>>> locus.in_y_nonpar().value False
Note
Many variant callers only generate variants on chromosome X for the pseudoautosomal region. In this case, all loci mapped to chromosome Y are non-pseudoautosomal.
Returns: BooleanExpression
-
in_y_par
()[source]¶ Returns
True
if the locus is in a pseudoautosomal region of chromosome Y.Examples
>>> locus.in_y_par().value False
Note
Many variant callers only generate variants on chromosome X for the pseudoautosomal region. In this case, all loci mapped to chromosome Y are non-pseudoautosomal.
Returns: BooleanExpression
-
position
¶ Returns the position along the chromosome.
Examples
>>> locus.position.value 1034245
Returns: Expression
of typetint32
– This locus’s position along its chromosome.
-
sequence_context
(before=0, after=0)[source]¶ Return the reference genome sequence at the locus.
Examples
Get the reference allele at a locus:
>>> locus.sequence_context().value "G"
Get the reference sequence at a locus including the previous 5 bases:
>>> locus.sequence_context(before=5).value "ACTCGG"
Notes
This function requires that this locus’ reference genome has an attached reference sequence. Use
ReferenceGenome.add_sequence()
to load and attach a reference sequence to a reference genome.Parameters: - before (
Expression
of typetint32
, optional) – Number of bases to include before the locus. Truncates at contig boundary. - after (
Expression
of typetint32
, optional) – Number of bases to include after the locus. Truncates at contig boundary.
Returns: - before (
-
-
class
hail.expr.expressions.
NumericExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of numeric type.
>>> x = hl.literal(3)
>>> y = hl.literal(4.5)
-
__add__
(other)[source]¶ Add two numbers.
Examples
>>> (x + 2).value 5
>>> (x + y).value 7.5
Parameters: other ( NumericExpression
) – Number to add.Returns: NumericExpression
– Sum of the two numbers.
-
__floordiv__
(other)[source]¶ Divide two numbers with floor division.
Examples
>>> (x // 2).value 1
>>> (y // 2).value 2.0
Parameters: other ( NumericExpression
) – Dividend.Returns: NumericExpression
– The floor of the left number divided by the right.
-
__ge__
(other)[source]¶ Greater-than-or-equals comparison.
Examples
>>> (y >= 4).value True
Parameters: other ( NumericExpression
) – Right side for comparison.Returns: BooleanExpression
–True
if the left side is greater than or equal to the right side.
-
__gt__
(other)[source]¶ Greater-than comparison.
Examples
>>> (y > 4).value True
Parameters: other ( NumericExpression
) – Right side for comparison.Returns: BooleanExpression
–True
if the left side is greater than the right side.
-
__le__
(other)[source]¶ Less-than-or-equals comparison.
Examples
>>> (x <= 3).value True
Parameters: other ( NumericExpression
) – Right side for comparison.Returns: BooleanExpression
–True
if the left side is smaller than or equal to the right side.
-
__lt__
(other)[source]¶ Less-than comparison.
Examples
>>> (x < 5).value True
Parameters: other ( NumericExpression
) – Right side for comparison.Returns: BooleanExpression
–True
if the left side is smaller than the right side.
-
__mod__
(other)[source]¶ Compute the left modulo the right number.
Examples
>>> (32 % x).value 2
>>> (7 % y).value 2.5
Parameters: other ( NumericExpression
) – Dividend.Returns: NumericExpression
– Remainder after dividing the left by the right.
-
__mul__
(other)[source]¶ Multiply two numbers.
Examples
>>> (x * 2).value 6
>>> (x * y).value 9.0
Parameters: other ( NumericExpression
) – Number to multiply.Returns: NumericExpression
– Product of the two numbers.
-
__neg__
()[source]¶ Negate the number (multiply by -1).
Examples
>>> (-x).value -3
Returns: NumericExpression
– Negated number.
-
__pow__
(power, modulo=None)[source]¶ Raise the left to the right power.
Examples
>>> (x ** 2).value 9.0
>>> (x ** -2).value 0.1111111111111111
>>> (y ** 1.5).value 9.545941546018392
Parameters: - power (
NumericExpression
) - modulo – Unsupported argument.
Returns: Expression
of typetfloat64
– Result of raising left to the right power.- power (
-
__sub__
(other)[source]¶ Subtract the right number from the left.
Examples
>>> (x - 2).value 1
>>> (x - y).value -1.5
Parameters: other ( NumericExpression
) – Number to subtract.Returns: NumericExpression
– Difference of the two numbers.
-
-
class
hail.expr.expressions.
Int32Expression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.NumericExpression
Expression of type
tint32
.
-
class
hail.expr.expressions.
Int64Expression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.NumericExpression
Expression of type
tint64
.
-
class
hail.expr.expressions.
Float32Expression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.NumericExpression
Expression of type
tfloat32
.
-
class
hail.expr.expressions.
Float64Expression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.NumericExpression
Expression of type
tfloat64
.
-
class
hail.expr.expressions.
SetExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.typed_expressions.CollectionExpression
Expression of type
tset
.>>> s1 = hl.literal({1, 2, 3}) >>> s2 = hl.literal({1, 3, 5})
See also
-
add
(item)[source]¶ Returns a new set including item.
Examples
>>> s1.add(10).value {1, 2, 3, 10}
Parameters: item ( Expression
) – Value to add.Returns: SetExpression
– Set with item added.
-
contains
(item)[source]¶ Returns
True
if item is in the set.Examples
>>> s1.contains(1).value True
>>> s1.contains(10).value False
Parameters: item ( Expression
) – Value for inclusion test.Returns: BooleanExpression
–True
if item is in the set.
-
difference
(s)[source]¶ Return the set of elements in the set that are not present in set s.
Examples
>>> s1.difference(s2).value {2}
>>> s2.difference(s1).value {5}
Parameters: s ( SetExpression
) – Set expression of the same type.Returns: SetExpression
– Set of elements not in s.
-
intersection
(s)[source]¶ Return the intersection of the set and set s.
Examples
>>> s1.intersection(s2).value {1, 3}
Parameters: s ( SetExpression
) – Set expression of the same type.Returns: SetExpression
– Set of elements present in s.
-
is_subset
(s)[source]¶ Returns
True
if every element is contained in set s.Examples
>>> s1.is_subset(s2).value False
>>> s1.remove(2).is_subset(s2).value True
Parameters: s ( SetExpression
) – Set expression of the same type.Returns: BooleanExpression
–True
if every element is contained in set s.
-
remove
(item)[source]¶ Returns a new set excluding item.
Examples
>>> s1.remove(1).value {2, 3}
Parameters: item ( Expression
) – Value to remove.Returns: SetExpression
– Set with item removed.
-
union
(s)[source]¶ Return the union of the set and set s.
Examples
>>> s1.union(s2).value {1, 2, 3, 5}
Parameters: s ( SetExpression
) – Set expression of the same type.Returns: SetExpression
– Set of elements present in either set.
-
-
class
hail.expr.expressions.
StringExpression
(ast: hail.expr.expr_ast.AST, type: hail.expr.types.HailType, indices: hail.expr.expressions.indices.Indices = Indices(axes=set(), source=None), aggregations: hail.utils.linkedlist.LinkedList = List())[source]¶ Bases:
hail.expr.expressions.base_expression.Expression
Expression of type
tstr
.>>> s = hl.literal('The quick brown fox')
-
__add__
(other)[source]¶ Concatenate strings.
Examples
>>> (s + ' jumped over the lazy dog').value 'The quick brown fox jumped over the lazy dog'
Parameters: other ( StringExpression
) – String to concatenate.Returns: StringExpression
– Concatenated string.
-
__getitem__
(item)[source]¶ Slice or index into the string.
Examples
>>> s[:15].value 'The quick brown'
>>> s[0].value 'T'
Parameters: item (slice or Expression
of typetint32
) – Slice or character index.Returns: StringExpression
– Substring or character at index item.
-
contains
(substr)[source]¶ Returns whether substr is contained in the string.
Examples
>>> s.contains('fox').value True
>>> s.contains('dog').value False
Note
This method is case-sensitive.
Parameters: substr ( StringExpression
)Returns: BooleanExpression
-
endswith
(substr)[source]¶ Returns whether substr is a suffix of the string.
Examples
>>> s.endswith('dog').value True
Note
This method is case-sensitive.
Parameters: substr ( StringExpression
)Returns: StringExpression
-
first_match_in
(regex)[source]¶ Returns an array containing the capture groups of the first match of regex in the given character sequence.
Examples
>>> s.first_match_in("The quick (\w+) fox").value ["brown"]
>>> s.first_match_in("The (\w+) (\w+) (\w+)").value ["quick", "brown", "fox"]
>>> s.first_match_in("(\w+) (\w+)").value None
Parameters: regex ( StringExpression
)Returns: ArrayExpression
with element typetstr
-
length
()[source]¶ Returns the length of the string.
Examples
>>> s.length().value 19
Returns: Expression
of typetint32
– Length of the string.
-
lower
()[source]¶ Returns a copy of the string, but with upper case letters converted to lower case.
Examples
>>> s.lower().value 'the quick brown fox'
Returns: StringExpression
-
matches
(regex)[source]¶ Returns
True
if the string contains any match for the given regex.Examples
>>> string = hl.literal('NA12878')
The regex parameter does not need to match the entire string:
>>> string.matches('12').value True
Regex motifs can be used to match sequences of characters:
>>> string.matches(r'NA\\d+').value True
Notes
The regex argument is a regular expression, and uses Java regex syntax.
Parameters: regex ( str
) – Pattern to match.Returns: BooleanExpression
–True
if the string contains any match for the regex, otherwiseFalse
.
-
replace
(pattern1, pattern2)[source]¶ Replace substrings matching pattern1 with pattern2 using regex.
Examples
>>> s.replace(' ', '_').value 'The_quick_brown_fox'
Notes
The regex expressions used should follow Java regex syntax
Parameters: - pattern1 (str or
StringExpression
) - pattern2 (str or
StringExpression
)
- pattern1 (str or
-
split
(delim, n=None)[source]¶ Returns an array of strings generated by splitting the string at delim.
Examples
>>> s.split('\s+').value ['The', 'quick', 'brown', 'fox']
>>> s.split('\s+', 2).value ['The', 'quick brown fox']
Notes
The delimiter is a regex using the Java regex syntax delimiter. To split on special characters, escape them with double backslash (
\\
).Parameters: - delim (str or
StringExpression
) – Delimiter regex. - n (
Expression
of typetint32
, optional) – Maximum number of splits.
Returns: ArrayExpression
– Array of split strings.- delim (str or
-
startswith
(substr)[source]¶ Returns whether substr is a prefix of the string.
Examples
>>> s.startswith('The').value True
>>> s.startswith('the').value False
Note
This method is case-sensitive.
Parameters: substr ( StringExpression
)Returns: StringExpression
-
strip
()[source]¶ Returns a copy of the string with whitespace removed from the start and end.
Examples
>>> s2 = hl.str(' once upon a time\n') >>> s2.strip().value 'once upon a time'
Returns: StringExpression
-
upper
()[source]¶ Returns a copy of the string, but with lower case letters converted to upper case.
Examples
>>> s.upper().value 'THE QUICK BROWN FOX'
Returns: StringExpression
-
-
class
hail.expr.expressions.
StructExpression
(ast, type, indices=Indices(axes=set(), source=None), aggregations=List())[source]¶ Bases:
typing.Mapping
,hail.expr.expressions.base_expression.Expression
Expression of type
tstruct
.>>> struct = hl.struct(a=5, b='Foo')
Struct fields are accessible as attributes and keys. It is therefore possible to access field a of struct s with dot syntax:
>>> struct.a.value 5
However, it is recommended to use square brackets to select fields:
>>> struct['a'].value 5
The latter syntax is safer, because fields that share their name with an existing attribute of
StructExpression
(keys, values, annotate, drop, etc.) will only be accessible using theStructExpression.__getitem__()
syntax. This is also the only way to access fields that are not valid Python identifiers, like fields with spaces or symbols.-
__getitem__
(item)[source]¶ Access a field of the struct by name or index.
Examples
>>> struct['a'].value 5
>>> struct[1].value 'Foo'
Parameters: item ( str
) – Field name.Returns: Expression
– Struct field.
-
annotate
(**named_exprs)[source]¶ Add new fields or recompute existing fields.
Examples
>>> struct.annotate(a=10, c=2*2*2).value Struct(a=10, b='Foo', c=8)
Notes
If an expression in named_exprs shares a name with a field of the struct, then that field will be replaced but keep its position in the struct. New fields will be appended to the end of the struct.
Parameters: named_exprs (keyword args of Expression
) – Fields to add.Returns: StructExpression
– Struct with new or updated fields.
-
drop
(*fields)[source]¶ Drop fields from the struct.
Examples
>>> struct.drop('b').value Struct(a=5)
Parameters: fields (varargs of str
) – Fields to drop.Returns: StructExpression
– Struct without certain fields.
-
select
(*fields, **named_exprs)[source]¶ Select existing fields and compute new ones.
Examples
>>> struct.select('a', c=['bar', 'baz']).value Struct(a=5, c=[u'bar', u'baz'])
Notes
The fields argument is a list of field names to keep. These fields will appear in the resulting struct in the order they appear in fields.
The named_exprs arguments are new field expressions.
Parameters: - fields (varargs of
str
) – Field names to keep. - named_exprs (keyword args of
Expression
) – New field expressions.
Returns: StructExpression
– Struct containing specified existing fields and computed fields.- fields (varargs of
-