String functions

format(f, *args) Returns a formatted string using a specified format string and arguments.
json(x) Convert an expression to a JSON string expression.
hamming(s1, s2) Returns the Hamming distance between the two strings.
delimit(collection[, delimiter]) Joins elements of collection into single string delimited by delimiter.
entropy(s) Returns the Shannon entropy of the character distribution defined by the string.
hail.expr.functions.format(f, *args)[source]

Returns a formatted string using a specified format string and arguments.

Examples

>>> hl.eval(hl.format('%.3e', 0.09345332))
'9.345e-02'
>>> hl.eval(hl.format('%.4f', hl.null(hl.tfloat64)))
'null'
>>> hl.eval(hl.format('%s %s %s', 'hello', hl.tuple([3, hl.locus('1', 2453)]), True))
'hello [3,1:2453] true'

Notes

See the Java documentation for valid format specifiers and arguments.

Missing values are printed as 'null' except when using the format flags ‘b’ and ‘B’ (printed as 'false' instead).

Parameters:
Returns:

StringExpression

hail.expr.functions.json(x) → hail.expr.expressions.typed_expressions.StringExpression[source]

Convert an expression to a JSON string expression.

Examples

>>> hl.eval(hl.json([1,2,3,4,5]))
'[1,2,3,4,5]'
>>> hl.eval(hl.json(hl.struct(a='Hello', b=0.12345, c=[1,2], d={'hi', 'bye'})))
'{"a":"Hello","c":[1,2],"b":0.12345,"d":["bye","hi"]}'
Parameters:x – Expression to convert.
Returns:StringExpression – String expression with JSON representation of x.
hail.expr.functions.hamming(s1, s2) → hail.expr.expressions.typed_expressions.Int32Expression[source]

Returns the Hamming distance between the two strings.

Examples

>>> hl.eval(hl.hamming('ATATA', 'ATGCA'))
2
>>> hl.eval(hl.hamming('abcdefg', 'zzcdefz'))
3

Notes

This method will fail if the two strings have different length.

Parameters:
Returns:

Expression of type tint32

hail.expr.functions.delimit(collection, delimiter=', ') → hail.expr.expressions.typed_expressions.StringExpression[source]

Joins elements of collection into single string delimited by delimiter.

Examples

>>> a = ['Bob', 'Charlie', 'Alice', 'Bob', 'Bob']
>>> hl.eval(hl.delimit(a))
'Bob,Charlie,Alice,Bob,Bob'

Notes

If the element type of collection is not tstr, then the str() function will be called on each element before joining with the delimiter.

Parameters:
Returns:

StringExpression – Joined string expression.

hail.expr.functions.entropy(s) → hail.expr.expressions.typed_expressions.Float64Expression[source]

Returns the Shannon entropy of the character distribution defined by the string.

Examples

>>> hl.eval(hl.entropy('ac'))
1.0
>>> hl.eval(hl.entropy('accctg'))
1.79248

Notes

For a string of length \(n\) with \(k\) unique characters \(\{ c_1, \dots, c_k \}\), let \(p_i\) be the probability that a randomly chosen character is \(c_i\), e.g. the number of instances of \(c_i\) divided by \(n\). Then the base-2 Shannon entropy is given by

\[H = \sum_{i=1}^k p_i \log_2(p_i).\]
Parameters:s (StringExpression)
Returns:Expression of type tfloat64