String functions¶
format (f, *args) |
Returns a formatted string using a specified format string and arguments. |
json (x) |
Convert an expression to a JSON string expression. |
hamming (s1, s2) |
Returns the Hamming distance between the two strings. |
delimit (collection[, delimiter]) |
Joins elements of collection into single string delimited by delimiter. |
entropy (s) |
Returns the Shannon entropy of the character distribution defined by the string. |
-
hail.expr.functions.
format
(f, *args)[source]¶ Returns a formatted string using a specified format string and arguments.
Examples
>>> hl.eval(hl.format('%.3e', 0.09345332)) '9.345e-02'
>>> hl.eval(hl.format('%.4f', hl.null(hl.tfloat64))) 'null'
>>> hl.eval(hl.format('%s %s %s', 'hello', hl.tuple([3, hl.locus('1', 2453)]), True)) 'hello [3,1:2453] true'
Notes
See the Java documentation for valid format specifiers and arguments.
Missing values are printed as
'null'
except when using the format flags ‘b’ and ‘B’ (printed as'false'
instead).Parameters: - f (
StringExpression
) – Java format string. - args (variable-length arguments of
Expression
) – Arguments to format.
Returns: - f (
-
hail.expr.functions.
json
(x) → hail.expr.expressions.typed_expressions.StringExpression[source]¶ Convert an expression to a JSON string expression.
Examples
>>> hl.eval(hl.json([1,2,3,4,5])) '[1,2,3,4,5]'
>>> hl.eval(hl.json(hl.struct(a='Hello', b=0.12345, c=[1,2], d={'hi', 'bye'}))) '{"a":"Hello","c":[1,2],"b":0.12345,"d":["bye","hi"]}'
Parameters: x – Expression to convert. Returns: StringExpression
– String expression with JSON representation of x.
-
hail.expr.functions.
hamming
(s1, s2) → hail.expr.expressions.typed_expressions.Int32Expression[source]¶ Returns the Hamming distance between the two strings.
Examples
>>> hl.eval(hl.hamming('ATATA', 'ATGCA')) 2
>>> hl.eval(hl.hamming('abcdefg', 'zzcdefz')) 3
Notes
This method will fail if the two strings have different length.
Parameters: - s1 (
StringExpression
) – First string. - s2 (
StringExpression
) – Second string.
Returns: Expression
of typetint32
- s1 (
-
hail.expr.functions.
delimit
(collection, delimiter=', ') → hail.expr.expressions.typed_expressions.StringExpression[source]¶ Joins elements of collection into single string delimited by delimiter.
Examples
>>> a = ['Bob', 'Charlie', 'Alice', 'Bob', 'Bob']
>>> hl.eval(hl.delimit(a)) 'Bob,Charlie,Alice,Bob,Bob'
Notes
If the element type of collection is not
tstr
, then thestr()
function will be called on each element before joining with the delimiter.Parameters: - collection (
ArrayExpression
orSetExpression
) – Collection. - delimiter (str or
StringExpression
) – Field delimiter.
Returns: StringExpression
– Joined string expression.- collection (
-
hail.expr.functions.
entropy
(s) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Returns the Shannon entropy of the character distribution defined by the string.
Examples
>>> hl.eval(hl.entropy('ac')) 1.0
>>> hl.eval(hl.entropy('accctg')) 1.79248
Notes
For a string of length \(n\) with \(k\) unique characters \(\{ c_1, \dots, c_k \}\), let \(p_i\) be the probability that a randomly chosen character is \(c_i\), e.g. the number of instances of \(c_i\) divided by \(n\). Then the base-2 Shannon entropy is given by
\[H = \sum_{i=1}^k p_i \log_2(p_i).\]Parameters: s ( StringExpression
)Returns: Expression
of typetfloat64