String functions¶
json (x) |
Convert an expression to a JSON string expression. |
hamming (s1, s2) |
Returns the Hamming distance between the two strings. |
delimit (collection[, delimiter]) |
Joins elements of collection into single string delimited by delimiter. |
entropy (s) |
Returns the Shannon entropy of the character distribution defined by the string. |
-
hail.expr.functions.
json
(x) → hail.expr.expressions.typed_expressions.StringExpression[source]¶ Convert an expression to a JSON string expression.
Examples
>>> hl.json([1,2,3,4,5]).value '[1,2,3,4,5]'
>>> hl.json(hl.struct(a='Hello', b=0.12345, c=[1,2], d={'hi', 'bye'})).value '{"a":"Hello","c":[1,2],"b":0.12345,"d":["bye","hi"]}'
Parameters: x – Expression to convert. Returns: StringExpression
– String expression with JSON representation of x.
-
hail.expr.functions.
hamming
(s1, s2) → hail.expr.expressions.typed_expressions.Int32Expression[source]¶ Returns the Hamming distance between the two strings.
Examples
>>> hl.hamming('ATATA', 'ATGCA').value 2
>>> hl.hamming('abcdefg', 'zzcdefz').value 3
Notes
This method will fail if the two strings have different length.
Parameters: - s1 (
StringExpression
) – First string. - s2 (
StringExpression
) – Second string.
Returns: Expression
of typetint32
- s1 (
-
hail.expr.functions.
delimit
(collection, delimiter=', ') → hail.expr.expressions.typed_expressions.StringExpression[source]¶ Joins elements of collection into single string delimited by delimiter.
Examples
>>> a = ['Bob', 'Charlie', 'Alice', 'Bob', 'Bob']
>>> hl.delimit(a).value 'Bob,Charlie,Alice,Bob,Bob'
Notes
If the element type of collection is not
tstr
, then thestr()
function will be called on each element before joining with the delimiter.Parameters: - collection (
ArrayExpression
orSetExpression
) – Collection. - delimiter (str or
StringExpression
) – Field delimiter.
Returns: StringExpression
– Joined string expression.- collection (
-
hail.expr.functions.
entropy
(s) → hail.expr.expressions.typed_expressions.Float64Expression[source]¶ Returns the Shannon entropy of the character distribution defined by the string.
Examples
>>> hl.entropy('ac').value 1.0
>>> hl.entropy('accctg').value 1.79248
Notes
For a string of length \(n\) with \(k\) unique characters \(\{ c_1, \dots, c_k \}\), let \(p_i\) be the probability that a randomly chosen character is \(c_i\), e.g. the number of instances of \(c_i\) divided by \(n\). Then the base-2 Shannon entropy is given by
\[H = \sum_{i=1}^k p_i \log_2(p_i).\]Parameters: s ( StringExpression
)Returns: Expression
of typetfloat64