Missingness¶
All values in Hail can be missing.
Expressions deal with missingness in a natural way. For example:
- a missing value plus another value is always missing.
- a conditional with a missing predicate is missing.
- when aggregating a sum of values, the missing values are ignored.
Hail has a collection of primitive operations for dealing with missingness.
To start, let’s create expressions representing missing and non-missing values.
In [1]:
import hail as hl
hl.init()
Running on Apache Spark version 2.2.0
SparkUI available at http://172.31.25.74:4040
Welcome to
__ __ <>__
/ /_/ /__ __/ /
/ __ / _ `/ / /
/_/ /_/\_,_/_/_/ version devel-897938986fe5
NOTE: This is a beta version. Interfaces may change
during the beta period. We recommend pulling
the latest changes weekly.
In [2]:
na = hl.null(hl.tint32)
x = hl.literal(5)
To evaluate an expression, ask for its value
. Let’s look at a few
expression involving missingness.
In [3]:
print(na.value)
None
In [4]:
print(x.value)
5
In [5]:
hl.is_defined(na).value
Out[5]:
False
In [6]:
hl.is_defined(x).value
Out[6]:
True
In [7]:
hl.is_missing(na).value
Out[7]:
True
In [8]:
hl.or_else(na, x).value
Out[8]:
5
In [9]:
hl.or_else(x, na).value
Out[9]:
5
In [10]:
hl.or_missing(True, x).value
Out[10]:
5
In [11]:
print(hl.or_missing(False, x).value)
None
The above is equivalent to:
In [12]:
print(hl.case().when(False, x).or_missing())
<Int32Expression of type int32>
Missingness propagates up¶
In Python, None + 5
is an error. In Hail, operating on a missing
value doesn’t produce an error, but rather produces a missing result.
In [13]:
(x + 5).value
Out[13]:
10
In [14]:
print((na + 5).value)
None
In [15]:
a = hl.array([1, 1, 2, 3, 5, 8, 13, 21])
In [16]:
a[x].value
Out[16]:
8
In [17]:
print(a[na].value)
None