For Software Developers¶
Hail is an open-source project. We welcome contributions to the repository. If you’re interested in contributing to Hail, you will need to build your own Hail JAR and set up the testing environment.
Requirements¶
You’ll need:
- Java 8 JDK
- Spark 2.2.0
- Hail will work with other bug fix versions of Spark 2.2.x, but it will not work with Spark 1.x.x, 2.0.x, or 2.1.x.
- Anaconda for Python 3
Building a Hail JAR¶
The only additional tool necessary to build Hail from source is a C++ compiler. On a Debian-based OS like Ubuntu, a C++ compiler can be installed with apt-get:
sudo apt-get install g++
On Mac OS X, a C++ compiler is provided by the Apple Xcode:
xcode-select --install
The Hail source code is hosted on GitHub:
git clone https://github.com/hail-is/hail.git
cd hail/hail
A Hail JAR can be built using Gradle. Note that every Hail JAR is specific to one version of Spark:
./gradlew -Dspark.version=2.2.0 shadowJar
Finally, some environment variables must be set so that Hail can find Spark, Spark can find Hail, and Python can find Hail. Add these lines to your .bashrc
or equivalent setting SPARK_HOME
to the root directory of a Spark installation and HAIL_HOME
to the root of the Hail repository:
export SPARK_HOME=/path/to/spark
export HAIL_HOME=/path/to/hail/hail
export PYTHONPATH="$PYTHONPATH:$HAIL_HOME/python:$SPARK_HOME/python:`echo $SPARK_HOME/python/lib/py4j*-src.zip`"
export SPARK_CLASSPATH=$HAIL_HOME/build/libs/hail-all-spark.jar
Now you can import hail from a python interpreter:
$ python
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:14:23)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail as hl
>>> hl.init() # doctest: +SKIP
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 2.2.0
SparkUI available at http://10.1.6.36:4041
Welcome to
__ __ <>__
/ /_/ /__ __/ /
/ __ / _ `/ / /
/_/ /_/\_,_/_/_/ version devel-9f866ba
NOTE: This is a beta version. Interfaces may change
during the beta period. We also recommend pulling
the latest changes weekly.
>>>
Building the Docs¶
Hail uses conda environments to manage the doc build process’s python dependencies. First, create a conda environment for hail:
conda env create haildoc -f ./python/hail/dev-environment.yml
Activate the environment
source activate haildoc
Now the shell prompt should include the name of the environment, in this case
“haildoc”. Within the environment, run the makeDocs
gradle task in the
environment:
./gradlew makeDocs
The generated docs are located at ./build/www/hail/index.html
.
When you are finished developing hail, disable the environment
source deactivate haildoc
The dev-environment.yml
file may change without warning; therefore, after
pulling new changes from a remote repository, we always recommend updating the
conda environment
conda env update haildoc -f ./python/hail/dev-environment.yml
Running the tests¶
Several Hail tests have additional dependencies:
R 3.3.4 with CRAN packages
jsonlite
,SKAT
andlogistf
, as well as pcrelate from the GENESIS Bioconductor package. These can be installed within R using:install.packages(c("jsonlite", "SKAT", "logistf")) source("https://bioconductor.org/biocLite.R") biocLite("GENESIS") biocLite("SNPRelate") biocLite("GWASTools")
To execute all Hail tests, run:
./gradlew -Dspark.version=${SPARK_VERSION} -Dspark.home=${SPARK_HOME} test
Contributing¶
Chat with the dev team on our Zulip chatroom if you have an idea for a contribution. We can help you determine if your project is a good candidate for merging.
Keep in mind the following principles when submitting a pull request:
- A PR should focus on a single feature. Multiple features should be split into multiple PRs.
- Before submitting your PR, you should rebase onto the latest master.
- PRs must pass all tests before being merged. See the section above on Running the tests locally.
- PRs require a review before being merged. We will assign someone from our dev team to review your PR.
- Code in PRs should be formatted according to the style in
code_style.xml
. This file can be loaded into Intellij to automatically format your code. - When you make a PR, include a short message that describes the purpose of the PR and any necessary context for the changes you are making.