# Configuring TQL
This guide will show you how to configure and customize TQL to fit your needs.
## The TQL Configuration File
TQL is configured by a [yaml](https://yaml.org/) file. You can always see the location of the
yaml configuration file by using TQL's command line tool by running the command `tql status`:
```
(tql-venv) jshmoe$ tql status
------------------------------------------------------------------
Version: 20.1.16
Version Timestamp: 2021-07-21 17:34:23
Version Age: 6 days, 17 hours, 56 minutes, 6 seconds
Filesystem Root: /home/jshmoe/.tql/files
Working Directory: /home/jshmoe/.tql
Configuration File: /home/jshmoe/.tql/tql_conf.yml
Api Gateway: http://localhost:9000
Service Status: Icarus: OFFLINE, Daedalus: OFFLINE
Service Uptime:
------------------------------------------------------------------
```
You can also see the current configuration by running the command `tql conf`:
```
(tql-venv) jshmoe$ tql conf
conf_path: /home/jshmoe/.tql/tql_conf.yml
database:
nanml.standalone.db.h2-disk.directory: /home/jshmoe/.tql/db
nanml.standalone.db.instance_type: h2-disk
filesystem:
root: /home/jshmoe/.tql/files
icarus:
...
```
NOTE: The TQL configuration file will only be generated on first startup of
the TQL backend. If you just installed TQL but have not yet tried to use it, first
use the command tql start to generate a configuration file to edit.
## Modifying the Configuration
Edit the configuration file at `/home/jshmoe/.tql/tql_conf.yml` using your editor
of choice. Afterwords, restart TQL using the command `tql restart` to apply
your configuration changes.
## Configuration Sections
The configuration file is broken up into four sections: `filesystem`, `pyspark`,
`icarus`, and `database`.
### FileSystem
The backend reads and writes files as part of its normal operation. Configure this property
by adding the following section to your conf:
```
filesystem:
root: /path/to/filesystem/root
```
By default, the backend reads and writes to the local filesystem, in a directory `~/.tql/files/`.
However, if deploying the backend to a cluster environment, such an AWS EMR Cluster or GCP
Dataproc, all processing nodes must have access to the filesystem. For this you should
configure the backend to write to cloud storage, such as S3 or GCS.
#### Amazon S3 Filesystem
To read/write to Amazon S3, use the following configuration:
```
filesystem:
root: s3://your-bucket/sub-folder/
s3:
s3_access_key: YOUR_ACCESS_KEY
s3_secret_key: YOUR_SECRET_KEY
s3_region: YOUR_REGION
```
#### Google GCS Filesystem:
To read/write to GCS, use the following configuration:
```
filesystem:
root: gs://your-bucket/sub-folder/
gcs:
type: "service_account"
project_id" YOUR_PROJECT_ID
private_key_id: YOUR_PRIVATE_KEY_ID
private_key: "YOUR_PRIVATE_KEY"
client_email: YOUR_PROJECT_ID@appspot.gserviceaccount.com
client_id: YOUR_CLIENT_ID
auth_uri: "https://accounts.google.com/o/oauth2/auth"
token_uri: "https://oauth2.googleapis.com/token"
auth_provider_x509_cert_url: "https://www.googleapis.com/oauth2/v1/certs"
client_x509_cert_url: "https://www.googleapis.com..."
```
### PySpark
TQL installs with a basic default configuration of PySpark. By default, TQL uses a local,
embedded instance of Spark, denoted by `local[*]` with `1024m` of memory for both the
driver and executor. You can customize the Spark instance using the following configuration,
changing the values from their defaults shown here:
```
pyspark:
conf:
spark.master: local[*]
spark.driver.memory: 1024m
spark.executor.memory: 1024m
...
```
You may add any [Spark application properties](https://spark.apache.org/docs/latest/configuration.html#application-properties)
you wish under `pyspark -> conf`. In particular, you may wish to connect to a different type of Spark
master, such as `yarn`. This configuration change is required in order to use TQL on an
[Amazon EMR](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark.html) or
[Google Dataproc](https://cloud.google.com/dataproc) Cluster.
### Icarus
TQL's REST API server, named **Icarus**, acts as a gateway between the Python user interface
and the Java query execution engine. By default Icarus runs on port `9000` and `9001`
with `512M` of memory. You can customize the web service's memory and ports using the
following configuration, changing the values from their defaults shown here:
```
icarus:
configuration:
server:
adminConnectors:
- port: 9001
type: http
applicationConnectors:
- port: 9000
type: http
memory: 512m
```
Under the hood, Icarus uses the Dropwizard web framework, and you can further customize
it's configuration in this section of the config file. Read more about Dropwizard
configuration [here](https://www.dropwizard.io/en/latest/manual/configuration.html).
### Database
TQL uses a SQL database to store it's metadata about Projects, Timelines, Queries,
and Resultsets. By default TQL uses a file-based [H2 Database](http://www.h2database.com/html/main.html).
With the following configuration:
```
database:
nanml.standalone.db.instance_type: h2-disk
nanml.standalone.db.h2-disk.directory: ~/.tql/db
```
However, you may wish to use a persistent MySQL datastore. To do so, instead you
should use the following configuration:
```
database
nanml.db.instance_type: mysql
nanml.mysql.db.host: 172.25.0.99
```