Training ML Models

In this guide, we are going to use TQL to train a simple model that predicts the probability we will win an advertising auction, given the ad-size. We will then publish that model back into TQL and utilize it in a query that displays model predictions next to ad-size.

Inspect The bid TimeSeries

Use a simple SELECT bid.* statement to inspect the fields and the values of the bid time series:

from zeenk.tql import *

select('bid.*').from_events('lethe4').limit(10)

------------------------------------------------------------------
           Version: 20.1.17-SNAPSHOT
 Version Timestamp: 2021-08-27 20:56:17
       Version Age: 3 days, 13 hours, 38 minutes, 14 seconds
   Filesystem Root: /Users/zkozick/.tql/files
 Working Directory: /Users/zkozick/.tql
Configuration File: /Users/zkozick/.tql/tql-conf.yml
       Api Gateway: http://localhost:9000
    Service Status: Icarus: ONLINE, Daedalus: ONLINE
    Service Uptime: 18 hours, 6 minutes, 13 seconds
------------------------------------------------------------------

Query results:

partition "_default"

	bid_ad_size	bid_bid	bid_event_time	bid_ghosted	bid_request_id	bid_user_id	bid_won
0	None	None	None	None	None	None	None
1	big	1.0	2021-01-02 00:02:36	false	r13260	u252	true
2	big	1.0	2021-01-02 00:16:41	false	r13376	u252	true
3	small	1.0	2021-01-02 00:30:55	false	r13509	u252	true
4	None	None	None	None	None	None	None
5	small	1.0	2021-01-02 01:04:01	false	r13737	u252	true
6	big	1.0	2021-01-02 01:37:26	false	r13985	u252	true
7	None	None	None	None	None	None	None
8	big	1.0	2021-01-02 02:18:30	false	r14310	u252	true
9	big	1.0	2021-01-02 03:01:48	false	r14607	u252	true

query produced 10 rows x 7 columns in 0.51 seconds

Create a Training ResultSet

Let’s create a ResultSet using the fields of bid event. For this TQL query, we cast the columns to machine learning types using the function label(), weight(), and categorical(). Note there is also numerical(), however, we are not using it in this model as the bid event object has no fields that could be used as numerical features.

An important aspect of creating machine learning ResultSets is the partition_by() statement. partition_by() takes a TQL expression which emits an enumerated string for each row. The output dataframe is split into partitions based on the value of the partition key expression for that row. This allows you to produce reproducible train/test splits over multiple runs, if your expression is based on an attribute of the event, as follows.

result_set = select(
    label('if(bid.won, 1, 0)'),
    weight(1.0),
    categorical('lower(bid.ad_size)', 'ad_size')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.partition_by('IF(HASH(bid.user_id) % 10 < 2, "test", "train")')\
.options(expand_numerical_features=True, fill_na=True)\
.submit()

result_set

Query results:

partition "test"

	_label	_weight	ad_size
0	1.0	1.0	big
1	1.0	1.0	big
2	1.0	1.0	small
3	1.0	1.0	small
4	1.0	1.0	big
...	...	...	...
33554	1.0	1.0	small
33555	1.0	1.0	small
33556	1.0	1.0	big
33557	1.0	1.0	small
33558	1.0	1.0	small

33559 rows × 3 columns

partition "train"

	_label	_weight	ad_size
0	1.0	1.0	small
1	1.0	1.0	small
2	1.0	1.0	big
3	1.0	1.0	small
4	1.0	1.0	small
...	...	...	...
138413	1.0	1.0	small
138414	1.0	1.0	small
138415	1.0	1.0	big
138416	1.0	1.0	small
138417	1.0	1.0	small

138418 rows × 3 columns

query produced 171977 rows x 3 columns in 8.48 seconds

Estimating a Model

We now pass the ResultSet directly into a TQL Estimator in order to produce a trained model. TQL provides convenience wrapper for H2O, but drop-in support for other training packages can be made available upon request.

When training, a TuningConf and/or BootstrapConf may be provided. TuningConf enables hyper parameter tuning during the training session. BootstrapConf trains an ensemble of models with perturbed weights, enabling the estimation of standard errors via a Bayesian Bootstrap technique.

The estimator emits a PublishedModel instance, which we will then inspect.

from zeenk.tql.modeling import H2OEstimator, TuningConf, BootstrapConf

estimator = H2OEstimator('linear', result_set, 'glm')
estimator.get_tag()
model = estimator.train(
    tuning_conf=TuningConf(init_pts=10, iterations=10),
    bootstrap_conf=BootstrapConf(bootstraps=10)
)

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_275"; OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_275-b01); OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.275-b01, mixed mode)
  Starting server from /Users/zkozick/dev/.virtualenvs/tqldev/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3
  JVM stdout: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.out
  JVM stderr: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
|   iter    |  target   |   alpha   |
-------------------------------------
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  1        | -0.000645 | -9.237    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  2        | -0.000629 | -2.201    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  3        | -0.000639 | -5.616    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  4        | -0.000629 | -2.765    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  5        | -0.000629 | -0.2201   |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  6        | -0.000629 | -4.615    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  7        | -0.000629 | -4.989    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  8        | -0.000645 | -9.279    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  9        | -0.000644 | -7.316    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  10       | -0.000629 | -5.001    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  11       | -0.000629 | -1.149    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  12       | -0.000629 | -3.67     |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  13       | -0.000629 | -1.658    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  14       | -0.000629 | -0.6461   |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  15       | -0.000629 | -4.11     |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  16       | -0.000629 | -3.209    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  17       | -0.000629 | -0.000322 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  18       | -0.000629 | -2.479    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  19       | -0.000629 | -1.405    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  20       | -0.000629 | -4.358    |
=====================================
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%

H2O_cluster_uptime:	02 secs
H2O_cluster_timezone:	America/Denver
H2O_data_parsing_timezone:	UTC
H2O_cluster_version:	3.32.1.6
H2O_cluster_version_age:	11 days
H2O_cluster_name:	H2O_from_python_zkozick_mqyqyc
H2O_cluster_total_nodes:	1
H2O_cluster_free_memory:	3.556 Gb
H2O_cluster_total_cores:	8
H2O_cluster_allowed_cores:	8
H2O_cluster_status:	accepting new members, healthy
H2O_connection_url:	http://127.0.0.1:54321
H2O_connection_proxy:	{"http": null, "https": null}
H2O_internal_security:	False
H2O_API_Extensions:	Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version:	3.7.3 final

Warning! Using fewer than 20 bootstraps can lead standard error estimates to vary by +/- 50%.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.

Summarize the Model Training Session

model.summarize().coefficients()

GLM Coefficients
	Coefficient	Standard Error	t-Stat
Intercept	0.712494	0.000628	1134.43
ad_size.big	0.000000	0.000730	0.00
ad_size.small	0.000000	0.000719	0.00

model.summarize().model_metrics()

Model Metrics
	Value (Train+Test)	Standard Error (Train+Test)	t-stat (Train+Test)	Lower 95 (Train+Test)	Upper 95 (Train+Test)	Value (Train)	Standard Error (Train)	t-stat (Train)	Lower 95 (Train)	Upper 95 (Train)	Value (Test)	Standard Error (Test)	t-stat (Test)	Lower 95 (Test)	Upper 95 (Test)
r2	-0.000024	0.000013	-1.82	-0.000051	0.000002	0.000000	0.000008	0.00	-0.000015	0.000015	-0.000630	0.000231	-2.73	-0.001083	-0.000177
r2_incr	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
r2_het	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
auc	0.500000	0.000927	539.10	0.498182	0.501818	0.500000	0.000820	610.04	0.498394	0.501606	0.500000	0.001569	318.71	0.496925	0.503075
aucc	0.500000	0.000927	539.10	0.498182	0.501818	0.500000	0.000820	610.04	0.498394	0.501606	0.500000	0.001569	318.71	0.496925	0.503075
auc_incr	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
aucc_incr	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
aucc_realized	0.500000	0.000000	0.00	0.500000	0.500000	0.500000	0.000000	0.00	0.500000	0.500000	0.500000	0.000000	0.00	0.500000	0.500000
auc_het	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
aucc_het	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
label_avg	0.710252	0.000604	1175.83	0.709068	0.711436	0.712494	0.000628	1134.40	0.711263	0.713725	0.701004	0.001918	365.47	0.697245	0.704764
pred_avg	0.712494	0.000628	1134.31	0.711263	0.713725	0.712494	0.000628	1134.40	0.711263	0.713725	0.712494	0.000628	1133.94	0.711263	0.713726
avg_error	-0.002242	0.000407	-5.51	-0.003040	-0.001444	0.000000	0.000000	0.86	-0.000000	0.000000	-0.011490	0.002071	-5.55	-0.015549	-0.007431
realized_avg	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan
incr_rate_avg	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
incr_rate_var	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
label_sum	122147.000000	316.913695	385.43	121525.860572	122768.139428	98622.000000	290.989622	338.92	98051.670821	99192.329179	23525.000000	124.511887	188.94	23280.961186	23769.038814
pred_sum	122532.587481	311.074534	393.90	121922.892599	123142.282364	98622.000000	290.989622	338.92	98051.670821	99192.329179	23910.587481	127.164609	188.03	23661.349429	24159.825534
conv_rate	122532.587481	311.074534	393.90	121922.892599	123142.282364	98622.000000	290.989622	338.92	98051.670821	99192.329179	23910.587481	127.164609	188.03	23661.349429	24159.825534
incr_rate	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
realized	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan
realized_raw	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan	0.000000	nan	nan	nan	nan
het_rate	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
het_rate_abs	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
het_rate_sq	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
samples	171977.000000	0.000000	0.00	171977.000000	171977.000000	138418.000000	0.000000	0.00	138418.000000	138418.000000	33559.000000	0.000000	0.00	33559.000000	33559.000000
weight	171977.000000	433.745190	396.49	171126.875049	172827.124951	138418.000000	403.602094	342.96	137626.954432	139209.045568	33559.000000	179.957884	186.48	33206.289029	33911.710971
positives	122147.000000	0.000000	0.00	122147.000000	122147.000000	98622.000000	0.000000	0.00	98622.000000	98622.000000	23525.000000	0.000000	0.00	23525.000000	23525.000000
negatives	49830.000000	0.000000	0.00	49830.000000	49830.000000	39796.000000	0.000000	0.00	39796.000000	39796.000000	10034.000000	0.000000	0.00	10034.000000	10034.000000
rss	35392.719304	102.363402	345.76	35192.090722	35593.347886	28354.412808	94.152122	301.16	28169.878039	28538.947576	7038.306496	52.383240	134.36	6935.637232	7140.975760
mse	0.205799	0.000254	809.74	0.205301	0.206297	0.204846	0.000267	768.02	0.204324	0.205369	0.209729	0.000816	256.98	0.208130	0.211329
mse_base	0.205799	0.000254	809.74	0.205301	0.206297	0.204846	0.000267	768.02	0.204324	0.205369	0.209729	0.000816	256.98	0.208130	0.211329
mse_hom	0.205799	0.000254	809.74	0.205301	0.206297	0.204846	0.000267	768.02	0.204324	0.205369	0.209729	0.000816	256.98	0.208130	0.211329
mse_incr	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000
mse_het	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000	0.000000	0.000000	0.00	0.000000	0.000000

Publish the Model into TQL:

The final step is to upload the trained model back into TQL so it may be used to make predictions. To do that we use model.publish(), providing a name we would like to use to reference the model in queries:

model.publish('my_winrate_model')

The model.publish() function returns the unique ID of the model upon successful publish.

Make Model Predictions with TQL

Now that the model has been published into TQL, it can be used in conjunction with the special TQL Expression function PREDICT("model_type") to make predictions against timeline events:

select(
  col('bid.ad_size', 'ad_size'), 
  col('PREDICT("my_winrate_model")', 'predicting_winrate')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.limit(3)

Query results:

partition "_default"

	ad_size	predicting_winrate
0	big	0.7124940397925126
1	big	0.7124940397925126
2	small	0.7124940397925126

query produced 3 rows x 2 columns in 0.43 seconds

In this case, since the model was trained using features extracted from events of type bid, it also only makes sense to make predictions on events of type bid.