Training ML Models

In this guide, we are going to use TQL to train a simple model that predicts the probability we will win an advertising auction, given the ad-size. We will then publish that model back into TQL and utilize it in a query that displays model predictions next to ad-size.

Inspect The bid TimeSeries

Use a simple SELECT bid.* statement to inspect the fields and the values of the bid time series:

from zeenk.tql import *

select('bid.*').from_events('lethe4').limit(10)
------------------------------------------------------------------
           Version: 20.1.17-SNAPSHOT
 Version Timestamp: 2021-08-27 20:56:17
       Version Age: 3 days, 13 hours, 38 minutes, 14 seconds
   Filesystem Root: /Users/zkozick/.tql/files
 Working Directory: /Users/zkozick/.tql
Configuration File: /Users/zkozick/.tql/tql-conf.yml
       Api Gateway: http://localhost:9000
    Service Status: Icarus: ONLINE, Daedalus: ONLINE
    Service Uptime: 18 hours, 6 minutes, 13 seconds
------------------------------------------------------------------

Query results:

partition "_default"
bid_ad_size bid_bid bid_event_time bid_ghosted bid_request_id bid_user_id bid_won
0 None None None None None None None
1 big 1.0 2021-01-02 00:02:36 false r13260 u252 true
2 big 1.0 2021-01-02 00:16:41 false r13376 u252 true
3 small 1.0 2021-01-02 00:30:55 false r13509 u252 true
4 None None None None None None None
5 small 1.0 2021-01-02 01:04:01 false r13737 u252 true
6 big 1.0 2021-01-02 01:37:26 false r13985 u252 true
7 None None None None None None None
8 big 1.0 2021-01-02 02:18:30 false r14310 u252 true
9 big 1.0 2021-01-02 03:01:48 false r14607 u252 true
query produced 10 rows x 7 columns in 0.51 seconds

Create a Training ResultSet

Let’s create a ResultSet using the fields of bid event. For this TQL query, we cast the columns to machine learning types using the function label(), weight(), and categorical(). Note there is also numerical(), however, we are not using it in this model as the bid event object has no fields that could be used as numerical features.

An important aspect of creating machine learning ResultSets is the partition_by() statement. partition_by() takes a TQL expression which emits an enumerated string for each row. The output dataframe is split into partitions based on the value of the partition key expression for that row. This allows you to produce reproducible train/test splits over multiple runs, if your expression is based on an attribute of the event, as follows.

result_set = select(
    label('if(bid.won, 1, 0)'),
    weight(1.0),
    categorical('lower(bid.ad_size)', 'ad_size')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.partition_by('IF(HASH(bid.user_id) % 10 < 2, "test", "train")')\
.options(expand_numerical_features=True, fill_na=True)\
.submit()

result_set

Query results:

partition "test"
_label _weight ad_size
0 1.0 1.0 big
1 1.0 1.0 big
2 1.0 1.0 small
3 1.0 1.0 small
4 1.0 1.0 big
... ... ... ...
33554 1.0 1.0 small
33555 1.0 1.0 small
33556 1.0 1.0 big
33557 1.0 1.0 small
33558 1.0 1.0 small

33559 rows × 3 columns

partition "train"
_label _weight ad_size
0 1.0 1.0 small
1 1.0 1.0 small
2 1.0 1.0 big
3 1.0 1.0 small
4 1.0 1.0 small
... ... ... ...
138413 1.0 1.0 small
138414 1.0 1.0 small
138415 1.0 1.0 big
138416 1.0 1.0 small
138417 1.0 1.0 small

138418 rows × 3 columns

query produced 171977 rows x 3 columns in 8.48 seconds

Estimating a Model

We now pass the ResultSet directly into a TQL Estimator in order to produce a trained model. TQL provides convenience wrapper for H2O, but drop-in support for other training packages can be made available upon request.

When training, a TuningConf and/or BootstrapConf may be provided. TuningConf enables hyper parameter tuning during the training session. BootstrapConf trains an ensemble of models with perturbed weights, enabling the estimation of standard errors via a Bayesian Bootstrap technique.

The estimator emits a PublishedModel instance, which we will then inspect.

from zeenk.tql.modeling import H2OEstimator, TuningConf, BootstrapConf

estimator = H2OEstimator('linear', result_set, 'glm')
estimator.get_tag()
model = estimator.train(
    tuning_conf=TuningConf(init_pts=10, iterations=10),
    bootstrap_conf=BootstrapConf(bootstraps=10)
)
Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_275"; OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_275-b01); OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.275-b01, mixed mode)
  Starting server from /Users/zkozick/dev/.virtualenvs/tqldev/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3
  JVM stdout: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.out
  JVM stderr: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
|   iter    |  target   |   alpha   |
-------------------------------------
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  1        | -0.000645 | -9.237    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  2        | -0.000629 | -2.201    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  3        | -0.000639 | -5.616    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  4        | -0.000629 | -2.765    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  5        | -0.000629 | -0.2201   |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  6        | -0.000629 | -4.615    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  7        | -0.000629 | -4.989    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  8        | -0.000645 | -9.279    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  9        | -0.000644 | -7.316    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  10       | -0.000629 | -5.001    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  11       | -0.000629 | -1.149    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  12       | -0.000629 | -3.67     |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  13       | -0.000629 | -1.658    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  14       | -0.000629 | -0.6461   |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  15       | -0.000629 | -4.11     |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  16       | -0.000629 | -3.209    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  17       | -0.000629 | -0.000322 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  18       | -0.000629 | -2.479    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  19       | -0.000629 | -1.405    |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
|  20       | -0.000629 | -4.358    |
=====================================
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
H2O_cluster_uptime: 02 secs
H2O_cluster_timezone: America/Denver
H2O_data_parsing_timezone: UTC
H2O_cluster_version: 3.32.1.6
H2O_cluster_version_age: 11 days
H2O_cluster_name: H2O_from_python_zkozick_mqyqyc
H2O_cluster_total_nodes: 1
H2O_cluster_free_memory: 3.556 Gb
H2O_cluster_total_cores: 8
H2O_cluster_allowed_cores: 8
H2O_cluster_status: accepting new members, healthy
H2O_connection_url: http://127.0.0.1:54321
H2O_connection_proxy: {"http": null, "https": null}
H2O_internal_security: False
H2O_API_Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version: 3.7.3 final
Warning! Using fewer than 20 bootstraps can lead standard error estimates to vary by +/- 50%.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
  incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.

Summarize the Model Training Session

model.summarize().coefficients()
GLM Coefficients
Coefficient Standard Error t-Stat
Intercept 0.712494 0.000628 1134.43
ad_size.big 0.000000 0.000730 0.00
ad_size.small 0.000000 0.000719 0.00
model.summarize().model_metrics()
Model Metrics
Value (Train+Test) Standard Error (Train+Test) t-stat (Train+Test) Lower 95 (Train+Test) Upper 95 (Train+Test) Value (Train) Standard Error (Train) t-stat (Train) Lower 95 (Train) Upper 95 (Train) Value (Test) Standard Error (Test) t-stat (Test) Lower 95 (Test) Upper 95 (Test)
r2 -0.000024 0.000013 -1.82 -0.000051 0.000002 0.000000 0.000008 0.00 -0.000015 0.000015 -0.000630 0.000231 -2.73 -0.001083 -0.000177
r2_incr 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
r2_het 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
auc 0.500000 0.000927 539.10 0.498182 0.501818 0.500000 0.000820 610.04 0.498394 0.501606 0.500000 0.001569 318.71 0.496925 0.503075
aucc 0.500000 0.000927 539.10 0.498182 0.501818 0.500000 0.000820 610.04 0.498394 0.501606 0.500000 0.001569 318.71 0.496925 0.503075
auc_incr 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
aucc_incr 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
aucc_realized 0.500000 0.000000 0.00 0.500000 0.500000 0.500000 0.000000 0.00 0.500000 0.500000 0.500000 0.000000 0.00 0.500000 0.500000
auc_het 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
aucc_het 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
label_avg 0.710252 0.000604 1175.83 0.709068 0.711436 0.712494 0.000628 1134.40 0.711263 0.713725 0.701004 0.001918 365.47 0.697245 0.704764
pred_avg 0.712494 0.000628 1134.31 0.711263 0.713725 0.712494 0.000628 1134.40 0.711263 0.713725 0.712494 0.000628 1133.94 0.711263 0.713726
avg_error -0.002242 0.000407 -5.51 -0.003040 -0.001444 0.000000 0.000000 0.86 -0.000000 0.000000 -0.011490 0.002071 -5.55 -0.015549 -0.007431
realized_avg 0.000000 nan nan nan nan 0.000000 nan nan nan nan 0.000000 nan nan nan nan
incr_rate_avg 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
incr_rate_var 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
label_sum 122147.000000 316.913695 385.43 121525.860572 122768.139428 98622.000000 290.989622 338.92 98051.670821 99192.329179 23525.000000 124.511887 188.94 23280.961186 23769.038814
pred_sum 122532.587481 311.074534 393.90 121922.892599 123142.282364 98622.000000 290.989622 338.92 98051.670821 99192.329179 23910.587481 127.164609 188.03 23661.349429 24159.825534
conv_rate 122532.587481 311.074534 393.90 121922.892599 123142.282364 98622.000000 290.989622 338.92 98051.670821 99192.329179 23910.587481 127.164609 188.03 23661.349429 24159.825534
incr_rate 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
realized 0.000000 nan nan nan nan 0.000000 nan nan nan nan 0.000000 nan nan nan nan
realized_raw 0.000000 nan nan nan nan 0.000000 nan nan nan nan 0.000000 nan nan nan nan
het_rate 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
het_rate_abs 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
het_rate_sq 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
samples 171977.000000 0.000000 0.00 171977.000000 171977.000000 138418.000000 0.000000 0.00 138418.000000 138418.000000 33559.000000 0.000000 0.00 33559.000000 33559.000000
weight 171977.000000 433.745190 396.49 171126.875049 172827.124951 138418.000000 403.602094 342.96 137626.954432 139209.045568 33559.000000 179.957884 186.48 33206.289029 33911.710971
positives 122147.000000 0.000000 0.00 122147.000000 122147.000000 98622.000000 0.000000 0.00 98622.000000 98622.000000 23525.000000 0.000000 0.00 23525.000000 23525.000000
negatives 49830.000000 0.000000 0.00 49830.000000 49830.000000 39796.000000 0.000000 0.00 39796.000000 39796.000000 10034.000000 0.000000 0.00 10034.000000 10034.000000
rss 35392.719304 102.363402 345.76 35192.090722 35593.347886 28354.412808 94.152122 301.16 28169.878039 28538.947576 7038.306496 52.383240 134.36 6935.637232 7140.975760
mse 0.205799 0.000254 809.74 0.205301 0.206297 0.204846 0.000267 768.02 0.204324 0.205369 0.209729 0.000816 256.98 0.208130 0.211329
mse_base 0.205799 0.000254 809.74 0.205301 0.206297 0.204846 0.000267 768.02 0.204324 0.205369 0.209729 0.000816 256.98 0.208130 0.211329
mse_hom 0.205799 0.000254 809.74 0.205301 0.206297 0.204846 0.000267 768.02 0.204324 0.205369 0.209729 0.000816 256.98 0.208130 0.211329
mse_incr 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000
mse_het 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000 0.000000 0.000000 0.00 0.000000 0.000000

Publish the Model into TQL:

The final step is to upload the trained model back into TQL so it may be used to make predictions. To do that we use model.publish(), providing a name we would like to use to reference the model in queries:

model.publish('my_winrate_model')
381

The model.publish() function returns the unique ID of the model upon successful publish.

Make Model Predictions with TQL

Now that the model has been published into TQL, it can be used in conjunction with the special TQL Expression function PREDICT("model_type") to make predictions against timeline events:

select(
  col('bid.ad_size', 'ad_size'), 
  col('PREDICT("my_winrate_model")', 'predicting_winrate')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.limit(3)

Query results:

partition "_default"
ad_size predicting_winrate
0 big 0.7124940397925126
1 big 0.7124940397925126
2 small 0.7124940397925126
query produced 3 rows x 2 columns in 0.43 seconds

In this case, since the model was trained using features extracted from events of type bid, it also only makes sense to make predictions on events of type bid.