Training ML Models
In this guide, we are going to use TQL to train a simple model that predicts the probability we will win an advertising auction, given the ad-size. We will then publish that model back into TQL and utilize it in a query that displays model predictions next to ad-size.
Inspect The bid TimeSeries
Use a simple SELECT bid.*
statement to inspect the fields and the values of the bid time series:
from zeenk.tql import *
select('bid.*').from_events('lethe4').limit(10)
------------------------------------------------------------------
Version: 20.1.17-SNAPSHOT
Version Timestamp: 2021-08-27 20:56:17
Version Age: 3 days, 13 hours, 38 minutes, 14 seconds
Filesystem Root: /Users/zkozick/.tql/files
Working Directory: /Users/zkozick/.tql
Configuration File: /Users/zkozick/.tql/tql-conf.yml
Api Gateway: http://localhost:9000
Service Status: Icarus: ONLINE, Daedalus: ONLINE
Service Uptime: 18 hours, 6 minutes, 13 seconds
------------------------------------------------------------------
Query results:
partition "_default"bid_ad_size | bid_bid | bid_event_time | bid_ghosted | bid_request_id | bid_user_id | bid_won | |
---|---|---|---|---|---|---|---|
0 | None | None | None | None | None | None | None |
1 | big | 1.0 | 2021-01-02 00:02:36 | false | r13260 | u252 | true |
2 | big | 1.0 | 2021-01-02 00:16:41 | false | r13376 | u252 | true |
3 | small | 1.0 | 2021-01-02 00:30:55 | false | r13509 | u252 | true |
4 | None | None | None | None | None | None | None |
5 | small | 1.0 | 2021-01-02 01:04:01 | false | r13737 | u252 | true |
6 | big | 1.0 | 2021-01-02 01:37:26 | false | r13985 | u252 | true |
7 | None | None | None | None | None | None | None |
8 | big | 1.0 | 2021-01-02 02:18:30 | false | r14310 | u252 | true |
9 | big | 1.0 | 2021-01-02 03:01:48 | false | r14607 | u252 | true |
Create a Training ResultSet
Let’s create a ResultSet using the fields of bid
event. For this TQL query, we cast the columns to machine
learning types using the function label()
, weight()
, and categorical()
. Note there is also numerical()
,
however, we are not using it in this model as the bid
event object has no fields that could be used
as numerical features.
An important aspect of creating machine learning ResultSets is the partition_by()
statement. partition_by()
takes a TQL expression which emits an enumerated string for each row. The output dataframe is split into partitions
based on the value of the partition key expression for that row. This allows you to produce reproducible
train/test splits over multiple runs, if your expression is based on an attribute of the event, as
follows.
result_set = select(
label('if(bid.won, 1, 0)'),
weight(1.0),
categorical('lower(bid.ad_size)', 'ad_size')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.partition_by('IF(HASH(bid.user_id) % 10 < 2, "test", "train")')\
.options(expand_numerical_features=True, fill_na=True)\
.submit()
result_set
Query results:
partition "test"_label | _weight | ad_size | |
---|---|---|---|
0 | 1.0 | 1.0 | big |
1 | 1.0 | 1.0 | big |
2 | 1.0 | 1.0 | small |
3 | 1.0 | 1.0 | small |
4 | 1.0 | 1.0 | big |
... | ... | ... | ... |
33554 | 1.0 | 1.0 | small |
33555 | 1.0 | 1.0 | small |
33556 | 1.0 | 1.0 | big |
33557 | 1.0 | 1.0 | small |
33558 | 1.0 | 1.0 | small |
33559 rows × 3 columns
_label | _weight | ad_size | |
---|---|---|---|
0 | 1.0 | 1.0 | small |
1 | 1.0 | 1.0 | small |
2 | 1.0 | 1.0 | big |
3 | 1.0 | 1.0 | small |
4 | 1.0 | 1.0 | small |
... | ... | ... | ... |
138413 | 1.0 | 1.0 | small |
138414 | 1.0 | 1.0 | small |
138415 | 1.0 | 1.0 | big |
138416 | 1.0 | 1.0 | small |
138417 | 1.0 | 1.0 | small |
138418 rows × 3 columns
Estimating a Model
We now pass the ResultSet directly into a TQL Estimator in order to produce a trained model. TQL provides convenience wrapper for H2O, but drop-in support for other training packages can be made available upon request.
When training, a TuningConf
and/or BootstrapConf
may be provided. TuningConf
enables hyper parameter
tuning during the training session. BootstrapConf
trains an ensemble of models with perturbed
weights, enabling the estimation of standard errors via a
Bayesian Bootstrap technique.
The estimator emits a PublishedModel instance, which we will then inspect.
from zeenk.tql.modeling import H2OEstimator, TuningConf, BootstrapConf
estimator = H2OEstimator('linear', result_set, 'glm')
estimator.get_tag()
model = estimator.train(
tuning_conf=TuningConf(init_pts=10, iterations=10),
bootstrap_conf=BootstrapConf(bootstraps=10)
)
Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
Java Version: openjdk version "1.8.0_275"; OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_275-b01); OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.275-b01, mixed mode)
Starting server from /Users/zkozick/dev/.virtualenvs/tqldev/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
Ice root: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3
JVM stdout: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.out
JVM stderr: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.err
Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
| iter | target | alpha |
-------------------------------------
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 1 | -0.000645 | -9.237 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 2 | -0.000629 | -2.201 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 3 | -0.000639 | -5.616 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 4 | -0.000629 | -2.765 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 5 | -0.000629 | -0.2201 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 6 | -0.000629 | -4.615 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 7 | -0.000629 | -4.989 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 8 | -0.000645 | -9.279 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 9 | -0.000644 | -7.316 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 10 | -0.000629 | -5.001 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 11 | -0.000629 | -1.149 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 12 | -0.000629 | -3.67 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 13 | -0.000629 | -1.658 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 14 | -0.000629 | -0.6461 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 15 | -0.000629 | -4.11 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 16 | -0.000629 | -3.209 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 17 | -0.000629 | -0.000322 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 18 | -0.000629 | -2.479 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 19 | -0.000629 | -1.405 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 20 | -0.000629 | -4.358 |
=====================================
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
H2O_cluster_uptime: | 02 secs |
H2O_cluster_timezone: | America/Denver |
H2O_data_parsing_timezone: | UTC |
H2O_cluster_version: | 3.32.1.6 |
H2O_cluster_version_age: | 11 days |
H2O_cluster_name: | H2O_from_python_zkozick_mqyqyc |
H2O_cluster_total_nodes: | 1 |
H2O_cluster_free_memory: | 3.556 Gb |
H2O_cluster_total_cores: | 8 |
H2O_cluster_allowed_cores: | 8 |
H2O_cluster_status: | accepting new members, healthy |
H2O_connection_url: | http://127.0.0.1:54321 |
H2O_connection_proxy: | {"http": null, "https": null} |
H2O_internal_security: | False |
H2O_API_Extensions: | Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 |
Python_version: | 3.7.3 final |
Warning! Using fewer than 20 bootstraps can lead standard error estimates to vary by +/- 50%.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
Summarize the Model Training Session
model.summarize().coefficients()
Coefficient | Standard Error | t-Stat | |
---|---|---|---|
Intercept | 0.712494 | 0.000628 | 1134.43 |
ad_size.big | 0.000000 | 0.000730 | 0.00 |
ad_size.small | 0.000000 | 0.000719 | 0.00 |
model.summarize().model_metrics()
Value (Train+Test) | Standard Error (Train+Test) | t-stat (Train+Test) | Lower 95 (Train+Test) | Upper 95 (Train+Test) | Value (Train) | Standard Error (Train) | t-stat (Train) | Lower 95 (Train) | Upper 95 (Train) | Value (Test) | Standard Error (Test) | t-stat (Test) | Lower 95 (Test) | Upper 95 (Test) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
r2 | -0.000024 | 0.000013 | -1.82 | -0.000051 | 0.000002 | 0.000000 | 0.000008 | 0.00 | -0.000015 | 0.000015 | -0.000630 | 0.000231 | -2.73 | -0.001083 | -0.000177 |
r2_incr | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
r2_het | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
auc | 0.500000 | 0.000927 | 539.10 | 0.498182 | 0.501818 | 0.500000 | 0.000820 | 610.04 | 0.498394 | 0.501606 | 0.500000 | 0.001569 | 318.71 | 0.496925 | 0.503075 |
aucc | 0.500000 | 0.000927 | 539.10 | 0.498182 | 0.501818 | 0.500000 | 0.000820 | 610.04 | 0.498394 | 0.501606 | 0.500000 | 0.001569 | 318.71 | 0.496925 | 0.503075 |
auc_incr | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
aucc_incr | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
aucc_realized | 0.500000 | 0.000000 | 0.00 | 0.500000 | 0.500000 | 0.500000 | 0.000000 | 0.00 | 0.500000 | 0.500000 | 0.500000 | 0.000000 | 0.00 | 0.500000 | 0.500000 |
auc_het | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
aucc_het | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
label_avg | 0.710252 | 0.000604 | 1175.83 | 0.709068 | 0.711436 | 0.712494 | 0.000628 | 1134.40 | 0.711263 | 0.713725 | 0.701004 | 0.001918 | 365.47 | 0.697245 | 0.704764 |
pred_avg | 0.712494 | 0.000628 | 1134.31 | 0.711263 | 0.713725 | 0.712494 | 0.000628 | 1134.40 | 0.711263 | 0.713725 | 0.712494 | 0.000628 | 1133.94 | 0.711263 | 0.713726 |
avg_error | -0.002242 | 0.000407 | -5.51 | -0.003040 | -0.001444 | 0.000000 | 0.000000 | 0.86 | -0.000000 | 0.000000 | -0.011490 | 0.002071 | -5.55 | -0.015549 | -0.007431 |
realized_avg | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan |
incr_rate_avg | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
incr_rate_var | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
label_sum | 122147.000000 | 316.913695 | 385.43 | 121525.860572 | 122768.139428 | 98622.000000 | 290.989622 | 338.92 | 98051.670821 | 99192.329179 | 23525.000000 | 124.511887 | 188.94 | 23280.961186 | 23769.038814 |
pred_sum | 122532.587481 | 311.074534 | 393.90 | 121922.892599 | 123142.282364 | 98622.000000 | 290.989622 | 338.92 | 98051.670821 | 99192.329179 | 23910.587481 | 127.164609 | 188.03 | 23661.349429 | 24159.825534 |
conv_rate | 122532.587481 | 311.074534 | 393.90 | 121922.892599 | 123142.282364 | 98622.000000 | 290.989622 | 338.92 | 98051.670821 | 99192.329179 | 23910.587481 | 127.164609 | 188.03 | 23661.349429 | 24159.825534 |
incr_rate | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
realized | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan |
realized_raw | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan | 0.000000 | nan | nan | nan | nan |
het_rate | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
het_rate_abs | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
het_rate_sq | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
samples | 171977.000000 | 0.000000 | 0.00 | 171977.000000 | 171977.000000 | 138418.000000 | 0.000000 | 0.00 | 138418.000000 | 138418.000000 | 33559.000000 | 0.000000 | 0.00 | 33559.000000 | 33559.000000 |
weight | 171977.000000 | 433.745190 | 396.49 | 171126.875049 | 172827.124951 | 138418.000000 | 403.602094 | 342.96 | 137626.954432 | 139209.045568 | 33559.000000 | 179.957884 | 186.48 | 33206.289029 | 33911.710971 |
positives | 122147.000000 | 0.000000 | 0.00 | 122147.000000 | 122147.000000 | 98622.000000 | 0.000000 | 0.00 | 98622.000000 | 98622.000000 | 23525.000000 | 0.000000 | 0.00 | 23525.000000 | 23525.000000 |
negatives | 49830.000000 | 0.000000 | 0.00 | 49830.000000 | 49830.000000 | 39796.000000 | 0.000000 | 0.00 | 39796.000000 | 39796.000000 | 10034.000000 | 0.000000 | 0.00 | 10034.000000 | 10034.000000 |
rss | 35392.719304 | 102.363402 | 345.76 | 35192.090722 | 35593.347886 | 28354.412808 | 94.152122 | 301.16 | 28169.878039 | 28538.947576 | 7038.306496 | 52.383240 | 134.36 | 6935.637232 | 7140.975760 |
mse | 0.205799 | 0.000254 | 809.74 | 0.205301 | 0.206297 | 0.204846 | 0.000267 | 768.02 | 0.204324 | 0.205369 | 0.209729 | 0.000816 | 256.98 | 0.208130 | 0.211329 |
mse_base | 0.205799 | 0.000254 | 809.74 | 0.205301 | 0.206297 | 0.204846 | 0.000267 | 768.02 | 0.204324 | 0.205369 | 0.209729 | 0.000816 | 256.98 | 0.208130 | 0.211329 |
mse_hom | 0.205799 | 0.000254 | 809.74 | 0.205301 | 0.206297 | 0.204846 | 0.000267 | 768.02 | 0.204324 | 0.205369 | 0.209729 | 0.000816 | 256.98 | 0.208130 | 0.211329 |
mse_incr | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
mse_het | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.00 | 0.000000 | 0.000000 |
Publish the Model into TQL:
The final step is to upload the trained model back into TQL so it may be used to make predictions.
To do that we use model.publish()
, providing a name we would like to use to reference the model
in queries:
model.publish('my_winrate_model')
381
The model.publish()
function returns the unique ID of the model upon successful publish.
Make Model Predictions with TQL
Now that the model has been published into TQL, it can be used in conjunction with the special TQL
Expression function PREDICT("model_type")
to make predictions against timeline events:
select(
col('bid.ad_size', 'ad_size'),
col('PREDICT("my_winrate_model")', 'predicting_winrate')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.limit(3)
Query results:
partition "_default"ad_size | predicting_winrate | |
---|---|---|
0 | big | 0.7124940397925126 |
1 | big | 0.7124940397925126 |
2 | small | 0.7124940397925126 |
In this case, since the model was trained using features extracted from events of type bid
, it
also only makes sense to make predictions on events of type bid
.