# Training ML Models
In this guide, we are going to use TQL to train a simple model that predicts the probability we will
win an advertising auction, given the ad-size. We will then publish that model back into TQL and
utilize it in a query that displays model predictions next to ad-size.
## Inspect The bid TimeSeries
Use a simple `SELECT bid.*` statement to inspect the fields and the values of the bid time series:
```python
from zeenk.tql import *
select('bid.*').from_events('lethe4').limit(10)
```
------------------------------------------------------------------
Version: 20.1.17-SNAPSHOT
Version Timestamp: 2021-08-27 20:56:17
Version Age: 3 days, 13 hours, 38 minutes, 14 seconds
Filesystem Root: /Users/zkozick/.tql/files
Working Directory: /Users/zkozick/.tql
Configuration File: /Users/zkozick/.tql/tql-conf.yml
Api Gateway: http://localhost:9000
Service Status: Icarus: ONLINE, Daedalus: ONLINE
Service Uptime: 18 hours, 6 minutes, 13 seconds
------------------------------------------------------------------
Query results:
partition "_default"
|
bid_ad_size |
bid_bid |
bid_event_time |
bid_ghosted |
bid_request_id |
bid_user_id |
bid_won |
0 |
None |
None |
None |
None |
None |
None |
None |
1 |
big |
1.0 |
2021-01-02 00:02:36 |
false |
r13260 |
u252 |
true |
2 |
big |
1.0 |
2021-01-02 00:16:41 |
false |
r13376 |
u252 |
true |
3 |
small |
1.0 |
2021-01-02 00:30:55 |
false |
r13509 |
u252 |
true |
4 |
None |
None |
None |
None |
None |
None |
None |
5 |
small |
1.0 |
2021-01-02 01:04:01 |
false |
r13737 |
u252 |
true |
6 |
big |
1.0 |
2021-01-02 01:37:26 |
false |
r13985 |
u252 |
true |
7 |
None |
None |
None |
None |
None |
None |
None |
8 |
big |
1.0 |
2021-01-02 02:18:30 |
false |
r14310 |
u252 |
true |
9 |
big |
1.0 |
2021-01-02 03:01:48 |
false |
r14607 |
u252 |
true |
query produced 10 rows x 7 columns in 0.51 seconds
## Create a Training ResultSet
Let's create a ResultSet using the fields of `bid` event. For this TQL query, we cast the columns to machine
learning types using the function `label()`, `weight()`, and `categorical()`. Note there is also `numerical()`,
however, we are not using it in this model as the `bid` event object has no fields that could be used
as numerical features.
An important aspect of creating machine learning ResultSets is the `partition_by()` statement. `partition_by()`
takes a TQL expression which emits an enumerated string for each row. The output dataframe is split into partitions
based on the value of the partition key expression for that row. This allows you to produce reproducible
train/test splits over multiple runs, if your expression is based on an attribute of the event, as
follows.
```python
result_set = select(
label('if(bid.won, 1, 0)'),
weight(1.0),
categorical('lower(bid.ad_size)', 'ad_size')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.partition_by('IF(HASH(bid.user_id) % 10 < 2, "test", "train")')\
.options(expand_numerical_features=True, fill_na=True)\
.submit()
result_set
```
Query results:
partition "test"
|
_label |
_weight |
ad_size |
0 |
1.0 |
1.0 |
big |
1 |
1.0 |
1.0 |
big |
2 |
1.0 |
1.0 |
small |
3 |
1.0 |
1.0 |
small |
4 |
1.0 |
1.0 |
big |
... |
... |
... |
... |
33554 |
1.0 |
1.0 |
small |
33555 |
1.0 |
1.0 |
small |
33556 |
1.0 |
1.0 |
big |
33557 |
1.0 |
1.0 |
small |
33558 |
1.0 |
1.0 |
small |
33559 rows × 3 columns
partition "train"
|
_label |
_weight |
ad_size |
0 |
1.0 |
1.0 |
small |
1 |
1.0 |
1.0 |
small |
2 |
1.0 |
1.0 |
big |
3 |
1.0 |
1.0 |
small |
4 |
1.0 |
1.0 |
small |
... |
... |
... |
... |
138413 |
1.0 |
1.0 |
small |
138414 |
1.0 |
1.0 |
small |
138415 |
1.0 |
1.0 |
big |
138416 |
1.0 |
1.0 |
small |
138417 |
1.0 |
1.0 |
small |
138418 rows × 3 columns
query produced 171977 rows x 3 columns in 8.48 seconds
## Estimating a Model
We now pass the ResultSet directly into a TQL Estimator in order to produce a trained model. TQL
provides convenience wrapper for H2O, but drop-in support for other training packages can be made
available [upon request](mailto:zkozick@nanigans.com?subject=Request%20For%20Additional%20TQL%20Training%20Engine%20Support).
When training, a `TuningConf` and/or `BootstrapConf` may be provided. `TuningConf` enables hyper parameter
tuning during the training session. `BootstrapConf` trains an ensemble of models with perturbed
weights, enabling the estimation of standard errors via a
[Bayesian Bootstrap](https://gdmarmerola.github.io/the-bayesian-bootstrap/) technique.
The estimator emits a PublishedModel instance, which we will then inspect.
```python
from zeenk.tql.modeling import H2OEstimator, TuningConf, BootstrapConf
estimator = H2OEstimator('linear', result_set, 'glm')
estimator.get_tag()
model = estimator.train(
tuning_conf=TuningConf(init_pts=10, iterations=10),
bootstrap_conf=BootstrapConf(bootstraps=10)
)
```
Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
Java Version: openjdk version "1.8.0_275"; OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_275-b01); OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.275-b01, mixed mode)
Starting server from /Users/zkozick/dev/.virtualenvs/tqldev/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
Ice root: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3
JVM stdout: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.out
JVM stderr: /var/folders/2_/tx9l8xzd4fxg22hdd2wgd7d40000gn/T/tmpt2nmth_3/h2o_zkozick_started_from_python.err
Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
| iter | target | alpha |
-------------------------------------
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 1 | -0.000645 | -9.237 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 2 | -0.000629 | -2.201 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 3 | -0.000639 | -5.616 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 4 | -0.000629 | -2.765 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 5 | -0.000629 | -0.2201 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 6 | -0.000629 | -4.615 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 7 | -0.000629 | -4.989 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 8 | -0.000645 | -9.279 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 9 | -0.000644 | -7.316 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 10 | -0.000629 | -5.001 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 11 | -0.000629 | -1.149 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 12 | -0.000629 | -3.67 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 13 | -0.000629 | -1.658 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 14 | -0.000629 | -0.6461 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 15 | -0.000629 | -4.11 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 16 | -0.000629 | -3.209 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 17 | -0.000629 | -0.000322 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 18 | -0.000629 | -2.479 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 19 | -0.000629 | -1.405 |
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
| 20 | -0.000629 | -4.358 |
=====================================
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%
glm Model Build progress: |███████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
glm prediction progress: |████████████████████████████████████████████████| 100%
H2O_cluster_uptime: |
02 secs |
H2O_cluster_timezone: |
America/Denver |
H2O_data_parsing_timezone: |
UTC |
H2O_cluster_version: |
3.32.1.6 |
H2O_cluster_version_age: |
11 days |
H2O_cluster_name: |
H2O_from_python_zkozick_mqyqyc |
H2O_cluster_total_nodes: |
1 |
H2O_cluster_free_memory: |
3.556 Gb |
H2O_cluster_total_cores: |
8 |
H2O_cluster_allowed_cores: |
8 |
H2O_cluster_status: |
accepting new members, healthy |
H2O_connection_url: |
http://127.0.0.1:54321 |
H2O_connection_proxy: |
{"http": null, "https": null} |
H2O_internal_security: |
False |
H2O_API_Extensions: |
Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 |
Python_version: |
3.7.3 final |
Warning! Using fewer than 20 bootstraps can lead standard error estimates to vary by +/- 50%.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
/Users/zkozick/dev/nanigans/noumena/python/zeenk-tql/zeenk/tql/modeling/h2o/utilities.py:381: RuntimeWarning: invalid value encountered in true_divide
incr_share = incr_pred / pred
AUCc_realized calculation failed. Setting to 0.5.
## Summarize the Model Training Session
```python
model.summarize().coefficients()
```
GLM Coefficients
|
Coefficient |
Standard Error |
t-Stat |
Intercept |
0.712494 |
0.000628 |
1134.43 |
ad_size.big |
0.000000 |
0.000730 |
0.00 |
ad_size.small |
0.000000 |
0.000719 |
0.00 |
```python
model.summarize().model_metrics()
```
Model Metrics
|
Value
(Train+Test) |
Standard Error
(Train+Test) |
t-stat
(Train+Test) |
Lower 95
(Train+Test) |
Upper 95
(Train+Test) |
Value
(Train) |
Standard Error
(Train) |
t-stat
(Train) |
Lower 95
(Train) |
Upper 95
(Train) |
Value
(Test) |
Standard Error
(Test) |
t-stat
(Test) |
Lower 95
(Test) |
Upper 95
(Test) |
r2 |
-0.000024 |
0.000013 |
-1.82 |
-0.000051 |
0.000002 |
0.000000 |
0.000008 |
0.00 |
-0.000015 |
0.000015 |
-0.000630 |
0.000231 |
-2.73 |
-0.001083 |
-0.000177 |
r2_incr |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
r2_het |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
auc |
0.500000 |
0.000927 |
539.10 |
0.498182 |
0.501818 |
0.500000 |
0.000820 |
610.04 |
0.498394 |
0.501606 |
0.500000 |
0.001569 |
318.71 |
0.496925 |
0.503075 |
aucc |
0.500000 |
0.000927 |
539.10 |
0.498182 |
0.501818 |
0.500000 |
0.000820 |
610.04 |
0.498394 |
0.501606 |
0.500000 |
0.001569 |
318.71 |
0.496925 |
0.503075 |
auc_incr |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
aucc_incr |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
aucc_realized |
0.500000 |
0.000000 |
0.00 |
0.500000 |
0.500000 |
0.500000 |
0.000000 |
0.00 |
0.500000 |
0.500000 |
0.500000 |
0.000000 |
0.00 |
0.500000 |
0.500000 |
auc_het |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
aucc_het |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
label_avg |
0.710252 |
0.000604 |
1175.83 |
0.709068 |
0.711436 |
0.712494 |
0.000628 |
1134.40 |
0.711263 |
0.713725 |
0.701004 |
0.001918 |
365.47 |
0.697245 |
0.704764 |
pred_avg |
0.712494 |
0.000628 |
1134.31 |
0.711263 |
0.713725 |
0.712494 |
0.000628 |
1134.40 |
0.711263 |
0.713725 |
0.712494 |
0.000628 |
1133.94 |
0.711263 |
0.713726 |
avg_error |
-0.002242 |
0.000407 |
-5.51 |
-0.003040 |
-0.001444 |
0.000000 |
0.000000 |
0.86 |
-0.000000 |
0.000000 |
-0.011490 |
0.002071 |
-5.55 |
-0.015549 |
-0.007431 |
realized_avg |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
incr_rate_avg |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
incr_rate_var |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
label_sum |
122147.000000 |
316.913695 |
385.43 |
121525.860572 |
122768.139428 |
98622.000000 |
290.989622 |
338.92 |
98051.670821 |
99192.329179 |
23525.000000 |
124.511887 |
188.94 |
23280.961186 |
23769.038814 |
pred_sum |
122532.587481 |
311.074534 |
393.90 |
121922.892599 |
123142.282364 |
98622.000000 |
290.989622 |
338.92 |
98051.670821 |
99192.329179 |
23910.587481 |
127.164609 |
188.03 |
23661.349429 |
24159.825534 |
conv_rate |
122532.587481 |
311.074534 |
393.90 |
121922.892599 |
123142.282364 |
98622.000000 |
290.989622 |
338.92 |
98051.670821 |
99192.329179 |
23910.587481 |
127.164609 |
188.03 |
23661.349429 |
24159.825534 |
incr_rate |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
realized |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
realized_raw |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
0.000000 |
nan |
nan |
nan |
nan |
het_rate |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
het_rate_abs |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
het_rate_sq |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
samples |
171977.000000 |
0.000000 |
0.00 |
171977.000000 |
171977.000000 |
138418.000000 |
0.000000 |
0.00 |
138418.000000 |
138418.000000 |
33559.000000 |
0.000000 |
0.00 |
33559.000000 |
33559.000000 |
weight |
171977.000000 |
433.745190 |
396.49 |
171126.875049 |
172827.124951 |
138418.000000 |
403.602094 |
342.96 |
137626.954432 |
139209.045568 |
33559.000000 |
179.957884 |
186.48 |
33206.289029 |
33911.710971 |
positives |
122147.000000 |
0.000000 |
0.00 |
122147.000000 |
122147.000000 |
98622.000000 |
0.000000 |
0.00 |
98622.000000 |
98622.000000 |
23525.000000 |
0.000000 |
0.00 |
23525.000000 |
23525.000000 |
negatives |
49830.000000 |
0.000000 |
0.00 |
49830.000000 |
49830.000000 |
39796.000000 |
0.000000 |
0.00 |
39796.000000 |
39796.000000 |
10034.000000 |
0.000000 |
0.00 |
10034.000000 |
10034.000000 |
rss |
35392.719304 |
102.363402 |
345.76 |
35192.090722 |
35593.347886 |
28354.412808 |
94.152122 |
301.16 |
28169.878039 |
28538.947576 |
7038.306496 |
52.383240 |
134.36 |
6935.637232 |
7140.975760 |
mse |
0.205799 |
0.000254 |
809.74 |
0.205301 |
0.206297 |
0.204846 |
0.000267 |
768.02 |
0.204324 |
0.205369 |
0.209729 |
0.000816 |
256.98 |
0.208130 |
0.211329 |
mse_base |
0.205799 |
0.000254 |
809.74 |
0.205301 |
0.206297 |
0.204846 |
0.000267 |
768.02 |
0.204324 |
0.205369 |
0.209729 |
0.000816 |
256.98 |
0.208130 |
0.211329 |
mse_hom |
0.205799 |
0.000254 |
809.74 |
0.205301 |
0.206297 |
0.204846 |
0.000267 |
768.02 |
0.204324 |
0.205369 |
0.209729 |
0.000816 |
256.98 |
0.208130 |
0.211329 |
mse_incr |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
mse_het |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
0.000000 |
0.000000 |
0.00 |
0.000000 |
0.000000 |
## Publish the Model into TQL:
The final step is to upload the trained model back into TQL so it may be used to make predictions.
To do that we use `model.publish()`, providing a name we would like to use to reference the model
in queries:
```python
model.publish('my_winrate_model')
```
381
The `model.publish()` function returns the unique ID of the model upon successful publish.
## Make Model Predictions with TQL
Now that the model has been published into TQL, it can be used in conjunction with the special TQL
Expression function `PREDICT("model_type")` to make predictions against timeline events:
```python
select(
col('bid.ad_size', 'ad_size'),
col('PREDICT("my_winrate_model")', 'predicting_winrate')
)\
.from_events('lethe4')\
.where('type == "bid"')\
.limit(3)
```
Query results:
partition "_default"
|
ad_size |
predicting_winrate |
0 |
big |
0.7124940397925126 |
1 |
big |
0.7124940397925126 |
2 |
small |
0.7124940397925126 |
query produced 3 rows x 2 columns in 0.43 seconds
In this case, since the model was trained using features extracted from events of type `bid`, it
also only makes sense to make predictions on events of type `bid`.