|
gpt3-small |
gpt3-175b |
636-GPT3_XL_filterexperiment_0 |
Model shape |
|
|
|
git_commit |
|
|
|
n_head |
12 |
96 |
32 |
n_vocab |
50257 |
50257 |
50257 |
n_layer |
12 |
96 |
24 |
n_embd |
768 |
12288 |
2048 |
n_ctx |
2048 |
2048 |
2048 |
approx_model_params |
123.53 M |
174.56 G |
1.31 G |
Training size |
|
|
|
train_batch_size |
250 |
1600 |
256 |
train_steps |
585938 |
91553 |
25000 |
total_train_tokens |
300.00 G |
300.00 G |
13.11 G |
total_approx_ops |
2.22e+20 |
3.14e+23 |
1.03e+20 |
total_pflops_days |
2.57 |
3636.75 |
1.19 |
TPU |
|
|
|
tpu_name |
|
|
chell_4 |
n_cores |
2048 |
2048 |
256 |
total_flops |
107.52 P |
107.52 P |
13.44 P |
theo_train_days |
0.02 |
33.82 |
0.09 |
Training progress |
|
|
|
tb_url |
|
|
vm.eleuther.ai:8005 |
sacred_id |
|
|
636 |
status |
|
|
RUNNING |
start_time |
|
|
2021-05-03 20:14:56 UTC |
n_updates |
0 |
0 |
0 |
last_update_time |
|
|
|
wall_time_secs |
|
|
|
latest_batch* |
|
|
|
latest_loss* |
|
|
|
fraction_done* |
|
|
|
train_tokens_elapsed* |
|
|
|
approx_ops_elapsed* |
|
|
|
pflops_days_elapsed* |
|
|
|
secs_per_batch |
|
|
|
tokens_per_sec |
|
|
|
theo_eff |
|
|
|
wall_remaining_secs |
|
|
|
est_finish_time |
|
|
|