gpt3-small gpt3-175b 636-GPT3_XL_filterexperiment_0
Model shape
git_commit
n_head 12 96 32
n_vocab 50257 50257 50257
n_layer 12 96 24
n_embd 768 12288 2048
n_ctx 2048 2048 2048
approx_model_params 123.53 M 174.56 G 1.31 G
Training size
train_batch_size 250 1600 256
train_steps 585938 91553 25000
total_train_tokens 300.00 G 300.00 G 13.11 G
total_approx_ops 2.22e+20 3.14e+23 1.03e+20
total_pflops_days 2.57 3636.75 1.19
TPU
tpu_name chell_4
n_cores 2048 2048 256
total_flops 107.52 P 107.52 P 13.44 P
theo_train_days 0.02 33.82 0.09
Training progress
tb_url vm.eleuther.ai:8005
sacred_id 636
status RUNNING
start_time 2021-05-03 20:14:56 UTC
n_updates 0 0 0
last_update_time
wall_time_secs
latest_batch*
latest_loss*
fraction_done*
train_tokens_elapsed*
approx_ops_elapsed*
pflops_days_elapsed*
secs_per_batch
tokens_per_sec
theo_eff
wall_remaining_secs
est_finish_time