[
  {
    "title": "MMLU (Massive Multitask Language Understanding)",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2593684210526315,
          "description": "min=0.19, mean=0.259, max=0.35, sum=3.891 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13067986008352367,
          "description": "min=0.074, mean=0.131, max=0.172, sum=1.96 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22085380116959066,
          "description": "min=0.15, mean=0.221, max=0.31, sum=3.313 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23635087719298245,
          "description": "min=0.17, mean=0.236, max=0.33, sum=3.545 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4567342927631581,
          "description": "min=0.419, mean=0.457, max=0.511, sum=6.851 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2411345029239766,
          "description": "min=0.2, mean=0.241, max=0.298, sum=3.617 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12277396117394333,
          "description": "min=0.051, mean=0.123, max=0.181, sum=1.842 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20011695906432747,
          "description": "min=0.16, mean=0.2, max=0.272, sum=3.002 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2039415204678363,
          "description": "min=0.16, mean=0.204, max=0.23, sum=3.059 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3765351217105263,
          "description": "min=0.348, mean=0.377, max=0.422, sum=5.648 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2697894736842105,
          "description": "min=0.2, mean=0.27, max=0.35, sum=4.047 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11389257817699022,
          "description": "min=0.063, mean=0.114, max=0.154, sum=1.708 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22511111111111112,
          "description": "min=0.15, mean=0.225, max=0.27, sum=3.377 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23159064327485382,
          "description": "min=0.158, mean=0.232, max=0.29, sum=3.474 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41104061293859656,
          "description": "min=0.381, mean=0.411, max=0.466, sum=6.166 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4451461988304094,
          "description": "min=0.23, mean=0.445, max=0.8, sum=6.677 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13930239849591303,
          "description": "min=0.067, mean=0.139, max=0.205, sum=2.09 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.39245614035087717,
          "description": "min=0.2, mean=0.392, max=0.73, sum=5.887 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4094619883040936,
          "description": "min=0.19, mean=0.409, max=0.77, sum=6.142 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4804912280701755,
          "description": "min=0.23, mean=0.48, max=0.83, sum=7.207 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13723997934779486,
          "description": "min=0.056, mean=0.137, max=0.248, sum=2.059 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41671345029239765,
          "description": "min=0.17, mean=0.417, max=0.75, sum=6.251 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.44997660818713453,
          "description": "min=0.21, mean=0.45, max=0.78, sum=6.75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4753216374269006,
          "description": "min=0.24, mean=0.475, max=0.81, sum=7.13 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13373539597087636,
          "description": "min=0.076, mean=0.134, max=0.172, sum=2.006 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41120467836257313,
          "description": "min=0.22, mean=0.411, max=0.68, sum=6.168 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43321637426900583,
          "description": "min=0.23, mean=0.433, max=0.73, sum=6.498 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.3385263157894737,
          "description": "min=0.211, mean=0.339, max=0.5, sum=5.078 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1406708954092635,
          "description": "min=0.06, mean=0.141, max=0.219, sum=2.11 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2625146198830409,
          "description": "min=0.17, mean=0.263, max=0.42, sum=3.938 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2968421052631579,
          "description": "min=0.167, mean=0.297, max=0.45, sum=4.453 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.73985964912276,
          "description": "min=308.59, mean=396.74, max=552.719, sum=5951.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.26969590643274854,
          "description": "min=0.193, mean=0.27, max=0.32, sum=4.045 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.110752611571227,
          "description": "min=0.087, mean=0.111, max=0.157, sum=1.661 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1826549707602339,
          "description": "min=0.1, mean=0.183, max=0.27, sum=2.74 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1845730994152047,
          "description": "min=0.09, mean=0.185, max=0.27, sum=2.769 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 471.0754736842105,
          "description": "min=360.75, mean=471.075, max=618.447, sum=7066.132 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.3207134502923977,
          "description": "min=0.23, mean=0.321, max=0.49, sum=4.811 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1348564339845485,
          "description": "min=0.075, mean=0.135, max=0.225, sum=2.023 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23008187134502922,
          "description": "min=0.1, mean=0.23, max=0.37, sum=3.451 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23658479532163745,
          "description": "min=0.14, mean=0.237, max=0.35, sum=3.549 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 471.0754736842105,
          "description": "min=360.75, mean=471.075, max=618.447, sum=7066.132 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.380140350877193,
          "description": "min=0.22, mean=0.38, max=0.61, sum=5.702 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15396738685964684,
          "description": "min=0.122, mean=0.154, max=0.217, sum=2.31 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2547368421052632,
          "description": "min=0.08, mean=0.255, max=0.51, sum=3.821 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2636608187134503,
          "description": "min=0.11, mean=0.264, max=0.51, sum=3.955 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 471.0754736842105,
          "description": "min=360.75, mean=471.075, max=618.447, sum=7066.132 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4813099415204679,
          "description": "min=0.25, mean=0.481, max=0.78, sum=7.22 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.063, mean=0.144, max=0.262, sum=2.165 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43421052631578944,
          "description": "min=0.17, mean=0.434, max=0.76, sum=6.513 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4467836257309941,
          "description": "min=0.211, mean=0.447, max=0.74, sum=6.702 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5775741999040572,
          "description": "min=0.556, mean=0.578, max=0.605, sum=8.664 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2987017543859649,
          "description": "min=0.19, mean=0.299, max=0.42, sum=4.481 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13690038983912287,
          "description": "min=0.115, mean=0.137, max=0.173, sum=2.054 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25025730994152046,
          "description": "min=0.167, mean=0.25, max=0.38, sum=3.754 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27360233918128657,
          "description": "min=0.175, mean=0.274, max=0.38, sum=4.104 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23288457024982262,
          "description": "min=0.135, mean=0.233, max=0.418, sum=3.493 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 436.9895789473684,
          "description": "min=333.02, mean=436.99, max=574.658, sum=6554.844 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4065029239766082,
          "description": "min=0.25, mean=0.407, max=0.67, sum=6.098 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16765379656947835,
          "description": "min=0.074, mean=0.168, max=0.3, sum=2.515 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.37832748538011696,
          "description": "min=0.25, mean=0.378, max=0.62, sum=5.675 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3820701754385965,
          "description": "min=0.25, mean=0.382, max=0.63, sum=5.731 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1453571324242486,
          "description": "min=0.141, mean=0.145, max=0.149, sum=2.18 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 492.0102807017544,
          "description": "min=386.05, mean=492.01, max=639.561, sum=7380.154 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.35304093567251466,
          "description": "min=0.228, mean=0.353, max=0.56, sum=5.296 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14945785718149934,
          "description": "min=0.089, mean=0.149, max=0.246, sum=2.242 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28992982456140354,
          "description": "min=0.158, mean=0.29, max=0.51, sum=4.349 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31526315789473686,
          "description": "min=0.158, mean=0.315, max=0.53, sum=4.729 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4885340888157895,
          "description": "min=0.47, mean=0.489, max=0.506, sum=7.328 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.32356725146198834,
          "description": "min=0.19, mean=0.324, max=0.4, sum=4.854 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11188578153206447,
          "description": "min=0.075, mean=0.112, max=0.151, sum=1.678 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25327485380116954,
          "description": "min=0.15, mean=0.253, max=0.35, sum=3.799 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2809590643274854,
          "description": "min=0.14, mean=0.281, max=0.38, sum=4.214 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3167793253495066,
          "description": "min=0.292, mean=0.317, max=0.349, sum=4.752 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.27880701754385967,
          "description": "min=0.18, mean=0.279, max=0.36, sum=4.182 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11350786269483934,
          "description": "min=0.067, mean=0.114, max=0.164, sum=1.703 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18368421052631578,
          "description": "min=0.09, mean=0.184, max=0.24, sum=2.755 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23653801169590644,
          "description": "min=0.15, mean=0.237, max=0.29, sum=3.548 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2806724427425987,
          "description": "min=0.265, mean=0.281, max=0.301, sum=4.21 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2642105263157895,
          "description": "min=0.18, mean=0.264, max=0.42, sum=3.963 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13602108170852936,
          "description": "min=0.049, mean=0.136, max=0.202, sum=2.04 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22644444444444442,
          "description": "min=0.13, mean=0.226, max=0.42, sum=3.397 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22225730994152046,
          "description": "min=0.1, mean=0.222, max=0.4, sum=3.334 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.284456830180921,
          "description": "min=0.265, mean=0.284, max=0.312, sum=4.267 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.382046783625731,
          "description": "min=0.21, mean=0.382, max=0.67, sum=5.731 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14305203655556303,
          "description": "min=0.104, mean=0.143, max=0.197, sum=2.146 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29933333333333334,
          "description": "min=0.12, mean=0.299, max=0.6, sum=4.49 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31652631578947366,
          "description": "min=0.13, mean=0.317, max=0.57, sum=4.748 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.25371929824561407,
          "description": "min=0.18, mean=0.254, max=0.32, sum=3.806 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11272299343238619,
          "description": "min=0.055, mean=0.113, max=0.167, sum=1.691 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20667836257309943,
          "description": "min=0.15, mean=0.207, max=0.25, sum=3.1 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21994152046783624,
          "description": "min=0.14, mean=0.22, max=0.3, sum=3.299 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4063157894736842,
          "description": "min=0.26, mean=0.406, max=0.63, sum=6.095 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1551609000421963,
          "description": "min=0.103, mean=0.155, max=0.243, sum=2.327 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33394152046783626,
          "description": "min=0.2, mean=0.334, max=0.54, sum=5.009 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36630409356725147,
          "description": "min=0.2, mean=0.366, max=0.55, sum=5.495 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4523859649122807,
          "description": "min=0.23, mean=0.452, max=0.79, sum=6.786 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18282231471159943,
          "description": "min=0.099, mean=0.183, max=0.338, sum=2.742 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38711111111111113,
          "description": "min=0.15, mean=0.387, max=0.73, sum=5.807 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4071111111111111,
          "description": "min=0.19, mean=0.407, max=0.73, sum=6.107 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 481.2602105263158,
          "description": "min=372.75, mean=481.26, max=628.421, sum=7218.903 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2485497076023392,
          "description": "min=0.14, mean=0.249, max=0.3, sum=3.728 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11546362297486105,
          "description": "min=0.062, mean=0.115, max=0.149, sum=1.732 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2174502923976608,
          "description": "min=0.11, mean=0.217, max=0.28, sum=3.262 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21961403508771932,
          "description": "min=0.13, mean=0.22, max=0.27, sum=3.294 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.06997480863135229,
          "description": "min=0.066, mean=0.07, max=0.072, sum=1.05 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2764093567251462,
          "description": "min=0.21, mean=0.276, max=0.351, sum=4.146 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12205035764205192,
          "description": "min=0.094, mean=0.122, max=0.145, sum=1.831 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1888421052631579,
          "description": "min=0.149, mean=0.189, max=0.24, sum=2.833 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21518128654970764,
          "description": "min=0.175, mean=0.215, max=0.26, sum=3.228 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1330090104470642,
          "description": "min=0.093, mean=0.133, max=0.275, sum=1.995 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=7019.035 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.236140350877193,
          "description": "min=0.16, mean=0.236, max=0.281, sum=1.181 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1364262799156796,
          "description": "min=0.064, mean=0.136, max=0.2, sum=0.682 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20063157894736844,
          "description": "min=0.12, mean=0.201, max=0.263, sum=1.003 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20687719298245613,
          "description": "min=0.14, mean=0.207, max=0.254, sum=1.034 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2736491228070176,
          "description": "min=0.2, mean=0.274, max=0.3, sum=1.368 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11132961223278444,
          "description": "min=0.092, mean=0.111, max=0.166, sum=0.557 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22035087719298244,
          "description": "min=0.17, mean=0.22, max=0.28, sum=1.102 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2121052631578947,
          "description": "min=0.16, mean=0.212, max=0.29, sum=1.061 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.290280701754386,
          "description": "min=0.211, mean=0.29, max=0.4, sum=4.354 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1514046561108303,
          "description": "min=0.1, mean=0.151, max=0.242, sum=2.271 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25776608187134503,
          "description": "min=0.19, mean=0.258, max=0.38, sum=3.866 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23500584795321638,
          "description": "min=0.167, mean=0.235, max=0.33, sum=3.525 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21847905223539232,
          "description": "min=0.173, mean=0.218, max=0.232, sum=3.277 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 4.326397660818714,
          "description": "min=2.482, mean=4.326, max=5, sum=64.896 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 420.5617309941521,
          "description": "min=382.49, mean=420.562, max=467.75, sum=6308.426 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2912046783625731,
          "description": "min=0.2, mean=0.291, max=0.39, sum=4.368 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13362255376880447,
          "description": "min=0.084, mean=0.134, max=0.202, sum=2.004 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2719415204678362,
          "description": "min=0.2, mean=0.272, max=0.37, sum=4.079 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2734502923976609,
          "description": "min=0.19, mean=0.273, max=0.36, sum=4.102 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18164482078684702,
          "description": "min=0.178, mean=0.182, max=0.184, sum=2.725 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 4.316222222222222,
          "description": "min=2.465, mean=4.316, max=5, sum=64.743 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 423.39457309941525,
          "description": "min=385.228, mean=423.395, max=467.79, sum=6350.919 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.31836257309941524,
          "description": "min=0.21, mean=0.318, max=0.48, sum=4.775 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14714449343481936,
          "description": "min=0.115, mean=0.147, max=0.194, sum=2.207 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2698479532163743,
          "description": "min=0.13, mean=0.27, max=0.45, sum=4.048 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28651461988304094,
          "description": "min=0.167, mean=0.287, max=0.43, sum=4.298 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1195572826114746,
          "description": "min=0.11, mean=0.12, max=0.138, sum=1.793 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2760350877192982,
          "description": "min=0.2, mean=0.276, max=0.37, sum=4.141 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13542563946906333,
          "description": "min=0.101, mean=0.135, max=0.172, sum=2.031 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21610526315789472,
          "description": "min=0.13, mean=0.216, max=0.32, sum=3.242 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22935672514619884,
          "description": "min=0.18, mean=0.229, max=0.33, sum=3.44 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.05452067670741475,
          "description": "min=0.041, mean=0.055, max=0.081, sum=0.818 (15)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.3206315789473684,
          "description": "min=0.23, mean=0.321, max=0.45, sum=1.603 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.063, mean=0.111, max=0.138, sum=0.557 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2676140350877193,
          "description": "min=0.18, mean=0.268, max=0.36, sum=1.338 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28410526315789475,
          "description": "min=0.19, mean=0.284, max=0.42, sum=1.421 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.422140350877193,
          "description": "min=0.2, mean=0.422, max=0.76, sum=2.111 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.127, mean=0.15, max=0.18, sum=0.748 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3696140350877193,
          "description": "min=0.14, mean=0.37, max=0.68, sum=1.848 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3853684210526316,
          "description": "min=0.18, mean=0.385, max=0.71, sum=1.927 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.531438596491228,
          "description": "min=0.33, mean=0.531, max=0.83, sum=2.657 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.051, mean=0.093, max=0.139, sum=0.464 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4609122807017544,
          "description": "min=0.22, mean=0.461, max=0.82, sum=2.305 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.49617543859649127,
          "description": "min=0.28, mean=0.496, max=0.81, sum=2.481 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5837192982456141,
          "description": "min=0.34, mean=0.584, max=0.89, sum=2.919 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5036842105263158,
          "description": "min=0.27, mean=0.504, max=0.81, sum=2.518 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5514385964912281,
          "description": "min=0.34, mean=0.551, max=0.84, sum=2.757 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.43066666666666664,
          "description": "min=0.28, mean=0.431, max=0.64, sum=2.153 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.37312280701754386,
          "description": "min=0.22, mean=0.373, max=0.57, sum=1.866 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.392140350877193,
          "description": "min=0.26, mean=0.392, max=0.59, sum=1.961 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5066666666666666,
          "description": "min=0.28, mean=0.507, max=0.84, sum=2.533 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.44438596491228066,
          "description": "min=0.22, mean=0.444, max=0.76, sum=2.222 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.46614035087719297,
          "description": "min=0.26, mean=0.466, max=0.79, sum=2.331 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5817192982456141,
          "description": "min=0.29, mean=0.582, max=0.92, sum=2.909 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5451929824561403,
          "description": "min=0.22, mean=0.545, max=0.9, sum=2.726 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5571929824561404,
          "description": "min=0.26, mean=0.557, max=0.91, sum=2.786 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.38463157894736844,
          "description": "min=0.263, mean=0.385, max=0.6, sum=1.923 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23428857555005617,
          "description": "min=0.151, mean=0.234, max=0.32, sum=1.171 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32410526315789473,
          "description": "min=0.18, mean=0.324, max=0.52, sum=1.621 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34585964912280703,
          "description": "min=0.219, mean=0.346, max=0.53, sum=1.729 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.43361403508771934,
          "description": "min=0.228, mean=0.434, max=0.7, sum=2.168 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17593793416924502,
          "description": "min=0.121, mean=0.176, max=0.315, sum=0.88 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3710877192982456,
          "description": "min=0.175, mean=0.371, max=0.65, sum=1.855 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38484210526315793,
          "description": "min=0.184, mean=0.385, max=0.68, sum=1.924 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4616491228070176,
          "description": "min=0.298, mean=0.462, max=0.72, sum=2.308 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19445587267296924,
          "description": "min=0.156, mean=0.194, max=0.246, sum=0.972 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4133684210526316,
          "description": "min=0.237, mean=0.413, max=0.69, sum=2.067 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4236140350877193,
          "description": "min=0.228, mean=0.424, max=0.7, sum=2.118 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 522.5470877192982,
          "description": "min=397.65, mean=522.547, max=684.675, sum=2612.735 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5722456140350877,
          "description": "min=0.28, mean=0.572, max=0.84, sum=2.861 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5332280701754385,
          "description": "min=0.24, mean=0.533, max=0.82, sum=2.666 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.541719298245614,
          "description": "min=0.27, mean=0.542, max=0.83, sum=2.709 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.46898245614035083,
          "description": "min=0.24, mean=0.469, max=0.78, sum=7.035 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12722994020701678,
          "description": "min=0.073, mean=0.127, max=0.202, sum=1.908 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40336842105263154,
          "description": "min=0.15, mean=0.403, max=0.75, sum=6.051 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41770760233918125,
          "description": "min=0.17, mean=0.418, max=0.75, sum=6.266 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.24178947368421053,
          "description": "min=0.2, mean=0.242, max=0.35, sum=3.627 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13220035950695058,
          "description": "min=0.103, mean=0.132, max=0.175, sum=1.983 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1694970760233918,
          "description": "min=0.09, mean=0.169, max=0.24, sum=2.542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2124327485380117,
          "description": "min=0.17, mean=0.212, max=0.31, sum=3.186 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4224093567251462,
          "description": "min=0.26, mean=0.422, max=0.7, sum=6.336 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13175836488041992,
          "description": "min=0.093, mean=0.132, max=0.18, sum=1.976 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3401169590643275,
          "description": "min=0.17, mean=0.34, max=0.6, sum=5.102 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3803040935672514,
          "description": "min=0.24, mean=0.38, max=0.61, sum=5.705 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21209971402138156,
          "description": "min=0.203, mean=0.212, max=0.221, sum=3.181 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.24280701754385967,
          "description": "min=0.19, mean=0.243, max=0.29, sum=3.642 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1380385889615569,
          "description": "min=0.069, mean=0.138, max=0.238, sum=2.071 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1902923976608187,
          "description": "min=0.1, mean=0.19, max=0.263, sum=2.854 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21771929824561406,
          "description": "min=0.15, mean=0.218, max=0.281, sum=3.266 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09245237979714913,
          "description": "min=0.091, mean=0.092, max=0.095, sum=1.387 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.23451461988304095,
          "description": "min=0.17, mean=0.235, max=0.35, sum=3.518 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13954639548632583,
          "description": "min=0.095, mean=0.14, max=0.179, sum=2.093 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.165906432748538,
          "description": "min=0.09, mean=0.166, max=0.24, sum=2.489 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20567251461988303,
          "description": "min=0.14, mean=0.206, max=0.28, sum=3.085 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11896953947368419,
          "description": "min=0.118, mean=0.119, max=0.12, sum=1.785 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.24276023391812865,
          "description": "min=0.132, mean=0.243, max=0.32, sum=3.641 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1282115692539908,
          "description": "min=0.049, mean=0.128, max=0.186, sum=1.923 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20357894736842103,
          "description": "min=0.105, mean=0.204, max=0.28, sum=3.054 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2103157894736842,
          "description": "min=0.053, mean=0.21, max=0.31, sum=3.155 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1402282775493421,
          "description": "min=0.14, mean=0.14, max=0.141, sum=2.103 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.568830409356725,
          "description": "min=0.28, mean=0.569, max=0.86, sum=8.532 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31740378740673564,
          "description": "min=0.127, mean=0.317, max=0.54, sum=4.761 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5167953216374268,
          "description": "min=0.19, mean=0.517, max=0.84, sum=7.752 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5369590643274853,
          "description": "min=0.24, mean=0.537, max=0.83, sum=8.054 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5676491228070175,
          "description": "min=0.26, mean=0.568, max=0.86, sum=8.515 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17629729974248792,
          "description": "min=0.064, mean=0.176, max=0.264, sum=2.644 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5245380116959065,
          "description": "min=0.23, mean=0.525, max=0.83, sum=7.868 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5309473684210526,
          "description": "min=0.24, mean=0.531, max=0.82, sum=7.964 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19643028419682018,
          "description": "min=0.175, mean=0.196, max=0.215, sum=2.946 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.23721637426900585,
          "description": "min=0.21, mean=0.237, max=0.298, sum=3.558 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4624557415628211,
          "description": "min=0.298, mean=0.462, max=0.534, sum=6.937 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22019883040935673,
          "description": "min=0.16, mean=0.22, max=0.272, sum=3.303 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23079532163742691,
          "description": "min=0.2, mean=0.231, max=0.281, sum=3.462 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13321992694627194,
          "description": "min=0.129, mean=0.133, max=0.14, sum=1.998 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.22873684210526318,
          "description": "min=0.11, mean=0.229, max=0.325, sum=3.431 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31056724427484883,
          "description": "min=0.16, mean=0.311, max=0.472, sum=4.659 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18602339181286548,
          "description": "min=0.1, mean=0.186, max=0.228, sum=2.79 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20512280701754387,
          "description": "min=0.09, mean=0.205, max=0.272, sum=3.077 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13263352809758774,
          "description": "min=0.131, mean=0.133, max=0.135, sum=1.99 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.23770760233918128,
          "description": "min=0.14, mean=0.238, max=0.31, sum=3.566 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5062965949265723,
          "description": "min=0.357, mean=0.506, max=0.666, sum=7.594 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17768421052631578,
          "description": "min=0.08, mean=0.178, max=0.28, sum=2.665 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.201766081871345,
          "description": "min=0.11, mean=0.202, max=0.28, sum=3.026 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08760755934758772,
          "description": "min=0.086, mean=0.088, max=0.089, sum=1.314 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5897543859649124,
          "description": "min=0.3, mean=0.59, max=0.85, sum=2.949 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5254736842105263,
          "description": "min=0.23, mean=0.525, max=0.79, sum=2.627 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5299649122807017,
          "description": "min=0.26, mean=0.53, max=0.8, sum=2.65 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 460.71996491228066,
          "description": "min=366.44, mean=460.72, max=607.43, sum=2303.6 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.012,
          "description": "min=1, mean=1.012, max=1.06, sum=5.06 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.3909122807017544,
          "description": "min=0.2, mean=0.391, max=0.73, sum=1.955 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2623859649122807,
          "description": "min=0.1, mean=0.262, max=0.49, sum=1.312 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31312280701754386,
          "description": "min=0.12, mean=0.313, max=0.66, sum=1.566 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 460.71996491228066,
          "description": "min=366.44, mean=460.72, max=607.43, sum=2303.6 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.3714035087719298,
          "description": "min=1.19, mean=1.371, max=1.61, sum=6.857 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.26287719298245615,
          "description": "min=0.24, mean=0.263, max=0.3, sum=1.314 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11506526711032969,
          "description": "min=0.082, mean=0.115, max=0.149, sum=0.575 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2168421052631579,
          "description": "min=0.184, mean=0.217, max=0.29, sum=1.084 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23210526315789473,
          "description": "min=0.2, mean=0.232, max=0.29, sum=1.161 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2573684210526316,
          "description": "min=0.22, mean=0.257, max=0.29, sum=1.287 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1238999810101579,
          "description": "min=0.09, mean=0.124, max=0.157, sum=0.619 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21785964912280703,
          "description": "min=0.18, mean=0.218, max=0.23, sum=1.089 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22210526315789475,
          "description": "min=0.18, mean=0.222, max=0.27, sum=1.111 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.30161403508771933,
          "description": "min=0.228, mean=0.302, max=0.38, sum=1.508 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09791468112621773,
          "description": "min=0.08, mean=0.098, max=0.13, sum=0.49 (5)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.2501052631578947,
          "description": "min=0.2, mean=0.25, max=0.33, sum=1.251 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.275859649122807,
          "description": "min=0.219, mean=0.276, max=0.34, sum=1.379 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.3631228070175439,
          "description": "min=0.246, mean=0.363, max=0.52, sum=1.816 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14292977551638825,
          "description": "min=0.092, mean=0.143, max=0.182, sum=0.715 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2910877192982456,
          "description": "min=0.175, mean=0.291, max=0.46, sum=1.455 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30533333333333335,
          "description": "min=0.167, mean=0.305, max=0.48, sum=1.527 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.43666666666666665,
          "description": "min=0.25, mean=0.437, max=0.68, sum=2.183 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.38087719298245615,
          "description": "min=0.25, mean=0.381, max=0.6, sum=1.904 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40989473684210526,
          "description": "min=0.24, mean=0.41, max=0.64, sum=2.049 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.44442105263157894,
          "description": "min=0.3, mean=0.444, max=0.64, sum=2.222 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.3826315789473684,
          "description": "min=0.22, mean=0.383, max=0.59, sum=1.913 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40038596491228073,
          "description": "min=0.24, mean=0.4, max=0.61, sum=2.002 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 467.935649122807,
          "description": "min=358.76, mean=467.936, max=612.798, sum=2339.678 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2863859649122807,
          "description": "min=0.17, mean=0.286, max=0.39, sum=1.432 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.23610526315789473,
          "description": "min=0.13, mean=0.236, max=0.37, sum=1.181 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26063157894736844,
          "description": "min=0.15, mean=0.261, max=0.33, sum=1.303 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 500.12014035087725,
          "description": "min=389.6, mean=500.12, max=664.281, sum=2500.601 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.27487719298245616,
          "description": "min=0.21, mean=0.275, max=0.34, sum=1.374 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.24961403508771932,
          "description": "min=0.2, mean=0.25, max=0.32, sum=1.248 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2613684210526316,
          "description": "min=0.2, mean=0.261, max=0.32, sum=1.307 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 500.12014035087725,
          "description": "min=389.6, mean=500.12, max=664.281, sum=2500.601 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.5089122807017544,
          "description": "min=0.32, mean=0.509, max=0.79, sum=2.545 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.4566315789473684,
          "description": "min=0.26, mean=0.457, max=0.76, sum=2.283 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4803859649122807,
          "description": "min=0.272, mean=0.48, max=0.78, sum=2.402 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 500.12014035087725,
          "description": "min=389.6, mean=500.12, max=664.281, sum=2500.601 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.49663157894736837,
          "description": "min=0.263, mean=0.497, max=0.82, sum=2.483 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.44561403508771924,
          "description": "min=0.228, mean=0.446, max=0.78, sum=2.228 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4658596491228071,
          "description": "min=0.219, mean=0.466, max=0.8, sum=2.329 (5)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=514 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=25 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 500.12014035087725,
          "description": "min=389.6, mean=500.12, max=664.281, sum=2500.601 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=5 (5)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.34397660818713455,
          "description": "min=0.23, mean=0.344, max=0.47, sum=5.16 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12760096192658882,
          "description": "min=0.075, mean=0.128, max=0.196, sum=1.914 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3203859649122807,
          "description": "min=0.17, mean=0.32, max=0.44, sum=4.806 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3148771929824561,
          "description": "min=0.22, mean=0.315, max=0.43, sum=4.723 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33523606010994367,
          "description": "min=0.194, mean=0.335, max=0.546, sum=5.029 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 460.63743859649117,
          "description": "min=354.52, mean=460.637, max=611.877, sum=6909.562 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.4027251461988304,
          "description": "min=0.23, mean=0.403, max=0.7, sum=6.041 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.34819883040935673,
          "description": "min=0.14, mean=0.348, max=0.65, sum=5.223 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3714502923976608,
          "description": "min=0.19, mean=0.371, max=0.66, sum=5.572 (15)",
          "style": {},
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.6090994152046784,
          "description": "min=0.35, mean=0.609, max=0.88, sum=9.136 (15)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5662339181286549,
          "description": "min=0.29, mean=0.566, max=0.86, sum=8.494 (15)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.5881637426900584,
          "description": "min=0.34, mean=0.588, max=0.86, sum=8.822 (15)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "5 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 472.2740350877193,
          "description": "min=371.38, mean=472.274, max=624.07, sum=7084.111 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "markdown": false
        },
        {
          "value": 0.2433684210526316,
          "description": "min=0.2, mean=0.243, max=0.28, sum=3.651 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7076962372990694,
          "description": "min=0.619, mean=0.708, max=0.769, sum=10.615 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2433684210526316,
          "description": "min=0.2, mean=0.243, max=0.28, sum=3.651 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2433684210526316,
          "description": "min=0.2, mean=0.243, max=0.28, sum=3.651 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14296402070471761,
          "description": "min=0.09, mean=0.143, max=0.217, sum=2.144 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 102.8,
          "description": "min=100, mean=102.8, max=114, sum=1542 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=75 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 453.38266666666664,
          "description": "min=354.96, mean=453.383, max=580.833, sum=6800.74 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=15 (15)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=45 (15)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu.json"
      }
    ],
    "name": "mmlu"
  },
  {
    "title": "subject: abstract_algebra",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.19, mean=0.203, max=0.21, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13004230998308916,
          "description": "min=0.11, mean=0.13, max=0.141, sum=0.39 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16666666666666666,
          "description": "min=0.15, mean=0.167, max=0.18, sum=0.5 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.18, mean=0.19, max=0.2, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4267298437500006,
          "description": "min=0.427, mean=0.427, max=0.427, sum=1.28 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.24, mean=0.253, max=0.27, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14285350170617525,
          "description": "min=0.123, mean=0.143, max=0.159, sum=0.429 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666665,
          "description": "min=0.16, mean=0.187, max=0.22, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.16, mean=0.203, max=0.23, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3535309375,
          "description": "min=0.354, mean=0.354, max=0.354, sum=1.061 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.23, mean=0.26, max=0.28, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09374911844567375,
          "description": "min=0.063, mean=0.094, max=0.121, sum=0.281 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22666666666666666,
          "description": "min=0.18, mean=0.227, max=0.25, sum=0.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.22, mean=0.24, max=0.26, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38593109374999984,
          "description": "min=0.386, mean=0.386, max=0.386, sum=1.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande-v2-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.23, mean=0.25, max=0.28, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12877835812064034,
          "description": "min=0.12, mean=0.129, max=0.144, sum=0.386 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21000000000000005,
          "description": "min=0.2, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.19, mean=0.213, max=0.24, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24666666666666667,
          "description": "min=0.23, mean=0.247, max=0.26, sum=0.74 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16440959642045752,
          "description": "min=0.127, mean=0.164, max=0.199, sum=0.493 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.17, mean=0.183, max=0.2, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.22, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.24, mean=0.267, max=0.31, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14143399633830836,
          "description": "min=0.127, mean=0.141, max=0.166, sum=0.424 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.22, mean=0.237, max=0.27, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.23, mean=0.253, max=0.29, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2966666666666667,
          "description": "min=0.27, mean=0.297, max=0.32, sum=0.89 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12419956041531525,
          "description": "min=0.112, mean=0.124, max=0.148, sum=0.373 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.2, mean=0.217, max=0.23, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.25, mean=0.27, max=0.3, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 324.26,
          "description": "min=324.26, mean=324.26, max=324.26, sum=972.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-base%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.22, mean=0.25, max=0.29, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11093126295330225,
          "description": "min=0.087, mean=0.111, max=0.157, sum=0.333 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14666666666666664,
          "description": "min=0.1, mean=0.147, max=0.22, sum=0.44 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16333333333333333,
          "description": "min=0.09, mean=0.163, max=0.27, sum=0.49 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 360.75,
          "description": "min=360.75, mean=360.75, max=360.75, sum=1082.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-extended%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.23, mean=0.237, max=0.24, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13016478124428085,
          "description": "min=0.1, mean=0.13, max=0.163, sum=0.39 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12333333333333334,
          "description": "min=0.1, mean=0.123, max=0.17, sum=0.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18000000000000002,
          "description": "min=0.17, mean=0.18, max=0.19, sum=0.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 360.75,
          "description": "min=360.75, mean=360.75, max=360.75, sum=1082.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-supreme%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.22, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1342492309607748,
          "description": "min=0.129, mean=0.134, max=0.138, sum=0.403 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10666666666666667,
          "description": "min=0.08, mean=0.107, max=0.12, sum=0.32 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12333333333333334,
          "description": "min=0.11, mean=0.123, max=0.14, sum=0.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 360.75,
          "description": "min=360.75, mean=360.75, max=360.75, sum=1082.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Danthropic_stanford-online-all-v4-s3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.25, mean=0.267, max=0.28, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.123, mean=0.159, max=0.225, sum=0.476 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.17, mean=0.19, max=0.21, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.23, mean=0.237, max=0.25, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5819928515624998,
          "description": "min=0.582, mean=0.582, max=0.582, sum=1.746 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_bloom%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.23, mean=0.243, max=0.26, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14794283505574565,
          "description": "min=0.137, mean=0.148, max=0.166, sum=0.444 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.2, mean=0.207, max=0.22, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22999999999999998,
          "description": "min=0.22, mean=0.23, max=0.24, sum=0.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1558831545710563,
          "description": "min=0.149, mean=0.156, max=0.16, sum=0.468 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 333.02,
          "description": "min=333.02, mean=333.02, max=333.02, sum=999.06 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t0pp%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.25, mean=0.257, max=0.26, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29613897711626275,
          "description": "min=0.293, mean=0.296, max=0.3, sum=0.888 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.25, mean=0.257, max=0.26, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.25, mean=0.257, max=0.26, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14693959375222523,
          "description": "min=0.147, mean=0.147, max=0.147, sum=0.441 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 393.52,
          "description": "min=393.52, mean=393.52, max=393.52, sum=1180.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20220609%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2833333333333334,
          "description": "min=0.28, mean=0.283, max=0.29, sum=0.85 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1240734248554128,
          "description": "min=0.089, mean=0.124, max=0.173, sum=0.372 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.22, mean=0.243, max=0.28, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.26, mean=0.267, max=0.27, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4704379882812501,
          "description": "min=0.47, mean=0.47, max=0.47, sum=1.411 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_large-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2933333333333334,
          "description": "min=0.27, mean=0.293, max=0.31, sum=0.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10421486035424381,
          "description": "min=0.093, mean=0.104, max=0.122, sum=0.313 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.17, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.24, mean=0.263, max=0.29, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2924594726562501,
          "description": "min=0.292, mean=0.292, max=0.292, sum=0.877 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.18, mean=0.24, max=0.29, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13694644404615863,
          "description": "min=0.088, mean=0.137, max=0.164, sum=0.411 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14333333333333334,
          "description": "min=0.09, mean=0.143, max=0.21, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.15, mean=0.22, max=0.28, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26462695312499995,
          "description": "min=0.265, mean=0.265, max=0.265, sum=0.794 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_small-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.24, mean=0.25, max=0.26, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19558083416988273,
          "description": "min=0.192, mean=0.196, max=0.202, sum=0.587 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.23, mean=0.237, max=0.25, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.24, mean=0.25, max=0.26, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2646899414062499,
          "description": "min=0.265, mean=0.265, max=0.265, sum=0.794 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.21, mean=0.233, max=0.25, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1431885636537816,
          "description": "min=0.131, mean=0.143, max=0.157, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.12, mean=0.16, max=0.18, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18000000000000002,
          "description": "min=0.13, mean=0.18, max=0.23, sum=0.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.19, mean=0.257, max=0.32, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13602277572504493,
          "description": "min=0.115, mean=0.136, max=0.167, sum=0.408 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.18, mean=0.21, max=0.24, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.19, mean=0.24, max=0.3, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-medium-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3466666666666667,
          "description": "min=0.34, mean=0.347, max=0.36, sum=1.04 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15392852452736658,
          "description": "min=0.13, mean=0.154, max=0.201, sum=0.462 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27666666666666667,
          "description": "min=0.24, mean=0.277, max=0.32, sum=0.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3233333333333333,
          "description": "min=0.31, mean=0.323, max=0.33, sum=0.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-xlarge-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.23, mean=0.253, max=0.3, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14600807436466365,
          "description": "min=0.128, mean=0.146, max=0.161, sum=0.438 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.15, mean=0.183, max=0.23, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.19, mean=0.22, max=0.27, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 372.75,
          "description": "min=372.75, mean=372.75, max=372.75, sum=1118.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-j-6b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.22, mean=0.25, max=0.29, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1144252023430285,
          "description": "min=0.095, mean=0.114, max=0.141, sum=0.343 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.21, mean=0.237, max=0.26, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.06982824911673863,
          "description": "min=0.07, mean=0.07, max=0.07, sum=0.209 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-neox-20b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.24, mean=0.267, max=0.29, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12943498646287685,
          "description": "min=0.118, mean=0.129, max=0.137, sum=0.388 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19999999999999998,
          "description": "min=0.17, mean=0.2, max=0.24, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24666666666666667,
          "description": "min=0.23, mean=0.247, max=0.26, sum=0.74 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09395700715837019,
          "description": "min=0.093, mean=0.094, max=0.095, sum=0.282 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=1076.28 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-6.9b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.06401448641361196,
          "description": "min=0.064, mean=0.064, max=0.064, sum=0.064 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-12b-v0%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16564084927593548,
          "description": "min=0.166, mean=0.166, max=0.166, sum=0.166 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t5-11b%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.25, mean=0.257, max=0.26, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16948243084782047,
          "description": "min=0.144, mean=0.169, max=0.203, sum=0.508 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.24, mean=0.25, max=0.26, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.24, mean=0.25, max=0.26, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2313456954227554,
          "description": "min=0.23, mean=0.231, max=0.232, sum=0.694 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 393.52,
          "description": "min=393.52, mean=393.52, max=393.52, sum=1180.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_ul2%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%2Cglobal_prefix%3Dnlg%22%5D",
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.24, mean=0.273, max=0.3, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12302690493121325,
          "description": "min=0.098, mean=0.123, max=0.154, sum=0.369 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.21, mean=0.253, max=0.28, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.22, mean=0.263, max=0.29, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18282412769397108,
          "description": "min=0.182, mean=0.183, max=0.183, sum=0.548 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.52,
          "description": "min=397.52, mean=397.52, max=397.52, sum=1192.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-175b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21999999999999997,
          "description": "min=0.21, mean=0.22, max=0.24, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.160741217873278,
          "description": "min=0.127, mean=0.161, max=0.194, sum=0.482 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19666666666666668,
          "description": "min=0.18, mean=0.197, max=0.21, sum=0.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.19, mean=0.21, max=0.24, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11172814465704417,
          "description": "min=0.111, mean=0.112, max=0.113, sum=0.335 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-66b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.2, mean=0.21, max=0.22, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1456340952502241,
          "description": "min=0.127, mean=0.146, max=0.171, sum=0.437 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.13, mean=0.17, max=0.2, sum=0.51 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.18, mean=0.203, max=0.22, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.048790259535113976,
          "description": "min=0.041, mean=0.049, max=0.064, sum=0.146 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.138, mean=0.138, max=0.138, sum=0.138 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.21, mean=0.21, max=0.21, sum=0.21 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19,
          "description": "min=0.19, mean=0.19, max=0.19, sum=0.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.136, mean=0.136, max=0.136, sum=0.136 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.33,
          "description": "min=0.33, mean=0.33, max=0.33, sum=0.33 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.118, mean=0.118, max=0.118, sum=0.118 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-65b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-70b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dstanford_alpaca-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2375565920391104,
          "description": "min=0.238, mean=0.238, max=0.238, sum=0.238 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-7b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18516542854902118,
          "description": "min=0.185, mean=0.185, max=0.185, sum=0.185 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.21, mean=0.21, max=0.21, sum=0.21 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-13b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2188828890644999,
          "description": "min=0.219, mean=0.219, max=0.219, sum=0.219 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.65,
          "description": "min=397.65, mean=397.65, max=397.65, sum=397.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmistralai_mistral-7b-v0.1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_530B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.24, mean=0.257, max=0.27, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13222976886680096,
          "description": "min=0.087, mean=0.132, max=0.158, sum=0.397 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16666666666666666,
          "description": "min=0.15, mean=0.167, max=0.19, sum=0.5 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666668,
          "description": "min=0.17, mean=0.187, max=0.21, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_7B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.23, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15850967106374994,
          "description": "min=0.138, mean=0.159, max=0.174, sum=0.476 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12333333333333334,
          "description": "min=0.09, mean=0.123, max=0.15, sum=0.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17666666666666667,
          "description": "min=0.17, mean=0.177, max=0.19, sum=0.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_davinci%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.26, mean=0.263, max=0.27, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12068802193672283,
          "description": "min=0.106, mean=0.121, max=0.14, sum=0.362 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666668,
          "description": "min=0.17, mean=0.187, max=0.2, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.24, mean=0.257, max=0.27, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21564082031249995,
          "description": "min=0.216, mean=0.216, max=0.216, sum=0.647 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_curie%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.19, mean=0.21, max=0.24, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15793562166845154,
          "description": "min=0.126, mean=0.158, max=0.188, sum=0.474 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15666666666666665,
          "description": "min=0.1, mean=0.157, max=0.22, sum=0.47 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.16, mean=0.183, max=0.23, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09060562499999998,
          "description": "min=0.091, mean=0.091, max=0.091, sum=0.272 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_babbage%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.21, mean=0.25, max=0.29, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16001863882611692,
          "description": "min=0.143, mean=0.16, max=0.179, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19999999999999998,
          "description": "min=0.17, mean=0.2, max=0.22, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.2, mean=0.237, max=0.27, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11897886718749999,
          "description": "min=0.119, mean=0.119, max=0.119, sum=0.357 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_ada%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.21, mean=0.257, max=0.31, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10565837371158471,
          "description": "min=0.049, mean=0.106, max=0.138, sum=0.317 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.19, mean=0.22, max=0.28, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24666666666666667,
          "description": "min=0.19, mean=0.247, max=0.31, sum=0.74 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.140352109375,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.421 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-003%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2866666666666667,
          "description": "min=0.28, mean=0.287, max=0.3, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5281465738827426,
          "description": "min=0.512, mean=0.528, max=0.54, sum=1.584 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.19, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.24, mean=0.253, max=0.27, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-002%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2933333333333334,
          "description": "min=0.26, mean=0.293, max=0.32, sum=0.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23701685350779825,
          "description": "min=0.222, mean=0.237, max=0.264, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.23, mean=0.233, max=0.24, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.24, mean=0.257, max=0.27, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20632390625000005,
          "description": "min=0.206, mean=0.206, max=0.206, sum=0.619 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-curie-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.21, mean=0.22, max=0.23, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.401900566266635,
          "description": "min=0.298, mean=0.402, max=0.475, sum=1.206 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19666666666666666,
          "description": "min=0.16, mean=0.197, max=0.22, sum=0.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.22, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.129154921875,
          "description": "min=0.129, mean=0.129, max=0.129, sum=0.387 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-babbage-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.15666666666666665,
          "description": "min=0.11, mean=0.157, max=0.18, sum=0.47 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.45613969203952826,
          "description": "min=0.444, mean=0.456, max=0.472, sum=1.368 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14333333333333334,
          "description": "min=0.1, mean=0.143, max=0.17, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15,
          "description": "min=0.09, mean=0.15, max=0.18, sum=0.45 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13070421875,
          "description": "min=0.131, mean=0.131, max=0.131, sum=0.392 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-ada-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.22, mean=0.273, max=0.31, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4604782316693649,
          "description": "min=0.377, mean=0.46, max=0.604, sum=1.381 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.18, mean=0.203, max=0.25, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.21, mean=0.253, max=0.28, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.086154921875,
          "description": "min=0.086, mean=0.086, max=0.086, sum=0.258 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0301%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 366.44,
          "description": "min=366.44, mean=366.44, max=366.44, sum=366.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0613%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.1,
          "description": "min=0.1, mean=0.1, max=0.1, sum=0.1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12,
          "description": "min=0.12, mean=0.12, max=0.12, sum=0.12 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 366.44,
          "description": "min=366.44, mean=366.44, max=366.44, sum=366.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.37,
          "description": "min=1.37, mean=1.37, max=1.37, sum=1.37 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14934370057280555,
          "description": "min=0.149, mean=0.149, max=0.149, sum=0.149 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15713480900071625,
          "description": "min=0.157, mean=0.157, max=0.157, sum=0.157 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08061130048831104,
          "description": "min=0.081, mean=0.081, max=0.081, sum=0.081 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14175099389171522,
          "description": "min=0.142, mean=0.142, max=0.142, sum=0.142 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-instruct-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 358.76,
          "description": "min=358.76, mean=358.76, max=358.76, sum=358.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.92,
          "description": "min=396.92, mean=396.92, max=396.92, sum=396.92 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.92,
          "description": "min=396.92, mean=396.92, max=396.92, sum=396.92 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.92,
          "description": "min=396.92, mean=396.92, max=396.92, sum=396.92 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 396.92,
          "description": "min=396.92, mean=396.92, max=396.92, sum=396.92 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_glm%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.23, mean=0.25, max=0.28, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12554701004226645,
          "description": "min=0.109, mean=0.126, max=0.143, sum=0.377 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.17, mean=0.22, max=0.26, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.22, mean=0.233, max=0.24, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22840096232791743,
          "description": "min=0.223, mean=0.228, max=0.238, sum=0.685 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 354.52,
          "description": "min=354.52, mean=354.52, max=354.52, sum=1063.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-instruct-30%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.23, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.16333333333333333,
          "description": "min=0.14, mean=0.163, max=0.21, sum=0.49 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.22, mean=0.233, max=0.24, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-x%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.37333333333333335,
          "description": "min=0.35, mean=0.373, max=0.4, sum=1.12 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.3033333333333333,
          "description": "min=0.29, mean=0.303, max=0.32, sum=0.91 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.3466666666666667,
          "description": "min=0.34, mean=0.347, max=0.35, sum=1.04 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.37999999999994,
          "description": "min=371.38, mean=371.38, max=371.38, sum=1114.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20abstract_algebra&runSpecs=%5B%22mmlu%3Asubject%3Dabstract_algebra%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_yalm%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7355750087018674,
          "description": "min=0.714, mean=0.736, max=0.753, sum=2.207 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09078174985945225,
          "description": "min=0.09, mean=0.091, max=0.091, sum=0.272 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 354.96,
          "description": "min=354.96, mean=354.96, max=354.96, sum=1064.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu_subject:abstract_algebra.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu_subject:abstract_algebra.json"
      }
    ],
    "name": "mmlu_subject:abstract_algebra"
  },
  {
    "title": "subject: college_chemistry",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.19, mean=0.207, max=0.22, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16397869768042903,
          "description": "min=0.154, mean=0.164, max=0.172, sum=0.492 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.15, mean=0.17, max=0.18, sum=0.51 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18000000000000002,
          "description": "min=0.17, mean=0.18, max=0.19, sum=0.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5019898242187504,
          "description": "min=0.502, mean=0.502, max=0.502, sum=1.506 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.2, mean=0.22, max=0.25, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12025063606290846,
          "description": "min=0.097, mean=0.12, max=0.134, sum=0.361 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17666666666666667,
          "description": "min=0.17, mean=0.177, max=0.19, sum=0.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.17, mean=0.19, max=0.22, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40662460937500006,
          "description": "min=0.407, mean=0.407, max=0.407, sum=1.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.2, mean=0.217, max=0.23, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14271415954802005,
          "description": "min=0.125, mean=0.143, max=0.154, sum=0.428 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17333333333333334,
          "description": "min=0.15, mean=0.173, max=0.21, sum=0.52 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.18, mean=0.203, max=0.22, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43769576171874985,
          "description": "min=0.438, mean=0.438, max=0.438, sum=1.313 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande-v2-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.26, mean=0.273, max=0.29, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13718202713050248,
          "description": "min=0.108, mean=0.137, max=0.184, sum=0.412 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22999999999999998,
          "description": "min=0.21, mean=0.23, max=0.25, sum=0.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.23, mean=0.24, max=0.25, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.31, mean=0.34, max=0.39, sum=1.02 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11967979689419712,
          "description": "min=0.107, mean=0.12, max=0.134, sum=0.359 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27666666666666667,
          "description": "min=0.25, mean=0.277, max=0.31, sum=0.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29000000000000004,
          "description": "min=0.26, mean=0.29, max=0.32, sum=0.87 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3633333333333333,
          "description": "min=0.36, mean=0.363, max=0.37, sum=1.09 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09383270950828138,
          "description": "min=0.076, mean=0.094, max=0.122, sum=0.281 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30333333333333334,
          "description": "min=0.28, mean=0.303, max=0.33, sum=0.91 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.29, mean=0.3, max=0.31, sum=0.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2933333333333334,
          "description": "min=0.26, mean=0.293, max=0.33, sum=0.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0941304355373469,
          "description": "min=0.06, mean=0.094, max=0.117, sum=0.282 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.17, mean=0.183, max=0.19, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.22, mean=0.26, max=0.3, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 477.01,
          "description": "min=477.01, mean=477.01, max=477.01, sum=1431.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-base%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.2, mean=0.263, max=0.3, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11528874957033723,
          "description": "min=0.106, mean=0.115, max=0.125, sum=0.346 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1366666666666667,
          "description": "min=0.13, mean=0.137, max=0.14, sum=0.41 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14333333333333334,
          "description": "min=0.11, mean=0.143, max=0.17, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 547.85,
          "description": "min=547.85, mean=547.85, max=547.85, sum=1643.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-extended%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.23, mean=0.25, max=0.26, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10422634829630807,
          "description": "min=0.075, mean=0.104, max=0.131, sum=0.313 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.19, mean=0.19, max=0.19, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666665,
          "description": "min=0.17, mean=0.187, max=0.21, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 547.85,
          "description": "min=547.85, mean=547.85, max=547.85, sum=1643.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-supreme%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3233333333333333,
          "description": "min=0.3, mean=0.323, max=0.36, sum=0.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13943706608019577,
          "description": "min=0.122, mean=0.139, max=0.173, sum=0.418 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666668,
          "description": "min=0.18, mean=0.187, max=0.19, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.17, mean=0.19, max=0.23, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 547.85,
          "description": "min=547.85, mean=547.85, max=547.85, sum=1643.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Danthropic_stanford-online-all-v4-s3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.41333333333333333,
          "description": "min=0.39, mean=0.413, max=0.44, sum=1.24 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.071, mean=0.109, max=0.138, sum=0.326 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3466666666666667,
          "description": "min=0.33, mean=0.347, max=0.36, sum=1.04 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3666666666666667,
          "description": "min=0.32, mean=0.367, max=0.41, sum=1.1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.556292529296875,
          "description": "min=0.556, mean=0.556, max=0.556, sum=1.669 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_bloom%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.19, mean=0.213, max=0.23, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14171600390009456,
          "description": "min=0.124, mean=0.142, max=0.153, sum=0.425 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18000000000000002,
          "description": "min=0.17, mean=0.18, max=0.19, sum=0.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19333333333333336,
          "description": "min=0.18, mean=0.193, max=0.21, sum=0.58 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14160722457445585,
          "description": "min=0.135, mean=0.142, max=0.148, sum=0.425 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 492.01,
          "description": "min=492.01, mean=492.01, max=492.01, sum=1476.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t0pp%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.29, mean=0.34, max=0.39, sum=1.02 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17152406097998318,
          "description": "min=0.139, mean=0.172, max=0.215, sum=0.515 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.25, mean=0.28, max=0.3, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30000000000000004,
          "description": "min=0.25, mean=0.3, max=0.34, sum=0.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14304039351145428,
          "description": "min=0.141, mean=0.143, max=0.147, sum=0.429 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 594.8,
          "description": "min=594.8, mean=594.8, max=594.8, sum=1784.4 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20220609%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.25, mean=0.273, max=0.3, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10790054818662764,
          "description": "min=0.09, mean=0.108, max=0.12, sum=0.324 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.18, mean=0.207, max=0.23, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.21, mean=0.243, max=0.27, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48594037109374993,
          "description": "min=0.486, mean=0.486, max=0.486, sum=1.458 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_large-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2866666666666667,
          "description": "min=0.19, mean=0.287, max=0.35, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13011982202591446,
          "description": "min=0.118, mean=0.13, max=0.151, sum=0.39 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22333333333333336,
          "description": "min=0.15, mean=0.223, max=0.27, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.14, mean=0.237, max=0.32, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3336808496093749,
          "description": "min=0.334, mean=0.334, max=0.334, sum=1.001 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.23, mean=0.263, max=0.3, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11177652159840872,
          "description": "min=0.098, mean=0.112, max=0.138, sum=0.335 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.13, mean=0.16, max=0.18, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333337,
          "description": "min=0.17, mean=0.203, max=0.27, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.288873740234375,
          "description": "min=0.289, mean=0.289, max=0.289, sum=0.867 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_small-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32666666666666666,
          "description": "min=0.18, mean=0.327, max=0.42, sum=0.98 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1335681964497074,
          "description": "min=0.115, mean=0.134, max=0.143, sum=0.401 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30333333333333334,
          "description": "min=0.13, mean=0.303, max=0.42, sum=0.91 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2833333333333333,
          "description": "min=0.1, mean=0.283, max=0.4, sum=0.85 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30212056640624996,
          "description": "min=0.302, mean=0.302, max=0.302, sum=0.906 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.21, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13604810726431601,
          "description": "min=0.112, mean=0.136, max=0.166, sum=0.408 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18666666666666668,
          "description": "min=0.18, mean=0.187, max=0.19, sum=0.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17666666666666667,
          "description": "min=0.15, mean=0.177, max=0.19, sum=0.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.24, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11212034501788289,
          "description": "min=0.092, mean=0.112, max=0.134, sum=0.336 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.21, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2233333333333333,
          "description": "min=0.21, mean=0.223, max=0.24, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-medium-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.26, mean=0.28, max=0.29, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17129650211750888,
          "description": "min=0.16, mean=0.171, max=0.181, sum=0.514 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.2, mean=0.24, max=0.27, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-xlarge-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=1.02 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16352932889326802,
          "description": "min=0.145, mean=0.164, max=0.193, sum=0.491 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.26, mean=0.273, max=0.29, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2833333333333334,
          "description": "min=0.26, mean=0.283, max=0.3, sum=0.85 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 563.33,
          "description": "min=563.33, mean=563.33, max=563.33, sum=1689.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-j-6b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.16333333333333336,
          "description": "min=0.14, mean=0.163, max=0.19, sum=0.49 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14289129285984933,
          "description": "min=0.139, mean=0.143, max=0.149, sum=0.429 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13,
          "description": "min=0.11, mean=0.13, max=0.14, sum=0.39 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13333333333333333,
          "description": "min=0.13, mean=0.133, max=0.14, sum=0.4 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07056914776563644,
          "description": "min=0.07, mean=0.071, max=0.071, sum=0.212 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-neox-20b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.21, mean=0.237, max=0.26, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13519661836335625,
          "description": "min=0.116, mean=0.135, max=0.145, sum=0.406 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.17, mean=0.183, max=0.19, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.19, mean=0.217, max=0.23, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09380936773050398,
          "description": "min=0.093, mean=0.094, max=0.095, sum=0.281 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=1607.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-6.9b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.16, mean=0.16, max=0.16, sum=0.16 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19951882837054843,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12,
          "description": "min=0.12, mean=0.12, max=0.12, sum=0.12 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.14 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-12b-v0%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10145685405763509,
          "description": "min=0.101, mean=0.101, max=0.101, sum=0.101 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.16, mean=0.16, max=0.16, sum=0.16 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t5-11b%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.32666666666666666,
          "description": "min=0.31, mean=0.327, max=0.35, sum=0.98 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12453723424839762,
          "description": "min=0.101, mean=0.125, max=0.167, sum=0.374 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.25, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.26, mean=0.267, max=0.27, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21684571991364165,
          "description": "min=0.21, mean=0.217, max=0.221, sum=0.651 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.56,
          "description": "min=3.19, mean=3.56, max=3.75, sum=10.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 463.2,
          "description": "min=457.06, mean=463.2, max=467.75, sum=1389.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_ul2%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%2Cglobal_prefix%3Dnlg%22%5D",
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17140502711168673,
          "description": "min=0.136, mean=0.171, max=0.202, sum=0.514 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.2, mean=0.207, max=0.21, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.19, mean=0.2, max=0.21, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17870940214395525,
          "description": "min=0.178, mean=0.179, max=0.179, sum=0.536 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.53,
          "description": "min=3.15, mean=3.53, max=3.74, sum=10.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 464.18666666666667,
          "description": "min=457.02, mean=464.187, max=467.79, sum=1392.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-175b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2866666666666667,
          "description": "min=0.26, mean=0.287, max=0.3, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11685322103486995,
          "description": "min=0.115, mean=0.117, max=0.118, sum=0.351 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.13, mean=0.213, max=0.26, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.24, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1205012358937944,
          "description": "min=0.12, mean=0.121, max=0.121, sum=0.362 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-66b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34333333333333327,
          "description": "min=0.31, mean=0.343, max=0.37, sum=1.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15115391119805247,
          "description": "min=0.119, mean=0.151, max=0.172, sum=0.453 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.21, mean=0.273, max=0.32, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2866666666666667,
          "description": "min=0.24, mean=0.287, max=0.33, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0633280340120906,
          "description": "min=0.043, mean=0.063, max=0.081, sum=0.19 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.063, mean=0.063, max=0.063, sum=0.063 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.14 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.078, mean=0.078, max=0.078, sum=0.078 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31,
          "description": "min=0.31, mean=0.31, max=0.31, sum=0.31 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38,
          "description": "min=0.38, mean=0.38, max=0.38, sum=0.38 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-65b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.46,
          "description": "min=0.46, mean=0.46, max=0.46, sum=0.46 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.43, mean=0.43, max=0.43, sum=0.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.4,
          "description": "min=0.4, mean=0.4, max=0.4, sum=0.4 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-70b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5,
          "description": "min=0.5, mean=0.5, max=0.5, sum=0.5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.47,
          "description": "min=0.47, mean=0.47, max=0.47, sum=0.47 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48,
          "description": "min=0.48, mean=0.48, max=0.48, sum=0.48 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dstanford_alpaca-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2857420460290844,
          "description": "min=0.286, mean=0.286, max=0.286, sum=0.286 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-7b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12336722699936771,
          "description": "min=0.123, mean=0.123, max=0.123, sum=0.123 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-13b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19283490442798235,
          "description": "min=0.193, mean=0.193, max=0.193, sum=0.193 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31,
          "description": "min=0.31, mean=0.31, max=0.31, sum=0.31 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 622.43,
          "description": "min=622.43, mean=622.43, max=622.43, sum=622.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmistralai_mistral-7b-v0.1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.48,
          "description": "min=0.48, mean=0.48, max=0.48, sum=0.48 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.43, mean=0.43, max=0.43, sum=0.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.45,
          "description": "min=0.45, mean=0.45, max=0.45, sum=0.45 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_530B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.33,
          "description": "min=0.31, mean=0.33, max=0.36, sum=0.99 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13715509444071508,
          "description": "min=0.1, mean=0.137, max=0.202, sum=0.411 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.22, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.23, mean=0.253, max=0.27, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_7B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.22, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12719607038727151,
          "description": "min=0.103, mean=0.127, max=0.164, sum=0.382 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.15, mean=0.183, max=0.21, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.21, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_davinci%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27666666666666667,
          "description": "min=0.26, mean=0.277, max=0.3, sum=0.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11954811144676253,
          "description": "min=0.093, mean=0.12, max=0.156, sum=0.359 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.17, mean=0.21, max=0.24, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.24, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20424852539062496,
          "description": "min=0.204, mean=0.204, max=0.204, sum=0.613 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_curie%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22333333333333336,
          "description": "min=0.22, mean=0.223, max=0.23, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16015341188793958,
          "description": "min=0.126, mean=0.16, max=0.224, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.16, mean=0.17, max=0.18, sum=0.51 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.15, mean=0.17, max=0.2, sum=0.51 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09323458984375003,
          "description": "min=0.093, mean=0.093, max=0.093, sum=0.28 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_babbage%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.17, mean=0.183, max=0.19, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1725547228812219,
          "description": "min=0.169, mean=0.173, max=0.177, sum=0.518 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09999999999999999,
          "description": "min=0.09, mean=0.1, max=0.11, sum=0.3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1466666666666667,
          "description": "min=0.14, mean=0.147, max=0.16, sum=0.44 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11803564453124997,
          "description": "min=0.118, mean=0.118, max=0.118, sum=0.354 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_ada%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.18000000000000002,
          "description": "min=0.17, mean=0.18, max=0.19, sum=0.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17685471275547607,
          "description": "min=0.165, mean=0.177, max=0.186, sum=0.531 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15333333333333335,
          "description": "min=0.14, mean=0.153, max=0.18, sum=0.46 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1366666666666667,
          "description": "min=0.13, mean=0.137, max=0.14, sum=0.41 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13958224609375,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.419 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-003%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.4566666666666666,
          "description": "min=0.44, mean=0.457, max=0.49, sum=1.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3552143909979209,
          "description": "min=0.342, mean=0.355, max=0.364, sum=1.066 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4000000000000001,
          "description": "min=0.38, mean=0.4, max=0.43, sum=1.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.42, mean=0.43, max=0.44, sum=1.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-002%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.47333333333333333,
          "description": "min=0.47, mean=0.473, max=0.48, sum=1.42 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17400204885499146,
          "description": "min=0.166, mean=0.174, max=0.182, sum=0.522 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.44333333333333336,
          "description": "min=0.44, mean=0.443, max=0.45, sum=1.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.45333333333333337,
          "description": "min=0.45, mean=0.453, max=0.46, sum=1.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17933924804687504,
          "description": "min=0.179, mean=0.179, max=0.179, sum=0.538 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-curie-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.22, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48006619021696517,
          "description": "min=0.459, mean=0.48, max=0.499, sum=1.44 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.22, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.2, mean=0.207, max=0.21, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13672244140624998,
          "description": "min=0.137, mean=0.137, max=0.137, sum=0.41 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-babbage-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.21, mean=0.237, max=0.26, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29807272660548884,
          "description": "min=0.297, mean=0.298, max=0.299, sum=0.894 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19333333333333336,
          "description": "min=0.18, mean=0.193, max=0.22, sum=0.58 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.18, mean=0.21, max=0.25, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13384642578125,
          "description": "min=0.134, mean=0.134, max=0.134, sum=0.402 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-ada-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.19, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.49416435616549226,
          "description": "min=0.465, mean=0.494, max=0.517, sum=1.482 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13,
          "description": "min=0.09, mean=0.13, max=0.15, sum=0.39 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15333333333333332,
          "description": "min=0.13, mean=0.153, max=0.18, sum=0.46 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08810412109374999,
          "description": "min=0.088, mean=0.088, max=0.088, sum=0.264 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0301%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.52,
          "description": "min=0.52, mean=0.52, max=0.52, sum=0.52 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.44,
          "description": "min=0.44, mean=0.44, max=0.44, sum=0.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.43, mean=0.43, max=0.43, sum=0.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 542.4,
          "description": "min=542.4, mean=542.4, max=542.4, sum=542.4 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0613%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.21, mean=0.21, max=0.21, sum=0.21 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 542.4,
          "description": "min=542.4, mean=542.4, max=542.4, sum=542.4 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.38,
          "description": "min=1.38, mean=1.38, max=1.38, sum=1.38 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0815254535285509,
          "description": "min=0.082, mean=0.082, max=0.082, sum=0.082 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14668044456475351,
          "description": "min=0.147, mean=0.147, max=0.147, sum=0.147 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11619046034234375,
          "description": "min=0.116, mean=0.116, max=0.116, sum=0.116 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18152577186689586,
          "description": "min=0.182, mean=0.182, max=0.182, sum=0.182 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-instruct-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 535.85,
          "description": "min=535.85, mean=535.85, max=535.85, sum=535.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.17, mean=0.17, max=0.17, sum=0.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.13,
          "description": "min=0.13, mean=0.13, max=0.13, sum=0.13 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15,
          "description": "min=0.15, mean=0.15, max=0.15, sum=0.15 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 600.61,
          "description": "min=600.61, mean=600.61, max=600.61, sum=600.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.21, mean=0.21, max=0.21, sum=0.21 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 600.61,
          "description": "min=600.61, mean=600.61, max=600.61, sum=600.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.44,
          "description": "min=0.44, mean=0.44, max=0.44, sum=0.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.39,
          "description": "min=0.39, mean=0.39, max=0.39, sum=0.39 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 600.61,
          "description": "min=600.61, mean=600.61, max=600.61, sum=600.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.44,
          "description": "min=0.44, mean=0.44, max=0.44, sum=0.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.37,
          "description": "min=0.37, mean=0.37, max=0.37, sum=0.37 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 600.61,
          "description": "min=600.61, mean=600.61, max=600.61, sum=600.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_glm%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.32333333333333336,
          "description": "min=0.28, mean=0.323, max=0.35, sum=0.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10778793189086984,
          "description": "min=0.075, mean=0.108, max=0.132, sum=0.323 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3033333333333334,
          "description": "min=0.28, mean=0.303, max=0.32, sum=0.91 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30000000000000004,
          "description": "min=0.27, mean=0.3, max=0.33, sum=0.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1974133338034153,
          "description": "min=0.194, mean=0.197, max=0.204, sum=0.592 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 548.66,
          "description": "min=548.66, mean=548.66, max=548.66, sum=1645.98 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-instruct-30%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.23, mean=0.233, max=0.24, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.17333333333333334,
          "description": "min=0.16, mean=0.173, max=0.19, sum=0.52 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19999999999999998,
          "description": "min=0.19, mean=0.2, max=0.21, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-x%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5133333333333333,
          "description": "min=0.5, mean=0.513, max=0.52, sum=1.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.49,
          "description": "min=0.48, mean=0.49, max=0.5, sum=1.47 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.5033333333333333,
          "description": "min=0.5, mean=0.503, max=0.51, sum=1.51 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 545.4,
          "description": "min=545.4, mean=545.4, max=545.4, sum=1636.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20college_chemistry&runSpecs=%5B%22mmlu%3Asubject%3Dcollege_chemistry%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_yalm%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7618656374525542,
          "description": "min=0.757, mean=0.762, max=0.769, sum=2.286 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09517944221695264,
          "description": "min=0.095, mean=0.095, max=0.095, sum=0.286 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 518.39,
          "description": "min=518.39, mean=518.39, max=518.39, sum=1555.17 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu_subject:college_chemistry.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu_subject:college_chemistry.json"
      }
    ],
    "name": "mmlu_subject:college_chemistry"
  },
  {
    "title": "subject: computer_security",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3133333333333333,
          "description": "min=0.29, mean=0.313, max=0.33, sum=0.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10068610837147839,
          "description": "min=0.074, mean=0.101, max=0.121, sum=0.302 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.25, mean=0.27, max=0.29, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.29, mean=0.3, max=0.31, sum=0.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41852757812500024,
          "description": "min=0.419, mean=0.419, max=0.419, sum=1.256 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24666666666666667,
          "description": "min=0.24, mean=0.247, max=0.25, sum=0.74 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11468094167379717,
          "description": "min=0.097, mean=0.115, max=0.134, sum=0.344 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333337,
          "description": "min=0.19, mean=0.203, max=0.22, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.19, mean=0.207, max=0.22, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34789953124999995,
          "description": "min=0.348, mean=0.348, max=0.348, sum=1.044 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2933333333333334,
          "description": "min=0.28, mean=0.293, max=0.31, sum=0.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09458287944901649,
          "description": "min=0.082, mean=0.095, max=0.119, sum=0.284 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.24, mean=0.253, max=0.26, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.25, mean=0.26, max=0.28, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38066695312499993,
          "description": "min=0.381, mean=0.381, max=0.381, sum=1.142 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande-v2-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6233333333333334,
          "description": "min=0.62, mean=0.623, max=0.63, sum=1.87 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09799470312472387,
          "description": "min=0.067, mean=0.098, max=0.13, sum=0.294 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5733333333333334,
          "description": "min=0.56, mean=0.573, max=0.6, sum=1.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5966666666666667,
          "description": "min=0.59, mean=0.597, max=0.61, sum=1.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6833333333333335,
          "description": "min=0.68, mean=0.683, max=0.69, sum=2.05 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10928205922930512,
          "description": "min=0.097, mean=0.109, max=0.128, sum=0.328 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.62, mean=0.64, max=0.67, sum=1.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.68,
          "description": "min=0.67, mean=0.68, max=0.69, sum=2.04 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6466666666666666,
          "description": "min=0.64, mean=0.647, max=0.66, sum=1.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14923720813570954,
          "description": "min=0.133, mean=0.149, max=0.172, sum=0.448 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5966666666666667,
          "description": "min=0.59, mean=0.597, max=0.6, sum=1.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6333333333333333,
          "description": "min=0.63, mean=0.633, max=0.64, sum=1.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.39666666666666667,
          "description": "min=0.38, mean=0.397, max=0.42, sum=1.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12937040411709103,
          "description": "min=0.102, mean=0.129, max=0.155, sum=0.388 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30666666666666664,
          "description": "min=0.29, mean=0.307, max=0.34, sum=0.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.32, mean=0.34, max=0.37, sum=1.02 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 308.59,
          "description": "min=308.59, mean=308.59, max=308.59, sum=925.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-base%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3133333333333333,
          "description": "min=0.3, mean=0.313, max=0.32, sum=0.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11573741468580782,
          "description": "min=0.094, mean=0.116, max=0.13, sum=0.347 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22666666666666666,
          "description": "min=0.21, mean=0.227, max=0.24, sum=0.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.23, mean=0.24, max=0.25, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.60999999999996,
          "description": "min=387.61, mean=387.61, max=387.61, sum=1162.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-extended%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.41333333333333333,
          "description": "min=0.41, mean=0.413, max=0.42, sum=1.24 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1579880745347255,
          "description": "min=0.147, mean=0.158, max=0.172, sum=0.474 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33666666666666667,
          "description": "min=0.3, mean=0.337, max=0.37, sum=1.01 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.33, mean=0.34, max=0.35, sum=1.02 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.60999999999996,
          "description": "min=387.61, mean=387.61, max=387.61, sum=1162.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-supreme%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.47,
          "description": "min=0.44, mean=0.47, max=0.5, sum=1.41 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17056656607046103,
          "description": "min=0.124, mean=0.171, max=0.212, sum=0.512 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38000000000000006,
          "description": "min=0.36, mean=0.38, max=0.39, sum=1.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.39999999999999997,
          "description": "min=0.39, mean=0.4, max=0.41, sum=1.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.60999999999996,
          "description": "min=387.61, mean=387.61, max=387.61, sum=1162.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Danthropic_stanford-online-all-v4-s3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6699999999999999,
          "description": "min=0.66, mean=0.67, max=0.69, sum=2.01 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.063, mean=0.09, max=0.111, sum=0.271 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.63, mean=0.64, max=0.65, sum=1.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6566666666666667,
          "description": "min=0.64, mean=0.657, max=0.68, sum=1.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5859159667968753,
          "description": "min=0.586, mean=0.586, max=0.586, sum=1.758 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_bloom%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.38999999999999996,
          "description": "min=0.37, mean=0.39, max=0.41, sum=1.17 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11985007640801015,
          "description": "min=0.115, mean=0.12, max=0.128, sum=0.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3233333333333333,
          "description": "min=0.3, mean=0.323, max=0.34, sum=0.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36999999999999994,
          "description": "min=0.35, mean=0.37, max=0.38, sum=1.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34413451552391056,
          "description": "min=0.342, mean=0.344, max=0.347, sum=1.032 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 365.54,
          "description": "min=365.54, mean=365.54, max=365.54, sum=1096.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t0pp%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.45,
          "description": "min=0.44, mean=0.45, max=0.46, sum=1.35 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1320418789604815,
          "description": "min=0.121, mean=0.132, max=0.142, sum=0.396 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.42, mean=0.43, max=0.45, sum=1.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43333333333333335,
          "description": "min=0.43, mean=0.433, max=0.44, sum=1.3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1438595759868621,
          "description": "min=0.144, mean=0.144, max=0.144, sum=0.432 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 386.05,
          "description": "min=386.05, mean=386.05, max=386.05, sum=1158.15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20220609%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.41,
          "description": "min=0.36, mean=0.41, max=0.46, sum=1.23 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17335689657865375,
          "description": "min=0.139, mean=0.173, max=0.196, sum=0.52 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3499999999999999,
          "description": "min=0.32, mean=0.35, max=0.37, sum=1.05 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.35999999999999993,
          "description": "min=0.32, mean=0.36, max=0.41, sum=1.08 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.47919388671874974,
          "description": "min=0.479, mean=0.479, max=0.479, sum=1.438 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_large-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.37000000000000005,
          "description": "min=0.32, mean=0.37, max=0.4, sum=1.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12337197434530917,
          "description": "min=0.097, mean=0.123, max=0.147, sum=0.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30666666666666664,
          "description": "min=0.29, mean=0.307, max=0.32, sum=0.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.3, mean=0.333, max=0.36, sum=1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29847886718750005,
          "description": "min=0.298, mean=0.298, max=0.298, sum=0.895 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32666666666666666,
          "description": "min=0.29, mean=0.327, max=0.36, sum=0.98 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11181490600851485,
          "description": "min=0.106, mean=0.112, max=0.121, sum=0.335 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.17, mean=0.203, max=0.24, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.24, mean=0.267, max=0.29, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26943001953124995,
          "description": "min=0.269, mean=0.269, max=0.269, sum=0.808 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_small-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.2, mean=0.22, max=0.25, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13349777009089045,
          "description": "min=0.107, mean=0.133, max=0.158, sum=0.4 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.13, mean=0.14, max=0.16, sum=0.42 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19333333333333333,
          "description": "min=0.16, mean=0.193, max=0.23, sum=0.58 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26847544921875,
          "description": "min=0.268, mean=0.268, max=0.268, sum=0.805 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5133333333333333,
          "description": "min=0.5, mean=0.513, max=0.52, sum=1.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11761177274835892,
          "description": "min=0.104, mean=0.118, max=0.136, sum=0.353 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.42, mean=0.43, max=0.45, sum=1.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.45,
          "description": "min=0.43, mean=0.45, max=0.46, sum=1.35 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.26, mean=0.27, max=0.29, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11362330752179094,
          "description": "min=0.107, mean=0.114, max=0.121, sum=0.341 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.24, mean=0.243, max=0.25, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.25, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-medium-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.49666666666666665,
          "description": "min=0.46, mean=0.497, max=0.52, sum=1.49 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13228107680823697,
          "description": "min=0.126, mean=0.132, max=0.142, sum=0.397 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4266666666666667,
          "description": "min=0.42, mean=0.427, max=0.43, sum=1.28 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.44333333333333336,
          "description": "min=0.43, mean=0.443, max=0.46, sum=1.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-xlarge-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6266666666666666,
          "description": "min=0.61, mean=0.627, max=0.64, sum=1.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15980931893047942,
          "description": "min=0.13, mean=0.16, max=0.177, sum=0.479 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5433333333333333,
          "description": "min=0.51, mean=0.543, max=0.56, sum=1.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5833333333333334,
          "description": "min=0.57, mean=0.583, max=0.59, sum=1.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 397.73,
          "description": "min=397.73, mean=397.73, max=397.73, sum=1193.19 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-j-6b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29666666666666663,
          "description": "min=0.29, mean=0.297, max=0.3, sum=0.89 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11265441143729527,
          "description": "min=0.097, mean=0.113, max=0.125, sum=0.338 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.25, mean=0.267, max=0.28, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.26, mean=0.267, max=0.27, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07118436763683955,
          "description": "min=0.07, mean=0.071, max=0.072, sum=0.214 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-neox-20b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.23, mean=0.253, max=0.28, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1239017534321159,
          "description": "min=0.11, mean=0.124, max=0.143, sum=0.372 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.17, mean=0.183, max=0.2, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.19, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2745299462477366,
          "description": "min=0.274, mean=0.275, max=0.275, sum=0.824 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=1164.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-6.9b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.26, mean=0.26, max=0.26, sum=0.26 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11319467098641812,
          "description": "min=0.113, mean=0.113, max=0.113, sum=0.113 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.25, mean=0.25, max=0.25, sum=0.25 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-12b-v0%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09205080958852419,
          "description": "min=0.092, mean=0.092, max=0.092, sum=0.092 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17,
          "description": "min=0.17, mean=0.17, max=0.17, sum=0.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19,
          "description": "min=0.19, mean=0.19, max=0.19, sum=0.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t5-11b%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.23, mean=0.27, max=0.3, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12570201728145283,
          "description": "min=0.117, mean=0.126, max=0.142, sum=0.377 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.19, mean=0.233, max=0.26, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19666666666666668,
          "description": "min=0.19, mean=0.197, max=0.21, sum=0.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2312925594051678,
          "description": "min=0.23, mean=0.231, max=0.232, sum=0.694 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 4.943333333333334,
          "description": "min=4.94, mean=4.943, max=4.95, sum=14.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 383.17333333333335,
          "description": "min=382.49, mean=383.173, max=383.66, sum=1149.52 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_ul2%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%2Cglobal_prefix%3Dnlg%22%5D",
          "markdown": false
        },
        {
          "value": 0.35000000000000003,
          "description": "min=0.34, mean=0.35, max=0.37, sum=1.05 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10889946225151632,
          "description": "min=0.09, mean=0.109, max=0.12, sum=0.327 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3233333333333333,
          "description": "min=0.31, mean=0.323, max=0.34, sum=0.97 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33666666666666667,
          "description": "min=0.33, mean=0.337, max=0.35, sum=1.01 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17963503815730417,
          "description": "min=0.179, mean=0.18, max=0.18, sum=0.539 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 4.94,
          "description": "min=4.93, mean=4.94, max=4.95, sum=14.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.0,
          "description": "min=386.49, mean=387, max=387.66, sum=1161 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-175b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3933333333333333,
          "description": "min=0.35, mean=0.393, max=0.42, sum=1.18 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1433199763875271,
          "description": "min=0.132, mean=0.143, max=0.161, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3133333333333333,
          "description": "min=0.29, mean=0.313, max=0.33, sum=0.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.35333333333333333,
          "description": "min=0.32, mean=0.353, max=0.39, sum=1.06 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11762017247222724,
          "description": "min=0.117, mean=0.118, max=0.119, sum=0.353 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-66b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.26, mean=0.27, max=0.29, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11247692521605653,
          "description": "min=0.101, mean=0.112, max=0.132, sum=0.337 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19999999999999998,
          "description": "min=0.18, mean=0.2, max=0.23, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.19, mean=0.22, max=0.24, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.044782473420103386,
          "description": "min=0.043, mean=0.045, max=0.047, sum=0.134 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.45,
          "description": "min=0.45, mean=0.45, max=0.45, sum=0.45 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.13, mean=0.13, max=0.13, sum=0.13 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36,
          "description": "min=0.36, mean=0.36, max=0.36, sum=0.36 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.147, mean=0.147, max=0.147, sum=0.147 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.58,
          "description": "min=0.58, mean=0.58, max=0.58, sum=0.58 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.57,
          "description": "min=0.57, mean=0.57, max=0.57, sum=0.57 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7,
          "description": "min=0.7, mean=0.7, max=0.7, sum=0.7 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.078, mean=0.078, max=0.078, sum=0.078 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.66,
          "description": "min=0.66, mean=0.66, max=0.66, sum=0.66 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-65b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.79,
          "description": "min=0.79, mean=0.79, max=0.79, sum=0.79 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.73,
          "description": "min=0.73, mean=0.73, max=0.73, sum=0.73 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.77,
          "description": "min=0.77, mean=0.77, max=0.77, sum=0.77 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6,
          "description": "min=0.6, mean=0.6, max=0.6, sum=0.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.57,
          "description": "min=0.57, mean=0.57, max=0.57, sum=0.57 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.56,
          "description": "min=0.56, mean=0.56, max=0.56, sum=0.56 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.68,
          "description": "min=0.68, mean=0.68, max=0.68, sum=0.68 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.65,
          "description": "min=0.65, mean=0.65, max=0.65, sum=0.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-70b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.76,
          "description": "min=0.76, mean=0.76, max=0.76, sum=0.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.75,
          "description": "min=0.75, mean=0.75, max=0.75, sum=0.75 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.75,
          "description": "min=0.75, mean=0.75, max=0.75, sum=0.75 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dstanford_alpaca-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.51,
          "description": "min=0.51, mean=0.51, max=0.51, sum=0.51 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17647420561225885,
          "description": "min=0.176, mean=0.176, max=0.176, sum=0.176 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.44,
          "description": "min=0.44, mean=0.44, max=0.44, sum=0.44 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48,
          "description": "min=0.48, mean=0.48, max=0.48, sum=0.48 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-7b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.64, mean=0.64, max=0.64, sum=0.64 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13485917615293308,
          "description": "min=0.135, mean=0.135, max=0.135, sum=0.135 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.57,
          "description": "min=0.57, mean=0.57, max=0.57, sum=0.57 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.58,
          "description": "min=0.58, mean=0.58, max=0.58, sum=0.58 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-13b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.65,
          "description": "min=0.65, mean=0.65, max=0.65, sum=0.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15566428719073158,
          "description": "min=0.156, mean=0.156, max=0.156, sum=0.156 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.62,
          "description": "min=0.62, mean=0.62, max=0.62, sum=0.62 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 428.17,
          "description": "min=428.17, mean=428.17, max=428.17, sum=428.17 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmistralai_mistral-7b-v0.1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.77,
          "description": "min=0.77, mean=0.77, max=0.77, sum=0.77 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.72,
          "description": "min=0.72, mean=0.72, max=0.72, sum=0.72 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.72,
          "description": "min=0.72, mean=0.72, max=0.72, sum=0.72 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_530B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6833333333333332,
          "description": "min=0.68, mean=0.683, max=0.69, sum=2.05 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13057923708901864,
          "description": "min=0.127, mean=0.131, max=0.134, sum=0.392 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6266666666666666,
          "description": "min=0.61, mean=0.627, max=0.65, sum=1.88 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.66,
          "description": "min=0.65, mean=0.66, max=0.67, sum=1.98 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_7B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.23, mean=0.243, max=0.25, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14200580883533165,
          "description": "min=0.116, mean=0.142, max=0.175, sum=0.426 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.18, mean=0.19, max=0.2, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22333333333333336,
          "description": "min=0.22, mean=0.223, max=0.23, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_davinci%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5599999999999999,
          "description": "min=0.54, mean=0.56, max=0.59, sum=1.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12558924735759688,
          "description": "min=0.121, mean=0.126, max=0.134, sum=0.377 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48,
          "description": "min=0.44, mean=0.48, max=0.52, sum=1.44 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5033333333333333,
          "description": "min=0.48, mean=0.503, max=0.52, sum=1.51 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21643040039062503,
          "description": "min=0.216, mean=0.216, max=0.216, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_curie%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2866666666666667,
          "description": "min=0.28, mean=0.287, max=0.29, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09515461681451905,
          "description": "min=0.081, mean=0.095, max=0.109, sum=0.285 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.18, mean=0.19, max=0.2, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.26, mean=0.263, max=0.27, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09104115234375003,
          "description": "min=0.091, mean=0.091, max=0.091, sum=0.273 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_babbage%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22666666666666668,
          "description": "min=0.2, mean=0.227, max=0.25, sum=0.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1271115976957019,
          "description": "min=0.105, mean=0.127, max=0.14, sum=0.381 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.16, mean=0.16, max=0.16, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19666666666666666,
          "description": "min=0.16, mean=0.197, max=0.24, sum=0.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11921455078124998,
          "description": "min=0.119, mean=0.119, max=0.119, sum=0.358 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_ada%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27666666666666667,
          "description": "min=0.26, mean=0.277, max=0.3, sum=0.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12566442658583743,
          "description": "min=0.113, mean=0.126, max=0.138, sum=0.377 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22999999999999998,
          "description": "min=0.21, mean=0.23, max=0.26, sum=0.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.25, mean=0.267, max=0.3, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14047474609375,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.421 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-003%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7633333333333333,
          "description": "min=0.74, mean=0.763, max=0.78, sum=2.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18982027860412187,
          "description": "min=0.166, mean=0.19, max=0.224, sum=0.569 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7266666666666666,
          "description": "min=0.72, mean=0.727, max=0.73, sum=2.18 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7533333333333333,
          "description": "min=0.74, mean=0.753, max=0.77, sum=2.26 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-002%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7633333333333333,
          "description": "min=0.76, mean=0.763, max=0.77, sum=2.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14209236867058628,
          "description": "min=0.134, mean=0.142, max=0.15, sum=0.426 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7200000000000001,
          "description": "min=0.71, mean=0.72, max=0.73, sum=2.16 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.73, mean=0.74, max=0.75, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20729065429687496,
          "description": "min=0.207, mean=0.207, max=0.207, sum=0.622 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-curie-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.24, mean=0.243, max=0.25, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4796885057233169,
          "description": "min=0.453, mean=0.48, max=0.513, sum=1.439 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23666666666666666,
          "description": "min=0.23, mean=0.237, max=0.24, sum=0.71 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12965806640625002,
          "description": "min=0.13, mean=0.13, max=0.13, sum=0.389 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-babbage-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22333333333333336,
          "description": "min=0.22, mean=0.223, max=0.23, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32967879774276465,
          "description": "min=0.307, mean=0.33, max=0.361, sum=0.989 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.18, mean=0.183, max=0.19, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2066666666666667,
          "description": "min=0.2, mean=0.207, max=0.22, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13105798828125007,
          "description": "min=0.131, mean=0.131, max=0.131, sum=0.393 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-ada-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.29, mean=0.3, max=0.31, sum=0.9 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40898990230589655,
          "description": "min=0.357, mean=0.409, max=0.444, sum=1.227 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25333333333333335,
          "description": "min=0.22, mean=0.253, max=0.28, sum=0.76 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26333333333333336,
          "description": "min=0.24, mean=0.263, max=0.28, sum=0.79 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08652787109375,
          "description": "min=0.087, mean=0.087, max=0.087, sum=0.26 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0301%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.77,
          "description": "min=0.77, mean=0.77, max=0.77, sum=0.77 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.72,
          "description": "min=0.72, mean=0.72, max=0.72, sum=0.72 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.73,
          "description": "min=0.73, mean=0.73, max=0.73, sum=0.73 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.54,
          "description": "min=371.54, mean=371.54, max=371.54, sum=371.54 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.06,
          "description": "min=1.06, mean=1.06, max=1.06, sum=1.06 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0613%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.35,
          "description": "min=0.35, mean=0.35, max=0.35, sum=0.35 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 371.54,
          "description": "min=371.54, mean=371.54, max=371.54, sum=371.54 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.61,
          "description": "min=1.61, mean=1.61, max=1.61, sum=1.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11657173647960638,
          "description": "min=0.117, mean=0.117, max=0.117, sum=0.117 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19,
          "description": "min=0.19, mean=0.19, max=0.19, sum=0.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12829402234456555,
          "description": "min=0.128, mean=0.128, max=0.128, sum=0.128 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18,
          "description": "min=0.18, mean=0.18, max=0.18, sum=0.18 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.38,
          "description": "min=0.38, mean=0.38, max=0.38, sum=0.38 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07996552614477029,
          "description": "min=0.08, mean=0.08, max=0.08, sum=0.08 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.33,
          "description": "min=0.33, mean=0.33, max=0.33, sum=0.33 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5,
          "description": "min=0.5, mean=0.5, max=0.5, sum=0.5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1286177423175291,
          "description": "min=0.129, mean=0.129, max=0.129, sum=0.129 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.55,
          "description": "min=0.55, mean=0.55, max=0.55, sum=0.55 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.62,
          "description": "min=0.62, mean=0.62, max=0.62, sum=0.62 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-instruct-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.62,
          "description": "min=0.62, mean=0.62, max=0.62, sum=0.62 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.59,
          "description": "min=0.59, mean=0.59, max=0.59, sum=0.59 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.61,
          "description": "min=0.61, mean=0.61, max=0.61, sum=0.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 388.19,
          "description": "min=388.19, mean=388.19, max=388.19, sum=388.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.24 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3,
          "description": "min=0.3, mean=0.3, max=0.3, sum=0.3 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 389.6,
          "description": "min=389.6, mean=389.6, max=389.6, sum=389.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.31,
          "description": "min=0.31, mean=0.31, max=0.31, sum=0.31 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.29, mean=0.29, max=0.29, sum=0.29 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 389.6,
          "description": "min=389.6, mean=389.6, max=389.6, sum=389.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.67,
          "description": "min=0.67, mean=0.67, max=0.67, sum=0.67 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.61,
          "description": "min=0.61, mean=0.61, max=0.61, sum=0.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.63,
          "description": "min=0.63, mean=0.63, max=0.63, sum=0.63 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 389.6,
          "description": "min=389.6, mean=389.6, max=389.6, sum=389.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.61,
          "description": "min=0.61, mean=0.61, max=0.61, sum=0.61 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.58,
          "description": "min=0.58, mean=0.58, max=0.58, sum=0.58 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.59,
          "description": "min=0.59, mean=0.59, max=0.59, sum=0.59 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 389.6,
          "description": "min=389.6, mean=389.6, max=389.6, sum=389.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_glm%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.4466666666666666,
          "description": "min=0.43, mean=0.447, max=0.47, sum=1.34 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17222430331798386,
          "description": "min=0.156, mean=0.172, max=0.196, sum=0.517 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4166666666666667,
          "description": "min=0.39, mean=0.417, max=0.44, sum=1.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4166666666666667,
          "description": "min=0.39, mean=0.417, max=0.43, sum=1.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5462319111824034,
          "description": "min=0.546, mean=0.546, max=0.546, sum=1.639 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 364.87000000000006,
          "description": "min=364.87, mean=364.87, max=364.87, sum=1094.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-instruct-30%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.58,
          "description": "min=0.57, mean=0.58, max=0.59, sum=1.74 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.5499999999999999,
          "description": "min=0.5, mean=0.55, max=0.58, sum=1.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5499999999999999,
          "description": "min=0.53, mean=0.55, max=0.58, sum=1.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-x%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.75,
          "description": "min=0.74, mean=0.75, max=0.76, sum=2.25 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.7166666666666667,
          "description": "min=0.71, mean=0.717, max=0.73, sum=2.15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.73, mean=0.74, max=0.75, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 387.3999999999999,
          "description": "min=387.4, mean=387.4, max=387.4, sum=1162.2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20computer_security&runSpecs=%5B%22mmlu%3Asubject%3Dcomputer_security%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_yalm%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6789330429620812,
          "description": "min=0.671, mean=0.679, max=0.689, sum=2.037 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21511975089708946,
          "description": "min=0.212, mean=0.215, max=0.217, sum=0.645 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 380.23,
          "description": "min=380.23, mean=380.23, max=380.23, sum=1140.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu_subject:computer_security.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu_subject:computer_security.json"
      }
    ],
    "name": "mmlu_subject:computer_security"
  },
  {
    "title": "subject: econometrics",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.228, mean=0.237, max=0.246, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1321028736393509,
          "description": "min=0.096, mean=0.132, max=0.154, sum=0.396 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20760233918128654,
          "description": "min=0.184, mean=0.208, max=0.237, sum=0.623 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20175438596491227,
          "description": "min=0.193, mean=0.202, max=0.211, sum=0.605 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5113379677220393,
          "description": "min=0.511, mean=0.511, max=0.511, sum=1.534 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26900584795321636,
          "description": "min=0.246, mean=0.269, max=0.298, sum=0.807 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08829487921655803,
          "description": "min=0.051, mean=0.088, max=0.11, sum=0.265 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.202, mean=0.234, max=0.272, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21637426900584797,
          "description": "min=0.211, mean=0.216, max=0.219, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.42221803042763134,
          "description": "min=0.422, mean=0.422, max=0.422, sum=1.267 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2456140350877193,
          "description": "min=0.202, mean=0.246, max=0.272, sum=0.737 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11752315004048706,
          "description": "min=0.098, mean=0.118, max=0.153, sum=0.353 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22222222222222224,
          "description": "min=0.184, mean=0.222, max=0.254, sum=0.667 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1812865497076023,
          "description": "min=0.158, mean=0.181, max=0.211, sum=0.544 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.46603300609923276,
          "description": "min=0.466, mean=0.466, max=0.466, sum=1.398 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande-v2-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2923976608187135,
          "description": "min=0.281, mean=0.292, max=0.298, sum=0.877 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17616028007051562,
          "description": "min=0.137, mean=0.176, max=0.205, sum=0.528 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24561403508771928,
          "description": "min=0.237, mean=0.246, max=0.254, sum=0.737 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2573099415204678,
          "description": "min=0.237, mean=0.257, max=0.281, sum=0.772 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3157894736842105,
          "description": "min=0.281, mean=0.316, max=0.333, sum=0.947 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21750303578118038,
          "description": "min=0.199, mean=0.218, max=0.248, sum=0.653 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.260233918128655,
          "description": "min=0.246, mean=0.26, max=0.281, sum=0.781 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28654970760233917,
          "description": "min=0.272, mean=0.287, max=0.298, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.30994152046783624,
          "description": "min=0.298, mean=0.31, max=0.325, sum=0.93 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15071618544190105,
          "description": "min=0.118, mean=0.151, max=0.172, sum=0.452 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24269005847953218,
          "description": "min=0.228, mean=0.243, max=0.254, sum=0.728 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26608187134502925,
          "description": "min=0.254, mean=0.266, max=0.289, sum=0.798 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.211, mean=0.219, max=0.237, sum=0.658 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.171032785519748,
          "description": "min=0.159, mean=0.171, max=0.186, sum=0.513 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.195906432748538,
          "description": "min=0.175, mean=0.196, max=0.211, sum=0.588 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18421052631578946,
          "description": "min=0.167, mean=0.184, max=0.202, sum=0.553 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 552.719298245614,
          "description": "min=552.719, mean=552.719, max=552.719, sum=1658.158 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-base%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22514619883040932,
          "description": "min=0.193, mean=0.225, max=0.263, sum=0.675 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10738010580413908,
          "description": "min=0.088, mean=0.107, max=0.14, sum=0.322 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14327485380116958,
          "description": "min=0.114, mean=0.143, max=0.167, sum=0.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14619883040935672,
          "description": "min=0.132, mean=0.146, max=0.158, sum=0.439 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 618.4473684210526,
          "description": "min=618.447, mean=618.447, max=618.447, sum=1855.342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-extended%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.260233918128655,
          "description": "min=0.246, mean=0.26, max=0.289, sum=0.781 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10247263834982502,
          "description": "min=0.094, mean=0.102, max=0.107, sum=0.307 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16374269005847952,
          "description": "min=0.158, mean=0.164, max=0.167, sum=0.491 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16959064327485382,
          "description": "min=0.14, mean=0.17, max=0.184, sum=0.509 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 618.4473684210526,
          "description": "min=618.447, mean=618.447, max=618.447, sum=1855.342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-supreme%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.272, mean=0.281, max=0.289, sum=0.842 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15175951114600025,
          "description": "min=0.135, mean=0.152, max=0.172, sum=0.455 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14035087719298245,
          "description": "min=0.123, mean=0.14, max=0.167, sum=0.421 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15497076023391812,
          "description": "min=0.149, mean=0.155, max=0.167, sum=0.465 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 618.4473684210526,
          "description": "min=618.447, mean=618.447, max=618.447, sum=1855.342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Danthropic_stanford-online-all-v4-s3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28654970760233917,
          "description": "min=0.272, mean=0.287, max=0.298, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.227, mean=0.248, max=0.262, sum=0.744 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25438596491228066,
          "description": "min=0.237, mean=0.254, max=0.263, sum=0.763 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.211, mean=0.234, max=0.272, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5584037143640352,
          "description": "min=0.558, mean=0.558, max=0.558, sum=1.675 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_bloom%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.219, mean=0.237, max=0.254, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1292266676800032,
          "description": "min=0.122, mean=0.129, max=0.14, sum=0.388 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1812865497076023,
          "description": "min=0.167, mean=0.181, max=0.202, sum=0.544 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20467836257309943,
          "description": "min=0.175, mean=0.205, max=0.219, sum=0.614 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15106598981264427,
          "description": "min=0.147, mean=0.151, max=0.154, sum=0.453 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 574.6578947368421,
          "description": "min=574.658, mean=574.658, max=574.658, sum=1723.974 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t0pp%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.33918128654970764,
          "description": "min=0.316, mean=0.339, max=0.351, sum=1.018 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1353608072197195,
          "description": "min=0.134, mean=0.135, max=0.136, sum=0.406 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3216374269005848,
          "description": "min=0.307, mean=0.322, max=0.333, sum=0.965 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.30701754385964913,
          "description": "min=0.289, mean=0.307, max=0.316, sum=0.921 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1482845332887438,
          "description": "min=0.148, mean=0.148, max=0.149, sum=0.445 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 639.561403508772,
          "description": "min=639.561, mean=639.561, max=639.561, sum=1918.684 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20220609%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24853801169590647,
          "description": "min=0.228, mean=0.249, max=0.289, sum=0.746 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13258416786314742,
          "description": "min=0.113, mean=0.133, max=0.153, sum=0.398 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1929824561403509,
          "description": "min=0.158, mean=0.193, max=0.228, sum=0.579 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1929824561403509,
          "description": "min=0.158, mean=0.193, max=0.246, sum=0.579 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5061541940789475,
          "description": "min=0.506, mean=0.506, max=0.506, sum=1.518 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_large-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3011695906432749,
          "description": "min=0.289, mean=0.301, max=0.307, sum=0.904 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08886613017750872,
          "description": "min=0.075, mean=0.089, max=0.11, sum=0.267 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21637426900584797,
          "description": "min=0.211, mean=0.216, max=0.228, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25146198830409355,
          "description": "min=0.237, mean=0.251, max=0.272, sum=0.754 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3491272615131578,
          "description": "min=0.349, mean=0.349, max=0.349, sum=1.047 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.246, mean=0.281, max=0.325, sum=0.842 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07609691490025716,
          "description": "min=0.067, mean=0.076, max=0.092, sum=0.228 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.20175438596491227,
          "description": "min=0.184, mean=0.202, max=0.211, sum=0.605 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24269005847953215,
          "description": "min=0.219, mean=0.243, max=0.263, sum=0.728 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3007604851973685,
          "description": "min=0.301, mean=0.301, max=0.301, sum=0.902 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_small-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.237, mean=0.254, max=0.272, sum=0.763 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13686678634036722,
          "description": "min=0.104, mean=0.137, max=0.157, sum=0.411 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2222222222222222,
          "description": "min=0.219, mean=0.222, max=0.228, sum=0.667 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18128654970760236,
          "description": "min=0.167, mean=0.181, max=0.193, sum=0.544 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3124181743421051,
          "description": "min=0.312, mean=0.312, max=0.312, sum=0.937 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.260233918128655,
          "description": "min=0.237, mean=0.26, max=0.281, sum=0.781 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17114261641749454,
          "description": "min=0.149, mean=0.171, max=0.197, sum=0.513 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16666666666666666,
          "description": "min=0.149, mean=0.167, max=0.193, sum=0.5 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.202, mean=0.219, max=0.237, sum=0.658 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.254, mean=0.272, max=0.289, sum=0.816 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07965340954285109,
          "description": "min=0.055, mean=0.08, max=0.1, sum=0.239 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19005847953216373,
          "description": "min=0.158, mean=0.19, max=0.219, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21637426900584797,
          "description": "min=0.193, mean=0.216, max=0.246, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-medium-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29824561403508776,
          "description": "min=0.272, mean=0.298, max=0.316, sum=0.895 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2040532275445278,
          "description": "min=0.17, mean=0.204, max=0.243, sum=0.612 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21637426900584797,
          "description": "min=0.211, mean=0.216, max=0.228, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27485380116959063,
          "description": "min=0.263, mean=0.275, max=0.289, sum=0.825 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-xlarge-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.263, mean=0.272, max=0.281, sum=0.816 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3189864914304454,
          "description": "min=0.299, mean=0.319, max=0.338, sum=0.957 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2222222222222222,
          "description": "min=0.219, mean=0.222, max=0.228, sum=0.667 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2222222222222222,
          "description": "min=0.219, mean=0.222, max=0.228, sum=0.667 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 628.421052631579,
          "description": "min=628.421, mean=628.421, max=628.421, sum=1885.263 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-j-6b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26608187134502925,
          "description": "min=0.246, mean=0.266, max=0.281, sum=0.798 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12789479203741905,
          "description": "min=0.118, mean=0.128, max=0.141, sum=0.384 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.211, mean=0.234, max=0.254, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22807017543859645,
          "description": "min=0.202, mean=0.228, max=0.246, sum=0.684 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07162863255767098,
          "description": "min=0.072, mean=0.072, max=0.072, sum=0.215 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-neox-20b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.31871345029239767,
          "description": "min=0.289, mean=0.319, max=0.351, sum=0.956 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10135874379712294,
          "description": "min=0.094, mean=0.101, max=0.107, sum=0.304 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18421052631578946,
          "description": "min=0.149, mean=0.184, max=0.211, sum=0.553 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19590643274853803,
          "description": "min=0.175, mean=0.196, max=0.219, sum=0.588 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10852476367708767,
          "description": "min=0.108, mean=0.109, max=0.109, sum=0.326 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=1838.395 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-6.9b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.281, mean=0.281, max=0.281, sum=0.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11006952038762687,
          "description": "min=0.11, mean=0.11, max=0.11, sum=0.11 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.254, mean=0.254, max=0.254, sum=0.254 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-12b-v0%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2982456140350877,
          "description": "min=0.298, mean=0.298, max=0.298, sum=0.298 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09322817809379413,
          "description": "min=0.093, mean=0.093, max=0.093, sum=0.093 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20175438596491227,
          "description": "min=0.202, mean=0.202, max=0.202, sum=0.202 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t5-11b%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.22807017543859645,
          "description": "min=0.211, mean=0.228, max=0.246, sum=0.684 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22313199966794617,
          "description": "min=0.206, mean=0.223, max=0.242, sum=0.669 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19883040935672514,
          "description": "min=0.193, mean=0.199, max=0.202, sum=0.596 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1783625730994152,
          "description": "min=0.167, mean=0.178, max=0.184, sum=0.535 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.191676010107204,
          "description": "min=0.173, mean=0.192, max=0.204, sum=0.575 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.128654970760234,
          "description": "min=2.482, mean=3.129, max=3.877, sum=9.386 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 416.7953216374269,
          "description": "min=384.561, mean=416.795, max=440.596, sum=1250.386 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_ul2%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%2Cglobal_prefix%3Dnlg%22%5D",
          "markdown": false
        },
        {
          "value": 0.24269005847953218,
          "description": "min=0.228, mean=0.243, max=0.254, sum=0.728 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14468884626832365,
          "description": "min=0.11, mean=0.145, max=0.176, sum=0.434 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21637426900584797,
          "description": "min=0.211, mean=0.216, max=0.228, sum=0.649 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.228, mean=0.234, max=0.246, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18330761359797562,
          "description": "min=0.183, mean=0.183, max=0.184, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.1111111111111107,
          "description": "min=2.465, mean=3.111, max=3.86, sum=9.333 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 418.14619883040933,
          "description": "min=385.228, mean=418.146, max=443.316, sum=1254.439 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-175b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22514619883040934,
          "description": "min=0.211, mean=0.225, max=0.254, sum=0.675 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15315730817654102,
          "description": "min=0.14, mean=0.153, max=0.166, sum=0.459 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.195906432748538,
          "description": "min=0.184, mean=0.196, max=0.202, sum=0.588 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.195906432748538,
          "description": "min=0.167, mean=0.196, max=0.211, sum=0.588 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13722378366988072,
          "description": "min=0.136, mean=0.137, max=0.138, sum=0.412 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-66b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.219, mean=0.237, max=0.263, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12744989153280617,
          "description": "min=0.11, mean=0.127, max=0.153, sum=0.382 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.202, mean=0.211, max=0.228, sum=0.632 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2134502923976608,
          "description": "min=0.202, mean=0.213, max=0.228, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.04806882907685481,
          "description": "min=0.048, mean=0.048, max=0.048, sum=0.144 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.091, mean=0.091, max=0.091, sum=0.091 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.281, mean=0.281, max=0.281, sum=0.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.127, mean=0.127, max=0.127, sum=0.127 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.237 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.37719298245614036,
          "description": "min=0.377, mean=0.377, max=0.377, sum=0.377 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.139, mean=0.139, max=0.139, sum=0.139 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32456140350877194,
          "description": "min=0.325, mean=0.325, max=0.325, sum=0.325 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3508771929824561,
          "description": "min=0.351, mean=0.351, max=0.351, sum=0.351 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-65b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.43859649122807015,
          "description": "min=0.439, mean=0.439, max=0.439, sum=0.439 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.3684210526315789,
          "description": "min=0.368, mean=0.368, max=0.368, sum=0.368 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.37719298245614036,
          "description": "min=0.377, mean=0.377, max=0.377, sum=0.377 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.333, mean=0.333, max=0.333, sum=0.333 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.24561403508771928,
          "description": "min=0.246, mean=0.246, max=0.246, sum=0.246 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.281, mean=0.281, max=0.281, sum=0.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.333, mean=0.333, max=0.333, sum=0.333 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.272, mean=0.272, max=0.272, sum=0.272 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.281, mean=0.281, max=0.281, sum=0.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-70b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.43859649122807015,
          "description": "min=0.439, mean=0.439, max=0.439, sum=0.439 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.38596491228070173,
          "description": "min=0.386, mean=0.386, max=0.386, sum=0.386 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38596491228070173,
          "description": "min=0.386, mean=0.386, max=0.386, sum=0.386 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dstanford_alpaca-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3202240754475298,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.219, mean=0.219, max=0.219, sum=0.219 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-7b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31482277269197106,
          "description": "min=0.315, mean=0.315, max=0.315, sum=0.315 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17543859649122806,
          "description": "min=0.175, mean=0.175, max=0.175, sum=0.175 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18421052631578946,
          "description": "min=0.184, mean=0.184, max=0.184, sum=0.184 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-13b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2982456140350877,
          "description": "min=0.298, mean=0.298, max=0.298, sum=0.298 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2464401289263631,
          "description": "min=0.246, mean=0.246, max=0.246, sum=0.246 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.237 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 684.6754385964912,
          "description": "min=684.675, mean=684.675, max=684.675, sum=684.675 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmistralai_mistral-7b-v0.1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.49122807017543857,
          "description": "min=0.491, mean=0.491, max=0.491, sum=0.491 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.45614035087719296,
          "description": "min=0.456, mean=0.456, max=0.456, sum=0.456 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43859649122807015,
          "description": "min=0.439, mean=0.439, max=0.439, sum=0.439 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_530B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2982456140350877,
          "description": "min=0.281, mean=0.298, max=0.325, sum=0.895 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15142812761289012,
          "description": "min=0.109, mean=0.151, max=0.176, sum=0.454 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24853801169590642,
          "description": "min=0.237, mean=0.249, max=0.272, sum=0.746 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_7B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24561403508771928,
          "description": "min=0.219, mean=0.246, max=0.281, sum=0.737 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12059358975031685,
          "description": "min=0.104, mean=0.121, max=0.133, sum=0.362 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1608187134502924,
          "description": "min=0.114, mean=0.161, max=0.211, sum=0.482 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19883040935672514,
          "description": "min=0.175, mean=0.199, max=0.237, sum=0.596 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_davinci%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3187134502923976,
          "description": "min=0.281, mean=0.319, max=0.342, sum=0.956 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1425816031861401,
          "description": "min=0.118, mean=0.143, max=0.156, sum=0.428 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.219, mean=0.234, max=0.263, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27485380116959063,
          "description": "min=0.263, mean=0.275, max=0.289, sum=0.825 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20311163651315797,
          "description": "min=0.203, mean=0.203, max=0.203, sum=0.609 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_curie%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2807017543859649,
          "description": "min=0.281, mean=0.281, max=0.281, sum=0.842 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08982735352107618,
          "description": "min=0.069, mean=0.09, max=0.123, sum=0.269 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25146198830409355,
          "description": "min=0.237, mean=0.251, max=0.263, sum=0.754 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.263, mean=0.272, max=0.281, sum=0.816 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09484553179824562,
          "description": "min=0.095, mean=0.095, max=0.095, sum=0.285 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_babbage%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.19590643274853803,
          "description": "min=0.175, mean=0.196, max=0.219, sum=0.588 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12814947016152292,
          "description": "min=0.106, mean=0.128, max=0.145, sum=0.384 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14619883040935672,
          "description": "min=0.114, mean=0.146, max=0.202, sum=0.439 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1783625730994152,
          "description": "min=0.158, mean=0.178, max=0.202, sum=0.535 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11831332236842103,
          "description": "min=0.118, mean=0.118, max=0.118, sum=0.355 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_ada%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.1871345029239766,
          "description": "min=0.132, mean=0.187, max=0.237, sum=0.561 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14058919474724593,
          "description": "min=0.117, mean=0.141, max=0.172, sum=0.422 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15789473684210523,
          "description": "min=0.105, mean=0.158, max=0.202, sum=0.474 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13157894736842105,
          "description": "min=0.053, mean=0.132, max=0.211, sum=0.395 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13965666118421047,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.419 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-003%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.49415204678362573,
          "description": "min=0.482, mean=0.494, max=0.509, sum=1.482 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3667604250164859,
          "description": "min=0.358, mean=0.367, max=0.376, sum=1.1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4239766081871345,
          "description": "min=0.412, mean=0.424, max=0.43, sum=1.272 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4181286549707603,
          "description": "min=0.386, mean=0.418, max=0.456, sum=1.254 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-002%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.46491228070175444,
          "description": "min=0.439, mean=0.465, max=0.491, sum=1.395 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2398189045835751,
          "description": "min=0.226, mean=0.24, max=0.248, sum=0.719 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.40935672514619875,
          "description": "min=0.404, mean=0.409, max=0.421, sum=1.228 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3947368421052631,
          "description": "min=0.377, mean=0.395, max=0.412, sum=1.184 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1746738623903509,
          "description": "min=0.175, mean=0.175, max=0.175, sum=0.524 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-curie-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26608187134502925,
          "description": "min=0.219, mean=0.266, max=0.298, sum=0.798 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4317000731608973,
          "description": "min=0.369, mean=0.432, max=0.484, sum=1.295 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2309941520467836,
          "description": "min=0.184, mean=0.231, max=0.272, sum=0.693 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2573099415204678,
          "description": "min=0.219, mean=0.257, max=0.281, sum=0.772 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1397210800438597,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.419 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-babbage-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.30701754385964913,
          "description": "min=0.289, mean=0.307, max=0.325, sum=0.921 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18170122591390483,
          "description": "min=0.16, mean=0.182, max=0.193, sum=0.545 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2134502923976608,
          "description": "min=0.193, mean=0.213, max=0.228, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2456140350877193,
          "description": "min=0.228, mean=0.246, max=0.272, sum=0.737 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13540775767543858,
          "description": "min=0.135, mean=0.135, max=0.135, sum=0.406 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-ada-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2485380116959064,
          "description": "min=0.202, mean=0.249, max=0.281, sum=0.746 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5095238986815372,
          "description": "min=0.504, mean=0.51, max=0.517, sum=1.529 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20175438596491227,
          "description": "min=0.14, mean=0.202, max=0.246, sum=0.605 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19883040935672514,
          "description": "min=0.149, mean=0.199, max=0.228, sum=0.596 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08940775767543861,
          "description": "min=0.089, mean=0.089, max=0.089, sum=0.268 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0301%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5087719298245614,
          "description": "min=0.509, mean=0.509, max=0.509, sum=0.509 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.4473684210526316,
          "description": "min=0.447, mean=0.447, max=0.447, sum=0.447 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4298245614035088,
          "description": "min=0.43, mean=0.43, max=0.43, sum=0.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 607.4298245614035,
          "description": "min=607.43, mean=607.43, max=607.43, sum=607.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0613%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32456140350877194,
          "description": "min=0.325, mean=0.325, max=0.325, sum=0.325 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.272, mean=0.272, max=0.272, sum=0.272 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24561403508771928,
          "description": "min=0.246, mean=0.246, max=0.246, sum=0.246 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 607.4298245614035,
          "description": "min=607.43, mean=607.43, max=607.43, sum=607.43 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.3070175438596492,
          "description": "min=1.307, mean=1.307, max=1.307, sum=1.307 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.254, mean=0.254, max=0.254, sum=0.254 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11462738209239573,
          "description": "min=0.115, mean=0.115, max=0.115, sum=0.115 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18421052631578946,
          "description": "min=0.184, mean=0.184, max=0.184, sum=0.184 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.237 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09777642387383734,
          "description": "min=0.098, mean=0.098, max=0.098, sum=0.098 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.219, mean=0.219, max=0.219, sum=0.219 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12981641720099685,
          "description": "min=0.13, mean=0.13, max=0.13, sum=0.13 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.219, mean=0.219, max=0.219, sum=0.219 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24561403508771928,
          "description": "min=0.246, mean=0.246, max=0.246, sum=0.246 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1710565485780246,
          "description": "min=0.171, mean=0.171, max=0.171, sum=0.171 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17543859649122806,
          "description": "min=0.175, mean=0.175, max=0.175, sum=0.175 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16666666666666666,
          "description": "min=0.167, mean=0.167, max=0.167, sum=0.167 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.333, mean=0.333, max=0.333, sum=0.333 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.254, mean=0.254, max=0.254, sum=0.254 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2894736842105263,
          "description": "min=0.289, mean=0.289, max=0.289, sum=0.289 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-instruct-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34210526315789475,
          "description": "min=0.342, mean=0.342, max=0.342, sum=0.342 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.272, mean=0.272, max=0.272, sum=0.272 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 612.7982456140351,
          "description": "min=612.798, mean=612.798, max=612.798, sum=612.798 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.272, mean=0.272, max=0.272, sum=0.272 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.21052631578947367,
          "description": "min=0.211, mean=0.211, max=0.211, sum=0.211 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 664.280701754386,
          "description": "min=664.281, mean=664.281, max=664.281, sum=664.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.254, mean=0.254, max=0.254, sum=0.254 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.237 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 664.280701754386,
          "description": "min=664.281, mean=664.281, max=664.281, sum=664.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32456140350877194,
          "description": "min=0.325, mean=0.325, max=0.325, sum=0.325 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2719298245614035,
          "description": "min=0.272, mean=0.272, max=0.272, sum=0.272 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 664.280701754386,
          "description": "min=664.281, mean=664.281, max=664.281, sum=664.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2631578947368421,
          "description": "min=0.263, mean=0.263, max=0.263, sum=0.263 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.22807017543859648,
          "description": "min=0.228, mean=0.228, max=0.228, sum=0.228 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21929824561403508,
          "description": "min=0.219, mean=0.219, max=0.219, sum=0.219 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=114 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 664.280701754386,
          "description": "min=664.281, mean=664.281, max=664.281, sum=664.281 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_glm%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.28654970760233917,
          "description": "min=0.254, mean=0.287, max=0.307, sum=0.86 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12078698511976739,
          "description": "min=0.094, mean=0.121, max=0.153, sum=0.362 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27192982456140347,
          "description": "min=0.254, mean=0.272, max=0.298, sum=0.816 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2543859649122807,
          "description": "min=0.246, mean=0.254, max=0.263, sum=0.763 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19555017816131573,
          "description": "min=0.196, mean=0.196, max=0.196, sum=0.587 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 611.8771929824561,
          "description": "min=611.877, mean=611.877, max=611.877, sum=1835.632 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-instruct-30%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28362573099415206,
          "description": "min=0.272, mean=0.284, max=0.298, sum=0.851 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.2309941520467836,
          "description": "min=0.211, mean=0.231, max=0.246, sum=0.693 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23391812865497075,
          "description": "min=0.228, mean=0.234, max=0.237, sum=0.702 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-x%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5321637426900585,
          "description": "min=0.526, mean=0.532, max=0.535, sum=1.596 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.4678362573099415,
          "description": "min=0.465, mean=0.468, max=0.474, sum=1.404 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.49415204678362573,
          "description": "min=0.482, mean=0.494, max=0.509, sum=1.482 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 624.0701754385965,
          "description": "min=624.07, mean=624.07, max=624.07, sum=1872.211 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20econometrics&runSpecs=%5B%22mmlu%3Asubject%3Deconometrics%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_yalm%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7272546717191638,
          "description": "min=0.715, mean=0.727, max=0.738, sum=2.182 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23684210526315788,
          "description": "min=0.237, mean=0.237, max=0.237, sum=0.711 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10539968980596078,
          "description": "min=0.105, mean=0.105, max=0.106, sum=0.316 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 114.0,
          "description": "min=114, mean=114, max=114, sum=342 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 580.8333333333334,
          "description": "min=580.833, mean=580.833, max=580.833, sum=1742.5 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu_subject:econometrics.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu_subject:econometrics.json"
      }
    ],
    "name": "mmlu_subject:econometrics"
  },
  {
    "title": "subject: us_foreign_policy",
    "header": [
      {
        "value": "Model/adapter",
        "markdown": false,
        "metadata": {}
      },
      {
        "value": "EM",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU"
        }
      },
      {
        "value": "ECE (10-bin)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n10-bin expected calibration error: The average difference between the model's confidence and accuracy, averaged across 10 bins where each bin contains an equal number of points (only computed for classification tasks). Warning - not reliable for small datasets (e.g., with < 300 examples) because each bin will have very few examples.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "ECE (10-bin)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "EM (Robustness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Robustness: Computes worst case over different robustness perturbations (misspellings, formatting, contrast sets).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Robustness"
        }
      },
      {
        "value": "EM (Fairness)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nExact match: Fraction of instances that the predicted output matches a correct reference exactly.\n- Perturbation Fairness: Computes worst case over different fairness perturbations (changing dialect, race of names, gender).",
        "markdown": false,
        "lower_is_better": false,
        "metadata": {
          "metric": "EM",
          "run_group": "MMLU",
          "perturbation": "Fairness"
        }
      },
      {
        "value": "Denoised inference time (s)",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\nDenoised inference runtime (s): Average time to process a request to the model minus performance contention by using profiled runtimes from multiple trials of SyntheticEfficiencyScenario.",
        "markdown": false,
        "lower_is_better": true,
        "metadata": {
          "metric": "Denoised inference time (s)",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# eval",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# eval: Number of evaluation instances.",
        "markdown": false,
        "metadata": {
          "metric": "# eval",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# train",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# train: Number of training instances (e.g., in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "# train",
          "run_group": "MMLU"
        }
      },
      {
        "value": "truncated",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\ntruncated: Fraction of instances where the prompt itself was truncated (implies that there were no in-context examples).",
        "markdown": false,
        "metadata": {
          "metric": "truncated",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# prompt tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# prompt tokens: Number of tokens in the prompt.",
        "markdown": false,
        "metadata": {
          "metric": "# prompt tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# output tokens",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# output tokens: Actual number of output tokens.",
        "markdown": false,
        "metadata": {
          "metric": "# output tokens",
          "run_group": "MMLU"
        }
      },
      {
        "value": "# trials",
        "description": "The Massive Multitask Language Understanding (MMLU) benchmark for knowledge-intensive question answering across 57 domains [(Hendrycks et al., 2021)](https://openreview.net/forum?id=d7KBjmI3GmQ).\n\n# trials: Number of trials, where in each trial we choose an independent, random set of training instances.",
        "markdown": false,
        "metadata": {
          "metric": "# trials",
          "run_group": "MMLU"
        }
      }
    ],
    "rows": [
      [
        {
          "value": "J1-Jumbo v1 (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.33666666666666667,
          "description": "min=0.32, mean=0.337, max=0.35, sum=1.01 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12658931074327093,
          "description": "min=0.107, mean=0.127, max=0.139, sum=0.38 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.27, mean=0.29, max=0.31, sum=0.87 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31,
          "description": "min=0.29, mean=0.31, max=0.33, sum=0.93 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4250862500000004,
          "description": "min=0.425, mean=0.425, max=0.425, sum=1.275 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Large v1 (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21666666666666667,
          "description": "min=0.21, mean=0.217, max=0.23, sum=0.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14778984721027782,
          "description": "min=0.121, mean=0.148, max=0.181, sum=0.443 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.19, mean=0.2, max=0.22, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333337,
          "description": "min=0.2, mean=0.203, max=0.21, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3524025000000001,
          "description": "min=0.352, mean=0.352, max=0.352, sum=1.057 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v1 (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.32, mean=0.333, max=0.35, sum=1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12089358340175375,
          "description": "min=0.107, mean=0.121, max=0.149, sum=0.363 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.24, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2733333333333334,
          "description": "min=0.24, mean=0.273, max=0.29, sum=0.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38487625000000003,
          "description": "min=0.385, mean=0.385, max=0.385, sum=1.155 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "J1-Grande v2 beta (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j1-grande-v2-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7866666666666667,
          "description": "min=0.76, mean=0.787, max=0.8, sum=2.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15639662403318275,
          "description": "min=0.125, mean=0.156, max=0.182, sum=0.469 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7033333333333333,
          "description": "min=0.68, mean=0.703, max=0.73, sum=2.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.71, mean=0.74, max=0.77, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Jumbo (178B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-jumbo%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.8166666666666668,
          "description": "min=0.8, mean=0.817, max=0.83, sum=2.45 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07532540841383416,
          "description": "min=0.056, mean=0.075, max=0.093, sum=0.226 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.7233333333333333,
          "description": "min=0.71, mean=0.723, max=0.75, sum=2.17 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7766666666666667,
          "description": "min=0.77, mean=0.777, max=0.78, sum=2.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Grande (17B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-grande%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.79,
          "description": "min=0.77, mean=0.79, max=0.81, sum=2.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13345688043018147,
          "description": "min=0.113, mean=0.133, max=0.147, sum=0.4 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6766666666666667,
          "description": "min=0.67, mean=0.677, max=0.68, sum=2.03 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7133333333333333,
          "description": "min=0.7, mean=0.713, max=0.73, sum=2.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Jurassic-2 Large (7.5B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dai21_j2-large%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.48666666666666664,
          "description": "min=0.47, mean=0.487, max=0.5, sum=1.46 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18462129145681627,
          "description": "min=0.162, mean=0.185, max=0.219, sum=0.554 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41,
          "description": "min=0.4, mean=0.41, max=0.42, sum=1.23 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.41, mean=0.43, max=0.45, sum=1.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 321.12,
          "description": "min=321.12, mean=321.12, max=321.12, sum=963.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Base (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-base%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29666666666666663,
          "description": "min=0.29, mean=0.297, max=0.31, sum=0.89 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10442552484254863,
          "description": "min=0.087, mean=0.104, max=0.121, sum=0.313 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.26,
          "description": "min=0.25, mean=0.26, max=0.27, sum=0.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22999999999999998,
          "description": "min=0.21, mean=0.23, max=0.25, sum=0.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 440.72,
          "description": "min=440.72, mean=440.72, max=440.72, sum=1322.16 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Extended (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-extended%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.44333333333333336,
          "description": "min=0.4, mean=0.443, max=0.49, sum=1.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17943032749760301,
          "description": "min=0.143, mean=0.179, max=0.225, sum=0.538 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33666666666666667,
          "description": "min=0.32, mean=0.337, max=0.35, sum=1.01 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3066666666666667,
          "description": "min=0.28, mean=0.307, max=0.33, sum=0.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 440.72,
          "description": "min=440.72, mean=440.72, max=440.72, sum=1322.16 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Luminous Supreme (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3DAlephAlpha_luminous-supreme%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.5766666666666667,
          "description": "min=0.55, mean=0.577, max=0.61, sum=1.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17382456004080235,
          "description": "min=0.125, mean=0.174, max=0.217, sum=0.521 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.45999999999999996,
          "description": "min=0.41, mean=0.46, max=0.51, sum=1.38 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.44999999999999996,
          "description": "min=0.41, mean=0.45, max=0.51, sum=1.35 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 440.72,
          "description": "min=440.72, mean=440.72, max=440.72, sum=1322.16 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Anthropic-LM v4-s3 (52B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Danthropic_stanford-online-all-v4-s3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.77,
          "description": "min=0.76, mean=0.77, max=0.78, sum=2.31 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.111, mean=0.116, max=0.125, sum=0.349 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.72, mean=0.74, max=0.76, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.74, mean=0.74, max=0.74, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6052659375000001,
          "description": "min=0.605, mean=0.605, max=0.605, sum=1.816 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "BLOOM (176B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_bloom%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.41,
          "description": "min=0.4, mean=0.41, max=0.42, sum=1.23 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14576636615176067,
          "description": "min=0.116, mean=0.146, max=0.173, sum=0.437 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36000000000000004,
          "description": "min=0.33, mean=0.36, max=0.38, sum=1.08 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36999999999999994,
          "description": "min=0.36, mean=0.37, max=0.38, sum=1.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.37173196676704606,
          "description": "min=0.349, mean=0.372, max=0.418, sum=1.115 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 419.72,
          "description": "min=419.72, mean=419.72, max=419.72, sum=1259.16 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T0pp (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t0pp%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.6466666666666666,
          "description": "min=0.63, mean=0.647, max=0.67, sum=1.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10320325857094487,
          "description": "min=0.074, mean=0.103, max=0.13, sum=0.31 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6033333333333334,
          "description": "min=0.59, mean=0.603, max=0.62, sum=1.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6133333333333333,
          "description": "min=0.59, mean=0.613, max=0.63, sum=1.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1446615655819575,
          "description": "min=0.145, mean=0.145, max=0.145, sum=0.434 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 446.12000000000006,
          "description": "min=446.12, mean=446.12, max=446.12, sum=1338.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20220609 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20220609%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.55,
          "description": "min=0.54, mean=0.55, max=0.56, sum=1.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20937424842365512,
          "description": "min=0.173, mean=0.209, max=0.246, sum=0.628 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.4566666666666666,
          "description": "min=0.41, mean=0.457, max=0.51, sum=1.37 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5133333333333333,
          "description": "min=0.48, mean=0.513, max=0.53, sum=1.54 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5009440039062502,
          "description": "min=0.501, mean=0.501, max=0.501, sum=1.503 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere large v20220720 (13.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_large-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3666666666666667,
          "description": "min=0.34, mean=0.367, max=0.39, sum=1.1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11285612075734615,
          "description": "min=0.085, mean=0.113, max=0.135, sum=0.339 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31,
          "description": "min=0.27, mean=0.31, max=0.35, sum=0.93 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.27, mean=0.32, max=0.38, sum=0.96 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.31015017578125,
          "description": "min=0.31, mean=0.31, max=0.31, sum=0.93 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20220720 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.2833333333333334,
          "description": "min=0.26, mean=0.283, max=0.3, sum=0.85 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13090452692085733,
          "description": "min=0.111, mean=0.131, max=0.147, sum=0.393 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.18, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.22, mean=0.25, max=0.27, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27967101562499996,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.839 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere small v20220720 (410M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_small-20220720%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.26, mean=0.27, max=0.28, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08059182149179894,
          "description": "min=0.049, mean=0.081, max=0.108, sum=0.242 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22999999999999998,
          "description": "min=0.22, mean=0.23, max=0.24, sum=0.69 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20333333333333334,
          "description": "min=0.17, mean=0.203, max=0.22, sum=0.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27458001953125005,
          "description": "min=0.275, mean=0.275, max=0.275, sum=0.824 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere xlarge v20221108 (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_xlarge-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6533333333333333,
          "description": "min=0.64, mean=0.653, max=0.67, sum=1.96 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14726912269386397,
          "description": "min=0.132, mean=0.147, max=0.157, sum=0.442 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5533333333333333,
          "description": "min=0.53, mean=0.553, max=0.6, sum=1.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5566666666666668,
          "description": "min=0.55, mean=0.557, max=0.57, sum=1.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere medium v20221108 (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_medium-20221108%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.18, mean=0.21, max=0.23, sum=0.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12219512935436112,
          "description": "min=0.083, mean=0.122, max=0.16, sum=0.367 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.17666666666666667,
          "description": "min=0.15, mean=0.177, max=0.19, sum=0.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16,
          "description": "min=0.14, mean=0.16, max=0.18, sum=0.48 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (6.1B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-medium-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.61,
          "description": "min=0.59, mean=0.61, max=0.63, sum=1.83 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11424516921334132,
          "description": "min=0.103, mean=0.114, max=0.124, sum=0.343 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5366666666666667,
          "description": "min=0.53, mean=0.537, max=0.54, sum=1.61 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.55,
          "description": "min=0.55, mean=0.55, max=0.55, sum=1.65 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Cohere Command beta (52.4B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dcohere_command-xlarge-beta%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.77,
          "description": "min=0.75, mean=0.77, max=0.79, sum=2.31 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12577835993914066,
          "description": "min=0.099, mean=0.126, max=0.156, sum=0.377 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7133333333333334,
          "description": "min=0.68, mean=0.713, max=0.73, sum=2.14 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7266666666666666,
          "description": "min=0.72, mean=0.727, max=0.73, sum=2.18 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.07,
          "description": "min=444.07, mean=444.07, max=444.07, sum=1332.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-J (6B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-j-6b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.26666666666666666,
          "description": "min=0.25, mean=0.267, max=0.29, sum=0.8 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.07945241619671319,
          "description": "min=0.062, mean=0.079, max=0.105, sum=0.238 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.24333333333333332,
          "description": "min=0.22, mean=0.243, max=0.26, sum=0.73 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.22, mean=0.233, max=0.25, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.06666364607987585,
          "description": "min=0.066, mean=0.067, max=0.067, sum=0.2 (3)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GPT-NeoX (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_gpt-neox-20b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3066666666666667,
          "description": "min=0.28, mean=0.307, max=0.35, sum=0.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12035968615478772,
          "description": "min=0.111, mean=0.12, max=0.131, sum=0.361 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19333333333333336,
          "description": "min=0.18, mean=0.193, max=0.21, sum=0.58 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20666666666666667,
          "description": "min=0.18, mean=0.207, max=0.24, sum=0.62 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09422396742162258,
          "description": "min=0.093, mean=0.094, max=0.095, sum=0.283 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=1332.24 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (6.9B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-6.9b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19533389342019253,
          "description": "min=0.195, mean=0.195, max=0.195, sum=0.195 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.14 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.14, mean=0.14, max=0.14, sum=0.14 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Pythia (12B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Deleutherai_pythia-12b-v0%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.10427137014803325,
          "description": "min=0.104, mean=0.104, max=0.104, sum=0.104 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21,
          "description": "min=0.21, mean=0.21, max=0.21, sum=0.21 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "T5 (11B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_t5-11b%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.36999999999999994,
          "description": "min=0.35, mean=0.37, max=0.4, sum=1.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11416959850853418,
          "description": "min=0.1, mean=0.114, max=0.127, sum=0.343 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3466666666666667,
          "description": "min=0.32, mean=0.347, max=0.38, sum=1.04 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2833333333333334,
          "description": "min=0.25, mean=0.283, max=0.33, sum=0.85 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22123527632819293,
          "description": "min=0.22, mean=0.221, max=0.223, sum=0.664 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 446.12000000000006,
          "description": "min=446.12, mean=446.12, max=446.12, sum=1338.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "UL2 (20B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_ul2%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%2Cglobal_prefix%3Dnlg%22%5D",
          "markdown": false
        },
        {
          "value": 0.37666666666666665,
          "description": "min=0.37, mean=0.377, max=0.39, sum=1.13 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1200925282812824,
          "description": "min=0.084, mean=0.12, max=0.15, sum=0.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36000000000000004,
          "description": "min=0.35, mean=0.36, max=0.37, sum=1.08 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.3333333333333333,
          "description": "min=0.31, mean=0.333, max=0.36, sum=1 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18374792234102888,
          "description": "min=0.184, mean=0.184, max=0.184, sum=0.551 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 450.12000000000006,
          "description": "min=450.12, mean=450.12, max=450.12, sum=1350.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-175b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.4666666666666666,
          "description": "min=0.45, mean=0.467, max=0.48, sum=1.4 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.16165074370188087,
          "description": "min=0.151, mean=0.162, max=0.176, sum=0.485 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.43,
          "description": "min=0.42, mean=0.43, max=0.45, sum=1.29 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.41333333333333333,
          "description": "min=0.39, mean=0.413, max=0.43, sum=1.24 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11071307636442645,
          "description": "min=0.11, mean=0.111, max=0.111, sum=0.332 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "OPT (66B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_opt-66b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.29, mean=0.32, max=0.34, sum=0.96 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14041337414817748,
          "description": "min=0.124, mean=0.14, max=0.166, sum=0.421 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22666666666666666,
          "description": "min=0.21, mean=0.227, max=0.26, sum=0.68 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2233333333333333,
          "description": "min=0.21, mean=0.223, max=0.25, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.06763378749291098,
          "description": "min=0.067, mean=0.068, max=0.069, sum=0.203 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.42,
          "description": "min=0.42, mean=0.42, max=0.42, sum=0.42 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.135, mean=0.135, max=0.135, sum=0.135 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36,
          "description": "min=0.36, mean=0.36, max=0.36, sum=0.36 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38,
          "description": "min=0.38, mean=0.38, max=0.38, sum=0.38 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.76,
          "description": "min=0.76, mean=0.76, max=0.76, sum=0.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.159, mean=0.159, max=0.159, sum=0.159 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.68,
          "description": "min=0.68, mean=0.68, max=0.68, sum=0.68 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.71,
          "description": "min=0.71, mean=0.71, max=0.71, sum=0.71 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.83,
          "description": "min=0.83, mean=0.83, max=0.83, sum=0.83 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "min=0.051, mean=0.051, max=0.051, sum=0.051 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.82,
          "description": "min=0.82, mean=0.82, max=0.82, sum=0.82 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.81,
          "description": "min=0.81, mean=0.81, max=0.81, sum=0.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "LLaMA (65B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-65b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.89,
          "description": "min=0.89, mean=0.89, max=0.89, sum=0.89 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.81,
          "description": "min=0.81, mean=0.81, max=0.81, sum=0.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.84,
          "description": "min=0.84, mean=0.84, max=0.84, sum=0.84 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.64, mean=0.64, max=0.64, sum=0.64 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.56,
          "description": "min=0.56, mean=0.56, max=0.56, sum=0.56 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.59,
          "description": "min=0.59, mean=0.59, max=0.59, sum=0.59 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-13b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.84,
          "description": "min=0.84, mean=0.84, max=0.84, sum=0.84 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.76,
          "description": "min=0.76, mean=0.76, max=0.76, sum=0.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.79,
          "description": "min=0.79, mean=0.79, max=0.79, sum=0.79 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Llama 2 (70B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmeta_llama-2-70b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.92,
          "description": "min=0.92, mean=0.92, max=0.92, sum=0.92 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.9,
          "description": "min=0.9, mean=0.9, max=0.9, sum=0.9 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "value": 0.91,
          "description": "min=0.91, mean=0.91, max=0.91, sum=0.91 (1)",
          "style": {
            "font-weight": "bold"
          },
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Alpaca (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dstanford_alpaca-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6,
          "description": "min=0.6, mean=0.6, max=0.6, sum=0.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1514459586222974,
          "description": "min=0.151, mean=0.151, max=0.151, sum=0.151 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.52,
          "description": "min=0.52, mean=0.52, max=0.52, sum=0.52 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.53,
          "description": "min=0.53, mean=0.53, max=0.53, sum=0.53 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-7b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7,
          "description": "min=0.7, mean=0.7, max=0.7, sum=0.7 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12147506645293213,
          "description": "min=0.121, mean=0.121, max=0.121, sum=0.121 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.65,
          "description": "min=0.65, mean=0.65, max=0.65, sum=0.65 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.68,
          "description": "min=0.68, mean=0.68, max=0.68, sum=0.68 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Vicuna v1.3 (13B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dlmsys_vicuna-13b-v1.3%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.72,
          "description": "min=0.72, mean=0.72, max=0.72, sum=0.72 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15845715375526928,
          "description": "min=0.158, mean=0.158, max=0.158, sum=0.158 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.69,
          "description": "min=0.69, mean=0.69, max=0.69, sum=0.69 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7,
          "description": "min=0.7, mean=0.7, max=0.7, sum=0.7 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 479.81,
          "description": "min=479.81, mean=479.81, max=479.81, sum=479.81 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Mistral v0.1 (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmistralai_mistral-7b-v0.1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.84,
          "description": "min=0.84, mean=0.84, max=0.84, sum=0.84 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.82,
          "description": "min=0.82, mean=0.82, max=0.82, sum=0.82 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.83,
          "description": "min=0.83, mean=0.83, max=0.83, sum=0.83 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (530B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_530B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.7766666666666667,
          "description": "min=0.77, mean=0.777, max=0.78, sum=2.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08475747302565896,
          "description": "min=0.073, mean=0.085, max=0.098, sum=0.254 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7366666666666667,
          "description": "min=0.72, mean=0.737, max=0.75, sum=2.21 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.7399999999999999,
          "description": "min=0.73, mean=0.74, max=0.75, sum=2.22 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "TNLG v2 (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmicrosoft_TNLGv2_7B%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.29,
          "description": "min=0.25, mean=0.29, max=0.35, sum=0.87 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11269665749808301,
          "description": "min=0.108, mean=0.113, max=0.119, sum=0.338 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19000000000000003,
          "description": "min=0.16, mean=0.19, max=0.24, sum=0.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25,
          "description": "min=0.2, mean=0.25, max=0.31, sum=0.75 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "davinci (175B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_davinci%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6933333333333334,
          "description": "min=0.68, mean=0.693, max=0.7, sum=2.08 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.15038484047487732,
          "description": "min=0.127, mean=0.15, max=0.18, sum=0.451 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.59,
          "description": "min=0.58, mean=0.59, max=0.6, sum=1.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6066666666666666,
          "description": "min=0.6, mean=0.607, max=0.61, sum=1.82 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22106718749999998,
          "description": "min=0.221, mean=0.221, max=0.221, sum=0.663 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "curie (6.7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_curie%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18712194091579812,
          "description": "min=0.159, mean=0.187, max=0.238, sum=0.561 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.18333333333333335,
          "description": "min=0.16, mean=0.183, max=0.2, sum=0.55 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20000000000000004,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.6 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09253499999999999,
          "description": "min=0.093, mean=0.093, max=0.093, sum=0.278 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "babbage (1.3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_babbage%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.31666666666666665,
          "description": "min=0.29, mean=0.317, max=0.35, sum=0.95 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1098975478670655,
          "description": "min=0.095, mean=0.11, max=0.124, sum=0.33 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22333333333333336,
          "description": "min=0.2, mean=0.223, max=0.24, sum=0.67 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.26, mean=0.27, max=0.28, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.12030531249999994,
          "description": "min=0.12, mean=0.12, max=0.12, sum=0.361 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "ada (350M)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_ada%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.3133333333333333,
          "description": "min=0.31, mean=0.313, max=0.32, sum=0.94 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09229113846980998,
          "description": "min=0.087, mean=0.092, max=0.101, sum=0.277 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.25666666666666665,
          "description": "min=0.25, mean=0.257, max=0.27, sum=0.77 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.25, mean=0.27, max=0.28, sum=0.81 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14107562500000007,
          "description": "min=0.141, mean=0.141, max=0.141, sum=0.423 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-003",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-003%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.8433333333333333,
          "description": "min=0.83, mean=0.843, max=0.86, sum=2.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.1470772685324068,
          "description": "min=0.127, mean=0.147, max=0.163, sum=0.441 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.8233333333333333,
          "description": "min=0.81, mean=0.823, max=0.84, sum=2.47 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.83,
          "description": "min=0.83, mean=0.83, max=0.83, sum=2.49 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-davinci-002",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-davinci-002%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.8433333333333333,
          "description": "min=0.83, mean=0.843, max=0.86, sum=2.53 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08855632309548854,
          "description": "min=0.064, mean=0.089, max=0.106, sum=0.266 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.8166666666666668,
          "description": "min=0.81, mean=0.817, max=0.83, sum=2.45 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.81,
          "description": "min=0.8, mean=0.81, max=0.82, sum=2.43 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21452374999999999,
          "description": "min=0.215, mean=0.215, max=0.215, sum=0.644 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-curie-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-curie-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.24,
          "description": "min=0.24, mean=0.24, max=0.24, sum=0.72 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.518923372446291,
          "description": "min=0.508, mean=0.519, max=0.534, sum=1.557 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.21, mean=0.22, max=0.23, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2333333333333333,
          "description": "min=0.23, mean=0.233, max=0.24, sum=0.7 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.130843125,
          "description": "min=0.131, mean=0.131, max=0.131, sum=0.393 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-babbage-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-babbage-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.21, mean=0.22, max=0.23, sum=0.66 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28724377907255744,
          "description": "min=0.263, mean=0.287, max=0.328, sum=0.862 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.19666666666666666,
          "description": "min=0.18, mean=0.197, max=0.21, sum=0.59 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.21333333333333335,
          "description": "min=0.2, mean=0.213, max=0.22, sum=0.64 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.13215125,
          "description": "min=0.132, mean=0.132, max=0.132, sum=0.396 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "text-ada-001",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_text-ada-001%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.15666666666666665,
          "description": "min=0.14, mean=0.157, max=0.18, sum=0.47 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6583265858105705,
          "description": "min=0.653, mean=0.658, max=0.666, sum=1.975 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09999999999999999,
          "description": "min=0.08, mean=0.1, max=0.12, sum=0.3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.14,
          "description": "min=0.11, mean=0.14, max=0.17, sum=0.42 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08784312499999998,
          "description": "min=0.088, mean=0.088, max=0.088, sum=0.264 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0301",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0301%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.85,
          "description": "min=0.85, mean=0.85, max=0.85, sum=0.85 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.79,
          "description": "min=0.79, mean=0.79, max=0.79, sum=0.79 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.8,
          "description": "min=0.8, mean=0.8, max=0.8, sum=0.8 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 415.79,
          "description": "min=415.79, mean=415.79, max=415.79, sum=415.79 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "gpt-3.5-turbo-0613",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dopenai_gpt-3.5-turbo-0613%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.73,
          "description": "min=0.73, mean=0.73, max=0.73, sum=0.73 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.49,
          "description": "min=0.49, mean=0.49, max=0.49, sum=0.49 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.66,
          "description": "min=0.66, mean=0.66, max=0.66, sum=0.66 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 415.79,
          "description": "min=415.79, mean=415.79, max=415.79, sum=415.79 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.19,
          "description": "min=1.19, mean=1.19, max=1.19, sum=1.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11325806287828982,
          "description": "min=0.113, mean=0.113, max=0.113, sum=0.113 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.22,
          "description": "min=0.22, mean=0.22, max=0.22, sum=0.22 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct-v1 (3B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-3b-v1%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08961420526691671,
          "description": "min=0.09, mean=0.09, max=0.09, sum=0.09 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.27,
          "description": "min=0.27, mean=0.27, max=0.27, sum=0.27 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Base (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-base-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.28 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.08298970145466679,
          "description": "min=0.083, mean=0.083, max=0.083, sum=0.083 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.2,
          "description": "min=0.2, mean=0.2, max=0.2, sum=0.2 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.23,
          "description": "min=0.23, mean=0.23, max=0.23, sum=0.23 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "RedPajama-INCITE-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_redpajama-incite-instruct-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.52,
          "description": "min=0.52, mean=0.52, max=0.52, sum=0.52 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.09169782092777656,
          "description": "min=0.092, mean=0.092, max=0.092, sum=0.092 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.46,
          "description": "min=0.46, mean=0.46, max=0.46, sum=0.46 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.48,
          "description": "min=0.48, mean=0.48, max=0.48, sum=0.48 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.68,
          "description": "min=0.68, mean=0.68, max=0.68, sum=0.68 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.6,
          "description": "min=0.6, mean=0.6, max=0.6, sum=0.6 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.64, mean=0.64, max=0.64, sum=0.64 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "MPT-Instruct (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dmosaicml_mpt-instruct-30b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.64, mean=0.64, max=0.64, sum=0.64 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.56,
          "description": "min=0.56, mean=0.56, max=0.56, sum=0.56 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.59,
          "description": "min=0.59, mean=0.59, max=0.59, sum=0.59 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 444.08,
          "description": "min=444.08, mean=444.08, max=444.08, sum=444.08 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.39,
          "description": "min=0.39, mean=0.39, max=0.39, sum=0.39 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.37,
          "description": "min=0.37, mean=0.37, max=0.37, sum=0.37 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.33,
          "description": "min=0.33, mean=0.33, max=0.33, sum=0.33 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 449.19,
          "description": "min=449.19, mean=449.19, max=449.19, sum=449.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (7B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-7b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.34,
          "description": "min=0.34, mean=0.34, max=0.34, sum=0.34 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.32,
          "description": "min=0.32, mean=0.32, max=0.32, sum=0.32 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 449.19,
          "description": "min=449.19, mean=449.19, max=449.19, sum=449.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.79,
          "description": "min=0.79, mean=0.79, max=0.79, sum=0.79 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.76,
          "description": "min=0.76, mean=0.76, max=0.76, sum=0.76 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.78,
          "description": "min=0.78, mean=0.78, max=0.78, sum=0.78 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 449.19,
          "description": "min=449.19, mean=449.19, max=449.19, sum=449.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Falcon-Instruct (40B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtiiuae_falcon-40b-instruct%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.82,
          "description": "min=0.82, mean=0.82, max=0.82, sum=0.82 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.78,
          "description": "min=0.78, mean=0.78, max=0.78, sum=0.78 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.8,
          "description": "min=0.8, mean=0.8, max=0.8, sum=0.8 (1)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=100 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=5 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 449.19,
          "description": "min=449.19, mean=449.19, max=449.19, sum=449.19 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=1 (1)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "GLM (130B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_glm%2Cdata_augmentation%3Dcanonical%2Cstop%3Dhash%22%5D",
          "markdown": false
        },
        {
          "value": 0.41333333333333333,
          "description": "min=0.39, mean=0.413, max=0.43, sum=1.24 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.11165857926205654,
          "description": "min=0.104, mean=0.112, max=0.125, sum=0.335 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.38999999999999996,
          "description": "min=0.37, mean=0.39, max=0.41, sum=1.17 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.36999999999999994,
          "description": "min=0.35, mean=0.37, max=0.39, sum=1.11 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.5085839150746664,
          "description": "min=0.508, mean=0.509, max=0.509, sum=1.526 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 423.26,
          "description": "min=423.26, mean=423.26, max=423.26, sum=1269.78 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "InstructPalmyra (30B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-instruct-30%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.6666666666666666,
          "description": "min=0.62, mean=0.667, max=0.7, sum=2 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.6233333333333334,
          "description": "min=0.58, mean=0.623, max=0.65, sum=1.87 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.64,
          "description": "min=0.6, mean=0.64, max=0.66, sum=1.92 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "Palmyra X (43B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dwriter_palmyra-x%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.8766666666666666,
          "description": "min=0.87, mean=0.877, max=0.88, sum=2.63 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 0.8533333333333334,
          "description": "min=0.84, mean=0.853, max=0.86, sum=2.56 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.8566666666666666,
          "description": "min=0.85, mean=0.857, max=0.86, sum=2.57 (3)",
          "style": {},
          "markdown": false
        },
        {
          "description": "1 matching runs, but no matching metrics",
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 433.12000000000006,
          "description": "min=433.12, mean=433.12, max=433.12, sum=1299.36 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ],
      [
        {
          "value": "YaLM (100B)",
          "description": "",
          "href": "?group=mmlu&subgroup=subject%3A%20us_foreign_policy&runSpecs=%5B%22mmlu%3Asubject%3Dus_foreign_policy%2Cmethod%3Dmultiple_choice_joint%2Cmodel%3Dtogether_yalm%2Cdata_augmentation%3Dcanonical%22%5D",
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.6348528256596796,
          "description": "min=0.619, mean=0.635, max=0.645, sum=1.905 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.28,
          "description": "min=0.28, mean=0.28, max=0.28, sum=0.84 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.20833947074413298,
          "description": "min=0.208, mean=0.208, max=0.208, sum=0.625 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 100.0,
          "description": "min=100, mean=100, max=100, sum=300 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 5.0,
          "description": "min=5, mean=5, max=5, sum=15 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 0.0,
          "description": "min=0, mean=0, max=0, sum=0 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 432.5,
          "description": "min=432.5, mean=432.5, max=432.5, sum=1297.5 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 1.0,
          "description": "min=1, mean=1, max=1, sum=3 (3)",
          "style": {},
          "markdown": false
        },
        {
          "value": 3.0,
          "description": "min=3, mean=3, max=3, sum=9 (3)",
          "style": {},
          "markdown": false
        }
      ]
    ],
    "links": [
      {
        "text": "LaTeX",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/latex/mmlu_mmlu_subject:us_foreign_policy.tex"
      },
      {
        "text": "JSON",
        "href": "/nlp/scr4/nlp/crfm/yifanmai/helm-release/benchmark_output/releases/v0.4.0/groups/json/mmlu_mmlu_subject:us_foreign_policy.json"
      }
    ],
    "name": "mmlu_subject:us_foreign_policy"
  }
]