Percy Liang · Apr 26, 2024 · 5:05 AM UTC

Percy Liang · Apr 26, 2024 · 5:05 AM UTC

Percy Liang

Percy Liang

@percyliang

26 Apr 2024

HELM Lite v1.2.0 is out! Datasets: NarrativeQA, NaturalQA, OpenbookQA, MMLU, MATH, GSM8K, LegalBench, MedQA, WMT14 Results (we still need to add Claude 3, which requires more prompt finagling): crfm.stanford.edu/helm/lite/…

Apr 26, 2024 · 5:05 AM UTC

202