GSM8K takes its name from “Grade School Math 8K.” In the 2021 paper “Training Verifiers to Solve Math Word Problems,” OpenAI researchers introduced it as “a dataset of 8.5K high quality linguistically diverse grade school math word problems.” The problems use only basic arithmetic but require several reasoning steps in sequence, which made them a useful early test of multi-step reasoning.