05:51
豆包大模型团队Open SourceBenchmark测试集SuperGPQA
On March 4th, Golden Ten Data reported that according to the official WeChat message of the Bean Model team, the Bean Model team has Open Source Super GPQA recently, a comprehensive and highly distinctive knowledge reasoning Benchmark test in the field. It is reported that this dataset has constructed an evaluation system covering 285 graduate-level disciplines, including 26,529 professional questions, not only covering mainstream disciplines, but also incorporating long-tail disciplines such as light industry, agriculture, and service science, demonstrating the breadth of coverage of comprehensive disciplines and filling the gap in the long-tail knowledge evaluation field.
- 2
- 3








