-
The latest GDPval leaderboard suggests frontier AI models are approaching industry-expert performance. But what does “parity” actually mean in practice and do the results overstate real-world impact? Introduction GDPval, is an evaluation benchmark for assessing AI model capabilities on real-world economically valuable tasks. It covers the majority of U.S.…



