I think you are somewhat overestimating current Minerva capabilities. MATH dataset is not so hard; the person who got 40% "did not like mathematics" and had 1 hour per 20 questions.
Generally, I agree with more or less everything in the essay, just wanted to nitpick on this particular thing.
I think you are somewhat overestimating current Minerva capabilities. MATH dataset is not so hard; the person who got 40% "did not like mathematics" and had 1 hour per 20 questions.
Generally, I agree with more or less everything in the essay, just wanted to nitpick on this particular thing.