AI Benchmarks Series — Metaculus Questions on Evaluations of AI Models Against Technical Benchmarks

christian

Effective Altruism Forum
EA Forum

Hide table of contents

AI Benchmarks Series — Metaculus Questions on Evaluations of AI Models Against Technical Benchmarks

by christian

Mar 27 20241 min read 0

10

AI safetyForecastingAI benchmarksAI forecastingAnnouncements and updatesMetaculus

Frontpage

AI Benchmarks Series — Metaculus Questions on Evaluations of AI Models Against Technical Benchmarks

How capable will top AI models be in 2025?

No comments

This is a linkpost for https://www.metaculus.com/project/3054/

How capable will top AI models be in 2025?

Forecast LLM agents' autonomous replication & adaptation (ARA) abilities and model performance on benchmarks like GPQA & GAIA in AI Benchmarks, a collaboration with the AI Safety Student Team at Harvard (AISST).

Start here.

AISST questions are inspired by work by @elifland.

10 Reactions

Comments

No comments on this post yet.

Be the first to respond.