cb

83 karmaJoined Nov 2024

Message

Bio

I work on AI grantmaking at Open Philanthropy.

All posts in a personal capacity and do not reflect the views of my employer, unless otherwise stated.

Posts
2

Sorted by New

Mistakes I made running an AI safety student group

· 6d ago · 9m read

Improving capability evaluations for AI governance: Open Philanthropy's new request for proposals

· 25d ago · 3m read

Comments
4

What posts would you like someone to write?

cb5d3

I wrote about mistakes I made as a uni group organiser here, inspired by this list!

How honest should you be when applying for high-impact roles?

cb6d5

(even larger disclaimer than usual: i don't have much experience applying to EA orgs, i'm also not trying to give career advice and wouldn't recommend taking career advice from me, ymmv)

Thanks for posting! I'm broadly sympathetic to this line of reasoning. One thing I wanted to note was that hiring processes seem pretty noisy, and lots of people seem pretty bad at estimating how good they are at things, so I think in practice there might not be that much difference between trying to get yourself hired vs. trying to get the best candidate hired. I think a reasonable heuristic is "try to do well at all the interviews/work tests, as you would for a normal job, but don't rule yourself out in advance, and be very honest and transparent if you're asked specific questions".

Improving capability evaluations for AI governance: Open Philanthropy's new request for proposals

cb22d3

Hi Søren,

Thanks for commenting. Some quick responses:

> The safety frameworks presented by the frontier labs are "safety-washing", more appropriately considered roadmaps towards an unsurvivable future

I don’t see the labs as the main audience for evaluation results, and I don’t think voluntary safety frameworks should be how deployment and safeguard decisions are made in the long-term, so I don’t think the quality of lab safety frameworks is that relevant to this RFP.

> I'd like sources for your claim, please.

Sure, see e.g. the sources linked to in our RFP for this claim: What Are the Real Questions in AI? and What the AI debate is really about.

I’m surprised you think the disagreements are “performative” – in my experience, many sceptics of GCRs from AI really do sincerely hold their beliefs.

> No decision-relevant conclusions can be drawn from evaluations in the style of Cybench and Re-Bench.

I think Cybench and RE-Bench are useful, if imperfect, proxies for frontier model capabilities at cyberoffense and ML engineering respectively, and those capabilities are central to threats from cyberattacks and AI R&D. My claim isn’t that running these evals will tell you exactly what to do: it’s that these evaluations are being used as inputs into RSPs and governance proposals more broadly, and provide some evidence on the likelihood of GCRs from AI, but will need to be harder and more robust to be relied upon.

cb

Bio

Posts 2

Comments4

Posts
2

Comments
4