This is an anonymous account.
Hi JWS, Just wanted to let you know that we've posted our introduction to the series. We hope it adds some clarity to the points you've raised here for others.
Hi Bruce, thanks for this thoughtful comment. We think Conjecture needs to address key concerns before we would recommend working there, although we could imagine Conjecture being the best option for a small fraction of people who are (a) excited by their current CoEm approach, (b) can operate independently in an environment with limited mentorship, (c) are confident they can withstand internal pressure (if there is a push to work on capabilities). As a result of these (and other) comments in this comment thread, we will be updating our recommendation to work at Conjecture.
That being said, we expect it to be rare that an individual would have an offer from Conjecture but not have access to other opportunities that are better than independent research. In practice many organizations end up competing for the same, relatively small pool of the very top candidates. Our guess is that most individuals who could receive an offer from Conjecture could pursue one of the paths outlined above in our replies to Marius such as being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company (if not from more promising places like the ones we discuss in the original post). We think these positions can absorb a fairly large amount of talent, although we note that most AI/ML fields are fairly competitive.
(personal, emotional reflection)
On a personal note, the past few days have been pretty tough for me. I noticed I took the negative feedback pretty hard.
I hope we have demonstrated that we are acting in good faith, willing to update and engage rigorously with feedback and criticism, but some of the comments made me feel like people thought we were trying to be deceptive or mislead people. It's pretty difficult to take that in when it's so far from our intentions.
We try not to let the fact that our posts are anonymous mean we can say things that aren't as rigorous, but sometimes it feels like people don't realize that we are people too. I think comments might be phrased differently if we weren't anonymous.
I think it's especially hard when this post has taken many weekends to complete, and we've invested several hours this week in engaging with comments, which is a tough trade off against other projects.
Brief reflections on the Conjecture post and it's reception
(Written from the non-technical primary author)
We didn't do some super basic things which feel obvious in retrospect e.g. explain why we are writing this series. But context is important when people are primed to respond negatively to a post.
Changes we plan to make:
We appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.
1) We agree it's worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We're not aware of any equally significant advances from Connor or other key staff members at Conjecture; we'd be interested to hear if you have examples of their pre-Conjecture output you find impressive.
We're not particularly impressed by Conjecture's process, although it's possible we'd change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn't feel like the crux for us: if Conjecture copied ARC's process entirely, we'd still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.
In terms of the explicit comparison with ARC, we would like to note that ARC Theory's team size is an order of magnitude smaller than Conjecture. Based on ARC's recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.
2) Thanks for the concrete examples, this really helps tease apart our disagreement.
We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.
The stuff on SVDs and sparse coding [...] was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
This sounds similar to our internal evaluation. We're a bit confused by why "3 people in two weeks" is the relevant reference class. We'd argue the costs of Conjecture's "misses" need to be accounted for, not just their "hits". Redwood's team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture's other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator's post is head and shoulders above Redwood's other output)?
Thanks for sharing the data point this influenced independent researchers. That's useful to know, and updates us positively. Are you excited by those independent researchers' new directions? Is there any output from those researchers you'd suggest we review?
3) We remain confident in our sources regarding Conecture's discussion with VCs, although it's certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It's reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.
4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.
5) This certainly depends on what "general industry" refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We'd be curious to hear your case for Conjecture as skill building; without that it's hard to identify where our main disagreement lies.
While we're taking a short break from writing criticisms, I (the non-technical author) was wondering if people would be find it valuable for us to share (brief) thoughts what we've learnt so far from writing these first two critiques - such as how to get feedback, balance considerations, anonymity concerns, things we wish would be different in the ecosystem to make it easier for people to provide criticisms etc.
We're always open to providing thoughts / feedback / inputs if you are trying to write a critique. I'd like to try and encourage more good-faith critiques that enable productive discourse.