Yeah, I think this is an excellent point that you have made more clearly than I did: we are measuring engagement as a proxy for effectiveness. It might be a decent proxy for something like 'probability of future effectiveness' when considering young students in particular - if an intervention meaningfully increases the likelihood that some well-meaning undergrads make EA friends and read books and come to events, then I have at least moderate confidence that it also increases impact because some of those people will go on to make more impactful choices through their greater engagement with EA ideas. But I don't think it's a good proxy for the amount of impact being made by 'people who basically run their whole lives around EA ideas already.' It's hard to imagine how these people could increase their ENGAGEMENT with EA (they've read all the books, they RUN the events, they're friends with most people in the community, etc etc) but there are many ways they could increase their IMPACT, which may well be facilitated/prompted by EAGx but not captured by the data.
Out of curiosity, would you say that since switching careers, your engagement measured by these kind of metrics (books read, events attended, number of EA friends, frequency of forum activity, etc) has gone up, gone down, or stayed the same?
I like these ideas but have something to add re your 'keen beans', or rather, their opposite - at what point is someone insufficiently engaged with EA to bother considering them when assessing the effectiveness of interventions? If someone signs up to an EA mailing list and then lets all the emails go to their junk folder without ever reading them or considering the ideas again, is that person actually part of the target group for the intervention? They are part of our statistics (as in, they count towards the 95% of 'people on the EA mailing list' who did not respond to the survey), is that a good thing or a bad thing?
Thanks for the response, I really like hearing about other people's reasoning re: study design! I agree that randomly excluding highly qualified people would be too costly, and I think your idea of building a control group from accepted-cancelled EAGx attendees across multiple conferences is a great idea. I guess my only issue with it is that these people are likely still experiencing the FOMO (they wanted to go but couldn't). If we are considering a counterfactual scenario where the resources currently used to organise EAGx conferences are spent on something else, there's no conference to miss out on, so it removes a layer of experience related to 'damn, I wish I could have gone to that'.
I'm not familiar enough with survey design to comment on the risk of adding more questions reducing the response rate. If you think it would be a big issue, that's good enough for me - and also I imagine it would further skew the survey respondents towards more-engaged rather than less-engaged people. I do think that for the purpose of this survey, it would make more sense to prompt the EAGx attendees to answer whether they had followed up on any connections / ideas / opportunities from EAGx in the last 6 months. I'm not sure how to word that so that the same survey/questions could be used for both groups though.
This is a really good point actually! I have never attended either an EAG conference, or EAGx on another continent, so I don't really have a frame of reference for how they generally compare. In Australia, EAGx is THE annual conference, and most of us put decently high priority on showing up if we can.
(Disclosure: I was an attendee at EAGx Australia in 2022 and 2023. I believe I am one of the data points in the Treatment group described.)
Thanks again for running this well-designed survey, which I know has taken a great deal of effort. The results do surprise me a little, and I notice that part of my motivation for writing this response is 'I feel like the conferences are really valuable so I want to add alternate explanations that would support that belief.' That said, I feel like some of my interpretations of this data might be of interest or add value to the conversation, so here goes.
The main thing that stands out to me in my interpretation of this data is that I think most EAs probably have an 'EA ceiling'. By that I mean, there's some maximum amount of engagement that each person is capable of, dependent on their circumstances. I think there may actually be two distinct cohorts of people who are representative of ceiling effects in the data.
The first cohort ('personal ceiling') are people who are doing everything they can, given their other goals and circumstances. I can't increase my donations if I'm really struggling financially (and I think it's important to acknowledge that Australia is in a major housing and cost-of-living crisis right now, which certainly affects my capacity to donate). I can't attend more events if I'm a single parent, or doing shift work on a rigid schedule, or living in a regional town with no active EA community. These people are at a personal ceiling.
The second cohort ('logical ceiling') are people who basically already run their entire lives around EA principles (and I met several at EAGx). They've taken the 10% pledge, they work at EA orgs, they are vegan, they attend every EA event they reasonably can, they volunteer, they are active online, etc. It's hard to imagine how people this committed could meaningfully increase their engagement with EA.
Given that attending EAGx requires a significant personal commitment of time and resources, it seems fairly obvious to me that conference attendees would be self-selected for BOTH 'has free time and resources to attend the conference' AND 'higher EA engagement in comparison to people on the mailing list who didn't attend the conference'. I think this is confirmed by the data: conference attendees had more EA friends and higher event attendance both before and after the conference. We should also consider that the survey response rate for non-conference-attendees was low, and the people who completed the survey are probably more engaged with EA than the average person on the mailing list. I think it would be really interesting to try to determine what percentage of respondents in each group are at either a personal or logical ceiling, and whether these 'ceiling participants' differ from other EAs in terms of the stability of their commitment and level of engagement over time. To resort to metaphor, it takes a lot more energy to keep a pot boiling than simmering, and it seems at least plausible to me that a large part of the value of EAGx is helping a relatively small group of extremely engaged people maintain their motivation, focus and commitment, and build new collaborations.
As an experimentalist (I'm a molecular biologist), the 'obvious' hypothesis test is one that was proposed in the OP: randomise would-be EAGx attendees into treatment and control groups, and then only let half of them attend. However, I think that using people at the borderline of being accepted or rejected as the basis for such a randomisation study would risk skewing the data. Specifically, I think it's likely that everyone at a logical ceiling and most people at a personal ceiling would be an 'automatic accept' for EAGx and at no risk of being considered 'borderline admits'. Therefore, the experimentally optimal way to run this would be to finalise the list of acceptances with 30 more acceptances than there are conference places, and then exclude 30 people totally at random. Unfortunately, there's significant downside risk to such an approach. It's likely that conference organisers, volunteers and speakers would be among those excluded, which would be disruptive to the conference and would likely reduce the value that other participants would get. I think it's also important to consider that missing out on attending EAG or EAGx is a massive bummer; people have written before about feelings of unimportance or inadequacy as a result of conference rejection pushing them away from further participation in EA, and we should take this into account if considering running experiments that would involve arbitrarily declining qualified applications. (Edited to add: the experience of people who applied but weren't selected for an EAGx, either because of a study or because they didn't make the cut, is likely very different from the experience of EAs if no EAGx was held. FOMO/resentment for having personally missed out when others went is not similar to 'oh, I hope there will be a conference next year' or [crickets].)
I do also think that the metrics used in this study but not in 'typical' EAGx impact surveys are missing a lot of dimensions via which EAs have impact, especially those which are most relevant to people at a logical ceiling who are already working or volunteering within EA orgs for multiple hours a week. Metrics like 'did you read books and forum posts, did you go to meetups, did you make friends' are great for measuring engagement with EA ideas and community, but not great for measuring outputs like 'Alex and Tsai had some great chats and have formed a technical collaboration' or 'Kate talked to Jess about her research and is now doing a PhD in her lab' or 'Kai inspired Josh to get professional mental health treatment and he's now able to spend another 10 hours a week on effective work'. On this basis, I completely agree with the original post that we need to combine BOTH self-reports of effectiveness based on subjective measures like meaningful connections or feeling motivated, AND objective measures of behaviour change. I wonder whether it would be possible to incorporate more metrics that would 'split the difference' in a way, while still relying on self-reports of past behaviour (which is important for all the reasons discussed in OP). For instance, could we ask at a 6-month follow-up, 'how many people that you met at EAGx have you interacted with in the past month'. Or, we could ask people to nominate specific actions they intended to take immediately after EAGx (with a control group of non-attendees) and then follow up 6 months later to ask them which of those actions they have actually taken. This design would be scalable to people with different levels of both engagement and ceiling-ness: a busy professional might commit to reading Scout Mindset and going to at least one meetup, while a student working to build an EA career might commit to applying for EAG, following up with 2 new connections and writing a forum post.
This is longer than I intended it to be, and I hope it doesn't come across as critical - I think this is very important work, and that we should always be open to considering that beloved interventions are less effective than we would like for them to be. I hope this is a useful addition to the discussion, at any rate. And thanks James for all the work you put into EAGx Australia!
Thanks so much for this useful reply :) There's something I want to write about which I believe carries a risk of 'some undefined amount of suffering that is worth consideration'. I think I have been wasting time trying to decide if it is an s-risk by the usual definitions rather than just writing about what's happening and then speculating from there. You make a good point that something not quite bad enough to be an s-risk is still pretty bad!
I do think 'catastrophic suffering risk' is an odd one, because it's really not intuitive that a 'catastrophic suffering risk' is less bad than a 'suffering risk'. I guess I just find it weird that something as bad as a genuine s-risk has such a pedestrian name, compared to 'existential risk', which I think is an intuitive and evocative name that gets across the level of bad-ness pretty well.
One quick question - when you say an s-risk creates a future with negative value, does that make it worse than an x-risk? As in, the imagined future is SO awful that the extinction of humanity would be preferable?
I like the idea of carbon offsets for flights etc, but I think most carbon offset schemes are probably garbage. A year ago I made a personal pledge that whenever I was prompted to pay extra to carbon offset something, I would decline, but then immediately donate the same amount or more to effective environmental funds (in my case, Effective Altruism Australia Environment.) It's easy to remember and easy to do. Perhaps this simple pledge will be similarly sticky for other people :)