Hide table of contents

I'm posting this in preparation for Draft Amnesty Week (Feb 24- March 2), but please also use this thread for posts you don't plan to write for Draft Amnesty. The last time I posted this question, there were some great responses. 

If you have multiple ideas, I'd recommend putting them in different answers, so that people can respond to them separately.

It would be great to see:

  • Both spur-of-the-moment vague ideas, and further along considered ideas. If you're in that latter camp, you could even share a google doc for feedback on an outline.
  • Commenters signalling with Reactions and upvotes the content that they'd like to see written.
  • Commenters responding with helpful resources or suggestions.
  • Commenters proposing Dialogues with authors who suggest similar ideas, or which they have an interesting disagreement with (Draft Amnesty Week might be a great time for scrappy/ unedited dialogues). 

Draft Amnesty Week

If the responses here encourage you to develop one of your ideas, Draft Amnesty Week (February 24- March 2) might be a great time to post it. Posts tagged "Draft Amnesty Week" don't have to be thoroughly thought through or even fully drafted. Bullet points and missing sections are allowed. You can have a lower bar for posting. 

29

0
0

Reactions

0
0
New Answer
New Comment

17 Answers sorted by

A history of ITRI, Taiwan's national electronics R&D institute. It was established in 1973, when Taiwan's income was less than Pakistan's income today. Yet it was single-handedly responsible for the rise of Taiwan's electronics industry, spinning out UMC, MediaTek and most notably TSMC. To give you a sense of how insane this is, imagine that Bangladesh announced today that they were going to start doing frontier AI R&D, and in 2045 they were the leaders in AI. ITRI is arguably the most successful development initiative in history, but I've never seen it brought up in either the metascience/progress community or the global dev community.

I'm considering writing about "RCTs in NGOs: When (and when not) to implement them"

The post would explore:

  • Why many new NGOs feel pressured to conduct RCTs primarily due to funder / EA community requirements.
  • The hidden costs and limitations of RCTs: high expenses, 80% statistical power meaning 20% chance of missing real effects, wide confidence intervals
  • Why RCTs might not be the best tool for early-stage organizations focused on iterative learning
  • How academic incentives in RCT design/implementation don't always align with NGO needs
  • Alternative evidence-gathering approaches that might be more appropriate for different organizational stages
  • Suggestions for both funders and NGOs on how to think about evidence generation

This comes from my conversations with several NGO founders. I believe the EA community could benefit from a more nuanced discussion about evidence hierarchies and when different types of evaluation make sense.

This sounds like it could be interesting, though I'd also consider if some of the points are fundamentally to do with RCTs. E.g., "80% statistical power meaning 20% chance of missing real effects" - nothing inherently says an RCT should only be powered at 80% or that the approach should even be one of null hypothesis significance testing.

1
Fernando Irarrázaval 🔸
Good point. Good to clarify that the 80% power standard comes from academic norms, not an inherent RCT requirement. NGOs should chose their statistical thresholds based on their specific needs, budget, and risk tolerance.

I would love to see this. Not a take I've seen before (that I remember). 

I would welcome a blog post about RCTs, and if you decide to write one, I hope you consider the perspective below.

As far as I can tell ~0% of nonprofits are interested in rigorously studying their programs in any way, RCTs or otherwise, and I can't help but suspect that this is largely because mostly when we do run RCTs we find that these cherished programs have ~no effect. It's not at all surprising to me that most charities that conduct RCTs feel pressured to do so by donors; but on the other hand basically all charity activities ultimately flow from don... (read more)

2
Fernando Irarrázaval 🔸
This is a great point. There's an important distinction though between evaluating new programs led by early-stage NGOs (like those coming from Charity Entrepreneurship) versus established programs directing millions in funding. I think RCTs make sense for the latter group. There's also a difference between the typical NGOs and EA-founded ones. In my experience, EA founders actively want to rigorously evaluate their programs, they don't want to work for ineffective interventions.

Would also love this. I think a useful contrast will be A/B testing in big tech firms. My amateur understanding is big tech firms can and should run hundreds of “RCTs” because:

  • No need to acquire subjects.
  • Minimal disruption to business since you only need to siphon off a minuscule portion of your user base.
  • Tech experiments can finish in days while field experiments need at least a few weeks and sometimes years.
  • If we assume treatments are heavy tailed, then a big tech firm running hundreds of A/B tests is more likely to learn of a weird trick that grows the business when compared to a NGO who may only get one shot. 
2
Fernando Irarrázaval 🔸
Yes, exactly. The marginal cost of an A/B test in tech is incredibly low, while for NGOs an RCT represents a significant portion of their budget and operational capacity.  This difference in costs explains why tech can use A/B tests for iterative learning, trying hundreds of small variations, while NGOs need to be much more selective about what to test.  And despite A/B testing being nearly free, most decisions at big tech firms aren't driven by experimental evidence.

How people who write on the EA Forum and on LessWrong can have non-obvious significant positive impact by influencing organizations (like mine) - both through culture and the merit of their reasoning.

Personally I'd be so keen on seeing that - it's part of the pitch that I make to authors. 

I have a hastily written draft from a while back called "Cause neutrality doesn't mean all EA causes matter equally". It's a corrective to people sometimes using "cause neutrality" as a justification for not doing cause prioritisation/ treating current EA cause areas as canonical/ equally deserving of funding or effort. I didn't finish it because I ran out of steam/ was concerned I might be making up a guy to get mad at. 
I'll consider posting it for Draft Amnesty, especially if anyone is interested in seeing this take written up.

Very much in favor of posts clarifying that cause neutrality doesn't require value neutrality or deference to others' values.

How to interpret the EA Survey and Open Phil EA/LT Survey.

I think these surveys are complementary and each have different strengths and weaknesses relevant for different purposes.[1] However, I think what the strengths and weaknesses are and how to interpret the surveys in light of them is not immediately obvious. And I know that in at least some cases, decision-makers have had straightforwardly mistaken factual beliefs about the surveys which has mislead them about how to interpret them. This is a problem if people mistakenly rely on the results of only one of the surveys, or assign the wrong weights to each survey, when answering different questions.

A post about this would outline the key strengths and weaknesses of the different surveys for different purposes, touching on questions such as:

  • How much our confidence should change when we have a small sample size from a small population.
  • How concerned we should be about biases in the samples for each survey and what population we should be targeting.
  • How much the different questions in each survey allows us to check and verify the answers within each survey.
  • How much the results of each survey can be verified and cross-referenced with each other (e.g. by identifying specific highly engaged LTists within the EAS).

 

  1. ^

    Reassuringly, they also seem to generate very similar results, when we directly compare them, adjusting for differences in composition, i.e. only looking at highly engaged longtermists within the EA Survey.

Nice. I'd find this super interesting!

"EA for hippies" - I managed to explain effective altruism to a group of rich hippies that were in the process of starting up a foundation, getting them on-board with donating some of the revenue to global health charities. 

The post would detail how I explained EA to people who are far from the standard target audience.

I would very much like to see something like this. Being able to communicate EA ideas to people that are roughly aligned in terms of many altruistic values is useful.

I'm thinking of writing a longer/ more nuanced collaborative piece discussing global vs local EA community building that I touched on in a previous post.

Some things you might want to do if you are making a weighted factor model

Weighted factor models are commonly used within EA (e.g. by Charity Entrepreneurship/AIM and 80,000 Hours). Even the formalised Scale, Solvability, Neglectedness framework can, itself, be considered a form of weighted factor model.

However, despite their wide use, weighted factor models often neglect to use important methodological techniques which could test and improve their robustness,  which may threaten their validity and usefulness. RP's Surveys and Data Analysis  team previously consulted for a project who were using a WFM, and helped them understand certain things that were confusing them about the behaviour of their model using these techniques, but we've never had time to write up a detailed post about these methods. Such a post would discuss such topics as:

  • Problems with ordinal measures
  • When (not) to rank scores
  • When and how (not) to normalise
  • How to make interpretable rating scales
  • Identifying the factors that drive your outcomes
  • Quantifying and interpreting disagreement / uncertainty

At some point I'd love to post something on ‘How to think about impact as a journalist’. I've accumulated a few notes and sources on the subject, and it's a question I often come back to, being directly concerned. But it's a big one and I haven't yet decided how to tackle it :)

Might be a nice candidate for a bullet-point outline draft amnesty post (like this one)? There's no rule that you can't republish it as a full post later on, and perhaps you could get some feedback/ ideas from the comments on a draft amnesty post...

I'm going to post about a great paper I read about the National Woman's Party, and 20th century feminism that I think has relevance to the EA communtiy :)

I’d like to write: 

A post about making difficult career decisions with examples of how I made my own decisions and some tools I used to make them, and how they worked out. I have it roughly written but would definitely need feedback from you Toby before I post :))

A post about mental health: why I’m focusing on it this year, why I think more people in EA should focus on it and what exactly I’m doing, what’s working etc. Haven’t written it yet, but a lot of people are asking about it so I do think there is potential value. 

Sounds great, and always happy to give feedback :)

My previous attempt at predicting what I was going to write got 1/4, which ain't great.

This is partly planning fallacy, partly real life being a lot busier than expected and Forum writing being one of the first things to drop, and partly increasingly feeling gloom and disillusionment with EA and so not having the same motivation to write or contribute to the Forum as I did previously.

For the things that I am still thinking of writing I'll add comments to this post separately to votes and comments can be attributed to each idea individually.

I do want to write something along the lines of "Alignment is a Political Philosophy Problem"

My takes on AI, and the problem of x-risk, have been in flux over the last 1.5 years, but they do seem to be more and focused on the idea of power and politics, as opposed to finding a mythical 'correct' utility function for a hypothesised superintelligence. Making TAI/AGI/ASI go well therefore falls in the reference class of 'principal agent problem'/'public choice theory'/'social contract theory' rather than 'timeless decision theory/coherent extrapolated volitio... (read more)

I don't think anyone wants or needs another "Why I'm leaving EA" post but I suppose if people really wanted to hear it I could write it up. I'm not sure I have anything new or super insightful to share on the topic.

I have some initial data on the popularity and public/elite perception of EA that I wanted to write into a full post, something along the lines of What is EA's reputation, 2.5 years after FTX? I might combine my old idea of a Forum data analytics update into this one to save time.

My initial data/investigation into this question ended being a lot more negative than other surveys of EA. The main takeaways are:

  • Declining use of the Forum, both in total and amongst influential EAs
  • EA has a very poor reputation in the public intellectual sphere, especially on Twi
... (read more)

I would write how there's a collective action problem regarding reading EA forum posts. People want to read interesting, informative, and impactful posts and karma is a signifier of this. So often people will not read posts, especially on topics they are not familiar, unless it has already achieved some karma threshold. Given how time-sensitive front page availability is without karma accumulation and unlikely relatively low karma posts are too be read once off the front page, it is likely that good posts could be entirely ignored. On the other hand, some early traction could result in OK posts getting very high karma because a higher volume of people have been motivated to check the post out. 

 

I think this could be partially addressed by having volunteers, or even paying people, to commit to read posts within a certain time frame and upvote (or not, or downvote) if appropriate. It might be a better use of funds than myriad cosmetic changes. 

Below is a post I wrote that I think might be such a post that was good (or at least worthy of discussion) but people probably wanted to freeride on others' early evaluation. It discusses how jobs in which the performance metrics actually used are orthogonal to many ways in which good can be done may be opportunities for significant impact. 

 

https://forum.effectivealtruism.org/posts/78pevHteaRxekaRGk/orthogonal-impact-finding-high-leverage-good-in-unlikely

Reputation Hardening

Prompted largely by the fall in EA credibility in recent years. And also being unsatisfied with GiveWell's lack of independent verification of the charities they recommend.

Here is a lightly edited AI generated slop version:

Reputation Hardening: Should GiveWell Verify Charity Data Independently?

"Reputation hardening" involves creating more resilient reputations.

Recent events have shown how reputation damage to one EA entity can affect the entire movement's credibility and therefore funding and influence. While GiveWell's evaluation process is thorough, it largely relies on charity-provided data. I propose they consider implementing independent verification methods.

Applying to GiveWell/GHD

These measures could help detect potential issues early and strengthen confidence in effectiveness estimates.

This is a preliminary idea to start discussion. What other verification methods or implementation challenges should we consider?

My ideas for draft amnesty week are replied to this message so they can be voted on separately:

Cosmological Fine-Tuning Considered:

The title’s kind of self-explanatory – over time I’ve noticed the cosmological fine-tuning argument for the existence of god become something like the most favored argument, and learning more about it over time has made me consider it more formidable than I used to think as well.

I’m ultimately not convinced, but I do consider it an update, and it makes for a good excuse for me to talk more about my views on things like anthropic arguments, outcome pumps, the metaphysics of multiverses, and interesting philosophical consi... (read more)

Topic from last round:

Okay, so, this is kind of a catch all. Out of the possible post ideas I commented last year, I never posted or wrote “Against National Special Obligation”, “The Case for Pluralist Evaluation”, or “Existentialist Currents in Pawn Hearts”. So, this is just the comment for “one of those”.

Observations on Alcoholism Appendix G:

This would be another addition to my Sequence on Alcoholism – I’ve been thinking in particular of writing a post listing out ideas about coping strategies/things to visualize to help with sobriety. I mention several in earlier appendices in the sequence – things like leaning into your laziness or naming and yelling at your addiction – but I don’t have a neat collection of advice like this, which seems like one of the more useful things I could put together on this subject.

Mid-Realist Ethics:

I occasionally bring up my meta-ethical views in blog posts, but I keep saying I’ll write a more dedicated post on the topic and never really do. A high level summary includes stuff like: “ethics” as I mean it has a ton of features that “real” stuff has, but it lacks the crucial bit which is actually being a real thing. The ways around this tend to fall into one of two major traps – either making a specific unlikely empirical prediction about the view, or labeling a specific procedure “ethics” in a way that has no satisfying difference f... (read more)

Moral problems for environmental restoration:

A post idea I’ve been playing with recently is converting part of my practicum write-up into a blog post about the ethics of environmental restoration projects. My practicum was with the “Billion Oyster Project”, which seeks to use oyster repopulation for geoengineering/ecosystem restoration, and I spent a big chunk of my write-up worrying about the environmental ethics of this, and I’ve been thinking this worrying could be turned into a decent blogpost.

I’ll discuss welfare biology briefly, but lots of it will s... (read more)

A BOTEC of base rates of moderate-to-severe narcissistic traits (ie, clinical but not necessarily diagnosed) in founders and their estimated costs to the ecosystem. My initial research suggests unusually high concentrations in AI safety relative to other cause areas and the general population.

Curated and popular this week
Relevant opportunities