Quick takes

For the record, I see the new field of "economics of transformative AI" as overrated.

Economics has some useful frames, but it also tilts people towards being too "normy" on the impacts of AI and it doesn't have a very good track record on advanced AI so far.

I'd much rather see multidisciplinary programs/conferences/research projects, including economics as just one of the perspectives represented, then economics of transformative AI qua economics of transformative AI.

(I'd be more enthusiastic about building economics of transformative AI as a field if we w... (read more)

Showing 3 of 5 replies (Click to show all)

(Could you elaborate on ‘economics doesn’t have a very good track record on advanced AI so far’? I haven’t heard this before)

4
MichaelDickens
I think "economics of transformative AI" only matters in the narrow slice of worlds (maybe 20% of my probability?) where AI is powerful enough to transform the economy, but not powerful enough to kill everyone or to create a post-scarcity utopia. So I think you're right.
2
Chris Leong
It has some relevance to strategy as well, such as in terms of how fast we develop the tech and how broadly distributed we expect it to be, however there's a limit to how much additional clarity we can expect to gain over short time period.

On AI alarmists:

A fair-sized stream seems vast to one who until then
Has never seen a greater; so with trees, with men.
In every field each man regards as vast in size
The greatest objects that have come before his eyes 
(Lucretius)

I sometimes say, in a provocative/hyperbolic sense, that the concept of "neglectedness" has been a disaster for EA. I do think the concept is significantly over-used (ironically, it's not neglected!), and people should just look directly at the importance and tractability of a cause at current margins.

Maybe neglectedness useful as a heuristic for scanning thousands of potential cause areas. But ultimately, it's just a heuristic for tractability: how many resources are going towards something is evidence about whether additional resources are likely to be i... (read more)

Showing 3 of 12 replies (Click to show all)
7
Karthik Tadepalli
But neglectedness as a heuristic is very good precisely for narrowing down what you think the good opportunity is. Every neglected field is a subset of a non-neglected field. So pointing out that great grants have come in some subset of a non neglected field doesn't tell us anything. To be specific, it's really important that EA identifies the area within that neglected field where resources aren't flowing, to minimize funging risk. Imagine that AI safety polling had not been neglected and that in fact there were tons of think tanks who planned to do AI safety polling and tons of funders who wanted to make that happen. Then even though it would be important and tractable, EA funding would not be counterfactually impactful, because those hypothetical factors would lead to AI safety polling happening with or without us. So ignoring neglectedness would lead to us having low impact.
4
tlevin
I think the opposite might be true: when you apply it to broad areas, you're likely to mistake low neglectedness for a signal of low tractability, and you should just look at "are there good opportunities at current margins." When you start looking at individual solutions, it starts being quite relevant whether they have already been tried. (This point already made here.)

That's interesting, but seems to be addressing a somewhat separate claim to mine.

My claim was that that broad heuristics are more often necessary and appropriate when engaged in abstract evaluation of broad cause areas, where you can't directly assess how promising concrete opportunities/interventions are, and less so when you can directly assess concrete interventions.

If I understand your claims correctly they are that:

  • Neglectedness is more likely to be misleading when applied to broad cause areas
  • When considering individual solutions, it's useful to consi
... (read more)

For the tax nerds, cool event next week from the OECD:
Tax Inspectors Without Borders: A decade of niche assistance to developing countries
12 March 2024 | 13:45 - 14:45 CET

https://www.tiwb.org/resources/events/oecd-tax-and-development-days-2025-tiwb-a-decade-of-niche-assistance-to-developing-countries.htm 

There have been numerous scandals within the EA community about how working for top AGI labs might be harmful. So, when are we going to have this conversation: contributing in any way to the current US admin getting (especially exclusive) access to AGI might be (very) harmful?

[cross-posted from X and LessWrong]

If you've liked my writing in the past, I wanted to share that I've started a Substack: https://peterwildeford.substack.com/

Ever wanted a top forecaster to help you navigate the news? Want to know the latest in AI? I'm doing all that in my Substack -- forecast-driven analysis about AI, national security, innovation, and emerging technology!

Something that I personally would find super valuable is to see you work through a forecasting problem "live" (in text). Take an AI question that you would like to forecast, and then describe how you actually go about making that forecast. The information you seek out, how you analyze it, and especially how you make it quantitative. That would

  1. make the forecast process more transparent for someone who wanted to apply skepticism to your bottom line
  2. help me "compare notes", ie work through the same forecasting question that you pose, come to a conclusion, a
... (read more)

I wish more work focused on digital minds really focused on answering the following questions, rather than merely investigating how plausible it is that digital minds similar to current day AI's could be sentient:

  1. What does good sets of scenarios for post-AGI governance need to look like to create good/avoid terrible (or whatever normative focus we want) futures, assuming digital minds are the dominant moral patients going into the future 1a) How does this differ dependent on what sorts of things can be digital minds eg whether sentient AIs are likely to

... (read more)
3
Bradford Saad
I'd also like to see more work on digital minds macrostrategy questions such as 1-3. To that end, I'll take this opportunity to mention that the Future Impact Group is accepting applications for projects on digital minds (among other topics) through EoD on March 8 for its part-time fellowship program. I'm set to be a project lead for the upcoming cohort and would welcome applications from people who'd want to work with me on a digital minds macrostrategy project. (I suggest some possible projects here but am open to others.)  I think the other project leads listed for AI sentience are all great and would highly recommend applying to work with any of them on a digital minds project (though I'm unsure if any of them are open to macrostrategy projects).
8
Ryan Greenblatt
I think work of the sort you're discussing isn't typically called digital minds work. I would just describe this as "trying to ensure better futures (from a scope-sensitive longtermist perspective) other than via avoiding AI takeover, human power grabs, or extinction (from some other source)". This just incidentally ends up being about digital entities/beings/value because that's where the vast majority of the value probably lives. ---------------------------------------- The way you phrase (1) seems to imply that you think large fractions of expected moral value (in the long run) will be in the minds of laborers (AIs we created to be useful) rather than things intentionally created to provide value/disvalue. I'm skeptical.

You're sort of right on the first point, and I've definitely counted that work in my views on the area. I generally prefer to refer to it as 'making sure the future goes well for non-humans' - but I've had that misinterpreted as just focused on animals. I

I think for me the fact that the minds will be non-human, and probably digital, matter a lot. Firstly, I think arguments for longtermism probably don't work if the future is mostly just humans. Secondly, the fact that these beings are digital minds, and maybe digital minds very different to us, means a lot... (read more)

Anyone else get a pig butchering scam attempt lately via DM on the forun? 

I just got the following message 

> Happy day to you, I am [X] i saw your profile today and i like it very much,which makes me to write to you to let you know that i am interested in you,therefore i will like you to write me back so that i will tell you further about myself and send you also my picture for you to know me physically. 

[EMAIL]

I reported the user on their profile and opened a support request but just FYI


 

We've got 'em. Apologies to anyone else who got this message. 

4
Toby Tremlett🔹
Thanks for sharing Seth. Would you mind DMing me their name? I'll ban the account, and mods will look into this. 

(x-posted from LW)

Single examples almost never provides overwhelming evidence. They can provide strong evidence, but not overwhelming.

Imagine someone arguing the following:
 

1. You make a superficially compelling argument for invading Iraq

2. A similar argument, if you squint, can be used to support invading Vietnam

3. It was wrong to invade Vietnam

4. Therefore, your argument can be ignored, and it provides ~0 evidence for the invasion of Iraq.

In my opinion, 1-4 is not reasonable. I think it's just not a good line of reasoning. Regardless of whether you'... (read more)

1-4 is only unreasonable because you've written a strawman version of 4. Here is a version that makes total sense:

1. You make a superficially compelling argument for invading Iraq

2. A similar argument, if you squint, can be used to support invading Vietnam

3. This argument for invading vietnam was wrong because it made mistakes X, Y, and Z

4. Your argument for invading Iraq also makes mistakes X, Y and Z

5. Therefore, your argument is also wrong. 

Steps 1-3 are not strictly necessary here, but they add supporting evidence to the claims. 

As far as I c... (read more)

So long and thanks for all the fish. 

I am deactivating my account.[1] My unfortunate best guess is that at this point there is little point and at least a bit of harm caused by me commenting more on the EA Forum. I am sad to leave behind so much that I have helped build and create, and even sadder to see my own actions indirectly contribute to much harm.

I think many people on the forum are great, and at many points in time this forum was one of the best places for thinking and talking and learning about many of the world's most important top... (read more)

Showing 3 of 12 replies (Click to show all)
-34
titotal
24
Ben_West🔸
It feels appropriate that this post has a lot of hearts and simultaneously disagree reacts. We will miss you, even (perhaps especially) those of us who often disagreed with you.  I would love to reflect with you on the other side of the singularity. If we make it through alive, I think there's a decent chance that it will be in part thanks to your work.

After nearly 7 years, I intend to soon step down as Executive Director of CEEALAR, founded by me as the EA Hotel in 2018. I will remain a Trustee, but take more of a back seat role. This is in order to focus more of my efforts on slowing down/pausing/stoping AGI/ASI, which for some time now I've thought of as being the most important, neglected and urgent cause.

We are hiring for my replacement. Please apply if you think you'd be good in the role! Or send on to others you'd like to see in the role. I'm hoping that we find someone who is highly passionate ab... (read more)

I haven't visited CEELAR and I don't know how impactful it has been, but one thing I've always admired about you via your work on this project is your grit and agency. When you thought it was a good idea back in 2018, you went ahead and bought the place. When you needed funding, you asked and wrote a lot about what was needed. You clearly care a lot about this project, and that really shows. I hope your successor will too.

I'm reminded of Lizka's Invisible Impact post. It's easy to spot flaws in projects that actually materialise but hard/impossible to crit... (read more)

Instead of "Goodharting", I like the potential names "Positive Alignment" and "Negative Alignment."

"Positive Alignment" means that the motivated party changes their actions in ways the incentive creator likes. "Negative Alignment" means the opposite.

Whenever there are incentives offered to certain people/agents, there are likely to be cases of both Positive Alignment and Negative Alignment. The net effect will likely be either positive or negative. 

"Goodharting" is fairly vague and typically just refers to just the "Negative Alignment" portion.&n... (read more)

I think the term "goodharting" is great. All you have to do is look up goodharts law to understand what is talked about: the AI is optimising for the metric you evaluated it on, rather than the thing you actually want it to do. 

Your suggestions would rob this term of the specific technical meaning, which makes thing much vaguer and harder to talk about. 

I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI. (This assumes strong AI progress in the next 5-20 years)

  • AI auditors could track everything (starting with some key things) done for an experiment, then flag if there was significant evidence of deception / stats gaming / etc. For example, maybe a scientist has an AI screen-recording their screen whenever it's on, but able to preserve necessary privacy and throw out the irrelevant data.
  • AI auditors could review any experi
... (read more)
Showing 3 of 7 replies (Click to show all)
3
Parker_Whitfill
Agreed with this. I'm very optimistic about AI solving a lot of incentive problems in science. I don't know if the end case (full audits) as you mention will happen, but I am very confident we will move in a better direction than where we are now.    I'm working on some software now that will help a bit in this direction! 
4
titotal
I don't mind you using LLMs for elucidating discussion, although I don't think asking it to rate arguments is very valuable.  The additional details of having subfield specific auditors that are opt-in does lessen my objections significantly. Of course, the issue of what counts as a subfield is kinda thorny. It would make most sense for, as claude suggests, journals to have an "auditor verified" badge, but then maybe you're giving too much power over content to the journals, which usually stick to accept/reject decisions (and even that can get quite political).  Coming back to your original statement, ultimately I just don't buy that any of this can lead to "incredibly low rates of fraud/bias". If someone wants to do fraud or bias, they will just game the tools, or submit to journals with weak/nonexistent auditors. Perhaps the black box nature of AI might even make it easier to hide this kind of thing.  Next: there are large areas of science where a tool telling you the best techniques to use will never be particularly useful. On the one hand there is research like mine, where it's so frontier that the "best practices" to put into such an auditor don't exist yet. On the other, you have statistics stuff that is so well known that there already exist software tools that implement the best practices: you just have to load up a well documented R package. What does an AI auditor add to this? If I was tasked with reducing bias and fraud, I would mainly push for data transparency requirements in journal publications, and in beefing up the incentives for careful peer review, which is currently unpaid and unrewarding labour. Perhaps AI tools could be useful in parts of that process, but I don't see it as anywhere near as important than those other two things. 

This context is useful, thanks.

Looking back, I think this part of my first comment was poorly worded:
> I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI.

I meant 
> I imagine that scientists will [soon have the ability to] be unusually transparent and provide incredibly low rates of fraud/bias], using AI.

So it's not that this will lead to low rates of fraud/bias, but that AI will help enable that for scientists willing to go along with it - but at the same time... (read more)

If antinatal advocacy was effective, wouldn't it make sense to pursue on animal welfare grounds? Aren't most new humans extremely net negative?

I have a 3YO so hold fire!

  • Most new humans will likely consume hundreds (thousands?) of factory farmed animals over their lifetime, creating a substantial negative impact that might outweigh the positive contributions of that human life
  • Probably of far less consequence, the environmental footprint of each new human also indirectly harms wild animals through habitat destruction, pollution, and climate change (TBH I am being very speculative on this point).
3
David Mathers🔸
Some people are going to say that destroying nature is a positive impact of new humans, because they think wild animals have net negative lives. 

ooft, good point. 

As AI improves, there's a window for people to get involved and make changes regarding AI alignment and policy.

The window arguably starts small, then widens as it becomes clearer what to do.

But at some point it gets too close to TAI, I expect that the window narrows. The key decisions get made by a smaller and smaller group of people, and these people have less ability get help from others, given the quickening pace of things.

For example, at T minus 1 month, there might ultimately be a group of 10 people with key decision-making authority on the most power... (read more)

3
Peter
Hmm maybe it could still be good to try things in case timelines are a bit longer or an unexpected opportunity arises? For example, what if you thought it was 2 years but actually 3-5?

I wasn't trying to make the argument that it would definitely be clear when this window closes. I'm very unsure of this. I also expect that different people have different beliefs, and that it makes sense for them to then take corresponding actions. 

Mini EA Forum Update

We've updated the user menu in the site header! 🎉 I'm really excited, since I think it looks way better and is much easier to use.

We've pulled out all the "New ___" items to a submenu, except for "New question" which you can still do from the "New post" page (it's still a tab there, as is linkpost). And you can see your quick takes via your profile page. See more discussion in the relevant PR.

Let us know what you think! 😊

Bonus: we've also added Bluesky to the list of profile links, feel free to add yours!

I love having the profile button at the top, I currently find not having that a bit disorientating

If we could have LLM agents that could inspect other software applications (including LLM agents) and make strong claims about them, that could open up a bunch of neat possibilities.

  • There could be assurances that apps won't share/store information.
  • There could be assurances that apps won't be controlled by any actor.
  • There could be assurances that apps can't be changed in certain ways (eventually).

I assume that all of this should provide most of the benefits people ascribe to blockchain benefits, but without the costs of being on the blockchain.

Some neat opt... (read more)

Reading the Emergent Misalignment paper and comments on the associated Twitter thread has helped me clarify the distinction[1] between what companies call "aligned" vs "jailbroken" models. 

"Aligned" in the sense that AI companies like DeepMind, Anthropic and OpenAI mean it = aligned to the purposes of the AI company that made the model. Or as Eliezer puts it, "corporate alignment." For example, a user may want the model to help edit racist text or the press release of an asteroid impact startup but this may go against the desired morals and/or co... (read more)

Showing 3 of 5 replies (Click to show all)
2
titotal
Thank you for laying that out, that is elucidatory. And behind all this I guess is the belief that if we don't suceed in "technical alignment", the default is that the AI will be "aligned" to an alien goal, the pursuit of which will involve humanities disempowerment or destruction? If this was the belief, I could see why you would find technical alignment superior.  I, personally, don't buy that this will be the default: I think the default will be some shitty approximation of the goals of the corporation that made it, localised mostly to the scenarios it was trained in. From the point of view of someone like me, technical alignment actually sounds dangerous to pursue: it would allow someone to imbue an AI with world domination plans and potentially actually succeed. 
3
Jonas Hallgren
FWIW, I find that if you analyze places where we've successfully aligned things in the past (social systems or biology etc.) you find that the 1th and 2nd types of alignment really don't break down in that way.  After doing Agent Foundations for a while I'm just really against the alignment frame and I'm personally hoping that more research in direction will happen so that we get more evidence that other types of solutions are needed. (e.g alignment of complex systems such as has happened in biology and social systems in the past)

That sounds like [Cooperative AI](https://www.cooperativeai.com/post/new-report-multi-agent-risks-from-advanced-ai) 

https://www.cooperativeai.com/post/new-report-multi-agent-risks-from-advanced-ai 

In addition to wreaking havoc with USAID, the rule of law, whatever little had been started in Washington about AI safety, etc., the US government has, as you all know, decided to go after trans people. I'm neither trans nor an American, but I think it's really not nice of them to do that, and I'd like to do something about it, if I can.

To some extent, of course, it's the inner deontologist within me speaking here: trans people are relatively few, arguably in less immediate danger than African children dying of AIDS, and the main reason why I feel an urge ... (read more)

Thinking about the idea of an "Evaluation Consent Policy" for charitable projects. 

For example, for a certain charitable project I produce, I'd explicitly consent to allow anyone online, including friends and enemies, to candidly review it to their heart's content. They're free to use methods like LLMs to do this.

Such a policy can give limited consent. For example:

  • You can't break laws when doing this evaluation
  • You can't lie/cheat/steal to get information for this evaluation
  • Consent is only provided for under 3 years
  • Consent is only provided starting in
... (read more)
Load more