MK

Mikolaj Kniejski

172 karmaJoined

Comments
14

All the EA-committed dollars in the world are a tiny drop in the ocean of the world's problems and it takes really incredible talent to leverage those dollars in a way that would be more effective than adding to them.
 

 

This seems false to me. I agree that earning to give should be highly rewarded and so on, but I don't think that, for example, launching an effective giving organization requires an incredible amount of talent. There have been many launched recently, either by CE or local groups (I was part of the team that launched one in Denmark). Recently, EAIF said that they are not funding-constrained, and there are a lot of projects being funded on Manifund. It looks more like funders are looking for new projects to fund. So either most of the funders are wrong in their assessment and should just grant to existing opportunities, or there is still room for new projects.

If anything my experience was that the bar for direct work is way lower than I expected and part of reason why I thought that way was that there are comments like this.

The short version of the argument is that excessive praise for 'direct work' has caused a lot of people who fail to secure direct work to feel un-valued and bounce off EA.
 

 

Interesting! Is there any data that supports this? 

Re 2. I agree that this is a lot of work but it's little given how much money goes into grants. Some of the predictions are also quite straightforward to resolve. 

Well, glad to hear that they are using it. 

I believe that an alternative could be funding a general direction, e.g., funding everything in AIS, but I don't think that these approaches are exclusive.

Meta: I'm requesting feedback and gauging interest. I'm not a grantmaker.

You can use prediction markets to improve grantmaking. The assumption is that having accurate predictions about project outcomes benefits the grantmaking process.

Here’s how I imagine the protocol could work:

  1. Someone proposes an idea for a project.
  2. They apply for a grant and make specific, measurable predictions about the outcomes they aim to achieve.

Examples of grant proposals and predictions (taken from here):

  • Project: Funding a well-executed podcast featuring innovative thinking from a range of cause areas in effective altruism.
    • Prediction: The podcast will reach 10,000 unique listeners in its first 12 months and score an average rating of 4.5/5 across major platforms.
  • Project: Funding a very promising biology PhD student to attend a one-month program run by a prestigious US think tank.
    • Prediction: The student will publish two policy-relevant research briefs within 12 months of attending the program.
  • Project: A 12-month stipend and budget for an EA to develop programs increasing the positive impact of biomedical engineers and scientists.
    • Prediction: Three biomedical researchers involved in the program will identify or implement career changes aimed at improving global health outcomes.
  • Project: Stipends for 4 full-time-equivalent (FTE) employees and operational expenses for an independent research organization conducting EA cause prioritization research.
    • Prediction: Two new donors with a combined giving potential of $5M+ will use this organization’s recommendations to allocate funds.

A prediction market is created based on these proposed outcomes, conditional on the project receiving funding. Some of the potential grant money is staked to make people trade.

Obvious criticism is that:

  • Markets can be gamed, so the potential grantee shouldn't be allowed to bet.
  • Exploratory projects and research can't make predictions like this.
  • A lot of people need to participate in the market.

Thanks! I saw that post. It's an excellent approach. I'm planning to do something similar, but less time-consuming and limited. The range of theories of change that are pursued in AIS is limited and can be broken down into:

  • Evals
  • Field-building
  • Governance
  • Research

Evals can be measured by quality and number of evals, relevance to ex-risks. It seems pretty straightforward to differentiate a bad eval org from a good eval org—engaging with major labs, having a lot of evals, and a relation to existential risks.

Field-building—having a lot of participants who do awesome things after the project.

Research—I argue that the number of citations is also a good proxy for the impact of a paper. It's definitely easy to measure and is related to how much engagement a paper received. In the absence of any work done to bring the paper to the attention of key decision makers, it's very related to the engagement.

I'm not sure how to think about governance.

Take this with a grain of salt. 


EDIT: Also I think that engaging broader ML community with AI safety is extremely valuable and citations tells us how if an organization is good at that. Another thing that would be good to reivew is to ask about transparency of organizations, how thier estimate their own impact and so on - this space is really unexplored and this seems crazy to me. The amount of money that goes into AI safety is gigantic and it would be worth exploring what happens with it. 

I’m working on a project to estimate the cost-effectiveness of AIS orgs, something like Animal Charity Evaluators does. This involves gathering data on metrics such as:

  • People impacted (e.g., scholars trained).
  • Research output (papers, citations).
  • Funding received and allocated.

Some organizations (e.g., MATS, AISC) share impact analyses, there’s no broad comparison. AI safety orgs operate on diverse theories of change, making standardized evaluation tricky—but I think rough estimates could help with prioritization.

I’m looking for:

  1. Previous work
  2. Collaborators
  3. Feedback on the idea

If you have ideas for useful metrics or feedback on the approach, let me know!

I've always been impressed with Rethink Priorities' work, but this post is underwhelming.

As I understand it, the post argues that we can't treat LLMs as coherent persons. The author seems to think this idea is vaguely connected to the claim that LLMs are not experiencing pain when they say they do. I guess the reasoning goes something like this: If LLMs are not coherent personas, then we shouldn't interpret statements like "I feel pain" as genuine indicators that they actually feel pain, because such statements are more akin to role-playing than honest representations of their internal states.

I think this makes sense but the way it's argued for is not great.

1. The user is not interacting with a single dedicated system.

The argument here seems to be: If the user is not interacting with a single dedicated system, then the system shouldn't be treated as a coherent person.

This is clearly incorrect. Imagine we had the ability to simulate a brain. You could run the same brain simulation across multiple systems. A more hypothetical scenario: you take a group of frozen, identical humans, connect them to a realistic VR simulation, and ensure their experiences are perfectly synchronized. From the user’s perspective, interacting with this setup would feel indistinguishable from interacting with a single coherent person. Furthermore, if the system is subjected to suffering, the suffering would multiply with each instance the experience is replayed. This shows that coherence doesn't necessarily depend on being a "single" system.

2. An LLM model doesn't clearly distinguish the text it generates from the text the user inputs.

Firstly, this claim isn't accurate. If you provide an LLM with the transcript of a conversation, it can often identify which parts are its responses and which parts are user inputs. This is an empirically testable claim. Moreover, statements about how LLMs process text don't necessarily negate the possibility of them being coherent personas. For instance, it’s conceivable that an LLM could function exactly as described and still be a coherent persona. 

There is interesting connection between those techniques and "Trapped priors" and the whole take on human cognition as bayesian reasoning and biases as a strong prior. Why would those techniques work? (Assuming they work).

I guess some like "Try to speak truth" can make you consider a wide range of connected notions e.g. you say something like "Climate change is fake" and  you start to consider "why would make it true?" Or you just feel (because of your prioir) that this is true and ignore any further considerations (in that case the technique doesn't work). 

 

Do you have any arguments for why this would be more important rather than working on evals of deceptive AI or evals of cybersecurity capabilities? Asking in general, I'm trying to figure out how one should think about prioritizing things like that.

Load more