Crossposting notes
This funding memo was written by Nuño Sempere and Rai Sur, the cofounders of Sentinel; I (Saul) am crossposting to the Forum with their permission. I've made some minor changes in formatting to match the Forum, but all of the content is the same.
Note that I crossposted the memo on November 5, 2024, so by the time you're reading this, Sentinel's funding situation might've changed. See here for an archived version of their funding memo at the time of my crossposting, and here for an evergreen version.
Summary
Leadership
Nuño Sempere – Cofounder & Head of Foresight
| Founder of Samotsvety, a forecasting team about which one of the widest-read rationalist bloggers, Scott Alexander, says:
‘Enter Samotsvety Forecasts. This is a team of some of the best superforecasters in the world. They won the CSET-Foretell forecasting competition by an absolutely obscene margin, “around twice as good as the next-best team in terms of the relative Brier score”. If the point of forecasting tournaments is to figure out who you can trust, the science has spoken, and the answer is “these guys”.’
More praise for Samotsvety here. |
Rai Sur – Cofounder & Head of Emergency Response
|
Previously:
|
Strategy
Sentinel safeguards against cascading black swans and institutional failure during pivotal moments of history.
- Many current initiatives to reduce existential risk are based on models of the world with long feedback loops between them and reality. These long feedback loops mean model-error risk could be highly significant and most of our opportunity to mitigate large-scale and existential risk may be around crisis-time as we update our worldview with highly relevant and proximate details about the crisis.
- Even well-resourced state actors like the US government have displayed incompetence in some areas during crises such that there is room for private citizens to make a meaningful impact, permissionlessly. An inspiring example of this is VaccinateCA, who were able to save thousands of lives by disseminating vaccine-availability information more effectively than the US government.
Sentinel is composed of a foresight team to address (1) and an emergency response team to address (2).
The Foresight team’s goal is to monitor current events, update forecasts of these events leading to large-scale and existential catastrophes, alert the emergency response team when they deem the risk high enough, and support the emergency response team with ongoing forecasts if necessary.
The goal of the Emergency Response team is to ensure that advance warning from the Foresight team translates into effective action as fast as possible. It is a reserve team with diverse skill sets that, when activated, searches for the highest impact actions. We expect that in many cases that will look like some combination of raising the awareness of live players; deploying websites, APIs, cybersecurity countermeasures; gathering data in-person and online; implementing emergency protocols made by Sentinel or other organizations such as ALLFED; and any other tasks that are specific to the emergency at hand.
Advisors
- Founding Partner at Mythos Ventures
- Previously Co-lead, AGI Deployment Strategy Research at DeepMind
- Senior Biosecurity consultant
- Former VP Data Science at BlueDot.global, INFER Pro All-Star Forecaster
Progress-to-Date
- $10K (✓ via Manifund, March 2024)
- Proof of Concept; initial recruitment of ~five reserve team members, initial forecasting team.
- $85K (✓ via private donor and Survival and Flourishing Fund, July 2024)
- Foresight Team (4 people) and Emergency Response Team (10 people) further fleshed out. The forecasting team has been producing a steady weekly cadence of reports, and has LLM tools to analyze around 2M pieces of news per week.
- Vishal Maini encourages us to think bigger and address the many bottlenecks in setting up a more ambitious version of Sentinel.
Funding Ask
Since we are already operational, grants of less than the full Base amount (such as from smaller, individual donors) are welcome as they still increase our runway. These are annual budgets, and intended for funders who might fund us fully. If you are an individual, donations in the $10k range are still very useful as they’d allow us to expand faster now.
Funding Scenario | Base | Mid | High |
---|---|---|---|
Cofounder Salaries | 130K | 150K | 180K |
Foresight | 50K | 110K | 150K |
Emergency Response | 50K | 80K | 115K |
Ops | 36K | 54K | 100K |
Marketing | 0 | 25K | 35K |
Engineering | 0 | 0 | 85K |
Slack/Taxes/Overhead | 1.3x | 1.3x | 1.3x |
Total | 345.8K | 544.7K | 851.5K |
Base Scenario
- Foresight
- 5 team members
- LLM-enabled global news analysis
- More news source integrations
- Official APIs instead of brittle web-scraping
- Published weekly reports detailing the greatest current risks
- Emergency Response
- Recruit 20 reserve members to cover GCR areas
- Bio, medical, machine learning, policy, international relations, meteorology/disaster response, nuclear/chemistry/weapons expertise, etc.
- 3 real-life intervention reports on outcomes, learnings (at least small-scale if no larger-scale emergencies occur)
- High-stakes decision-making trainings
- Recruit 20 reserve members to cover GCR areas
Mid Scenario - everything in Base plus:
- Foresight
- 1 additional team member
- Risk reporting upgrades such as graphics, podcasts,
- Emergency Response
- 10 additional reserve members
- 1 engineering deliverable such as a semantic-searchable social-network graph of live players and their motivations
High Scenario - everything in Mid plus:
- Foresight
- 1 additional team member
- Using engineering and forecasting resources to do novel OSINT synthesis on global catastrophic risk precursors
- Emergency Response
- 10 additional reserve members
Manifund Profile
Black Swans, Brittle Institutions, and Situational Awareness
The confluence of the following makes for a clear opportunity to invest more heavily in shorter-term situational awareness and fast, effective response for catastrophic/existential events.
- There are serious risks to humanity’s continued survival in the near future. – Large-scale catastrophe and existential risks in the coming years and decades are concerns to many sane people.
- Black Swans cause outlier impacts. – Events with outlier impacts are often caused by relatively infrequent and difficult-to-hypothesize events. This makes them difficult to plan for and mitigate in advance.
- Existing institutions, including state actors, can be slow and ineffective during times of crisis. – Large institutions and cultures consistently fail to adapt their processes and cultures to new social and technological environments.
- Private citizens can win. – We have already seen states be outcompeted by small groups of private citizens during times of crisis.
- Counterfactual response-time reductions measured in days or hours can save many lives in expectation.
- Institutions involved in planning for specific scenarios in general are limited by the overton window of non-crisis times. This overton window shifts substantially, but too slowly, during times of crisis.
- The people working on mitigating large-scale risks today may see their efforts wasted. – Most efforts are focused on reducing risks that progress can be conceptualized and made legible today, like mechanistic interpretability. These initiatives are based on models of the world with long feedback loops, and, as such, may have significant model error.
- Some people are extremely good at predicting the future on time-frames measured in months. – Judgemental forecasting by Sentinel’s expert forecasters has a very good track record for short-term forecasts.
Addressing the Problem with Forecasting and Emergency Response
To identify and act at the hinge of history, Sentinel is composed of a foresight team and an emergency response team.
The Foresight team’s goal is to monitor current events, update forecasts of these events leading to large-scale and existential catastrophes, alert the emergency response team when they deem the risk high enough, and support the emergency response team with ongoing forecasts if necessary.
The goal of the Emergency Response team is to ensure that advance warning from the Foresight team translates into effective action as fast as possible.
From Gavin Leech and Jan Kulveit’s Case for emergency response teams:
[Moments at the hinge of history] might be best influenced by sustained efforts long before… But it seems plausible that in many cases, the best realistic chance to influence them is “while they are happening”, via a concentrated effort at that moment. Some reasons for this: there are decreased problems with cluelessness; an increase in the resources spent on the problem; actual decisions being made by powerful actors, and so on.
The team has a broad base of specialties so it can fit to the shape of the emergency at hand, which could be hard to predict in advance. Depending on the emergency, some examples of actions the team could take are: raising the awareness of live players; deploying websites, APIs, cybersecurity countermeasures; gathering data in-person and online; implementing emergency protocols made by Sentinel or other organizations such as ALLFED; and any other tasks that are specific to the emergency at hand. We summarize cases of effective private efforts below.
Founding Team
Nuño Sempere – Cofounder & Head of Foresight
Previously:
| Rai Sur – Cofounder & Head of Emergency Response
Previously:
|
The Foresight Team
The track record of our forecasters is understood to be world-class. The team members as of now are 4 top forecasters/superforecasters who are members of the Samotsvety forecasting team. Scott Alexander, one of the widest-read rationalist bloggers, says of Samotsvety:
Enter Samotsvety Forecasts. This is a team of some of the best superforecasters in the world. They won the CSET-Foretell forecasting competition by an absolutely obscene margin, “around twice as good as the next-best team in terms of the relative Brier score”. If the point of forecasting tournaments is to figure out who you can trust, the science has spoken, and the answer is “these guys”.
More praise for Samotsvety here.
Additionally, the Foresight team has been and will continue to publish weekly minutes of forecasting meetings.
The weekly minutes are a public good valuable both as
- a news source for those who just want to know what, if anything, they should be concerned about
- reports of the reasoning of the forecasters for those who want to interact with their mental models
So far we have over 600 subscribers to our minutes and correctly anticipated the WHO's decision to rate monkeypox a Public Health Emergency of International Concern (PHEIC), and gave thorough details on the outbreak dynamics.
For further reading on judgemental forecasting in the context of early-warning forecasting-centers, here is a long-form piece by Linch Zhang going through the idea.
Foresight Team as of August 13, 2024
Nuño Sempere | See Above |
Lisa M.
| Superforecaster Samotsvety Team Member Professional INFER forecaster |
Vidur Kapur | Superforecaster Samotsvety Team Member Holds degrees in Medicine and Public Health and has worked as a biosecurity researcher |
| Policy Researcher at ControlAI, a non-profit that works to reduce risks to humanity from AI Leader of aitreaty.org
|
The Emergency Response Team
The goal of the Emergency Response team is to ensure that advance warning from the Foresight team translates into effective action as fast as possible.
This could take two prongs, which aren’t mutually exclusive:
- Where other organizations exist with specialized resources and/or expertise dealing with the catastrophe at hand, the team’s primary goal would likely be to activate them more quickly than they otherwise would have been activated. This would involve crafting messages and sending them along public and private channels such that the live players within the relevant organizations are motivated to act.
- More importantly, where no such organization exists or is motivated, the Emergency Response team will take on as many of the necessary roles as possible until it is no longer the most motivated, capable party.
While the reservists of the Emergency Response team are usually dormant and intentionally generalist, there are specific supporting initiatives to work on during periods of low risk. These include:
- Practicing emergency response on sub-catastrophic real-world events, evaluating the outcomes, and integrating learnings
- Creating trainings in general high-stakes-decision-making skills developed by existing and past organizations
- Cataloging and monitoring for live players and their motivations
- Building social capital and mapping out social graphs such that we can easily find the fastest route to live players
- Building infrastructure to quickly solve common subproblems with little notice:
- Spinning up informative and authoritative media such as websites and videos
- Accepting and routing donations wherever they may need to go
- Recruiting more reservists for expertise coverage
One key source of early skepticism we’ve encountered is that Sentinel should maybe only focus on the foresight part, and leave response to other actors. We understand that the associated response arm is unproven and we want to test it because we have the intuition it could be very valuable:
- By coupling an emergency response team to a foresight team, the emergency response team can act faster. When potential loss is extremely high, the counterfactual value of acting even a day sooner is worth a lot.
- We are bullish on the ability of talented independents, as opposed to
- emergency response professionals, particularly around novel threats. Emergency response professionals and practices can be a source of inspiration, but they may also tend to be oriented towards more traditional risks (volcanoes, earthquakes), or be too skeptical of speculative risks.
- states, because of their slowness, bureaucratization, and reduced capacity for action over time.
- The emergency response team benefits from continuous curation. By tight coupling with the foresight team, the emergency response team can seek and get commitments from reservists with skills related to the risks that are growing in the eyes of the foresight team in real time.
Inspirations
Some past examples of motivated groups having impact without tight coordination with states we take inspiration from are: VaccinateCA, World Central Kitchen, Lübeck vaccine. The VaccinateCA case study is below while the others are in the Appendix.
VaccinateCA
VaccinateCA was a group of capable generalists, primarily from the tech industry, who rapidly outperformed government efforts during the COVID-19 vaccine rollout. Within days, the volunteer-led initiative created a more comprehensive and accurate system for tracking vaccine availability than official channels. They built a nationwide infrastructure that provided real-time vaccine location information to millions of Americans. While government agencies struggled with bureaucratic constraints, data sharing agreements, and outdated systems, VaccinateCA's approach allowed them to quickly iterate, form partnerships with major tech companies like Google, and adapt to changing circumstances. Their success in gathering and disseminating crucial vaccine information filled a significant gap left by official efforts.
They used simple methods, such as direct phone calls to pharmacies and web scraping of county health department social media posts, to gather data that government systems couldn't easily access. They rapidly scaled their operations from California to nationwide, managed data integration challenges, and even assisted in resolving logistical issues they uncovered during their work. Despite operating on a fraction of the budget of official efforts like California's "My Turn" system, VaccinateCA managed to provide more comprehensive and timely information.
How do these public health and humanitarian aid examples generalize to Sentinel?
The relevant aspect of the world we want to draw attention to is how individuals with a bias for action can outperform institutions even when those institutions are aware that the crisis is their primary focus. Some things do get more streamlined during crisis time but critical limiting factors may remain or get magnified. For example, because the vaccines were highly scarce, it magnified the importance of virtue signaling and they became instrumentalized for politics during the rollout.
Additionally, where public debate exists on the pandemic that could shift policy, it mostly isn’t about these dynamics and how to prevent them. This leads us to believe that the institutions haven’t updated in light of these failures and will exhibit similar failure modes.
In our view, there is insufficient evidence that an institution that shows clear deficiencies during smaller crises will necessarily function better when the stakes are higher.
Speculative Scenario
A limitation of this exercise is that any scenario we could come up with is, by definition, relatively easy to conceptualize. In this section we sketch a speculative but plausible scenario where Sentinel’s approach would empower private individuals to mitigate large-scale risk, addressing skepticism about what role private individuals would play during emergencies. Mitigating these conceptualizable scenarios is still important since they can increase systemic volatility that could lead to the more difficult-to-conceptualize events.
The first time effective AI persuasion is deployed at scale could introduce a lot of uncertainty in the course of weeks or months. We are already seeing hints of interest in this capability, for example, the United Arab Emirates having a large army of LLM bots hyping it as a destination for AI startups. This could scale up into more dangerous strains such as:
- Causing unnecessary large-scale evacuations of key infrastructure or cities to distract a population from some other operation
- As a historical point: In Manhattan during March and April 2020, rumors of a state-sanctioned quarantine of the island spread without large-scale persuasion. More persuasive messaging could have amplified this to the point where many more people acted on it.
- AI (girl/boy)friends which promote some ideological message in order to affect elections or coordinate protests
- Attempts to weaken the defensive effectiveness of NATO by reducing the trust in the US as a defense source, inciting offensive action by NATO that is in fact not endorsed by the US, amplifying the dissent of countries like Hungary, or making people believe that previous attacks (that didn’t exist) have not been answered.
Once signs of persuasion capacities improving quickly or being deployed in suspicious ways concern the foresight team, they would alert the emergency response team. The emergency response team could take many actions based on the shape of the problem at hand. Some examples include:
- AI experts on the team determine if it’s practically possible to create a discriminator.
- If so, fund engineers to train and/or deploy discriminators. Use in-house software expertise to deploy websites and API endpoints for discrimination-as-a-service.
- If the foresight team discerns small-scale tests:
- Infer the actors behind the tests with:
- Network analysis to determine the geographic location of the servers running the models
- Comparison with output of popular models to determine model provenance.
- Infer ultimate targets by generalizing specific aspects of the test (language, domain of expertise, communication channels).
- Infer the actors behind the tests with:
- Once the large-scale persuasion is underway:
- Spinning up operations to reach out to people to collect data on the arguments used. While presumed to be initially effective in this hypothetical, the persuasion may be brittle in the face of certain information or adversarial prompting.
Emergency Response Team as of August 13, 2024
Nuño Sempere | See Above |
Rai Sur | See Above |
Dusan Nesic |
COO, PIBBSS |
Gavin Leech |
Founder – Arb Research Schmidt Futures International Strategy Fellow Emergent Ventures Grant Recipient |
Alex Demarsh |
Biosecurity Consultant Previously: VP Data Science, bluedot.global He’s currently acting more as an advisor and mentor rather than as a reservist |
Vivian Belenky |
Technical Biosecurity Intervention Researcher Promising and undervalued |
Raymond Douglas
|
Director, Schelling Residency; multifaceted genius. |
Nathan Young |
Product Manager – (Goodheart Labs, Funded by Vitalik Buterin) Previously: Coronavirus tech handbook |
Mathias Bonde |
Previously: Cofounder and Director – Center for Effective Aid Policy; now independent agent. |
Sapphire |
Former crypto entrepreneur and animal rights advocate, known for being able to execute well in fast-paced novel environments |
Impact Estimate
Below is our estimate of the cost-adjusted impact of Sentinel on existential risk. It is a lower bound on the value of the project as a whole since it doesn't attempt to model the information value derived from our experimentation that can benefit other projects. We invite you to play with the assumptions in the interactive version here.
Progress-to-Date
Here are the results of Sentinel so far at various levels of funding and our ask for the next stage.
- $10K (✓ via Manifund, March 2024)
- Proof of Concept; initial recruitment of ~five reserve team members, initial forecasting team.
- $85K (✓ via private donor and SFF, July 2024)
- Forecasting Team (4 people) and Emergency Response Team (10 people) further fleshed out. The forecasting team has been producing a steady weekly cadence of reports, and has LLM tools to analyze around 2M pieces of news per week.
- Vishal Maini encourages us to think bigger and address the many bottlenecks in setting up a more ambitious version of Sentinel.
- Being spent on incorporate Sentinel as a US nonprofit, initial salaries for the foresight team, and
- ~$350K (in the process of being raised)
- Nuño and Rai to go full time, increase the size and scope of the foresight and reserve teams, buy more ops capacity, and have an emergency fund to deploy in case of an emergency
Funding Ask
Since we are already operational, grants of less than the full Base amount (such as from smaller, individual donors) are welcome as they still increase our runway. These are annual budgets, and intended for funders who might fund us fully. If you are an individual, donations in the $10k range are still very useful as they’d allow us to expand faster now.
Funding Scenario | Base | Mid | High |
---|---|---|---|
Cofounder Salaries | 130K | 150K | 180K |
Foresight | 50K | 110K | 150K |
Emergency Response | 50K | 80K | 115K |
Ops | 36K | 54K | 100K |
Marketing | 0 | 25K | 35K |
Engineering | 0 | 0 | 85K |
Slack/Taxes/Overhead | 1.3x | 1.3x | 1.3x |
Total | 345.8K | 544.7K | 851.5K |
Base Scenario
- Foresight
- 5 team members
- LLM-enabled global news analysis
- More news source integrations
- Official APIs instead of brittle web-scraping
- Published weekly reports detailing the greatest current risks
- Emergency Response
- Recruit 20 reserve members to cover GCR areas
- Bio, medical, machine learning, policy, international relations, meteorology/disaster response, nuclear/chemistry/weapons expertise, etc.
- 3 real-life intervention reports on outcomes, learnings (at least small-scale if no larger-scale emergencies occur)
- High-stakes decision-making trainings
- Recruit 20 reserve members to cover GCR areas
Mid Scenario - everything in Base plus:
- Foresight
- 1 additional team member
- Risk reporting upgrades such as graphics, podcasts,
- Emergency Response
- 10 additional reserve members
- 1 engineering deliverable such as a semantic-searchable social-network graph of live players and their motivations
High Scenario - everything in Mid plus:
- Foresight
- 1 additional team member
- Using engineering and forecasting resources to do novel OSINT synthesis on global catastrophic risk precursors
- Emergency Response
- 10 additional reserve members
Future Direction
If successful in these endeavors we see the project evolving from “Samotsvety augmented with reservists” to “intelligence agency focused on existential risk”. State-affiliated intelligence-agencies, while aware of existential risk, overfit to nationalistic goals such as winning AI races. A state-unaffiliated existential-risk-focused intelligence-agency could represent broader interests.
Appendix
Inspiration Case Studies
World Central Kitchen
World Central Kitchen has repeatedly demonstrated its ability to respond quickly and effectively to various crises around the world, often outpacing and outperforming government efforts. From their marketing:
In 2017, Hurricane Harvey hit Houston, and José, with several chefs from his team, decided it was time to act. The group got on the ground and began helping to prepare meals. José continued learning, observing how food relief was handled following a crisis—he immediately saw gaps and ways that it could be done better. Then, just a month later, Hurricane María hit Puerto Rico—the storm brought catastrophic devastation, and millions of Americans were in need, immediately. Boarding the first commercial flight to San Juan, José started in one kitchen, cooking sancocho at a friend’s restaurant in the Santurce neighborhood. Building fast, chefs, food trucks, volunteers joined the team and #ChefsForPuertoRico was born. WCK would go on to serve nearly 4 million fresh meals in the aftermath of María.
The organization's success lies in its adaptability and willingness to work outside traditional bureaucratic structures. WCK focuses on mobilizing quickly, collaborating with local partners, and providing tailored solutions. This is in contrast to the often slower and more rigid responses of government agencies. WCK's approach of empowering local communities has also allowed it to respond more nimbly to crises.
Lübeck Vaccine
Winfried Stöcker, a German doctor after a career of developing immunization technologies, continues working on his own, smaller lab, Labor Stöcker.
Come COVID, he realizes that he has the relevant expertise, and in March 2020 develops a vaccine in his personal lab. He tests this vaccine on himself, and later on family and employees.
After some experiments, he becomes sure of the vaccine’s efficacy and harmlessness, as it uses the same mechanisms as, e.g., Hepatitis vaccines. He decides that, in times of urgency, double blind trials would take too long, and organizes a vaccine drive at his airport. But the Paul Ehrlich Institute and the Landesamt für soziale Dienste sue him, and police interrupt the vaccine drive.
After my resounding success with the first five immunizations (within my own family), I applied for approval for a corresponding study in September 2020 (Wednesday, 2 September 2020, 18:52) with the head of the Paul Ehrlich Institute, Klaus Cichutek. I tried to make it clear to him that within six months I could have immunized all of Germany safely and efficiently against Covid-19. And yet, instead of eliciting enthusiasm from this important senior civil servant, I aroused his displeasure, apparently because he felt overlooked. Or perhaps there were other interests involved. And so, he had me prosecuted (the proceedings were closed)…
Overall, sources defending Stöcker’s point of view are more numerous and accessible than sources presenting the opposite perspective. Still, here is a blog post outlining the lack of information about quality assurance processes, and other problems with the potential vaccine.
Ultimately, it could be the case that the Lübeck vaccine was inferior, and that legal censure was indeed justified. However, our impression is that, on the balance of probabilities, that’s not the case. Particularly in the early days of the pandemic, scaling a cheap method seems like it would have been much better than subjecting the population to the also uncertain effects of more infections. And as time goes on, having had different vaccines also seems like it would have been more robust.
Not sure if important and I am sure you have thought of this: I find the sentinel minutes some of the best news sources out there. I would be happy as an individual to pay a small monthly fee to get these.
Also, I would really look forward to reports on future interactions with non-EA/non-X-risk orgs that are in "crisis mode" to see how Sentinel's offering is received by the customers.
[epistemic status: i've spent about 5-20 hours thinking by myself and talking with rai about my thoughts below. however, i spent fairly little time actually writing this, so the literal text below might not map to my views as well as other comments of mine.]
IMO, Sentinel is one of the most impactful uses of marginal forecasting money.
some specific things i like about the team & the org thus far:
i have the highest crux-uncertainty and -elasticity around the following, in (extremely rough) order of impact on my thought process: