Do you have anything you recommend reading on that?
I guess I see a lot of the value of people at labs happening around the time of AGI and in the period leading up to ASI (if we get there). At that point I expect things to be very locked down such that external researchers don't really know what's happening and have a tough time interacting with lab insiders. I thought this recent post from you kind of supported the claim that working inside the labs would be good? - i.e. surely 11 people on the inside is better than 10? (and 30 far far better)
I do agree OS models help with all this and I guess it's true that we kinda know the architecture and maybe internal models won't diverge in any fundamental way from what's available OS. To the extent OS keeps going warning shots do seem more likely - I guess it'll be pretty decisive if the PRC lets Deepseek keep OSing their stuff (I kinda suspect not? But no idea really).
I guess rather than concrete implications I should indicate these are more 'updates given more internal deployment' some of which are pushed back against by surprisingly capable OS models (maybe I'll add some caveats)
I think it's very reasonable to say that 2008 and 2012 were unusual. Obama is widely recognized as a generational political talent among those in Dem politics. People seem to look back on, especially 2008, as a game-changing election year with really impressive work by the Obama team. This could be rationalization of what were effectively normal margins of victory (assuming this model is correct) but I think it matches the comparative vibes pretty well at the time vs now.
As for changes over the past 20+ years, I think it's reasonable to say that there's been fundamental shifts since the 90s:
Agree that 5-10% probability isn't cause for rejection of the hypothesis but given we're working with 6 data points, I think it should be cause for suspicion. I wouldn't put a ton of weight on this but 5% is at the level of statistical significance so it seems reasonable to tentatively reject that formulation of the model.
Trump vs Biden favorability was +3 for Trump in 2020, Obama was +7 on McCain around election day (average likely >7 points in Sept/Oct 2008). Kamala is +3 vs Trump today. So that's some indication of when things are close. Couldn't quickly find this for the 2000 election.
I think this is all very reasonable and I have been working under the assumption of one votes in PA leading to a 1 in 2 million chance of flipping the election. That said, I think this might be too conservative, potentially by a lot (and maybe I need to update my estimate).
Of the past 6 elections 3 were exceedingly close. Probably in the 95th percentile (for 2016 & 2020) and 99.99th percentile (for 2000) for models based off polling alone. For 2020 this was even the case when the popular vote for Biden was +8-10 points all year (so maybe that one would also have been a 99th percentile result?). Seems like if the model performs this badly it may be missing something crucial (or it's just a coincidental series of outliers).
I don't really understand the underlying dynamics and don't have a good guess as to what mechanisms might explain them. However, it seems to suggest that maybe extrapolating purely from polling data is insufficient and there's some background processes that lead to much tighter elections than one might expect.
Some incredibly rough guesses for mechanisms that could be at play here (I suspect these are mostly wrong but maybe have something to them):
That's my bad, I did say 'automated' and should have been 'automatable'. Have now corrected to clarify