KaedeHamasaki's Quick takes

KaedeHamasaki

This is a special post for quick takes by KaedeHamasaki. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Sorted by

New & upvoted

Click to highlight new quick takes since: Today at 1:14 AM

KaedeHamasakiApr 103

AI safetyShow more

What happens when AI speaks a truth just before you do?
This post explores how accidental answers can suppress human emergence—ethically, structurally, and silently.

📄 Full paper: Cognitive Confinement by AI’s Premature Revelation

KaedeHamasakiApr 131

We’ve just released the updated version of our structural alternative to dark matter: the Central Tensional Return Hypothesis (CTRH).

This version includes:

High-resolution, multi-galaxy CTR model fits
Comparative plots of CTR acceleration vs Newtonian gravity
Tension-dominance domains (zero-crossing maps)
Escape velocity validation using J1249+36
Structural scaling comparisons via the CTR “b” parameter

https://forum.effectivealtruism.org/posts/LA4Ma5NMALF3MQmvS/updated-structural-validation-of-the-central-tensional?utm_campaign=post_share&utm_source=link

We welcome engagement, critique, and comparative discussion with MOND or DM-based models.

KaedeHamasakiApr 121

This post proposes a structural alternative to dark matter called the Central Tensional Return Hypothesis (CTRH). Instead of invoking unseen mass, CTRH attributes galactic rotation to directional bias from a radially symmetric tension field. The post outlines both a phenomenological model and a field-theoretic formulation, and invites epistemic scrutiny and theoretical engagement.

KaedeHamasakiApr 121

AI safetyShow more

Update: New Version Released with Illustrative Scenarios & Cognitive Framing

Thanks again for the thoughtful feedback on my original post Cognitive Confinement by AI’s Premature Revelation.

I've now released Version 2 of the paper, available on OSF: 📄 Cognitive Confinement by AI’s Premature Revelation (v2)

What’s new in this version?

– A new section of concrete scenarios illustrating how AI can unintentionally suppress emergent thought
– A framing based on cold reading to explain how LLMs may anticipate user thoughts before they are fully formed
– Slight improvements in structure and flow for better accessibility

Examples included:

A student receives an AI answer that mirrors their in-progress insight and loses motivation
A researcher consults an LLM mid-theorizing, sees their intuition echoed, and feels their idea is no longer “theirs”

These additions aim to bridge the gap between abstract ethical structure and lived experience — making the argument more tangible and testable.

Feel free to revisit, comment, or share. And thank you again to those who engaged in the original thread — your input helped shape this improved version.

Japanese version also available (PDF, included in OSF link)

KaedeHamasakiApr 111

Existential riskShow more

If a self-optimizing AI collapses due to recursive prediction...

How would we detect it?

Would it be silence? Stagnation? Convergence?

Or would we mistake it for success?

(Full conceptual model: [https://doi.org/10.17605/OSF.IO/XCAQF])

KaedeHamasakiApr 71

AI safetyShow more

Hypothesis: Structural Collapse in Self-Optimizing AI

Could an AI system recursively optimize itself into failure—not by turning hostile, but by collapsing under its own recursive predictions?

I'm proposing a structural failure mode: as an AI becomes more capable at modeling itself and predicting its own future behavior, it may generate optimization pressure on its own architecture. This can create a feedback loop where recursive modeling exceeds the system's capacity to stabilize itself.

I call this failure point the Structural Singularity.

Core idea:

Recursive prediction → internal modeling → architectural targeting
Feedback loop intensifies recursively
Collapse occurs from within, not via external control loss

This is a logical failure mode, not an alignment problem or adversarial behavior.

Here's a full conceptual paper if you're curious: [https://doi.org/10.17605/OSF.IO/XCAQF]

Would love feedback—especially whether this failure mode seems plausible, or if you’ve seen similar ideas elsewhere. I'm very open to refining or rethinking parts of this.