Autonomy vs Alignment

The design space of agent systems: safety constraints, capability boundaries, escalation paths. When does autonomy become liability? When does alignment become brittleness?

This thinking map synthesises findings from NotebookLM deep research across 65 sources on agent architecture, alignment theory, and safety engineering. It explores the central tension of agent systems: the more autonomy you grant, the harder alignment becomes — and the more catastrophic the failure when it breaks.

Source notebook: NotebookLM — “Autonomy vs Alignment in Agent Systems” — 65 sources loaded via deep research.

“The question isn't how autonomous a system can be. It's how autonomous it needs to be — and what you're willing to pay for every extra degree of freedom.”

Autonomy vs Alignment

The Autonomy Spectrum

Documented Failure Modes

Alignment Theory for Multi-Step Agents

Safety Architecture

The Autonomy Tax