diff --git a/paper/archeflow.tex b/paper/archeflow.tex index aa5f1b6..e0067f2 100644 --- a/paper/archeflow.tex +++ b/paper/archeflow.tex @@ -733,6 +733,70 @@ toward specific cognitive orientations---but the shadow mechanism prevents them from drifting too far, maintaining a productive operating range analogous to what \citeauthor{lu2026assistant} achieve through activation capping. +\subsection{Wiggum Breaks as Human-in-the-Loop Boundaries} + +A central question in autonomous agent systems is: \emph{when should the +system stop acting and ask a human?} Most frameworks treat this as an +implementation detail---a timeout, a retry limit, an exception handler. +ArcheFlow treats it as a first-class architectural concept through the +\emph{Wiggum Break}. + +The Wiggum Break defines the \textbf{formal boundary between autonomous and +human-supervised operation}. It is not a failure mode; it is the system's +\emph{designed} response to situations where autonomous resolution is +provably unproductive: + +\begin{itemize} + \item \textbf{Oscillation} (finding present $\to$ absent $\to$ present) + indicates a genuine tension in the review criteria that no amount of + cycling will resolve---only human judgment about which criterion takes + priority. + + \item \textbf{Divergence} (convergence score $< 0.5$ for two consecutive + cycles) indicates that the implementation is getting worse with each + iteration---the agents lack the context or capability to solve the + problem, and continuing wastes resources. + + \item \textbf{Repeated shadow detection} (same dysfunction three times) + indicates that the corrective action framework has exhausted its + options---the task structure is incompatible with the assigned archetype, + and a human must re-scope. +\end{itemize} + +This framing inverts the typical HITL paradigm. Rather than asking +``how much autonomy should the system have?'' and pre-defining approval +gates, ArcheFlow asks ``under what conditions is autonomy +\emph{provably unproductive}?'' and derives the HITL boundary from +convergence theory. The system runs autonomously by default and escalates +only when it can demonstrate---through quantitative metrics, not +heuristics---that continued autonomous operation will not improve the +outcome. + +This approach has three advantages over pre-defined approval gates: + +\begin{enumerate} + \item \textbf{Adaptive autonomy}: Simple tasks never trigger a Wiggum + Break; complex tasks trigger one quickly. The HITL boundary adapts to + task difficulty without manual configuration. + + \item \textbf{Auditable escalation}: Every Wiggum Break emits a + \texttt{wiggum.break} event with the trigger condition, run state, and + unresolved findings. The human receives not just a request for help, + but a structured summary of \emph{why} autonomous resolution failed + and what specifically needs their judgment. + + \item \textbf{Minimal interruption}: Pre-defined gates (``approve every + PR'', ``review every design'') interrupt the human on tasks the system + could have handled autonomously. Convergence-derived breaks interrupt + only when the system has evidence that it cannot proceed productively. +\end{enumerate} + +The Wiggum Break thus operationalizes a principle from resilience +engineering: the system should be \emph{autonomy-seeking} (preferring to +resolve issues itself) but \emph{escalation-ready} (able to produce a +useful handoff when self-resolution fails). The quality of the handoff---not +just the fact of escalation---is what makes HITL effective. + \subsection{Limitations} \begin{enumerate}