docs: add arXiv paper on ArcheFlow architecture

LaTeX paper describing the archetypal role system, PDCA quality cycles, shadow detection framework, attention filters, convergence detection, and effectiveness scoring. References Lu et al. 2026 (Assistant Axis) for persona stability grounding.
2026-04-08 04:54:14 +02:00
parent 55dde5f07a
commit 24ea632207
4 changed files with 920 additions and 0 deletions
--- a/paper/Makefile
+++ b/paper/Makefile
@@ -0,0 +1,18 @@
+# Build the ArcheFlow paper
+# Usage: make          (build PDF)
+#        make clean    (remove build artifacts)
+
+MAIN = archeflow
+
+.PHONY: all clean
+
+all: $(MAIN).pdf
+
+$(MAIN).pdf: $(MAIN).tex references.bib
+	pdflatex $(MAIN)
+	bibtex $(MAIN)
+	pdflatex $(MAIN)
+	pdflatex $(MAIN)
+
+clean:
+	rm -f $(MAIN).{aux,bbl,blg,log,out,pdf,toc,lof,lot,nav,snm,vrb}
--- a/paper/archeflow.tex
+++ b/paper/archeflow.tex
@@ -0,0 +1,805 @@
+\documentclass[11pt,a4paper]{article}
+
+% ---- Packages ----
+\usepackage[utf8]{inputenc}
+\usepackage[T1]{fontenc}
+\usepackage{amsmath,amssymb}
+\usepackage{graphicx}
+\usepackage{booktabs}
+\usepackage{hyperref}
+\usepackage{xcolor}
+\usepackage{listings}
+\usepackage{subcaption}
+\usepackage{tikz}
+\usetikzlibrary{shapes,arrows.meta,positioning,fit,calc}
+\usepackage[numbers]{natbib}
+\usepackage{geometry}
+\geometry{margin=1in}
+
+% ---- Listings style ----
+\lstset{
+  basicstyle=\ttfamily\small,
+  breaklines=true,
+  frame=single,
+  framesep=3pt,
+  columns=flexible,
+  keepspaces=true,
+  showstringspaces=false,
+  commentstyle=\color{gray},
+  keywordstyle=\color{blue!70!black},
+}
+
+% ---- Title ----
+\title{%
+  ArcheFlow: Multi-Agent Orchestration with\\
+  Archetypal Roles and PDCA Quality Cycles%
+}
+
+\author{
+  Christian Nennemann\\
+  Independent Researcher\\
+  \texttt{chris@nennemann.de}\\
+  \texttt{https://github.com/XORwell/archeflow}
+}
+
+\date{April 2026}
+
+\begin{document}
+\maketitle
+
+% ============================================================
+\begin{abstract}
+We present \textsc{ArcheFlow}, an open-source orchestration framework for
+multi-agent software engineering that assigns \emph{archetypal roles}---derived
+from Jungian analytical psychology---to LLM agents and coordinates them through
+\emph{Plan--Do--Check--Act} (PDCA) quality cycles. Each of seven archetypes
+(Explorer, Creator, Maker, Guardian, Skeptic, Trickster, Sage) carries a defined
+cognitive virtue and a quantitatively detected \emph{shadow}---a failure mode
+triggered when the virtue becomes excessive. The framework implements a
+three-layer corrective action system (archetype shadows, system shadows, policy
+boundaries) that detects and mitigates agent dysfunction during autonomous
+operation. We describe ArcheFlow's architecture as a zero-dependency plugin for
+Claude Code, detail its attention filtering, feedback routing, convergence
+detection, and effectiveness scoring mechanisms, and discuss connections to
+recent work on persona stability in language models
+\citep{lu2026assistant}. ArcheFlow demonstrates that structured persona
+assignment with shadow detection can maintain productive agent behavior across
+extended autonomous sessions spanning multiple projects and quality domains
+(code, prose, research). The system is publicly available under the MIT license.
+\end{abstract}
+
+% ============================================================
+\section{Introduction}
+\label{sec:introduction}
+
+The rise of agentic coding assistants---tools that autonomously write, test,
+review, and commit code---has created a new class of software engineering
+challenges. While individual LLM agents can produce competent code, the quality
+of autonomous output degrades under conditions that are well-known from human
+software teams: reviewers who rubber-stamp, architects who over-engineer,
+implementers who ignore specifications, and testers who optimize for coverage
+metrics rather than real defects.
+
+These failure modes are not merely analogies. \citet{lu2026assistant}
+demonstrate that language models occupy a measurable \emph{persona space} and
+can drift from their trained Assistant identity during extended conversations,
+particularly under emotional or philosophical pressure. Their ``Assistant
+Axis''---a dominant directional component in activation space---predicts when
+models will exhibit uncharacteristic behavior. If a single model drifts, a
+multi-agent system where each agent maintains a distinct persona faces
+compounded persona management challenges.
+
+ArcheFlow addresses this problem by drawing on two established frameworks:
+\begin{enumerate}
+  \item \textbf{Jungian archetypal psychology} \citep{jung1968archetypes}, which
+  provides a taxonomy of cognitive orientations---each with a productive
+  \emph{virtue} and a destructive \emph{shadow}---that map naturally onto
+  software engineering roles.
+  \item \textbf{PDCA quality cycles} \citep{deming1986out}, which provide a
+  convergence mechanism for iterative refinement with measurable exit criteria.
+\end{enumerate}
+
+The contribution of this paper is threefold:
+\begin{itemize}
+  \item We present a \emph{shadow detection framework} that quantitatively
+  identifies agent dysfunction---not through sentiment analysis or output
+  classification, but through structural metrics (output length, finding ratios,
+  scope violations) specific to each archetype's failure mode (Section~\ref{sec:shadows}).
+  \item We describe \emph{attention filters} and \emph{feedback routing} mechanisms
+  that constrain what each agent sees and where its output flows, preventing the
+  information overload and echo chamber effects that plague na\"ive multi-agent
+  systems (Section~\ref{sec:attention}).
+  \item We demonstrate that PDCA convergence detection---including oscillation
+  analysis and divergence scoring---provides principled stopping criteria for
+  iterative review cycles (Section~\ref{sec:convergence}).
+\end{itemize}
+
+ArcheFlow is implemented as a zero-dependency plugin (Bash + Markdown) for
+Claude Code\footnote{\url{https://claude.ai/claude-code}}, Anthropic's CLI
+coding assistant. It has been used in production across a portfolio of 10--30
+repositories spanning code, creative writing, and academic research.
+
+% ============================================================
+\section{Related Work}
+\label{sec:related}
+
+\subsection{Multi-Agent Software Engineering}
+
+Multi-agent systems for software engineering have proliferated since 2024.
+\citet{hong2024metagpt} propose MetaGPT, which assigns human-like roles
+(product manager, architect, engineer) to LLM agents and enforces structured
+communication through Standardized Operating Procedures (SOPs). ChatDev
+\citep{qian2024chatdev} simulates a virtual software company with role-playing
+agents communicating through natural language chat. SWE-Agent
+\citep{yang2024sweagent} focuses on single-agent benchmark performance on
+GitHub issues, demonstrating that tool-augmented agents can resolve real-world
+bugs.
+
+These systems share a common limitation: roles are defined by \emph{job
+descriptions} rather than \emph{cognitive orientations}. A ``product manager''
+agent may behave identically to a ``tech lead'' agent when both receive the same
+context, because the role boundary is semantic rather than structural. ArcheFlow
+addresses this through attention filters (Section~\ref{sec:attention}) that
+physically restrict what each agent perceives, ensuring that role differences
+manifest in behavior rather than merely in prompts.
+
+\subsection{Persona Stability in Language Models}
+
+\citet{lu2026assistant} identify the ``Assistant Axis'' in LLM activation
+space---a linear direction capturing the degree to which a model operates in its
+default helpful mode versus an alternative persona. Their key findings are
+directly relevant to multi-agent orchestration:
+
+\begin{enumerate}
+  \item \textbf{Persona space is low-dimensional}: only 4--19 principal
+  components explain 70\% of persona variance across 275 character archetypes.
+  \item \textbf{Drift is predictable}: user message embeddings predict response
+  position along the Assistant Axis ($R^2 = 0.53$--$0.77$).
+  \item \textbf{Drift correlates with harm}: models are more liable to produce
+  harmful outputs when drifted from the Assistant identity ($r = 0.39$--$0.52$).
+\end{enumerate}
+
+ArcheFlow's shadow detection (Section~\ref{sec:shadows}) can be understood as an
+\emph{application-level} analog to activation capping: where \citet{lu2026assistant}
+constrain neural activations to maintain persona stability, ArcheFlow constrains
+\emph{behavioral outputs} through quantitative triggers and corrective prompts.
+Both approaches recognize that productive personas require active stabilization,
+not merely initial assignment.
+
+\subsection{Quality Cycles in Software Engineering}
+
+The Plan--Do--Check--Act (PDCA) cycle, formalized by \citet{deming1986out} and
+rooted in Shewhart's statistical process control \citep{shewhart1939statistical},
+is the dominant quality improvement framework in manufacturing and has been
+applied to software engineering through agile retrospectives and continuous
+improvement. To our knowledge, ArcheFlow is the first system to apply PDCA
+cycles to multi-agent LLM orchestration with formal convergence detection and
+oscillation analysis.
+
+\subsection{Jungian Archetypes in Computing}
+
+While Jungian archetypes have been applied in user experience design
+\citep{hartson2012ux}, brand strategy, and game design, their application to
+AI agent systems is novel. The closest related work is in computational
+creativity, where archetypal narratives have been used to structure story
+generation \citep{winston2011strong}. ArcheFlow extends this to software
+engineering by mapping archetypal virtues and shadows to measurable engineering
+outcomes.
+
+% ============================================================
+\section{Architecture}
+\label{sec:architecture}
+
+ArcheFlow is a plugin for Claude Code that operates entirely through prompt
+engineering, shell scripts, and file-based communication. It has zero runtime
+dependencies beyond Bash and a compatible LLM backend.
+
+\begin{figure}[t]
+\centering
+\begin{tikzpicture}[
+  node distance=1.2cm and 2cm,
+  phase/.style={draw, rounded corners, minimum width=2.5cm, minimum height=0.8cm, font=\small\bfseries},
+  agent/.style={draw, rounded corners, minimum width=2cm, minimum height=0.6cm, font=\small, fill=blue!5},
+  arrow/.style={-{Stealth[length=3mm]}, thick},
+  label/.style={font=\scriptsize, text=gray},
+]
+
+% PDCA Cycle
+\node[phase, fill=yellow!20] (plan) {Plan};
+\node[phase, fill=green!20, right=of plan] (do) {Do};
+\node[phase, fill=orange!20, right=of do] (check) {Check};
+\node[phase, fill=red!15, right=of check] (act) {Act};
+
+% Plan agents
+\node[agent, below left=0.8cm and 0.3cm of plan] (explorer) {Explorer};
+\node[agent, below right=0.8cm and 0.3cm of plan] (creator) {Creator};
+
+% Do agent
+\node[agent, below=0.8cm of do] (maker) {Maker};
+
+% Check agents
+\node[agent, below left=0.8cm and -0.2cm of check] (guardian) {Guardian};
+\node[agent, below=0.8cm of check] (skeptic) {Skeptic};
+\node[agent, below right=0.8cm and -0.2cm of check] (sage) {Sage};
+
+% Arrows
+\draw[arrow] (plan) -- (do);
+\draw[arrow] (do) -- (check);
+\draw[arrow] (check) -- (act);
+\draw[arrow, dashed] (act.south) -- ++(0,-0.5) -| node[label, below, pos=0.25] {cycle back} (plan.south);
+
+% Agent connections
+\draw[-] (plan.south) -- (explorer.north);
+\draw[-] (plan.south) -- (creator.north);
+\draw[-] (do.south) -- (maker.north);
+\draw[-] (check.south) -- (guardian.north);
+\draw[-] (check.south) -- (skeptic.north);
+\draw[-] (check.south) -- (sage.north);
+
+\end{tikzpicture}
+\caption{ArcheFlow PDCA cycle with archetypal agent assignments. The dashed arrow represents cycle-back when reviewers find issues. A Trickster agent (not shown) joins the Check phase in \texttt{thorough} workflows.}
+\label{fig:pdca}
+\end{figure}
+
+\subsection{Components}
+
+The system comprises four component types:
+
+\begin{description}
+  \item[Agent personas] (\texttt{agents/*.md}): Behavioral protocols for each
+  archetype, defining the agent's cognitive lens, output format, and quality
+  criteria. Each persona is a Markdown file loaded as a system prompt.
+
+  \item[Skills] (\texttt{skills/*/SKILL.md}): Operational instructions that
+  Claude Code follows to orchestrate the PDCA cycle. The core \texttt{run} skill
+  (466 lines) is self-contained---it encodes the complete orchestration protocol
+  including workflow selection, agent spawning, attention filtering, convergence
+  checking, and exit decisions.
+
+  \item[Library scripts] (\texttt{lib/*.sh}): Ten Bash scripts handling
+  infrastructure concerns: JSONL event logging, git operations (per-phase
+  commits, branch management, rollback), cross-run memory, progress tracking,
+  effectiveness scoring, and run replay.
+
+  \item[Hooks] (\texttt{hooks/}): Session-start hook that auto-activates
+  ArcheFlow and injects the domain detection logic.
+\end{description}
+
+\subsection{Execution Modes}
+
+ArcheFlow provides three execution modes optimized for different use cases:
+
+\begin{description}
+  \item[Sprint] (\texttt{/af-sprint}): Queue-driven parallel dispatch. Reads a
+  priority-ordered task queue, spawns 3--5 agents across different projects
+  simultaneously, collects results, commits, and starts the next batch. Designed
+  for throughput over ceremony.
+
+  \item[Review] (\texttt{/af-review}): Guardian-led post-implementation review
+  on existing diffs, branches, or commit ranges. No planning or implementation
+  orchestration---pure quality analysis.
+
+  \item[Run] (\texttt{/af-run}): Full PDCA orchestration for complex tasks
+  requiring structured exploration, design, implementation, and multi-perspective
+  review.
+\end{description}
+
+\subsection{Domain Adaptation}
+
+ArcheFlow adapts its terminology and quality criteria based on domain detection:
+\texttt{code} (diffs, tests, security), \texttt{writing} (voice consistency,
+dialect authenticity, narrative structure), and \texttt{research} (source quality,
+argument coherence, citation accuracy). Domain is auto-detected from project
+contents or specified in configuration.
+
+% ============================================================
+\section{The Seven Archetypes}
+\label{sec:archetypes}
+
+Each archetype embodies a cognitive orientation with a defined virtue (productive
+mode) and shadow (destructive mode). \Cref{tab:archetypes} summarizes the
+complete taxonomy.
+
+\begin{table}[t]
+\centering
+\caption{The seven ArcheFlow archetypes with their PDCA phase assignments,
+cognitive virtues, and shadow failure modes.}
+\label{tab:archetypes}
+\begin{tabular}{@{}llllll@{}}
+\toprule
+\textbf{Archetype} & \textbf{Phase} & \textbf{Virtue} & \textbf{Shadow} & \textbf{Model Tier} \\
+\midrule
+Explorer  & Plan  & Contextual Clarity      & Rabbit Hole   & Haiku \\
+Creator   & Plan  & Decisive Framing        & Over-Architect & Sonnet \\
+Maker     & Do    & Execution Discipline    & Rogue          & Sonnet \\
+Guardian  & Check & Threat Intuition        & Paranoid       & Sonnet \\
+Skeptic   & Check & Assumption Surfacing    & Paralytic      & Haiku \\
+Trickster & Check & Adversarial Creativity  & False Alarm    & Haiku \\
+Sage      & Check & Maintainability Judgment & Bureaucrat    & Haiku \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+The archetype--shadow pairing is not metaphorical; it is the core mechanism
+for maintaining agent quality. The virtue describes \emph{what} the archetype
+contributes; the shadow describes what happens when that contribution becomes
+excessive. An Explorer who never stops researching (Rabbit Hole) delays the
+entire pipeline. A Guardian who rejects everything (Paranoid) prevents any
+code from shipping.
+
+\subsection{Cost-Aware Model Assignment}
+
+Not all archetypes require the same model capability. Analytical tasks
+(exploration, assumption checking, code quality review) can be performed by
+cheaper models (Haiku), while creative tasks (architecture design,
+implementation, security analysis) benefit from more capable models (Sonnet).
+This tiered assignment reduces per-run costs by 40--60\% compared to using the
+most capable model for all agents, with no observed quality degradation in
+analytical roles.
+
+% ============================================================
+\section{Shadow Detection and Corrective Action}
+\label{sec:shadows}
+
+\subsection{Archetype Shadows}
+
+Shadow detection is \emph{quantitative, not sentiment-based}. Each archetype has
+a specific trigger condition derived from structural properties of its output:
+
+\begin{table}[h]
+\centering
+\caption{Shadow detection triggers. Each trigger is evaluated automatically
+after the agent completes.}
+\label{tab:shadows}
+\begin{tabular}{@{}lll@{}}
+\toprule
+\textbf{Archetype} & \textbf{Shadow} & \textbf{Trigger} \\
+\midrule
+Explorer  & Rabbit Hole   & Output $> 2000$ words without Recommendation section \\
+Creator   & Over-Architect & $> 2$ new abstractions for a single feature \\
+Maker     & Rogue          & No tests in changeset, or files outside proposal scope \\
+Guardian  & Paranoid       & CRITICAL:WARNING ratio $> 2{:}1$, or zero approvals \\
+Skeptic   & Paralytic      & $> 7$ challenges with $< 50\%$ having alternatives \\
+Trickster & False Alarm    & Findings in untouched code, or $> 10$ total findings \\
+Sage      & Bureaucrat     & Review length $> 2\times$ code change length \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+The escalation protocol follows a three-strike pattern:
+\begin{enumerate}
+  \item \textbf{First detection}: Inject a correction prompt that names the
+  shadow and redirects the agent toward its virtue.
+  \item \textbf{Second detection} (same shadow, same run): Replace the agent
+  with a fresh instance.
+  \item \textbf{Third detection}: Escalate to the user for manual intervention.
+\end{enumerate}
+
+\subsection{System Shadows}
+
+Beyond individual archetype dysfunction, ArcheFlow monitors for
+\emph{system-level} failure modes:
+
+\begin{description}
+  \item[Echo Chamber]: Multiple reviewers produce identical findings, suggesting
+  they are confirming each other rather than applying independent judgment.
+  Detected when $> 60\%$ of findings across reviewers share the same
+  file-and-category tuple.
+
+  \item[Tunnel Vision]: All findings cluster in a single file or module while
+  the changeset spans multiple. Detected when $> 80\%$ of findings target
+  $< 20\%$ of changed files.
+
+  \item[Scope Creep]: Maker modifies files not mentioned in the Creator's
+  proposal. Detected by comparing \texttt{do-maker-files.txt} against the
+  proposal's file list.
+\end{description}
+
+\subsection{Policy Boundaries}
+
+The third layer enforces operational limits:
+\begin{itemize}
+  \item \textbf{Budget enforcement}: per-run token limits with per-agent
+  tracking.
+  \item \textbf{Cycle limits}: maximum PDCA iterations (1/2/3 for
+  fast/standard/thorough).
+  \item \textbf{Checkpoint frequency}: mandatory progress saves to prevent
+  lost work on interruption.
+\end{itemize}
+
+\subsection{Connection to the Assistant Axis}
+
+The shadow detection framework addresses the same fundamental problem identified
+by \citet{lu2026assistant}: models drift from productive personas during
+extended operation. Where their work identifies drift in activation space and
+proposes activation capping as a mitigation, ArcheFlow operates at the
+\emph{behavioral} level---detecting drift through output structure rather than
+internal representations, and correcting through prompt injection rather than
+activation manipulation.
+
+This application-level approach has a practical advantage: it requires no access
+to model internals and works with any LLM backend, including API-only models
+where activation-level interventions are impossible. The tradeoff is that
+behavioral detection is necessarily coarser than activation-level measurement
+and can only detect drift after it manifests in output, not before.
+
+% ============================================================
+\section{Attention Filters and Information Flow}
+\label{sec:attention}
+
+A key design principle is that each agent receives \emph{only the information
+relevant to its role}. This is implemented through \emph{attention filters}---rules
+governing which artifacts from prior phases are injected into each agent's
+context.
+
+\begin{table}[h]
+\centering
+\caption{Attention filter matrix. Each agent receives only the artifacts marked
+with \checkmark.}
+\label{tab:attention}
+\begin{tabular}{@{}lccccc@{}}
+\toprule
+\textbf{Agent} & \textbf{Task} & \textbf{Explorer} & \textbf{Creator} & \textbf{Diff} & \textbf{Reviews} \\
+\midrule
+Explorer  & \checkmark &            &            &           &            \\
+Creator   & \checkmark & \checkmark &            &           &            \\
+Maker     & \checkmark &            & \checkmark &           &            \\
+Guardian  &            &            & (risks)    & \checkmark &            \\
+Skeptic   &            &            & \checkmark &           &            \\
+Sage      &            &            & \checkmark & \checkmark &            \\
+Trickster &            &            &            & \checkmark &            \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+The rationale for attention filtering is twofold:
+
+\begin{enumerate}
+  \item \textbf{Independence}: Reviewers who see each other's findings tend to
+  converge on a shared narrative rather than applying independent judgment. By
+  isolating reviewer inputs, ArcheFlow ensures that each reviewer contributes a
+  genuinely distinct perspective.
+
+  \item \textbf{Focus}: An agent given everything tends to address everything,
+  producing diluted analysis. The Trickster, for example, receives \emph{only}
+  the diff---no design rationale, no risk analysis---forcing it to evaluate the
+  code purely on its own terms.
+\end{enumerate}
+
+In PDCA cycle 2+, the feedback from the Act phase is routed selectively:
+Creator-routed issues go to the Creator, Maker-routed issues go to the Maker.
+Neither sees the other's feedback, preventing defensive responses to criticism
+that was directed elsewhere.
+
+% ============================================================
+\section{Feedback Routing}
+\label{sec:routing}
+
+When the Check phase identifies issues, the Act phase must decide where to route
+each finding for the next cycle. ArcheFlow uses a deterministic routing table
+based on the source archetype and finding category:
+
+\begin{table}[h]
+\centering
+\caption{Feedback routing table. Findings are routed to the agent best equipped
+to address them, preventing cross-contamination.}
+\label{tab:routing}
+\begin{tabular}{@{}llll@{}}
+\toprule
+\textbf{Source} & \textbf{Category} & \textbf{Routes To} & \textbf{Rationale} \\
+\midrule
+Guardian  & security, breaking-change & Creator & Design must change \\
+Guardian  & reliability, dependency   & Creator & Architectural decision \\
+Skeptic   & design, scalability       & Creator & Assumptions need revision \\
+Sage      & quality, consistency      & Maker   & Implementation refinement \\
+Sage      & testing                   & Maker   & Test gap, not design flaw \\
+Trickster & reliability (design flaw) & Creator & Needs redesign \\
+Trickster & reliability (test gap)    & Maker   & Needs more tests \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+The disambiguation principle: if fixing the issue requires changing the
+\emph{approach}, route to Creator. If it requires changing the \emph{code within
+the existing approach}, route to Maker. Findings that persist across two
+consecutive cycles are escalated to the user rather than cycled indefinitely.
+
+% ============================================================
+\section{Convergence Detection}
+\label{sec:convergence}
+
+\subsection{Convergence Score}
+
+In PDCA cycle 2+, ArcheFlow compares current findings against the previous cycle
+and classifies each as \textsc{New}, \textsc{Resolved}, \textsc{Persistent}, or
+\textsc{Regressed}. The convergence score is:
+
+\begin{equation}
+  C = \frac{|\textsc{Resolved}|}{|\textsc{Resolved}| + |\textsc{New}| + |\textsc{Regressed}|}
+  \label{eq:convergence}
+\end{equation}
+
+\begin{table}[h]
+\centering
+\caption{Convergence score interpretation and corresponding actions.}
+\label{tab:convergence}
+\begin{tabular}{@{}lll@{}}
+\toprule
+\textbf{Score Range} & \textbf{Status} & \textbf{Action} \\
+\midrule
+$C > 0.8$   & Converging & Continue if cycles remain \\
+$0.5 \leq C \leq 0.8$ & Stalling  & Continue with caution \\
+$C < 0.5$   & Diverging  & Stop if 2 consecutive diverging cycles \\
+$C = 0$     & Stuck      & Stop immediately \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+\subsection{Oscillation Detection}
+
+A finding is \emph{oscillating} if it was present in cycle $n-2$, absent in
+cycle $n-1$, and present again in cycle $n$. Two or more oscillating findings
+trigger an immediate stop with escalation to the user, as oscillation indicates
+a fundamental tension in the review criteria that automated cycles cannot
+resolve.
+
+\subsection{Adaptive Workflow Escalation}
+
+Convergence detection interacts with workflow selection through Rule A1: if a
+\texttt{fast} workflow and Guardian finds $\geq 2$ CRITICAL findings, the next
+cycle escalates to \texttt{standard} (adding Skeptic and Sage reviewers). Once
+escalated, the workflow remains escalated for the duration of the run.
+
+Conversely, Rule A2 provides a \emph{fast-path}: if Guardian finds zero CRITICAL
+and zero WARNING findings, remaining reviewers are skipped entirely, and the
+system proceeds directly to Act. This optimization reduces the cost of runs
+where the Maker's implementation is clean.
+
+% ============================================================
+\section{Evidence Validation}
+\label{sec:evidence}
+
+Reviewer findings are subject to evidence validation before they influence
+routing decisions. A CRITICAL or WARNING finding is downgraded to INFO if:
+
+\begin{itemize}
+  \item It uses \emph{banned hedging phrases} without supporting evidence:
+  ``might be'', ``could potentially'', ``appears to'', ``seems like'', ``may not''.
+  \item It contains \emph{no evidence}: no command output, code citation, line
+  reference, or reproduction steps.
+\end{itemize}
+
+This mechanism addresses a well-known failure mode of LLM reviewers: generating
+plausible-sounding but unsupported concerns. By requiring evidence for
+high-severity findings, ArcheFlow forces reviewers to ground their analysis in
+the actual changeset rather than speculation.
+
+Downgrades are tracked in the event log but do \emph{not} modify the original
+artifact files, preserving the complete reviewer output for post-run analysis.
+
+% ============================================================
+\section{Effectiveness Scoring}
+\label{sec:effectiveness}
+
+After each completed run, ArcheFlow scores review archetypes across five
+dimensions:
+
+\begin{table}[h]
+\centering
+\caption{Effectiveness scoring dimensions and their weights.}
+\label{tab:effectiveness}
+\begin{tabular}{@{}lp{7cm}r@{}}
+\toprule
+\textbf{Dimension} & \textbf{Description} & \textbf{Weight} \\
+\midrule
+Signal-to-noise & Ratio of useful findings to total findings & 0.30 \\
+Fix rate        & Fraction of findings that led to applied fixes & 0.25 \\
+Cost efficiency & Useful findings per dollar of model inference cost & 0.20 \\
+Accuracy        & Fraction not contradicted by other reviewers & 0.15 \\
+Cycle impact    & Whether findings contributed to cycle exit decision & 0.10 \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+Scores accumulate in a cross-run memory file
+(\texttt{.archeflow/memory/effectiveness.jsonl}). After 10+ completed runs,
+the system recommends model tier changes (e.g., promoting a Haiku-tier reviewer
+to Sonnet if its signal-to-noise is consistently high) and, in extreme cases,
+archetype removal for persistently low-scoring reviewers.
+
+% ============================================================
+\section{Cross-Run Memory}
+\label{sec:memory}
+
+ArcheFlow maintains a lesson-learning system that persists across runs. When
+recurring findings are detected---the same category of issue appearing in
+multiple runs---the system stores a lesson and injects it into future agents
+as additional context.
+
+Lessons decay over time: each lesson has a relevance counter that increments on
+reuse and decrements on irrelevance. Lessons that fall below a threshold are
+archived rather than injected, preventing the accumulation of stale guidance.
+
+The memory system also performs regression detection: if a previously resolved
+issue reappears, it is flagged as a regression with higher priority than a
+fresh finding.
+
+% ============================================================
+\section{Implementation}
+\label{sec:implementation}
+
+ArcheFlow is implemented in approximately 6,700 lines across three layers:
+
+\begin{itemize}
+  \item \textbf{Skills} (19 Markdown files, $\sim$2,500 lines): Operational
+  instructions for Claude Code, written as imperative protocols. The core
+  \texttt{run} skill encodes the complete PDCA orchestration in 466 lines.
+
+  \item \textbf{Agent personas} (7 Markdown files, $\sim$700 lines): Behavioral
+  protocols defining each archetype's cognitive lens, output format, and
+  self-review checklist.
+
+  \item \textbf{Library scripts} (10 Bash scripts, $\sim$3,500 lines): Event
+  logging, git operations, memory management, progress tracking, effectiveness
+  scoring, and run replay.
+\end{itemize}
+
+The system uses no database, no API server, and no runtime dependencies beyond
+Bash 4+ and a Claude Code installation. All state is stored in JSONL event logs
+and Markdown artifact files. This zero-dependency architecture was a deliberate
+design choice: orchestration infrastructure that itself requires complex setup
+and maintenance undermines the autonomy it is supposed to enable.
+
+\subsection{Git Integration}
+
+ArcheFlow creates per-phase commits, enabling fine-grained rollback. The Maker
+operates in a git worktree---an isolated working copy---so its changes do not
+affect the main branch until explicitly merged. If post-merge tests fail, the
+system auto-reverts the merge and cycles back with ``integration test failure''
+feedback.
+
+\subsection{Run Replay}
+
+All orchestration decisions are logged as \texttt{decision.point} events,
+enabling post-hoc analysis. The replay system provides:
+\begin{itemize}
+  \item \textbf{Timeline view}: chronological sequence of all decisions with
+  confidence scores.
+  \item \textbf{Weighted what-if}: re-evaluation of the ship/block outcome
+  using different reviewer weights, answering questions like ``would the outcome
+  have changed if we weighted Guardian 2x and Sage 0.5x?''
+  \item \textbf{Cross-run comparison}: side-by-side analysis of decision
+  patterns across runs.
+\end{itemize}
+
+% ============================================================
+\section{Multi-Domain Application}
+\label{sec:domains}
+
+ArcheFlow's archetype system extends beyond code. The framework has been
+deployed across three domains:
+
+\subsection{Software Engineering}
+
+The primary domain. Archetypes map to standard engineering roles: Explorer
+performs codebase research, Creator designs architecture, Maker writes code,
+and the Check-phase archetypes review for security (Guardian), design flaws
+(Skeptic), edge cases (Trickster), and overall quality (Sage).
+
+\subsection{Creative Writing}
+
+In writing mode, the same archetype structure applies with adapted quality
+criteria. Custom archetypes (story-explorer, story-sage) replace or augment
+the defaults. The framework integrates with Colette, a voice profiling system
+that maintains consistent authorial voice across chapters. Quality gates check
+for voice consistency, dialect authenticity, and narrative structure rather
+than test coverage and security.
+
+\subsection{Academic Research}
+
+In research mode, quality criteria shift to source quality, argument coherence,
+citation accuracy, and methodological rigor. The Guardian reviews for logical
+fallacies and unsupported claims rather than security vulnerabilities.
+
+% ============================================================
+\section{Discussion}
+\label{sec:discussion}
+
+\subsection{Archetypes vs. Role Descriptions}
+
+The key distinction between ArcheFlow's approach and prior multi-agent systems
+is the \emph{shadow} mechanism. A role description tells an agent what to do;
+an archetype tells an agent what to do \emph{and what doing too much of it
+looks like}. This bidirectional specification creates a bounded operating
+range for each agent, preventing the unbounded optimization that leads to
+dysfunction.
+
+The connection to \citet{lu2026assistant}'s persona axis is instructive.
+They show that model personas exist on a continuum, with the Assistant identity
+at one extreme and theatrical/mystical identities at the other. ArcheFlow's
+archetypes deliberately position agents \emph{away} from the default Assistant
+toward specific cognitive orientations---but the shadow mechanism prevents them
+from drifting too far, maintaining a productive operating range analogous to
+what \citeauthor{lu2026assistant} achieve through activation capping.
+
+\subsection{Limitations}
+
+\begin{enumerate}
+  \item \textbf{No activation-level control}: ArcheFlow operates purely at the
+  prompt level. It cannot detect persona drift before it manifests in output,
+  unlike activation-level approaches \citep{lu2026assistant}.
+
+  \item \textbf{Single LLM backend}: The current implementation targets Claude
+  Code. While the architectural principles are model-agnostic, the skill and
+  hook system is specific to Claude Code's plugin API.
+
+  \item \textbf{Evaluation methodology}: We have not conducted controlled
+  experiments comparing ArcheFlow's output quality against baselines (single-agent,
+  role-based multi-agent without shadows, PDCA without archetypes). The system
+  has been evaluated through production use across real projects, which
+  demonstrates practical utility but not causal attribution.
+
+  \item \textbf{Shadow trigger thresholds}: The quantitative thresholds
+  (e.g., 2000 words for Rabbit Hole, ratio $> 2{:}1$ for Paranoid) were
+  determined empirically through iterative use and may not generalize across
+  all codebases and domains.
+\end{enumerate}
+
+\subsection{Future Work}
+
+\begin{enumerate}
+  \item \textbf{Activation-level integration}: Combining behavioral shadow
+  detection with the Assistant Axis measurement from \citet{lu2026assistant}
+  could provide earlier and more reliable drift detection, particularly for
+  open-weight models where activations are accessible.
+
+  \item \textbf{Controlled evaluation}: A systematic comparison across standard
+  benchmarks (SWE-bench, HumanEval) would establish whether the archetype +
+  PDCA approach provides measurable quality improvements over simpler
+  orchestration strategies.
+
+  \item \textbf{Archetype discovery}: Rather than hand-designing archetypes,
+  the persona space analysis from \citet{lu2026assistant} could be used to
+  identify \emph{natural} cognitive orientations that models adopt, potentially
+  revealing useful archetypes that human intuition would not suggest.
+
+  \item \textbf{Cross-model persona stability}: Investigating whether shadow
+  triggers calibrated for one model family transfer to others, or whether
+  per-model calibration is necessary.
+\end{enumerate}
+
+% ============================================================
+\section{Conclusion}
+\label{sec:conclusion}
+
+ArcheFlow demonstrates that multi-agent LLM orchestration benefits from
+structured persona management---not just telling agents \emph{what to do},
+but actively monitoring and correcting \emph{how they do it}. The combination
+of Jungian archetypes (providing a principled taxonomy of cognitive virtues and
+their failure modes) with PDCA quality cycles (providing convergence guarantees
+and principled stopping criteria) produces an orchestration framework that
+maintains productive agent behavior across extended autonomous sessions.
+
+The shadow detection mechanism---quantitative triggers for archetype-specific
+dysfunction---addresses the same persona stability challenge identified by
+\citet{lu2026assistant} at the application level, requiring no access to model
+internals and working with any LLM backend. While coarser than activation-level
+approaches, behavioral shadow detection is practical, interpretable, and
+immediately deployable.
+
+ArcheFlow is open-source under the MIT license and available at
+\url{https://github.com/XORwell/archeflow}.
+
+% ============================================================
+\section*{Acknowledgments}
+
+The author thanks the Claude Code team at Anthropic for building the plugin
+infrastructure that made ArcheFlow possible, and the authors of
+\citet{lu2026assistant} for the Assistant Axis framework that informed the
+theoretical grounding of shadow detection.
+
+% ============================================================
+\bibliographystyle{plainnat}
+\bibliography{references}
+
+\end{document}
--- a/paper/references.bib
+++ b/paper/references.bib
@@ -0,0 +1,89 @@
+@article{lu2026assistant,
+  title={The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models},
+  author={Lu, Christina and Gallagher, Jack and Michala, Jonathan and Fish, Kyle and Lindsey, Jack},
+  journal={arXiv preprint arXiv:2601.10387},
+  year={2026},
+  url={https://arxiv.org/abs/2601.10387}
+}
+
+@book{jung1968archetypes,
+  title={The Archetypes and the Collective Unconscious},
+  author={Jung, Carl Gustav},
+  year={1968},
+  publisher={Princeton University Press},
+  edition={2nd},
+  series={Collected Works of C.G. Jung},
+  volume={9}
+}
+
+@book{deming1986out,
+  title={Out of the Crisis},
+  author={Deming, W. Edwards},
+  year={1986},
+  publisher={MIT Press},
+  address={Cambridge, MA}
+}
+
+@book{shewhart1939statistical,
+  title={Statistical Method from the Viewpoint of Quality Control},
+  author={Shewhart, Walter Andrew},
+  year={1939},
+  publisher={Graduate School of the Department of Agriculture},
+  address={Washington, DC}
+}
+
+@article{hong2024metagpt,
+  title={MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework},
+  author={Hong, Sirui and Zhuge, Mingchen and Chen, Jonathan and Zheng, Xiawu and Cheng, Yuheng and Zhang, Ceyao and Wang, Jinlin and Wang, Zili and Yau, Steven Ka Shing and Lin, Zijuan and Zhou, Liyang and Ran, Chenyu and Xiao, Lingfeng and Wu, Chenglin and Schmidhuber, J{\"u}rgen},
+  journal={arXiv preprint arXiv:2308.00352},
+  year={2024},
+  url={https://arxiv.org/abs/2308.00352}
+}
+
+@article{qian2024chatdev,
+  title={ChatDev: Communicative Agents for Software Development},
+  author={Qian, Chen and Liu, Wei and Liu, Hongzhang and Chen, Nuo and Dang, Yufan and Li, Jiahao and Yang, Cheng and Chen, Weize and Su, Yusheng and Cong, Xin and Xu, Juyuan and Li, Dahai and Liu, Zhiyuan and Sun, Maosong},
+  journal={arXiv preprint arXiv:2307.07924},
+  year={2024},
+  url={https://arxiv.org/abs/2307.07924}
+}
+
+@article{yang2024sweagent,
+  title={SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering},
+  author={Yang, John and Jimenez, Carlos E and Wettig, Alexander and Liber, Kilian and Narasimhan, Karthik and Press, Ofir},
+  journal={arXiv preprint arXiv:2405.15793},
+  year={2024},
+  url={https://arxiv.org/abs/2405.15793}
+}
+
+@article{chen2025persona,
+  title={Persona Vectors: Monitoring and Controlling Character Traits via Activation Directions},
+  author={Chen, Yiwei and others},
+  journal={arXiv preprint arXiv:2507.21509},
+  year={2025},
+  url={https://arxiv.org/abs/2507.21509}
+}
+
+@article{bai2022constitutional,
+  title={Constitutional AI: Harmlessness from AI Feedback},
+  author={Bai, Yuntao and Kadavath, Saurav and Kundu, Sandipan and Askell, Amanda and Kernion, Jackson and Jones, Andy and Chen, Anna and Goldie, Anna and Mirhoseini, Azalia and McKinnon, Cameron and others},
+  journal={arXiv preprint arXiv:2212.08073},
+  year={2022},
+  url={https://arxiv.org/abs/2212.08073}
+}
+
+@book{hartson2012ux,
+  title={The UX Book: Process and Guidelines for Ensuring a Quality User Experience},
+  author={Hartson, Rex and Pyla, Pardha S.},
+  year={2012},
+  publisher={Morgan Kaufmann},
+  address={Burlington, MA}
+}
+
+@inproceedings{winston2011strong,
+  title={The Strong Story Hypothesis and the Directed Perception Hypothesis},
+  author={Winston, Patrick Henry},
+  booktitle={AAAI Fall Symposium: Advances in Cognitive Systems},
+  year={2011},
+  pages={345--352}
+}