AI agent approval prompts need scoped authority, risk lanes, audit logs, expiry, and revocation so humans approve concrete actions, not fluent requests.
Human oversight is central to safe and responsible AI, but current approaches risk either collapsing agentic AI into mere automation, stripping it of its agentic character, or reducing human agency to a rubber stamp. This paper proposes a design framework that treats agency as layered: AI operative agency in task execution, and human evaluative agency in verification, steering, and substitution. Instead of demanding low-level explanations and controls over how a complex AI model works internally (i.e. internal reasoning faithfulness), we focus on high-level explanations tied to external criteria and human expert understanding (external reasoning faithfulness). This approach retains AI’s operative agency while strengthening human’s evaluative agency. We also exploit the solve-verify asymmetry by designing AI outputs so that humans can efficiently check and contest them without having to resolve the task. This paper makes three contributions. First, it develops a layered agency framework that distinguishes operative and evaluative agency and specifies where human accountability attaches in AI-enabled decision systems. Second, it reframes the explainability requirement by arguing that external reasoning faithfulness—alignment with externally articulated criteria and human expertise—is sufficient and often preferable to internal mechanistic transparency for enabling meaningful oversight. Third, it provides a structured catalogue of oversight mechanisms (e.g., structured rationales, reasoning traces, confidence signals, policy attribution, circuit breakers, appeal bundles) and four end-to-end design patterns that translate these principles into implementable system architectures. We also outline evaluation criteria for AI’s agency, human’s agency, and joint system agency. The framework provides AI ethicsts, engineers, safety teams, users, and organisational leaders with a concrete way to design meaningful and effective oversight that preserves human accountability and ag
AI agent approval prompts need scoped authority, risk lanes, audit logs, expiry, and revocation so humans approve concrete actions, not fluent requests.