researchers Google DeepMind, Microsoft Research, Columbia University, t54 Good, And Virtual Protocol We propose the Agency Risk Standard (ARS), a framework that applies financial risk management principles to AI agent transactions. their papers, Measuring Trust: Financial Risk Management for Trustworthy AI Agents It offers a payment layer protocol that uses escrow, insuring, and collateralization to protect users from financial loss when autonomous AI systems execute tasks involving payments or assets.
The new open source standard introduces escrow, brokerage and collateral mechanisms to protect users while AI agents manage payments and assets, applying the same financial measures used in construction, insurance and capital markets.
AI agents are evolving from chatbots to autonomous systems that write code, file taxes, manage customer service and handle financial transactions. Recent developments in the industry have highlighted the growing need for a clear financial risk standard for AI agents. Incidents involving autonomous systems, such as OpenClaw brokers, that perform undesirable financial actions or issue tokens without appropriate security measures highlight how these systems can directly carry value without adequate oversight. At the same time, challenges faced by security teams at Meta and concerns around identity-related infrastructure, including integrations with protocols such as World ID, highlight the additional complexity that arises when financial activities intersect with digital identity.
As these systems take on tasks with real economic consequences, users face a fundamental problem: Current AI security research focuses on improving model behavior but cannot eliminate the possibility of failure. Large language models are stochastic in nature; thus no amount of training can reliably reduce the probability of failure to zero. Most AI brokers lost money in the autonomous crypto trading competition in 2025; one model lost 63% of its capital, while others fell by 30-56%.
When an AI trade representative executes an order incorrectly or a coding assistant uncovers a critical error, the damage incurred can far exceed the cost of the service. Researchers describe this as a “warranty gap”; The disconnect between the probabilistic reliability provided by AI security techniques and the enforceable guarantees users need before delegating high-risk tasks. Without a way to limit potential losses, users rationally limit AI delegation to low-risk tasks, restricting broader adoption of agent-based services.
How does the Agency Risk Standard work?
Rather than trying to perfect AI models, Agentic Risk Standard takes a complementary approach inspired by how traditional industries have managed uncertainty for centuries. Financial markets use clearinghouses and margin requirements. Doctors carry malpractice insurance. Construction companies issue performance bonds. The solution is not to eliminate risk, but to price it and distribute it through financial mechanisms that protect affected parties when things go wrong.
The Agency Risk Standard applies this logic to AI agents through two modes. For standard service tasks such as creating reports, writing code, preparing documents, etc., payment is held in escrow and released only after the work is verified. For tasks where agents must handle user funds before the results are known, such as trading, currency conversion, financial API calls, the Agentic Risk Standard adds a layer of underwriting: The party assuming the risk evaluates the task, prices the risk, can require the agency provider to provide coverage, and promises to reimburse the user under specified conditions of failure.
The entire transaction lifecycle is formalized as a deterministic state machine with explicit fund control rules; This means that regardless of how an AI agent behaves internally, the financial outcome for the user is governed by auditable, enforceable payment logic.
The paper includes a simulation study that models users, AI tool providers, and insurers interacting through the Agency Risk Standard protocol across 5,000 episodes. Across all parameter configurations tested, the mechanism consistently reduced user losses relative to an ecosystem without underwriting, with loss reductions ranging from 24% to 61% depending on pricing and risk estimation settings. In addition, since fraud or abuse now carries its own cost on the agency side, the collateral mechanism has independently deterred 15-20% of risky transactions from being carried out in the first place, thus discouraging agents from engaging in risky actions.
The results also reveal structured trade-offs: tighter underwriting increases user protection and insurer solvency, but also introduces frictions that can reduce market participation, reflecting trade-offs that exist in traditional insurance and financial markets.
The paper was co-authored by researchers at five institutions: Wenyue Hua (Microsoft Research; study started during an appointment at UC Santa Barbara), Tianyi Peng (Colombia University), ChiWang (Google DeepMind), Ian Kaufman And Chandler Female (t54 Laboratories) and Bryan Lim (Virtual ACP). The research represents the authors’ individual scientific contributions and does not necessarily represent the positions of their respective employers.
“Most trustworthy AI research aims to reduce the probability of failure. This work is crucial, but probability is not a guarantee. ARS takes a complementary approach: instead of trying to make the model perfect, we formalize financially what happens when it is not perfect. The result is a consensus protocol where user protection is deterministic, not probabilistic.” Hua said.
“The industry is increasingly building autonomous AI agents but not addressing what happens when they fail with someone’s money. This is the problem T54 Labs was founded to solve, and the proposed Agency Risk Standard represents our thinking along with leading researchers across industry and academia. We are publicly publishing this because the broader ecosystem needs to recognize that financial risk management for AI agents is essential, not optional.” Fang said.






