The last two weeks have been unusually clear. Agent governance is no longer being described as a policy document, a responsible-AI checklist, or a dashboard that sits beside the system. The stronger signals are all pointing in the same direction: governance has to move into the execution path.
That shift matters because agents do not merely answer questions. They call tools, inspect files, touch calendars, retrieve records, send messages, mutate systems, hand work to other agents, and increasingly act across several providers in one run. Once a system can do that, the old governance pattern breaks. You cannot govern an actioning system only by reviewing its output after the action has already happened.
This is the line Pea has been built around. Capability is not enough. The question is whether the system can account for its own actions: who asked, what boundary the request crossed, what authority was derived, what evidence was used, what policy posture applied, which tool actually ran, what was withheld, and what can be replayed afterwards.
Agent governance is becoming less about what the model says, and more about what the runtime is allowed to commit.
The policy layer is learning to ask runtime questions
Singapore's IMDA updated its Model AI Governance Framework for Agentic AI on May 20, adding real-world case studies and best practices after feedback from more than 60 organisations. The update is interesting because it keeps the human-accountability frame, but the examples are increasingly operational: autonomy levels, tiered actions, approval checkpoints, external-tool limits, monitoring, and safe rollout stages.
That is a practical turn. The policy question is no longer just "should this agent be safe?" It becomes "which actions can this agent take without approval, which actions can it propose but not execute, which actions are off-limits, and how is that enforced at the moment the action is attempted?"
In other words, the framework conversation is starting to sound like runtime design. Responsible deployment needs more than statements of intent. It needs boundaries that can be expressed in software, evaluated before execution, recorded during execution, and reconstructed after execution.
Enterprises are naming the sprawl problem
Glean's May 12 announcement of an Enterprise Agent Development Lifecycle is another sign of the same pressure. Its language is enterprise-platform language: agents moving from fragmented experiments into governed production systems, with launch, measurement, monitoring, and improvement treated as lifecycle stages rather than one-off build tasks.
That is the enterprise version of a simple truth: if every team builds agents differently, every team invents its own authority model. One team treats a Slack event as a trigger. Another treats it as an instruction. One team lets a mail agent send directly. Another uses drafts. One team logs tool calls. Another only logs final answers. The organisation does not get an agent platform. It gets agent sprawl with a nice UI.
Sprawl is not only an operational mess. It is an accountability failure. The moment agents become embedded in business workflows, the organisation needs to know which agent acted, on whose behalf, under which scope, with which approval, and against which evidence. If those answers live inside separate frameworks, prompt chains, provider logs, and improvised middleware, the audit trail becomes archaeology.
Agents are being compared to unmanaged endpoints
The security press is using sharper language too. A May 13 TechRadar piece framed AI agents as the next unmanaged endpoint problem, comparing the current moment to earlier waves of shadow IT, mobile-device sprawl, and cloud sprawl. The analogy is useful, but agents raise the stakes because an endpoint mostly carries access. An agent can carry access, interpretation, planning, and execution at the same time.
The piece also points to the wider non-human identity problem. Machine identities already outnumber human identities by a large margin in many enterprises. Agents can multiply that problem because each connected tool, API, data source, and workflow can create another place where authority is assumed rather than explicitly derived.
Identity remains necessary, but identity alone is not governance. A valid credential proves that something can call an API. It does not prove that this action is semantically allowed, that the user actually requested it, that the approval is fresh, that the evidence is sufficient, or that the mutation belongs inside the current task boundary.
The research language is converging on hard runtime constraints
The academic signals are pointed in the same direction. The May 8 SARC paper argues that agentic systems act through tools, sub-agents, and external services while many controls remain attached to prompts, dashboards, or post-hoc documentation. Its proposed answer is runtime governance architecture: pre-action gates, action-time monitoring, post-action audit, and escalation routing.
A separate May 13 paper on proof-derived authorization makes a related claim from the infrastructure side. Standing identity is too weak for autonomous agents because an agent can produce a syntactically valid command that is still semantically unsafe. The paper's answer is to derive execution authority from structured, verifiable artifacts and preserve the authorization lifecycle in an evidence chain.
The names differ, but the pressure is the same. Governance is not a sticker on the agent. It is a constraint system around action. The runtime must know what authority exists, where it came from, when it expires, what it permits, and how to prove that a committed action stayed inside it.
The governance boundary has to sit before the side effect, not after the apology.
What this means for Pea
Pea's posture is intentionally conservative here. External services are capability substrates, not authority planes. Web pages, mailboxes, calendars, drives, chats, APIs, papers, feeds, and crawler outputs can all provide evidence or bounded capabilities. They do not get to mint authority for the runtime.
That is why the architecture keeps returning to the same seams: functional-requirement ingress, semantic task projection, planning and capability lookup, Decision authority, governed dispatch, memory boundaries, provider contracts, artifact intake, redaction posture, approval gates, and replayable evidence. None of those pieces is decorative. Each one exists because agents become unsafe when capability outruns custody.
The current API work makes this visible. Google, Microsoft Graph, Zoho Mail, messaging surfaces, source-development tools, and cloud operations are useful only if they remain downstream of Pea's authority chain. Raw provider operation ids are rejected. Provider events are evidence or triggers, not instructions. Live calls fail closed until transport readiness is proven. Mutations require explicit invoke wording and approval posture. Provider payloads cannot smuggle permission into the system that consumes them.
The crawler work follows the same logic. Research retrieval should not be a hidden browser scrape with no memory of its own conduct. PeaCrawler gives corpus intake a governed identity, respects robots.txt and crawl-delay, bounds depth and rate, avoids access controls, and writes manifests that record provenance, blocked paths, errors, and crawl limits. The runtime can read the web, but it should leave a record of how it arrived.
The control plane is not separate from the product
The tempting product shortcut is to build a highly capable agent first and add governance once customers ask for procurement evidence. That path looks fast until the first serious deployment review. Then every missing boundary becomes a retrofit: memory has to be separated, approvals have to be invented, tool calls have to be typed, provider logs have to be normalized, credentials have to be contained, and old actions have to be explained after the fact.
Pea is taking the slower route because the slower route is the product. The control surface is not a compliance afterthought. It is how the runtime earns permission to act. If an agent can send a message, move a file, update a record, invoke an API, crawl a site, or retrieve private context, the runtime must be able to say why that was allowed before it happened and show what happened afterwards.
That is where the market appears to be heading. Frameworks are asking for bounded autonomy. Enterprise platforms are naming lifecycle governance. Security writers are warning about agent sprawl. Research papers are formalizing runtime constraints and proof-derived authority. The vocabulary is still settling, but the direction is not vague.
Governed agents will not be defined by how confidently they speak. They will be defined by whether their actions are bounded, attributable, inspectable, reversible where possible, and denied when authority is absent.
Source notes: this article draws on recent public signals from IMDA's May 20 framework update, Glean's May 12 ADLC announcement, TechRadar's May 13 unmanaged-endpoint analysis, SARC, submitted May 8, and Verifiable Agentic Infrastructure, submitted May 13.