Agentic Intelligence Risk Management: Treating AI Agents as Third-Party Vendors

Intent Drift: Is your helpful AI writing the code you expect?

AI coding assistants are spectacularly productive. They generate working code in seconds. The difficulty is that “working code” and “correct code” are not the same thing.

When an AI agent writes a function, it optimises for code that runs — code that passes syntax checks and produces output. What it does not optimise for is whether that output fulfils the original business requirement. Over time, across hundreds of commits, this gap widens. We call it intent drift: the gradual divergence between what the business needs and what the code actually does.

In low-stakes applications, intent drift is a nuisance. In regulated industries — banking, healthcare, defence — it is a liability.

An old problem in new clothes

The challenge of governing an autonomous agent that acts on your behalf is not new. Organisations have managed it for decades under a different name: third-party risk management.

When a bank outsources a function to a vendor, it does not simply trust that vendor to deliver correctly. It issues structured questionnaires. It demands evidence. It scores responses against weighted criteria. It maintains audit trails. It gates approvals through defined workflows. If the vendor’s performance drifts from the agreed standard, the process catches it.

AI coding agents are, in every meaningful sense, digital third-party vendors. They receive instructions, they produce deliverables, and they operate with a degree of autonomy that makes oversight essential. The rigour that regulated industries apply to human vendors should apply equally to their AI counterparts.

From TPRM to AIRM

RiskNodes began life in 2003 as an RFP management system for a banking research consultancy. Over two decades it has been used by major banks, airlines, and rating agencies to run structured vendor assessments. The questionnaire engine — hierarchical scoring, weighted criteria, multi-party workflows — was built for exactly the kind of disciplined, evidence-based evaluation that AI governance now demands.

We have repositioned the platform around this insight. RiskNodes is now an Agentic Intelligence Risk Management (AIRM) system: the same structured assessment infrastructure, applied to AI-generated output rather than — or as well as — human vendors.

The workflow is straightforward. When an AI agent proposes a code change, RiskNodes treats it as a submission from a vendor. The system iterates through a questionnaire defined by the project owner, presenting each question to a local LLM alongside the relevant source context — a git diff, a module, a specification. The LLM returns a structured answer: a verdict, its reasoning, and specific evidence down to the line number. RiskNodes validates the response against a Pydantic schema, records it, and moves to the next question. When the questionnaire is complete, the workflow determines the next step — automatic approval, human review, or escalation.

The questionnaire is not a chat transcript dressed in formality. It is a structured audit. Each question isolates a specific concern. Each answer is validated, stored, and scored. The principle throughout: the machine writes the expression; the human verifies intent.

Structured checks at every level

Intent drift can enter at any point. The code might diverge from the design. But the design itself might diverge from what the business actually needs. RiskNodes addresses both, because the questionnaire — what gets asked, what evidence is required, what constitutes a passing answer — is entirely defined by the project owner.

Business intent. Before development begins, RiskNodes can present a questionnaire to the business owner that tests whether the requirements capture what the organisation actually needs. Plain questions, recorded answers, a signed-off statement of intent.

Design compliance. When code changes are proposed, a second questionnaire checks whether the implementation follows the agreed design. The LLM examines the source context, answers each question with a verdict and evidence, and RiskNodes records the result.

Continuous assurance. Because each assessment is stored, scored, and traceable, the organisation builds a cumulative record. If something fails in production, you can trace backwards through the chain: was the requirement correct? Did the code match the design? Where did the drift enter? The failure is attributable and the record is complete.

The business case

AI-assisted development delivers genuine productivity gains. Organisations that refuse to adopt it will fall behind. But productivity without oversight is a liability — particularly in regulated industries where auditability is not optional.

RiskNodes provides the oversight layer. It does not slow development down; it ensures that each change is checked against the questions that matter to the business, and that the results are recorded in a format that satisfies auditors, regulators, and clients. The alternative — trusting that AI-generated code is correct because it runs — is a risk that no serious organisation should accept.

Sovereignty by design

RiskNodes has been rebuilt around a principle we call sovereign-first: all processing happens within the client’s physical perimeter.

The technical stack reflects this. The application runs on Starlette and ASGI — a lightweight, modern Python web framework. State is managed in SQLite, which requires no separate database server and produces a single file that can be backed up, moved, or inspected trivially. Background tasks run through ASGI’s own task infrastructure, eliminating the need for external job queues. LLM inference runs locally via Ollama on modest hardware — a machine with a current-generation GPU can run 14B to 24B parameter models comfortably.

There is no telemetry. No data leaves the deployment. No cloud dependency. The entire system — application, database, and inference engine — runs on a single machine if required. For government, defence, and financial-services clients who cannot send code or assessment data to external servers, this is not a feature but a requirement.

Open source under the EUPL

RiskNodes is now open source, published under the European Union Public Licence. The code is hosted on Codeberg.

The EUPL was chosen deliberately. It is a copyleft licence recognised by the European Commission, compatible with the GPL family, and designed for cross-border use within the EU legal framework. For European consultancies and their regulated clients, it provides a transparent, legally familiar foundation.

The commercial model follows a pattern established by projects such as SQLite: the engine is open; the high-value content — specialised questionnaire templates, audit packs, BDD specification libraries — is proprietary. Consultancies bring their domain expertise; RiskNodes provides the infrastructure to apply it systematically.

What this means in practice

We are using RiskNodes to review our own pull requests — running questionnaire-backed assessments against every code change before it merges. The system flags deviations from our BDD specifications, identifies missing state transitions in our Mermaid diagrams, and records the full reasoning chain for later review.

It is, admittedly, an unusual form of dogfooding: using a vendor-assessment platform to assess the AI agents that help build it. But the structural analogy holds. The questions we ask of an AI coding agent — does this change introduce risk? does it follow the agreed design? is the evidence sufficient? — are the same questions a diligent assessor asks of any third-party vendor.

The difference is that the assessment is now automated, reproducible, and attached to every commit.

Agentic Intelligence Risk Management: Treating AI Agents as Third-Party Vendors

Intent Drift: Is your helpful AI writing the code you expect?

An old problem in new clothes

From TPRM to AIRM

Structured checks at every level

The business case

Sovereignty by design

Open source under the EUPL

What this means in practice

Related posts

When the Computer Stops Being the Bottleneck - AI, Customisation, and the New Division of Labour

How a Dogma About Language Stalled AI for Fifty Years

Agentic Intelligence Risk Management: Treating AI Agents as Third-Party Vendors