Public Agents

What is a Public Agent?

Accountable extensions of government capacity

A Public Agent is an AI-enabled assistant — software-based, AI-assisted human, or a hybrid of both — that interacts with citizens or officials within defined rights-based boundaries, and activates DPI Workflows to complete tasks.

⚠ A Public Agent is not autonomous. It cannot access data without consent, cannot modify records directly, and cannot bypass human oversight. It is a constrained interface that activates governed workflows — not a decision-maker.

What a Public Agent can do

Interpret citizen intent

Use LLM/SLM to understand what a citizen is asking for, in any language, via any channel

Invoke DPI Workflows

Activate pre-approved workflows to deliver services — identity verification, eligibility, benefits, registration

Call AI Blocks

Use foundational blocks (translation, OCR) and sector-specific blocks (eligibility, fraud detection)

Escalate to humans

Route low-confidence cases, exceptions, and complaints to a human caseworker with a case ID

Notify and follow up

Send status updates via preferred channel (SMS, WhatsApp, voice)

What a Public Agent cannot do

Access data without consent

Every data pull requires a consent token — verified by the workflow, not assumed

Modify authoritative records directly

Record updates require a governance step in the workflow — the agent cannot write directly to registries

Operate outside approved workflows

Agents can only activate workflows that have been pre-approved and audited

Make final decisions autonomously

All decisions above a confidence threshold remain with human authorities

Multi-Channel Design

Design for the lowest-bandwidth channel first

A Public Agent must work for every citizen — including those with limited connectivity or literacy. Channel design is a governance principle, not a technical afterthought.

💬

WhatsApp / Chat

Primary channel for urban and semi-urban users. Supports text, voice notes, and document submission.

🎙️

Voice / IVR

Critical for low-literacy users. speech_to_text() + translate() blocks enable voice in local dialects.

📱

USSD

Fallback for feature phones and low-connectivity zones. No data connection required. Async queue.

💻

Web / Mobile App

For users with reliable connectivity. Richer interface, document uploads, case tracking.

📨

SMS

Notifications and confirmations. Lowest common denominator for all mobile users.

🏢

In-Person Kiosk

Assisted mode for citizens without devices. Agent operates through a public servant as intermediary.

Language principle: Local language AI Blocks (translation, transliteration, speech-to-text in dialects) are not optional features — they are prerequisites for inclusion. A Public Agent that only works in English or French is structurally exclusionary.

Agent Specification

Generic Public Agent Spec (YAML)

A machine-readable spec for a Public Agent, including its model configuration, channels, workflow bindings, and governance constraints.

public_agent:
  id: "social_benefit_agent_v1"
  description: "Guides citizens through social protection benefit applications"
  channels: ["whatsapp", "ussd", "voice", "web"]
  languages: ["sw", "en", "am", "om"]  # Swahili, English, Amharic, Oromo

  ai_model:
    # Plug-and-play: swap the adapter to change providers
    adapter: "ollama"          # openai | anthropic | google | ollama | vllm | bedrock | custom
    runtime: "sovereign_cloud" # sovereign_cloud | on_prem | external_api
    endpoint: "https://ai.gov.example/v1"
    model: "llama3-8b-instruct-sw"   # SLM fine-tuned on Swahili civic content
    auth_ref: "vault://ai-gateway/token"
    timeout_ms: 30000

  scope:
    allowed_workflows: ["benefit_disbursement_v1", "eligibility_check_v1"]
    allowed_ai_blocks: ["translate", "speech_to_text", "eligibility_verify_v1"]
    data_access: "consent_required"
    escalation_target: "caseworker_queue"

  constraints:
    can_modify_records: false
    can_initiate_payments: false   # Only via workflow disburse step
    can_access_data_without_consent: false
    human_escalation: always_available
    session_timeout: "30m"

  governance:
    owner_agency: "Ministry of Social Welfare"
    legal_basis: "Social Protection Act § 8.1"
    audit_logging: true
    pii_handling: "tokenise_in_logs"
    human_recourse:
      enabled: true
      channel: "any"
      response_sla: "48h"

Model Selection

LLMs vs SLMs: choosing the right model for government

Public Agents can run on any language model — from large frontier APIs to small on-device models. The choice is not just technical; it has direct implications for sovereignty, cost, latency, and inclusion. Neither type is universally better. Each has a role.

LLM

Large Language Models

GPT-5 / o3 / o4-mini · Claude Opus 4 / Sonnet 4 · Gemini 2.5 Pro / 3 Pro · Llama 4

✅

Broad knowledge — handles open-ended queries, complex reasoning, and edge cases with less fine-tuning

✅

Strong multilingual base — reasonable performance across many languages without specialised training

✅

Faster to prototype — good default behaviour out of the box for pilots and demos

⚠️

Data sovereignty risk — cloud-only APIs send citizen data to third-party servers; requires careful PII handling

⚠️

High cost at scale — per-token pricing becomes significant across millions of citizen interactions

⚠️

Latency and connectivity — dependent on reliable internet; unsuitable as primary model for low-connectivity contexts

⚠️

Unpredictable outputs — larger context windows and generality increase risk of off-scope responses in constrained workflows

Best for: Exploratory pilots, non-PII tasks (translation, summarisation), and use cases where interpretive flexibility is more important than strict constraint.

SLM — Recommended

Small Language Models

Phi-4 Mini (3.8B) · Gemma 3 · Qwen2.5 / Qwen3 · Llama 3.2 3B · SmolLM3

✅

On-premise deployment — runs entirely within government infrastructure; citizen data never leaves the country

✅

Low cost at scale — fixed infrastructure cost; no per-token billing across millions of interactions

✅

Low latency — local inference means fast responses even in constrained network environments

✅

Task-focused fine-tuning — small models specialised on a narrow task (eligibility checking, document extraction) outperform large generalist models on that task

✅

Predictable behaviour — narrower scope means fewer off-topic or hallucinated responses within governed workflow steps

✅

Local language fine-tuning — SLMs can be fine-tuned on local dialect corpora that large model providers do not cover

⚠️

More upfront investment — requires GPU infrastructure, fine-tuning pipelines, and technical capacity to evaluate and maintain models

Best for: Production deployments, PII-sensitive workflows, low-connectivity contexts, local dialect support, and any service operating at national scale.

⚡

Why task-focused SLMs are the right fit for government at scale

Government AI does not need to be brilliant — it needs to be reliable, bounded, and governable. A Public Agent executing a benefits eligibility workflow does not need to discuss philosophy or write poetry. It needs to understand a citizen's request, extract structured fields, call the right workflow step, and escalate when uncertain. A 3–8 billion parameter model fine-tuned on that specific task, running on sovereign infrastructure, is more appropriate than a frontier LLM accessed via a third-party API — and it will be cheaper, faster, and more controllable at the 10 million interactions that matter.

Sovereignty by default

Data stays within national infrastructure. No dependency on foreign API availability, pricing changes, or terms of service. The government owns the model and its outputs.

Replaceability

Smaller models are easier to swap. The DPI Workflow spec declares the model adapter — changing from Llama to Mistral or a locally developed model requires no workflow rebuild.

Local language inclusion

SLMs can be fine-tuned on Swahili dialects, Amharic, Bahasa, or Kinyarwanda corpora that large providers do not prioritise. Inclusion is a governance imperative, not a feature request.

Auditability

A model you control can be inspected, re-evaluated, and red-teamed by your own team. Black-box third-party models make it harder to satisfy accountability requirements to citizens and legislatures.

Cost at scale

At 5 million citizen interactions per month, per-token API costs become significant budget line items. Fixed infrastructure costs for on-prem SLMs are predictable and decrease per-interaction as usage grows.

Offline and edge capability

SLMs in the 1–3B range can run on commodity hardware, enabling AI-assisted services in district offices without reliable internet — extending the reach of Public Agents to the last mile.

The principle of replaceability applies to models too. Every Public Agent spec declares its model adapter (adapter: "ollama", "openai", "vllm"). Start with what you can deploy today — even a cloud LLM for a pilot — and migrate to a sovereign SLM as capacity grows. The workflow does not change. Only the adapter does.