Designing A2A Services for Long-Running Agents: Architecting for Patience

Beyond the Chatbot: Orchestrating "Thinking" State with A2A and Gemini Interactions

The era of the "Stateless Wrapper" is ending, and the current generation of agent frameworks isn't ready for what comes next.

Tools like LangChain or CrewAI have focused heavily on orchestrating prompts—wrapping the "Goldfish Model" in layers of abstraction to simulate memory and planning. But they largely rely on the client (or a fragile Python script) to be the "brain" that holds the state.

This architecture collapses in the face of Reasoning Models (like Gemini 3). When an AI takes 10 minutes to "think," plan, and execute deep research, you cannot rely on an open HTTP connection or a transient python process to hold the line. The state needs to move from the client to the infrastructure.

We need a shift from Stateless Transactions to Managed State.

This post explores a new architectural pattern: combining the server-side session management of the Gemini Interactions API with the distributed orchestration of the A2A (Agent-to-Agent) Protocol. Together, they allow us to treat an AI "thought process" as a durable, addressable, and portable resource.

Note: This post is based on the reference implementation in the a2a-experiments repository. All examples below can be run locally using the project's CLI.

The Evolution: From "Chat" to "Interactions"

Traditional GenAI SDKs are designed for transactions. You send context; you get text.
The Gemini Interactions API represents a fundamental shift. It treats the conversation as a Resource (Interaction), not a payload.

Server-Side State: The history lives in the cloud. You pass a previous_interaction_id, not a 100k-token transcript.
Thinking as a Process: With background=true, the API acknowledges that "reasoning" is a temporal activity. It decouples the request from the result.

But while the Interactions API solves the storage of state, it doesn't solve the ownership and observability of that state across a distributed system. An InteractionID is just a string. It has no owner, no "working" status visible to the outside world, and no standard way to be shared.

A2A: The Operating System for State

This is where A2A comes in. If the Interactions API is the "Process" (the CPU and RAM), A2A is the "Window Manager."

A2A wraps the raw Interaction in a Task. This simple wrapper adds the semantic layer that raw APIs lack:

Lifecycle: A2A defines working, completed, and failed states that are broadcast to any observer.
Persistence: The A2A TaskID is a permanent handle. Even if the Interactions API retention policy expires, the A2A Task remains as the record of what happened.
Discovery: A2A allows clients to negotiate how to talk to this stateful entity before sending the first byte.

Pattern 1: The Resilient Session

The most immediate benefit is resilience. In a standard setup, if a mobile client disconnects during a 5-minute "Deep Research" step, the response is lost.

By wrapping the Interaction in an A2A Task, we decouple the Observer from the Worker.

A flowchart comparing two network interaction models. In the Standard 'Goldfish' Model, a network drop results in context loss and a timeout. In the A2A Resilient Model, the server continues processing after a network drop, allowing the client to reconnect later and successfully retrieve the result.

This isn't just theory. In the a2a-experiments CLI, you can see this "detach/reattach" flow in action with the ai_researcher skill.

Example: The Asynchronous Workflow

Initiation: You kick off a research task. The server starts the Interaction and immediately returns a Task ID.

./bin/client invoke "Research the history of the Go language" --skill ai_researcher
# Output: Task Started (ID: 0194...) - Status: Working

A blank terminal window with a white command-line cursor in the top-left corner.

Disconnect: You can safely close your laptop or kill the CLI process. The server keeps polling the Gemini Interaction in the background.

Resumption: Hours (really, minutes) later, you "resume" the specific task to get the result.

./bin/client resume <TASK_ID>
# Output: [Status Update] Research Complete. 
# [Artifact] deep_research_report.md

A blank command-line terminal window with a white cursor in the top-left corner.

A dark terminal window with a white cursor at the command prompt.

The Interactions API provided the "Thinking"; A2A provided the "Patience."

A flow diagram illustrating how an A2A Server maintains task state during client disconnections. The 10-step process shows a client starting a mission, the server managing a Gemini session in the background after a network drop, and the client reconnecting using a Task ID to retrieve the finished artifact.

Pattern 2: The Reference Pattern (zero-copy handoff)

The most thought-provoking pattern emerges when we look at inter-agent collaboration.

In a stateless world, if Agent A (Researcher) wants to pass its work to Agent B (Writer), it must serialize the entire conversation history and send it over the wire. This is slow, expensive, and fragile.

With A2A + Interactions, we enable Zero-Copy Handoffs.

A flowchart illustrating the A2A Protocol Layer where Agent A (Researcher) creates an interaction session in a Cloud State API, outputs a task artifact (ID=123), and performs a zero-copy reference handoff to Agent B (Writer), who then resumes the session using the artifact ID.

This is implemented in a2a-experiments using the --ref flag. This flag tells the new agent, "Don't just listen to me; look at that task for context."

Example: Cross-Skill Chaining

Step 1 (Research): You generate a dense, machine-readable report using the Researcher agent.

./bin/client invoke "Research A2A protocol standards" --skill ai_researcher
# Returns Task ID: task_123

Step 2 (Summarize): You ask a different skill to summarize it, passing task_123 as a reference.

./bin/client invoke "Summarize this for a slide deck" --skill summarize --ref task_123

A blank terminal window with a white cursor in the top-left corner.

What happened here?

The client did not download and re-upload the report.
The summarize skill received the ReferenceTaskID.
The server looked up task_123, found the InteractionID (or the Report Artifact), and loaded it directly into the context of the new skill.

Agent B has effectively "downloaded" Agent A's brain state without transferring a single byte of context. They are sharing the same "Mind" (the Interactions Session) but applying different "Skills" (Prompts/Tools).

A Note on Architecture: Intra- vs. Inter-Agent

In this implementation, both the Researcher and Summarizer skills live within the same server process, sharing a local memory store. This makes the "handoff" instantaneous.

However, this Reference Pattern is designed to scale horizontally. In a production system, Agent A and Agent B could be entirely different microservices, er, agents. As long as they share a common "Task Registry" (like a Redis backend or a centralized A2A Control Plane), the pattern holds: the TaskID is the universal pointer that allows any authorized agent to "mount" the context of another's work.

Why This Matters

We are moving toward a world of Agentic Mesh—systems where specialized agents collaborate to solve complex problems.

Interactions API gives us the "Brain" that can hold a thought for days.
A2A gives us the "Protocol" to pass that thought around safely.

This combination allows us to build systems that are not just "chatbots that remember," but distributed applications that think.

Areas for Exploration

This architecture opens several doors we intend to explore further:

The State Bridge Implementation: A technical deep dive into mapping A2A Tasks to Interaction IDs in Go.
Impedance Matching Protocols: How to bridge gRPC and JSON-RPC for high-performance agents.
Agentic UX: Designing user interfaces that can visualize and manage these long-running, "thinking" states.

Try It Yourself

Explore the patterns discussed here in the reference implementation:

Repo: https://github.com/ghchinoy/a2a-experiments
Patterns Doc: docs/agentic_patterns.md
Scenarios: docs/scenarios.md
Code Walkthrough: docs/code_walkthrough.md