6. Agent API Protocol Specification¶
Overview¶
This document describes the structured JSON protocol for communicating with AI agents. The protocol defines messages, requests, and responses with support for:
Streaming content
Tool/function calling
Multi-modal content (text, images, data)
Status tracking through the full lifecycle
Error handling
Protocol Structure¶
1. Core Enums¶
Roles:
class Role:
ASSISTANT = "assistant"
USER = "user"
SYSTEM = "system"
Message Types:
class MessageType:
MESSAGE = "message"
FUNCTION_CALL = "function_call"
FUNCTION_CALL_OUTPUT = "function_call_output"
PLUGIN_CALL = "plugin_call"
PLUGIN_CALL_OUTPUT = "plugin_call_output"
COMPONENT_CALL = "component_call"
COMPONENT_CALL_OUTPUT = "component_call_output"
MCP_LIST_TOOLS = "mcp_list_tools"
MCP_APPROVAL_REQUEST = "mcp_approval_request"
MCP_TOOL_CALL = "mcp_call"
MCP_APPROVAL_RESPONSE = "mcp_approval_response"
HEARTBEAT = "heartbeat"
ERROR = "error"
Run Statuses:
class RunStatus:
Created = "created"
InProgress = "in_progress"
Completed = "completed"
Canceled = "canceled"
Failed = "failed"
Rejected = "rejected"
Unknown = "unknown"
2. Tool Definitions¶
Function Parameters:
class FunctionParameters(BaseModel):
type: str # Must be "object"
properties: Dict[str, Any]
required: Optional[List[str]]
Function Tool:
class FunctionTool(BaseModel):
name: str
description: str
parameters: Union[Dict[str, Any], FunctionParameters]
Tool:
class Tool(BaseModel):
type: Optional[str] = None # Currently only "function"
function: Optional[FunctionTool] = None
Function Call:
class FunctionCall(BaseModel):
"""
Model class for assistant prompt message tool call function.
"""
call_id: Optional[str] = None
"""The ID of the tool call."""
name: Optional[str] = None
"""The name of the function to call."""
arguments: Optional[str] = None
"""The arguments to call the function with, as generated by the model in
JSON format.
Note that the model does not always generate valid JSON, and may
hallucinate parameters not defined by your function schema. Validate
the arguments in your code before calling your function.
"""
Function Call Output:
class FunctionCallOutput(BaseModel):
"""
Model class for assistant prompt message tool call function.
"""
call_id: str
"""The ID of the tool call."""
output: str
"""The result of the function."""
3. Content Models¶
Base Content Model:
class Content(Event):
type: str
"""The type of the content part."""
object: str = "content"
"""The identity of the content part."""
index: Optional[int] = None
"""the content index in message's content list"""
delta: Optional[bool] = False
"""Whether this content is a delta."""
msg_id: str = None
"""message unique id"""
Specialized Content Types:
class ImageContent(Content):
type: str = ContentType.IMAGE
"""The type of the content part."""
image_url: Optional[str] = None
"""The image URL details."""
class TextContent(Content):
type: str = ContentType.TEXT
"""The type of the content part."""
text: Optional[str] = None
"""The text content."""
class DataContent(Content):
type: str = ContentType.DATA
"""The type of the content part."""
data: Optional[Dict] = None
"""The data content."""
4. Message Model¶
class Message(Event):
id: str = Field(default_factory=lambda: "msg_" + str(uuid4()))
"""message unique id"""
object: str = "message"
"""message identity"""
type: str = "message"
"""The type of the message."""
status: str = RunStatus.Created
"""The status of the message. in_progress, completed, or incomplete"""
role: Optional[str] = None
"""The role of the messages author, should be in `user`,`system`,
'assistant'."""
content: Optional[
List[Union[TextContent, ImageContent, DataContent]]
] = None
"""The contents of the message."""
code: Optional[str] = None
"""The error code of the message."""
message: Optional[str] = None
"""The error message of the message."""
Key Methods:
add_delta_content()
: Appends partial content to the existing messagecontent_completed()
: Marks content segment as completeadd_content()
: Adds a fully formed content segment
5. Request Models¶
Base Request:
class BaseRequest(BaseModel):
input: List[Message]
stream: bool = True
Agent Request:
class AgentRequest(BaseRequest):
model: Optional[str] = None
top_p: Optional[float] = None
temperature: Optional[float] = None
frequency_penalty: Optional[float] = None
presence_penalty: Optional[float] = None
max_tokens: Optional[int] = None
stop: Optional[Union[Optional[str], List[str]]] = None
n: Optional[int] = Field(default=1, ge=1, le=5)
seed: Optional[int] = None
tools: Optional[List[Union[Tool, Dict]]] = None
session_id: Optional[str] = None
response_id: Optional[str] = None
6. Response Models¶
Base Response:
class BaseResponse(Event):
sequence_number: str = None
id: str = Field(default_factory=lambda: "response_" + str(uuid4()))
object: str = "response"
created_at: int = int(datetime.now().timestamp())
completed_at: Optional[int] = None
error: Optional[Error] = None
output: Optional[List[Message]] = None
usage: Optional[Dict] = None
Agent Response:
class AgentResponse(BaseResponse):
session_id: Optional[str] = None
7. Error Model¶
class Error(BaseModel):
code: str
message: str
Protocol Flow¶
Request/Response Lifecycle¶
Client sends
AgentRequest
with:Input messages
Generation parameters
Tools definition
Session context
Server responds with a stream of
AgentResponse
objects containing:Status updates (
created
→in_progress
→completed
)Output messages with content segments
Final usage metrics
Content Streaming¶
When stream=True
in request:
Text content is sent incrementally as
delta=true
segmentsEach segment has an
index
pointing to the target content slotFinal segment marks completion with
status=completed
Example Streaming Sequence:
{"status":"created","id":"response_...","object":"response"}
{"status":"created","id":"msg_...","object":"message","type":"assistant"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"Hello","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":", ","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"world","object":"content"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"Hello, world!","object":"content"}
{"status":"completed","id":"msg_...","object":"message", ...}
{"status":"completed","id":"response_...","object":"response", ...}
Status Transitions¶
State |
Description |
---|---|
|
Initial state when object is created |
|
Operation is being processed |
|
Operation finished successfully |
|
Operation terminated with errors |
|
Operation was rejected by the system |
|
Operation was canceled by the user |
Best Practices¶
Stream Handling:
Buffer delta segments until
status=completed
is receivedUse
msg_id
to correlate content with the parent messageRespect
index
for multi-segment messages
Error Handling:
Check for
error
field in responsesMonitor for
failed
status transitionsImplement retry logic for recoverable errors
State Management:
Use
session_id
for conversation continuityTrack
created_at
/completed_at
for latency monitoringUse
sequence_number
for ordering (if implemented)
Example Use Case¶
User Query:
{
"input": [{
"role": "user",
"content": [{"type": "text", "text": "Describe this image"}],
"type": "message"
}],
"stream": true,
"model": "gpt-4-vision"
}
Agent Response Stream:
{"id":"response_123","object":"response","status":"created"}
{"id":"msg_abc","object":"message","type":"assistant","status":"created"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"This","object":"content","msg_id":"msg_abc"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":" image shows...","object":"content","msg_id":"msg_abc"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"This image shows...","object":"content","msg_id":"msg_abc"}
{"id":"msg_abc","status":"completed","object":"message"}
{"id":"response_123","status":"completed","object":"response"}