6. Agent API Protocol Specification

Overview

This document describes the structured JSON protocol for communicating with AI agents. The protocol defines messages, requests, and responses with support for:

  • Streaming content

  • Tool/function calling

  • Multi-modal content (text, images, data)

  • Status tracking through the full lifecycle

  • Error handling

Protocol Structure

1. Core Enums

Roles:

class Role:
    ASSISTANT = "assistant"
    USER = "user"
    SYSTEM = "system"

Message Types:

class MessageType:
    MESSAGE = "message"
    FUNCTION_CALL = "function_call"
    FUNCTION_CALL_OUTPUT = "function_call_output"
    PLUGIN_CALL = "plugin_call"
    PLUGIN_CALL_OUTPUT = "plugin_call_output"
    COMPONENT_CALL = "component_call"
    COMPONENT_CALL_OUTPUT = "component_call_output"
    MCP_LIST_TOOLS = "mcp_list_tools"
    MCP_APPROVAL_REQUEST = "mcp_approval_request"
    MCP_TOOL_CALL = "mcp_call"
    MCP_APPROVAL_RESPONSE = "mcp_approval_response"
    HEARTBEAT = "heartbeat"
    ERROR = "error"

Run Statuses:

class RunStatus:
    Created = "created"
    InProgress = "in_progress"
    Completed = "completed"
    Canceled = "canceled"
    Failed = "failed"
    Rejected = "rejected"
    Unknown = "unknown"

2. Tool Definitions

Function Parameters:

class FunctionParameters(BaseModel):
    type: str  # Must be "object"
    properties: Dict[str, Any]
    required: Optional[List[str]]

Function Tool:

class FunctionTool(BaseModel):
    name: str
    description: str
    parameters: Union[Dict[str, Any], FunctionParameters]

Tool:

class Tool(BaseModel):
    type: Optional[str] = None  # Currently only "function"
    function: Optional[FunctionTool] = None

Function Call:

class FunctionCall(BaseModel):
    """
    Model class for assistant prompt message tool call function.
    """

    call_id: Optional[str] = None
    """The ID of the tool call."""

    name: Optional[str] = None
    """The name of the function to call."""

    arguments: Optional[str] = None
    """The arguments to call the function with, as generated by the model in
    JSON format.

    Note that the model does not always generate valid JSON, and may
    hallucinate  parameters not defined by your function schema. Validate
    the arguments in your code before calling your function.
    """

Function Call Output:

class FunctionCallOutput(BaseModel):
    """
    Model class for assistant prompt message tool call function.
    """

    call_id: str
    """The ID of the tool call."""

    output: str
    """The result of the function."""

3. Content Models

Base Content Model:

class Content(Event):
    type: str
    """The type of the content part."""

    object: str = "content"
    """The identity of the content part."""

    index: Optional[int] = None
    """the content index in message's content list"""

    delta: Optional[bool] = False
    """Whether this content is a delta."""

    msg_id: str = None
    """message unique id"""

Specialized Content Types:

class ImageContent(Content):
    type: str = ContentType.IMAGE
    """The type of the content part."""

    image_url: Optional[str] = None
    """The image URL details."""


class TextContent(Content):
    type: str = ContentType.TEXT
    """The type of the content part."""

    text: Optional[str] = None
    """The text content."""


class DataContent(Content):
    type: str = ContentType.DATA
    """The type of the content part."""

    data: Optional[Dict] = None
    """The data content."""

4. Message Model

class Message(Event):
    id: str = Field(default_factory=lambda: "msg_" + str(uuid4()))
    """message unique id"""

    object: str = "message"
    """message identity"""

    type: str = "message"
    """The type of the message."""

    status: str = RunStatus.Created
    """The status of the message. in_progress, completed, or incomplete"""

    role: Optional[str] = None
    """The role of the messages author, should be in `user`,`system`,
    'assistant'."""

    content: Optional[
        List[Union[TextContent, ImageContent, DataContent]]
    ] = None
    """The contents of the message."""

    code: Optional[str] = None
    """The error code of the message."""

    message: Optional[str] = None
    """The error message of the message."""

Key Methods:

  • add_delta_content(): Appends partial content to the existing message

  • content_completed(): Marks content segment as complete

  • add_content(): Adds a fully formed content segment

5. Request Models

Base Request:

class BaseRequest(BaseModel):
    input: List[Message]
    stream: bool = True

Agent Request:

class AgentRequest(BaseRequest):
    model: Optional[str] = None
    top_p: Optional[float] = None
    temperature: Optional[float] = None
    frequency_penalty: Optional[float] = None
    presence_penalty: Optional[float] = None
    max_tokens: Optional[int] = None
    stop: Optional[Union[Optional[str], List[str]]] = None
    n: Optional[int] = Field(default=1, ge=1, le=5)
    seed: Optional[int] = None
    tools: Optional[List[Union[Tool, Dict]]] = None
    session_id: Optional[str] = None
    response_id: Optional[str] = None

6. Response Models

Base Response:

class BaseResponse(Event):
    sequence_number: str = None
    id: str = Field(default_factory=lambda: "response_" + str(uuid4()))
    object: str = "response"
    created_at: int = int(datetime.now().timestamp())
    completed_at: Optional[int] = None
    error: Optional[Error] = None
    output: Optional[List[Message]] = None
    usage: Optional[Dict] = None

Agent Response:

class AgentResponse(BaseResponse):
    session_id: Optional[str] = None

7. Error Model

class Error(BaseModel):
    code: str
    message: str

Protocol Flow

Request/Response Lifecycle

  1. Client sends AgentRequest with:

    • Input messages

    • Generation parameters

    • Tools definition

    • Session context

  2. Server responds with a stream of AgentResponse objects containing:

    • Status updates (createdin_progresscompleted)

    • Output messages with content segments

    • Final usage metrics

Content Streaming

When stream=True in request:

  • Text content is sent incrementally as delta=true segments

  • Each segment has an index pointing to the target content slot

  • Final segment marks completion with status=completed

Example Streaming Sequence:

{"status":"created","id":"response_...","object":"response"}
{"status":"created","id":"msg_...","object":"message","type":"assistant"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"Hello","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":", ","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"world","object":"content"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"Hello, world!","object":"content"}
{"status":"completed","id":"msg_...","object":"message", ...}
{"status":"completed","id":"response_...","object":"response", ...}

Status Transitions

State

Description

created

Initial state when object is created

in_progress

Operation is being processed

completed

Operation finished successfully

failed

Operation terminated with errors

rejected

Operation was rejected by the system

canceled

Operation was canceled by the user

Best Practices

  1. Stream Handling:

    • Buffer delta segments until status=completed is received

    • Use msg_id to correlate content with the parent message

    • Respect index for multi-segment messages

  2. Error Handling:

    • Check for error field in responses

    • Monitor for failed status transitions

    • Implement retry logic for recoverable errors

  3. State Management:

    • Use session_id for conversation continuity

    • Track created_at/completed_at for latency monitoring

    • Use sequence_number for ordering (if implemented)

Example Use Case

User Query:

{
  "input": [{
    "role": "user",
    "content": [{"type": "text", "text": "Describe this image"}],
    "type": "message"
  }],
  "stream": true,
  "model": "gpt-4-vision"
}

Agent Response Stream:

{"id":"response_123","object":"response","status":"created"}
{"id":"msg_abc","object":"message","type":"assistant","status":"created"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"This","object":"content","msg_id":"msg_abc"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":" image shows...","object":"content","msg_id":"msg_abc"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"This image shows...","object":"content","msg_id":"msg_abc"}
{"id":"msg_abc","status":"completed","object":"message"}
{"id":"response_123","status":"completed","object":"response"}