8. Agent API Protocol Specification¶
Overview¶
This document describes the structured JSON protocol for communicating with AI agents. The protocol defines messages, requests, and responses with support for:
Streaming content
Tool/function calling
Multi-modal content (text, images, data)
Status tracking through the full lifecycle
Error handling
Protocol Structure¶
1. Core Enums¶
Roles:
class Role:
ASSISTANT = "assistant"
USER = "user"
SYSTEM = "system"
TOOL = "tool" # New: Tool role
Message Types:
class MessageType:
MESSAGE = "message"
FUNCTION_CALL = "function_call"
FUNCTION_CALL_OUTPUT = "function_call_output"
PLUGIN_CALL = "plugin_call"
PLUGIN_CALL_OUTPUT = "plugin_call_output"
COMPONENT_CALL = "component_call"
COMPONENT_CALL_OUTPUT = "component_call_output"
MCP_LIST_TOOLS = "mcp_list_tools"
MCP_APPROVAL_REQUEST = "mcp_approval_request"
MCP_TOOL_CALL = "mcp_call"
MCP_APPROVAL_RESPONSE = "mcp_approval_response"
REASONING = "reasoning"
HEARTBEAT = "heartbeat"
ERROR = "error"
Run Statuses:
class RunStatus:
Created = "created"
InProgress = "in_progress"
Completed = "completed"
Canceled = "canceled"
Failed = "failed"
Rejected = "rejected"
Unknown = "unknown"
Queued = "queued"
Incomplete = "incomplete"
2. Tool Definitions¶
Function Parameters:
class FunctionParameters(BaseModel):
type: str # Must be "object"
properties: Dict[str, Any]
required: Optional[List[str]]
Function Tool:
class FunctionTool(BaseModel):
name: str
description: str
parameters: Union[Dict[str, Any], FunctionParameters]
Tool:
class Tool(BaseModel):
type: Optional[str] = None # Currently only "function"
function: Optional[FunctionTool] = None
Function Call:
class FunctionCall(BaseModel):
"""
Model class for assistant prompt message tool call function.
"""
call_id: Optional[str] = None
"""The ID of the tool call."""
name: Optional[str] = None
"""The name of the function to call."""
arguments: Optional[str] = None
"""The arguments to call the function with, as generated by the model in
JSON format.
Note that the model does not always generate valid JSON, and may
hallucinate parameters not defined by your function schema. Validate
the arguments in your code before calling your function.
"""
Function Call Output:
class FunctionCallOutput(BaseModel):
"""
Model class for assistant prompt message tool call function.
"""
call_id: str
"""The ID of the tool call."""
output: str
"""The result of the function."""
3. Content Models¶
Base Content Model:
class Content(Event):
type: str
"""The type of the content part."""
object: str = "content"
"""The identity of the content part."""
index: Optional[int] = None
"""the content index in message's content list"""
delta: Optional[bool] = False
"""Whether this content is a delta."""
msg_id: str = None
"""message unique id"""
Specialized Content Types:
class ImageContent(Content):
type: str = ContentType.IMAGE
"""The type of the content part."""
image_url: Optional[str] = None
"""The image URL details."""
class TextContent(Content):
type: str = ContentType.TEXT
"""The type of the content part."""
text: Optional[str] = None
"""The text content."""
class DataContent(Content):
type: str = ContentType.DATA
"""The type of the content part."""
data: Optional[Dict] = None
"""The data content."""
class AudioContent(Content):
type: str = ContentType.AUDIO
"""The type of the content part."""
data: Optional[str] = None
"""The audio data details."""
format: Optional[str] = None
"""The format of the audio data."""
class FileContent(Content):
type: str = ContentType.FILE
"""The type of the content part."""
file_url: Optional[str] = None
"""The file URL details."""
file_id: Optional[str] = None
"""The file ID details."""
filename: Optional[str] = None
"""The file name details."""
file_data: Optional[str] = None
"""The file data details."""
class RefusalContent(Content):
type: str = ContentType.REFUSAL
"""The type of the content part."""
refusal: Optional[str] = None
"""The refusal content."""
4. Message Model¶
class Message(Event):
id: str = Field(default_factory=lambda: "msg_" + str(uuid4()))
"""message unique id"""
object: str = "message"
"""message identity"""
type: str = "message"
"""The type of the message."""
status: str = RunStatus.Created
"""The status of the message. in_progress, completed, or incomplete"""
role: Optional[str] = None
"""The role of the messages author, should be in `user`,`system`,
'assistant'."""
content: Optional[
List[Union[TextContent, ImageContent, DataContent]]
] = None
"""The contents of the message."""
code: Optional[str] = None
"""The error code of the message."""
message: Optional[str] = None
"""The error message of the message."""
Key Methods:
add_delta_content(): Appends partial content to the existing messagecontent_completed(): Marks content segment as completeadd_content(): Adds a fully formed content segment
5. Request Models¶
Base Request:
class BaseRequest(BaseModel):
input: List[Message]
stream: bool = True
Agent Request:
class AgentRequest(BaseRequest):
model: Optional[str] = None
top_p: Optional[float] = None
temperature: Optional[float] = None
frequency_penalty: Optional[float] = None
presence_penalty: Optional[float] = None
max_tokens: Optional[int] = None
stop: Optional[Union[Optional[str], List[str]]] = None
n: Optional[int] = Field(default=1, ge=1, le=5)
seed: Optional[int] = None
tools: Optional[List[Union[Tool, Dict]]] = None
session_id: Optional[str] = None
response_id: Optional[str] = None
6. Response Models¶
Base Response:
class BaseResponse(Event):
sequence_number: str = None
id: str = Field(default_factory=lambda: "response_" + str(uuid4()))
object: str = "response"
created_at: int = int(datetime.now().timestamp())
completed_at: Optional[int] = None
error: Optional[Error] = None
output: Optional[List[Message]] = None
usage: Optional[Dict] = None
Agent Response:
class AgentResponse(BaseResponse):
session_id: Optional[str] = None
7. Error Model¶
class Error(BaseModel):
code: str
message: str
Protocol Flow¶
Request/Response Lifecycle¶
Client sends
AgentRequestwith:Input messages
Generation parameters
Tools definition
Session context
Server responds with a stream of
AgentResponseobjects containing:Status updates (
created→in_progress→completed)Output messages with content segments
Final usage metrics
Content Streaming¶
When stream=True in request:
Text content is sent incrementally as
delta=truesegmentsEach segment has an
indexpointing to the target content slotFinal segment marks completion with
status=completed
Example Streaming Sequence:
{"status":"created","id":"response_...","object":"response"}
{"status":"created","id":"msg_...","object":"message","type":"assistant"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"Hello","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":", ","object":"content"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"world","object":"content"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"Hello, world!","object":"content"}
{"status":"completed","id":"msg_...","object":"message", ...}
{"status":"completed","id":"response_...","object":"response", ...}
Status Transitions¶
State |
Description |
|---|---|
|
Initial state when object is created |
|
Operation is being processed |
|
Operation finished successfully |
|
Operation terminated with errors |
|
Operation was rejected by the system |
|
Operation was canceled by the user |
Best Practices¶
Stream Handling:
Buffer delta segments until
status=completedis receivedUse
msg_idto correlate content with the parent messageRespect
indexfor multi-segment messages
Error Handling:
Check for
errorfield in responsesMonitor for
failedstatus transitionsImplement retry logic for recoverable errors
State Management:
Use
session_idfor conversation continuityTrack
created_at/completed_atfor latency monitoringUse
sequence_numberfor ordering (if implemented)
Example Use Case¶
User Query:
{
"input": [{
"role": "user",
"content": [{"type": "text", "text": "Describe this image"}],
"type": "message"
}],
"stream": true,
"model": "gpt-4-vision"
}
Agent Response Stream:
{"id":"response_123","object":"response","status":"created"}
{"id":"msg_abc","object":"message","type":"assistant","status":"created"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":"This","object":"content","msg_id":"msg_abc"}
{"status":"in_progress","type":"text","index":0,"delta":true,"text":" image shows...","object":"content","msg_id":"msg_abc"}
{"status":"completed","type":"text","index":0,"delta":false,"text":"This image shows...","object":"content","msg_id":"msg_abc"}
{"id":"msg_abc","status":"completed","object":"message"}
{"id":"response_123","status":"completed","object":"response"}
Agent API Protocol Builder¶
The Agent API protocol provides a layered Builder pattern for generating streaming response data that conforms to protocol specifications. Using the agent_api_builder module, developers can easily construct complex streaming response sequences.
1. Builder Architecture¶
The Agent API builder adopts a three-layer architecture design:
ResponseBuilder: Response builder, responsible for managing the entire response flow
MessageBuilder: Message builder, responsible for building and managing individual message objects
ContentBuilder: Content builder, responsible for building and managing individual content objects
2. Core Classes¶
ResponseBuilder (Response Builder)¶
from agentscope_runtime.engine.helpers.agent_api_builder import ResponseBuilder
# Create response builder
response_builder = ResponseBuilder(session_id="session_123")
# Set response status
response_builder.created() # Created status
response_builder.in_progress() # In progress status
response_builder.completed() # Completed status
# Create message builder
message_builder = response_builder.create_message_builder(
role="assistant",
message_type="message"
)
MessageBuilder (Message Builder)¶
# Create content builder
content_builder = message_builder.create_content_builder(
content_type="text",
index=0
)
# Add content to message
message_builder.add_content(content)
# Complete message building
message_builder.complete()
ContentBuilder (Content Builder)¶
# Add text delta
content_builder.add_text_delta("Hello")
content_builder.add_text_delta(" World")
# Set complete text content
content_builder.set_text("Hello World")
# Set image content
content_builder.set_image_url("https://example.com/image.jpg")
# Set data content
content_builder.set_data({"key": "value"})
# Complete content building
content_builder.complete()
3. Complete Usage Example¶
The following example demonstrates how to use the Agent API builder to generate a complete streaming response sequence:
from agentscope_runtime.engine.helpers.agent_api_builder import ResponseBuilder
def generate_streaming_response(text_tokens):
"""Generate streaming response sequence"""
# Create response builder
response_builder = ResponseBuilder(session_id="session_123")
# Generate complete streaming response sequence
for event in response_builder.generate_streaming_response(
text_tokens=["Hello", " ", "World", "!"],
role="assistant"
):
yield event
# Usage example
for event in generate_streaming_response(["Hello", " ", "World", "!"]):
print(event)
4. Streaming Response Sequence¶
Using the generate_streaming_response method generates a standard streaming response sequence:
Response Creation (
response.created)Response Start (
response.in_progress)Message Creation (
message.created)Content Streaming Output (
content.deltaevents)Content Completion (
content.completed)Message Completion (
message.completed)Response Completion (
response.completed)
5. Supported Content Types¶
ContentBuilder supports multiple content types:
TextContent: Text content, supports incremental output
ImageContent: Image content, supports URL and base64 formats
DataContent: Data content, supports arbitrary JSON data
AudioContent: Audio content, supports multiple audio formats
FileContent: File content, supports file URLs and file data
RefusalContent: Refusal content, used to indicate refusal to execute
6. Best Practices¶
State Management: Ensure calling status methods in correct order (created → in_progress → completed)
Content Indexing: Properly set index values for multi-content messages
Incremental Output: Use add_delta method to implement streaming text output
Error Handling: Appropriately handle exceptions during building process
Resource Cleanup: Timely call complete method to finish building
7. Advanced Usage¶
Multi-Content Message Building¶
# Create message containing text and image
message_builder = response_builder.create_message_builder()
# Add text content
text_builder = message_builder.create_content_builder("text", index=0)
text_builder.set_text("This is an image:")
text_builder.complete()
# Add image content
image_builder = message_builder.create_content_builder("image", index=1)
image_builder.set_image_url("https://example.com/image.jpg")
image_builder.complete()
# Complete message
message_builder.complete()
Data Content Building¶
# Create message containing structured data
data_builder = message_builder.create_content_builder("data", index=0)
# Set data content
data_builder.set_data({
"type": "function_call",
"name": "get_weather",
"arguments": '{"city": "Beijing"}'
})
# Add data deltas
data_builder.add_data_delta({"status": "processing"})
data_builder.add_data_delta({"result": "sunny"})
data_builder.complete()
By using the Agent API builder, developers can easily construct complex streaming responses that conform to protocol specifications, achieving better user experience and more flexible response control.