Agent Execution
Execute agents via Server-Sent Events (SSE) streaming and manage conversation sessions.
Authentication
All execution endpoints support hybrid authentication:
| Method | Header |
|---|---|
| JWT | Authorization: Bearer <jwt> + x-tenant-id |
| API Key | X-Api-Key: YOUR_API_KEY |
POST /api/v2/agents/:agent_id/stream
Execute an agent with real-time SSE streaming.
Path Parameters:
| Param | Type | Description |
|---|---|---|
agent_id | UUID | Agent ID |
Request Body:
{
"message": "List active contracts",
"session_id": "uuid (optional)",
"attachments": [
{
"url": "https://example.com/image.png",
"type": "image"
}
],
"options": {
"temperature": 0.7,
"max_tokens": 4096
},
"metadata": {
"trace_id": "string",
"correlation_id": "string"
}
}
| Field | Type | Required | Description |
|---|---|---|---|
message | string | Yes | User message |
session_id | UUID | No | Session ID to maintain conversation context |
attachments | array | No | Multimodal files (see Media) |
options | object | No | Execution configuration |
metadata | object | No | Tracing metadata |
Attachments
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | File URL or base64 data URI |
type | string | No | image, audio, document (auto-detect if omitted) |
filename | string | No | Filename (helps type detection) |
Supported types:
- image: jpg, jpeg, png, gif, webp, bmp
- audio: mp3, wav, ogg, flac, m4a, webm, mp4
- document: pdf, xlsx, xls, docx, csv, txt
SSE Response
See Streaming for the complete event format.
Errors:
| Code | Status | Description |
|---|---|---|
VALIDATION_ERROR | 400 | Missing message field |
AGENT_NOT_FOUND | 403 | Agent not available |
AGENT_ARCHIVED | — | Archived agent — execution blocked |
STREAM_ERROR | 500 | Error during streaming |
GET /api/v2/agents/:agent_id/sessions
List conversation sessions for an agent.
Query Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
limit | integer | 20 | Max sessions returned |
offset | integer | 0 | Pagination offset |
include_expired | boolean | false | Include expired sessions |
Response:
{
"success": true,
"data": [
{
"id": "ee59c2b3-1466-4367-a365-7d62a9d93a2d",
"created_at": "2026-01-14T10:30:00.000Z",
"updated_at": "2026-01-14T11:45:00.000Z",
"message_count": 8,
"last_message_preview": "Found 5 active contracts for the client...",
"last_message_role": "assistant"
}
],
"count": 2,
"agent_id": "uuid"
}
Session Management
Sessions are managed through the /api/v1/sessions endpoints.
Session Expiration (TTL by Plan)
| Plan | Session TTL | Redis TTL |
|---|---|---|
free | 24 hours | 24h |
basic | 7 days | 24h (cap) |
pro | 7 days | 24h (cap) |
enterprise | 30 days | 24h (cap) |
- Each user interaction renews
expires_atbased on the current plan - Expired sessions are deleted by a cleanup job (every 6 hours)
Key Endpoints
| Endpoint | Method | Description |
|---|---|---|
GET /api/v1/sessions | GET | List sessions with filters |
GET /api/v1/sessions/active | GET | Currently executing sessions |
GET /api/v1/sessions/stats | GET | Aggregated metrics |
GET /api/v1/sessions/:sessionId | GET | Session details with conversation history and system prompt |
GET /api/v1/sessions/:sessionId/tool-calls | GET | Granular tool call logs (input, output, duration) |
GET /api/v1/sessions/:executionId/live | GET | Real-time SSE stream |
POST /api/v1/sessions/:sessionId/archive | POST | Archive session |
POST /api/v1/sessions/:sessionId/clear | POST | Clear conversation history |
POST /api/v1/sessions/:sessionId/compact | POST | Compact session (replace old messages with LLM summary) |
POST /api/v1/sessions/batch/archive | POST | Archive multiple sessions (max 50) |
POST /api/v1/sessions/batch/compact | POST | Batch compact eligible sessions |
Session Compaction
Compaction replaces old messages with an LLM-generated summary, preserving recent messages intact.
| Tier | Threshold | Preserved Messages | Model |
|---|---|---|---|
default | 20 msgs | 6 | gpt-4.1-nano |
basic | 20 msgs | 6 | gpt-4.1-nano |
premium | 35 msgs | 10 | gpt-4.1-nano |
enterprise | 70 msgs | 20 | gpt-4.1-mini |
Derived Sessions (Webhooks)
Sessions created via webhooks with session_strategy.mode: 'derive' use deterministic UUIDs (hash of user + company + agent identifiers). The same user always generates the same session_id.
When an archived derived session receives a new message:
- The same deterministic
session_idis generated - No active session is found
- A new session is created with clean history
- The session is reactivated via UPSERT
GET /api/v1/sessions/:sessionId/tool-calls
Get detailed tool call logs for a session. Each entry includes the tool's input arguments, output result, execution duration, and success/failure status.
Query Parameters:
| Param | Type | Default | Description |
|---|---|---|---|
iteration | integer | — | Filter by agent loop iteration |
tool_name | string | — | Filter by tool name |
Response:
{
"success": true,
"data": [
{
"id": "uuid",
"tool_name": "search_contracts",
"tool_call_id": "call_abc123",
"tool_input": { "status": "active", "limit": 10 },
"tool_output": { "rows": [...], "count": 5 },
"output_preview": "5 rows returned",
"success": true,
"duration_ms": 450,
"error_message": null,
"iteration": 1,
"call_index": 0,
"execution_id": "uuid",
"created_at": "2026-02-19T14:30:01Z"
}
],
"count": 1,
"session_id": "a1b2c3d4-..."
}
Tool call logs are recorded starting from a previous release. Sessions from before this update will return an empty array.
Conversation History Normalization
Starting from a previous release, the agent loop persists the complete tool call sequence internally (including role: 'tool' messages and assistant messages with tool_calls). However, all session endpoints normalize the conversation_history before returning:
- Messages with
role: 'tool'are filtered out (internal to the agent loop) - Assistant messages that only contain
tool_calls(no text content) are filtered out - Technical fields (
tool_calls,tool_call_id,name) are stripped from remaining messages - The
message_countin session lists reflects only renderable messages (user + assistant with content)
This ensures API consumers always receive the familiar {role: 'user'|'assistant', content: string} format.
Template Variables in System Prompt
The agent's system prompt supports dynamic template variables that are resolved at execution time. This allows personas to reference current date/time without manual updates.
| Variable | Description | Example |
|---|---|---|
{{current_date}} | Today's date (pt-BR format) | 02/03/2026 |
{{current_time}} | Current time (pt-BR format) | 14:30:00 |
{{current_weekday}} | Weekday name (pt-BR) | segunda-feira |
{{agent_name}} | The agent's name | Sales Assistant |
Variables are interpolated after the system prompt is assembled and before it is sent to the LLM. Unrecognized variables (e.g., {{custom_field}}) are preserved as-is.
System Prompt Snapshot
The GET /api/v1/sessions/:sessionId response includes a top-level system_prompt field containing the exact system prompt used in the first turn of the session. This is captured once and never overwritten, even if the agent's persona changes later.
{
"data": {
"session": { "..." },
"conversation_history": [ "..." ],
"metrics": { "..." },
"logs": [ "..." ],
"system_prompt": "# Agent Name\n\n## Identidade\n..."
}
}
| Scenario | system_prompt value |
|---|---|
| Session created after this feature was introduced | Full system prompt string |
| Session created in older versions | null |
| Session with no interactions yet | null |
GET /api/v2/agents/:agent_id/tools
Get the complete JSON Schema for each tool available to the agent.
Response:
{
"success": true,
"data": [
{
"name": "zigma_list_contracts",
"description": "List active client contracts",
"parameters": {
"type": "object",
"properties": {
"customer_id": {
"type": "string",
"description": "Client ID"
}
},
"required": ["customer_id"]
}
}
],
"count": 39,
"agent_id": "uuid"
}
Examples
Execute Agent (Stream)
- curl
- Node.js
- Python
curl -N "https://llm.zihin.ai/api/v2/agents/AGENT_ID/stream" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"message": "List active contracts"}'
const response = await fetch(
'https://llm.zihin.ai/api/v2/agents/AGENT_ID/stream',
{
method: 'POST',
headers: {
'X-Api-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
},
body: JSON.stringify({ message: 'List active contracts' }),
}
);
// Read SSE stream
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(decoder.decode(value));
}
import requests
response = requests.post(
"https://llm.zihin.ai/api/v2/agents/AGENT_ID/stream",
headers={
"X-Api-Key": "YOUR_API_KEY",
"Content-Type": "application/json",
},
json={"message": "List active contracts"},
stream=True,
)
# Read SSE stream
for chunk in response.iter_content(chunk_size=None):
print(chunk.decode(), end="")
Execute with Attachment
curl -N -X POST "https://llm.zihin.ai/api/v2/agents/AGENT_ID/stream" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"message": "What do you see in this image?",
"attachments": [
{ "url": "https://example.com/photo.jpg", "type": "image" }
]
}'
Continue Conversation
curl -N "https://llm.zihin.ai/api/v2/agents/AGENT_ID/stream" \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type": "application/json" \
-d '{"message": "Show more details", "session_id": "SESSION_ID"}'