Event Data Models

Note

Technical specification for HoneyHive event data structures

This document defines the exact data models and formats used for events in the HoneyHive SDK.

Events are the core observability primitives in HoneyHive, representing discrete operations or interactions in your LLM application.

Core Event Model

class Event

The primary event data structure used throughout HoneyHive.

event_id: str

Unique identifier for the event.

Format: UUID v4 string Example: "01234567-89ab-cdef-0123-456789abcdef" Required: Auto-generated by SDK

session_id: str

Session identifier that groups related events.

Format: UUID v4 string Example: "session-01234567-89ab-cdef-0123-456789abcdef" Required: Auto-generated by tracer

parent_id: str | None

Parent event ID for nested operations.

Format: UUID v4 string Example: "parent-01234567-89ab-cdef-0123-456789abcdef" Required: No (None for root events)

event_type: str

Categorizes the type of operation.

Valid Values: - "model" - LLM model calls and interactions - "tool" - Tool/function calls and external API interactions - "chain" - Chain/workflow operations and multi-step processes

Example: "model" Required: Yes

event_name: str

Human-readable name for the specific operation.

Format: Descriptive string, typically kebab-case Example: "openai-chat-completion" Required: Yes

start_time: datetime

ISO 8601 timestamp when the event started.

Format: YYYY-MM-DDTHH:MM:SS.fffffZ Example: "2024-01-15T10:30:45.123456Z" Required: Auto-generated by SDK

end_time: datetime | None

ISO 8601 timestamp when the event completed.

Format: YYYY-MM-DDTHH:MM:SS.fffffZ Example: "2024-01-15T10:30:47.654321Z" Required: Auto-generated by SDK

duration_ms: float | None

Event duration in milliseconds.

Calculation: end_time - start_time in milliseconds Example: 2531.065 Required: Auto-calculated by SDK

status: str

Event completion status.

Values: - "success" - Completed successfully - "error" - Failed with error - "cancelled" - Cancelled before completion - "timeout" - Timed out

Example: "success" Required: Auto-determined by SDK

inputs: Dict[str, Any] | None

Input data for the operation.

Structure: Key-value pairs of input parameters Example:

{
  "messages": [
    {"role": "user", "content": "Hello, world!"}
  ],
  "model": "gpt-3.5-turbo",
  "temperature": 0.7,
  "max_tokens": 150
}

Required: No (but recommended)

outputs: Dict[str, Any] | None

Output data from the operation.

Structure: Key-value pairs of output data Example:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Required: No (but recommended)

metadata: Dict[str, Any] | None

Additional context and metadata.

Structure: Key-value pairs of contextual information Example:

{
  "user_id": "user_12345",
  "environment": "production",
  "model_version": "gpt-3.5-turbo-0613",
  "request_id": "req_abc123",
  "tags": ["chat", "customer-support"]
}

Required: No

metrics: Dict[str, int | float] | None

Numerical metrics associated with the event.

Structure: Key-value pairs of numeric measurements Example:

{
  "latency_ms": 1250.5,
  "token_count": 21,
  "cost_usd": 0.0001,
  "cache_hit_rate": 0.85,
  "confidence_score": 0.92
}

Required: No

error: Dict[str, Any] | None

Error information if the event failed.

Structure:

{
  "type": "OpenAIError",
  "message": "Rate limit exceeded",
  "code": "rate_limit_exceeded",
  "traceback": "Traceback (most recent call last)...",
  "context": {
    "retry_after": 60,
    "request_id": "req_123"
  }
}

Required: No (only for failed events)

project: str

Project identifier for organization.

Format: String identifier Example: "customer-chat-bot" Required: Yes (set by tracer)

source: str

Source system or component identifier.

Format: String identifier Example: "chat-service" Required: Yes (set by tracer)

user_properties: Dict[str, Any] | None

User-defined custom properties.

Structure: Flexible key-value pairs Example:

{
  "experiment_id": "exp_001",
  "feature_flags": ["new_ui", "beta_model"],
  "user_tier": "premium",
  "custom_field": "custom_value"
}

Required: No

LLM Event Model

class LLMEvent

Specialized event model for LLM operations, extends base Event model.

Inherits: All fields from Event

LLM-Specific Fields:

model: str

LLM model identifier.

Examples: "gpt-3.5-turbo", "claude-3-sonnet-20240229", "llama-2-70b" Required: Yes for LLM events

provider: str

LLM provider/service.

Values: "openai", "anthropic", "google", "azure", "local" Required: Yes for LLM events

prompt_template: str | None

Template used to generate the prompt.

Example: "Answer the following question: {question}" Required: No

prompt_variables: Dict[str, Any] | None

Variables used in prompt template.

Example: {"question": "What is the capital of France?"} Required: No

response_format: str | None

Expected response format.

Values: "text", "json", "function_call" Required: No

tools: List[Dict[str, Any]] | None

Available tools/functions for the LLM.

Structure: OpenAI function calling format Required: No

tool_calls: List[Dict[str, Any]] | None

Tool calls made by the LLM.

Structure: OpenAI tool call format Required: No

Example LLM Event:

{
  "event_id": "evt_01234567",
  "session_id": "session_abcdef",
  "event_type": "model",
  "event_name": "openai-chat-completion",
  "start_time": "2024-01-15T10:30:45.123Z",
  "end_time": "2024-01-15T10:30:47.654Z",
  "duration_ms": 2531.0,
  "status": "success",
  "model": "gpt-3.5-turbo",
  "provider": "openai",
  "inputs": {
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 50
  },
  "outputs": {
    "choices": [
      {
        "message": {
          "role": "assistant",
          "content": "The capital of France is Paris."
        },
        "finish_reason": "stop"
      }
    ],
    "usage": {
      "prompt_tokens": 12,
      "completion_tokens": 8,
      "total_tokens": 20
    }
  },
  "metrics": {
    "latency_ms": 2531.0,
    "tokens_per_second": 3.16,
    "cost_usd": 0.00004
  }
}

Tool Event Model

class ToolEvent

Event model for tool/function calls.

Inherits: All fields from Event

Tool-Specific Fields:

function_name: str

Name of the function/tool called.

Example: "get_weather" Required: Yes for tool events

function_description: str | None

Description of the function’s purpose.

Example: "Get current weather for a location" Required: No

parameters: Dict[str, Any] | None

Function parameters schema.

Structure: JSON Schema format Required: No

return_value: Any | None

Function return value.

Structure: Any valid JSON value Required: No

Example Tool Event:

{
  "event_id": "evt_tool_001",
  "session_id": "session_abcdef",
  "event_type": "tool",
  "event_name": "weather-api-call",
  "function_name": "get_weather",
  "inputs": {
    "location": "Paris, France",
    "units": "celsius"
  },
  "outputs": {
    "temperature": 22,
    "conditions": "sunny",
    "humidity": 65
  },
  "metrics": {
    "api_latency_ms": 150.5
  }
}

Evaluation Event Model

class EvaluationEvent

Event model for evaluation operations.

Inherits: All fields from Event

Evaluation-Specific Fields:

evaluator_name: str

Name of the evaluator used.

Example: "factual_accuracy" Required: Yes for evaluation events

evaluator_version: str | None

Version of the evaluator.

Example: "v1.2.0" Required: No

target_event_id: str

ID of the event being evaluated.

Format: UUID v4 string Required: Yes for evaluation events

score: float | int | bool | None

Evaluation score.

Examples: 0.85, True, 7 Required: No

explanation: str | None

Human-readable explanation of the score.

Example: "Response is factually accurate and well-supported" Required: No

criteria: Dict[str, Any] | None

Evaluation criteria used.

Structure: Evaluator-specific criteria Required: No

Example Evaluation Event:

{
  "event_id": "evt_eval_001",
  "session_id": "session_abcdef",
  "event_type": "tool",
  "event_name": "factual-accuracy-check",
  "evaluator_name": "factual_accuracy",
  "target_event_id": "evt_01234567",
  "score": 0.92,
  "explanation": "Response contains accurate information with proper citations",
  "metrics": {
    "confidence": 0.95,
    "processing_time_ms": 1200
  }
}

Event Serialization

JSON Format:

Events are serialized to JSON for storage and transmission:

import json
from datetime import datetime

# Event serialization
event = {
    "event_id": "evt_123",
    "event_type": "model",
    "start_time": datetime.utcnow().isoformat() + "Z",
    # ... other fields
}

json_data = json.dumps(event, ensure_ascii=False, indent=2)

Field Validation:

All events undergo validation before transmission:

from pydantic import BaseModel, Field
from typing import Optional, Dict, Any
from datetime import datetime

class EventModel(BaseModel):
    event_id: str = Field(..., description="Unique event identifier")
    event_type: str = Field(..., description="Type of event")
    event_name: str = Field(..., description="Human-readable event name")
    start_time: datetime = Field(..., description="Event start time")
    end_time: Optional[datetime] = Field(None, description="Event end time")
    inputs: Optional[Dict[str, Any]] = Field(None, description="Input data")
    outputs: Optional[Dict[str, Any]] = Field(None, description="Output data")
    metadata: Optional[Dict[str, Any]] = Field(None, description="Metadata")

    class Config:
        # Ensure datetime serialization
        json_encoders = {
            datetime: lambda v: v.isoformat() + "Z"
        }

Event Batching:

Events can be batched for efficient transmission:

{
  "batch_id": "batch_001",
  "project": "my-project",
  "events": [
    {
      "event_id": "evt_001",
      "event_type": "model",
      // ... event data
    },
    {
      "event_id": "evt_002",
      "event_type": "tool",
      // ... event data
    }
  ],
  "metadata": {
    "batch_size": 2,
    "created_at": "2024-01-15T10:30:45.123Z"
  }
}

Common Patterns

Nested Events:

Events can form hierarchies using parent_id:

{
  "event_id": "evt_parent",
  "event_type": "chain",
  "event_name": "rag-pipeline",
  "parent_id": null
}

{
  "event_id": "evt_child_1",
  "event_type": "tool",
  "event_name": "vector-search",
  "parent_id": "evt_parent"
}

{
  "event_id": "evt_child_2",
  "event_type": "model",
  "event_name": "answer-generation",
  "parent_id": "evt_parent"
}

Event Correlation:

Events can reference each other:

{
  "event_id": "evt_llm",
  "event_type": "model",
  "outputs": {"response": "Paris is the capital."}
}

{
  "event_id": "evt_eval",
  "event_type": "tool",
  "target_event_id": "evt_llm",
  "score": 0.95
}

Custom Event Types:

Define domain-specific event types:

# Custom event for document processing
custom_event = {
    "event_type": "chain",
    "event_name": "pdf-extraction",
    "inputs": {
        "document_url": "https://example.com/doc.pdf",
        "extract_tables": True
    },
    "outputs": {
        "text_content": "...",
        "tables": [...],
        "page_count": 10
    },
    "metadata": {
        "processing_engine": "pdfplumber",
        "file_size_mb": 2.5
    }
}

Best Practices

Event Design Guidelines:

Descriptive Names: Use clear, descriptive event_name values
Consistent Types: Standardize event_type values across your application
Rich Context: Include relevant metadata for debugging and analysis
Structured Data: Keep inputs and outputs well-structured
Error Details: Capture comprehensive error information when events fail
Metrics: Include relevant performance and business metrics
Privacy: Avoid capturing sensitive data in event fields

Performance Considerations:

Field Size: Keep individual fields reasonably sized (< 1MB recommended)
Batch Events: Use batching for high-volume scenarios
Async Logging: Log events asynchronously to avoid blocking operations
Selective Capture: Only capture necessary data to minimize overhead

Event Data Models

Core Event Model

LLM Event Model

Tool Event Model

Evaluation Event Model

Event Serialization

Common Patterns

Best Practices

See Also