Event Data Models

Note

Technical specification for HoneyHive event data structures

This document defines the exact data models and formats used for events in the HoneyHive SDK.

Events are the core observability primitives in HoneyHive, representing discrete operations or interactions in your LLM application.

Core Event Model

class Event

The primary event data structure used throughout HoneyHive.

event_id: str

Unique identifier for the event.

Format: UUID v4 string Example: "01234567-89ab-cdef-0123-456789abcdef" Required: Auto-generated by SDK

session_id: str

Session identifier that groups related events.

Format: UUID v4 string Example: "session-01234567-89ab-cdef-0123-456789abcdef" Required: Auto-generated by tracer

parent_id: str | None

Parent event ID for nested operations.

Format: UUID v4 string Example: "parent-01234567-89ab-cdef-0123-456789abcdef" Required: No (None for root events)

event_type: str

Categorizes the type of operation.

Valid Values: - "model" - LLM model calls and interactions - "tool" - Tool/function calls and external API interactions - "chain" - Chain/workflow operations and multi-step processes

Example: "model" Required: Yes

event_name: str

Human-readable name for the specific operation.

Format: Descriptive string, typically kebab-case Example: "openai-chat-completion" Required: Yes

start_time: datetime

ISO 8601 timestamp when the event started.

Format: YYYY-MM-DDTHH:MM:SS.fffffZ Example: "2024-01-15T10:30:45.123456Z" Required: Auto-generated by SDK

end_time: datetime | None

ISO 8601 timestamp when the event completed.

Format: YYYY-MM-DDTHH:MM:SS.fffffZ Example: "2024-01-15T10:30:47.654321Z" Required: Auto-generated by SDK

duration_ms: float | None

Event duration in milliseconds.

Calculation: end_time - start_time in milliseconds Example: 2531.065 Required: Auto-calculated by SDK

status: str

Event completion status.

Values: - "success" - Completed successfully - "error" - Failed with error - "cancelled" - Cancelled before completion - "timeout" - Timed out

Example: "success" Required: Auto-determined by SDK

inputs: Dict[str, Any] | None

Input data for the operation.

Structure: Key-value pairs of input parameters Example:

{
  "messages": [
    {"role": "user", "content": "Hello, world!"}
  ],
  "model": "gpt-3.5-turbo",
  "temperature": 0.7,
  "max_tokens": 150
}

Required: No (but recommended)

outputs: Dict[str, Any] | None

Output data from the operation.

Structure: Key-value pairs of output data Example:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Required: No (but recommended)

metadata: Dict[str, Any] | None

Additional context and metadata.

Structure: Key-value pairs of contextual information Example:

{
  "user_id": "user_12345",
  "environment": "production",
  "model_version": "gpt-3.5-turbo-0613",
  "request_id": "req_abc123",
  "tags": ["chat", "customer-support"]
}

Required: No

metrics: Dict[str, int | float] | None

Numerical metrics associated with the event.

Structure: Key-value pairs of numeric measurements Example:

{
  "latency_ms": 1250.5,
  "token_count": 21,
  "cost_usd": 0.0001,
  "cache_hit_rate": 0.85,
  "confidence_score": 0.92
}

Required: No

error: Dict[str, Any] | None

Error information if the event failed.

Structure:

{
  "type": "OpenAIError",
  "message": "Rate limit exceeded",
  "code": "rate_limit_exceeded",
  "traceback": "Traceback (most recent call last)...",
  "context": {
    "retry_after": 60,
    "request_id": "req_123"
  }
}

Required: No (only for failed events)

project: str

Project identifier for organization.

Format: String identifier Example: "customer-chat-bot" Required: Yes (set by tracer)

source: str

Source system or component identifier.

Format: String identifier Example: "chat-service" Required: Yes (set by tracer)

user_properties: Dict[str, Any] | None

User-defined custom properties.

Structure: Flexible key-value pairs Example:

{
  "experiment_id": "exp_001",
  "feature_flags": ["new_ui", "beta_model"],
  "user_tier": "premium",
  "custom_field": "custom_value"
}

Required: No

LLM Event Model

class LLMEvent

Specialized event model for LLM operations, extends base Event model.

Inherits: All fields from Event

LLM-Specific Fields:

model: str

LLM model identifier.

Examples: "gpt-3.5-turbo", "claude-3-sonnet-20240229", "llama-2-70b" Required: Yes for LLM events

provider: str

LLM provider/service.

Values: "openai", "anthropic", "google", "azure", "local" Required: Yes for LLM events

prompt_template: str | None

Template used to generate the prompt.

Example: "Answer the following question: {question}" Required: No

prompt_variables: Dict[str, Any] | None

Variables used in prompt template.

Example: {"question": "What is the capital of France?"} Required: No

response_format: str | None

Expected response format.

Values: "text", "json", "function_call" Required: No

tools: List[Dict[str, Any]] | None

Available tools/functions for the LLM.

Structure: OpenAI function calling format Required: No

tool_calls: List[Dict[str, Any]] | None

Tool calls made by the LLM.

Structure: OpenAI tool call format Required: No

Example LLM Event:

{
  "event_id": "evt_01234567",
  "session_id": "session_abcdef",
  "event_type": "model",
  "event_name": "openai-chat-completion",
  "start_time": "2024-01-15T10:30:45.123Z",
  "end_time": "2024-01-15T10:30:47.654Z",
  "duration_ms": 2531.0,
  "status": "success",
  "model": "gpt-3.5-turbo",
  "provider": "openai",
  "inputs": {
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 50
  },
  "outputs": {
    "choices": [
      {
        "message": {
          "role": "assistant",
          "content": "The capital of France is Paris."
        },
        "finish_reason": "stop"
      }
    ],
    "usage": {
      "prompt_tokens": 12,
      "completion_tokens": 8,
      "total_tokens": 20
    }
  },
  "metrics": {
    "latency_ms": 2531.0,
    "tokens_per_second": 3.16,
    "cost_usd": 0.00004
  }
}

Tool Event Model

class ToolEvent

Event model for tool/function calls.

Inherits: All fields from Event

Tool-Specific Fields:

function_name: str

Name of the function/tool called.

Example: "get_weather" Required: Yes for tool events

function_description: str | None

Description of the function’s purpose.

Example: "Get current weather for a location" Required: No

parameters: Dict[str, Any] | None

Function parameters schema.

Structure: JSON Schema format Required: No

return_value: Any | None

Function return value.

Structure: Any valid JSON value Required: No

Example Tool Event:

{
  "event_id": "evt_tool_001",
  "session_id": "session_abcdef",
  "event_type": "tool",
  "event_name": "weather-api-call",
  "function_name": "get_weather",
  "inputs": {
    "location": "Paris, France",
    "units": "celsius"
  },
  "outputs": {
    "temperature": 22,
    "conditions": "sunny",
    "humidity": 65
  },
  "metrics": {
    "api_latency_ms": 150.5
  }
}

Evaluation Event Model

class EvaluationEvent

Event model for evaluation operations.

Inherits: All fields from Event

Evaluation-Specific Fields:

evaluator_name: str

Name of the evaluator used.

Example: "factual_accuracy" Required: Yes for evaluation events

evaluator_version: str | None

Version of the evaluator.

Example: "v1.2.0" Required: No

target_event_id: str

ID of the event being evaluated.

Format: UUID v4 string Required: Yes for evaluation events

score: float | int | bool | None

Evaluation score.

Examples: 0.85, True, 7 Required: No

explanation: str | None

Human-readable explanation of the score.

Example: "Response is factually accurate and well-supported" Required: No

criteria: Dict[str, Any] | None

Evaluation criteria used.

Structure: Evaluator-specific criteria Required: No

Example Evaluation Event:

{
  "event_id": "evt_eval_001",
  "session_id": "session_abcdef",
  "event_type": "tool",
  "event_name": "factual-accuracy-check",
  "evaluator_name": "factual_accuracy",
  "target_event_id": "evt_01234567",
  "score": 0.92,
  "explanation": "Response contains accurate information with proper citations",
  "metrics": {
    "confidence": 0.95,
    "processing_time_ms": 1200
  }
}

Event Serialization

JSON Format:

Events are serialized to JSON for storage and transmission:

import json
from datetime import datetime

# Event serialization
event = {
    "event_id": "evt_123",
    "event_type": "model",
    "start_time": datetime.utcnow().isoformat() + "Z",
    # ... other fields
}

json_data = json.dumps(event, ensure_ascii=False, indent=2)

Field Validation:

All events undergo validation before transmission:

from pydantic import BaseModel, Field
from typing import Optional, Dict, Any
from datetime import datetime

class EventModel(BaseModel):
    event_id: str = Field(..., description="Unique event identifier")
    event_type: str = Field(..., description="Type of event")
    event_name: str = Field(..., description="Human-readable event name")
    start_time: datetime = Field(..., description="Event start time")
    end_time: Optional[datetime] = Field(None, description="Event end time")
    inputs: Optional[Dict[str, Any]] = Field(None, description="Input data")
    outputs: Optional[Dict[str, Any]] = Field(None, description="Output data")
    metadata: Optional[Dict[str, Any]] = Field(None, description="Metadata")

    class Config:
        # Ensure datetime serialization
        json_encoders = {
            datetime: lambda v: v.isoformat() + "Z"
        }

Event Batching:

Events can be batched for efficient transmission:

{
  "batch_id": "batch_001",
  "project": "my-project",
  "events": [
    {
      "event_id": "evt_001",
      "event_type": "model",
      // ... event data
    },
    {
      "event_id": "evt_002",
      "event_type": "tool",
      // ... event data
    }
  ],
  "metadata": {
    "batch_size": 2,
    "created_at": "2024-01-15T10:30:45.123Z"
  }
}

Common Patterns

Nested Events:

Events can form hierarchies using parent_id:

{
  "event_id": "evt_parent",
  "event_type": "chain",
  "event_name": "rag-pipeline",
  "parent_id": null
}

{
  "event_id": "evt_child_1",
  "event_type": "tool",
  "event_name": "vector-search",
  "parent_id": "evt_parent"
}

{
  "event_id": "evt_child_2",
  "event_type": "model",
  "event_name": "answer-generation",
  "parent_id": "evt_parent"
}

Event Correlation:

Events can reference each other:

{
  "event_id": "evt_llm",
  "event_type": "model",
  "outputs": {"response": "Paris is the capital."}
}

{
  "event_id": "evt_eval",
  "event_type": "tool",
  "target_event_id": "evt_llm",
  "score": 0.95
}

Custom Event Types:

Define domain-specific event types:

# Custom event for document processing
custom_event = {
    "event_type": "chain",
    "event_name": "pdf-extraction",
    "inputs": {
        "document_url": "https://example.com/doc.pdf",
        "extract_tables": True
    },
    "outputs": {
        "text_content": "...",
        "tables": [...],
        "page_count": 10
    },
    "metadata": {
        "processing_engine": "pdfplumber",
        "file_size_mb": 2.5
    }
}

Best Practices

Event Design Guidelines:

  1. Descriptive Names: Use clear, descriptive event_name values

  2. Consistent Types: Standardize event_type values across your application

  3. Rich Context: Include relevant metadata for debugging and analysis

  4. Structured Data: Keep inputs and outputs well-structured

  5. Error Details: Capture comprehensive error information when events fail

  6. Metrics: Include relevant performance and business metrics

  7. Privacy: Avoid capturing sensitive data in event fields

Performance Considerations:

  1. Field Size: Keep individual fields reasonably sized (< 1MB recommended)

  2. Batch Events: Use batching for high-volume scenarios

  3. Async Logging: Log events asynchronously to avoid blocking operations

  4. Selective Capture: Only capture necessary data to minimize overhead

See Also