Session Enrichment ================== **Problem:** You need to add metadata, metrics, and context to entire sessions (collections of related spans) for tracking user workflows, experiments, or multi-step operations. **Solution:** Use ``enrich_session()`` to add session-level metadata that persists across all spans in a session and is stored in the HoneyHive backend. This guide covers session enrichment patterns. For span-level enrichment, see :doc:`span-enrichment`. Understanding Session Enrichment -------------------------------- Session enrichment differs from span enrichment: **Span Enrichment** (``enrich_span()``): - Adds metadata to a **single span** (one operation) - Stored in OpenTelemetry span attributes - Local to the trace **Session Enrichment** (``enrich_session()``): - Adds metadata to an **entire session** (collection of spans) - **Persisted to HoneyHive backend** via API - Available for analysis across all spans in the session - Supports complex nested data structures Use Cases --------- Session enrichment is ideal for: - **User Workflows**: Track user journeys across multiple LLM calls - **Experiments**: Add experiment parameters and results - **A/B Testing**: Tag sessions with test variants - **Business Context**: Add customer IDs, subscription tiers, feature flags - **Performance Metrics**: Session-level latency, success rates, cost tracking API Reference ------------- Function Signature ~~~~~~~~~~~~~~~~~~ .. py:function:: enrich_session(session_id=None, *, metadata=None, inputs=None, outputs=None, config=None, feedback=None, metrics=None, user_properties=None, **kwargs) Add metadata and metrics to a session with backend persistence. .. note:: **All parameters are optional**: You can call ``enrich_session()`` without any parameters. The function will work correctly as long as a valid session_id is available (either explicitly provided or detected from the active context). This is useful for ensuring a session exists or "touching" it even when you don't have enrichment data to add. **Parameters:** :param metadata: Business context data (user IDs, features, session info). :type metadata: Optional[Dict[str, Any]] :param inputs: Input data for the session (e.g., initial query, configuration). :type inputs: Optional[Dict[str, Any]] :param outputs: Output data from the session (e.g., final response, results). :type outputs: Optional[Dict[str, Any]] :param config: Configuration parameters for the session (model settings, hyperparameters). :type config: Optional[Dict[str, Any]] :param feedback: User or system feedback for the session (ratings, quality scores). :type feedback: Optional[Dict[str, Any]] :param metrics: Numeric measurements for the session (latency, cost, token counts). :type metrics: Optional[Dict[str, Any]] :param user_properties: User-specific properties (user_id, plan, etc.). Stored as a separate field in the backend, not merged into metadata. :type user_properties: Optional[Dict[str, Any]] :param session_id: Explicit session ID to enrich. If not provided, uses the active session from context. :type session_id: Optional[str] :param kwargs: Additional keyword arguments (passed through for extensibility). :type kwargs: Any **Returns:** :rtype: None :returns: None (updates session in backend) **Raises:** - No exceptions raised - failures are logged and gracefully handled **Key Differences from enrich_span:** 1. **Backend Persistence**: ``enrich_session()`` makes API calls to persist data, while ``enrich_span()`` only sets local span attributes 2. **Session Scope**: Affects the entire session, not just the current span 3. **Complex Data**: Supports nested dictionaries and lists 4. **Explicit Session ID**: Can target any session by ID, not just the active one Basic Usage ----------- Enrich Active Session ~~~~~~~~~~~~~~~~~~~~~ The simplest usage enriches the currently active session: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session import openai # Initialize tracer (creates a session automatically) tracer = HoneyHiveTracer.init( project="my-app", session_name="user-123-chat" ) # Enrich the active session enrich_session( metadata={ "user_id": "user_123", "subscription_tier": "premium", "feature": "chat_assistant" } ) # All subsequent traces in this session will be associated with this metadata client = openai.OpenAI() response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello!"}] ) .. note:: **Optional Parameters**: All parameters to ``enrich_session()`` are optional. You can call ``enrich_session()`` without any parameters to ensure the session exists or to "touch" it, even if you don't have enrichment data to add at that moment. The function will work correctly as long as a valid session_id is available (either explicitly provided or detected from the active context). Enrich Specific Session ~~~~~~~~~~~~~~~~~~~~~~~ Target a specific session by providing its ID: .. code-block:: python from honeyhive import enrich_session # Enrich a specific session (not necessarily the active one) enrich_session( session_id="sess_abc123xyz", metadata={ "experiment": "variant_b", "completed": True }, metrics={ "total_tokens": 1500, "total_cost": 0.045, "duration_seconds": 12.5 } ) Backwards Compatible Signatures ------------------------------- The ``enrich_session()`` function maintains full backwards compatibility with previous versions: Legacy Signature (Still Supported) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Old style: positional session_id enrich_session( "sess_abc123", # session_id as first positional arg metadata={"user_id": "user_456"} ) # Old style: user_properties parameter enrich_session( session_id="sess_abc123", user_properties={ "tier": "premium", "region": "us-east" } ) # Result: user_properties stored as a separate field in the backend # Backend receives: # { # "user_properties": { # "tier": "premium", # "region": "us-east" # } # } Modern Signature (Recommended) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # New style: keyword-only arguments enrich_session( session_id="sess_abc123", # Optional, defaults to active session metadata={ "user_id": "user_456", "tier": "premium", "region": "us-east" }, metrics={ "total_cost": 0.045 } ) Common Patterns --------------- Pattern 1: User Workflow Tracking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Track user journeys across multiple interactions: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session from datetime import datetime import openai def handle_user_workflow(user_id: str, workflow_name: str): """Handle a multi-step user workflow.""" # Initialize session for this workflow tracer = HoneyHiveTracer.init( project="customer-support", session_name=f"{workflow_name}-{user_id}" ) # Enrich with user context enrich_session( metadata={ "user_id": user_id, "workflow": workflow_name, "started_at": datetime.now().isoformat() } ) # Step 1: Initial query client = openai.OpenAI() response1 = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "How do I reset my password?"}] ) # Update session with progress enrich_session( metadata={ "step": "initial_query_complete" } ) # Step 2: Follow-up response2 = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": "How do I reset my password?"}, {"role": "assistant", "content": response1.choices[0].message.content}, {"role": "user", "content": "I didn't receive the email"} ] ) # Final session enrichment enrich_session( metadata={ "step": "workflow_complete", "completed_at": datetime.now().isoformat() }, metrics={ "total_interactions": 2, "resolution": "success" } ) return response2.choices[0].message.content Pattern 2: Experiment Tracking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Add experiment parameters and results to sessions: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session import openai import random import time def run_ab_test_experiment(query: str, user_id: str): """Run A/B test with different model configurations.""" # Determine variant variant = "variant_a" if random.random() < 0.5 else "variant_b" # Initialize session tracer = HoneyHiveTracer.init( project="ab-testing", session_name=f"experiment-{user_id}" ) # Enrich with experiment metadata enrich_session( metadata={ "experiment": "prompt_optimization_v2", "variant": variant, "user_id": user_id }, config={ "model": "gpt-4" if variant == "variant_a" else "gpt-3.5-turbo", "temperature": 0.7 if variant == "variant_a" else 0.9 } ) # Run the experiment start_time = time.time() client = openai.OpenAI() response = client.chat.completions.create( model="gpt-4" if variant == "variant_a" else "gpt-3.5-turbo", messages=[{"role": "user", "content": query}], temperature=0.7 if variant == "variant_a" else 0.9 ) duration = time.time() - start_time # Enrich with results enrich_session( metrics={ "response_time": duration, "token_count": response.usage.total_tokens, "cost": calculate_cost(response.usage) }, outputs={ "response": response.choices[0].message.content } ) return response.choices[0].message.content Pattern 3: Session Feedback Collection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Add user feedback to sessions after completion: .. code-block:: python from honeyhive import enrich_session from datetime import datetime def collect_session_feedback(session_id: str, rating: int, comments: str): """Add user feedback to a completed session.""" # Enrich the session with feedback (can be called after session ends) enrich_session( session_id=session_id, feedback={ "user_rating": rating, "user_comments": comments, "feedback_timestamp": datetime.now().isoformat(), "helpful": rating >= 4 }, metadata={ "feedback_collected": True } ) Pattern 4: Cost and Performance Tracking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Track session-level costs and performance metrics: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session import openai class SessionCostTracker: """Track costs across a session.""" def __init__(self, project: str, session_name: str): self.tracer = HoneyHiveTracer.init( project=project, session_name=session_name ) self.total_tokens = 0 self.total_cost = 0.0 self.call_count = 0 def make_llm_call(self, messages: list, model: str = "gpt-3.5-turbo"): """Make an LLM call and track costs.""" client = openai.OpenAI() response = client.chat.completions.create( model=model, messages=messages ) # Update tracking self.call_count += 1 self.total_tokens += response.usage.total_tokens self.total_cost += self.calculate_cost(response.usage, model) # Enrich session with updated metrics enrich_session( metrics={ "total_tokens": self.total_tokens, "total_cost": self.total_cost, "call_count": self.call_count, "avg_tokens_per_call": self.total_tokens / self.call_count } ) return response.choices[0].message.content def calculate_cost(self, usage, model): """Calculate cost based on token usage and model.""" # Simplified cost calculation if "gpt-4" in model: return (usage.prompt_tokens * 0.00003 + usage.completion_tokens * 0.00006) else: return (usage.prompt_tokens * 0.000001 + usage.completion_tokens * 0.000002) # Usage tracker = SessionCostTracker("my-app", "cost-tracking-session") tracker.make_llm_call([{"role": "user", "content": "Hello!"}]) tracker.make_llm_call([{"role": "user", "content": "Tell me more"}]) Pattern 5: Multi-Instance Session Enrichment ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enrich sessions across multiple tracer instances: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session # Create multiple tracers for different workflows prod_tracer = HoneyHiveTracer.init( project="production", session_name="prod-session-1", source="production" ) test_tracer = HoneyHiveTracer.init( project="testing", session_name="test-session-1", source="testing" ) # Enrich production session enrich_session( metadata={ "environment": "production", "user_id": "user_123" }, tracer_instance=prod_tracer # Specify which tracer's session to enrich ) # Enrich test session enrich_session( metadata={ "environment": "testing", "test_case": "scenario_1" }, tracer_instance=test_tracer ) Advanced Usage -------------- Session Lifecycle Management ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enrich sessions at different lifecycle stages: .. code-block:: python from honeyhive import HoneyHiveTracer, enrich_session from datetime import datetime import openai def managed_session_workflow(user_id: str, task: str): """Demonstrate session enrichment across lifecycle.""" # Initialize session tracer = HoneyHiveTracer.init( project="managed-workflows", session_name=f"{task}-{user_id}" ) # Start: Add initial metadata enrich_session( metadata={ "user_id": user_id, "task": task, "status": "started", "started_at": datetime.now().isoformat() } ) try: # In Progress: Update status enrich_session( metadata={ "status": "in_progress" } ) # Do work client = openai.OpenAI() response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": f"Help me with: {task}"}] ) # Success: Add final metadata enrich_session( metadata={ "status": "completed", "completed_at": datetime.now().isoformat() }, outputs={ "result": response.choices[0].message.content }, metrics={ "success": True } ) return response.choices[0].message.content except Exception as e: # Error: Add error metadata enrich_session( metadata={ "status": "failed", "failed_at": datetime.now().isoformat(), "error_type": type(e).__name__, "error_message": str(e) }, metrics={ "success": False } ) raise Complex Data Structures ~~~~~~~~~~~~~~~~~~~~~~~ ``enrich_session()`` supports nested dictionaries and lists: .. code-block:: python from honeyhive import enrich_session # Complex nested structures enrich_session( metadata={ "user": { "id": "user_123", "profile": { "tier": "premium", "features": ["chat", "analytics", "export"], "settings": { "notifications": True, "language": "en" } } } }, config={ "model_pipeline": [ {"step": 1, "model": "gpt-4", "temperature": 0.7}, {"step": 2, "model": "gpt-3.5-turbo", "temperature": 0.5} ], "fallback_strategy": { "enabled": True, "models": ["gpt-4", "gpt-3.5-turbo", "claude-2"] } } ) Best Practices -------------- **DO:** - Enrich sessions at key lifecycle points (start, progress, completion) - Use consistent naming conventions for metadata keys - Add business-relevant context (user IDs, feature flags, experiments) - Include performance metrics (cost, latency, token counts) - Collect and add user feedback to completed sessions **DON'T:** - Include sensitive data (passwords, API keys, PII) - Add extremely large payloads (>100KB per enrichment) - Call ``enrich_session()`` excessively (it makes API calls) - Use inconsistent key names across sessions - Forget to handle enrichment failures gracefully Troubleshooting --------------- **Session enrichment not appearing:** - Verify tracer is initialized and session is active - Check API key has proper permissions - Ensure session_id is valid (if explicitly provided) - Check network connectivity and API endpoint **Performance impact:** - ``enrich_session()`` makes API calls (expect ~50-200ms per call) - Batch enrichment calls when possible (send all data at once) - Don't call inside tight loops - Consider async enrichment for high-throughput applications **Backwards compatibility issues:** - The function accepts both old and new signatures - ``user_properties`` is stored as a separate field (not merged into metadata) - ``session_id`` can be positional or keyword argument - All enrichment data is gracefully merged Comparison with enrich_span --------------------------- .. list-table:: :header-rows: 1 :widths: 30 35 35 * - Feature - enrich_span() - enrich_session() * - Scope - Single span - Entire session * - Storage - OpenTelemetry attributes - HoneyHive backend API * - Persistence - Local to trace - Backend persisted * - API Calls - No - Yes * - Complex Data - Limited (OTel constraints) - Full support * - Performance - Instant - ~50-200ms per call * - Use Case - Operation-level context - Workflow-level context Next Steps ---------- - :doc:`span-enrichment` - Learn about span-level enrichment - :doc:`custom-spans` - Create custom spans for complex workflows - :doc:`advanced-patterns` - Advanced session and tracing patterns - :doc:`/how-to/llm-application-patterns` - Application architecture patterns **Key Takeaway:** Use ``enrich_session()`` to add workflow-level context that persists across all spans in a session and is stored in the HoneyHive backend for comprehensive analysis. ✨