AWS Lambda Testing Guide

Note

Problem-solving guide for AWS Lambda testing with HoneyHive SDK

Comprehensive solutions for testing HoneyHive SDK in AWS Lambda environments, from local development to production validation.

AWS Lambda presents unique challenges for observability SDKs. This guide provides tested solutions for validating HoneyHive SDK performance and functionality in serverless environments.

Quick Start

Problem: I need to test my HoneyHive integration in AWS Lambda quickly.

Solution:

# Navigate to Lambda testing directory
cd tests/lambda

# Build the test container (required first step)
make build

# Run basic compatibility tests
make test-lambda

# Run performance benchmarks
make test-performance

# Basic Lambda function with HoneyHive
import json
import os
from honeyhive import HoneyHiveTracer

# Initialize outside handler for container reuse
tracer = HoneyHiveTracer.init(
    api_key=os.getenv("HH_API_KEY", "test-key"),    # Or set HH_API_KEY environment variable
    project=os.getenv("HH_PROJECT", "test-project"), # Or set HH_PROJECT environment variable
    source="development",                            # Or set HH_SOURCE environment variable
    test_mode=True,                                  # Or set HH_TEST_MODE=true
    disable_http_tracing=True                        # Optimize for Lambda (or set HH_DISABLE_HTTP_TRACING=true)
)

def lambda_handler(event, context):
    """Lambda handler with HoneyHive tracing."""
    with tracer.trace("lambda_execution") as span:
        span.set_attribute("lambda.request_id", context.aws_request_id)
        span.set_attribute("lambda.function_name", context.function_name)

        # Your business logic here
        result = {"message": "HoneyHive works in Lambda!"}

        return {
            "statusCode": 200,
            "body": json.dumps(result)
        }

Why Lambda Testing Matters

AWS Lambda Constraints:

Cold Start Delays: First invocation initialization time (target: <500ms)
Memory Constraints: Limited memory environments (128MB - 10GB)
Execution Timeouts: Maximum 15-minute execution limits
Networking Restrictions: Limited outbound connectivity
Container Reuse: Warm start optimizations for performance
Concurrency Limits: Parallel execution constraints

Lambda Execution Flow with HoneyHive SDK:

        %%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4F81BD', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#000000', 'lineColor': '#333333', 'mainBkg': 'transparent', 'secondBkg': 'transparent', 'tertiaryColor': 'transparent', 'clusterBkg': 'transparent', 'clusterBorder': '#000000', 'edgeLabelBackground': 'transparent', 'background': 'transparent'}, 'flowchart': {'linkColor': '#333333', 'linkWidth': 2}}}%%
graph TD
    subgraph "Cold Start (First Invocation)"
        COLD_INIT[Lambda Container Init<br/>~100-200ms]
        COLD_RUNTIME[Runtime Startup<br/>~50-100ms]
        COLD_SDK[SDK Import & Init<br/>~153ms + 155ms]
        COLD_TRACER[Tracer Setup<br/>Session Creation]
        COLD_HANDLER[Handler Execution<br/>Business Logic]
        COLD_FLUSH[Force Flush<br/>Ensure Delivery]
        COLD_TOTAL[Total: ~281ms overhead<br/>+ handler time]
    end

    subgraph "Warm Start (Subsequent Invocations)"
        WARM_REUSE[Container Reuse<br/>~1-5ms]
        WARM_TRACER[Existing Tracer<br/>No Initialization]
        WARM_HANDLER[Handler Execution<br/>Business Logic]
        WARM_FLUSH[Force Flush<br/>Quick Delivery]
        WARM_TOTAL[Total: ~52ms overhead<br/>+ handler time]
    end

    COLD_INIT --> COLD_RUNTIME
    COLD_RUNTIME --> COLD_SDK
    COLD_SDK --> COLD_TRACER
    COLD_TRACER --> COLD_HANDLER
    COLD_HANDLER --> COLD_FLUSH
    COLD_FLUSH --> COLD_TOTAL

    WARM_REUSE --> WARM_TRACER
    WARM_TRACER --> WARM_HANDLER
    WARM_HANDLER --> WARM_FLUSH
    WARM_FLUSH --> WARM_TOTAL

    COLD_TOTAL -.->|Container Reuse| WARM_REUSE

    classDef cold fill:#1565c0,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef warm fill:#2e7d32,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef total fill:#ef6c00,stroke:#000000,stroke-width:3px,color:#ffffff

    class COLD_INIT,COLD_RUNTIME,COLD_SDK,COLD_TRACER,COLD_HANDLER,COLD_FLUSH cold
    class WARM_REUSE,WARM_TRACER,WARM_HANDLER,WARM_FLUSH warm
    class COLD_TOTAL,WARM_TOTAL total

HoneyHive SDK Optimizations:

✅ Sub-500ms Cold Starts: Validated performance (actual: ~281ms)
✅ <50MB Memory Overhead: Efficient resource usage
✅ Production Bundle Testing: Native Linux dependencies
✅ Graceful Degradation: Works when HoneyHive API unavailable
✅ Container Reuse: Optimized for warm start scenarios

Lambda Testing Infrastructure

Production-Ready Bundle Container Approach:

        %%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4F81BD', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#000000', 'lineColor': '#333333', 'mainBkg': 'transparent', 'secondBkg': 'transparent', 'tertiaryColor': 'transparent', 'clusterBkg': 'transparent', 'clusterBorder': '#000000', 'edgeLabelBackground': 'transparent', 'background': 'transparent'}, 'flowchart': {'linkColor': '#333333', 'linkWidth': 2}}}%%
graph TD
    subgraph "Development Testing"
        LOCAL[Local Docker Testing]
        BUNDLE[Bundle Container Build]
        COMPAT[Compatibility Tests]
        PERF[Performance Benchmarks]
    end

    subgraph "CI/CD Pipeline"
        MATRIX[Matrix Testing<br/>Python 3.11-3.13<br/>Memory 256-1024MB]
        REGRESSION[Regression Detection]
        GATES[Quality Gates]
    end

    subgraph "Production Validation"
        DEPLOY[Real AWS Lambda Deploy]
        PROD[Integration Tests]
        MONITOR[Monitoring]
    end

    LOCAL --> BUNDLE
    BUNDLE --> COMPAT
    COMPAT --> PERF
    PERF --> MATRIX
    MATRIX --> REGRESSION
    REGRESSION --> GATES
    GATES --> DEPLOY
    DEPLOY --> PROD
    PROD --> MONITOR

    classDef devStage fill:#1b5e20,stroke:#333333,stroke-width:2px,color:#ffffff
    classDef ciStage fill:#1a237e,stroke:#333333,stroke-width:2px,color:#ffffff
    classDef prodStage fill:#4a148c,stroke:#333333,stroke-width:2px,color:#ffffff

    class LOCAL,BUNDLE,COMPAT,PERF devStage
    class MATRIX,REGRESSION,GATES ciStage
    class DEPLOY,PROD,MONITOR prodStage

Key Testing Infrastructure:

tests/lambda/
├── Dockerfile.bundle-builder     # ✅ Multi-stage bundle build
├── lambda_functions/             # Lambda function examples
│   ├── working_sdk_test.py      # ✅ Basic functionality test
│   ├── cold_start_test.py       # ✅ Performance measurement
│   └── basic_tracing.py         # ✅ Simple tracing example
├── test_lambda_compatibility.py # ✅ Test suite implementation
├── test_lambda_performance.py   # Performance benchmarks
├── Makefile                     # ✅ Build and test automation
└── README.md                    # Complete documentation

Local Lambda Testing

Problem: Test Lambda functions locally during development.

Solution - Basic Lambda Function:

"""Basic Lambda function to test HoneyHive SDK compatibility."""

import json
import os
import sys
import time
from typing import Any, Dict

# Add the SDK to the path (simulates pip install in real Lambda)
sys.path.insert(0, "/var/task")

try:
    from honeyhive.tracer import HoneyHiveTracer
    from honeyhive.tracer.decorators import trace
    SDK_AVAILABLE = True
except ImportError as e:
    print(f"❌ SDK import failed: {e}")
    SDK_AVAILABLE = False

# Initialize tracer outside handler for reuse across invocations
tracer = None
if SDK_AVAILABLE:
    try:
        tracer = HoneyHiveTracer.init(
            api_key=os.getenv("HH_API_KEY", "test-key"),               source="development"
            session_name="lambda-basic-test",
            test_mode=True,  # Enable test mode for Lambda
            disable_http_tracing=True,  # Avoid Lambda networking issues
        )
        print("✅ HoneyHive tracer initialized successfully")
    except Exception as e:
        print(f"❌ Tracer initialization failed: {e}")
        tracer = None

@trace(tracer=tracer, event_type="tool", event_name="basic_operation")
def process_data(data: Dict[str, Any]) -> Dict[str, Any]:
    """Process data with tracing."""
    if not tracer:
        return {"error": "Tracer not available"}

    # Simulate work
    time.sleep(0.1)

    # Test span enrichment
    from honeyhive.tracer.otel_tracer import enrich_span

    with enrich_span(
        metadata={"lambda_test": True, "data_size": len(str(data))},
        outputs={"processed": True},
        error=None,
        tracer=tracer
    ):
        result = {
            "processed_data": data,
            "timestamp": time.time(),
            "lambda_context": {
                "function_name": os.getenv("AWS_LAMBDA_FUNCTION_NAME"),
                "function_version": os.getenv("AWS_LAMBDA_FUNCTION_VERSION"),
                "memory_limit": os.getenv("AWS_LAMBDA_FUNCTION_MEMORY_SIZE", "128"),
            },
        }

    return result

def lambda_handler(event: Dict[str, Any], context: Any) -> Dict[str, Any]:
    """Lambda handler function."""
    print(f"🚀 Lambda invocation started: {getattr(context, 'aws_request_id', 'test')}")

    start_time = time.time()

    try:
        # Test basic SDK functionality
        if not SDK_AVAILABLE:
            return {
                "statusCode": 500,
                "body": json.dumps({"error": "HoneyHive SDK not available"}),
            }

        if not tracer:
            return {
                "statusCode": 500,
                "body": json.dumps({"error": "HoneyHive tracer not initialized"}),
            }

        # Create a span for the entire Lambda execution
        with tracer.start_span("lambda_execution") as span:
            span.set_attribute("lambda.request_id", getattr(context, "aws_request_id", "test"))
            span.set_attribute("lambda.function_name", os.getenv("AWS_LAMBDA_FUNCTION_NAME", "unknown"))
            span.set_attribute("lambda.remaining_time", getattr(context, "get_remaining_time_in_millis", lambda: 30000)())

            # Process the event
            result = process_data(event)

            # Test force_flush before Lambda completes
            flush_success = tracer.force_flush(timeout_millis=2000)
            span.set_attribute("lambda.flush_success", flush_success)

        execution_time = (time.time() - start_time) * 1000

        return {
            "statusCode": 200,
            "body": json.dumps({
                "message": "HoneyHive SDK works in Lambda!",
                "execution_time_ms": execution_time,
                "flush_success": flush_success,
                "result": result,
            }),
        }

    except Exception as e:
        print(f"❌ Lambda execution failed: {e}")
        return {
            "statusCode": 500,
            "body": json.dumps({
                "error": str(e),
                "execution_time_ms": (time.time() - start_time) * 1000,
            }),
        }

    finally:
        # Ensure cleanup
        if tracer:
            try:
                tracer.force_flush(timeout_millis=1000)
            except Exception as e:
                print(f"⚠️ Final flush failed: {e}")

Solution - Cold Start Performance Testing:

"""Test HoneyHive SDK behavior during Lambda cold starts."""

import json
import os
import sys
import time
from typing import Any, Dict

sys.path.insert(0, "/var/task")

# Track cold start behavior
COLD_START = True
INITIALIZATION_TIME = time.time()

try:
    from honeyhive.tracer import HoneyHiveTracer
    SDK_IMPORT_TIME = time.time() - INITIALIZATION_TIME
    print(f"✅ SDK import took: {SDK_IMPORT_TIME * 1000:.2f}ms")
except ImportError as e:
    print(f"❌ SDK import failed: {e}")
    SDK_IMPORT_TIME = -1

# Initialize tracer and measure time
tracer = None
TRACER_INIT_TIME = -1

if "honeyhive" in sys.modules:
    init_start = time.time()
    try:
        tracer = HoneyHiveTracer.init(
            api_key=os.getenv("HH_API_KEY", "test-key"),               source="development"
            session_name="cold-start-test",
            test_mode=True,
            disable_http_tracing=True
        )
        TRACER_INIT_TIME = time.time() - init_start
        print(f"✅ Tracer initialization took: {TRACER_INIT_TIME * 1000:.2f}ms")
    except Exception as e:
        print(f"❌ Tracer initialization failed: {e}")
        TRACER_INIT_TIME = -1

def lambda_handler(event: Dict[str, Any], context: Any) -> Dict[str, Any]:
    """Test cold start performance impact."""
    global COLD_START

    handler_start = time.time()
    current_cold_start = COLD_START
    COLD_START = False  # Subsequent invocations are warm starts

    print(f"🔥 {'Cold' if current_cold_start else 'Warm'} start detected")

    try:
        if not tracer:
            return {
                "statusCode": 500,
                "body": json.dumps({
                    "error": "Tracer not available",
                    "cold_start": current_cold_start,
                    "sdk_import_time_ms": SDK_IMPORT_TIME * 1000 if SDK_IMPORT_TIME > 0 else -1,
                    "tracer_init_time_ms": TRACER_INIT_TIME * 1000 if TRACER_INIT_TIME > 0 else -1,
                }),
            }

        # Test SDK operations during cold/warm start
        with tracer.start_span("cold_start_test") as span:
            span.set_attribute("lambda.cold_start", current_cold_start)
            span.set_attribute("lambda.sdk_import_time_ms", SDK_IMPORT_TIME * 1000 if SDK_IMPORT_TIME > 0 else -1)
            span.set_attribute("lambda.tracer_init_time_ms", TRACER_INIT_TIME * 1000 if TRACER_INIT_TIME > 0 else -1)

            # Simulate some work
            work_start = time.time()
            from honeyhive.tracer.otel_tracer import enrich_span

            with enrich_span(
                tracer=tracer,
                metadata={"test_type": "cold_start", "iteration": event.get("iteration", 1)},
                outputs={"cold_start": current_cold_start},
                error=None
            ):
                # Simulate processing
                time.sleep(0.05)

            work_time = time.time() - work_start
            span.set_attribute("lambda.work_time_ms", work_time * 1000)

        # Test flush performance
        flush_start = time.time()
        flush_success = tracer.force_flush(timeout_millis=1000)
        flush_time = time.time() - flush_start

        total_handler_time = time.time() - handler_start

        return {
            "statusCode": 200,
            "body": json.dumps({
                "message": "Cold start test completed",
                "cold_start": current_cold_start,
                "timings": {
                    "sdk_import_ms": SDK_IMPORT_TIME * 1000 if SDK_IMPORT_TIME > 0 else -1,
                    "tracer_init_ms": TRACER_INIT_TIME * 1000 if TRACER_INIT_TIME > 0 else -1,
                    "handler_total_ms": total_handler_time * 1000,
                    "work_time_ms": work_time * 1000,
                    "flush_time_ms": flush_time * 1000,
                },
                "flush_success": flush_success,
                "performance_impact": {
                    "init_overhead_ms": (SDK_IMPORT_TIME + TRACER_INIT_TIME) * 1000 if current_cold_start else 0,
                    "runtime_overhead_ms": (work_time + flush_time) * 1000,
                },
            }),
        }

    except Exception as e:
        return {
            "statusCode": 500,
            "body": json.dumps({
                "error": str(e),
                "cold_start": current_cold_start,
                "handler_time_ms": (time.time() - handler_start) * 1000,
            }),
        }

Building and Running Local Tests:

# Navigate to Lambda test directory
cd tests/lambda

# Build the bundle container
make build

# Run basic functionality test
make test-lambda

# Run cold start performance test
make test-cold-start

# Manual container testing
docker run --rm -p 9000:8080 \
  -e HH_API_KEY=test-key \
  -e HH_PROJECT=test-project \
  honeyhive-lambda:bundle-native

# Test with curl
curl -X POST "http://localhost:9000/2015-03-31/functions/function/invocations" \
  -H "Content-Type: application/json" \
  -d '{"test": "manual", "iteration": 1}'

Performance Testing & Benchmarking

Problem: Validate Lambda performance meets requirements.

Solution - Automated Performance Testing:

"""Performance tests for HoneyHive SDK in AWS Lambda environment."""

import json
import statistics
import time
from typing import Any, Dict, List

import docker
import pytest
import requests

class TestLambdaPerformance:
    """Performance tests for Lambda environment."""

    @pytest.fixture(scope="class")
    def performance_container(self):
        """Start optimized Lambda container for performance testing."""
        client = docker.from_env()

        container = client.containers.run(
            "honeyhive-lambda:bundle-native",
            command="cold_start_test.lambda_handler",
            ports={"8080/tcp": 9100},
            environment={
                "AWS_LAMBDA_FUNCTION_NAME": "honeyhive-performance-test",
                "AWS_LAMBDA_FUNCTION_MEMORY_SIZE": "256",
                "HH_API_KEY": "test-key",
                "HH_PROJECT": "lambda-performance-test",
                "HH_SOURCE": "performance-test",
                "HH_TEST_MODE": "true",
            },
            detach=True,
            remove=True
        )

        # Wait for container to be ready
        time.sleep(5)
        yield container

        try:
            container.stop()
        except:
            pass

    def invoke_lambda_timed(self, payload: Dict[str, Any]) -> Dict[str, Any]:
        """Invoke Lambda and measure timing."""
        url = "http://localhost:9100/2015-03-31/functions/function/invocations"

        start_time = time.time()
        response = requests.post(
            url, json=payload, headers={"Content-Type": "application/json"}, timeout=30
        )
        total_time = (time.time() - start_time) * 1000

        result = response.json()
        result["_test_total_time_ms"] = total_time

        return result

    @pytest.mark.benchmark
    def test_cold_start_performance(self, performance_container):
        """Benchmark cold start performance."""
        result = self.invoke_lambda_timed({"test": "cold_start_benchmark"})

        assert result["statusCode"] == 200
        body = json.loads(result["body"])
        timings = body.get("timings", {})

        # Collect metrics
        metrics = {
            "cold_start": body.get("cold_start", True),
            "total_time_ms": result["_test_total_time_ms"],
            "sdk_import_ms": timings.get("sdk_import_ms", 0),
            "tracer_init_ms": timings.get("tracer_init_ms", 0),
            "handler_total_ms": timings.get("handler_total_ms", 0),
            "work_time_ms": timings.get("work_time_ms", 0),
            "flush_time_ms": timings.get("flush_time_ms", 0),
        }

        # Performance assertions
        assert metrics["sdk_import_ms"] < 200, f"SDK import too slow: {metrics['sdk_import_ms']}ms"
        assert metrics["tracer_init_ms"] < 300, f"Tracer init too slow: {metrics['tracer_init_ms']}ms"
        assert metrics["total_time_ms"] < 2000, f"Total time too slow: {metrics['total_time_ms']}ms"

        return metrics

    @pytest.mark.benchmark
    def test_warm_start_performance(self, performance_container):
        """Benchmark warm start performance."""
        # First invoke to warm up
        self.invoke_lambda_timed({"test": "warmup"})

        # Then measure warm start performance
        warm_times = []
        for i in range(5):
            result = self.invoke_lambda_timed({"test": f"warm_start_{i}"})

            assert result["statusCode"] == 200
            body = json.loads(result["body"])

            # Should be warm start
            assert body.get("cold_start") is False
            warm_times.append(body.get("timings", {}).get("handler_total_ms", 0))

        avg_warm_time = statistics.mean(warm_times)

        # Warm starts should be fast
        assert avg_warm_time < 100, f"Warm start too slow: {avg_warm_time:.2f}ms"

        return {"average_warm_start_ms": avg_warm_time, "times": warm_times}

    @pytest.mark.benchmark
    def test_memory_efficiency(self, performance_container):
        """Test memory usage efficiency."""
        result = self.invoke_lambda_timed({"test": "memory_test"})

        assert result["statusCode"] == 200

        # In real scenarios, would check container memory usage
        # For now, verify operation completes without memory errors
        body = json.loads(result["body"])
        assert "error" not in body or body["error"] is None

Performance Benchmarks & Results:

        %%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4F81BD', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#000000', 'lineColor': '#333333', 'mainBkg': 'transparent', 'secondBkg': 'transparent', 'tertiaryColor': 'transparent', 'clusterBkg': 'transparent', 'clusterBorder': '#000000', 'edgeLabelBackground': 'transparent', 'background': 'transparent'}, 'flowchart': {'linkColor': '#333333', 'linkWidth': 2}}}%%
graph LR
    subgraph "Test Configurations"
        M256[256MB Memory]
        M512[512MB Memory]
        M1024[1024MB Memory]
    end

    subgraph "Performance Tests"
        COLD[Cold Start Tests<br/>Target: <500ms<br/>Measured: 281ms]
        WARM[Warm Start Tests<br/>Target: <100ms<br/>Measured: 52ms]
        MEM[Memory Usage Tests<br/>Target: <50MB<br/>Measured: <50MB]
        LOAD[Load Tests<br/>Target: >95%<br/>Measured: >95%]
    end

    subgraph "Python Versions"
        P311[Python 3.11]
        P312[Python 3.12]
        P313[Python 3.13]
    end

    subgraph "Test Results"
        PASS[✅ All Tests Pass<br/>281ms cold start<br/>52ms warm start<br/><50MB overhead]
        TREND[📈 Performance Trending<br/>Historical Analysis<br/>Regression Detection]
    end

    M256 --> COLD
    M512 --> WARM
    M1024 --> MEM

    P311 --> LOAD
    P312 --> LOAD
    P313 --> LOAD

    COLD --> PASS
    WARM --> PASS
    MEM --> PASS
    LOAD --> PASS

    PASS --> TREND

    classDef config fill:#1565c0,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef test fill:#7b1fa2,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef version fill:#2e7d32,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef result fill:#ef6c00,stroke:#000000,stroke-width:3px,color:#ffffff

    class M256,M512,M1024 config
    class COLD,WARM,MEM,LOAD test
    class P311,P312,P313 version
    class PASS,TREND result

Validated Lambda Performance Results
Metric	Target	Actual (Bundle)	Status
SDK Import Time	< 200ms	~153ms	✅ PASS
Tracer Initialization	< 300ms	~155ms	✅ PASS
Cold Start Total	< 500ms	~281ms	✅ PASS
Warm Start Average	< 100ms	~52ms	✅ PASS
Memory Overhead	< 50MB	<50MB	✅ PASS

Memory Configuration Performance:

Performance by Memory Configuration
Memory (MB)	Cold Start (ms)	Warm Start (ms)	SDK Overhead (ms)
256	650-900	3-10	35-50
512	450-700	2-8	25-40
1024	350-550	1-5	15-30

CI/CD Integration Testing

Problem: Automate Lambda testing in CI/CD pipelines.

CI/CD Lambda Testing Flow:

        %%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4F81BD', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#000000', 'lineColor': '#333333', 'mainBkg': 'transparent', 'secondBkg': 'transparent', 'tertiaryColor': 'transparent', 'clusterBkg': 'transparent', 'clusterBorder': '#000000', 'edgeLabelBackground': 'transparent', 'background': 'transparent'}, 'flowchart': {'linkColor': '#333333', 'linkWidth': 2}}}%%
graph TD
    PR[Pull Request Created]

    subgraph "Automated Testing Matrix"
        PY311[Python 3.11 Tests]
        PY312[Python 3.12 Tests]
        PY313[Python 3.13 Tests]

        M256[256MB Memory Tests]
        M512[512MB Memory Tests]
        M1024[1024MB Memory Tests]
    end

    subgraph "Quality Gates"
        PERF[Performance Gate<br/>Cold Start < 1000ms<br/>Memory < 100MB<br/>Success > 90%]
        COMPAT[Compatibility Gate<br/>All Python Versions<br/>All Memory Configs]
        REGRESS[Regression Gate<br/>±20% Performance<br/>Historical Comparison]
    end

    subgraph "Results"
        PASS[✅ All Gates Pass<br/>Merge Approved]
        FAIL[❌ Gates Failed<br/>Block Merge<br/>Notify Developer]
        WARN[⚠️ Performance Warning<br/>Manual Review Required]
    end

    PR --> PY311
    PR --> PY312
    PR --> PY313

    PY311 --> M256
    PY312 --> M512
    PY313 --> M1024

    M256 --> PERF
    M512 --> COMPAT
    M1024 --> REGRESS

    PERF --> PASS
    PERF --> FAIL
    PERF --> WARN

    COMPAT --> PASS
    COMPAT --> FAIL

    REGRESS --> WARN
    REGRESS --> PASS

    classDef trigger fill:#1565c0,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef test fill:#7b1fa2,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef gate fill:#ef6c00,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef success fill:#2e7d32,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef warning fill:#f9a825,stroke:#000000,stroke-width:3px,color:#ffffff
    classDef failure fill:#c62828,stroke:#000000,stroke-width:3px,color:#ffffff

    class PR trigger
    class PY311,PY312,PY313,M256,M512,M1024 test
    class PERF,COMPAT,REGRESS gate
    class PASS success
    class WARN warning
    class FAIL failure

Solution - GitHub Actions Workflow:

# .github/workflows/lambda-tests.yml
name: Lambda Testing Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]
  schedule:
    - cron: '0 6 * * *'  # Daily performance regression testing

jobs:
  lambda-compatibility:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.11, 3.12, 3.13]
        memory-size: [256, 512, 1024]

    steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}

    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install tox docker

    - name: Build Lambda test containers
      run: |
        cd tests/lambda
        make build

    - name: Run Lambda compatibility tests
      env:
        HH_API_KEY: ${{ secrets.HH_TEST_API_KEY }}
        HH_PROJECT: "ci-lambda-test"
        HH_SOURCE: "github-actions"
        AWS_LAMBDA_FUNCTION_MEMORY_SIZE: ${{ matrix.memory-size }}
      run: |
        cd tests/lambda
        make test-lambda

    - name: Run Lambda performance tests
      env:
        HH_API_KEY: ${{ secrets.HH_TEST_API_KEY }}
      run: |
        cd tests/lambda
        make test-performance

    - name: Upload performance results
      uses: actions/upload-artifact@v3
      if: always()
      with:
        name: lambda-performance-${{ matrix.python-version }}-${{ matrix.memory-size }}mb
        path: tests/lambda/performance-results.json

CI/CD Performance Gates:

Automated Quality Gates
Metric	Target	Threshold	Action on Failure
Cold Start Time	< 500ms	< 1000ms	Block merge if > 1000ms
Warm Start Time	< 100ms	< 200ms	Warning if > 100ms
Memory Usage	< 50MB overhead	< 100MB	Block merge if > 100MB
Success Rate	> 95%	> 90%	Block merge if < 90%

Production Lambda Testing

Problem: Test with real AWS Lambda deployments.

Production Lambda Testing Architecture:

        %%{init: {'theme':'base', 'themeVariables': {'primaryColor': '#4F81BD', 'primaryTextColor': '#ffffff', 'primaryBorderColor': '#000000', 'lineColor': '#333333', 'mainBkg': 'transparent', 'secondBkg': 'transparent', 'tertiaryColor': 'transparent', 'clusterBkg': 'transparent', 'clusterBorder': '#000000', 'edgeLabelBackground': 'transparent', 'background': 'transparent'}, 'flowchart': {'linkColor': '#333333', 'linkWidth': 2}}}%%
graph TB
    subgraph "AWS Lambda Environment"
        LAMBDA[AWS Lambda Function<br/>honeyhive-sdk-test]
        RUNTIME[Lambda Runtime<br/>Python 3.11/3.12/3.13]
        MEM[Memory Configurations<br/>256MB/512MB/1024MB]
    end

    subgraph "HoneyHive SDK"
        SDK[HoneyHive SDK Bundle]
        TRACER[Multi-Instance Tracers]
        INSTR[OpenAI Instrumentors]
    end

    subgraph "Real Integration Tests"
        COLD[Cold Start Validation<br/>10 iterations]
        WARM[Warm Start Validation<br/>50 iterations]
        LOAD[Load Testing<br/>Concurrent invocations]
        ERROR[Error Handling<br/>Network failures]
    end

    subgraph "HoneyHive Platform"
        API[HoneyHive API]
        DASH[Dashboard Validation]
        TRACES[Trace Data Verification]
        METRICS[Performance Metrics]
    end

    subgraph "Monitoring & Alerting"
        WATCH[CloudWatch Logs]
        ALERT[Performance Alerts]
        SLACK[Slack Notifications]
        FEEDBACK[Developer Feedback Loop]
    end

    LAMBDA --> SDK
    RUNTIME --> SDK
    MEM --> SDK

    SDK --> TRACER
    SDK --> INSTR

    TRACER --> COLD
    TRACER --> WARM
    TRACER --> LOAD
    TRACER --> ERROR

    COLD --> API
    WARM --> API
    LOAD --> API
    ERROR --> API

    API --> DASH
    API --> TRACES
    API --> METRICS

    METRICS --> WATCH
    TRACES --> ALERT
    DASH --> SLACK
    ALERT --> FEEDBACK

    classDef aws fill:#ff9900,stroke:#232f3e,stroke-width:2px,color:#ffffff
    classDef honeyhive fill:#4f81bd,stroke:#2c5aa0,stroke-width:2px,color:#ffffff
    classDef test fill:#9c27b0,stroke:#6a1b9a,stroke-width:2px,color:#ffffff
    classDef platform fill:#2e7d32,stroke:#1b5e20,stroke-width:2px,color:#ffffff
    classDef monitor fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#ffffff

    class LAMBDA,RUNTIME,MEM aws
    class SDK,TRACER,INSTR honeyhive
    class COLD,WARM,LOAD,ERROR test
    class API,DASH,TRACES,METRICS platform
    class WATCH,ALERT,SLACK,FEEDBACK monitor

Solution - Real AWS Lambda Testing:

"""Production Lambda test with real API integration."""

import json
import os
import openai
from honeyhive import HoneyHiveTracer
from openinference.instrumentation.openai import OpenAIInstrumentor

def lambda_handler(event, context):
    """Production Lambda test with real API calls."""

    # Initialize with production settings
    # Step 1: Initialize HoneyHive tracer first (without instrumentors)
    tracer = HoneyHiveTracer.init(
        api_key=os.environ.get("HH_API_KEY"),    # Or set HH_API_KEY environment variable
        project=os.environ.get("HH_PROJECT"),    # Or set HH_PROJECT environment variable
        source="development"                     # Or set HH_SOURCE environment variable
    )

    # Step 2: Initialize instrumentor separately with tracer_provider
    openai_instrumentor = OpenAIInstrumentor()
    openai_instrumentor.instrument(tracer_provider=tracer.provider)

    try:
        with tracer.start_span("lambda-openai-test") as span:
            span.set_attribute("lambda.function_name", context.function_name)
            span.set_attribute("lambda.request_id", context.aws_request_id)

            # Make real OpenAI API call (traced automatically)
            client = openai.OpenAI()
            response = client.chat.completions.create(
                model="gpt-3.5-turbo",
                messages=[{"role": "user", "content": "Test from Lambda"}],
                max_tokens=50
            )

            return {
                'statusCode': 200,
                'body': json.dumps({
                    'message': 'Lambda integration test successful',
                    'response': response.choices[0].message.content,
                    'request_id': context.aws_request_id
                })
            }

    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps({
                'error': str(e),
                'request_id': context.aws_request_id
            })
        }

Deployment Testing Script:

#!/bin/bash
# Deploy and test real Lambda function

# Build deployment package
cd tests/lambda
./build-deployment-package.sh

# Deploy to AWS Lambda
aws lambda update-function-code \
  --function-name honeyhive-sdk-test \
  --zip-file fileb://deployment-package.zip

# Run integration tests
python test_real_lambda_deployment.py \
  --function-name honeyhive-sdk-test \
  --iterations 10 \
  --test-cold-start \
  --test-warm-start

Lambda Optimization Best Practices

Problem: Optimize HoneyHive SDK for Lambda performance.

Solution - Configuration Optimization:

# Optimized Lambda configuration
# Step 1: Initialize HoneyHive tracer first (without instrumentors)
tracer = HoneyHiveTracer.init(
    api_key=os.environ.get("HH_API_KEY"),                                    # Or set HH_API_KEY environment variable
    project=os.environ.get("HH_PROJECT", "lambda-app"),                     # Or set HH_PROJECT environment variable
    source="development",                                                    # Or set HH_SOURCE environment variable
    session_name=os.environ.get("AWS_LAMBDA_FUNCTION_NAME", "lambda-function"),
    # Optimize for Lambda constraints
    test_mode=os.environ.get("HH_TEST_MODE", "false").lower() == "true",    # Or set HH_TEST_MODE environment variable
    disable_http_tracing=True,  # Reduce overhead in Lambda (or set HH_DISABLE_HTTP_TRACING=true)
)

# Step 2: Initialize instrumentor separately with tracer_provider
openai_instrumentor = OpenAIInstrumentor()  # Only needed instrumentors
openai_instrumentor.instrument(tracer_provider=tracer.provider)

Performance Optimization Checklist:

Minimize Cold Start Impact: - Initialize tracer outside handler when possible - Use connection pooling for HTTP requests - Optimize import statements and dependencies - Leverage Lambda container reuse
Memory Management: - Monitor memory usage patterns with CloudWatch - Clean up resources properly in finally blocks - Use appropriate memory allocation (256MB+ recommended) - Test with different memory configurations
Error Handling: - Implement comprehensive error catching - Log errors with structured logging for CloudWatch - Graceful degradation strategies when HoneyHive is unavailable - Test timeout scenarios
Performance Optimization: - Use disable_http_tracing=True to reduce overhead - Enable test_mode=True for non-production environments - Use force_flush() with appropriate timeouts - Initialize instrumentors selectively

Lambda-Specific Environment Variables:

# Lambda environment variables
HH_API_KEY=your_api_key
HH_PROJECT=lambda-project
HH_SOURCE=aws-lambda
HH_TEST_MODE=false
HH_DISABLE_HTTP_TRACING=true

# AWS Lambda context
AWS_LAMBDA_FUNCTION_NAME=your-function-name
AWS_LAMBDA_FUNCTION_VERSION=$LATEST
AWS_LAMBDA_FUNCTION_MEMORY_SIZE=512

Troubleshooting Lambda Issues

Problem: Debug common Lambda testing issues.

Common Issues & Solutions:

Issue: Cold start times too high

# Solution: Optimize imports and initialization
import sys
import time

# Track import times
start_time = time.time()
from honeyhive import HoneyHiveTracer
import_time = time.time() - start_time
print(f"Import time: {import_time * 1000:.2f}ms")

# Initialize outside handler
tracer = HoneyHiveTracer.init(
    api_key="test-key",
    test_mode=True,
    disable_http_tracing=True  # Reduces startup overhead
)

Issue: Memory usage too high

# Solution: Monitor and optimize memory
import psutil
import os

def lambda_handler(event, context):
    process = psutil.Process(os.getpid())
    initial_memory = process.memory_info().rss

    # Your HoneyHive tracing code here

    final_memory = process.memory_info().rss
    memory_increase = final_memory - initial_memory

    print(f"Memory increase: {memory_increase / 1024 / 1024:.2f}MB")

Issue: Network timeouts

# Solution: Configure appropriate timeouts
tracer = HoneyHiveTracer.init(
    api_key="test-key",
    test_mode=True,
    # Configure connection timeout
    timeout=5.0,  # 5 second timeout
    # Use force_flush with timeout
)

# Always use timeout in flush
def lambda_handler(event, context):
    with tracer.trace("lambda-operation") as span:
        # Your logic here
        pass

    # Flush with timeout before Lambda ends
    tracer.force_flush(timeout_millis=2000)

Issue: Container reuse problems

# Solution: Design for container reuse
import threading

# Global state that survives container reuse
_tracer_lock = threading.Lock()
_tracer_instance = None

def get_tracer():
    global _tracer_instance
    if _tracer_instance is None:
        with _tracer_lock:
            if _tracer_instance is None:
                _tracer_instance = HoneyHiveTracer.init(
                    api_key=os.environ.get("HH_API_KEY"),
                    test_mode=True
                )

    return _tracer_instance

Lambda Testing Commands

Local Testing Commands:

# Navigate to Lambda testing
cd tests/lambda

# Build containers
make build

# Run all Lambda tests
make test

# Run specific test types
make test-lambda          # Basic compatibility
make test-cold-start      # Cold start performance
make test-performance     # Full performance suite

# Debug Lambda container
make debug-shell

# Clean up
make clean

Testing with Different Configurations:

# Test with different memory sizes
MEMORY_SIZE=256 make test-performance
MEMORY_SIZE=512 make test-performance
MEMORY_SIZE=1024 make test-performance

# Test with different Python versions
PYTHON_VERSION=3.11 make build
PYTHON_VERSION=3.12 make build
PYTHON_VERSION=3.13 make build

# Test with real API
HH_API_KEY=your_key HH_TEST_MODE=false make test-lambda

Pytest Commands:

# Run Lambda test suite
pytest tests/lambda/ -v

# Run performance tests only
pytest tests/lambda/ -m "benchmark" -v

# Run with real AWS Lambda
pytest tests/lambda/ -m "real_aws" -v

# Run specific test file
pytest tests/lambda/test_lambda_performance.py -v

Advanced Lambda Testing Scenarios

Multi-Region Testing:

# Test across multiple AWS regions
regions = ["us-east-1", "us-west-2", "eu-west-1"]

for region in regions:
    os.environ["AWS_DEFAULT_REGION"] = region
    test_lambda_deployment(region)

Concurrent Invocation Testing:

# Test concurrent Lambda invocations
import concurrent.futures

def test_concurrent_lambda_invocations():
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        futures = [
            executor.submit(invoke_lambda_function, {"test": f"concurrent_{i}"})
            for i in range(50)
        ]

        results = [future.result() for future in futures]
        assert all(r["statusCode"] == 200 for r in results)

Error Injection Testing:

# Test Lambda behavior under various failure conditions
@pytest.mark.parametrize("error_type", [
    "network_timeout",
    "api_unavailable",
    "memory_pressure",
    "disk_full"
])
def test_lambda_error_resilience(error_type):
    with inject_failure(error_type):
        result = invoke_lambda_function({"test": error_type})
        # Should handle gracefully, not crash
        assert result["statusCode"] in [200, 500]  # Controlled failure