Files
michaelschiemer/docs/claude/structured-logging.md
Michael Schiemer fc3d7e6357 feat(Production): Complete production deployment infrastructure
- Add comprehensive health check system with multiple endpoints
- Add Prometheus metrics endpoint
- Add production logging configurations (5 strategies)
- Add complete deployment documentation suite:
  * QUICKSTART.md - 30-minute deployment guide
  * DEPLOYMENT_CHECKLIST.md - Printable verification checklist
  * DEPLOYMENT_WORKFLOW.md - Complete deployment lifecycle
  * PRODUCTION_DEPLOYMENT.md - Comprehensive technical reference
  * production-logging.md - Logging configuration guide
  * ANSIBLE_DEPLOYMENT.md - Infrastructure as Code automation
  * README.md - Navigation hub
  * DEPLOYMENT_SUMMARY.md - Executive summary
- Add deployment scripts and automation
- Add DEPLOYMENT_PLAN.md - Concrete plan for immediate deployment
- Update README with production-ready features

All production infrastructure is now complete and ready for deployment.
2025-10-25 19:18:37 +02:00

16 KiB

Structured Logging

Comprehensive guide to the framework's structured logging system with PII protection and log aggregator integration.

Overview

The framework's structured logging system provides:

  • JSON-structured logs for log aggregators (Elasticsearch, Datadog, Splunk)
  • Automatic sensitive data redaction for PII protection
  • Docker-optimized logging for container environments
  • Readonly, immutable log records following framework principles
  • Channel-based routing for organized log management
  • Processor pipeline for log enrichment

Core Components

LogRecord (Readonly & Immutable)

All log records are readonly and immutable value objects:

use App\Framework\Logging\LogRecord;
use App\Framework\Logging\LogLevel;
use App\Framework\Logging\LogChannel;
use App\Framework\Logging\ValueObjects\LogContext;

$record = new LogRecord(
    level: LogLevel::INFO,
    message: 'User login successful',
    channel: LogChannel::SECURITY,
    context: LogContext::structured([
        'user_id' => $userId,
        'ip_address' => $ipAddress
    ])
);

// Immutable transformation (Copy-on-Write pattern)
$enrichedRecord = $record->withContext(
    $record->context->with('login_method', 'oauth')
);

JSON Formatter with Log Aggregator Support

The JsonFormatter produces structured JSON optimized for log aggregators:

use App\Framework\Logging\Formatter\JsonFormatter;
use App\Framework\Config\Environment;

$formatter = new JsonFormatter(
    prettyPrint: false,           // Compact JSON for production
    includeExtras: true,          // Include processor-added data
    flattenContext: true,         // Flatten for easier querying
    env: $environment,
    serviceName: 'api-gateway',
    redactSensitiveData: true     // Auto-redact PII
);

// Output format:
{
    "timestamp": "2025-01-22T15:30:45+00:00",
    "@timestamp": "2025-01-22T15:30:45+00:00",  // Elasticsearch convention
    "level": "INFO",
    "level_value": 200,
    "severity": 6,                               // RFC 5424 (0-7)
    "channel": "security",
    "message": "User login successful",
    "environment": "production",                 // Auto-detected
    "host": "web-server-01",                     // Server hostname
    "service": "api-gateway",                    // Service name
    "context": {
        "user_id": "user123",
        "ip_address": "[REDACTED]"              // Auto-redacted in production
    }
}

Standard Fields for Log Aggregators:

  • @timestamp: Elasticsearch-compatible timestamp field
  • severity: RFC 5424 severity level (0-7) for standardized filtering
  • environment: Deployment environment (production, staging, development)
  • host: Server hostname for distributed system correlation
  • service: Service/application name for multi-service architectures

Sensitive Data Redaction

Overview

Automatic PII (Personally Identifiable Information) protection with three redaction modes:

use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;

// Factory methods for different environments
$redactor = SensitiveDataRedactor::production();   // FULL redaction
$redactor = SensitiveDataRedactor::development();  // PARTIAL redaction
$redactor = SensitiveDataRedactor::testing();      // HASH redaction

// Custom configuration
$redactor = new SensitiveDataRedactor(
    mode: RedactionMode::PARTIAL,
    redactEmails: true,
    redactIps: true,
    mask: '[REDACTED]'
);

Redaction Modes

FULL Mode (Production Default)

Complete masking of sensitive data:

$redactor = new SensitiveDataRedactor(RedactionMode::FULL);
$data = ['username' => 'john', 'password' => 'secret123'];
$redacted = $redactor->redact($data);

// Result: ['username' => 'john', 'password' => '[REDACTED]']

PARTIAL Mode (Development Default)

Partial masking showing first/last characters:

$redactor = new SensitiveDataRedactor(RedactionMode::PARTIAL);
$data = ['password' => 'super-secret-password'];
$redacted = $redactor->redact($data);

// Result: ['password' => 'su****************rd']

HASH Mode (Testing Default)

Deterministic hash-based masking:

$redactor = new SensitiveDataRedactor(RedactionMode::HASH);
$data = ['api_key' => 'sk_live_1234567890'];
$redacted = $redactor->redact($data);

// Result: ['api_key' => '[HASH:a1b2c3d4e5f6]']
// Same input always produces same hash

Key-Based Redaction

Automatically redacts fields with sensitive names:

Sensitive Keys:

  • Passwords: password, passwd, pwd, secret
  • API Keys: api_key, apikey, api_secret, apisecret
  • Tokens: token, access_token, refresh_token, bearer, auth
  • Encryption: private_key, encryption_key
  • Session: session_id, cookie, csrf, csrf_token
  • Financial: credit_card, card_number, cvv, cvc
  • Personal: ssn, social_security, tax_id, passport
$data = [
    'username' => 'john',
    'password' => 'secret123',
    'api_key' => 'sk_live_abc123',
    'user_id' => 42
];

$redacted = $redactor->redact($data);

// Result:
[
    'username' => 'john',           // Not sensitive
    'password' => '[REDACTED]',     // Key-based redaction
    'api_key' => '[REDACTED]',      // Key-based redaction
    'user_id' => 42                 // Not sensitive
]

Content-Based Redaction

Pattern-based detection for sensitive content:

Detected Patterns:

  • Credit Cards: 4532-1234-5678-9010[CREDIT_CARD]
  • SSN: 123-45-6789[SSN]
  • Bearer Tokens: Bearer eyJhbGci...Bearer [REDACTED]
  • Emails (optional): john.doe@example.comj******e@example.com
  • IP Addresses (optional): 192.168.1.100[IP_ADDRESS]
$message = 'Payment with card 4532-1234-5678-9010';
$redacted = $redactor->redactString($message);

// Result: 'Payment with card [CREDIT_CARD]'

Nested Array Redaction

Recursive redaction for complex data structures:

$data = [
    'user' => [
        'name' => 'John Doe',
        'password' => 'secret',
        'preferences' => [
            'theme' => 'dark',
            'api_key' => 'key123'
        ]
    ]
];

$redacted = $redactor->redact($data);

// Result:
[
    'user' => [
        'name' => 'John Doe',           // Not redacted
        'password' => '[REDACTED]',     // Redacted
        'preferences' => [
            'theme' => 'dark',          // Not redacted
            'api_key' => '[REDACTED]'   // Redacted (nested)
        ]
    ]
]

Docker JSON Handler

Optimized JSON logging for Docker containers with stdout/stderr streaming:

use App\Framework\Logging\Handlers\DockerJsonHandler;
use App\Framework\Logging\Security\SensitiveDataRedactor;

// Production: Compact JSON with full redaction
$handler = new DockerJsonHandler(
    env: $environment,
    minLevel: LogLevel::INFO,
    redactSensitiveData: true,  // Auto-redact in production
    prettyPrint: false          // Compact for log aggregators
);

// Development: Pretty-printed JSON without redaction
$handler = new DockerJsonHandler(
    env: $environment,
    serviceName: 'api-service',
    minLevel: LogLevel::DEBUG,
    prettyPrint: true,          // Pretty-print for readability
    redactSensitiveData: false  // No redaction for debugging
);

Docker Log Integration:

# View logs with Docker
docker logs <container> --tail 50
docker logs <container> --follow
docker logs <container> --since 10m

# Format with jq for readability
docker logs <container> 2>&1 | jq .
docker logs <container> 2>&1 | jq 'select(.level == "ERROR")'
docker logs <container> 2>&1 | jq -r '[.timestamp, .level, .message] | @tsv'

Automatic Integration

The framework automatically configures structured logging in LoggerInitializer:

// Production Docker environment
if ($inDocker && $config->app->isProduction()) {
    $handlers[] = new DockerJsonHandler(
        env: $env,
        minLevel: LogLevel::INFO,
        redactSensitiveData: true  // AUTOMATIC PII PROTECTION
    );
}

// Development Docker environment
if ($inDocker && !$config->app->isProduction()) {
    $handlers[] = new DockerJsonHandler(
        env: $env,
        serviceName: $config->app->name ?? 'app',
        minLevel: LogLevel::DEBUG,
        prettyPrint: true,         // Better readability
        redactSensitiveData: false // No redaction for debugging
    );
}

Usage Examples

Basic Logging with Auto-Redaction

use App\Framework\Logging\Logger;
use App\Framework\Logging\LogLevel;

// Logger automatically uses configured handlers
$this->logger->log(
    LogLevel::INFO,
    'User authentication',
    [
        'user_id' => $userId,
        'password' => 'secret123',    // Auto-redacted
        'api_key' => 'sk_live_xyz',   // Auto-redacted
        'email' => 'user@example.com'  // Auto-redacted (if enabled)
    ]
);

// JSON output (production):
{
    "message": "User authentication",
    "context": {
        "user_id": "user123",
        "password": "[REDACTED]",
        "api_key": "[REDACTED]",
        "email": "[REDACTED]"
    }
}

Channel-Based Logging

use App\Framework\Logging\LogChannel;

// Security-specific logging
$this->logger->channel(LogChannel::SECURITY)->warning(
    'Failed login attempt',
    [
        'username' => $username,
        'ip_address' => $ipAddress,  // Auto-redacted in production
        'attempt_count' => $attempts
    ]
);

// Database-specific logging
$this->logger->channel(LogChannel::DATABASE)->debug(
    'Query executed',
    [
        'query' => $sql,
        'bindings' => $bindings,     // Sensitive bindings redacted
        'duration_ms' => $duration
    ]
);

Custom Redaction Configuration

use App\Framework\Logging\Formatter\JsonFormatter;
use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;

// Custom redactor for specific service
$redactor = new SensitiveDataRedactor(
    mode: RedactionMode::PARTIAL,
    redactEmails: true,
    redactIps: false  // Keep IPs for debugging
);

$formatter = new JsonFormatter(
    prettyPrint: false,
    includeExtras: true,
    flattenContext: true,
    env: $environment,
    serviceName: 'payment-service',
    redactSensitiveData: true,
    redactor: $redactor  // Custom redactor
);

Best Practices

1. Environment-Specific Configuration

// Production: Security-first
- Full redaction (RedactionMode::FULL)
- Compact JSON for log aggregators
- INFO level minimum
- Email and IP redaction enabled

// Development: Debugging-first
- Partial redaction (RedactionMode::PARTIAL)
- Pretty-printed JSON for readability
- DEBUG level minimum
- Email and IP redaction disabled

// Testing: Determinism-first
- Hash redaction (RedactionMode::HASH)
- Consistent output for test assertions
- All levels enabled
- Selective redaction

2. Structured Context

Always use structured context instead of embedding data in messages:

// ❌ Bad: Embedded data in message
$this->logger->info("User {$userId} logged in from {$ipAddress}");

// ✅ Good: Structured context
$this->logger->info('User login successful', [
    'user_id' => $userId,
    'ip_address' => $ipAddress,
    'login_method' => 'oauth'
]);

3. Channel Organization

Use channels for logical log separation:

LogChannel::SECURITY     Authentication, authorization, security events
LogChannel::DATABASE     Database queries, migrations, connection issues
LogChannel::CACHE        Cache hits/misses, cache operations
LogChannel::HTTP         HTTP requests/responses, API calls
LogChannel::QUEUE        Background jobs, queue processing
LogChannel::APPLICATION  General application events (default)

4. Sensitive Data Awareness

Be explicit about what data is logged:

// ❌ Avoid: Logging entire request
$this->logger->debug('Request received', ['request' => $request]);

// ✅ Better: Log specific safe fields
$this->logger->debug('Request received', [
    'method' => $request->method->value,
    'path' => $request->path,
    'user_id' => $request->user?->id
    // Password automatically redacted by key name
]);

5. Log Aggregator Optimization

Optimize for log aggregator querying:

// Use consistent field names across services
[
    'user_id' => $userId,           // Not 'userId' or 'id'
    'request_id' => $requestId,     // Not 'reqId' or 'rid'
    'duration_ms' => $durationMs,   // Not 'time' or 'elapsed'
    'status_code' => $statusCode    // Not 'status' or 'code'
]

// Include searchable metadata
[
    'transaction_type' => 'payment',
    'payment_method' => 'credit_card',
    'currency' => 'USD',
    'success' => true
]

Performance Considerations

  • Redaction Overhead: ~1-2ms per log record with complex context
  • JSON Serialization: Minimal overhead with JsonSerializer
  • Pattern Matching: Credit card/SSN regex executed only on string content
  • Memory Usage: Readonly records prevent accidental mutations, low overhead

Security Guarantees

PII Protection: Automatic redaction of passwords, tokens, credit cards, SSN Production-Safe: Full redaction by default in production environments No Plaintext Secrets: Sensitive keys always masked in logs Configurable Sensitivity: Adjust redaction level per environment Audit-Ready: Deterministic hashing for correlation without exposing data

Testing

use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;

it('redacts sensitive data in logs', function () {
    $redactor = new SensitiveDataRedactor(RedactionMode::FULL);

    $data = [
        'user' => 'john',
        'password' => 'secret123',
        'api_key' => 'sk_live_abc'
    ];

    $redacted = $redactor->redact($data);

    expect($redacted['user'])->toBe('john');
    expect($redacted['password'])->toBe('[REDACTED]');
    expect($redacted['api_key'])->toBe('[REDACTED]');
});

it('redacts credit card numbers in content', function () {
    $redactor = new SensitiveDataRedactor();
    $message = 'Payment with card 4532-1234-5678-9010';

    $redacted = $redactor->redactString($message);

    expect($redacted)->toContain('[CREDIT_CARD]');
    expect(str_contains($redacted, '4532'))->toBeFalsy();
});

Troubleshooting

Sensitive data still visible in logs

Check:

  1. Verify redactSensitiveData: true in production handlers
  2. Confirm environment detection ($config->app->isProduction())
  3. Check field names match sensitive key patterns
  4. Verify redactor is properly injected into formatter

Logs missing expected fields

Check:

  1. Ensure includeExtras: true in JsonFormatter
  2. Verify processors are registered in ProcessorManager
  3. Check flattenContext: true for structured context
  4. Confirm channel is properly set on log records

Performance degradation

Check:

  1. Reduce redaction scope (disable email/IP if not needed)
  2. Use PARTIAL mode instead of HASH for less overhead
  3. Minimize context size (only log essential data)
  4. Consider async logging with QueuedLogHandler

Migration Guide

Upgrading to Structured Logging

Step 1: Update log calls to use structured context

// Before
$logger->info("User {$userId} did something");

// After
$logger->info('User action completed', ['user_id' => $userId]);

Step 2: Enable Docker JSON handler in production

// LoggerInitializer already configured
// No changes needed if using framework defaults

Step 3: Verify redaction in production logs

# Check Docker logs don't contain plaintext secrets
docker logs <container> 2>&1 | grep -i "password\|api_key\|token"
# Should only show [REDACTED] or masked values

Summary

The framework's structured logging system provides:

JSON-structured output for modern log aggregators Automatic PII redaction for security compliance Docker-optimized logging for container environments Readonly, immutable records following framework principles Environment-aware configuration (prod/dev/test) Standard fields (@timestamp, severity, environment, host, service) Pattern-based detection for credit cards, SSN, tokens Nested data redaction for complex structures Performance-optimized with minimal overhead Test-friendly with deterministic hashing mode