# Structured Logging Comprehensive guide to the framework's structured logging system with PII protection and log aggregator integration. ## Overview The framework's structured logging system provides: - **JSON-structured logs** for log aggregators (Elasticsearch, Datadog, Splunk) - **Automatic sensitive data redaction** for PII protection - **Docker-optimized logging** for container environments - **Readonly, immutable log records** following framework principles - **Channel-based routing** for organized log management - **Processor pipeline** for log enrichment ## Core Components ### LogRecord (Readonly & Immutable) All log records are readonly and immutable value objects: ```php use App\Framework\Logging\LogRecord; use App\Framework\Logging\LogLevel; use App\Framework\Logging\LogChannel; use App\Framework\Logging\ValueObjects\LogContext; $record = new LogRecord( level: LogLevel::INFO, message: 'User login successful', channel: LogChannel::SECURITY, context: LogContext::structured([ 'user_id' => $userId, 'ip_address' => $ipAddress ]) ); // Immutable transformation (Copy-on-Write pattern) $enrichedRecord = $record->withContext( $record->context->with('login_method', 'oauth') ); ``` ### JSON Formatter with Log Aggregator Support The `JsonFormatter` produces structured JSON optimized for log aggregators: ```php use App\Framework\Logging\Formatter\JsonFormatter; use App\Framework\Config\Environment; $formatter = new JsonFormatter( prettyPrint: false, // Compact JSON for production includeExtras: true, // Include processor-added data flattenContext: true, // Flatten for easier querying env: $environment, serviceName: 'api-gateway', redactSensitiveData: true // Auto-redact PII ); // Output format: { "timestamp": "2025-01-22T15:30:45+00:00", "@timestamp": "2025-01-22T15:30:45+00:00", // Elasticsearch convention "level": "INFO", "level_value": 200, "severity": 6, // RFC 5424 (0-7) "channel": "security", "message": "User login successful", "environment": "production", // Auto-detected "host": "web-server-01", // Server hostname "service": "api-gateway", // Service name "context": { "user_id": "user123", "ip_address": "[REDACTED]" // Auto-redacted in production } } ``` **Standard Fields for Log Aggregators**: - `@timestamp`: Elasticsearch-compatible timestamp field - `severity`: RFC 5424 severity level (0-7) for standardized filtering - `environment`: Deployment environment (production, staging, development) - `host`: Server hostname for distributed system correlation - `service`: Service/application name for multi-service architectures ## Sensitive Data Redaction ### Overview Automatic PII (Personally Identifiable Information) protection with three redaction modes: ```php use App\Framework\Logging\Security\SensitiveDataRedactor; use App\Framework\Logging\Security\RedactionMode; // Factory methods for different environments $redactor = SensitiveDataRedactor::production(); // FULL redaction $redactor = SensitiveDataRedactor::development(); // PARTIAL redaction $redactor = SensitiveDataRedactor::testing(); // HASH redaction // Custom configuration $redactor = new SensitiveDataRedactor( mode: RedactionMode::PARTIAL, redactEmails: true, redactIps: true, mask: '[REDACTED]' ); ``` ### Redaction Modes #### FULL Mode (Production Default) Complete masking of sensitive data: ```php $redactor = new SensitiveDataRedactor(RedactionMode::FULL); $data = ['username' => 'john', 'password' => 'secret123']; $redacted = $redactor->redact($data); // Result: ['username' => 'john', 'password' => '[REDACTED]'] ``` #### PARTIAL Mode (Development Default) Partial masking showing first/last characters: ```php $redactor = new SensitiveDataRedactor(RedactionMode::PARTIAL); $data = ['password' => 'super-secret-password']; $redacted = $redactor->redact($data); // Result: ['password' => 'su****************rd'] ``` #### HASH Mode (Testing Default) Deterministic hash-based masking: ```php $redactor = new SensitiveDataRedactor(RedactionMode::HASH); $data = ['api_key' => 'sk_live_1234567890']; $redacted = $redactor->redact($data); // Result: ['api_key' => '[HASH:a1b2c3d4e5f6]'] // Same input always produces same hash ``` ### Key-Based Redaction Automatically redacts fields with sensitive names: **Sensitive Keys**: - Passwords: `password`, `passwd`, `pwd`, `secret` - API Keys: `api_key`, `apikey`, `api_secret`, `apisecret` - Tokens: `token`, `access_token`, `refresh_token`, `bearer`, `auth` - Encryption: `private_key`, `encryption_key` - Session: `session_id`, `cookie`, `csrf`, `csrf_token` - Financial: `credit_card`, `card_number`, `cvv`, `cvc` - Personal: `ssn`, `social_security`, `tax_id`, `passport` ```php $data = [ 'username' => 'john', 'password' => 'secret123', 'api_key' => 'sk_live_abc123', 'user_id' => 42 ]; $redacted = $redactor->redact($data); // Result: [ 'username' => 'john', // Not sensitive 'password' => '[REDACTED]', // Key-based redaction 'api_key' => '[REDACTED]', // Key-based redaction 'user_id' => 42 // Not sensitive ] ``` ### Content-Based Redaction Pattern-based detection for sensitive content: **Detected Patterns**: - **Credit Cards**: `4532-1234-5678-9010` → `[CREDIT_CARD]` - **SSN**: `123-45-6789` → `[SSN]` - **Bearer Tokens**: `Bearer eyJhbGci...` → `Bearer [REDACTED]` - **Emails** (optional): `john.doe@example.com` → `j******e@example.com` - **IP Addresses** (optional): `192.168.1.100` → `[IP_ADDRESS]` ```php $message = 'Payment with card 4532-1234-5678-9010'; $redacted = $redactor->redactString($message); // Result: 'Payment with card [CREDIT_CARD]' ``` ### Nested Array Redaction Recursive redaction for complex data structures: ```php $data = [ 'user' => [ 'name' => 'John Doe', 'password' => 'secret', 'preferences' => [ 'theme' => 'dark', 'api_key' => 'key123' ] ] ]; $redacted = $redactor->redact($data); // Result: [ 'user' => [ 'name' => 'John Doe', // Not redacted 'password' => '[REDACTED]', // Redacted 'preferences' => [ 'theme' => 'dark', // Not redacted 'api_key' => '[REDACTED]' // Redacted (nested) ] ] ] ``` ## Docker JSON Handler Optimized JSON logging for Docker containers with stdout/stderr streaming: ```php use App\Framework\Logging\Handlers\DockerJsonHandler; use App\Framework\Logging\Security\SensitiveDataRedactor; // Production: Compact JSON with full redaction $handler = new DockerJsonHandler( env: $environment, minLevel: LogLevel::INFO, redactSensitiveData: true, // Auto-redact in production prettyPrint: false // Compact for log aggregators ); // Development: Pretty-printed JSON without redaction $handler = new DockerJsonHandler( env: $environment, serviceName: 'api-service', minLevel: LogLevel::DEBUG, prettyPrint: true, // Pretty-print for readability redactSensitiveData: false // No redaction for debugging ); ``` **Docker Log Integration**: ```bash # View logs with Docker docker logs --tail 50 docker logs --follow docker logs --since 10m # Format with jq for readability docker logs 2>&1 | jq . docker logs 2>&1 | jq 'select(.level == "ERROR")' docker logs 2>&1 | jq -r '[.timestamp, .level, .message] | @tsv' ``` ## Automatic Integration The framework automatically configures structured logging in `LoggerInitializer`: ```php // Production Docker environment if ($inDocker && $config->app->isProduction()) { $handlers[] = new DockerJsonHandler( env: $env, minLevel: LogLevel::INFO, redactSensitiveData: true // AUTOMATIC PII PROTECTION ); } // Development Docker environment if ($inDocker && !$config->app->isProduction()) { $handlers[] = new DockerJsonHandler( env: $env, serviceName: $config->app->name ?? 'app', minLevel: LogLevel::DEBUG, prettyPrint: true, // Better readability redactSensitiveData: false // No redaction for debugging ); } ``` ## Usage Examples ### Basic Logging with Auto-Redaction ```php use App\Framework\Logging\Logger; use App\Framework\Logging\LogLevel; // Logger automatically uses configured handlers $this->logger->log( LogLevel::INFO, 'User authentication', [ 'user_id' => $userId, 'password' => 'secret123', // Auto-redacted 'api_key' => 'sk_live_xyz', // Auto-redacted 'email' => 'user@example.com' // Auto-redacted (if enabled) ] ); // JSON output (production): { "message": "User authentication", "context": { "user_id": "user123", "password": "[REDACTED]", "api_key": "[REDACTED]", "email": "[REDACTED]" } } ``` ### Channel-Based Logging ```php use App\Framework\Logging\LogChannel; // Security-specific logging $this->logger->channel(LogChannel::SECURITY)->warning( 'Failed login attempt', [ 'username' => $username, 'ip_address' => $ipAddress, // Auto-redacted in production 'attempt_count' => $attempts ] ); // Database-specific logging $this->logger->channel(LogChannel::DATABASE)->debug( 'Query executed', [ 'query' => $sql, 'bindings' => $bindings, // Sensitive bindings redacted 'duration_ms' => $duration ] ); ``` ### Custom Redaction Configuration ```php use App\Framework\Logging\Formatter\JsonFormatter; use App\Framework\Logging\Security\SensitiveDataRedactor; use App\Framework\Logging\Security\RedactionMode; // Custom redactor for specific service $redactor = new SensitiveDataRedactor( mode: RedactionMode::PARTIAL, redactEmails: true, redactIps: false // Keep IPs for debugging ); $formatter = new JsonFormatter( prettyPrint: false, includeExtras: true, flattenContext: true, env: $environment, serviceName: 'payment-service', redactSensitiveData: true, redactor: $redactor // Custom redactor ); ``` ## Best Practices ### 1. Environment-Specific Configuration ```php // Production: Security-first - Full redaction (RedactionMode::FULL) - Compact JSON for log aggregators - INFO level minimum - Email and IP redaction enabled // Development: Debugging-first - Partial redaction (RedactionMode::PARTIAL) - Pretty-printed JSON for readability - DEBUG level minimum - Email and IP redaction disabled // Testing: Determinism-first - Hash redaction (RedactionMode::HASH) - Consistent output for test assertions - All levels enabled - Selective redaction ``` ### 2. Structured Context Always use structured context instead of embedding data in messages: ```php // ❌ Bad: Embedded data in message $this->logger->info("User {$userId} logged in from {$ipAddress}"); // ✅ Good: Structured context $this->logger->info('User login successful', [ 'user_id' => $userId, 'ip_address' => $ipAddress, 'login_method' => 'oauth' ]); ``` ### 3. Channel Organization Use channels for logical log separation: ```php LogChannel::SECURITY → Authentication, authorization, security events LogChannel::DATABASE → Database queries, migrations, connection issues LogChannel::CACHE → Cache hits/misses, cache operations LogChannel::HTTP → HTTP requests/responses, API calls LogChannel::QUEUE → Background jobs, queue processing LogChannel::APPLICATION → General application events (default) ``` ### 4. Sensitive Data Awareness Be explicit about what data is logged: ```php // ❌ Avoid: Logging entire request $this->logger->debug('Request received', ['request' => $request]); // ✅ Better: Log specific safe fields $this->logger->debug('Request received', [ 'method' => $request->method->value, 'path' => $request->path, 'user_id' => $request->user?->id // Password automatically redacted by key name ]); ``` ### 5. Log Aggregator Optimization Optimize for log aggregator querying: ```php // Use consistent field names across services [ 'user_id' => $userId, // Not 'userId' or 'id' 'request_id' => $requestId, // Not 'reqId' or 'rid' 'duration_ms' => $durationMs, // Not 'time' or 'elapsed' 'status_code' => $statusCode // Not 'status' or 'code' ] // Include searchable metadata [ 'transaction_type' => 'payment', 'payment_method' => 'credit_card', 'currency' => 'USD', 'success' => true ] ``` ## Performance Considerations - **Redaction Overhead**: ~1-2ms per log record with complex context - **JSON Serialization**: Minimal overhead with `JsonSerializer` - **Pattern Matching**: Credit card/SSN regex executed only on string content - **Memory Usage**: Readonly records prevent accidental mutations, low overhead ## Security Guarantees ✅ **PII Protection**: Automatic redaction of passwords, tokens, credit cards, SSN ✅ **Production-Safe**: Full redaction by default in production environments ✅ **No Plaintext Secrets**: Sensitive keys always masked in logs ✅ **Configurable Sensitivity**: Adjust redaction level per environment ✅ **Audit-Ready**: Deterministic hashing for correlation without exposing data ## Testing ```php use App\Framework\Logging\Security\SensitiveDataRedactor; use App\Framework\Logging\Security\RedactionMode; it('redacts sensitive data in logs', function () { $redactor = new SensitiveDataRedactor(RedactionMode::FULL); $data = [ 'user' => 'john', 'password' => 'secret123', 'api_key' => 'sk_live_abc' ]; $redacted = $redactor->redact($data); expect($redacted['user'])->toBe('john'); expect($redacted['password'])->toBe('[REDACTED]'); expect($redacted['api_key'])->toBe('[REDACTED]'); }); it('redacts credit card numbers in content', function () { $redactor = new SensitiveDataRedactor(); $message = 'Payment with card 4532-1234-5678-9010'; $redacted = $redactor->redactString($message); expect($redacted)->toContain('[CREDIT_CARD]'); expect(str_contains($redacted, '4532'))->toBeFalsy(); }); ``` ## Troubleshooting ### Sensitive data still visible in logs **Check**: 1. Verify `redactSensitiveData: true` in production handlers 2. Confirm environment detection (`$config->app->isProduction()`) 3. Check field names match sensitive key patterns 4. Verify redactor is properly injected into formatter ### Logs missing expected fields **Check**: 1. Ensure `includeExtras: true` in JsonFormatter 2. Verify processors are registered in ProcessorManager 3. Check `flattenContext: true` for structured context 4. Confirm channel is properly set on log records ### Performance degradation **Check**: 1. Reduce redaction scope (disable email/IP if not needed) 2. Use PARTIAL mode instead of HASH for less overhead 3. Minimize context size (only log essential data) 4. Consider async logging with QueuedLogHandler ## Migration Guide ### Upgrading to Structured Logging **Step 1**: Update log calls to use structured context ```php // Before $logger->info("User {$userId} did something"); // After $logger->info('User action completed', ['user_id' => $userId]); ``` **Step 2**: Enable Docker JSON handler in production ```php // LoggerInitializer already configured // No changes needed if using framework defaults ``` **Step 3**: Verify redaction in production logs ```bash # Check Docker logs don't contain plaintext secrets docker logs 2>&1 | grep -i "password\|api_key\|token" # Should only show [REDACTED] or masked values ``` ## Summary The framework's structured logging system provides: ✅ **JSON-structured output** for modern log aggregators ✅ **Automatic PII redaction** for security compliance ✅ **Docker-optimized logging** for container environments ✅ **Readonly, immutable records** following framework principles ✅ **Environment-aware configuration** (prod/dev/test) ✅ **Standard fields** (@timestamp, severity, environment, host, service) ✅ **Pattern-based detection** for credit cards, SSN, tokens ✅ **Nested data redaction** for complex structures ✅ **Performance-optimized** with minimal overhead ✅ **Test-friendly** with deterministic hashing mode