Files
michaelschiemer/docs/claude/structured-logging.md
Michael Schiemer fc3d7e6357 feat(Production): Complete production deployment infrastructure
- Add comprehensive health check system with multiple endpoints
- Add Prometheus metrics endpoint
- Add production logging configurations (5 strategies)
- Add complete deployment documentation suite:
  * QUICKSTART.md - 30-minute deployment guide
  * DEPLOYMENT_CHECKLIST.md - Printable verification checklist
  * DEPLOYMENT_WORKFLOW.md - Complete deployment lifecycle
  * PRODUCTION_DEPLOYMENT.md - Comprehensive technical reference
  * production-logging.md - Logging configuration guide
  * ANSIBLE_DEPLOYMENT.md - Infrastructure as Code automation
  * README.md - Navigation hub
  * DEPLOYMENT_SUMMARY.md - Executive summary
- Add deployment scripts and automation
- Add DEPLOYMENT_PLAN.md - Concrete plan for immediate deployment
- Update README with production-ready features

All production infrastructure is now complete and ready for deployment.
2025-10-25 19:18:37 +02:00

578 lines
16 KiB
Markdown

# Structured Logging
Comprehensive guide to the framework's structured logging system with PII protection and log aggregator integration.
## Overview
The framework's structured logging system provides:
- **JSON-structured logs** for log aggregators (Elasticsearch, Datadog, Splunk)
- **Automatic sensitive data redaction** for PII protection
- **Docker-optimized logging** for container environments
- **Readonly, immutable log records** following framework principles
- **Channel-based routing** for organized log management
- **Processor pipeline** for log enrichment
## Core Components
### LogRecord (Readonly & Immutable)
All log records are readonly and immutable value objects:
```php
use App\Framework\Logging\LogRecord;
use App\Framework\Logging\LogLevel;
use App\Framework\Logging\LogChannel;
use App\Framework\Logging\ValueObjects\LogContext;
$record = new LogRecord(
level: LogLevel::INFO,
message: 'User login successful',
channel: LogChannel::SECURITY,
context: LogContext::structured([
'user_id' => $userId,
'ip_address' => $ipAddress
])
);
// Immutable transformation (Copy-on-Write pattern)
$enrichedRecord = $record->withContext(
$record->context->with('login_method', 'oauth')
);
```
### JSON Formatter with Log Aggregator Support
The `JsonFormatter` produces structured JSON optimized for log aggregators:
```php
use App\Framework\Logging\Formatter\JsonFormatter;
use App\Framework\Config\Environment;
$formatter = new JsonFormatter(
prettyPrint: false, // Compact JSON for production
includeExtras: true, // Include processor-added data
flattenContext: true, // Flatten for easier querying
env: $environment,
serviceName: 'api-gateway',
redactSensitiveData: true // Auto-redact PII
);
// Output format:
{
"timestamp": "2025-01-22T15:30:45+00:00",
"@timestamp": "2025-01-22T15:30:45+00:00", // Elasticsearch convention
"level": "INFO",
"level_value": 200,
"severity": 6, // RFC 5424 (0-7)
"channel": "security",
"message": "User login successful",
"environment": "production", // Auto-detected
"host": "web-server-01", // Server hostname
"service": "api-gateway", // Service name
"context": {
"user_id": "user123",
"ip_address": "[REDACTED]" // Auto-redacted in production
}
}
```
**Standard Fields for Log Aggregators**:
- `@timestamp`: Elasticsearch-compatible timestamp field
- `severity`: RFC 5424 severity level (0-7) for standardized filtering
- `environment`: Deployment environment (production, staging, development)
- `host`: Server hostname for distributed system correlation
- `service`: Service/application name for multi-service architectures
## Sensitive Data Redaction
### Overview
Automatic PII (Personally Identifiable Information) protection with three redaction modes:
```php
use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;
// Factory methods for different environments
$redactor = SensitiveDataRedactor::production(); // FULL redaction
$redactor = SensitiveDataRedactor::development(); // PARTIAL redaction
$redactor = SensitiveDataRedactor::testing(); // HASH redaction
// Custom configuration
$redactor = new SensitiveDataRedactor(
mode: RedactionMode::PARTIAL,
redactEmails: true,
redactIps: true,
mask: '[REDACTED]'
);
```
### Redaction Modes
#### FULL Mode (Production Default)
Complete masking of sensitive data:
```php
$redactor = new SensitiveDataRedactor(RedactionMode::FULL);
$data = ['username' => 'john', 'password' => 'secret123'];
$redacted = $redactor->redact($data);
// Result: ['username' => 'john', 'password' => '[REDACTED]']
```
#### PARTIAL Mode (Development Default)
Partial masking showing first/last characters:
```php
$redactor = new SensitiveDataRedactor(RedactionMode::PARTIAL);
$data = ['password' => 'super-secret-password'];
$redacted = $redactor->redact($data);
// Result: ['password' => 'su****************rd']
```
#### HASH Mode (Testing Default)
Deterministic hash-based masking:
```php
$redactor = new SensitiveDataRedactor(RedactionMode::HASH);
$data = ['api_key' => 'sk_live_1234567890'];
$redacted = $redactor->redact($data);
// Result: ['api_key' => '[HASH:a1b2c3d4e5f6]']
// Same input always produces same hash
```
### Key-Based Redaction
Automatically redacts fields with sensitive names:
**Sensitive Keys**:
- Passwords: `password`, `passwd`, `pwd`, `secret`
- API Keys: `api_key`, `apikey`, `api_secret`, `apisecret`
- Tokens: `token`, `access_token`, `refresh_token`, `bearer`, `auth`
- Encryption: `private_key`, `encryption_key`
- Session: `session_id`, `cookie`, `csrf`, `csrf_token`
- Financial: `credit_card`, `card_number`, `cvv`, `cvc`
- Personal: `ssn`, `social_security`, `tax_id`, `passport`
```php
$data = [
'username' => 'john',
'password' => 'secret123',
'api_key' => 'sk_live_abc123',
'user_id' => 42
];
$redacted = $redactor->redact($data);
// Result:
[
'username' => 'john', // Not sensitive
'password' => '[REDACTED]', // Key-based redaction
'api_key' => '[REDACTED]', // Key-based redaction
'user_id' => 42 // Not sensitive
]
```
### Content-Based Redaction
Pattern-based detection for sensitive content:
**Detected Patterns**:
- **Credit Cards**: `4532-1234-5678-9010``[CREDIT_CARD]`
- **SSN**: `123-45-6789``[SSN]`
- **Bearer Tokens**: `Bearer eyJhbGci...``Bearer [REDACTED]`
- **Emails** (optional): `john.doe@example.com``j******e@example.com`
- **IP Addresses** (optional): `192.168.1.100``[IP_ADDRESS]`
```php
$message = 'Payment with card 4532-1234-5678-9010';
$redacted = $redactor->redactString($message);
// Result: 'Payment with card [CREDIT_CARD]'
```
### Nested Array Redaction
Recursive redaction for complex data structures:
```php
$data = [
'user' => [
'name' => 'John Doe',
'password' => 'secret',
'preferences' => [
'theme' => 'dark',
'api_key' => 'key123'
]
]
];
$redacted = $redactor->redact($data);
// Result:
[
'user' => [
'name' => 'John Doe', // Not redacted
'password' => '[REDACTED]', // Redacted
'preferences' => [
'theme' => 'dark', // Not redacted
'api_key' => '[REDACTED]' // Redacted (nested)
]
]
]
```
## Docker JSON Handler
Optimized JSON logging for Docker containers with stdout/stderr streaming:
```php
use App\Framework\Logging\Handlers\DockerJsonHandler;
use App\Framework\Logging\Security\SensitiveDataRedactor;
// Production: Compact JSON with full redaction
$handler = new DockerJsonHandler(
env: $environment,
minLevel: LogLevel::INFO,
redactSensitiveData: true, // Auto-redact in production
prettyPrint: false // Compact for log aggregators
);
// Development: Pretty-printed JSON without redaction
$handler = new DockerJsonHandler(
env: $environment,
serviceName: 'api-service',
minLevel: LogLevel::DEBUG,
prettyPrint: true, // Pretty-print for readability
redactSensitiveData: false // No redaction for debugging
);
```
**Docker Log Integration**:
```bash
# View logs with Docker
docker logs <container> --tail 50
docker logs <container> --follow
docker logs <container> --since 10m
# Format with jq for readability
docker logs <container> 2>&1 | jq .
docker logs <container> 2>&1 | jq 'select(.level == "ERROR")'
docker logs <container> 2>&1 | jq -r '[.timestamp, .level, .message] | @tsv'
```
## Automatic Integration
The framework automatically configures structured logging in `LoggerInitializer`:
```php
// Production Docker environment
if ($inDocker && $config->app->isProduction()) {
$handlers[] = new DockerJsonHandler(
env: $env,
minLevel: LogLevel::INFO,
redactSensitiveData: true // AUTOMATIC PII PROTECTION
);
}
// Development Docker environment
if ($inDocker && !$config->app->isProduction()) {
$handlers[] = new DockerJsonHandler(
env: $env,
serviceName: $config->app->name ?? 'app',
minLevel: LogLevel::DEBUG,
prettyPrint: true, // Better readability
redactSensitiveData: false // No redaction for debugging
);
}
```
## Usage Examples
### Basic Logging with Auto-Redaction
```php
use App\Framework\Logging\Logger;
use App\Framework\Logging\LogLevel;
// Logger automatically uses configured handlers
$this->logger->log(
LogLevel::INFO,
'User authentication',
[
'user_id' => $userId,
'password' => 'secret123', // Auto-redacted
'api_key' => 'sk_live_xyz', // Auto-redacted
'email' => 'user@example.com' // Auto-redacted (if enabled)
]
);
// JSON output (production):
{
"message": "User authentication",
"context": {
"user_id": "user123",
"password": "[REDACTED]",
"api_key": "[REDACTED]",
"email": "[REDACTED]"
}
}
```
### Channel-Based Logging
```php
use App\Framework\Logging\LogChannel;
// Security-specific logging
$this->logger->channel(LogChannel::SECURITY)->warning(
'Failed login attempt',
[
'username' => $username,
'ip_address' => $ipAddress, // Auto-redacted in production
'attempt_count' => $attempts
]
);
// Database-specific logging
$this->logger->channel(LogChannel::DATABASE)->debug(
'Query executed',
[
'query' => $sql,
'bindings' => $bindings, // Sensitive bindings redacted
'duration_ms' => $duration
]
);
```
### Custom Redaction Configuration
```php
use App\Framework\Logging\Formatter\JsonFormatter;
use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;
// Custom redactor for specific service
$redactor = new SensitiveDataRedactor(
mode: RedactionMode::PARTIAL,
redactEmails: true,
redactIps: false // Keep IPs for debugging
);
$formatter = new JsonFormatter(
prettyPrint: false,
includeExtras: true,
flattenContext: true,
env: $environment,
serviceName: 'payment-service',
redactSensitiveData: true,
redactor: $redactor // Custom redactor
);
```
## Best Practices
### 1. Environment-Specific Configuration
```php
// Production: Security-first
- Full redaction (RedactionMode::FULL)
- Compact JSON for log aggregators
- INFO level minimum
- Email and IP redaction enabled
// Development: Debugging-first
- Partial redaction (RedactionMode::PARTIAL)
- Pretty-printed JSON for readability
- DEBUG level minimum
- Email and IP redaction disabled
// Testing: Determinism-first
- Hash redaction (RedactionMode::HASH)
- Consistent output for test assertions
- All levels enabled
- Selective redaction
```
### 2. Structured Context
Always use structured context instead of embedding data in messages:
```php
// ❌ Bad: Embedded data in message
$this->logger->info("User {$userId} logged in from {$ipAddress}");
// ✅ Good: Structured context
$this->logger->info('User login successful', [
'user_id' => $userId,
'ip_address' => $ipAddress,
'login_method' => 'oauth'
]);
```
### 3. Channel Organization
Use channels for logical log separation:
```php
LogChannel::SECURITY Authentication, authorization, security events
LogChannel::DATABASE Database queries, migrations, connection issues
LogChannel::CACHE Cache hits/misses, cache operations
LogChannel::HTTP HTTP requests/responses, API calls
LogChannel::QUEUE Background jobs, queue processing
LogChannel::APPLICATION General application events (default)
```
### 4. Sensitive Data Awareness
Be explicit about what data is logged:
```php
// ❌ Avoid: Logging entire request
$this->logger->debug('Request received', ['request' => $request]);
// ✅ Better: Log specific safe fields
$this->logger->debug('Request received', [
'method' => $request->method->value,
'path' => $request->path,
'user_id' => $request->user?->id
// Password automatically redacted by key name
]);
```
### 5. Log Aggregator Optimization
Optimize for log aggregator querying:
```php
// Use consistent field names across services
[
'user_id' => $userId, // Not 'userId' or 'id'
'request_id' => $requestId, // Not 'reqId' or 'rid'
'duration_ms' => $durationMs, // Not 'time' or 'elapsed'
'status_code' => $statusCode // Not 'status' or 'code'
]
// Include searchable metadata
[
'transaction_type' => 'payment',
'payment_method' => 'credit_card',
'currency' => 'USD',
'success' => true
]
```
## Performance Considerations
- **Redaction Overhead**: ~1-2ms per log record with complex context
- **JSON Serialization**: Minimal overhead with `JsonSerializer`
- **Pattern Matching**: Credit card/SSN regex executed only on string content
- **Memory Usage**: Readonly records prevent accidental mutations, low overhead
## Security Guarantees
**PII Protection**: Automatic redaction of passwords, tokens, credit cards, SSN
**Production-Safe**: Full redaction by default in production environments
**No Plaintext Secrets**: Sensitive keys always masked in logs
**Configurable Sensitivity**: Adjust redaction level per environment
**Audit-Ready**: Deterministic hashing for correlation without exposing data
## Testing
```php
use App\Framework\Logging\Security\SensitiveDataRedactor;
use App\Framework\Logging\Security\RedactionMode;
it('redacts sensitive data in logs', function () {
$redactor = new SensitiveDataRedactor(RedactionMode::FULL);
$data = [
'user' => 'john',
'password' => 'secret123',
'api_key' => 'sk_live_abc'
];
$redacted = $redactor->redact($data);
expect($redacted['user'])->toBe('john');
expect($redacted['password'])->toBe('[REDACTED]');
expect($redacted['api_key'])->toBe('[REDACTED]');
});
it('redacts credit card numbers in content', function () {
$redactor = new SensitiveDataRedactor();
$message = 'Payment with card 4532-1234-5678-9010';
$redacted = $redactor->redactString($message);
expect($redacted)->toContain('[CREDIT_CARD]');
expect(str_contains($redacted, '4532'))->toBeFalsy();
});
```
## Troubleshooting
### Sensitive data still visible in logs
**Check**:
1. Verify `redactSensitiveData: true` in production handlers
2. Confirm environment detection (`$config->app->isProduction()`)
3. Check field names match sensitive key patterns
4. Verify redactor is properly injected into formatter
### Logs missing expected fields
**Check**:
1. Ensure `includeExtras: true` in JsonFormatter
2. Verify processors are registered in ProcessorManager
3. Check `flattenContext: true` for structured context
4. Confirm channel is properly set on log records
### Performance degradation
**Check**:
1. Reduce redaction scope (disable email/IP if not needed)
2. Use PARTIAL mode instead of HASH for less overhead
3. Minimize context size (only log essential data)
4. Consider async logging with QueuedLogHandler
## Migration Guide
### Upgrading to Structured Logging
**Step 1**: Update log calls to use structured context
```php
// Before
$logger->info("User {$userId} did something");
// After
$logger->info('User action completed', ['user_id' => $userId]);
```
**Step 2**: Enable Docker JSON handler in production
```php
// LoggerInitializer already configured
// No changes needed if using framework defaults
```
**Step 3**: Verify redaction in production logs
```bash
# Check Docker logs don't contain plaintext secrets
docker logs <container> 2>&1 | grep -i "password\|api_key\|token"
# Should only show [REDACTED] or masked values
```
## Summary
The framework's structured logging system provides:
**JSON-structured output** for modern log aggregators
**Automatic PII redaction** for security compliance
**Docker-optimized logging** for container environments
**Readonly, immutable records** following framework principles
**Environment-aware configuration** (prod/dev/test)
**Standard fields** (@timestamp, severity, environment, host, service)
**Pattern-based detection** for credit cards, SSN, tokens
**Nested data redaction** for complex structures
**Performance-optimized** with minimal overhead
**Test-friendly** with deterministic hashing mode