Files
michaelschiemer/docs/ml-model-management.md

873 lines
24 KiB
Markdown

# ML Model Management System
Umfassendes System für ML-Model-Verwaltung, Versionierung, A/B-Testing, Performance-Monitoring und Auto-Tuning.
## Übersicht
Das ML Model Management System bietet eine vollständige Lösung für die Verwaltung von Machine Learning Modellen in Production:
- **Model Registry**: Zentralisierte Versionsverwaltung für alle ML-Modelle
- **A/B Testing**: Traffic-Splitting und statistische Vergleiche zwischen Modell-Versionen
- **Performance Monitoring**: Real-Time Accuracy Tracking und Drift Detection
- **Auto-Tuning**: Automatische Threshold- und Hyperparameter-Optimierung
## Architektur
```
┌─────────────────────┐
│ Model Registry │ ← Zentrale Modell-Verwaltung
└─────────────────────┘
├─── ModelMetadata (Value Object)
├─── CacheModelRegistry (Production)
└─── InMemoryModelRegistry (Testing)
┌─────────────────────┐
│ A/B Testing │ ← Traffic Splitting & Vergleiche
└─────────────────────┘
├─── ABTestConfig (Value Object)
├─── ABTestResult (Value Object)
└─── ABTestingService
┌─────────────────────┐
│ Performance Monitor │ ← Real-Time Tracking
└─────────────────────┘
├─── ModelPerformanceMonitor
├─── PerformanceStorage
└─── AlertingService
┌─────────────────────┐
│ Auto-Tuning │ ← Automatische Optimierung
└─────────────────────┘
└─── AutoTuningEngine
```
## Komponenten
### 1. Model Registry
Zentrale Verwaltung für ML-Modell-Metadaten mit Versionierung.
**Core Value Objects**:
#### ModelMetadata
```php
final readonly class ModelMetadata
{
public function __construct(
public string $modelName,
public ModelType $modelType,
public Version $version, // Framework Core Version
public array $configuration = [],
public array $performanceMetrics = [],
public Timestamp $createdAt = new Timestamp(),
public ?Timestamp $deployedAt = null,
public ?string $environment = null,
public array $metadata = []
)
}
```
**Factory Methods**:
```php
// Für N+1 Detection
$metadata = ModelMetadata::forN1Detector(
version: Version::fromString('1.0.0'),
configuration: ['threshold' => 0.7]
);
// Für WAF Behavioral Analysis
$metadata = ModelMetadata::forWafBehavioral(
version: Version::fromString('2.0.0'),
configuration: ['window_size' => 100]
);
// Für Queue Job Anomaly Detection
$metadata = ModelMetadata::forQueueAnomaly(
version: Version::fromString('1.2.0'),
configuration: ['min_cluster_size' => 3]
);
```
#### ModelType
```php
enum ModelType: string
{
case SUPERVISED = 'supervised';
case UNSUPERVISED = 'unsupervised';
case SEMI_SUPERVISED = 'semi_supervised';
case REINFORCEMENT = 'reinforcement';
}
```
**Registry Interface**:
```php
interface ModelRegistry
{
// CRUD Operations
public function register(ModelMetadata $metadata): void;
public function get(string $modelName, Version $version): ?ModelMetadata;
public function getLatest(string $modelName): ?ModelMetadata;
public function update(ModelMetadata $metadata): void;
public function delete(string $modelName, Version $version): bool;
// Querying
public function getAll(string $modelName): array;
public function getByType(ModelType $type): array;
public function getByEnvironment(string $environment): array;
public function getProductionModels(): array;
// Utilities
public function exists(string $modelName, Version $version): bool;
public function getAllModelNames(): array;
public function getVersionCount(string $modelName): int;
}
```
**Verwendung**:
```php
use App\Framework\MachineLearning\ModelManagement\ModelRegistry;
use App\Framework\MachineLearning\ModelManagement\ValueObjects\ModelMetadata;
use App\Framework\Core\ValueObjects\Version;
// 1. Modell registrieren
$metadata = ModelMetadata::forN1Detector(
version: Version::fromString('1.0.0'),
configuration: ['threshold' => 0.7, 'window_size' => 100]
);
$metadata = $metadata->withPerformanceMetrics([
'accuracy' => 0.92,
'precision' => 0.89,
'recall' => 0.88,
'f1_score' => 0.885,
]);
$registry->register($metadata);
// 2. Modell abrufen
$model = $registry->get('n1-detector', Version::fromString('1.0.0'));
// 3. Alle Versionen auflisten
$versions = $registry->getAll('n1-detector');
// 4. Production-Modelle
$productionModels = $registry->getProductionModels();
```
### 2. A/B Testing Service
Traffic-Splitting und statistische Vergleiche zwischen Modell-Versionen.
**Value Objects**:
#### ABTestConfig
```php
final readonly class ABTestConfig
{
public function __construct(
public string $modelName,
public Version $versionA, // Control (Baseline)
public Version $versionB, // Treatment (New Version)
public float $trafficSplitA = 0.5, // 0.0-1.0
public string $primaryMetric = 'accuracy',
public float $minimumImprovement = 0.05,
public float $significanceLevel = 0.05
)
}
```
**Factory Methods**:
```php
// Standard 50/50 split
$config = ABTestConfig::create(
modelName: 'n1-detector',
versionA: Version::fromString('1.0.0'),
versionB: Version::fromString('1.1.0'),
trafficSplit: 0.5
);
// Gradual rollout (10% to new version)
$config = ABTestConfig::forGradualRollout(
modelName: 'n1-detector',
currentVersion: Version::fromString('1.0.0'),
newVersion: Version::fromString('1.1.0')
);
// Champion/Challenger (80/20 split)
$config = ABTestConfig::forChallenger(
modelName: 'n1-detector',
champion: Version::fromString('1.0.0'),
challenger: Version::fromString('1.1.0')
);
```
#### ABTestResult
```php
final readonly class ABTestResult
{
public function __construct(
public ABTestConfig $config,
public ModelMetadata $metadataA,
public ModelMetadata $metadataB,
public array $metricsDifference,
public string $winner, // 'A', 'B', or 'tie'
public bool $isStatisticallySignificant,
public string $recommendation
)
}
```
**Service Methods**:
```php
// 1. Traffic Routing
$selectedVersion = $abTesting->selectVersion($config);
// 2. Model Comparison
$result = $abTesting->compareModels($config, $metadataA, $metadataB);
// 3. Automated Test
$result = $abTesting->runTest($config);
// 4. Gradual Rollout Plan
$plan = $abTesting->generateRolloutPlan(steps: 5);
// Returns: [1 => 0.2, 2 => 0.4, 3 => 0.6, 4 => 0.8, 5 => 1.0]
// 5. Required Sample Size
$sampleSize = $abTesting->calculateRequiredSampleSize(
confidenceLevel: 0.95,
marginOfError: 0.05
);
```
**Verwendung**:
```php
use App\Framework\MachineLearning\ModelManagement\ABTestingService;
use App\Framework\MachineLearning\ModelManagement\ValueObjects\ABTestConfig;
// 1. Test konfigurieren
$config = ABTestConfig::create(
modelName: 'n1-detector',
versionA: Version::fromString('1.0.0'),
versionB: Version::fromString('1.1.0'),
trafficSplit: 0.5
);
// 2. Test durchführen
$result = $abTesting->runTest($config);
// 3. Ergebnisse analysieren
if ($result->shouldDeployVersionB()) {
echo "Deploy Version B - {$result->recommendation}";
echo "Improvement: {$result->getPrimaryMetricImprovementPercent()}%";
// Gradual Rollout
$plan = $abTesting->generateRolloutPlan(5);
foreach ($plan as $step => $traffic) {
// Deploy step-by-step
}
}
// 4. Metriken-Zusammenfassung
$summary = $result->getMetricsSummary();
print_r($summary);
```
### 3. Performance Monitor
Real-Time Accuracy Tracking, Drift Detection und Alerting.
**Core Features**:
- Real-time prediction tracking
- Performance degradation detection
- Concept drift detection
- Confusion matrix calculation
- Performance trend analysis
- Multi-version comparison
**Service Methods**:
```php
// 1. Track Prediction
$performanceMonitor->trackPrediction(
modelName: 'n1-detector',
version: Version::fromString('1.0.0'),
prediction: true,
actual: true,
confidence: 0.95,
features: ['query_count' => 5, 'pattern' => 'SELECT']
);
// 2. Current Metrics
$metrics = $performanceMonitor->getCurrentMetrics(
'n1-detector',
Version::fromString('1.0.0'),
timeWindow: Duration::fromHours(24)
);
// Returns: accuracy, precision, recall, f1_score, confusion_matrix, etc.
// 3. Performance Degradation Check
$degradationInfo = $performanceMonitor->getPerformanceDegradationInfo(
'n1-detector',
Version::fromString('1.0.0'),
thresholdPercent: 0.05 // 5% degradation threshold
);
if ($degradationInfo['has_degraded']) {
// Alert and take action
}
// 4. Concept Drift Detection
$hasDrift = $performanceMonitor->detectConceptDrift(
'n1-detector',
Version::fromString('1.0.0'),
timeWindow: Duration::fromHours(24)
);
// 5. Performance Trend
$trend = $performanceMonitor->getPerformanceTrend(
'n1-detector',
Version::fromString('1.0.0'),
timeWindow: Duration::fromDays(7),
interval: Duration::fromHours(1)
);
// 6. Version Comparison
$comparison = $performanceMonitor->compareVersions(
'n1-detector',
[Version::fromString('1.0.0'), Version::fromString('1.1.0')],
timeWindow: Duration::fromHours(24)
);
```
**Verwendung**:
```php
use App\Framework\MachineLearning\ModelManagement\ModelPerformanceMonitor;
use App\Framework\Core\ValueObjects\Version;
use App\Framework\Core\ValueObjects\Duration;
// 1. Predictions tracken (nach jeder Vorhersage)
$performanceMonitor->trackPrediction(
modelName: 'n1-detector',
version: $currentVersion,
prediction: $modelPrediction,
actual: $groundTruth, // Wenn bekannt
confidence: $confidenceScore
);
// 2. Performance überwachen (Scheduler, alle 5 Minuten)
$metrics = $performanceMonitor->getCurrentMetrics(
'n1-detector',
$currentVersion,
Duration::fromHours(1)
);
if ($metrics['accuracy'] < 0.85) {
// Alert: Accuracy drop detected
}
// 3. Degradation Check (Scheduler, stündlich)
$degradation = $performanceMonitor->getPerformanceDegradationInfo(
'n1-detector',
$currentVersion
);
if ($degradation['has_degraded']) {
// Automatic alert sent via AlertingService
// Consider rollback or retraining
}
// 4. Drift Detection (Scheduler, täglich)
if ($performanceMonitor->detectConceptDrift('n1-detector', $currentVersion)) {
// Schedule model retraining
}
```
### 4. Auto-Tuning Engine
Automatische Threshold- und Hyperparameter-Optimierung.
**Core Features**:
- Threshold optimization (Grid Search)
- Hyperparameter tuning
- Precision-recall trade-off optimization
- Adaptive threshold adjustment
- Performance-cost trade-off
**Service Methods**:
```php
// 1. Threshold Optimization
$result = $autoTuning->optimizeThreshold(
modelName: 'n1-detector',
version: Version::fromString('1.0.0'),
metricToOptimize: 'f1_score',
thresholdRange: [0.5, 0.9],
step: 0.05,
timeWindow: Duration::fromHours(24)
);
// Returns: optimal_threshold, improvement_percent, recommendation
// 2. Hyperparameter Optimization
$result = $autoTuning->optimizeHyperparameters(
modelName: 'n1-detector',
version: Version::fromString('1.0.0'),
parameterRanges: [
'window_size' => [50, 150, 25],
'min_cluster_size' => [2, 5, 1],
],
metricToOptimize: 'f1_score'
);
// 3. Precision-Recall Trade-off
$result = $autoTuning->optimizePrecisionRecallTradeoff(
modelName: 'n1-detector',
version: Version::fromString('1.0.0'),
targetPrecision: 0.95, // 95% precision target
thresholdRange: [0.5, 0.99]
);
// 4. Adaptive Adjustment
$result = $autoTuning->adaptiveThresholdAdjustment(
'n1-detector',
Version::fromString('1.0.0')
);
// Automatically adjusts based on FP/FN rates
```
**Verwendung**:
```php
use App\Framework\MachineLearning\ModelManagement\AutoTuningEngine;
// 1. Threshold optimieren (wöchentlich via Scheduler)
$optimization = $autoTuning->optimizeThreshold(
modelName: 'n1-detector',
version: $currentVersion,
metricToOptimize: 'f1_score',
thresholdRange: [0.5, 0.9],
step: 0.05
);
if ($optimization['improvement_percent'] > 5.0) {
// Apply optimized threshold
$updatedConfig = array_merge(
$currentConfig,
['threshold' => $optimization['optimal_threshold']]
);
$updatedMetadata = $metadata->withConfiguration($updatedConfig);
$registry->update($updatedMetadata);
}
// 2. Adaptive Adjustment (täglich via Scheduler)
$adaptive = $autoTuning->adaptiveThresholdAdjustment(
'n1-detector',
$currentVersion
);
if ($adaptive['recommended_threshold'] !== $adaptive['current_threshold']) {
// Apply adaptive adjustment
}
// 3. Precision-Recall Optimization (on-demand)
$tradeoff = $autoTuning->optimizePrecisionRecallTradeoff(
'n1-detector',
$currentVersion,
targetPrecision: 0.95
);
// Apply if precision target met with acceptable recall
if ($tradeoff['achieved_precision'] >= 0.95
&& $tradeoff['achieved_recall'] >= 0.80) {
// Apply optimized threshold
}
```
## DI Container Integration
```php
use App\Framework\MachineLearning\ModelManagement\MLModelManagementInitializer;
// Automatically registered via Initializer attribute
#[Initializer]
final readonly class MLModelManagementInitializer
{
public function initialize(): void
{
// ModelRegistry (Singleton)
$this->container->singleton(
ModelRegistry::class,
fn(Container $c) => new CacheModelRegistry($c->get(Cache::class))
);
// ABTestingService
$this->container->bind(ABTestingService::class, ...);
// ModelPerformanceMonitor
$this->container->bind(ModelPerformanceMonitor::class, ...);
// AutoTuningEngine
$this->container->bind(AutoTuningEngine::class, ...);
}
}
```
## Complete Workflow Example
```php
// ============================================================================
// Step 1: Register Model Versions
// ============================================================================
$v1 = ModelMetadata::forN1Detector(
version: Version::fromString('1.0.0'),
configuration: ['threshold' => 0.7]
)->withPerformanceMetrics([
'accuracy' => 0.92,
'f1_score' => 0.885,
]);
$registry->register($v1);
$v2 = ModelMetadata::forN1Detector(
version: Version::fromString('1.1.0'),
configuration: ['threshold' => 0.75]
)->withPerformanceMetrics([
'accuracy' => 0.95,
'f1_score' => 0.92,
]);
$registry->register($v2);
// ============================================================================
// Step 2: A/B Test New Version
// ============================================================================
$config = ABTestConfig::create(
modelName: 'n1-detector',
versionA: Version::fromString('1.0.0'),
versionB: Version::fromString('1.1.0'),
trafficSplit: 0.5
);
$abResult = $abTesting->runTest($config);
if ($abResult->shouldDeployVersionB()) {
// ========================================================================
// Step 3: Gradual Rollout
// ========================================================================
$plan = $abTesting->generateRolloutPlan(steps: 5);
foreach ($plan as $step => $trafficToB) {
// Update traffic split
$stepConfig = new ABTestConfig(
modelName: 'n1-detector',
versionA: Version::fromString('1.0.0'),
versionB: Version::fromString('1.1.0'),
trafficSplitA: 1.0 - $trafficToB
);
// Wait and monitor (e.g., 1 hour per step)
sleep(3600);
// Check metrics
$metrics = $performanceMonitor->getCurrentMetrics(
'n1-detector',
Version::fromString('1.1.0'),
Duration::fromHours(1)
);
if ($metrics['accuracy'] < 0.90) {
// Rollback!
break;
}
}
// ========================================================================
// Step 4: Full Deployment
// ========================================================================
$deployed = $v2->withDeployment(
environment: 'production',
deployedAt: Timestamp::now()
);
$registry->update($deployed);
}
// ============================================================================
// Step 5: Continuous Monitoring
// ============================================================================
// Scheduler Job: Every 5 minutes
$metrics = $performanceMonitor->getCurrentMetrics(
'n1-detector',
Version::fromString('1.1.0'),
Duration::fromHours(1)
);
// Scheduler Job: Every hour
$degradation = $performanceMonitor->getPerformanceDegradationInfo(
'n1-detector',
Version::fromString('1.1.0')
);
if ($degradation['has_degraded']) {
// Automatic alert sent
// Consider rollback or retraining
}
// ============================================================================
// Step 6: Auto-Tuning
// ============================================================================
// Scheduler Job: Weekly
$optimization = $autoTuning->optimizeThreshold(
'n1-detector',
Version::fromString('1.1.0'),
'f1_score',
[0.5, 0.9],
0.05
);
if ($optimization['improvement_percent'] > 5.0) {
$updatedConfig = ['threshold' => $optimization['optimal_threshold']];
$updated = $deployed->withConfiguration($updatedConfig);
$registry->update($updated);
}
```
## Scheduler Integration
```php
use App\Framework\Scheduler\Services\SchedulerService;
use App\Framework\Scheduler\Schedules\IntervalSchedule;
use App\Framework\Core\ValueObjects\Duration;
// Performance Monitoring (Every 5 minutes)
$scheduler->schedule(
'ml-performance-monitoring',
IntervalSchedule::every(Duration::fromMinutes(5)),
function() use ($performanceMonitor) {
$productionModels = $this->registry->getProductionModels();
foreach ($productionModels as $model) {
$metrics = $performanceMonitor->getCurrentMetrics(
$model->modelName,
$model->version,
Duration::fromHours(1)
);
// Log metrics for dashboard
}
}
);
// Degradation Check (Every hour)
$scheduler->schedule(
'ml-degradation-check',
IntervalSchedule::every(Duration::fromHours(1)),
function() use ($performanceMonitor) {
$productionModels = $this->registry->getProductionModels();
foreach ($productionModels as $model) {
$degradation = $performanceMonitor->getPerformanceDegradationInfo(
$model->modelName,
$model->version
);
if ($degradation['has_degraded']) {
// Automatic alert sent via AlertingService
}
}
}
);
// Auto-Tuning (Weekly)
$scheduler->schedule(
'ml-auto-tuning',
CronSchedule::fromExpression('0 2 * * 0'), // Sunday 2 AM
function() use ($autoTuning, $registry) {
$productionModels = $registry->getProductionModels();
foreach ($productionModels as $model) {
$optimization = $autoTuning->optimizeThreshold(
$model->modelName,
$model->version,
'f1_score',
[0.5, 0.9],
0.05
);
if ($optimization['improvement_percent'] > 3.0) {
$updated = $model->withConfiguration([
'threshold' => $optimization['optimal_threshold']
]);
$registry->update($updated);
}
}
}
);
// Drift Detection (Daily)
$scheduler->schedule(
'ml-drift-detection',
CronSchedule::fromExpression('0 3 * * *'), // Daily 3 AM
function() use ($performanceMonitor) {
$productionModels = $this->registry->getProductionModels();
foreach ($productionModels as $model) {
$hasDrift = $performanceMonitor->detectConceptDrift(
$model->modelName,
$model->version
);
if ($hasDrift) {
// Schedule retraining
}
}
}
);
```
## Best Practices
### 1. Model Versioning
- **Semantic Versioning**: MAJOR.MINOR.PATCH für alle Modelle
- **Breaking Changes**: Major version increment bei Breaking Changes
- **Feature Additions**: Minor version increment bei neuen Features
- **Bug Fixes**: Patch version für Threshold-Adjustments
### 2. A/B Testing
- **Sample Size**: Mindestens 100 Predictions pro Version für statistische Signifikanz
- **Gradual Rollout**: Schrittweise Erhöhung (10% → 25% → 50% → 75% → 100%)
- **Monitoring**: Kontinuierliches Monitoring während Rollout
- **Rollback Plan**: Automatischer Rollback bei Performance-Degradation
### 3. Performance Monitoring
- **Real-Time Tracking**: Track jede Prediction mit Ground Truth (wenn verfügbar)
- **Alert Thresholds**: 5% Degradation = Warning, 10% = Critical
- **Drift Detection**: Tägliche Checks auf Concept Drift
- **Retention**: Performance-Daten 30 Tage aufbewahren
### 4. Auto-Tuning
- **Grid Search**: Wöchentlich für Threshold-Optimierung
- **Adaptive Adjustment**: Täglich basierend auf FP/FN Rates
- **Minimum Improvement**: Mindestens 3% Verbesserung für Threshold-Änderung
- **Validation**: Immer auf separatem Validation-Set testen
### 5. Production Deployment
- **Baseline**: Immer Baseline-Metrics vor Deployment erfassen
- **Canary Deployment**: Gradual Rollout mit 10% Start
- **Monitoring Window**: Mindestens 24h Monitoring nach Full Deployment
- **Rollback**: Automatischer Rollback bei >5% Accuracy Drop
## Performance Characteristics
- **ModelRegistry**: O(1) Lookup, 7-day Cache TTL
- **A/B Testing**: <50ms per traffic routing decision
- **Performance Monitor**: ~1ms per prediction tracking
- **Auto-Tuning**: ~10s for threshold optimization (grid search 0.05 steps)
- **Storage**: 30-day retention für Performance-Daten
## Troubleshooting
### Problem: Hohe False Positive Rate
**Solution**:
```php
$adaptive = $autoTuning->adaptiveThresholdAdjustment(
'n1-detector',
$currentVersion
);
// Automatically suggests threshold increase
```
### Problem: Performance Degradation
**Solution**:
```php
// 1. Check degradation info
$degradation = $performanceMonitor->getPerformanceDegradationInfo(...);
// 2. Rollback to previous version
$previousVersion = Version::fromString('1.0.0');
$previous = $registry->get('n1-detector', $previousVersion);
$redeployed = $previous->withDeployment('production', Timestamp::now());
$registry->update($redeployed);
// 3. Schedule retraining
```
### Problem: A/B Test Inconclusive
**Solution**:
```php
// Increase sample size
$requiredSize = $abTesting->calculateRequiredSampleSize(
confidenceLevel: 0.95,
marginOfError: 0.03 // Reduce margin of error
);
// Continue test until sample size reached
```
## Integration mit bestehenden ML-Systemen
### N+1 Detection
```php
$metadata = ModelMetadata::forN1Detector(
version: Version::fromString('1.0.0'),
configuration: $n1Detector->getConfiguration()
);
$registry->register($metadata);
// Track predictions
foreach ($detections as $detection) {
$performanceMonitor->trackPrediction(
modelName: 'n1-detector',
version: $metadata->version,
prediction: $detection->isN1Pattern,
actual: $groundTruth, // If available
confidence: $detection->confidence
);
}
```
### WAF Behavioral Analysis
```php
$metadata = ModelMetadata::forWafBehavioral(
version: Version::fromString('2.0.0'),
configuration: $wafBehavioral->getConfiguration()
);
$registry->register($metadata);
```
### Queue Job Anomaly Detection
```php
$metadata = ModelMetadata::forQueueAnomaly(
version: Version::fromString('1.2.0'),
configuration: $queueAnomaly->getConfiguration()
);
$registry->register($metadata);
```
## Zusammenfassung
Das ML Model Management System bietet:
**Centralized Model Registry** mit Versionsverwaltung
**A/B Testing** mit statistischer Signifikanz-Prüfung
**Real-Time Performance Monitoring** mit Drift Detection
**Automatic Threshold Optimization** mit Grid Search
**Production-Ready** mit Cache-basierter Persistenz
**Framework-Compliant** mit Value Objects und readonly Classes
**Fully Integrated** mit Scheduler und Queue System
**Scalable** mit 30-day Performance Data Retention
Das System ist vollständig in das Custom PHP Framework integriert und folgt allen Framework-Patterns.