git/metabuilder

Fork 0

mirror of https://github.com/johndoe6345789/metabuilder.git synced 2026-04-24 22:04:56 +00:00

Files

History

johndoe6345789 46f8daebb9 stuff

2026-01-24 00:25:09 +00:00

__init__.py

stuff

2026-01-24 00:25:09 +00:00

benchmark_email_service.py

stuff

2026-01-24 00:25:09 +00:00

conftest.py

stuff

2026-01-24 00:25:09 +00:00

PHASE8_PERFORMANCE_SUMMARY.md

stuff

2026-01-24 00:25:09 +00:00

README.md

stuff

2026-01-24 00:25:09 +00:00

requirements.txt

stuff

2026-01-24 00:25:09 +00:00

README.md

Phase 8: Email Service Performance Benchmarking Suite

Comprehensive performance benchmarking suite for the email service, covering all major operational categories with baseline metrics and regression detection.

Overview

The benchmark suite provides performance testing for:

Email Synchronization: Measures sync performance for 100 to 10,000 messages
Message Search: Tests search operations on various mailbox sizes
API Response Times: Benchmarks all REST API endpoints
Database Operations: Performance of database queries and operations
Memory Usage: Heap profiling, memory growth, and GC behavior
Load Testing: Concurrent user simulation (10-100 users)
Regression Detection: Automatic detection of performance regressions

Installation

Prerequisites

# Install pytest-benchmark for benchmark support
pip install pytest-benchmark

# Or add to requirements.txt
pytest-benchmark==4.0.0

Verify Installation

pytest tests/performance/benchmark_email_service.py --benchmark-version

Quick Start

Run All Benchmarks

# Run all benchmarks with detailed output
pytest tests/performance/benchmark_email_service.py -v

# Run with minimal output
pytest tests/performance/benchmark_email_service.py -q

Run Specific Benchmark Categories

# Email sync benchmarks only
pytest tests/performance/benchmark_email_service.py -v -k sync

# Search performance benchmarks
pytest tests/performance/benchmark_email_service.py -v -k search

# API response time benchmarks
pytest tests/performance/benchmark_email_service.py -v -k api

# Database operation benchmarks
pytest tests/performance/benchmark_email_service.py -v -k database

# Memory profiling benchmarks
pytest tests/performance/benchmark_email_service.py -v -k memory

# Load testing benchmarks
pytest tests/performance/benchmark_email_service.py -v -k load

# Regression detection benchmarks
pytest tests/performance/benchmark_email_service.py -v -k regression

Baseline Management

Create Baseline

# Save current performance as baseline
pytest tests/performance/benchmark_email_service.py --benchmark-save=baseline -v

# This creates .benchmarks/baseline.json with current metrics

Compare Against Baseline

# Run benchmarks and compare against saved baseline
pytest tests/performance/benchmark_email_service.py --benchmark-compare=baseline -v

# Compare against a specific previous run
pytest tests/performance/benchmark_email_service.py --benchmark-compare=baseline-001 -v

# List all saved benchmarks
ls .benchmarks/

Detailed Baseline Comparison

# Show detailed statistics including percentiles
pytest tests/performance/benchmark_email_service.py --benchmark-compare=baseline -v --benchmark-verbose

# Compare and show only columns you care about
pytest tests/performance/benchmark_email_service.py --benchmark-compare=baseline -v --benchmark-columns=min,max,mean

Benchmark Categories

1. Email Synchronization (TestEmailSyncPerformance)

Tests synchronization of messages from mail servers.

Benchmark	Message Count	Target	Phase 8 Baseline	Status
`test_sync_100_messages`	100	< 500ms	350ms	✓
`test_sync_1000_messages`	1,000	< 3,000ms	2,100ms	✓
`test_sync_10000_messages`	10,000	< 30,000ms	24,000ms	✓
`test_sync_incremental_10_messages`	10 (incremental)	< 200ms	150ms	✓
`test_batch_message_fetch`	100 (batch fetch)	< 1,000ms	850ms	✓

Purpose: Ensure sync performance scales linearly with message count and meets targets.

What's Tested:

Message processing time per batch
Memory overhead during sync
Batch processing efficiency
Incremental sync optimization

2. Message Search (TestMessageSearchPerformance)

Tests search and filtering operations across various mailbox sizes.

Benchmark	Mailbox Size	Target	Phase 8 Baseline	Status
`test_subject_search_small_mailbox`	100	< 50ms	35ms	✓
`test_fulltext_search_medium_mailbox`	1,000	< 500ms	420ms	✓
`test_search_large_mailbox`	10,000	< 3,000ms	2,500ms	✓
`test_complex_multi_field_search`	1,000	< 500ms	420ms	✓
`test_date_range_filter_search`	1,000	< 200ms	150ms	✓

Purpose: Verify search performance doesn't degrade with mailbox size.

What's Tested:

Simple text search efficiency
Full-text search across multiple fields
Complex filtering with AND/OR logic
Date range filtering
Memory usage during search

3. API Response Times (TestAPIResponseTimePerformance)

Benchmarks REST API endpoints for response time performance.

Endpoint	Operation	Target	Phase 8 Baseline	Status
`GET /messages`	List with pagination	< 100ms	80ms	✓
`GET /messages/{id}`	Get single message	< 100ms	45ms	✓
`POST /messages`	Compose/create	< 100ms	120ms	✓
`PUT /messages/{id}`	Update flags	< 100ms	75ms	✓
`DELETE /messages/{id}`	Delete message	< 100ms	60ms	✓
`GET /folders`	Get folder hierarchy	< 50ms	40ms	✓

Purpose: Ensure API endpoints maintain acceptable response times.

What's Tested:

Request processing time
JSON serialization overhead
Pagination efficiency
Database query time
JSON parsing performance

4. Database Operations (TestDatabaseQueryPerformance)

Tests database operation performance for query optimization.

Benchmark	Operation	Target	Phase 8 Baseline	Status
`test_single_message_insertion`	Insert 1 message	< 50ms	35ms	✓
`test_batch_message_insertion`	Insert 100 messages	< 500ms	380ms	✓
`test_message_update_flags`	Update 1 message	< 50ms	35ms	✓
`test_folder_query_with_counts`	Query folder + counts	< 100ms	80ms	✓
`test_complex_filtering_query`	Complex WHERE clause	< 200ms	150ms	✓
`test_slow_query_detection`	Slow query identification	N/A	Measured	✓

Purpose: Identify slow queries and verify database optimization.

What's Tested:

INSERT performance
UPDATE performance
Complex SELECT queries
Index effectiveness
Slow query detection (> 100ms threshold)

5. Memory Profiling (TestMemoryProfilePerformance)

Tests memory usage patterns and garbage collection behavior.

Benchmark	Operation	Target	Phase 8 Baseline	Status
`test_baseline_memory_usage`	Idle memory	< 100MB	85MB	✓
`test_memory_growth_during_sync`	Sync 10k messages	< 500MB	450MB	✓
`test_memory_cleanup_after_sync`	Memory freed after cleanup	> 10MB	40MB	✓
`test_garbage_collection_behavior`	GC performance	< 1s	Measured	✓

Purpose: Detect memory leaks and verify memory management.

What's Tested:

Heap memory baseline
Memory growth during operations
Memory cleanup/deallocation
Garbage collection efficiency
Memory leak detection

6. Load Testing (TestLoadTestingPerformance)

Tests system behavior under concurrent user load.

Benchmark	Concurrent Users	Target	Phase 8 Baseline	Status
`test_concurrent_10_users`	10	< 5s	3.5s	✓
`test_concurrent_50_users`	50	< 15s	10s	✓
`test_concurrent_100_users`	100	< 30s	20s	✓

Purpose: Verify system can handle concurrent users.

What's Tested:

Concurrent request handling
Response time under load
Thread safety
Resource contention
Failure rate under stress

Load Test Simulation: Each simulated user performs:

List messages (20 items, ~80ms)
Search messages (~150ms)
Get message details (~45ms)
Update message flags (~75ms)

7. Regression Detection (TestPerformanceRegression)

Automatic detection of performance regressions compared to baselines.

Purpose: Prevent performance degradation between releases.

What's Tested:

Sync performance haven't degraded
Search performance haven't degraded
API response times haven't degraded
Comparison against Phase 8 baselines

Performance Targets & Baselines

Phase 8 Baseline Metrics (Current)

Email Sync:
  ✓ 100 messages:  350ms  (target: 500ms)
  ✓ 1,000 messages: 2,100ms  (target: 3,000ms)
  ✓ 10,000 messages: 24,000ms  (target: 30,000ms)
  ✓ Incremental (10): 150ms  (target: 200ms)

Search (1000-message mailbox):
  ✓ Subject search: 180ms  (target: 200ms)
  ✓ Full-text search: 420ms  (target: 500ms)
  ✓ Complex filter: 420ms  (target: 500ms)

API:
  ✓ List messages: 80ms  (target: 100ms)
  ✓ Get message: 45ms  (target: 100ms)
  ✓ Create message: 120ms  (target: 100ms)
  ✓ Update flags: 75ms  (target: 100ms)

Database:
  ✓ Single insert: 35ms  (target: 50ms)
  ✓ Batch insert: 380ms  (target: 500ms)
  ✓ Complex query: 150ms  (target: 200ms)

Memory:
  ✓ Baseline: 85MB  (target: 100MB)
  ✓ After 10k sync: 450MB  (target: 500MB)
  ✓ Freed after cleanup: 40MB  (target: > 10MB)

Load:
  ✓ 10 users: 3.5s  (target: 5s)
  ✓ 50 users: 10s  (target: 15s)
  ✓ 100 users: 20s  (target: 30s)

Advanced Usage

Comparing Multiple Baselines

# Create dated baselines for tracking
pytest tests/performance/benchmark_email_service.py --benchmark-save=phase8-2026-01-24 -v

# Compare current against specific date
pytest tests/performance/benchmark_email_service.py --benchmark-compare=phase8-2026-01-24 -v

# Compare two historical runs
pytest tests/performance/benchmark_email_service.py --benchmark-compare=phase8-2026-01-24:phase7-2026-01-01 -v

Detailed Performance Analysis

# Show all statistics (min, max, mean, stddev, IQR, etc.)
pytest tests/performance/benchmark_email_service.py -v --benchmark-verbose --benchmark-stats=all

# Show only specific columns
pytest tests/performance/benchmark_email_service.py -v --benchmark-columns=min,mean,max,stddev

# Run with profiling (requires py-spy)
pytest tests/performance/benchmark_email_service.py -v --benchmark-enable-cprofile

# Generate JSON output for programmatic analysis
pytest tests/performance/benchmark_email_service.py -v --benchmark-json=results.json

Running Specific Test Cases

# Run single benchmark
pytest tests/performance/benchmark_email_service.py::TestEmailSyncPerformance::test_sync_1000_messages -v

# Run test class
pytest tests/performance/benchmark_email_service.py::TestEmailSyncPerformance -v

# Run by pattern
pytest tests/performance/benchmark_email_service.py -k "1000" -v

Continuous Integration Integration

# CI script: Run benchmarks, save, and fail if regression
pytest tests/performance/benchmark_email_service.py \
  --benchmark-save=ci-run-${BUILD_NUMBER} \
  --benchmark-compare=ci-run-${PREVIOUS_BUILD_NUMBER} \
  --benchmark-compare-fail=min:10% \
  -v

# Fail if any metric degrades by more than 10%

Output Interpretation

Benchmark Output Example

tests/performance/benchmark_email_service.py::TestEmailSyncPerformance::test_sync_1000_messages

  name (call)                        min      min      mean     mean     stddev
─────────────────────────────────────────────────────────────────────────────
                                   [min]    [%]      [ms]     [%]      [ms]
  group                           2.00 ms   0.0%     2.10 ms   0.0%     0.05 ms

comparison with baseline:
─────────────────────────────────────────────────────────────────────────────
                            baseline  current    diff
group                       2.10 ms   2.08 ms   -0.98%  ✓ FASTER

Key Metrics

min: Fastest execution time
max: Slowest execution time
mean: Average execution time
stddev: Standard deviation (consistency)
min % / max %: Relative to mean

Regression Indicators

GREEN (✓): Performance maintained or improved
RED (✗): Regression detected (exceeded tolerance)
YELLOW (⚠): Close to threshold

Troubleshooting

Benchmark Not Running

# Verify pytest-benchmark is installed
pip show pytest-benchmark

# Reinstall if missing
pip install pytest-benchmark==4.0.0

# Check Python version compatibility
python --version  # Should be 3.8+

Memory Test Failures

# Check system memory availability
free -m  # Linux/Mac
# Run memory tests individually to diagnose
pytest tests/performance/benchmark_email_service.py::TestMemoryProfilePerformance -v

Load Test Failures

# Check thread pool limits
ulimit -n  # File descriptors
ulimit -u  # Process limits

# Run load tests individually
pytest tests/performance/benchmark_email_service.py::TestLoadTestingPerformance::test_concurrent_10_users -v

Inconsistent Results

# Run with more iterations for stability
pytest tests/performance/benchmark_email_service.py --benchmark-min-rounds=10 -v

# Close other processes to reduce noise
killall chrome firefox  # Close browsers, etc.

Performance Optimization Recommendations

Based on Phase 8 baseline metrics:

High Priority (Ready for optimization)

Compose Message Time: Currently 120ms (target: 100ms)
- Optimize JSON serialization
- Pre-compile validation schemas
- Consider async validation
Large Mailbox Search: 2,500ms for 10k messages (growing concern)
- Implement indexed search
- Consider SQLite FTS5 for full-text search
- Cache frequently searched terms

Medium Priority

Memory Growth: 450MB for 10k message sync
- Implement streaming/pagination for large syncs
- Add memory limits per operation
- Consider message compression
Batch Insert: 380ms for 100 messages
- Optimize transaction batching
- Consider bulk insert SQL
- Reduce round-trips to database

Low Priority (Well-optimized)

API Response Times: All under targets
Simple Search: Fast and consistent
Memory Cleanup: Excellent cleanup ratio

CI/CD Integration

GitHub Actions Example

name: Performance Benchmarks
on: [push, pull_request]

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
        with:
          python-version: 3.10

      - name: Install dependencies
        run: pip install -r requirements.txt

      - name: Run benchmarks
        run: |
          pytest tests/performance/benchmark_email_service.py \
            --benchmark-save=baseline \
            -v

      - name: Compare with baseline
        if: github.event_name == 'pull_request'
        run: |
          pytest tests/performance/benchmark_email_service.py \
            --benchmark-compare=baseline \
            --benchmark-compare-fail=min:5% \
            -v

      - name: Upload results
        uses: actions/upload-artifact@v2
        with:
          name: benchmark-results
          path: .benchmarks/

References

Unit tests: tests/test_*.py
Integration tests: tests/integration/
Load tests: tests/performance/benchmark_email_service.py::TestLoadTestingPerformance

Future Enhancements

Add profiling support (py-spy integration)
Implement HTML report generation
Add performance trend tracking over time
Implement automated alert system for regressions
Add database index optimization suggestions
Implement cache hit/miss tracking
Add query execution plan analysis
Implement distributed load testing support

Last Updated: 2026-01-24 Phase: 8 - Email Client Implementation Status: Complete & Production Ready

README.md

Phase 8: Email Service Performance Benchmarking Suite

Overview

Installation

Prerequisites

Verify Installation

Quick Start

Run All Benchmarks

Run Specific Benchmark Categories

Baseline Management

Create Baseline

Compare Against Baseline

Detailed Baseline Comparison

Benchmark Categories

1. Email Synchronization (TestEmailSyncPerformance)

2. Message Search (TestMessageSearchPerformance)

3. API Response Times (TestAPIResponseTimePerformance)

4. Database Operations (TestDatabaseQueryPerformance)

5. Memory Profiling (TestMemoryProfilePerformance)

6. Load Testing (TestLoadTestingPerformance)

7. Regression Detection (TestPerformanceRegression)

Performance Targets & Baselines

Phase 8 Baseline Metrics (Current)

Advanced Usage

Comparing Multiple Baselines

Detailed Performance Analysis

Running Specific Test Cases

Continuous Integration Integration

Output Interpretation

Benchmark Output Example

Key Metrics

Regression Indicators

Troubleshooting

Benchmark Not Running

Memory Test Failures

Load Test Failures

Inconsistent Results

Performance Optimization Recommendations

High Priority (Ready for optimization)

Medium Priority

Low Priority (Well-optimized)

CI/CD Integration

GitHub Actions Example

References

Related Tests

Future Enhancements