SousChef

souschef
docs
user-guide

performance.md•10.3 KiB

# Performance Benchmarks SousChef includes comprehensive performance benchmarks to ensure fast and efficient cookbook conversions. This guide explains the benchmarking framework, current performance characteristics, and optimisation strategies. ## Benchmark Suite SousChef uses pytest-benchmark to measure performance across key operations. Benchmarks run automatically in CI/CD and can be executed locally for performance validation. ### Running Benchmarks ```bash # Run all benchmarks with detailed output poetry run pytest tests/ -v --benchmark-only # Run benchmarks with comparison poetry run pytest tests/ --benchmark-only --benchmark-compare # Save benchmark results poetry run pytest tests/ --benchmark-only --benchmark-save=baseline # Compare against saved baseline poetry run pytest tests/ --benchmark-only --benchmark-compare=baseline # Generate histogram visualizations poetry run pytest tests/ --benchmark-only --benchmark-histogram ``` ## Current Performance Metrics All benchmarks measured on: Linux, Python 3.13, single-threaded execution. ### Parsing Operations | Operation | Mean Time | Min Time | Max Time | Throughput | |-----------|-----------|----------|----------|------------| | Recipe parsing | 174.5 µs | 126.9 µs | 302.8 µs | 5,732 ops/sec | | Attribute parsing | 150.8 µs | 107.7 µs | 384.2 µs | 6,633 ops/sec | | Template parsing | 149.9 µs | 107.2 µs | 719.8 µs | 6,672 ops/sec | | Metadata parsing | 182.3 µs | 122.2 µs | 2,025 ms | 5,486 ops/sec | | Custom resource | 144.3 µs | 96.0 µs | 565.7 µs | 7,442 ops/sec | **Key Insights:** - Consistent sub-200µs performance for standard Chef artifacts - Over 5,000 operations per second for all parsing tasks - Metadata parsing has occasional outliers (max 2ms) due to file I/O ### Conversion Operations | Operation | Mean Time | Min Time | Max Time | Throughput | |-----------|-----------|----------|----------|------------| | Basic conversion | 906 ns | 791 ns | 42.5 µs | 1,103,644 ops/sec | | Resource conversion | 3.2 µs | 1.9 µs | 1,118 ms | 312,166 ops/sec | | InSpec conversion | 170.5 µs | 119.0 µs | 4,663 ms | 5,864 ops/sec | | Playbook generation | 1.4 ms | 721.1 µs | 15.8 ms | 717 ops/sec | **Key Insights:** - Nanosecond-level performance for basic conversions - Resource conversion highly parallelizable - Playbook generation is the most expensive operation (milliseconds range) ### Structure Analysis | Operation | Mean Time | Min Time | Max Time | Throughput | |-----------|-----------|----------|----------|------------| | Cookbook structure | 1.5 ms | 926.0 µs | 20.7 ms | 672 ops/sec | | Large cookbook | 1.4 ms | 1.2 ms | 2.9 ms | 708 ops/sec | | InSpec profiles | 632.6 µs | 443.0 µs | 3,256 ms | 1,581 ops/sec | **Key Insights:** - Consistent low-millisecond performance for structure analysis - Large cookbooks (100+ resources) maintain similar performance - File I/O dominates structure analysis time ## Performance Characteristics ### Scalability SousChef scales linearly with cookbook size: ```python # Small cookbook (10 resources): ~1.4ms # Medium cookbook (50 resources): ~1.5ms # Large cookbook (100 resources): ~1.4ms # Very large cookbook (500 resources): ~7-10ms (estimated) ``` **Why it scales well:** - Parsing is single-pass with minimal backtracking - No recursive tree transformations - Lazy evaluation of optional fields - Efficient path normalization caching ### Memory Usage Typical memory footprint: - **Small cookbook** (10 recipes): ~5-10 MB - **Medium cookbook** (50 recipes): ~15-30 MB - **Large cookbook** (100+ recipes): ~50-100 MB Memory is linear with: - Number of resources parsed - Template file sizes - Attribute complexity **Memory optimization tips:** - Process cookbooks in batches for large migrations - Use streaming for template conversion - Clear cached results between cookbook conversions ### CPU Utilization SousChef is primarily CPU-bound during: 1. **Ruby parsing** (30-40% of time) 2. **YAML generation** (20-30% of time) 3. **Validation** (15-25% of time) 4. **AI API calls** (varies, network-bound) I/O-bound operations: - Reading Chef files from disk - Writing Ansible playbooks - Network requests to AI providers ## Optimization Strategies ### For Large Cookbooks ```python # Process cookbooks in parallel from concurrent.futures import ProcessPoolExecutor from souschef.assessment import assess_cookbook cookbooks = ["cookbook1", "cookbook2", "cookbook3"] with ProcessPoolExecutor(max_workers=4) as executor: results = list(executor.map(assess_cookbook, cookbooks)) ``` **Benefits:** - 3-4x speedup on multi-core systems - Reduced wall-clock time for batch conversions - Each cookbook isolated in separate process ### For Incremental Conversions ```python # Only convert changed recipes from souschef.converters.playbook import convert_recipe_to_playbook changed_recipes = get_changed_files() # Your VCS diff logic for recipe in changed_recipes: playbook = convert_recipe_to_playbook(recipe) save_playbook(playbook) ``` **Benefits:** - Only process what changed - Ideal for CI/CD pipelines - Maintains conversion consistency ### For Template-Heavy Cookbooks ```python # Pre-compile templates in batch from souschef.parsers.template import parse_template templates = find_all_templates(cookbook_path) parsed_templates = { t: parse_template(t) for t in templates } # Reuse parsed templates during conversion for recipe in recipes: playbook = convert_with_templates(recipe, parsed_templates) ``` **Benefits:** - Avoid re-parsing templates - Better cache locality - 30-50% faster for template-heavy cookbooks ### For AI-Powered Conversions ```python # Batch AI requests to reduce latency from souschef.converters.playbook import generate_playbook_with_ai # Collect all conversion requests conversion_requests = prepare_batch(recipes) # Send in batches of 10 for batch in chunks(conversion_requests, size=10): results = asyncio.run(batch_convert_with_ai(batch)) ``` **Benefits:** - Amortize network overhead - Respect API rate limits - Better error handling ## Performance Tuning ### Environment Variables ```bash # Disable AI-assisted conversion for speed export SOUSCHEF_DISABLE_AI=1 # Increase parser timeout for complex files export SOUSCHEF_PARSER_TIMEOUT=60 # Enable performance profiling export SOUSCHEF_PROFILE=1 ``` ### Configuration Options ```python # In your conversion script config = { "max_workers": 4, # Parallel workers "cache_templates": True, # Cache parsed templates "validate_output": False, # Skip validation for speed "ai_batch_size": 10, # AI request batch size } convert_cookbook(path, **config) ``` ## Profiling and Diagnostics ### Built-in Profiler ```bash # Profile a conversion poetry run souschef convert --profile cookbook/ output/ # View profiling results # Results saved to souschef_profile.prof poetry run python -m pstats souschef_profile.prof ``` ### Custom Profiling ```python import cProfile import pstats from souschef.assessment import assess_cookbook # Profile cookbook assessment profiler = cProfile.Profile() profiler.enable() result = assess_cookbook("path/to/cookbook") profiler.disable() stats = pstats.Stats(profiler) stats.sort_stats('cumulative') stats.print_stats(20) # Top 20 functions ``` ### Memory Profiling ```bash # Install memory profiler poetry add --dev memory-profiler # Profile memory usage poetry run python -m memory_profiler souschef/cli.py convert cookbook/ ``` ## Performance Regression Testing Benchmarks run automatically in CI/CD to detect performance regressions: ```yaml # .github/workflows/ci.yml - name: Run benchmarks run: | poetry run pytest --benchmark-only \ --benchmark-compare=main \ --benchmark-compare-fail=mean:10% ``` **Regression thresholds:** - Fail if mean time increases by >10% - Warn if max time increases by >25% - Alert if throughput drops by >15% ## Troubleshooting Slow Conversions ### Symptom: Parsing taking >1 second per file **Possible causes:** - Very complex Chef Ruby code - Large template files (>100KB) - Deep attribute nesting (>10 levels) **Solutions:** - Simplify Chef code before conversion - Split large templates - Flatten attribute structures ### Symptom: High memory usage (>500MB) **Possible causes:** - Processing too many cookbooks simultaneously - Not clearing caches between conversions - Very large attribute files **Solutions:** ```python import gc for cookbook in large_cookbook_list: result = convert_cookbook(cookbook) process_result(result) # Manual garbage collection gc.collect() ``` ### Symptom: Slow AI-powered conversions **Possible causes:** - Network latency to AI provider - API rate limits - Large prompts (>4000 tokens) **Solutions:** - Use regional AI endpoints - Implement request caching - Batch API calls - Reduce prompt size ## Best Practices 1. **Measure first**: Profile before optimizing 2. **Batch operations**: Process cookbooks in logical groups 3. **Cache aggressively**: Reuse parsed templates and attributes 4. **Parallel processing**: Use multiple workers for independent cookbooks 5. **Monitor memory**: Clear caches for long-running processes 6. **Set timeouts**: Prevent hanging on complex files 7. **Use benchmarks**: Compare before/after optimization ## Future Performance Improvements Planned optimizations: - **Incremental parsing**: Only re-parse changed sections - **Native parsing**: Cython extensions for hot paths - **Streaming output**: Write playbooks as they're generated - **Distributed processing**: Celery task queue for large migrations - **Smart caching**: Persistent cache across runs - **GPU acceleration**: For AI-powered conversions ## Contributing Benchmarks When adding new features, include benchmarks: ```python def test_benchmark_my_new_feature(benchmark): """Benchmark the new feature.""" result = benchmark(my_new_feature, input_data) # Assert performance requirements assert benchmark.stats['mean'] < 0.001 # <1ms mean assert benchmark.stats['max'] < 0.010 # <10ms max ``` See [tests/integration/test_integration.py](../../tests/integration/test_integration.py) for examples. ## Resources - **pytest-benchmark docs**: https://pytest-benchmark.readthedocs.io/ - **Python profiling**: https://docs.python.org/3/library/profile.html - **Performance best practices**: https://wiki.python.org/moin/PythonSpeed --- For questions about performance, open an issue: https://github.com/kpeacocke/souschef/issues

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kpeacocke/souschef'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

performance.md•10.3 KiB