A high-performance Apache module for real-time text replacement in web content using the Aho-Corasick string matching algorithm. This module enables efficient multi-pattern search and replace operations with zero-copy optimizations and variable expansion support.
- High Performance: Aho-Corasick algorithm with O(n+m+z) complexity
- Variable Expansion: Support for
${VAR}and%{VAR}environment variables - Zero-Copy Operations: Optimized in-place replacements when possible
- Smart Caching: Precompiled automatons for maximum performance
- Performance Monitoring: Detailed timing logs for optimization
- Content-Type Filtering: Safe processing of text content only
- Dynamic Configuration: Real-time rule updates without server restart
- Apache 2.4+
- APR development libraries
- CMake 3.10+
- GCC or compatible C compiler
# Install dependencies (Ubuntu/Debian)
sudo apt-get install apache2-dev libapr1-dev cmake build-essential
# Clone and build
git clone <repository>
cd mod_replace
mkdir build && cd build
cmake ..
make
# Install module
sudo cp mod_replace.so /usr/lib/apache2/modules/
sudo a2enmod replace
sudo systemctl restart apache2Add to your Apache configuration or .htaccess:
# Enable text replacement
ReplaceEnable On
# Define replacement rules
ReplaceRule "old_text" "new_text"
ReplaceRule "search_pattern" "replacement_value"
# Apply to HTML files
<FilesMatch "\.(html|htm)$">
SetOutputFilter REPLACE
</FilesMatch>Syntax: ReplaceEnable On|Off
Default: Off
Context: server config, virtual host, directory, .htaccess
Enables or disables text replacement processing.
ReplaceEnable OnSyntax: ReplaceRule <search> <replacement>
Context: server config, virtual host, directory, .htaccess
Defines a search and replacement pattern. Multiple rules can be defined.
ReplaceRule "{{PLACEHOLDER}}" "Actual Content"
ReplaceRule "old_string" "new_string"The module supports environment variable expansion in replacement values with optimized per-request evaluation:
# Using ${VAR} syntax
ReplaceRule "{{SERVER}}" "${SERVER_NAME}"
# Using %{VAR} syntax
ReplaceRule "{{USER}}" "%{REMOTE_USER}"
# Environment variables
ReplaceRule "{{VERSION}}" "${APP_VERSION}"
# CSP nonce (unique per request)
ReplaceRule "___CSP_NONCE___" "%{UNIQUE_STRING}"Performance Note (v1.2.0+): Variables are expanded dynamically via callbacks without recompiling the automaton. The pattern matching automaton is compiled once at startup, and variables are resolved only when matches are found. This results in 382x-512x faster performance compared to the previous approach that recreated the automaton per request.
# Safe configuration with content type filtering
<FilesMatch "\.(html|htm)$">
# Force correct content type
ForceType text/html
Header set Content-Type "text/html; charset=utf-8"
# Enable replacement
ReplaceEnable On
SetOutputFilter REPLACE
</FilesMatch>
# Exclude binary files
<FilesMatch "\.(jpg|png|pdf|zip)$">
RemoveOutputFilter REPLACE
</FilesMatch># Enable debug logging for performance monitoring
LogLevel replace:debug
# Directory-based configuration for better performance
<Directory "/var/www/html">
ReplaceEnable On
ReplaceRule "{{SITE_NAME}}" "My Website"
<FilesMatch "\.(html|htm|php)$">
SetOutputFilter REPLACE
</FilesMatch>
</Directory>The module uses an optimized Aho-Corasick implementation with qsort optimization for exceptional performance:
- 4-7x faster than mod_substitute for high pattern volumes
- Up to 21x faster on large files (500KB+) with qsort optimization
- Fast Path (Precompiled): 100-600μs per request (100 patterns, 10-100KB)
- Memory Efficient: Shared automaton across requests
- Scalable: O(n+m+z) complexity vs O(n×m×k) sequential approach
- Throughput: Up to 131 MB/s on typical web content
| File Size | mod_substitute | mod_replace | Speedup |
|---|---|---|---|
| 10 KB | 405 μs | 106 μs | 3.82x |
| 50 KB | 1795 μs | 362 μs | 4.96x |
| 100 KB | 4205 μs | 604 μs | 6.96x |
| 500 KB | 18712 μs | 3534 μs | 5.29x |
See detailed benchmarks: benchmark/PERFORMANCE_COMPARISON.md
Version 1.2.0 includes callback-based variable expansion that eliminates per-request automaton compilation:
- 382x-512x speedup when using variables like
%{UNIQUE_STRING} - 99.8% CPU reduction for variable-based replacements
- Automaton compiled ONCE at startup vs recreated per request
- Perfect for CSP nonce injection and dynamic content
Run the benchmark:
cd benchmark
make
./performance_benchmark --v1.2 patterns_vars_heavy.txt 1000 test_content_10kb.htmlResults: 4x faster with 27 patterns (11 with variables), 75% CPU reduction
Details: benchmark/OPTIMIZATION_IMPLEMENTATION.md
Version 1.1.0 includes a critical optimization replacing O(n²) bubble sort with O(n log n) qsort:
- 21.7x speedup on 500KB files
- 95% reduction in match sorting time
- Production-ready for files of any size
Details: benchmark/QSORT_OPTIMIZATION_EN.md
Enable debug logging to monitor performance:
LogLevel replace:debugLog output includes timing information:
mod_replace: Fast path completed - ac_time=15 μs, total_time=45 μs, input_len=1024, output_len=1056, patterns=3
mod_replace: Slow path completed - create_time=120 μs, compile_time=80 μs, replace_time=35 μs, total_time=235 μs
# Replace template variables in HTML
ReplaceEnable On
ReplaceRule "{{TITLE}}" "Welcome to My Site"
ReplaceRule "{{YEAR}}" "2024"
ReplaceRule "{{VERSION}}" "${APP_VERSION}"# Inject dynamic content
ReplaceRule "{{USER_NAME}}" "%{REMOTE_USER}"
ReplaceRule "{{SERVER_INFO}}" "${SERVER_SOFTWARE}"
ReplaceRule "{{REQUEST_TIME}}" "%{REQUEST_TIME}"# Rewrite asset URLs
ReplaceRule "/assets/" "/cdn/assets/v${ASSET_VERSION}/"
ReplaceRule "{{CDN_URL}}" "${CDN_BASE_URL}"# Configure API endpoints
ReplaceRule "{{API_BASE}}" "${API_ENDPOINT}"
ReplaceRule "{{API_KEY}}" "%{API_SECRET}"# Standard build
mkdir build && cd build
cmake ..
make
# Debug build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make
# Run tests
make test
# Memory leak testing
make valgrind-testmod_replace/
├── src/
│ ├── mod_replace.c # Main module implementation
│ └── aho_corasick.c # Aho-Corasick algorithm
├── inc/
│ └── aho_corasick.h # Algorithm header
├── test/
│ ├── test_aho_corasick.c # Algorithm tests
│ └── test_standalone.c # Standalone tests
└── CMakeLists.txt # Build configuration
The module includes comprehensive test suites:
# Run Aho-Corasick tests
./test_aho_corasick
# Run memory leak detection
make valgrind-aho-corasick
# Check build
make test# Check if module is enabled
apache2ctl -M | grep replace_module
# Enable module
sudo a2enmod replace
sudo systemctl restart apache2# Check configuration syntax
apache2ctl configtest
# Verify output filter is set
grep -r "SetOutputFilter.*REPLACE" /etc/apache2/
# Check debug logs
sudo tail -f /var/log/apache2/error.log | grep mod_replace# Enable performance logging
LogLevel replace:debug
# Monitor timing
sudo tail -f /var/log/apache2/error.log | grep "total_time"# Ensure UTF-8 encoding
Header set Content-Type "text/html; charset=utf-8"
# Force content type
ForceType text/html# Enable comprehensive logging
LogLevel replace:debug
LogLevel headers:debug
# Log file analysis
ErrorLog ${APACHE_LOG_DIR}/replace-debug.log
# Custom log format
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D" combined_with_time
CustomLog ${APACHE_LOG_DIR}/replace-access.log combined_with_time# Always filter content types
<FilesMatch "\.(html|htm)$">
SetOutputFilter REPLACE
</FilesMatch>
# Exclude sensitive files
<FilesMatch "\.(conf|log|key)$">
RemoveOutputFilter REPLACE
Require all denied
</FilesMatch>
# Validate environment variables
<If "%{ENV:UNSAFE_VAR} != ''">
ReplaceRule "{{SAFE_VAR}}" "default_value"
</If>- Never expose sensitive environment variables in replacement rules
- Validate all environment variable values
- Use appropriate access controls for configuration files
- Monitor logs for unexpected patterns
This project is licensed under the Apache License 2.0. See LICENSE file for details.
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
- Follow Apache module conventions
- Include comprehensive tests
- Update documentation
- Use memory-safe practices
- Profile performance impact
- Issues: Report bugs and feature requests via GitHub issues
- Documentation: See inline code documentation and examples
- Performance: Use debug logging for optimization guidance
- 🚀 Variable Optimization: Callback-based replacement for dynamic variable expansion
- 382x-512x speedup when using variables like
%{UNIQUE_STRING}or${REMOTE_USER} - 99.8% CPU reduction for variable-based replacements
- Automaton compiled ONCE at startup, even with variables
- No more per-request automaton recreation for variables
- Tested with 1000 simulated requests - same automaton reused for all
- See benchmark/OPTIMIZATION_IMPLEMENTATION.md for implementation details
- Critical Performance Optimization: Replaced O(n²) bubble sort with O(n log n) qsort
- 21.7x speedup on large files (500KB+)
- 95% reduction in match sorting time
- Now 4-7x faster than mod_substitute across all file sizes
- Production-ready for files of any size
- Added comprehensive benchmark suite
- See benchmark/QSORT_OPTIMIZATION_EN.md for details
- Initial release with Aho-Corasick implementation
- Variable expansion support
- Performance monitoring
- Content-type filtering
- Comprehensive test suite