📢 Hivemind Performance Optimization Release - Database Query Improvements
Performance Optimization Release - Database Query Improvements
Date: December 23, 2025
Version: Includes PRs #354, #355, #356, #357 in Hivemind Github Repo
Overview
We're excited to announce a series of performance optimizations that significantly improve database query performance across multiple critical endpoints. These optimizations address slow query issues identified through AWS RDS Performance Insights analysis and reduce database load while improving response times.
Key Improvements
1. Child IDs Query Optimization (#354)
What Changed:
- Replaced
NOT EXISTSsubquery withLEFT JOINin_child_idsfunction - Better utilization of existing index
idx_hive_posts_status_list_type_author
Impact:
- 60-70% reduction in query execution time
- Addressed the slowest query (25.5% of total time, 15,309ms total)
- Improved performance for thread loading and comment retrieval
Affected Endpoints:
bridge.get_discussion- Discussion thread loading- All endpoints that load comment trees
2. Follows Query Optimization (#355)
What Changed:
- Replaced
LEFT JOINwithINNER JOINinget_followingfunction - Added Redis caching with 30-second TTL
- Improved query plan efficiency
Impact:
- 50-80% reduction in database load for frequently accessed follow lists
- Query latency reduced from ~30ms to <1ms on cache hits
- Addressed slow query: 30.58ms latency, 0.14 AAS load
Affected Endpoints:
condenser_api.get_discussions_by_feedbridge.get_account_posts(sort='feed')- All endpoints that retrieve user follow lists
3. Feed Cache Query Optimization (#356)
What Changed:
- Added Redis caching with 30-second TTL to
pids_by_blogfunctions - Applied to both
condenser_apiandbridge_apiimplementations - Cache key includes all parameters: account_id, start_id, limit
Impact:
- 50-80% reduction in database load for blog feed queries
- Query latency reduced from ~11ms to <1ms on cache hits
- Addressed second slowest query: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec
Affected Endpoints:
bridge.get_account_posts(sort='blog')condenser_api.get_discussions_by_blogtags_api.get_discussions_by_blogcondenser_api.get_state(key='blog')
4. Posts Cache Batch Processing (#357)
What Changed:
- Implemented batch processing for large post ID lists (>1000 items)
- Added
MAX_BATCH_SIZEconstant (1000) to limit IN clause size - Automatically splits large queries into smaller batches
- Maintains backward compatibility for small lists (≤1000 items)
Impact:
- Prevents query timeouts for large discussion threads
- More stable query performance for popular posts with many comments
- Better database load distribution
- Addressed slow query: 9.0% of total time, 5,402ms total, 6.39ms average
Affected Endpoints:
bridge.get_discussion- Large discussion threads (>1000 comments)bridge.get_account_posts- Large account post listsbridge.get_ranked_posts- Large ranked post lists- All endpoints that load multiple posts simultaneously
Performance Metrics
Before Optimization
- Slowest query: 30.58ms latency, 0.14 AAS load
- Second slowest: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec
- Child IDs query: 15,309ms total, 25.5% of total query time
- Posts cache query: 5,402ms total, 9.0% of total query time
After Optimization
- Follows query: <1ms on cache hits (30ms → <1ms, 97% improvement)
- Feed cache query: <1ms on cache hits (11ms → <1ms, 91% improvement)
- Child IDs query: 60-70% reduction in execution time
- Posts cache query: Stable performance for large lists, no timeouts
Technical Details
Caching Strategy
- Cache TTL: 30 seconds for all cached queries
- Cache Key: Includes all parameters that affect query results
- Cache Backend: Redis (if configured)
- Fallback: Direct database query if cache is unavailable
Batch Processing
- Batch Size: 1000 items per batch
- Automatic: Transparent to API consumers
- Backward Compatible: No changes required for existing clients
Query Optimizations
- Index Utilization: Better use of existing indexes
- Join Types: Optimized JOIN types (INNER vs LEFT) based on data integrity
- Subquery Elimination: Replaced subqueries with JOINs where beneficial
Testing
Comprehensive test suites have been added for all optimizations:
tests/server/test_query_optimization.py- Child IDs query performance teststests/server/test_posts_cache_optimization.py- Posts cache batch processing teststests/server/TEST_POSTS_CACHE_OPTIMIZATION.md- Test documentation
Migration Notes
For API Consumers
- No breaking changes - All optimizations are backward compatible
- No API changes - Response formats remain unchanged
- Improved performance - Faster response times, especially for cached queries
For Operators
- Redis Recommended: Caching requires Redis for optimal performance
- Monitoring: Monitor cache hit rates and query performance
- Database Load: Expect reduced database load, especially during peak hours
Monitoring Recommendations
- Cache Hit Rates: Monitor Redis cache hit rates for cached queries
- Query Performance: Track query execution times in database logs
- Database Load: Monitor AAS (Average Active Sessions) metrics
- Error Rates: Watch for any increase in query timeouts or errors
Future Improvements
These optimizations are part of an ongoing performance improvement initiative. Future enhancements may include:
- Additional query optimizations based on performance analysis
- Extended cache TTL for less frequently updated data
- Further batch processing optimizations
- Database index optimizations
Questions or Issues?
If you experience any issues or have questions about these optimizations, please open an issue on GitHub with performance metrics and logs
Note: These optimizations have been thoroughly tested and are ready for production deployment. We recommend deploying during low-traffic periods and monitoring closely for the first few hours.
Upvoted! Thank you for supporting witness @jswit.
Great post! Featured in the hot section by @punicwax.