📢 Hivemind Performance Optimization Release - Database Query Improvements

in #steemit6 hours ago

Performance Optimization Release - Database Query Improvements

Date: December 23, 2025
Version: Includes PRs #354, #355, #356, #357 in Hivemind Github Repo

Overview

We're excited to announce a series of performance optimizations that significantly improve database query performance across multiple critical endpoints. These optimizations address slow query issues identified through AWS RDS Performance Insights analysis and reduce database load while improving response times.

Key Improvements

1. Child IDs Query Optimization (#354)

What Changed:

  • Replaced NOT EXISTS subquery with LEFT JOIN in _child_ids function
  • Better utilization of existing index idx_hive_posts_status_list_type_author

Impact:

  • 60-70% reduction in query execution time
  • Addressed the slowest query (25.5% of total time, 15,309ms total)
  • Improved performance for thread loading and comment retrieval

Affected Endpoints:

  • bridge.get_discussion - Discussion thread loading
  • All endpoints that load comment trees

2. Follows Query Optimization (#355)

What Changed:

  • Replaced LEFT JOIN with INNER JOIN in get_following function
  • Added Redis caching with 30-second TTL
  • Improved query plan efficiency

Impact:

  • 50-80% reduction in database load for frequently accessed follow lists
  • Query latency reduced from ~30ms to <1ms on cache hits
  • Addressed slow query: 30.58ms latency, 0.14 AAS load

Affected Endpoints:

  • condenser_api.get_discussions_by_feed
  • bridge.get_account_posts (sort='feed')
  • All endpoints that retrieve user follow lists

3. Feed Cache Query Optimization (#356)

What Changed:

  • Added Redis caching with 30-second TTL to pids_by_blog functions
  • Applied to both condenser_api and bridge_api implementations
  • Cache key includes all parameters: account_id, start_id, limit

Impact:

  • 50-80% reduction in database load for blog feed queries
  • Query latency reduced from ~11ms to <1ms on cache hits
  • Addressed second slowest query: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec

Affected Endpoints:

  • bridge.get_account_posts (sort='blog')
  • condenser_api.get_discussions_by_blog
  • tags_api.get_discussions_by_blog
  • condenser_api.get_state (key='blog')

4. Posts Cache Batch Processing (#357)

What Changed:

  • Implemented batch processing for large post ID lists (>1000 items)
  • Added MAX_BATCH_SIZE constant (1000) to limit IN clause size
  • Automatically splits large queries into smaller batches
  • Maintains backward compatibility for small lists (≤1000 items)

Impact:

  • Prevents query timeouts for large discussion threads
  • More stable query performance for popular posts with many comments
  • Better database load distribution
  • Addressed slow query: 9.0% of total time, 5,402ms total, 6.39ms average

Affected Endpoints:

  • bridge.get_discussion - Large discussion threads (>1000 comments)
  • bridge.get_account_posts - Large account post lists
  • bridge.get_ranked_posts - Large ranked post lists
  • All endpoints that load multiple posts simultaneously

Performance Metrics

Before Optimization

  • Slowest query: 30.58ms latency, 0.14 AAS load
  • Second slowest: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec
  • Child IDs query: 15,309ms total, 25.5% of total query time
  • Posts cache query: 5,402ms total, 9.0% of total query time

After Optimization

  • Follows query: <1ms on cache hits (30ms → <1ms, 97% improvement)
  • Feed cache query: <1ms on cache hits (11ms → <1ms, 91% improvement)
  • Child IDs query: 60-70% reduction in execution time
  • Posts cache query: Stable performance for large lists, no timeouts

Technical Details

Caching Strategy

  • Cache TTL: 30 seconds for all cached queries
  • Cache Key: Includes all parameters that affect query results
  • Cache Backend: Redis (if configured)
  • Fallback: Direct database query if cache is unavailable

Batch Processing

  • Batch Size: 1000 items per batch
  • Automatic: Transparent to API consumers
  • Backward Compatible: No changes required for existing clients

Query Optimizations

  • Index Utilization: Better use of existing indexes
  • Join Types: Optimized JOIN types (INNER vs LEFT) based on data integrity
  • Subquery Elimination: Replaced subqueries with JOINs where beneficial

Testing

Comprehensive test suites have been added for all optimizations:

  • tests/server/test_query_optimization.py - Child IDs query performance tests
  • tests/server/test_posts_cache_optimization.py - Posts cache batch processing tests
  • tests/server/TEST_POSTS_CACHE_OPTIMIZATION.md - Test documentation

Migration Notes

For API Consumers

  • No breaking changes - All optimizations are backward compatible
  • No API changes - Response formats remain unchanged
  • Improved performance - Faster response times, especially for cached queries

For Operators

  • Redis Recommended: Caching requires Redis for optimal performance
  • Monitoring: Monitor cache hit rates and query performance
  • Database Load: Expect reduced database load, especially during peak hours

Monitoring Recommendations

  1. Cache Hit Rates: Monitor Redis cache hit rates for cached queries
  2. Query Performance: Track query execution times in database logs
  3. Database Load: Monitor AAS (Average Active Sessions) metrics
  4. Error Rates: Watch for any increase in query timeouts or errors

Future Improvements

These optimizations are part of an ongoing performance improvement initiative. Future enhancements may include:

  • Additional query optimizations based on performance analysis
  • Extended cache TTL for less frequently updated data
  • Further batch processing optimizations
  • Database index optimizations

Questions or Issues?

If you experience any issues or have questions about these optimizations, please open an issue on GitHub with performance metrics and logs


Note: These optimizations have been thoroughly tested and are ready for production deployment. We recommend deploying during low-traffic periods and monitoring closely for the first few hours.

Sort:  

Upvoted! Thank you for supporting witness @jswit.

Great post! Featured in the hot section by @punicwax.