📢 Hivemind Performance Optimization Release - Database Query Improvements

ety001 (73)in #steemit • 2 months ago

Performance Optimization Release - Database Query Improvements

Date: December 23, 2025
Version: Includes PRs #354, #355, #356, #357 in Hivemind Github Repo

Overview

We're excited to announce a series of performance optimizations that significantly improve database query performance across multiple critical endpoints. These optimizations address slow query issues identified through AWS RDS Performance Insights analysis and reduce database load while improving response times.

Key Improvements

1. Child IDs Query Optimization (#354)

What Changed:

Replaced NOT EXISTS subquery with LEFT JOIN in _child_ids function
Better utilization of existing index idx_hive_posts_status_list_type_author

Impact:

60-70% reduction in query execution time
Addressed the slowest query (25.5% of total time, 15,309ms total)
Improved performance for thread loading and comment retrieval

Affected Endpoints:

bridge.get_discussion - Discussion thread loading
All endpoints that load comment trees

2. Follows Query Optimization (#355)

What Changed:

Replaced LEFT JOIN with INNER JOIN in get_following function
Added Redis caching with 30-second TTL
Improved query plan efficiency

Impact:

50-80% reduction in database load for frequently accessed follow lists
Query latency reduced from ~30ms to <1ms on cache hits
Addressed slow query: 30.58ms latency, 0.14 AAS load

Affected Endpoints:

condenser_api.get_discussions_by_feed
bridge.get_account_posts (sort='feed')
All endpoints that retrieve user follow lists

3. Feed Cache Query Optimization (#356)

What Changed:

Added Redis caching with 30-second TTL to pids_by_blog functions
Applied to both condenser_api and bridge_api implementations
Cache key includes all parameters: account_id, start_id, limit

Impact:

50-80% reduction in database load for blog feed queries
Query latency reduced from ~11ms to <1ms on cache hits
Addressed second slowest query: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec

Affected Endpoints:

bridge.get_account_posts (sort='blog')
condenser_api.get_discussions_by_blog
tags_api.get_discussions_by_blog
condenser_api.get_state (key='blog')

4. Posts Cache Batch Processing (#357)

What Changed:

Implemented batch processing for large post ID lists (>1000 items)
Added MAX_BATCH_SIZE constant (1000) to limit IN clause size
Automatically splits large queries into smaller batches
Maintains backward compatibility for small lists (≤1000 items)

Impact:

Prevents query timeouts for large discussion threads
More stable query performance for popular posts with many comments
Better database load distribution
Addressed slow query: 9.0% of total time, 5,402ms total, 6.39ms average

Affected Endpoints:

bridge.get_discussion - Large discussion threads (>1000 comments)
bridge.get_account_posts - Large account post lists
bridge.get_ranked_posts - Large ranked post lists
All endpoints that load multiple posts simultaneously

Performance Metrics

Before Optimization

Slowest query: 30.58ms latency, 0.14 AAS load
Second slowest: 11.41ms latency, 0.10 AAS load, 8.59 calls/sec
Child IDs query: 15,309ms total, 25.5% of total query time
Posts cache query: 5,402ms total, 9.0% of total query time

After Optimization

Follows query: <1ms on cache hits (30ms → <1ms, 97% improvement)
Feed cache query: <1ms on cache hits (11ms → <1ms, 91% improvement)
Child IDs query: 60-70% reduction in execution time
Posts cache query: Stable performance for large lists, no timeouts

Technical Details

Caching Strategy

Cache TTL: 30 seconds for all cached queries
Cache Key: Includes all parameters that affect query results
Cache Backend: Redis (if configured)
Fallback: Direct database query if cache is unavailable

Batch Processing

Batch Size: 1000 items per batch
Automatic: Transparent to API consumers
Backward Compatible: No changes required for existing clients

Query Optimizations

Index Utilization: Better use of existing indexes
Join Types: Optimized JOIN types (INNER vs LEFT) based on data integrity
Subquery Elimination: Replaced subqueries with JOINs where beneficial

Testing

Comprehensive test suites have been added for all optimizations:

tests/server/test_query_optimization.py - Child IDs query performance tests
tests/server/test_posts_cache_optimization.py - Posts cache batch processing tests
tests/server/TEST_POSTS_CACHE_OPTIMIZATION.md - Test documentation

Migration Notes

For API Consumers

No breaking changes - All optimizations are backward compatible
No API changes - Response formats remain unchanged
Improved performance - Faster response times, especially for cached queries

For Operators

Redis Recommended: Caching requires Redis for optimal performance
Monitoring: Monitor cache hit rates and query performance
Database Load: Expect reduced database load, especially during peak hours

Monitoring Recommendations

Cache Hit Rates: Monitor Redis cache hit rates for cached queries
Query Performance: Track query execution times in database logs
Database Load: Monitor AAS (Average Active Sessions) metrics
Error Rates: Watch for any increase in query timeouts or errors

Future Improvements

These optimizations are part of an ongoing performance improvement initiative. Future enhancements may include:

Additional query optimizations based on performance analysis
Extended cache TTL for less frequently updated data
Further batch processing optimizations
Database index optimizations

Questions or Issues?

If you experience any issues or have questions about these optimizations, please open an issue on GitHub with performance metrics and logs

Note: These optimizations have been thoroughly tested and are ready for production deployment. We recommend deploying during low-traffic periods and monitoring closely for the first few hours.

#steem #witness #dev

2 months ago in #steemit by ety001 (73)

$11.49

Sort:

Trending

[-]

jswit (74) 2 months ago

Upvoted! Thank you for supporting witness @jswit.

To turn off auto-reply, write a reply to this comment with "@jswit reply-off"
Delegate SP to jsup & receive daily upvote
Search and find Steemit posts

$0.00

[-]

punicwax (-14)(1) 2 months ago

Great post! Featured in the hot section by @punicwax.

$0.00

STEEM 0.05

TRX 0.28

JST 0.045

BTC 64645.58

ETH 1858.41

USDT 1.00

SBD 0.38

📢 Hivemind Performance Optimization Release - Database Query Improvements

Performance Optimization Release - Database Query Improvements

Overview

Key Improvements

1. Child IDs Query Optimization (#354)

2. Follows Query Optimization (#355)

3. Feed Cache Query Optimization (#356)

4. Posts Cache Batch Processing (#357)

Performance Metrics

Before Optimization

After Optimization

Technical Details

Caching Strategy

Batch Processing

Query Optimizations

Testing

Migration Notes

For API Consumers

For Operators

Monitoring Recommendations

Future Improvements

Questions or Issues?

Coin Marketplace