Stealth Development Blog 005: 01/11/2019
Multithreaded Techniques for Registry Synchronization | Using Leveldb for Registry Persistence
This week I spent most of my time completing the two central threads for qPoS. I first polished up the block production thread that I mostly finished last week, then I wrote the registry synchronization thread, which is now completely finished. My last accomplishment this week was to decide how to approach registry persistence, which is how the registry would be stored when the client shuts down.
-------
Registry Synchronization Requires Multithreading Techniques
Execution Latency Presents Synchronization Issues
I’ll start by explaining some changes to the synchronization thread from what I described last week. Last week I said I would have only one loop that could initiate changes to the registry, thinking that this limitation could protect the registry from inconsistencies. However, I quickly realized that latencies in code execution could also cause problems, and, although these would be infrequent, their consequences could be quite severe.
For example, I wondered what would happen if the registry was queried to validate a block, the block validated, and then, before the registry was updated based on the new block (which could only happen after the block is finalized), the registry advanced to the next slot as if the block had been missed. Since registry state changes should be considered irreversible from an efficiency standpoint, the registry state would be inconsistent with the new block, leading to undefined and undesirable results. This scenario, which might be difficult to understand without a diagram, underscores that the registry should not only be locked for state changes (write operations), but also for queries (read operations).
Simplest Solution to Execution Latency is a Mutex
Since registry queries are necessary throughout the qPoS core functionality, I coded a way to lock the registry for all state changes and queries. Once I wrote this code, I realized I must have reinvented a very simple multithreading technique. So rather than use my untested implementation, I researched multithreading and realized that what I needed (and what I had reinvented) is called a “mutex”. I then decided to use a high quality multithreading library rather than my simplistic implementation.
Simplification of the Registry Sync Loop
With this true multithreading model, the registry synchronization loop becomes simple, updating every few milliseconds. It completely forgoes checking for new blocks from the network. The registry synchronization loop is now so simple, it can be expressed as the following pseudocode
start → update registry based on time → sleep 5 ms → go to start
As a result of embracing the necessity for concurrent access to the registry, new qPoS blocks from the network are processed asynchronously in the existing message handler thread, just like PoS blocks are presently processed. Although using multithreaded techniques allow the registry synchronization logic to be quite simple, these multithreading techniques require careful implementation to (1) prevent inconsistent changes of the registry state and (2) ensure that managing the multithreading locks is as efficient as possible.
-------
Updating the Block Index for qPoS
Part of block production is assigning the block reward, and since the block reward is based on the money supply, I had to update a data structure called the block index (which tracks the money supply) so that I could complete implementation of the block production loop. A block index holds header information about a given block, as well as lots of derived information (like the money supply) that speeds up validation of later blocks. QPoS requires a few changes to blocks, which of course impact the structure of block indices. First, qPoS obviates coinbase transactions, which themselves hold some key block information, most notably the block height. Because block height is important for block validation, qPoS makes block height an explicit field of the block data structure. Second, qPoS adds the staker ID to the block data structure. Block height and staker ID are part of the information required to uniquely identify an unsigned block, so these fields must also be contents of the block header. This requirement means not only that these fields need to be part of the block index, but that the block index needs to be restructured to accommodate the new header fields.
-------
Using Leveldb for Registry Persistence
Registry Persistence should Be Synchronous with State Changes
Since I have completed the block production and registry synchronization threads and have applied multithreading techniques to protect the registry, I decided to tackle registry persistence, which entails (1) ensuring the registry is synchronized when loading block data from disk and (2) ensuring that the registry state information will survive client restarts, even when the client crashes.
It is important to realize that the state of the registry depends on the cumulative history of the blockchain, so reconstructing the registry means replaying the block chain. To minimize or even prevent time consuming replays, registry persistence should be updated every block, and it should be robust to crashes.
To satisfy all these requirements, I decided I will use Leveldb, a filesystem-based key-value database, instead of a RAM-only data structure. To see why a database is the best option, I present a few hypothetical alternatives. Obviously, a RAM-only alternative that never gets written to persistent storage will not work at all because the state of the registry will be lost when the client stops, requiring a full blockchain replay. One solution is to write the registry state to persistent storage when the client gracefully quits, but this solution is dangerous because, at the very least, clients will need to replay the blockchain after a crash. A somewhat better alternative is to periodically store a snapshot of the registry state, but in this case, one still needs an efficient persistence solution to store the snapshots. One small downside of snapshots is that one must write the entire registry at one shot, rather than have incremental updates. Another small downside of snapshots is that any blocks since the newest viable snapshot would need to be replayed. It is clear by examining this set of options, that as persistence becomes more synchronous with real-time registry state changes, the potential problems related to persistent storage become less severe.
In light of this observation, I decided I might as well go whole-hog and make the persistence solution a high performance database like leveldb. This choice means that updates to this database would be fully synchronous with registry state changes. This system is very robust because, if leveldb crashes during a write operation, it can recover everything up to that write operation, and even give assurances of data integrity.
Leveldb Facilitates Assurances of State Consistency
One consideration of this approach is that the all state changes resulting from the acceptance of a given block should be taken in total or not taken at all. For example, imagine that, along with several other changes, a block resulted in a staker owner earning 0.3 XST and its delegate earning 0.2 XST. In this case, it would be very problematic if the client crashed after adding 0.3 XST to the owner balance but before adding 0.2 XST to the delegate balance. These state changes should be executed in such a way that the client reverses this incomplete set of state changes. Databases like leveldb have facilities to address these types of issues because they arise for practically all database systems.
Leveldb Can Easily Handle the Required Throughput and Speed
Compared to Leveldb capabilities, updates to the registry are (1) small and (2) only executed every few seconds (which is a long time for a computer). The largest data structure for atomic storage operations will be the bit field that keeps track of recently missed blocks. This bit field is 512 bytes (0.5 kb). Disk I/O for the registry is expected to average about 1 kb/s, with about 4-5 operations per second. Leveldb is highly optimized and efficient, robust, and already used in XST, Bitcoin, and numerous other cryptocurrencies for transactional storage. Even with 60 stakers, which is expected to be the upper end of active stakers, the entire registry would only require about 0.15 MB of storage, meaning Leveldb can easily handle synchronous persistence of the staker registry.
Using Leveldb means that the registry is technically going to be a filesystem database, and not RAM-only as I have described it in the whitepaper and elsewhere. However, because Leveldb can fully cache read operations in RAM, Leveldb can be about as fast as if the registry were a RAM-only data. That is, the Leveldb cache can easily be much larger than the registry itself, so that only write operations are expected to require filesystem I/O, and Leveldb even optimizes these write operations to reduce file system resource usage.
-------
Standardizing 64 Bit Integers
Finally, one other notable improvement this week is that I standardized the 64 bit integer types throughout the entire code base. Although transparent to the end-user, these changes should result in better cross-platform reliability and more readable, streamlined, and robust code.
–––––
Hondo
To listen to the audio version of this article click on the play image.
Brought to you by @tts. If you find it useful please consider upvoting this reply.