First published on VOICE in January 2021 -> Check it out here.
While we can’t yet dive inside the super-secret Block.one GitHub repositories, to learn more about the long-awaited financial product and other secret developments, we can view the ongoing public improvement of eosio and envisage future capabilities.
For those who have yet to read part #1 in this series, BlockVault (or Nodeos Clustering) is a new feature of eosio 2.1RC which enables two or more nodes to form a single logical block producing cluster. BlockVault aims to provide a high degree of resilience and reliability for public node operators while delivering transaction finalisation times of less than half a second for Enterprise private chains.
Following the official release of eosio 2.1RC, the Cryptowriter team reached out to Block.one with questions raised by the community. Bart Wyatt, VP, Blockchain Engineering agreed to answer our questions and provided his valuable insight into BlockVault and its potential use.
The below was sent for reference and to help clarify the architecture and deployment options.
Please note, in draft versions, a critical part of the eosio software was missing which Wyatt pointed out:
“There is a central component best called the “Chain Controller” that takes the I/O from the HTTP and transaction plugins, talks to the RAM/RocksDB state and also talks to the BlockVault plugin. It is the central “brain” of the application (not BlockVault plugin as this diagram may suggest)”
The problem was rectified before publication.
The release documents explain the intent of BlockVault. Still, more information is valuable to understand the options available to businesses who are considering building and deploying applications on eosio.
We put the following technical questions to Wyatt.
Cryptowriter: The eosio 2.1RC article mentions “the primary node (of BlockVault)” and how a correctly configured system can keep producing with “minimal service disruption.” Is a single node writing the chain? Is the failover to an alternate producer automatic or would you need LB (load balancer) and health checks to switch it out? The readme file makes a note of both redundant and/or high availability mode.
Wyatt: For this release, BlockVault does not offer a direct HA solution. Nodeos has runtime interfaces for pausing and unpausing block production on a single instance, and BlockVault is designed so that competing instances in a cluster do not break protocol rules.
The easiest stable deployment would be as you describe: some external appliance doing health checks and “coordinating” which nodeos instance is the “primary”. It is worth noting that even if this coordinator is flawed or fails itself, BlockVault is designed to keep the data pristine to avoid logical BP mistakes.
Cryptowriter: The release notes state “Two or more nodes may be deployed as a single logical producer.” Whereas the article says “Three or more nodes may be deployed as a single logical producer.” Which is correct? What is the maximum number of nodes in a cluster?
Wyatt: In a highly automated environment, it is entirely possible to run an HA block producer with a single nodeos node at any time. Immediately upon detection of a problem, that node could be stopped, and a new one created, knowing only how to connect to BlockVault.
Most deployments will have at least two nodeos nodes active so that when the one told to produce a block fails the standby is already synced and connected to BlockVault. If the failure mode is easily detected (such as a process termination), this can be done fast enough to maintain a continuous stream of blocks from the logical producer.
Cryptowriter: Is it fair to assume all nodes in a BlockVault cluster can service read requests as per standard eosio functionality?
Wyatt: Yes, they are full-featured nodes, much like any other nodeos instance. They will likely have an ingress of blocks/transactions from the P2P network and likewise, they can service API requests and offer transactions to their internal-peers or to the P2P networks when sharing the responsibility of maintaining a chain with external BPs.
Cryptowriter: The article mentions “immediate finality with tools to mitigate the risk of a single point of failure.” What are these tools? Is this a new requirement for BPs and Enterprise?
Wyatt: BlockVault is this primary tool. There is no additional requirement for BPs. In scenarios where there is no threat of Byzantine behaviour and “Crash-Fault-Tolerance” can be used, a single logical BP network is now a viable solution. Such a network would achieve finality between 0.5 and 1.0 seconds. Prior to BlockVault, this was an extraordinarily risky and fragile deployment scenario in production. With BlockVault it is much safer.
Cryptowriter: Based on my reading of the docs, am I correct in stating BlockVault does not store chain state data? Data remains in the preferred backing store? Chainbase or RocksDB?
Wyatt: BlockVault does not store chain state data; this is true. It does store enough of the block data to reconstruct a replica of your BP. This separation of responsibilities allows BlockVault to utilise more industry-standard components because nodeos is still responsible for all aspects that make blockchain special (and would overwhelm classic databases).
Cryptowriter: Half second finality for Enterprise is big news. Many Enterprises distribute their workloads across data centres or regions. Are there still plans to upgrade consensus to deliver faster times over the network? As to work in combination with BlockVault?
Wyatt: Faster finality will always be a topic on the front of our minds; there are certainly enterprise applications where it will always be valuable to shave time off global consensus. Nothing to announce today but always something we are working on.
Wyatt: This was certainly part of the early discussions around BlockVault. We chose PostgreSQL because of its ubiquity and the presence of a managed cluster offering on all major public cloud vendors.
Following the Q&A, community members raised further questions or noted specific issues with the technology. We reached out to Block.one for additional comment, but as is usually the case at this time of year, many people break to recharge so we were unable to reconnect with Wyatt. We’ll follow up in early 2021 and report back any answers.
What does it mean for EOS?
Block Producers have begun testing eosio 2.1RC to check stability, reliability, search for bugs and look for other problems. Several have been discovered and subsequently fixed by Block.one. https://platform.twitter.com/embed/index.html?dnt=false&embedId=twitter-widget-10&frame=false&hideCard=false&hideThread=false&id=1345037184450101250&lang=en&origin=https%3A%2F%2Fwww.voice.com%2Fprofile%2Fscottowen&siteScreenName=VoiceHQ&theme=light&widgetsVersion=ed20a2b%3A1601588405575&width=550px
We’re running in house tests and have begun upgrading our testnet nodes to the latest release candidate. Earlier today, Block.one dropped 2.1RC2 with some notable fixes so we’ll push the update over the next few days.
As for timing, the upgrade to 2.1 is an individual block producer decision; however, based on history, it’s likely to take months before all nodes are running the new software.
Many of the benefits of BlockVault (and eosio 2.1) will be invisible to network users. The network will become more reliable, resilient and robust with little input from the EOS community.
Thanks to Bart Wyatt and Ross Dold for their contributions to the EOSIO Developments Series. Stay tuned to Cryptowriter for future updates.