Blockchain Bloating

As a blockchain increases in adoption and transactions, more transactions also means a huge amount of data is being transmitted and immutably stored on the blockchain. Blocks in the chain will of course grow in size as more transactions are added to the block. This is referred to as ‘Bloating’

On researching this issue further I came across some great insights on Cointelegraph and DevTeam.Space. I will share a summary of the thinking here.

Bloating can have different consequences for different networks. But typically it can result in lengthy synchronization times within the network. For crypto transactions this may impact transaction fees (making them more expensive).

Bloat creates problems like how to permanently store trillions or more of blocks on-chain, as well as how to transmit and/or download the chain within a reasonable timeframe.

As use of blockchain grows, these challenges will be considerable and cannot be ignored. Blockchain networks may become unstable. Logically we think the solution is simple… just increase the blocksize.

The basic thinking being that the bigger the block size, the more transactions can fit into a block — and therefore, the higher the number of transactions per second will be. Although this is true, it also means that the bigger the block size is, the more computing power is needed to verify the block.

If block sizes were to be increased indefinitely, specialized, highly powered computer equipment would be needed to handle the required processing power needed to act as a node. The increased cost of this type of equipment would mean node pools would necessarily become smaller and more centralized, increasing the risk of a 51% attack.

Increasing the block size might also require a hard fork, which risks splitting the network community. If not everyone upgrades to the new blockchain, two separate chains will exist; so increasing the block size is only a short-term solution. One solution I wanted to discuss in this blog is ‘Sharding’

Sharding

Sharding allows you to download the blockchain much faster, and it reduces the cycle-time of operations as well as lowers the needed disk space

So how does sharding work?

With sharding, the idea is to move from a linear execution model, in which every node has to compute every operation, to a parallel execution model, in which nodes are assigned to process only certain computations. This will allow for multiple, parallel transaction processing at the same time.

The blockchain will be divided into separate shards (subdomains, or “buckets”). Nodes will only have to run the part of the ledger that they are assigned in order to execute processes and validate transactions, instead of maintaining the whole ledger all of the time. The image below from Cointelegraph simply demonstrates how sharding works…

When implementing sharding in a blockchain, architects identify nodes and segregate parts of the blockchain database to create shards. Nodes in a particular shard only maintain that part of the database, and not the entire blockchain.

Since each node no longer loads the entire blockchain, the transaction speed improves.

The transaction validation process no longer involves all nodes. Naturally, POW consensus algorithm can’t work with sharding, and “Proof of Stake” (PoS) needs to be used. Each shard has its own set of transaction validators.

Interestingly many operations in a blockchain use only a relatively smaller part of the database, and not the entire database, so where this is the case sharding at a protocol level could be a good solution.

Sharding is not the only solution to blockchain bloat, but the other solutions could be for another article.