Blockchain was originally conceived as a system for conducting financial transactions and ensuring their immutability. In order to gain a much broader
acceptance outside of the financial community, the blockchain architecture needs to be able support large documents, photos,video, and other
complex data types which may have significant data storage requirements.
At the heart of the blockchain is the formation of the block, the acceptance of the created blocks by the decentralized
participant and the inclusion of the newly formed block onto the chain. This process is inherently slow and incorporating large data
objects will degrade performance significantly.
Blockchain by definition is an immutable system. It was created with the goal of permanently fixing important actions or entities and
preventing them from being changed,forged or deleted. This immutability precludes the adding, changing and deleting
of data that will by its very nature will change over time. Storing artifacts that need to support "changeability" over time are best stored "off-chain".
What are the viable options?
Although decentralized storage displays some of the same characteristics as a blockchain, it also requires us to rethink
how data is stored “on the chain.” As blockchain becomes flooded with transactions, it has had to seek out creative
solutions to the problem of scalability. The concept of storing large amounts of data on the blockchain
is simply not plausible. So what are the options?
1. Storing everything in blockchain itself
2. Peer to peer file system, such as IPFS
3. Decentralized cloud file storages, such as Storj, Sia, Ethereum Swarm, etc.
4. Distributed Databases, such as Apache Cassandra, Rethink DB, etc.
6. Ties DB
1. Store the data on the blockchain itself
Although storing everything on the blockchain would be the simplest solution, however this approach has at least two significant drawbacks.
Firstly, blockchain have a relatively low throughput rate as compared to other high volume transaction
systems, and increasing their payload size simply adds to the existing performance problem. Secondly,
if all the applications would keep their data on the blockchain, the blockchain size will grow exponentially and would
eventually exceed publicly available hard drive capacity.
2. Using a Peer to Peer file system, such as InterPlanetary File System
IPFS allows users to share files located on client computers and unifies them in the global file system.
This technology is based on the BitTorrent protocol and a distributed hash table. Files located on a local machine
are content addressable, so it is impossible to forge content by the given address. Popular files can be downloaded very
quickly thanks to the BitTorrent protocol. There are some drawbacks however. Users who want to share files
need to remain on-line at all times, files can not be deleted or modified once uploaded,
and files can not be searched by content.
3. Decentralized cloud file storages
Decentralized cloud file storage removes some of the limitations of IPFS. For example, from the user’s point of
view this storage behaves just like any other cloud storage, such as Dropbox. The difference is that the content
is hosted on user’s computers who offer their hard drive space for rent, rather than host the data in data
centers. There are plenty of such projects currently available, and include Sia, Storj, and Ethereum Swarm.
Users are not required to stay on-line to share your files anymore, and simply upload them into the cloud where
storage is highly reliable, fast, and has enormous capacity. The downside is that only static files are served,
no content search is supported, and since they are built on rented hardware, they are not free.
4. Distributed Databases
With the need to store structured data and provide advanced query capabilities, distributed
noSql databases provide a viable solution as transactional SQL databases can not be truly distributed.
MongoDB, Apache Cassandra, RethinkDB are all examples of such databases, and they are fast, scalable,
fault tolerant, and support a rich query language. The down side of these databases is that
they are not Byzantine-proof which means that all the nodes of the cluster must fully trust each
other. Any malicious node could potentially destroy the whole database.
BigChainDB claims to solve the data storage and transaction speed problems associated with blockchain.
BigChainDB is built on top of RethinkDB, a NoSQL database which was mentioned previously. BigChainDB stores
all the blocks and transactions and have full write access to the nodes in the cluster. The single biggest issue
related to BigChainDb is that it is not byzantine-proof, and any malicious node can destroy the
RethinkDB cluster. This makes BigChainDB a viable solution only when using a private blockchain.
6. Ties DB
The TiesDB inherits the majority of it's features from the underlying NoSQL database architecture and
adds byzantine fault tolerance. These feature make it suitable for use as a public database and enable
feature-rich applications based on Ethereum and other blockchain's which support smart contracts. The
database is writable by any user, where each record that is written is also associated with the users
public key and all the requests are signed. Once a record is written, it can be modified only by the owner.
Everyone can read all records, because the database is public. Additional permissions can be managed via a
Here is a comparison chart for Ties DB, IPFS and BigChain DB