Unless you’ve been living under a rock, you have undoubtedly heard the term blockchain being batted around. Most likely you’ve heard of it in relation to cryptocurrency like BitCoin, but blockchain is quickly moving into many other areas. Much like the Big Data craze though, my personal experience has been the more you hear someone utter the term blockchain, the less that person actually knows about it. It is the Dunning-Kruger Effect in action.
In this lesson, I am going to introduce you to the concept of blockchain. I have boiled it down to its simplest concepts, and I will be speaking very broadly about the subject. In future lessons I will dive deeper into the more technical aspects of blockchain, providing much more in the way of specifics.
For those already familiar with blockchain, I am aware that I am glossing over some rather important concepts here, but my goal in this lesson is to provide a simple, easily understood tutorial. The goal of my website is to provide an accessible education into many of the complex concepts surrounding analytics and data science to everyone, regardless of past experience or education. I promise a deeper dive in the future, but for now, let us start simply.
What is a blockchain?
At the most basic level, a blockchain is a collection of data kept in a list.
What makes these lists so interesting is that:
- They are connected using cryptography
- They are distributed amongst multiple computers, providing a redundant method of protection
To understand how they work, let us start by looking at a block
Starting from the top
Block #: is just the number of the block in the chain, first block # is 1, second is 2, so on
Nonce: stands for “number used only once”. I am going to cover the Nonce in depth in the lesson on Mining, but for right now, just be aware that it is a number and every block needs a Nonce
Data: This is where the data is stored. In BitCoin this is often filled with transaction information
Previous Hash: The hash of the block before this one
Hash: I will cover hashes in depth in a future lesson two. For now, just know this the cryptographic part of blockchain. A hash is a code number that identifies the block. An easy way to conceptualize it is to think of it like a VIN on a car. The VIN (or vehicle identification number) can be used to tell you the make, model, color, and many other characteristics of a car. The hash (in blockchain) will tell you everything found in the block (I know I am way over simplifying here, but hey we have to start somewhere, and I promise a future lesson on hash)
Now let’s add a second block to our chain
Notice the previous hash in block 2 is the same as the hash in block 1. This, you will soon see, is part of what makes blockchain so secure.
In the picture below, you can see if someone tried to go into Block 1 and make a change, the Hash for block 1 would change. Any change to the first four fields in a block will cause the Hash to chance since the block is no longer the same anymore. The Hash is like a VIN or a fingerprint, it can only represent a single individual block. And once you change any aspect of a block, it is no longer the same individual block anymore.
So as you can see, if the Hash in the first block changes, it will no longer match the Previous Hash in the second block. When this happens the blockchain is broken. So if a hacker tried to alter a transaction in block 1, the chain would break.
Okay, so what is to prevent the hacker from just changing the second block? Aside from hashing issues that we will discuss in future lessons? The other deterrent is found in peer to peer sharing of blockchains
In most real world applications of blockchain, the chain will not reside on only one computer, instead will be replicated across multiple computers.
So now if a hacker tries to change the third block on one computer.
The third blocks hash will change, breaking its connection from the fourth block
And even more importantly, the chain will no longer look like the one on the second computer. Now in real life, this will be spread across thousands of computers. So to determine which blockchain is correct, they look to see what iteration the majority of computers say is correct.
So as seen below, 3 of the 4 computers show 4 blue squares, while only one was a yellow square. So based on the vote of the majority, the final result will be 4 blue squares.
So there you have it, blockchain in its most simplistic form. In future lessons I’ll dive deeper in the different concepts to show how it works from the inside out.