In my earlier tutorial, I demonstrated how to use the Python library hashlib to create a sha256 hash function. Now, using Python, I am going to demonstrate the principle of blockchain mining. Again using BitCoin as my model, I will be trying to find a nonce value that will result in a hash value below a predetermined target.
We will start by simply enumerating an integer through our sha 256 hash function until we find a hash with 4 leading zeros.
I used a while loop, passing the variable “y” through my hashing function each time the loop runs. I then inspect the first 4 digits [:4] of my hash value. If the first four digits equal 0000 then I exit the loop by setting the found variable to 1
(*note, a hash value is a string – hence the need for quotes around ‘0000’)
As you can see in the version above, it took 88445 iterations to find an acceptable hash value
Now, using the basic example of a blockchain I gave in an earlier lesson, let’s simulate mining a block
You’ll see, I am now combining the block number, nonce, data, and previous hash of my simulated block and passing it through my encryption function. Just like in BitCoin, the only value I change per iteration is the Nonce. I keep passing my block through the hashing function until I find the Nonce that gives me a hash below the target.
Now, let’s lower the target value to 6 leading zeros. This should result in a longer runtime to get your hash
To measure the run time difference, let’s add some time stamps to our code
So, I am using the timestamp function twice. D1 will be our start time, d2 will be our end time, and I am subtracting d1 from d2 to get our elapsed time. In the example below, my elapsed time was 5 secs
Now, let’s bump the target down to 7 leading zeros. Now this brings my elapsed computing time to 20 minutes. That is a considerable commitment of resources. You can see why they call it a “proof of work” now.
If you are at all like me, reading about a concept is one thing. Actually practicing it though, that helps me to actually understand it. If you have been reading my blockchain tutorial, or if you came from an outside tutorial, then you have undoubtedly read enough about cryptographic hashes.
For this example, I am using the Anaconda Python 3 distribution.
Like most things in Python, creating a hash is as simple as importing a library someone has already created for us. In this case, that library is: hashlib
So our first step is to import hashlib
Now let us take a moment to learn the syntax require to create a cryptographic hash with hashlib. In this example, I am using the SHA 256 hashing algorithm. I am using this because it is the same algorithm used by BitCoin.
Here is the syntax used
To understand the syntax, we are calling the hashlib method sha256(): hashlib.sha256()
Inside the brackets, we are entering the string we want to encode in the hash. Yes it must be a string for this function to work.
Still inside the brackets we use the method .encode() to (surprise, surprise) ENCODE the string as a hash
Finally, I added the method .hexdigest() to have the algorithm return our hash in hexadecimal format. This format will help in understanding future lessons on blockchain mining.
So in the example below, you can see that I assigned the variable x the string ‘doggy’. I then passed x to our hash function. The output can be seen below.
Now a hash can hold much more than just a simple word. Below, I have passed the Gettysburg Address to the hashing function.
(**note the ”’ ”’ triple quotes. Those are used in Python if your string takes up more than one line **)
Now I try passing a number. You will notice I get an error.
To avoid the error, I turn the integer 8 into a string with the str() function
Below I concatenation a string and an integer.
Last I want to show the avalanche effect of the hash function.
By simply changing the first letter from an uppercase T to a lowercase t the hash changes completely. This is a requirement for hashing functions. If the hash did not change dramatically from a small change to the string, it would be easy to reverse engineer the hash. This is known as the avalanche effect.
To fully understand blockchain, it helps to have a good understanding of what is known as a cryptographic hash. It is this hash that is at the very core of how blockchain works.
If you read my introduction to blockchain lesson, you would see that the Hash is part of every block. It serves as a unique identifier for the block. That is the general idea for all Hashes, whether cryptographic or not.
It would actually be more proper for me to refer to it as a Hash Function. It is a function where you pass in text, a document, a picture, anything digital, and the function will return a unique identifying number.
Hash functions were used even before cryptography was an issue. You may have heard the term Hash Tables, this is where computer programmers would store a table full of hashes indicating text or documents, instead of having to store large documents in memory.
The Hash tables worked kind of like this:
You would pass some text to the Hash Function and it would transform it to a hash.
Ben -> 0123
Data Science -> 4871
Analytics is a great field of study -> 2580
These hashes were then placed in a table and when the program needed to access the information, it used the Hash to look it up – kind of like an index in the back of a book.
Now the Hash Functions used by Blockchain are cryptographic, so they are a little different than just a simple hash table. In order for a hash to be cryptographic, it needs to follow some basic rules.
It has to be 1 way. If you pass the text “Analytics” to a cryptographic hashing function it will return a hash(let’s say 1234). It will return the same hash every time you pass the text “Analytics” to it (again: 1234). So the hash function knows what hash to create for that text. But, we want to ensure that if someone has the hash 1234, they cannot reverse it to obtain the text (they can’t reverse engineer it)
Think of it like a finger print. If I have person, I can always obtain a fingerprint. However, if all I have is a fingerprint, I cannot produce a person from it. A finger print on its own won’t let me derive information like eye color or hair color of the individual who left print behind.
The hash function needs to be fast. You will understand better when we get to mining, but blockchain miners are passing millions of a hashes a second. With a slow hash function, the entire concept would fail.
The hash needs to ensure that similar items do not receive similar hashes. The example below shows what we do not want.
With the hashes 1234 and 1235 being so close as well as the text being so close, it would make it possible to reverse engineer a hash. For example, if you knew that Analytics4all was 1236, you might be able to back track the hash until you hit analytics at 1234
Instead we want something like this:
You see the similar text do not get similar hashes. Let’s look at it through numbers
Now different blockchain applications will use different cryptographic hash functions, For example, Etherium uses MD-5. BitCoin, however uses SHA-256, so that is what I will focus on.
SHA-256 was created by the NSA and as of this writing, it has not been cracked. The code for SHA-256 is open source, so anyone can make use of it.
A SHA-256 HASH is 64 hexidecimal digits long (64 digits * 4 bits per digit = 256).
Below, I pass the text Analytics to the hash generator and I get a 64 digit hash returned
Now when I just add the number 4 to the end of my text, the hash completely changes. The two hashes do not look anything alike. Again, this is designed to help reduce the chance of someone reverse engineering the hash.
Just to drive the point home. Here is removed the s from the end of the text. Note the new hash is again completely different.
Go to the website. Try out some hashing for yourself.
In the next lesson I will cover the concept of an imputable ledger. This will take us one step closer to understanding Blockchain.