Bitcoin Basics: Blockchain, hashing & mining….oh my!
Perhaps you are new to bitcoin, or perhaps you have been down the rabbit hole for a while but never really understood exactly what goes on behind the scenes, on the blockchain. I hope to unveil the mystery behind blockchain, hashing, bitcoin mining and distributed ledgers all in one simple article. Strap yourself in, we have got a lot to get through.
The bitcoin network uses the SHA256 algorithm to conduct hashing functions. Confused already? Don’t worry, this can be very complex, and way above my pay-grade, but I’ll explain the basics as best I can.
Think of hashing as creating a digital fingerprint of a set of data. Data goes into the algorithm (the input), the algorithm works its magic and spits out a series of alphanumeric characters (the output). What is really cool about hashing is that it doesn’t matter how much data you run through the algorithm, the output is always the same length, 64 characters for the SHA256 algorithm.
Figure 1 shows the SHA256 hash function (we will refer to this as hashing from here) of the name “Daz”. “Daz” is the input and 3445748836aa04e53fd5de81c41c4a370f0cf52004659abf87920abc0da1bbaf is the output. Simple
Now I’ll change the capital “D” to a little “d” (Figure 2) to demonstrate that one simple, small change to the input, results in a completely new output. Input=daz, output = ae5e9de1ed5510933a86705cb253b3cbd0b0891e70217c7a64603869aeaac093. As you can see from our new output, when compared to the original, the characters are completely different. There is no discernible pattern between the two.
As mentioned, the resultant hash is 64 characters long. This is always the case, no matter what the size the data is that we place into the input. In Figure 3, I paste an entire Wikipedia article on Fender Stratocaster’s into the input, the output remains 64 characters long. This feature would remain true if you were put the entire contents of the internet through the algorithm, 64 characters every time.
A feature of the SHA256 hash function is that if we were to re-hash the exact same set of data, we would get the same hash of that data each and every time. This feature allows us to form the basis of the next piece of our puzzle. Blocks.
Blocks and Mining
We can use the SHA256 hash function to start building blocks of data. For each block we have the input (the set of data), and we add some distinguishing features to that data like a block number for example. And we run it through the hash function.
In figure 4, we can see we have some data fields such as block #, a nonce (we will talk about this soon), and at the moment our data field is blank. The Hash of these inclusive fields of data is: 0000f727854b50bb95c054b39c1fe5c92e5ebcfa4bcb5dc279f56aa96a365e5a
You will notice a distinguishing feature of this hash, notice that the 4 leading characters are all 0’s. This is no accident.
Now let us add some data to this block. I will add the phrase “Hello World”. Once I add this data, you will notice that the background has turned red (Figure 5), the program is not happy with our block now. Recall from Figure 4 that the hash of that set of data started with 4 x leading 0's? The program I am running requires that the hash must always start with 4 leading 0’s (I will explain why later).
When I changed the data by adding “Hello World” the SHA256 algorithm provided a new hash, and as we can see, my new hash does not start with 4 leading 0’s.
The nature of this hash algorithm is that there must exist a set of data, that once included with my other data, would make it true that the hash begins with 0000. But what could that data be?
This is where the Nonce comes into play. If we break down my data-set within the block, we have the block number (1) and the Data (Hello World) and this field called the nonce (currently 72608). What if there was a set of data I could include as the nonce value, that once added to the rest of the data, would result in a hash that satisfied my programs requirement that the hash start with the leading characters 0000.
I can try and arbitrarily guess what that data may be. I can change the nonce to 1? Or 2? or 2456395697?…….. I might be here a while.
Or, we can use the power of the computer. We can use the computers ability to quickly process information to start guessing what data we could include in the “Nonce” field, that once run through the hash function, will result in 4 x leading 0’s.
If I hit “Mine” in my program, my PC will start guessing values for the nonce, there is no other way to compute this value aside from guessing and checking if the guess is correct.
Once a solution is found, the program checks the result and the block turns green again. It’s happy. This process can take some time depending on how difficult it is to find a data set to match. This, in simple terms is known as mining. We are “digging” through combinations of data to find a solution to a mathematical problem. We are trying to find the value of the nonce, that when coupled with our data, produces a predetermined requirement (in our case 4 x leading 0000’s in the hash). Figure 6 shows what this looks like for our example, this took around 3secs to complete on my desktop computer. The nonce was the number 24894.
Congratulations! We have mined our first block.
A blockchain is…….wait for it…….a chain of blocks. Its a chain of blocks just like the first block we mined above, but with some differences.
A set of data is entered into block 1, we mine that block to discover the nonce to satisfy the difficulty requirement (how many leading 0000’s we require to satisfy the program). This produces a hash and our block is complete. We then create a new block. We will include a new data field whereby we will include the output hash of the proceeding block as an input into the new block.
We can see in Figure 7, that we have a chain of 2 blocks. The first block is a copy of our above examples, but we have a new field “Prev” containing new data. Block 1 is our genesis block, it contains arbitrary information in the prev field. The block is mined and our hash of that data is obtained.
In block 2, we include the output hash from block 1 as input data to our new block. We place that data in the “Prev” field. We add our new data we want included in that second block and we mine that block. Both of our blocks are happy
Now let’s go back and change something in block 1. I will add a full stop “.” to the data. We can see now in Figure 8 that the entire blockchain is not happy. Both blocks are red. I have changed the data and broken the chain. Block 1’s hash no longer starts with 0000. Adding the full stop “.” to the data, has resulted in a new has that does not start with 0000 and thus is not compatible with the program’s requirements. But the output of block 1 is also an input to block 2, thus invalidating the hash of block 2 also.
Re-mining block 1 has resolved the program for block 1, but again the data has been changed in block 2 due to the change in the hash from block 1, it is still not happy (Figure 9).
We must now also re-mine block 2 to ensure it complies with the difficulty requirement. Figure 10 shows that we again have 2 x happy blocks once we have redone the work for the 2 blocks. “Work” is a term to describe the computational power we exerted through the process of mining.
Note the need to mine the blocks in order. Re-mining block 2 before we satisfied the work needed for block 1 would result in us having to re-mine block 2 yet again.
We now have the foundation for our block chain, a 3rd then 4th block can be added, mining each block with the new input data, with the output hash of block 2 leading to the input of block 3, and the output of block 3 leading to the input of block 4 and so on and so forth.
If our blockchain is 4 blocks long now, and we again try and change data in block 1. It will invalidate the entire chain. We would then need to re-mine block 1, then block 2, then block 3…..you see where I’m going with this?
We now have a good understanding of a blockchain and the pieces that need to go together to create the chain. Now let us expand upon this thinking and look at a distributed blockchain.
Distributed Blockchain and Consensus
One of the beautiful things about the bitcoin blockchain is the distributed ledger. We will get to the “ledger” side of things shortly, for now let us simply look at a distributed blockchain to understand the concept.
Let’s say I had a blockchain, that contained a series of blocks, each block containing some data. This data has been hashed and chained together as we have seen above to form a blockchain. I keep a copy of this blockchain on my computer. What if there was a need to compare notes with someone else? What if the integrity of this data was really important to me and I wanted to ensure it hadn’t been tampered with? What if I wanted to be sure that the data I had, was a true and correct version of that series of data? It would be handy if there was an exact copy of this blockchain for me to compare my version against…. right?
This is where the beauty of the distributed blockchain comes into play. What if I gave someone else a copy of my blockchain so that I could go and compare notes at any time. What if I wrote a program that would do that for me automatically. It could continuously compare my set of data to my friends and flag any discrepancies.
To give a scaled down example. Say I had a blockchain, only 3 blocks long containing some really important data I was working on. I could secure this data in a distributed blockchain by keeping a copy at home (Figure 11 — Peer A) and giving someone else a copy of it, or simply keep a copy on my computer at the office (Figure 11 — Peer B).
If I left my computer on and unlocked, and some nefarious actor (they’re everywhere) comes along and changes the data in block 2. They could re-mine each block from 2 through 4 so my copy of the blockchain looks OK in terms of the chain and the hashing functions for each block. However when I come back to my distributed blockchain program, it flags that my version of truth differs to that on my office computer? A quick comparison shows that indeed the hashing has changed all the way back from block 2 to block 4. (Figure 12). Both chains look ok, they are both green, but my computer program flagged the discrepancy between the hashes and alerted me.
But which version is correct? It is impossible to tell unless I new which specific version the attacker had changed.
It would stand to reason then that it would be handy if I had yet another version to compare to. I’ll keep a copy at home (Figure 13a — Peer A), a copy at the office (Figure 13b — Peer B) and another copy at my mother-in-law’s house (Figure 13c — Peer C).
Now, the nefarious actor strikes again, they change the data on my home PC version again and re-mine all the blocks at home. But now I have a consensus mechanism by which to compare versions. I have a voting system. I have 3 copies of the truth. Peer A (Figure 14a) is telling me one thing and the other 2 (Figure 14b) are in consensus on a different version of truth.
I can put a level of trust into this consensus by assuming that the nefarious actor would not have been able to:
- Know the physical location of my blockchains
- Be able to break in, change the data and re-mine 2 of the 3 sets of this blockchain.
While the above scenario is possible, it is not very probable.
I can thus disregard my home version, backup this version using the versions from my office and mother-in-law’s version, and be safe in the fact I have recovered THE version of truth. Thank god for mother-in-laws and blockchain technology.
Now imagine I had 10’s of thousands of computers (nodes), randomly distributed throughout the world, running this blockchain (that data is pretty special to me after all). That bad-actor would have to track down more than 50% of these nodes and change each version of them in order to cast doubt as to which version of the chain was the truth. This is highly improbable, if not impossible. This is exactly how the bitcoin blockchain works. A network of randomly distributed nodes, run by everyday people, on software on their computers or small cheap dedicated hardware devices, storing a version of truth and keeping each other honest. This is what makes bitcoin decentralised and trust-less. There is no one party that controls the blockchain, it is run by the community of participants
But Daz, what about bitcoin itself? What about the coins? What about the Ledger?
Bitcoins and the Ledger
From the knowledge we built about blockchain so far, understanding the ledger is as simple as being able to format the data-set we have been playing with. So far we have been playing with a text field. And as important as the information is that “Daz is pretty rad”, it turns out, no one else cares.
But what would be useful is if we utilised the “data field” to start recording something useful like transactions. Let’s imagine we now have a blockchain distributed among a lot of nodes, but we will just look at Peers A & B to get the idea, but in reality there are thousands of nodes running the same blockchain. In our example, we have replaced the data free-text field with a series of transaction fields. Figure 15 shows blocks 4 & 5 from Peer A and Peer B and we can see that there is a series of transactional data now within each block.
In block 4 we can see that an amount of $62.19 was sent from someone named Rick to someone named Isla. This was one transaction among 5 total transactions included in that block. The block of data is hashed and mined exactly as we have seen previously.
Our nefarious actor strikes again, his name is Sam, and he received $97.13 from Rick. Sam knows a bit of coding, and decides he wants to steal some $$ from Rick, he decides to change his version of truth and changes the transaction amount to $97,000.13. He re-mines his version (Figure 16), but it is useless. The vast amount of nodes on the network see that this is a version of truth that is out of consensus with the majority, this version is rejected by the network. Nice try Sam.
We can see a whole heap of transactions between parties, but how do we know that Rick had $97.13 to spend in the first place?
Coinbase — The Genesis Block
If you have been among the bitcoin community for a while, you have undoubtedly heard of the infamous genesis block. Satoshi Nakamoto, (Bitcoin’s infamous, pseudonymous creator), mined the first block which contained a block-reward of 50 Bitcoin. Block rewards form the coinbase for bitcoin, in other words, in order to spend coins, they must first be brought into circulation. The Bitcoin Genesis block was actually counted as block “0” and the first 50 bitcoin were actually non-spendable, but from block 1 onward they formed the coinbase.
Coins are introduced with each and every block on a tightly controlled release schedule. This release schedule is 1 block roughly every 10mins. Every 210,000 blocks, the block reward is halved. As of this writing in July 2021, the block-reward is 6.25BTC every ~10mins. There is currently $18.7million bitcoin in circulation, with the last 21millionth bitcoin estimated to be mined in the year 2140.
The block 1 established the initial coinbase, subsequent mining of further blocks expands on the coinbase through the block rewards. Looking at our example, I start with block number 1, this forms my coinbase. If we look at our blockchain example (Figure 17) we can see that I rewarded myself $100 in our first block. Partly because I am a good bloke and partly because I happen to be the one that mined the first block. When we move to Block 2, I start to spend my coins. The program will always check that my transaction outputs (the coins I spend) don’t exceed my balance in the coinbase. I include the transactional data in block 2 and I mine that block too, I am rewarded with more coins as the block reward for Block 2. From here, the blockchain keeps building, keeping a distributed copy of the ledger among the nodes and these nodes enforce the rules determined by the program. Block 3 is included with more transactions as other users start transacting. Lucas is also mining, solving the difficulty puzzle for block 3 and he is rewarded the block-reward for that block. And on and on the chain grows.
As is consistent, if data is changed, even my one simple character, whether it be in the hash outputs themselves or the transactional data, if someones chain is out of consensus with the majority, it is simply rejected.
In terms of balance, if I try and spend more than my balance, the network will reject the transaction. If I try and change the history, the network will reject the history by comparing the hashes of each block. The nodes are the source of truth and keep everyone honest.
The miners throw computational power at solving the cryptographic puzzles for each block, they are rewarded the block reward for the their efforts. Mining is a necessary function of the bitcoin blockchain, without mining there is no value. Miners must expend energy to solve the blocks, the amount of time it takes to mine a block is controlled through a feature of the protocol called the difficulty adjustment. For further information on bitcoin mining and the difficulty adjustment, and why they are needed, please visit this previous article. The difficulty adjustment is one of the more brilliant aspects of the bitcoin protocol which is explained nicely in that article.he bitcoin protocol.
What we have covered here is a simplified version of exactly how the bitcoin network operates. Distributed Ledgers, hashing, blockchains, nodes and miners working in harmony to provide a completely trust-less, censorship-resistance, open-source, open-ledger, open-monetary-network. Nobody can change the ledger, nobody can stop transactions, nobody can reverse transactions and nobody can double spend their coins. It is nothing short of brilliant and is completely revolutionising global finance.
The full history of transactions is available for anyone to interrogate and verify. Obviously, unlike our examples here today, the bitcoin blockchain does not display the identities of the people transacting, only their addresses. Addresses will be the subject of the next article, it is too large a topic to cover today. Please consider following me here on medium to receive notification of new articles.
In this article I used the amazingly interactive platform built by Anders Brownworth who taught blockchain at MIT. Visit here, go and have a play and get a deeper understanding of the concepts we covered in this article, you will get a much deeper understanding if you play around with it all. Anders also has a very good video tutorial with this content also to help cement the learning.
In terms of bitcoin adoption, if you are reading this, you are already ahead of the pack in terms of bitcoin adoption. Institutional adoption is coming and coming fast, we still have time to front-run the big boys. From my articles I am trying to help the Average Joe educate themselves on this technology so that we can benefit from it. We will benefit the most, the guys in the trenches, battling for wages. Start dollar-cost-averaging into bitcoin, buy a bit every day/every week. Treat bitcoin as your savings account and ride the wave up, it is the hardest money that has ever existed and it grows exponentially each day.
Happy Stacking, thanks for reading
Support this content
If you are in Australia, consider using the following code to sign up for Coinspot Crypto Exchange. Disclaimer: I do receive affiliate benefits from Coinspot by using this link, however I would never recommend a product that I didn’t rate highly or that I didn’t use myself.
Alternatively, if you would like to make a contribution to help fund my work please consider a bitcoin donation to the following bitcoin address. Every sat is deeply appreciated and HODL’d with love.
Finally, consider supporting me on Patreon.