Comparing lists and detecting errors
Whenever our friends at The Blockchain Bar suspect a mistake, they have to compare all entries on the last page with all guests individually. That’s very cumbersome.
Bob is annoyed: “It takes too long to compare every item on the page. There must be a better way.”
Alice has an idea: “Let’s calculate ‘fingerprints’ of our data. Everyone adds up the number of letters in each name on their list. If the sums are identical, the data is probably also identical. So we only have to compare line by line, if there is a mismatch of sums.”
Blockchains do something very similar: They calculate fingerprints of transaction blocks. These fingerprints are called hash values. This makes it easy to compare blocks and identify mistakes or unwanted changes.
But what to do if there are different fingerprints? How to agree on one version? Find out in the next episode…
Or maybe you first want to read more about hashing below.
‘Hash’ calculations create data ‘fingerprints’
In The Blockchain Bar, every time the list of new beer orders reaches the end of the page, guests want to make sure that every copy of the list contains the same orders. Only if they match, they hand out the drinks ordered on that last page.
Of course, to make sure that all pages contain the same orders (transactions) they could compare every single item on every copy of the list . But this would be very time consuming and guests would have to wait hours for their drinks.
It’s not feasible to compare all lists all the time. They have to find an easier detection mechanism for false pages.
Hence, they define the following error detection mechanism in The Blockruption’s Blockchain Bar Protocol:
At the end of every page of the list, they count the letters in the name of the person who ordered the drink on each line. Then, they add the resulting numbers together, and write it at the end of the page.
For example, the page lists the following orders: Alice, Bob, Carol, Bob, Dave. This results in calculating 5 + 3 + 5 + 3 + 4 = 20.
If they all come to the same result of the calculation, they likely have the same orders on their lists, so they don’t have to compare their pages line by line. But if they don’t have the same number, they instantly know that something is wrong.
But how do we make sure that the entire writing pad contains all the right pages? The guests agree to copy the result from the previous page onto each new page. At the end, they add it to the number of letters in all the orders on the current page. This method connects all pages of the writing pad through the increasing sum of letters.
In the example above, the next page would start with 20 copied from the result above and include this number in the sum of all letters at the end.
Of course, just counting the number of letters on each page is a little too simple. If we exchange ‘Dave’ for ‘Carl’ in the above example, we would still end up with 20, as both names have the same number of letters. We could come up with more sophisticated versions of the calculation. For example, we could count the number of letters in each name and multiply it by the position of the name’s starting letter in the alphabet before adding it to the sum: 1 * 5 + 2 * 3 + 3 * 5 + 2 * 3 + 4 * 4 = 48. ‘Carl’ instead of ‘Dave’ would lead to a different summand ‘3*4’ resulting in a different total of 44. The guests should decide to go with this more complicated but also safer version of the calculation.
Calculations like this are complex for humans, especially after three or four beers. But computers don’t drink beer. And doing calculations is the only thing they are really good at. Hence, real blockchain systems use complicated calculations to make sure that every ‘page’ in the ‘writing pad’ is correct. And they use complex calculations to connect the pages within the writing pad.
In blockchain systems the pages are called blocks. And the blocks get ‘chained’ together with complex calculations. The entire writing pad is called the ‘blockchain’.
In Bitcoin the ‘chaining calculations’ are called ‘hashing’. Depending on the content and the order of transactions of a block, the Bitcoin miners make a special calculation across transactions in a block that results in a simple number. Everyone else on the network can check that number when it is added to the end of a block.
Hashes in Bitcoin look something like this:
(Dear Nerds, yes, the following examples are heavily simplified.)
From the data ‘Bob’, Bitcoin’s hash calculation (algorithm) produces the following fingerprint:
The data ‘Alice’ turns into:
And ‘Blockruption’s Blockchain Bar’ results in:
All three generated hash values have the same length. We could put an arbitrarily long text into the algorithm and it would still produce only 64 characters as output. (Yes, nerds, these are 64 numbers in hexadecimal format.)
The hash algorithm used by Bitcoin has the nerdy name ‘SHA-256’. It produces very unique hash values for the data put into it. It is very unlikely – practically impossible – that two different inputs produce the same output. So if two hashes are the same, their original input must also have been the same.
You can try it out yourself with many freely available hash generators on the Web. For example, you can use this one: https://passwordsgenerator.net/sha256-hash-generator/
This is the most complicated of all episodes in this series. If you found it a bit too hard to have fun with, please don’t leave. Please try the next episodes, they will be easier to understand.