Tuesday 25 November 2014

Let's get definitions out of the way

Not everyone may be familiar with SHAs, let alone SHA-2. So the first question to answer is, what does SHA mean?

SHA stands for Secure Hash Algorithm. It is used to generate a unique hash for each input. A hash is simply the resulting value of a function. For example, with SHA-256 the string "hello123" has a hash value of 27cc6994fc1c01ce6659c6bddca9b69c4c6a9418065e612c69d110b3f7b11f8a. (The hash value is called the digest, and the input is called the pre-image). What is worth noting is that SHAs are NOT encryption algorithms themselves; there is no cipher key for 'decryption' because the process is not supposed to be reversible. Using the example, there should ideally be no way of processing the hash value to get a result of "hello123". A sign of a good hash function is that there is no better way to find the original data (pre-image) than by brute force.

If you can't get the original data back, what's the use?
In one word: verification.
A simple use-case of SHAs is error checking. When a site hosts a file, it can also provide a checksum generated with an SHA-2 algorithm so the user downloading the file can ensure its integrity.
A more heard of and complicated use of SHAs is in website certificate verification.

Some people say that hash functions like SHA-2 need to try and be one-to-one (i.e. exactly one unique output for each different input) while others say this is irrelevant or impossible. When at least 2 different inputs have the same digest, it is called a collision (not my area to talk about).

What is SHA-2 really?

It is a family of algorithms developed by the NSA and NIST. There are six of them: SHA-224, SHA-256, SHA-512, SHA-384, SHA-512/224, and SHA-512/256. They are all essentially the same algorithm with different data sizes used in computation. SHA-256 and SHA-512 use 64-bit words, and SHA-224 and SHA-384 use 32-bit words.

To show the inner workings of the algorithm I'll explain SHA-256.

No comments:

Post a Comment