Sunday 30 November 2014

Lots of bitwise operations

After looking over the algorithm again and again, the bitwise operations become less daunting... They're just part of the scrambling process. Quick and clever too.

More interesting is how the rounds go by and let the same registers get the same operations performed on them, yet be completely different each time by looking up values of various variables from previous rounds and incorporating them into the mix.

Breaking the main loop into three high level steps seems to be the best approach rather than try to explain everything at once. There's just too much going on to do that.

step 0: initialize the first intermediate hash and pad the message, also break it up into 32-bit chunks.
In the for loop:
step 1: initialize the temporary registers to the old intermediate hash sub-values

step 2: the heart of the algorithm. Repeat 64 times: apply a bunch of bitwise operations to the registers and incorporate previous calculated values from other variables.

step 3: make the new intermediate hash sub-values from the registers' values found in step 2.

do it all N times, where N is the number of 32-bit blocks the message is.

concatenate the final hash sub-values to get the actual digest.

Friday 28 November 2014

The birthday problem?

Well, since i haven't explained it yet. Collision attack are done based on a statistical problem known as the birthday problem. The problem being: What are the chances of two persons having the same birthday in class of certain size. In a class of say 30 people, one would expect 30/365, which is around 8%. However in reality the chances are around 70% and 99.9% for 70 people. This is due to the fact that this applied to any day of the year, not just a specific day. This is same concept is used in collision attack to try to find two identical hash. Due to the nature of this, it is possible to cut the complexity of attack down to 2^(n/2) where n is 256 for 32 bits and 512 for 64 bits.

Wednesday 26 November 2014

The method of a brute

As said before, hashing function are pretty much designed for one way conversion, so reversing it is basically impossible to do. Lets start off by saying that SHA-2 is used mostly in two sources, password and digital signature or check-sums.

In the first case there are two major obstacles in your path, first is getting access to the hash and the salt. These would require you being able to access the server in some way. Next obstacles is to try to create a value that will collide with said hash. This part is where it gets very difficult. This attack is basically a pre-image attack and the time complexity for this is basically the amount of characters that is used to encrypt it. In the case of SHA-2 with digest size of 256 it has a complexity of 2^255 and with digest of 512 for 64 bits it has complexity of 2^511. While it is possible to do it, the amount of time it would take simply be not practical for any purpose.

The second application of this encrypt and validate stored data. The attempt here is to create two similar documents with the same hash. One will be malicious while the other will be non-malicious. The trick is to send the non-malicious document to other party, get their approval(such as signature or agreement to download something), then send the malicious copy to them. Since they have the same hash, one will not know the difference until they open the file or until it has been alerted.


Tuesday 25 November 2014

Let's get definitions out of the way

Not everyone may be familiar with SHAs, let alone SHA-2. So the first question to answer is, what does SHA mean?

SHA stands for Secure Hash Algorithm. It is used to generate a unique hash for each input. A hash is simply the resulting value of a function. For example, with SHA-256 the string "hello123" has a hash value of 27cc6994fc1c01ce6659c6bddca9b69c4c6a9418065e612c69d110b3f7b11f8a. (The hash value is called the digest, and the input is called the pre-image). What is worth noting is that SHAs are NOT encryption algorithms themselves; there is no cipher key for 'decryption' because the process is not supposed to be reversible. Using the example, there should ideally be no way of processing the hash value to get a result of "hello123". A sign of a good hash function is that there is no better way to find the original data (pre-image) than by brute force.

If you can't get the original data back, what's the use?
In one word: verification.
A simple use-case of SHAs is error checking. When a site hosts a file, it can also provide a checksum generated with an SHA-2 algorithm so the user downloading the file can ensure its integrity.
A more heard of and complicated use of SHAs is in website certificate verification.

Some people say that hash functions like SHA-2 need to try and be one-to-one (i.e. exactly one unique output for each different input) while others say this is irrelevant or impossible. When at least 2 different inputs have the same digest, it is called a collision (not my area to talk about).

What is SHA-2 really?

It is a family of algorithms developed by the NSA and NIST. There are six of them: SHA-224, SHA-256, SHA-512, SHA-384, SHA-512/224, and SHA-512/256. They are all essentially the same algorithm with different data sizes used in computation. SHA-256 and SHA-512 use 64-bit words, and SHA-224 and SHA-384 use 32-bit words.

To show the inner workings of the algorithm I'll explain SHA-256.

Monday 24 November 2014

The impenetrable defence

I have to say, hashing algorithmic are definitely one of the more secure way to encrypt something. The only number of way to really attack it falls into two major categories. Pre-image which basically tries to brute force the key from the hash(provided you can get access to the key to start with) and collision to try to create duplicate keys. Thought the usefulness of latter is questionable.