How does the SHA family of hash functions work?

 

Hashes are a mathematical function that takes an arbitrary length input and returns a fixed length output like the Fowler-Noll-Vo hash. Cryptographic hash functions are designed with the intention of ensuring that there is as close to one input per output as possible, and that they are incredibly difficult to reverse. There are also hash functions that take a key as well as an input that are, astonishingly, called keyed hash functions. Discourse is still ongoing as to where the name came from. The SHA family of hash functions are cryptographic hashes, and includes both functions that are considered secure today and functions that have been deprecated due to being deemed insecure.

Let’s take a look at how SHA-256 works internally. It has a digest length of 256 and is not keyed. SHA-256 takes blocks of 512 bits, and performs 64 rounds per block. There are three main steps to this process: Padding, Block Decomposition, and Hash Computation

We use a constant 64 binary words:

0x428a2f98 0x71374491 0xb5c0fbcf 0xe9b5dba5 0x3956c25b 0x59f111f1 0x923f82a4 0xab1c5ed5 0xd807aa98 0x12835b01 0x243185be 0x550c7dc3 0x72be5d74 0x80deb1fe 0x9bdc06a7 0xc19bf174 0xe49b69c1 0xefbe4786 0x0fc19dc6 0x240ca1cc 0x2de92c6f 0x4a7484aa 0x5cb0a9dc 0x76f988da 0x983e5152 0xa831c66d 0xb00327c8 0xbf597fc7 0xc6e00bf3 0xd5a79147 0x06ca6351 0x14292967 0x27b70a85 0x2e1b2138 0x4d2c6dfc 0x53380d13 0x650a7354 0x766a0abb 0x81c2c92e 0x92722c85 0xa2bfe8a1 0xa81a664b 0xc24b8b70 0xc76c51a3 0xd192e819 0xd6990624 0xf40e3585 0x106aa070 0x19a4c116 0x1e376c08 0x2748774c 0x34b0bcb5 0x391c0cb3 0x4ed8aa4a 0x5b9cca4f 0x682e6ff3 0x748f82ee 0x78a5636f 0x84c87814 0x8cc70208 0x90befffa 0xa4506ceb 0xbef9a3f7 0xc67178f2

We will use these later. First, we need to pad the information. No matter the length of the input, even if it is a perfect multiple of 512 bits, we pad the input prior to computing our hash. There are three steps to the padding:

  1. Append a 1 bit
  2. Append a number, k, of 0 bits. k is going to be the smaller positive integer where the (inital length of the message) + 1 + k = 448 mod 512
  3. Append the length < 2^64 of the inital message with 64 bits to the end of the message

To decompose the block, we seperate the first 16 blocks