Month: September 2024

Offensive Security

What are Cryptographic Hashes?

Frying PotatoesThis post is the first of three posts that I have planned in a little mini-series. I want to post about What are Cryptographic Hashes?, then Identifying Hashes, and finally Cracking Hashes.

So, what is a Cryptographic Hash? At its core, a cryptographic hash function is a mathematical algorithm that takes an input and returns a fixed-size string of characters, which is typically a hexadecimal number. The output is called a hash or digest. Cryptographic hash functions are designed to be secure and resistant to tampering, which makes them highly useful in a variety of security-related contexts. The key features of cryptographic hashes are:

  1. Deterministic: The same input will always produce the same hash.
  2. Fixed Length: Regardless of the size of the input, the hash produced will always be of a fixed size (e.g., SHA-256 always produces a 256-bit hash).
  3. Efficient: Computing the hash for any input is relatively fast.
  4. Pre-image Resistance: It should be computationally infeasible to reverse the process and determine the original input from the hash.
  5. Collision Resistance: Two different inputs should not produce the same hash.
  6. Avalanche Effect: Even a small change to the input should drastically change the output hash.

That Avalanche Effect part is actually really important and easier to understand with a demonstration. If a hashing of similar words resulted in similar hashes, you could actually figure out the original message very easily as you got closer and closer. Instead, abcd, abcdd, abcdD, and abcde actually hash to wildly different SHA256 Hashes, for example. Pretty smart, I think.

$ echo "abcd" | sha256sum
fc4b5fd6816f75a7c81fc8eaa9499d6a299bd803397166e8c4cf9280b801d62c

$ echo "abcdd" | sha256sum
2fc0920b01e12a239cac49f999c641dafaa841da88428c08632ec1fa9860920f

$ echo "abcdD" | sha256sum
f71367ae3183ad102b186748aba1a2132c73168b79a3e2612199f2d74ae9d9b0

$ echo "abcde" | sha256sum
0283da60063abfb3a87f1aed845d17fe2d9ba8c780b478dc4ae048f5ee97a6d5

Where do we use Cryptographic Hashes? Well, I’m writing about this under the banner of Offensive Security, so some logical guesses could be made. One thing they are used for is for storing passwords. Instead of storing the passwords directly in a database or even storing an encrypted version that can be decrypted, it is a good practice to store a Cryptographic Hash. Then when someone tries to log in, you hash the input and then compare the hashes. That way, the passwords are harder to determine from the hash alone. Another use is for data integrity. If a hash is taken of a file or a message and that hash is published, you can confirm that you have the exact same file or message by calculating the hash yourself and comparing.

What are the dangers here? Why are there so many kinds of hashes? Well – like a lot of things – this is actually a hard problem. If the hash algorithm isn’t very good, it could have collisions. That means two or more messages hash to the same value, so if apple and orange both hash to ABCDEF and ABCDEF is stored in the database as the password hash, then providing either a password of apple or orange at login would let you in. That’s not great. Also, if it is possible for multiple inputs to have the same output, then a file with a hash of AAABBBCCC is published, but someone also makes a version of a file full of viruses that also hashes out to AAABBBCCC, then we’ve lost trust in the process.

Here are just four examples of hash algorithms that had been trusted, but aren’t anymore.

MD4 (Message Digest Algorithm 4)
Discovered Collisions: Yes (1995)
Details: MD4 is an earlier version of MD5 and was found to be insecure in the mid-1990s. Collisions were discovered in MD4 as early as 1995, and since then, the hash function has been considered broken and insecure. MD4 was eventually replaced by more secure algorithms like SHA and MD5, though MD5 also suffered from its own weaknesses.
Examples: MD4 is largely obsolete but might still be found in legacy systems.

MD5 (Message Digest Algorithm 5)
Discovered Collisions: Yes (2004)
Details: MD5 is one of the most famous cryptographic hash functions with widely known collisions. It produces a 128-bit hash value. In 2004, researchers demonstrated a practical method to find collisions in MD5. Since then, the hash function has been considered broken and unsuitable for security purposes like digital signatures and certificates.
Examples: In 2008, an attack was demonstrated that allowed attackers to create a rogue Certificate Authority (CA) certificate, which undermined the trust model of SSL certificates.

SHA-0 (Secure Hash Algorithm 0)
Discovered Collisions: Yes (2004)
Details: SHA-0 is the predecessor of SHA-1 and was withdrawn from use due to its vulnerability to collisions. Researchers found theoretical weaknesses, and in 2004, they were able to demonstrate actual collisions in the function. As a result, SHA-0 was deprecated early in its life.
Examples: SHA-0 is no longer used in any modern cryptographic system due to these vulnerabilities.

SHA-1 (Secure Hash Algorithm 1)
Discovered Collisions: Yes (2017)
Details: SHA-1, which produces a 160-bit hash value, was long considered secure but began to show theoretical weaknesses in the early 2000s. In 2017, Google and the CWI Institute in Amsterdam successfully generated a practical collision using the “SHAttered” attack. This discovery showed that SHA-1 was vulnerable to collision attacks and should no longer be used in critical security applications.
Examples: SHA-1 has been deprecated for use in SSL certificates, and major browsers no longer accept SHA-1-signed certificates.

Understanding these foundational concepts is critical as we move forward in our offensive security series. In our next post, we will dive into how to identify a hash you find in the wild—whether it’s from a password dump, a malware sample, or part of an intercepted communication.