A hash function takes an input of any size and produces a fixed-size output called a hash, digest or checksum. The same input always produces the same output. A change to even a single character of the input produces a completely different output. This combination of determinism and sensitivity to change makes hashes useful for verifying data integrity, storing passwords securely, and creating identifiers for content.
Hash functions are one-way operations. Given the hash output, there is no algorithm to recover the original input. This irreversibility is fundamental to most security applications of hashing. It means you can verify that someone knows a password without storing the password itself, and you can confirm that a file has not been modified without storing the original file for comparison.
MD5 and why it should not be used for security
MD5 produces a 128-bit hash represented as 32 hexadecimal characters. It was widely used through the 1990s and early 2000s for checksums, password storage and digital signatures. It is fast to compute and the output format is compact. These properties made it popular when it was introduced.
MD5 is now considered cryptographically broken for security purposes. Researchers have demonstrated practical collision attacks, meaning it is possible to construct two different inputs that produce the same MD5 hash. This breaks any application that relies on MD5 hashes being unique to a specific input. Password databases protected with MD5 are vulnerable to precomputed rainbow table attacks and GPU-accelerated brute force. MD5 should not be used for any new security-related application.
Where MD5 remains useful is for non-security checksums where collision resistance is not required. Verifying that a large file transferred without corruption, checking whether a cached file has changed, or generating a quick fingerprint for deduplication are all appropriate uses because the adversarial threat model does not apply.
SHA-1 and its deprecation
SHA-1 produces a 160-bit hash represented as 40 hexadecimal characters. It was the successor to MD5 and addressed some of its weaknesses. SHA-1 was the standard for SSL certificates, code signing and version control systems including early Git for many years.
SHA-1 was deprecated for security-critical applications after theoretical attacks were demonstrated and eventually a practical collision was computed in 2017. Major browsers stopped accepting SHA-1 certificates. Certificate authorities stopped issuing them. For security purposes, SHA-1 is in the same category as MD5: broken and unsuitable for new use.
Git uses SHA-1 for its object identifiers but in a context where the threat model is different from most cryptographic uses. The content-addressed nature of Git means a collision would require both inputs to produce valid Git objects, which is a harder constraint than a general collision. Git has been migrating toward SHA-256 as an option for new repositories.
SHA-256 and the SHA-2 family
SHA-256 produces a 256-bit hash and is part of the SHA-2 family, which also includes SHA-224, SHA-384 and SHA-512. SHA-256 is the current recommended general-purpose hash function for most applications. No practical attacks against SHA-2 have been demonstrated, and it is the standard for TLS certificates, code signing, cryptocurrency applications and most modern security protocols.
SHA-256 is slower to compute than MD5 or SHA-1, which is actually a feature in the context of password hashing. Password hashing wants to be slow to make brute-force attacks expensive. However, for password storage specifically, SHA-256 alone is not sufficient. It needs to be combined with a salt and an iterated computation using a function designed specifically for passwords like bcrypt, scrypt or Argon2.
Practical uses for hashing
File integrity verification is one of the most common practical uses. Software downloads often include a hash of the file alongside the download link. After downloading, computing the hash of the downloaded file and comparing it to the published hash confirms the file was not modified in transit or storage. This protects against both accidental corruption and deliberate tampering.
Content-based deduplication uses hashes to identify identical files without comparing them byte by byte. Two files with the same hash are almost certainly identical. Scanning a large collection of files for duplicates by comparing their hashes is much faster than comparing file contents directly.
Etags in HTTP caching are hashes or hash-like identifiers of resource content. When a resource changes, its Etag changes. Browsers and proxies use Etags to determine whether cached content is still valid without downloading the full resource.
- Open the Hash Generator below.
- Paste the text or data you want to hash.
- Select the hash algorithm: MD5, SHA-1, SHA-256 or others.
- Copy the resulting hash for your use.
Generate MD5, SHA-1, SHA-256 and other hash values instantly.