What is MD5?

June 21, 2020 Security and Compliance, MOVEit

MD5 is still being used today as a hash function even though it has been exploited for years. In this article, we discuss what MD5 is; it's history, and how it is used today. 

Does your content management system rely on the MD5 hashing scheme for securing and storing passwords? It's time to check!        

ZDNet reports more than 25 percent of the major CMS systems use the old and outdated MD5 hashing scheme as the default for securing and storing user passwords. Unless users change the default settings by modifying the CMS source code, any websites running on the CMS are placing user passwords at risk if a hacker breaches the site database.

Initially created in 1991 by cryptographer and MIT professor Ronald Rivest, MD5 is technically known as the Message-Digest Algorithm. As a hash function, MD5 maps a set of data to a bit string of a fixed size called the hash value. Hash functions have variable levels of complexity and difficulty and are used for cryptocurrency, password security, and message security.

Following in the footsteps of MD2 and MD4, MD5 produces a 128-bit hash value. Its main purpose is to verify that a file has been unaltered. Instead of confirming that two sets of data are identical by comparing the raw data, MD5 does this by producing a checksum on both sets and then comparing the checksums to verify that they're the same.

 

Historical Breaches That Exploited MD5

The weaknesses of MD5 have been exploited in the field. One of the more infamous occurrences took place several years ago with Flame malware, which impacted many of the world's largest companies. According to eWeek, a known weakness in the MD5 hash function gave the group of threat actors behind the Flame malware the ability to forge a valid certificate for the Microsoft's Windows Update service.

To verify the potential damage, Venafi, a certificate-management firm, scanned 450 companies in the Global 2000 at that time. They found that every single one had MD5 certificates associated with their networks. In total, more than 17 percent of the certificates used to sign servers, code, and VPN access still used the MD5 algorithm.

Two of the biggest data breaches of all time also involved MD5. In 2013, a data breach allegedly originating from social website Badoo was found to be circulated. The breach contained 112 million unique email addresses and personal information like names, birthdates, and passwords stored as MD5 hashes. And in 2016, Youku, a Chinese video service, exposed 92 million unique user accounts and MD5 password hashes.

More recently, InfoSecurity Magazine reported last year that the data belonging to 817,000 RuneScape subscribers to bot provider EpicBot was uploaded to the same hacking forums from a previous breach at the firm. Compromised details included usernames, email and IP addresses and passwords stored as either salted MD5 or bcrypt hashes.

Valid Uses for MD5 Remain

Although it's designed as a cryptographic function, MD5 suffers from extensive vulnerabilities, which is why you want to stay away from it when it comes to protecting your CMS, web framework, and other systems that use passwords for granting access. One of the reasons this is true is that it should be computationally infeasible to find two distinct messages that hash to the same value. But MD5 fails this requirement—such collisions can potentially be found in seconds.

Despite breaches like those described above, MD5 can still be used for standard file verifications and as a checksum to verify data integrity, but only against unintentional corruption. It also remains suitable for other non-cryptographic purposes, such as determining the partition for a particular key in a partitioned database.

Staying Away Still a Good Idea

Over the years, as MD5 was getting widespread use but proving to be vulnerable, the MD6 hashing algorithm emerged. But MD6 went relatively unused and faded into obscurity, perhaps due to the doubts people had about MD5. Alternatives available as free downloads that have emerged in recent years include highly-complex systems like SHA-2 and SHA-3 as well as BCRYPT, SCRYPT, Argon2, CABHA, WHIRLPOOL, and RIPEMD-160.

As you ponder the likelihood of a hashing attack on one of your systems, it's important to note that even with MD5, the odds are heavily in your favor. A hash attack can only occur when two separate inputs generate the same hash output. But since hash functions have infinite input length and a predefined output length, it is rare for a collision to occur. The longer the hash value, the possibility of a hash attack gets less.

But as engineers at the Carnegie Mellon University Software Engineering Institute warn, software developers, certification authorities and website owners should all avoid using the MD5 algorithm in any capacity. As previous research has demonstrated, "it should be considered cryptographically broken and unsuitable for further use."

It's also clear that cybercriminals will continue to quickly adopt attacks against any systems they come across that use MD5. The continued use of the broken cryptographic hash algorithm may put your company at risk—one that's not worth taking.

 

Greg Mooney

Greg is a technologist and data geek with over 10 years in tech. He has worked in a variety of industries as an IT manager and software tester. Greg is an avid writer on everything IT related, from cyber security to troubleshooting.

Read next Why Email and EFSS are Unsecure