Key derivation function

Last updated October 21, 2024

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function (which typically uses a cryptographic hash function or block cipher).^[1]^[2]^[3] KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.^[4]

History

The first^{[ citation needed ]} deliberately slow (key stretching) password-based key derivation function was called "crypt" (or "crypt(3)" after its man page), and was invented by Robert Morris in 1978. It would encrypt a constant (zero), using the first 8 characters of the user's password as the key, by performing 25 iterations of a modified DES encryption algorithm (in which a 12-bit number read from the real-time computer clock is used to perturb the calculations). The resulting 64-bit number is encoded as 11 printable characters and then stored in the Unix password file.^[5] While it was a great advance at the time, increases in processor speeds since the PDP-11 era have made brute-force attacks against crypt feasible, and advances in storage have rendered the 12-bit salt inadequate. The crypt function's design also limits the user password to 8 characters, which limits the keyspace and makes strong passphrases impossible.^{[ citation needed ]}

Although high throughput is a desirable property in general-purpose hash functions, the opposite is true in password security applications in which defending against brute-force cracking is a primary concern. The growing use of massively-parallel hardware such as GPUs, FPGAs, and even ASICs for brute-force cracking has made the selection of a suitable algorithms even more critical because the good algorithm should not only enforce a certain amount of computational cost not only on CPUs, but also resist the cost/performance advantages of modern massively-parallel platforms for such tasks. Various algorithms have been designed specifically for this purpose, including bcrypt, scrypt and, more recently, Lyra2 and Argon2 (the latter being the winner of the Password Hashing Competition). The large-scale Ashley Madison data breach in which roughly 36 million passwords hashes were stolen by attackers illustrated the importance of algorithm selection in securing passwords. Although bcrypt was employed to protect the hashes (making large scale brute-force cracking expensive and time-consuming), a significant portion of the accounts in the compromised data also contained a password hash based on the fast general-purpose MD5 algorithm, which made it possible for over 11 million of the passwords to be cracked in a matter of weeks.^[6]

In June 2017, The U.S. National Institute of Standards and Technology (NIST) issued a new revision of their digital authentication guidelines, NIST SP 800-63B-3,^[7]^: 5.1.1.2 stating that: "Verifiers SHALL store memorized secrets [i.e. passwords] in a form that is resistant to offline attacks. Memorized secrets SHALL be salted and hashed using a suitable one-way key derivation function. Key derivation functions take a password, a salt, and a cost factor as inputs then generate a password hash. Their purpose is to make each password guessing trial by an attacker who has obtained a password hash file expensive and therefore the cost of a guessing attack high or prohibitive."

Modern password-based key derivation functions, such as PBKDF2,^[2] are based on a recognized cryptographic hash, such as SHA-2, use more salt (at least 64 bits and chosen randomly) and a high iteration count. NIST recommends a minimum iteration count of 10,000.^[7]^: 5.1.1.2 "For especially critical keys, or for very powerful systems or systems where user-perceived performance is not critical, an iteration count of 10,000,000 may be appropriate.” ^[8]^: 5.2

Key derivation

The original use for a KDF is key derivation, the generation of keys from secret passwords or passphrases. Variations on this theme include:

In conjunction with non-secret parameters to derive one or more keys from a common secret value (which is sometimes also referred to as "key diversification"). Such use may prevent an attacker who obtains a derived key from learning useful information about either the input secret value or any of the other derived keys. A KDF may also be used to ensure that derived keys have other desirable properties, such as avoiding "weak keys" in some specific encryption systems.
As components of multiparty key-agreement protocols. Examples of such key derivation functions include KDF1, defined in IEEE Std 1363-2000, and similar functions in ANSI X9.42.
To derive keys from secret passwords or passphrases (a password-based KDF).
To derive keys of different length from the ones provided. KDFs designed for this purpose include HKDF and SSKDF. These take an 'info' bit string as an additional optional 'info' parameter, which may be crucial to bind the derived key material to application- and context-specific information.^[9]
Key stretching and key strengthening.

Key stretching and key strengthening

Key derivation functions are also used in applications to derive keys from secret passwords or passphrases, which typically do not have the desired properties to be used directly as cryptographic keys. In such applications, it is generally recommended that the key derivation function be made deliberately slow so as to frustrate brute-force attack or dictionary attack on the password or passphrase input value.

Such use may be expressed as $DK = KDF(key, salt, iterations)$ , where $DK$ is the derived key, $KDF$ is the key derivation function, $key$ is the original key or password, $salt$ is a random number which acts as cryptographic salt, and $iterations$ refers to the number of iterations of a sub-function. The derived key is used instead of the original key or password as the key to the system. The values of the salt and the number of iterations (if it is not fixed) are stored with the hashed password or sent as cleartext (unencrypted) with an encrypted message.^[10]

The difficulty of a brute force attack is increased with the number of iterations. A practical limit on the iteration count is the unwillingness of users to tolerate a perceptible delay in logging into a computer or seeing a decrypted message. The use of salt prevents the attackers from precomputing a dictionary of derived keys.^[10]

An alternative approach, called key strengthening, extends the key with a random salt, but then (unlike in key stretching) securely deletes the salt.^[11] This forces both the attacker and legitimate users to perform a brute-force search for the salt value.^[12] Although the paper that introduced key stretching^[13] referred to this earlier technique and intentionally chose a different name, the term "key strengthening" is now often (arguably incorrectly) used to refer to key stretching.

Password hashing

Despite their original use for key derivation, KDFs are possibly better known for their use in password hashing (password verification by hash comparison), as used by the passwd file or shadow password file. Password hash functions should be relatively expensive to calculate in case of brute-force attacks, and the key stretching of KDFs happen to provide this characteristic.^{[ citation needed ]} The non-secret parameters are called "salt" in this context.

In 2013 a Password Hashing Competition was announced to choose a new, standard algorithm for password hashing. On 20 July 2015 the competition ended and Argon2 was announced as the final winner. Four other algorithms received special recognition: Catena, Lyra2, Makwa and yescrypt.^[14]

As of May 2023, OWASP recommends the following KDFs for password hashing, listed in order of priority:^[15]

Argon2id
scrypt if Argon2id is unavailable
bcrypt for legacy systems
PBKDF2 if FIPS-140 compliance is required

Related Research Articles

A key in cryptography is a piece of information, usually a string of numbers or letters that are stored in a file, which, when processed through a cryptographic algorithm, can encode or decode cryptographic data. Based on the used method, the key can be different sizes and varieties, but in all cases, the strength of the encryption relies on the security of the key being maintained. A key's security strength is dependent on its algorithm, the size of the key, the generation of the key, and the process of key exchange.

In cryptography, a brute-force attack consists of an attacker submitting many passwords or passphrases with the hope of eventually guessing correctly. The attacker systematically checks all possible passwords and passphrases until the correct one is found. Alternatively, the attacker can attempt to guess the key which is typically created from the password using a key derivation function. This is known as an exhaustive key search. This approach doesn't depend on intellectual tactics; rather, it relies on making several attempts.

In cryptanalysis and computer security, a dictionary attack is an attack using a restricted subset of a keyspace to defeat a cipher or authentication mechanism by trying to determine its decryption key or passphrase, sometimes trying thousands or millions of likely possibilities often obtained from lists of past security breaches.

A passphrase is a sequence of words or other text used to control access to a computer system, program or data. It is similar to a password in usage, but a passphrase is generally longer for added security. Passphrases are often used to control both access to, and the operation of, cryptographic programs and systems, especially those that derive an encryption key from a passphrase. The origin of the term is by analogy with password. The modern concept of passphrases is believed to have been invented by Sigmund N. Porter in 1982.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also referred to as a cryptographic random number generator (CRNG).

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In cryptanalysis and computer security, password cracking is the process of guessing passwords protecting a computer system. A common approach is to repeatedly try guesses for the password and to check them against an available cryptographic hash of the password. Another type of approach is password spraying, which is often automated and occurs slowly over time in order to remain undetected, using a list of common passwords.

In cryptography, a salt is random data fed as an additional input to a one-way function that hashes data, a password or passphrase. Salting helps defend against attacks that use precomputed tables, by vastly growing the size of table needed for a successful attack. It also helps protect passwords that occur multiple times in a database, as a new salt is used for each password instance. Additionally, salting does not place any burden on users.

A random password generator is a software program or hardware device that takes input from a random or pseudo-random number generator and automatically generates a password. Random passwords can be generated manually, using simple sources of randomness such as dice or coins, or they can be generated using a computer.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

A rainbow table is a precomputed table for caching the outputs of a cryptographic hash function, usually for cracking password hashes. Passwords are typically stored not in plain text form, but as hash values. If such a database of hashed passwords falls into the hands of attackers, they can use a precomputed rainbow table to recover the plaintext passwords. A common defense against this attack is to compute the hashes using a key derivation function that adds a "salt" to each password before hashing it, with different passwords receiving different salts, which are stored in plain text along with the hash.

In cryptography, key stretching techniques are used to make a possibly weak key, typically a password or passphrase, more secure against a brute-force attack by increasing the resources it takes to test each possible key. Passwords or passphrases created by humans are often short or predictable enough to allow password cracking, and key stretching is intended to make such attacks more difficult by complicating a basic step of trying a single password candidate. Key stretching also improves security in some real-world applications where the key length has been constrained, by mimicking a longer key length from the perspective of a brute-force attacker.

In cryptography, a pre-shared key (PSK) is a shared secret which was previously shared between the two parties using some secure channel before it needs to be used.

bcrypt is a password-hashing function designed by Niels Provos and David Mazières, based on the Blowfish cipher and presented at USENIX in 1999. Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.

The following outline is provided as an overview of and topical guide to cryptography:

HKDF is a simple key derivation function (KDF) based on the HMAC message authentication code. It was initially proposed by its authors as a building block in various protocols and applications, as well as to discourage the proliferation of multiple KDF mechanisms. The main approach HKDF follows is the "extract-then-expand" paradigm, where the KDF logically consists of two modules: the first stage takes the input keying material and "extracts" from it a fixed-length pseudorandom key, and then the second stage "expands" this key into several additional pseudorandom keys.

In cryptography, scrypt is a password-based key derivation function created by Colin Percival in March 2009, originally for the Tarsnap online backup service. The algorithm was specifically designed to make it costly to perform large-scale custom hardware attacks by requiring large amounts of memory. In 2016, the scrypt algorithm was published by IETF as RFC 7914. A simplified version of scrypt is used as a proof-of-work scheme by a number of cryptocurrencies, first implemented by an anonymous programmer called ArtForz in Tenebrix and followed by Fairbrix and Litecoin soon after.

Microsoft Office password protection is a security feature that allows Microsoft Office documents to be protected with a user-provided password.

crypt is a POSIX C library function. It is typically used to compute the hash of user account passwords. The function outputs a text string which also encodes the salt, and identifies the hash algorithm used. This output string forms a password record, which is usually stored in a text file.

In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate in some other medium, such as a Hardware Security Module. Note that the National Institute of Standards and Technology refers to this value as a secret key rather than a pepper. A pepper is similar in concept to a salt or an encryption key. It is like a salt in that it is a randomized value that is added to a password hash, and it is similar to an encryption key in that it should be kept secret.

References

↑ Bezzi, Michele; et al. (2011). "Data privacy". In Camenisch, Jan; et al. (eds.). Privacy and Identity Management for Life. Springer. pp. 185–186. ISBN 9783642203176.
1 2 B. Kaliski; A. Rusch (January 2017). K. Moriarty (ed.). PKCS #5: Password-Based Cryptography Specification Version 2.1. Internet Engineering Task Force (IETF). doi: 10.17487/RFC8018 . ISSN 2070-1721. RFC 8018.Informational. Obsoletes RFC 2898. Updated by RFC 9579.
↑ Chen, Lily (October 2009). "NIST SP 800-108: Recommendation for Key Derivation Using Pseudorandom Functions". NIST.
↑ Zdziarski, Jonathan (2012). Hacking and Securing IOS Applications: Stealing Data, Hijacking Software, and How to Prevent It. O'Reilly Media. pp. 252–253. ISBN 9781449318741.
↑ Morris, Robert; Thompson, Ken (3 April 1978). "Password Security: A Case History". Bell Laboratories. Archived from the original on 22 March 2003. Retrieved 9 May 2011.
↑ Goodin, Dan (10 September 2015). "Once seen as bulletproof, 11 million+ Ashley Madison passwords already cracked". Ars Technica . Retrieved 10 September 2015.
1 2 Grassi Paul A. (June 2017). SP 800-63B-3 – Digital Identity Guidelines, Authentication and Lifecycle Management. NIST. doi:10.6028/NIST.SP.800-63b.
↑ Meltem Sönmez Turan; Elaine Barker; William Burr; Lily Chen (December 2010). SP 800-132 – Recommendation for Password-Based Key Derivation, Part 1: Storage Applications (PDF). NIST. doi:10.6028/NIST.SP.800-132. S2CID 56801929.
↑ Krawczyk, Hugo; Eronen, Pasi (May 2010). "The 'info' Input to HKDF". datatracker.ietf.org. RFC 5869 (2010)
1 2 "Salted Password Hashing – Doing it Right". CrackStation.net. Retrieved 29 January 2015.
↑ Abadi, Martın, T. Mark A. Lomas, and Roger Needham. "Strengthening passwords." Digital System Research Center, Tech. Rep 33 (1997): 1997.
↑ U. Manber, "A Simple Scheme to Make Passwords Based on One-Way Functions Much Harder to Crack," Computers & Security, v.15, n.2, 1996, pp.171–176.
↑ Secure Applications of Low-Entropy Keys, J. Kelsey, B. Schneier, C. Hall, and D. Wagner (1997)
↑ "Password Hashing Competition"
↑ "Password Storage Cheat Sheet". OWASP Cheat Sheet Series. OWASP. Retrieved 17 May 2023.