Open access peer-reviewed chapter - ONLINE FIRST

Revocable Crypto-Biometric Key Regeneration Using Face Biometrics, Fuzzy Commitment, and a Cohort Bit Selection Process

Written By

Mohamed Amine Hmani, Dijana Petrovska-Delacretaz and Bernadette Dorizzi

Reviewed: 23 October 2023 Published: 03 April 2024

DOI: 10.5772/intechopen.1003710

Biometrics and Cryptography IntechOpen
Biometrics and Cryptography Edited by Sudhakar Radhakrishnan

From the Edited Volume

Biometrics and Cryptography [Working Title]

Dr. Sudhakar Radhakrishnan

Chapter metrics overview

10 Chapter Downloads

View Full Metrics

Abstract

Security is a main concern nowadays. Cryptography and biometrics are the main pillars of security. Using biometrics to obtain cryptographic keys offers distinct advantages over traditional methods. Classical systems rely on passwords or tokens assigned by administrators, which can be stolen or shared, making them insufficient for identity verification. In contrast, biometric-based keys provide a better solution for proving a user’s identity. This chapter proposes an approach to regenerate crypto-biometric keys with high entropy, ensuring high security using facial biometrics. The keys are regenerated using a fuzzy commitment scheme, utilizing BCH error-correcting codes, and have a high entropy of 528 bits. To achieve this, we use an intra-inter variance strategy for the process of bit selection from our facial deep binary embeddings. The system is evaluated on the MOBIO dataset and gives 0% FAR and less than 1% FRR. The proposed crypto-biometric keys are resistant to quantum computing algorithms, provide non-repudiation, and are revocable and convenient with low false rejection rates.

Keywords

  • face biometrics
  • binarization
  • crypto-biometrics
  • key regeneration
  • key binding
  • fuzzy commitment
  • biometric cryptosystems
  • revocable crypto-biometric keys

1. Introduction

Cryptography has assumed an increasingly vital role in our society, driven by the digitization of all aspects of life and the escalating need to safeguard data and transactions across various domains such as telecommunications, healthcare, financial dealings, and even the protection of personal privacy, including cryptocurrencies. Cryptography principally relies on cryptographic keys to either encrypt data or create digital signatures, ensuring both data authenticity and integrity.

Nonetheless, cryptography faces certain challenges. One of its looming threats emanates from the advent of quantum computers. Quantum computer prototypes are becoming progressively more efficient and advanced, with the field of quantum computing progressing rapidly [1]. Projections suggest that within the next two decades, quantum computers could render current cryptographic methodologies obsolete. In this context, recommendations for post-quantum cryptographic algorithms, encompassing symmetric encryption, symmetric authentication, public-key encryption, and public-key signatures founded on long-term security, have been outlined in Ref. [2].

Beyond the quantum threat, cryptographic keys pose their set of inherent drawbacks. The sheer length of these keys precludes memorization, necessitating their storage, which introduces the additional risk of unauthorized key duplication or, worse yet, key loss resulting in the forfeiture of encrypted data. Traditional cryptographic keys also fail to guarantee non-repudiation, that is, the assurance that only the rightful owner uses the key, thereby enabling digital identity sharing and potential liability denial.

Biometrics emerges as a potential solution to this predicament. By employing biometrics, the issue of non-repudiation can be resolved. However, this approach gives rise to a new predicament: the irrevocability of biometric keys. It is impossible to arbitrarily alter an individual’s biometric data, leading to the perpetual generation of the same biometric template for a given user.

Hence, our efforts focus on devising methods to create crypto-biometric keys derived from users’ facial biometric data. We chose to focus on facial biometrics, given their widespread accessibility, as nearly every device nowadays is equipped with a camera. Employing a crypto-biometric key guarantees non-repudiation and eliminates the need for users to memorize the key, as it is reconstructed at the time of system usage.

The goal of this work is to obtain post-quantum crypto-biometric symmetric keys. This goal has multiple requirements:

  1. to be resistant to quantum computing algorithms;

  2. non-repudiation;

  3. to be cancellable;

  4. convenience, having low false rejection rate (FRR) at the required security level.

The rest of this chapter is structured as follows. In Section 2, we present an overview of related works in crypto-biometric key regeneration. In Section 3, we introduce the key regeneration scheme used in the proposed system. In Section 4, we provide the performance of the system. Before concluding in Section 6, we present the security analysis of the system in Section 5.

Advertisement

2. Related works

The concept of crypto-biometric keys appeared toward the end of the twentieth century. The first works had the disadvantage of low entropy and a non-negligible reconstruction error. Firstly, they demanded highly consistent biometric samples from the same user, which proved challenging due to environmental and physiological variations inherent to the nature of biometric data. Secondly, only one key was associated with a user’s biometric sample, rendering the sample useless if the key was compromised, especially when periodic key updates were necessary.

To overcome these challenges, subsequent methods were introduced that did not require the storage of biometric templates or keys in databases using techniques such as key generation and key regeneration. Key generation is the process of deriving a key directly from biometrics. The keys can vary from a session to another, resulting in high FRR. As for key regeneration (also known as key binding [3]), it is the idea that a randomly generated key is combined with the biometric data using cryptographic techniques and that the key is later retrieved from the combined data at the time of verification. Sometimes, key generation and regeneration are used interchangeably, as the standards of crypto-biometric systems are not fully established.

In this section, we present related works in crypto-biometric key regeneration, focusing mainly on schemes using fuzzy commitment.

In 2007, Chen et al. introduced a method for regenerating cryptographic keys from facial biometrics in Ref. [4], utilizing entropy-based feature extraction and Reed–Solomon error-correcting codes. This technique proved capable of reliably producing keys suitable for 128-bit AES encryption when evaluated using 3D face data.

In 2010, Kanade et al. [5] proposed an iris biometric system with a key regeneration scheme based on fuzzy commitment use. Initially, a key is randomly generated and encoded into a pseudocode using error-correcting codes (ECCs). Subsequently, a cancellable transformation is applied to the user’s reference biometric data, and this transformed data is XORed with the pseudocode to produce a locked code.

In 2018, Panchal et al. [6] introduced an innovative method for creating a threshold-free cryptographic biometric key, referred to as the crypto-biometric key. They achieved this by combining all bits from three different sets of straight line attributes obtained from three types of minutiae. Additionally, they employed Reed-Solomon encoding to generate a code word, which served the purpose of encryption and facilitated the retrieval of the user’s crypto-biometric key. Notably, the authors asserted that their approach eliminated the need to store either the user’s biometric template or the generated key. However, in Ref. [7], it is stated that the three feature sets may be the same set and their approach may have the risk of disclosing the fingerprint features. An attacker can apply the user’s ciphertext to separate the code word from the user’s final ciphertext and then obtain the features from the code word using Reed-Solomon decoding, and finally, he may successfully reconstruct the user’s crypto-biometric key from the line attributes if they can obtain the parameters for substitution or expansion.

Anees et al. [8] have developed a unified framework for regenerating cryptographic keys of size 256 bits based on facial features, in 2018. There are three different modules in their proposed framework: learning the facial features, the quantization of the facial features, and key generation. In their application, the features of a subject are extracted in a controlled illumination with minimal variations in pose and expression at the time of enrolment and key regeneration.

In 2019, Panchal et al. [9] introduced a biometric code word generation technique based on the convolution coding principle using fingerprints. This method is capable of producing a distinct code word, and the hash value derived from this code word, in combination with specific private parameters, can serve as a cryptographic 1274-bit key.

In 2021, Wang et al. [7] proposed a new crypto-biometric key regeneration approach based on the generated interval scheme with a two-layer error-correcting technique for fingerprints. The key size in the proposed approach is adjustable between 120 and 168 bits long for the extracted minutia numbers with low computational cost.

Chang et al. [10] introduced a multi-biometric fusion system named BIOFUSE. They leverage the fuzzy commitment and fuzzy vault schemes to regenerate crypto-biometric keys of length 256 bits, combining face, iris, and fingerprint features.

Key regeneration can also function to verify the identity of the user; if the key is reconstructed correctly, the user is verified. Mai et al. [11] proposed a randomized convolutional neural network (CNN) to regenerate protected face templates given a facial image and a user-specific key. Their system was focused on biometric verification and provides 0.19% FRR @0.1% FAR on the FRGC database.

In 2022, Elrefaei et al. [12] proposed a fuzzy commitment scheme using gait-based biometrics to secure the feature templates. They leverage Bose–Chaudhuri–Hocquenghem codes (BCH) to encode the key in the enrolment phase and decode in the verification phase. They obtain 0% FAR and FRR on CMU MoBo and CASIA-A databases with key lengths of 50 bits.

Recently, in [13], Lin and Chen have proposed an error-correction-based iris recognition scheme that provides secure template storage and a high level of accuracy for personal authentication with a security of 110 bits, which means that the intruder requires 2110 attempts to access their system.

The previously mentioned works either focus on other biometrics modalities than face or have low crypto-biometric key sizes. Furthermore, in the works focusing on face biometrics, the testing databases had ideal conditions (illumination, pose), which reduces the convenience of the system and forces the user to comply with the system requirements. Moreover, the fuzzy commitment scheme was criticized for being vulnerable to correlation attacks and providing insufficient template security [14, 15]. Keller et al. [15], in particular, focused on the security of fuzzy commitment schemes based on biometric templates produced by deep learning.

In this chapter, we present a key regeneration approach that aims to regenerate long keys from facial data. Our focus is to have keys longer than 400 bits from facial biometric and working in noncontrolled environments. We employ a modified fuzzy commitment scheme that prevents template correlation and reconstruction. The following section presents the key regeneration scheme used in our system.

Advertisement

3. Crypto-biometric key regeneration with fuzzy commitment

In this section, we present our proposed key regeneration scheme. The system comprises two steps: (i) Extracting binary template from face biometrics and (ii) regenerating the crypto-biometric key using a fuzzy commitment scheme combining the face binary template, which is protected using shuffling protection scheme, and error-correcting codes. The first step is described in our paper [16] and summarized in Section 3.2. The challenge is to obtain long binary representations with high entropy and high biometric performance. These long binary representations are used as a basis for the second step. In the second step, the challenge is to regenerate the crypto-biometric keys without degrading the entropy and with low FRR.

The system proposed in this section is based on the fuzzy commitment scheme presented in Figures 1 and 2. The fuzzy commitment scheme was first introduced in 1999 by Juels and Wattenberg [17]. A random key is encoded using error-correcting codes (ECCs) and is then XORed with the biometric data. The XORed data is cryptographically secure because neither the key nor the biometric data can be obtained from it without providing one of the two. The random key is retrieved at the time of key regeneration by providing fresh biometric data. This system requires ordered biometric data in binary form. In this scheme, the variability in the biometric data from one acquisition to another is treated as noise. This noise causes errors in the data, which are corrected using ECC.

Figure 1.

Enrolment phase of the fuzzy commitment scheme used in the key regeneration.

Figure 2.

Regeneration phase of the fuzzy commitment scheme.

The system comprises two phases. The first phase, shown in Figure 1, is the enrolment phase where the user generates a symmetric key and links it to his/her identity. The second phase, shown in Figure 2, is the verification phase where the user regenerates his/her key from his biometric data, stored helper data, and a secret second factor used to ensure the revocability of the scheme.

The revocability of the fuzzy commitment scheme is assured using the same shuffling scheme described in section 3.4.

In the enrolment phase, the user provides an image containing his/her face I and a secret second factor S. The second factor can be a shuffling key stored on a secure token or a password that is used to derive the shuffling key. The image I is processed using the binarization method described in the section 3.2 to provide a binary embedding B.

The binary embedding is then shuffled using the shuffling key S provided by the user to achieve the revocability requirement. The shuffled binary embedding is denoted by SB. The SB is used to protect the encryption key K generated by the system at the beginning of the process. The encryption key K is encoded using the error-correcting code described in the following section, creating an encoded encryption key EK. The encryption key K is also hashed using a hash function to provide a hash-string that will be used to verify the successful regeneration of the encryption key K in the regeneration phase.

The shuffled binary embedding SB and the encoded key EK are XORed to provide the protected encryption key PK. The security of the scheme is mainly provided by the SB. In fact, the entropy of the system is the minimum between the entropy of SB and K.

In the enrolment phase, the system stores the hash of the encryption key HK and the protected encryption key PK. The two pieces of data do not need to be protected and can be stored in a non-encrypted database. On the other hand, the second factor, S, must be protected. Should the shuffling key be compromised, the user’s enrolment must be revoked and a new enrolment using a different shuffling key S needs to be generated.

In the verification phase, the user provides a new facial image sample I. This image is processed following the same method as in the enrolment phase to provide a binary embedding B. The user also provides the same second factor, in the form of either a password or a shuffling key S, used in the enrolment phase. The binary embedding B is shuffled using this second factor, resulting in a shuffled binary embedding SB. This shuffled binary embedding SB will have some differences from SB due to the variability in the facial image sample provided at the beginning of the verification step. SB is then XORed with the protected encryption key PK recovered from storage. The result of the XOR operation is EK. EK is then decoded using the same error correcting code used in the enrolment phase. The decoded encryption key is denoted by K. To check if the regeneration of the key is successful or not, we compare the hash of the decoded key HK with the hash stored in the enrolment phase HK. If HK is equal to HK, the regeneration is successful and the system provides the user with the encryption key K that is identical to the key K. If HK and HK are different, the regeneration fails, and the user is asked to provide a new facial image.

The description of the fuzzy commitment scheme provided above explains only the generic data flow in the system. In our system, the binary embedding extractor (DNN) provides a fixed output of 4096 bits. To adapt the length of the binary embedding to the need of the security requirements of the system, we added a process of bit selection. In fact, according to the parameters of the error-correcting code and the length of the encryption key K, the length of the encoded encryption key EK will be different. Therefore, the length of SB will vary as it hides the encoded encryption key EK. As such, the length of the binary embedding B and shuffling key S has to vary. The details about the shuffling scheme, the error-correcting code used, as well as the process of bit selection are provided in the following sections.

3.1 Error-correcting code

The difference between E(k) and E(k’) is due to the variability in the biometric samples. We consider the difference as channel noise and use ECC to correct this noise. The error-correcting code that we used for the fuzzy commitment scheme is the Bose, Chaudhuri, and Hocquenghem (BCH) code. We chose the BCH code because it is a robust code capable of correcting random errors. BCH codes are a class of cyclic error-correcting codes that are based on algebraic concepts. They are widely used in communication systems and storage devices to detect and correct errors.

The basic idea behind BCH codes is to add redundant bits to the data being transmitted or stored in such a way that the receiver or reader can use these bits to detect and correct errors. These redundant bits are called check bits. The number of check bits added to the data is determined by the length of the BCH code and the desired error correction capability.

The BCH code is constructed using a generating polynomial, which is a polynomial equation with coefficients that are used to generate the check bits. The generating polynomial is chosen such that it has a certain number of roots, which are values that make the polynomial equation equal to zero. These roots are used to generate the check bits, and the number of roots determines the error correction capability of the BCH code. Each coefficient in the polynomial represents a bit of data, and the polynomial as a whole represents the entire dataset. The polynomial is then used to generate a set of check bits, which are added to the dataset to form the coded message.

To encode data using a BCH code, the data is first divided into blocks of a certain size, and the check bits are generated for each block using the generating polynomial. The check bits are then appended to the data block to form the encoded data block. When the encoded data block is transmitted or stored, errors may occur due to noise or other factors.

To detect and correct errors in the received or retrieved data, the receiver, or reader, uses the generating polynomial to calculate the check bits for the received or retrieved data block. If the calculated check bits match the check bits in the received or retrieved data block, it is assumed that the data is error-free. If the calculated check bits do not match the check bits in the received or retrieved data block, it is assumed that errors have occurred and the receiver or reader uses the generating polynomial to determine the locations of the errors and correct them.

One of the advantages of BCH codes is their robustness against errors. These codes can correct a certain number of errors based on their designed parameters and are therefore able to tolerate a certain level of noise or interference during the transmission or storage of data. This makes them particularly useful for applications where the transmission or storage of data may be subject to noise or interference.

In summary, BCH codes are a type of error-correcting code that are used to detect and correct errors in digital data. They are based on algebraic concepts and use a generating polynomial to generate check bits that are appended to the data being transmitted or stored. The receiver or reader uses the generating polynomial to detect and correct errors in the received or retrieved data. BCH codes have a high error correction capability and low overhead, making them efficient for use in communication systems and storage devices.

The BCH code takes a block of size k and encodes it on a code word of length n. The code has a correction capacity of t, meaning in each code word, we can correct at most t errors. The parameters n, k, and t are defined by Eq. (1).

For any positive integers m3 and t<2m1, there exists a binary BCH code with the following parameters:

code wordlengthn=2m1number of partiy check bitsnkm×tminimum distancedmin2t+1E1

with the minimum distance being the minimum number of positions in which any two distinct code words differ.

The encryption key of length L is divided in N blocks of k bits. Each block is encoded into a new block of n bits using the BCH error correcting code. The total length of the encoded key is N×n. The scheme can correct up to T=N×t errors, where t is the correction capacity of the code.

3.2 Face binary template extraction

Crypto-biometric schemes such as fuzzy commitment require binary sources [18]. In Ref. [16], we use a data-driven template binarization method using deep neural networks (DNNs), which does not degrade the performance of the baseline system. Furthermore, we seek to obtain long binary representations with high entropy to be used in crypto-biometric key regeneration schemes. The binarization method is detailed in Ref. [16]. Using a pre-trained convolutional neural network (CNN) and training the model on a cleaned version [19] of the MS-celeb-1 M database, we obtain binary representations of length 4096 bits and 3300 bits of entropy. The extracted representations have high entropy and are long enough to be used in crypto-biometric systems such as fuzzy commitment. Furthermore, the proposed approach is data-driven and constitutes a locality-preserving hashing that can be leveraged for data clustering and similarity searches. Further details about the implementation are provided in Ref. [16].

3.3 Bit selection process

According to the length of the encryption key, the block size used, and the code word size of the ECC, the output encoded key EK will have varying sizes. For example, if the length of the encryption key K is 512 bits, we divide it into 16 blocks of 32 bits. Each block is encoded on a code word of 63 bits. The encoded encryption key will have 16 words, resulting in a total length of 1008 bits. If the length of the encryption key is 516 bits (a 512-bit key padded with 4 zeros), the key can be divided into 86 blocks of 6 bits. Each block will be encoded on a 31-bit code word. The final output will have a length of 2635 bits.

However, the binary embedding extractor provides binary representations of a fixed length of 4096 bits. As the histogram in Figure 3 shows, not all the bits of the binary representation have high entropy. As such, we proceed to select the bits with the most entropy of the binary representations. Furthermore, we need to preserve the recognition performance of the binary representations when selecting the bits.

Figure 3.

Entropy per bit of the 4096-bit binary representations.

The bit selection is made by first reordering the bits of the binary representation following the inter-class variance, intra-class variance, or both. Then, we take the first N bits needed in the fuzzy commitment scheme. The computation of the inter-class variance and intra-class variance is done on a cohort database that has no overlap with the validation database.

We selected the FRGC database [20] as the cohort database because of two reasons. First, it was not acquired in the wild like LFW [21] or MS-celeb-1 m [22]. As such, there is a low risk of mislabeling or overlap with other databases. Secondly, it provides images of high quality in controlled conditions, which are useful for computing inter-class variance without introducing ambiguities due to the acquisition conditions. Figure 4 shows examples of the images used to compute the inter-class variance. The images are frontal facing with good lighting and a uniform background. We also computed the intra-class variance using the FRGC dataset, but using the uncontrolled partition, as shown in Figure 5.

Figure 4.

Example of the images used to compute the inter-class variance. The images are taken from the controlled partition of the FRGC database [20].

Figure 5.

Example of the images used to compute the intra-class variance. The images are taken from the uncontrolled partition of the FRGC database [20].

Figure 6 shows the process of reordering the bits using the inter-class variance computed using FRGC. We take a binary representation of a controlled sample from each subject in the database. Then, we compute the variance for each component (column) (considering that the bit distribution is a Bernoulli distribution and the bit probability is computed empirically). We then reorder variances using descending order. Following this process, we store the indices following the new order for future use.

Figure 6.

Bit reordering of the binary representations created by the DNN according to the inter-class variance.

In the key regeneration system, each time the binary representation is extracted, we use the stored indices to select the suitable number of bits for the parameters of the system. Using these indices, we can select the N bits needed for the scheme. The selected bits will be the bits with the highest entropy from the binary face representations. This is under the assumption that there is no session noise when the data was acquired. That is why we only computed the inter-class variance using only samples from the controlled partition.

As for the computation of the intra-class variance, we used the uncontrolled partition of the FRGC database. In this case, we take all the samples pertaining to each user. Then, we compute the variance for each user independently using all the samples. As a result, we obtain a variance vector for each user, as illustrated in Figure 7. The variance vectors are then averaged to provide the mean variance vector. In this mean variance vector, the bits with the highest variance are the bits that represent the session noise contained in the input samples. In this case, the variance vector is ordered using an ascending order. And same as the case of the inter variance, we store the indices of the new order.

Figure 7.

Bit reordering of the binary representations created by the DNN according to the intra-class variance.

As the goal of this type of bit selection is to remove the noisy bits, the need for using the FRGC database is further emphasized as we are confident that the data does not contain mislabeling.

Selecting the bits using the inter-class variance improves the security of the system at the cost of the convenience of the user. By taking the bits with the highest inter-class variance, we reduce the false acceptance rate, which increases the false rejection rate of the system. On the other hand, if we focus only on the bits with the lowest average intra-class variance, the user will have an easier time regenerating his/her key, but this will increase the risk of false acceptance. As such, we tried to combine both approaches by selecting the bits using both intra-class and inter-class variance.

Figure 8 illustrates the process of the bit selection process. We subtract the intra-class mean variance from the inter-class variance vector. If the bit has high inter-class variance and low intra-class variance, in the resulting vector, it will obtain a high weight. If the bit has low inter-class variance and high intra-class variance, it will get a low weight in the new vector. Finally, this new vector is ordered in descending order, and we take the first N bits needed in the fuzzy commitment scheme.

Figure 8.

Bit reordering of the binary representations created by the DNN using the inter-class variance and intra-class variance.

To validate the selection process, we used the accuracy on the LFW database as the benchmark. The tests carried out on the LFW database are divided into 3000 client-client tests and 3000 client-imposter tests. As such, the accuracy metric can give a sensible measure of the usability of the system. If the system has high accuracy, then it achieves both a low false acceptance rate and a false rejection rate.

Figure 9 shows the performance of the inter-class, intra-class, and inter-class + intra-class bit selection strategies. The figure shows the accuracy on LFW as a function of the length of the binary representation. All the curves are drawn using the same initial binary representation; we only changed the selection strategy. As the lowest length of the encoded message (encryption key) is above 1000 bits, the curves start at length 1000 bits, and each data point is computed using an increment of 10 bits.

Figure 9.

Impact of the bit selection strategy on the accuracy of the binary representations on LFW [23]. The accuracy is represented as a function of the length of the representation.

The average size of the encoded keys in our experiments is between 2000 and 3000 bits. Thus, we chose the bit selection based on the third strategy, which is to use both the inter-class and intra-class variance to select the bits as this strategy has the best performance in this range as shown in Figure 9.

3.4 Biometric template protection scheme: shuffling

To protect template data, we adopt the shuffling scheme introduced by Kanade et al. in Ref. [24]. This scheme employs a binary shuffling key, which due to its length, is either securely stored on a dedicated token or derived through a password-based mechanism. The binary embedding, representing the template, is partitioned into blocks. This partition yields two distinctive segments: the first comprises blocks aligned with positions where the shuffling key exhibits a bit value of “1”, while the second encompasses all remaining blocks. These two segments are subsequently concatenated, constituting the shuffled binary embedding. It is noteworthy that the original and shuffled templates exhibit a one-to-one correspondence; however, each block from the original vector has a different position within the shuffled embedding. As a result, when identical shuffling keys are employed to shuffle two binary embeddings, the absolute block positions are altered consistently across both representations, leaving the Hamming distance unaffected. On the other hand, using distinct shuffling keys results in a randomized transformation of the initial binary representations, leading to an increased Hamming distance.

In the context of our application, namely, crypto-biometric key regeneration, we have opted for a block size of “1”, rather than “7” employed in [24]. This choice provides two primary advantages. First, it generates a longer shuffling key, rendering it more resistant to brute-force attacks. Second, it expands the permutation space, providing a higher number of potential templates.

The shuffled binary embedding is the outcome of combining the biometric binary embedding with the shuffling key. Therefore, in the event of a security compromise, it can be revoked, and a new enrolment can be generated with the same biometrics using a different shuffling key. The management of shuffling keys involves two viable approaches: the generation and storage of keys within a secure element or their derivation from a password using mechanisms such as a password-based key derivation function (PBKDF). PBKDF, in essence, represents a mechanism for transforming a password into a symmetric key suitable for cryptographic operations, such as the advanced encryption standard (AES).

Advertisement

4. Results of the proposed key regeneration scheme on the MOBIO database

The proposed key regeneration scheme illustrated in Figures 1 and 2 can be summarized as follows:

  • Enrolment phase

    1. Input: The user presents a good-quality image of their face I and second factor S.

    2. The system generates an encryption key K.

    3. The system computes EK=ENCODEBCHK and HK=secure_hash_functionK.

    4. The system extracts a binary representation from the facial image [16].

    5. The binary representation length is reduced to the desired length using the inter-intra variance bit selection process to obtain the binary embedding B.

    6. The binary embedding B is shuffled using the secret factor S to produce the shuffled binary embedding SB.

    7. The system computes the protected encryption key PK=EKSB.

    8. Storage: PK and HK are stored in the system database, and all other data is discarded.

  • Verification phase (regeneration)

    1. Input: The user provides a new image of their face I and the shuffling key S.

    2. The system extracts a binary representation from the facial image.

    3. The binary representation length is reduced to the desired length using the inter-intra variance bit selection process to obtain the binary embedding B.

    4. The binary embedding B is shuffled using the secret factor S to produce the shuffled binary embedding SB.

    5. The system computes the encoded encryption key EK=PKSB.

    6. The system decodes the encoded key to obtain K=DECODEBCHEK.

    7. The system computes HK=secure_hash_functionK.

    8. Output: The system checks if HK is equal to the stored HK; if they are equal, then the key is successfully regenerated, if not, then repeat from step (1).

In this section, we report the performance of the key regeneration scheme on the MOBIO database. We chose the MOBIO database because it presents a real use-case where the user tries to log in to a system. Furthermore, the MOBIO database does not contain any overlap with the databases used for training the DNN (MS-celeb-1 M) and for the bit selection process (FRGC and LFW). The experimental protocol of the MOBIO databases was originally developed for identity verification [25], not for key regeneration. For key regeneration, the system output is a binary decision, either success in the key regeneration or fail. As for identity verification, the process is based on a comparison against a threshold allowing for the computation of metrics such as accuracy, EER, and DET curves. As such, we introduced some changes to the protocol. We combined the male and female partitions of the MOBIO database to obtain more samples. Each subject has at least 120 videos. We selected 3 frames from each video (start, middle, and end). Frames, where the faces are not detected, or partial faces were discarded. After the pruning, we get a total of 53 k biometric samples from 152 users.

Table 1 reports the regeneration performance of the fuzzy commitment scheme. We report the FAR and the FRR on the MOBIO database. We carried out 9 M client-client tests and 10 M client-imposter tests. For the client-client tests, all the biometric samples are cross-matched. As for the client-imposter tests, 21 samples are randomly selected from each user and are then cross-matched. The goal of the experiments was to regenerate keys with more than 400 bits to be resistant to quantum computing. As such, we focused on encryption keys with a length between 400 and 512.

Key lengthEncoded key lengthBCH codeFRR (%)FAR (%)
5283084(127,22,23)0.80
5162666(31,6,7)0.70
5121008(63,32,11)0.30
5103213(63,10,13)0.30
4302047(2047,430, 214)1.60
4304095(4095, 495, 430)1.30
4203556(127,15,27)0.30

Table 1.

Performance of the key regeneration scheme. BCH codes are presented in (n, k, t) format where n is the length of the encoded block, k is the length of the message block, and t is the number of bits that can be corrected in the encoded block.

The FAR and FRR are computed on the MOBIO database using 9 M client-client tests and 10 M client-imposter tests.

In all experiments, the FAR is 0% as the number of erroneous bits is bigger than the code correction capacity. As shown in Figure 10, the minimum imposter distance is 0.196 for representations with 3000 bits. This means we need to correct 588 bits for the imposter to be accepted as the correct user. However, from Table 1, we see that we can correct at most 552 bits.

Figure 10.

Normalized hamming distance distribution for genuine and impostor comparisons on the MOBIO Eval male partition. The template used in the comparisons are binary templates of length 3000 bits created using the bit selection process described in the previous section 3.3.

In Table 2, we compare the performance of our key regeneration system to other systems using fuzzy commitment and face biometrics. As shown in the table, our system provides longer keys1 with low FAR and FRR. Comparing our system to Anees et al. [8], we obtain better recognition performance. On the other hand, Chang et al. [10] and Mai et al. [11] were not optimized for the task of key regeneration using face biometrics. Chang et al.’s system focused on the fusion of multiple biometric modalities, and as such, they did not optimize their face binary representations. In Ref. [11], Mai et al.’s goal was to use key regeneration for user verification; as a result, they optimized their system for better biometric recognition performance not length. In conclusion, our face-based crypto-biometric system provides longer keys compared to the state of the art.

SystemDatabaseFAR (%)FRR (%)Key Length
Anees et al. [8]ORL [26]0.0610.02256
Chang et al. [10]XM2VTSDB [27]125256
Mai et al. [11]FRGC [20]0.10.1956
OursMOBIO00.8528

Table 2.

Comparisons with key regeneration schemes based on fuzzy commitment using face biometrics.

Studying the biometric performance of the system alone is not enough, especially when we are working with encryption keys. As such, in the following section, we present the security analysis of the system under different attack scenarios.

Advertisement

5. Security analysis

Due to the sensitivity of the application as it is intended to be used in cryptographic systems, we need to check the security of the scheme.

Multiple security concerns were raised around the use of fuzzy commitment in crypto-biometric systems, due to the risk of information leakage and correlation attacks [14, 15]. Furthermore, in Ref. [15], the authors raised the question about the unlinkability, irreversibility, and revocability of fuzzy commitment schemes. The fuzzy commitment scheme that we propose does not suffer from these problems, as we use a second factor for the protection of the biometric template. We have shown in Ref. [16] that the shuffling scheme is fully unlinkable and irreversible, provided that the user secret second factor is not compromised. The revocability is also provided by the means of the shuffling key; we can re-enroll the user using a different shuffling key when the enrollment is compromised.

In this section, we evaluate the security of the proposed key regeneration system based on fuzzy commitment against different attack scenarios:

  • Stolen second factor (shuffling key or password),

  • Stolen biometrics,

  • Compromised storage (protected encryption keys PK and hashes HK),

  • Brute-force attacks.

The security analysis was carried out on the MOBIO database using the same protected templates generated in the previous section.

5.1 Stolen second factor

In this scenario, we study the impact of the theft of the second factor on the security of the system. The attacker will use the second factor, in this case, the shuffling key, to regenerate the crypto-biometric key. We assume that the attacker also has access to the hash of the encryption key and the protected encryption key of the target user, but does not have access to the target biometric data.

As such, the attacker tries using a facial dataset to access the system. We simulated this type of attack using the MOBIO database and comparing each user against the rest of the database. We report in Table 3 the FAR obtained using the different system configurations.

Encryption key lengthBCH codeFAR
528(127,22,23)0
516(31,6,7)0
512(63,32,11)0
510(63,10,13)0
430(2047,430, 214)0
430(4095, 495, 430)0
420(127,15,27)0

Table 3.

FAR on the MOBIO database in the scenario of stolen second factor.

BCH codes are presented in (n, k, t) format where n is the length of the encoded block, k is the length of the message block, and t is the number of bits that can be corrected in the encoded block.

In fact, as in our protocol, we removed the enrolment faces with low quality, and as the minimum client-imposter distance is 0.19, which is higher than the error correction capacity of the used ECCs, the FAR is 0% in all the experiments.

5.2 Stolen biometrics

In this scenario, we study the impact of the theft of biometric data of the user on the security of the system. The biometric data, in this case, is the face of the user. As biometric data is easily accessible, especially using social networks. This type of attack is also known as spoofing, where the attacker introduces previously captured data of the user to access the system. The best mitigation is to implement an anti-spoofing measure such as liveness detection to assure that the user is present in front of the sensor and that it is not a picture or a video. In our system, the use of the second factor helps to mitigate the spoofing attack.

We simulate this scenario by using the MOBIO protocol and using wrong shuffling keys for the client-client tests of the protocol. In all the experiments, the attacker is rejected with 100% accuracy.

5.3 Compromised storage (PK and HK)

In this scenario, the attacker has access to the system and to the storage of the application, where he/she recovers the protected encryption keys PK and the encryption key hashes HK (signature).

This allows for two types of attacks. The first type of attack is to try to generate an encryption key with the same length. In this case, the complexity of the attack and the possibility of recovering the original encryption key K depend entirely on the security of the hash function used to generate the signature.

The second type of attack is to try to use protected encryption keys to recover the original encryption key. In this case, the attacker tries to XOR the protected encryption key with binary strings generated using the metadata of the system. We suppose that attackers know the statistical distribution of the binary face representations and how the shuffling keys are generated. The shuffling keys can be either randomly generated or stored on a special medium for the user to use (a smart card) or derived using a PBKDF from a password provided by the user. Using this information, the attacker can reduce the complexity of his attack by knowing the average number of activated bits in the shuffled binary embeddings SB used to protect the encoded encryption keys EK. In this attack, the goal of the adversary is mainly to generate a “pseudo” shuffled binary embedding to reverse the XOR operation applied to the encoded encryption key EK. The complexity of the attack is equivalent to the diversity of the cancellable system computed using the parameters of the fuzzy commitment scheme. The maximum number of shuffled binary embeddings SB is given using the number of possible permutations. Moreover, because the decision-making is based on a threshold comparison, we should not account for templates falling in the same neighborhood. We estimate the maximum number of templates using the hamming-packing bound.

Number OfSB=numberof permutationvolumeofHammingspheres=L!L0!L1!k=0tLkE2

where:

L: the length of the encoded encryption key/shuffled binary representations.

L0: the average number of zeros in the shuffled binary representations.

L1: the average number of ones in the shuffled binary representations.

t: the maximum of number of bits that can be corrected using the ECC used in the fuzzy commitment scheme, where 2t+1 is the minimum distance of the ECC.

For example, in the case where we use encryption keys of length 528 bits with a BCH(127,22,23) code with a correction capacity of 23 bits in a block of 127 bits of the encoded message, the encryption key is divided into 24 blocks of 22 bits. Each block is encoded onto a 127-bit block. Thus, the maximum number of bits that can be corrected is 23×24. However, this does not mean that we can correct 552 random errors in the encoded message. We can correct only 23 random bits in each 127-bit block of the encoded message. As such, the number of possible SB in this case is higher than the lower bound provided by Eq. (2). We obtain more than 2992 possible SB for this configuration, as shown in Eq. (3).

Number OfSB=3084!1542!1542!k=023×243084k2992E3

We show in Table 4 the minimum number of possible SB for each configuration of the system. We study the number of SB as it is equivalent to the entropy of the shuffled binary representations used to protect the encryption keys. For the case of BCH (63,32,11), the complexity of the brute-force attack is reduced from 512 to 333, which is still not brute-forceable with current technology but lower than the security requirement that we established in Section 1. As for the rest of the configurations presented in Table 4, except for BCH (63,32,11), it is computationally infeasible to attack the system using the information recovered from the protected encryption key. It is easier to try to brute-force the encryption key directly using its signature.

Encryption key lengthBCH codeNumber of SB (log2)
528(127,22,23)992
516(31,6,7)610
512(63,32,11)333
510(63,10,13)852
430(2047,430, 214)1056
430(4095, 495, 430)1916
420(127,15,27)901

Table 4.

Number of possible SB for each system configuration.

The number of SB is provided in log2 format. BCH codes are presented in (n, k, t) format where n is the length of the encoded block, k is the length of the message block, and t is the number of bits that can be corrected in the encoded block.

5.4 Brute-force attacks

In this section, we study the brute-force attacks that can be done on the system. The attacker can try to either directly brute-force the encryption key K or brute-force the system by presenting random faces and random second factors to recover K.

The first attack is computationally infeasible for all the system configurations presented in Table 1. The encryption key lengths are longer than 400 bits, which are not brute-forceable even using quantum algorithms such as the Grover algorithm [28].

As for brute-forcing the system using its inputs (face and shuffling key), this attack is more complex than brute-forcing the encryption key as the shuffling key is longer than the original encryption key K. Furthermore, if we consider the best case for the attacker when they have biometric samples of the target, the attack reverts to the stolen biometric scenario, which is not brute-forceable.

Advertisement

6. Conclusion and perspectives

The challenge of this work is to obtain high entropy keys to guarantee a high level of security. In fact, for traditional cryptographic keys, one can control the entropy during their creation. On the other hand, with crypto-biometric keys, the entropy contained in the biometric reference limits the entropy of the key.

In this chapter, we propose a key regeneration scheme based on fuzzy commitment, deep binary embeddings, and a bit selection process based on cohort inter-intra variance. The regenerated keys are post-quantum, guarantee non-repudiation, and are revocable and convenient to use.

As we focus on symmetric key regeneration, the keys need to have double the entropy of the keys used currently to present the same degree of security [2]. Crypto-biometric keys are limited by the usable information contained in the biometric sample that they are generated from. This is the reason we use deep learning to create binary representations that are longer and more discriminative. State-of-the-art key regeneration systems that use face biometrics suffer from high FRR and low entropy compared to other biometric modalities [7]. In our case, we succeeded in regenerating symmetric encryption keys longer than 400 bits with low FAR and low FRR using face biometrics. For example, for crypto-biometric keys with 512 bits, we get 0% FAR and 0.3% FRR using BCH (63,32,11) code.

As for the revocability of the keys, biometrics are unique for each user. They cannot be changed without special circumstances (plastics surgery, diseases, and so on). As such, if the regeneration scheme is not revocable, the user will be restricted to a single key across multiple applications. In addition, in the case the key is compromised, the user will not be able to create a new one. Thus, to ensure that the regeneration scheme is revocable, we used the shuffling scheme proposed by [24].

The non-repudiation requirement is satisfied by the intrinsic properties of biometric samples (under the assumption that an anti-spoofing system is used). However, we must ensure that the scheme used in the key regeneration has a low false acceptance rate (FAR). In our case, we obtain 0% FAR for all configurations of the system on the MOBIO database.

Moreover, the proposed key regeneration scheme allows for user convenience, meaning, at the required security level, the user is not rejected multiple times before having access to the system. The convenience of the system is shown through the FRR metric (less than 1% FRR).

Furthermore, our system does not suffer from the security concerns presented in Ref. [15] about the unlinkability, irreversibility, and revocability of fuzzy commitment schemes, as we use a second factor for the protection of the biometric template. We have shown in Ref. [16] that the shuffling scheme is fully unlinkable and irreversible, provided that the user’s secret second factor is not compromised. The revocability is also provided by means of the shuffling key. We can re-enroll the user using a different shuffling key.

Finally, the system can be further improved by using newer face recognition systems with higher accuracy as the basis for the binarization auto-encoder. This should allow for more stable and longer representations and as such longer crypto-biometric keys.

Advertisement

Abbreviations

FRR

false rejection rate

FAR

false acceptance rate

PBKDF

password-based key derivation function

ECC

error-correcting code

DBR

direct binary representation

BRGC

binary reflected gray code

LSSC

linearly separable sub-code

BCH

Bose, Chaudhuri, and Hocquenghem

DNN

deep neural network

CNN

convolutional neural network

References

  1. 1. Pooja S, SK. Quantum computing review: A decade of research. IEEE Transactions on Engineering Management. 2023:1-15. DOI: 10.1109/TEM.2023.3284689
  2. 2. Augot D, Batina L, Bernstein DJ, Bos J, Buchmann J, Castryck W, et al. Initial recommendations of long-term secure post-quantum systems. In: Post-Quantum Cryptography for Long-Term Security. Eindhoven, Netherlands: PQCRYPTO; 2015. Available from: pqcrypto.eu.org/docs/initial-recommendations.pdf
  3. 3. Jain AK, Ross A, Uludag U. Biometric template security: Challenges and solutions. In: 2005 13th European Signal Processing Conference in Antalya, Turkey. New York City, USA: IEEE; 2005. pp. 1-4
  4. 4. Chen B, Chandran V. Biometric based cryptographic key generation from faces. In: 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007) in Glenelg, SA, Australia. New York City, USA: IEEE; 2007. pp. 394-401. DOI: 10.1109/DICTA.2007.4426824
  5. 5. Kanade S, Petrovska-Delacrétaz D, Dorizzi B. Generating and sharing biometrics based session keys for secure cryptographic applications. In: 2010 Fourth IEEE International Conference on Biometrics: Theory, Applications and Systems (BTAS) in Washington, DC, USA. New York City, USA: IEEE; 2010. pp. 1-7. DOI: 10.1109/BTAS.2010.5634545
  6. 6. Panchal G, Samanta D. A novel approach to fingerprint biometric-based cryptographic key generation and its applications to storage security. Computers and Electrical Engineering. 2018;69:461-478. Available from:. DOI: 10.1016/j.compeleceng.2018.01.028
  7. 7. Wang P, You L, Hu G, Hu L, Jian Z, Xing C. Biometric key generation based on generated intervals and two-layer error correcting technique. Pattern Recognition. 2021;111:107733. Available from:. DOI: 10.1016/j.patcog.2020.107733
  8. 8. Anees A, Chen YPP. Discriminative binary feature learning and quantization in biometric key generation. Pattern Recognition. 2018;77:289-305
  9. 9. Panchal G, Samanta D, Barman S. Biometric-based cryptography for digital content protection without any key storage. Multimedia Tools and Applications. 2019;78:26979-27000
  10. 10. Chang D, Garg S, Ghosh M, Hasan M. BIOFUSE: A framework for multi-biometric fusion on biocryptosystem level. Information Sciences. 2021;546:481-511. Available from: https://www.sciencedirect.com/science/article/pii/S0020025520308318
  11. 11. Mai G, Cao K, Lan X, Yuen PC. SecureFace: Face template protection. IEEE Transactions on Information Forensics and Security. 2021;16:262-277
  12. 12. Elrefaei LA, Al-Mohammadi AM. Machine vision gait-based biometric cryptosystem using a fuzzy commitment scheme. Journal of King Saud University - Computer and Information Sciences. 2022;34(2):204-217. Available from: https://www.sciencedirect.com/science/article/pii/S1319157819300916
  13. 13. Lin KC, Chen YM. A high-security-level iris cryptosystem based on fuzzy commitment and soft reliability extraction. IEEE Transactions on Dependable and Secure Computing. 2023:1-15. DOI 10.1109/TDSC.2023.3289916
  14. 14. Tams B. Decodability Attack against the Fuzzy Commitment Scheme with Public Feature Transforms. New York, USA: arXiv; 2014. DOI: 10.48550/arXiv.1406.1154
  15. 15. Keller D, Osadchy M, Dunkelman O. Fuzzy Commitments Offer Insufficient Protection to Biometric Templates Produced by Deep Learning. CoRR. 2020;abs/2012.13293. arXiv. Available from: https://arxiv.org/abs/2012.13293. Eprint: 2012.13293
  16. 16. Hmani MA, Petrovska-Delacrétaz D, Dorizzi B. Locality preserving binary face representations using auto-encoders. IET Biometrics. 2022;11(5):445-458. DOI: 10.1049/bme2.12096
  17. 17. Juels A, Wattenberg M. A fuzzy commitment scheme. In: Proceedings of the 6th ACM Conference on Computer and Communications Security. New York, USA: Association for Computing Machinery (ACM); 1999. pp. 28-36
  18. 18. Lim M, Teoh ABJ, Kim J. Biometric feature-type transformation: Making templates compatible for secret protection. IEEE Signal Processing Magazine. 2015;32(5):77-87
  19. 19. Hmani MA, Mtibaa A, Delacretaz DP. Joining forces of voice and facial biometrics: A case study in the scope of NIST SRE19. In: Voice Biometrics: Technology, Trust and Security. Chapter 9. London, England: IET Digital Library; 2021. pp. 187-217. Available from: https://digital-library.theiet.org/content/books/10.1049/pbse012e_ch9. DOI: 10.1049/PBSE012E_ch9
  20. 20. Phillips PJ, Flynn PJ, Scruggs T, Bowyer KW, Chang J, Hoffman K, et al. Overview of the face recognition grand challenge. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). Vol. 1. London, England: IEEE; 2005. pp. 947-954
  21. 21. Learned-Miller E, Huang GB, Roy Chowdhury A, Li H, Hua G. Labeled faces in the wild: A survey. In: Advances in Face Detection and Facial Image Analysis. New York, USA: Springer; 2016. pp. 189-248
  22. 22. Guo Y, Zhang L, Hu Y, He X, Gao J. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In: European Conference on Computer Vision. New York, USA: Springer; 2016. pp. 87-102
  23. 23. Erik LM, Gary BH, Aruni R, Haoxiang L, Hua G. LFW: Results. Amherst, MA, USA: Computer Vision Research Laboratory, University of Massachusetts. 2019. Available from: http://vis-www.cs.umass.edu/lfw/results.html.
  24. 24. Kanade SG. Enhancing information security and privacy by combining biometrics with cryptography [thesis]. Lyon, France: Institut National des Télécommunications; 2010. Available from: https://tel.archives-ouvertes.fr/tel-01057728
  25. 25. Bourlai T. Face recognition in challenging environments: An experimental and reproducible research survey. In: Face Recognition across the Imaging Spectrum. New York City: Springer; 2016. pp. 269-270
  26. 26. Samaria FS, Harter AC. Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision. New York, USA: IEEE; 1994. pp. 138-142
  27. 27. Messer K, Matas J, Kittler J, Luettin J, Maitre G, et al. XM2VTSDB: The extended M2VTS database. In: Second International Conference on Audio and Video-Based Biometric Person Authentication in Washington, DC, USA. Vol. 964. Lausanne, Switzerland: Citeseer; 1999. pp. 965-966
  28. 28. Grover LK. A fast quantum mechanical algorithm for database search. In: Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing. New York, USA: Association for Computing Machinery (ACM); 1996. pp. 212-219

Notes

  • Crypto-biometric systems using other biometric modalities (iris, fingerprint, or fusion) provide longer regenerated keys [9]. To our knowledge, our system provides the longest regenerated keys when focusing on fuzzy commitment and face biometrics.

Written By

Mohamed Amine Hmani, Dijana Petrovska-Delacretaz and Bernadette Dorizzi

Reviewed: 23 October 2023 Published: 03 April 2024