An AFIS Candidate List Centric Fingerprint Likelihood Ratio Model Based on Morphometric and Spatial Analyses (MSA) An AFIS Candidate List Centric Fingerprint Likelihood Ratio Model Based on Morphometric and Spatial Analyses (MSA)

The use of fingerprints for identification purposes boasts worldwide adoption for a large variety of applications, from governance centric applications such as border control to personalised uses such as electronic device authentication. In addition to being an inexpensive and widely used form of biometric for authentication systems, fingerprints are also recognised as an invaluable biometric for forensic identification purposes such as law enforcement and disaster victim identification. Since the very first forensic applications, fingerprints have been utilised as one of the most commonly used form of forensic evidence worldwide.


Introduction
The use of fingerprints for identification purposes boasts worldwide adoption for a large variety of applications, from governance centric applications such as border control to personalised uses such as electronic device authentication. In addition to being an inexpensive and widely used form of biometric for authentication systems, fingerprints are also recognised as an invaluable biometric for forensic identification purposes such as law enforcement and disaster victim identification. Since the very first forensic applications, fingerprints have been utilised as one of the most commonly used form of forensic evidence worldwide.
Applications of fingerprint identification are founded on the intrinsic characteristics of the friction ridge arrangement present at the fingertips, which can be generally classified at different levels or resolutions of detail ( Figure 1). Generally speaking, fingerprint patterns can be described as numerous curved lines alternated as ridges and valleys that are largely regular in terms orientation and flow, with relatively few key locations being of exception (singularities). A closer examination reveals a more detail rich feature set allowing for greater discriminatory analysis. In addition, analysis of local textural detail such as ridge shape, orientation, and frequency, have been used successfully in fingerprint matching algorithms as primary features [1] [2] or in conjunction with other landmark-based features [3].
Both biometric and forensic fingerprint identification applications rely on premises that such fingerprint characteristics are highly discriminatory and immutable amongst the general population. However, the collectability of such fingerprint characteristics from biometric scanners, ink rolled impressions, and especially, latent marks, are susceptible to adverse factors such as partiality of contact, variation in detail location and appearance due to skin elasticity (specifically for level 2 and 3 features) and applied force, environmental noises such ridge shape. These fingerprints were sourced from the FVC2002 [47], NIST4 [46], and NIST24 [48] databases as moisture, dirt, slippage, and skin conditions such as dryness, scarring, warts, creases, and general ageing. Such influences generally act as a hindrance for identification, reducing both the quality and confidence of assessing matching features between impressions ( Figure 2).
In this chapter, we will firstly discuss the current state of forensic fingerprint identification and how models play an important role for the future, followed by a brief introduction and review into relevant statistical models. Next, we will introduce a Likelihood Ratio (LR) model based on Support Vector Machines (SVMs) trained with features discovered via the morphometric and other spatial analyses of matching minutiae for both genuine and close imposter (or match and close non-match) populations typically recovered from Automated Fingerprint Identification System (AFIS) candidate lists. Lastly, experimentation performed on a set of over 60,000 publicly available fingerprint images (mostly sourced from NIST and FVC databases) and a distortion set of 6,000 images will be presented, illustrating that the proposed LR model is reliably guiding towards the right proposition in the identification assessment for both genuine and high ranking imposter populations, based on the discovered distortion characteristic differences of each population.  [48], FVC2004 [51], and our own databases.

Forensic fingerprint identification
Historically, the forensic identification of fingerprints has had near unanimous acceptance as a gold standard of forensic evidence, where the scientific foundations of such testimonies were rarely challenged in court proceedings. In addition, fingerprint experts have generally been regarded as expert witnesses with adequate training, scientific knowledge, relevant experience, and following a methodical process for identification, ultimately giving credibility to their expert witness testimonies.
Fingerprint experts largely follow a friction ridge identification process called ACE-V (Analysis, Comparison, Evaluation, and Verification) [5] to compare an unknown fingermark with known fingerprint exemplars. The ACE-V acronym also details the ordering of the identification process ( Figure 3). In the analysis stage, all visible ridge characteristics (level 1, 2, and 3) are noted and assessed for reliability, while taking into account variations caused by pressure, distortion, contact medium, and development techniques used in the laboratory. The comparison stage involves comparing features between the latent mark and either the top n fingerprint exemplars return from an AFIS search, or specific pre-selected exemplars. If a positive identification is declared, all corresponding features are charted, along with any differences considered to be caused by environmental influence. The Evaluation stage consists of an expert making an inferential decision based on the comparison stage observations. The possible outcomes [6] are: • exclusion: a discrepancy of features are discovered so it precludes the possibility of a common source, • identification: a significant correspondence of features are discovered that is considered to be sufficient in itself to conclude to a common source, and • inconclusive: not enough evidence is found for either an exclusion or identification.
The Verification stage consists of a peer review of the prior stages. Any discrepancies in evaluations are handled by a conflict resolution procedure.
Identification evaluation conclusions [7] made by fingerprint experts have historical influence from Edmond Locard's tripartite rule [8]. The tripartite rule is defined as follows:  Figure 3. Flowchart of modern ACE-V process used in conjunction with AFIS. The iterative comparison of each exemplar fingerprint in the AFIS candidate list is performed until identification occurs or no more exemplars are left. The red flow lines indicate the process for the verification stage analysis. The purple flow line from the agreement of features test shows the ACE process that skips the evaluation stage.
• Positive identifications are possible when there are more than 12 minutiae within sharp quality fingermarks.
• If 8 to 12 minutiae are involved, then the case is borderline. Certainty of identity will depend on additional information such as finger mark quality, rarity of pattern, presence of the core, delta(s), and pores, and ridge shape characteristics, along with agreement by at least 2 experts.
• If a limited number of minutiae are present, the fingermarks cannot provide certainty for an identification, but only a presumption of strength proportional to the number of minutiae.
Holistically, the tripartite rule can be viewed as a probabilistic framework, where the successful applications of the first and second rules are analogous to a statement with 100% certainty that the mark and the print share the same source, whereas the third rule covers the probability range between 0% to 100%. While some jurisdictions only apply the first rule to set a numerical standard within the ACE-V framework, other jurisdictions (such as Australia, UK, and USA [9]) adopt a holistic approach, where no strict numerical standard or feature combination is prescribed. Nevertheless, current fingerprint expert testimony is largely restricted to conclusions that convey a statement of certainty, ignoring the third rule's probabilistic outcome.

Daubert and criticisms
Recently, there has been a number of voiced criticisms on the scientific validity of forensic fingerprint identification [10] [11] [12] [13] [14] [15]. Questions with regards to the scientific validity of forensic fingerprint identification began shortly after the Daubert case [17]. In the 1993 case of Daubert v. Merrell Dow Pharmaceuticals [18] the US Supreme Court outlined criteria concerning the admissibility of scientific expert testimony. The criteria for a valid scientific method given were as follows: • must be based on testable and falsifiable theories/techniques, • must be subjected to peer-review and publication, • must have known or predictable error rates, • must have standards and controls concerning its applications, and • must be generally accepted by a relevant scientific community.
The objections which followed [13] [14] [15] from a number of academic and legal commentators were: • the contextual bias of experts for decisions made within the ACE-V (Analysis, Comparison, Evaluation, and Verification) framework used in fingerprint identification • the unfounded and unfalsifiable theoretical foundations of fingerprint feature discriminability, and • the 'unscientific' absolute conclusions of identification in testimonies (i.e., either match, non-match, or inconclusive).
There have been a number of studies [16] over the last 5 years concerning contextual bias and the associated error rates of ACE-V evaluations in practice. The experiments reported by [19] led to conclusions that experts appear more susceptible to bias assessments of 'inconclusive' and 'exclusion', while false positive rates are reasonably low within simulation of the ACE-V framework. It has also been suggested from results in [20] and [21] that not all stages of ACE-V are equally vulnerable to contextual bias, with primary effects occurring in the analysis stage, with proposals on how to mediate such variability found in [22]. While contextual bias is solely concerned with the influence of the expert, the remaining criticisms can be summarised as the non-existence of a scientifically sound probabilistic framework for fingerprint evidential assessment, that has the consensual approval from the forensic science community.
The theoretical foundations of fingerprint identification primarily rest on rudimentary observational science, where a high discriminability of feature characteristics exists. However, there is a lack of consensus regarding quantifiable error rates for a given pair of 'corresponding' feature configurations [23]. Some critics have invoked a more traditional interpretation for discriminability [24] [25], claiming that an assumption of 'uniqueness' is used. This clearly violates the falsifiable requirement of Daubert. However, it has been argued that modern day experts do not necessarily associate discriminability with uniqueness [26]. Nevertheless, a consensus framework for calculating accurate error rates for corresponding fingerprint features needs to be established.

Role of statistical models
While a probabilistic framework for fingerprint comparisons has not been historically popular and was even previously banned by professional bodies [8], a more favourable treatment within the forensic community is given in recent times. For example, the IAI have recently rescinded their ban on reporting possible, probable, or likely conclusions [27] and support the future use of valid statistical models (provided that they are accepted as valid by the scientific community) to aid the practitioner in identification assessments. It has also been suggested in [28] that a probabilistic framework is based on strong scientific principles unlike the traditional numerical standards.
Statistical models for fingerprint identification provide a probabilistic framework that can be applied to forensic fingerprint identification to create a framework for evaluations, that do not account for the inherent uncertainties of fingerprint evidence. Moreover, the use of such statistical models as an identification framework helps answer the criticisms of scientific reliability and error rate knowledge raised by some commentators. For instance, statistical models can be used to describe the discriminatory power of a given fingerprint feature configuration, which in hand can be used to predict and estimate error rates associated with the identification of specific fingerprint features found in any given latent mark.
Statistical models could potentially act as a tool for fingerprint practitioners with evaluations made within the ACE-V framework, specifically when the confidence in identification or exclusion is not overtly clear. However, such applications require statistical models to be accurate and robust to real work scenarios.

Likelihood Ratio models
A likelihood ratio (LR) is a simple yet powerful statistic when applied to a variety of forensic science applications, including inference of identity of source for evidences such as DNA [29], ear-prints [30], glass fragments [31], and fingerprints [32] [33] [34] [35]. An LR is defined as the ratio of two likelihoods of a specific event occurring, each of which follow a different prior hypothesis, and thus, empirical distribution. In the forensic identification context, an event, E, may represent the recovered evidence in question, while the prior hypotheses considered for calculating the two likelihoods of E occurring are: • H 0 : E comes from a specific known source, P, and • H A : E has an alternative origin to P.
Noting any additional relevant prior information collected from the crime scene as I cs , the LR can be expressed as where P(E|H 0 , I cs ) is the likelihood of the observations on the mark given that the mark was produced by the same finger as the print P, while P(E|H A , I cs ) is the likelihood of the observations on the mark given that the mark was not produced by the same finger as P. The LR value can be interpreted as follows: • LR < 1: the evidence has more support for hypothesis H A , • LR = 1: the evidence has equal support from both hypotheses, and • LR > 1: the evidence has more support for hypothesis H 0 .
The general LR form of equation (1) can be restated specifically for fingerprint identification evaluations. Given an unknown query impression, y, (e.g., unknown latent mark) with m ′ marked features (denoted as y (m ′ ) ), and a known impression, x, (e.g., known AFIS candidate or latent mark) with m marked features (denoted as x (m) ), the LR is defined as where the value P(y (m ′ ) |x (m) , H 0 , I cs ) represents the probability that impressions x and y agree given that the marks were produced by the same finger, while P(y (m ′ ) |x (m) , H A , I cs ) is the probability that x and y agree given that the marks were not produced by the same finger, using the closest q corresponding features between x (m) and y (m ′ ) with q ≤ min(m, m ′ ). Thus, hypotheses used to calculate the LR numerator and denominator probabilities are defined as: • H 0 : x and y were produced by the same finger, and • H A : x and y were produced by different fingers.
The addendum crime scene information, I cs , may include detail of surrounding fingermarks, surficial characteristics of the contacted medium, or a latent mark quality/confidence assessment. In order to measure the within-finger and between-finger variability of landmark based feature configurations required to derive values for P(y (m ′ ) |x (m) , H 0 , I cs ) and P(y (m ′ ) |x (m) , H A , I cs ), models either use statistical distributions of dissimilarity metrics (used as a proxy for direct assessment) derived from either the analysis of spatial properties [33] [34] [35], or analysis of similarity score distributions produced by the AFIS [36] [37] [38].

AFIS score based LR models
AFIS score based LR models use estimates of the genuine and imposter similarity score distributions from fingerprint matching algorithm(s) within AFIS, in order to derive a LR measure. In a practical application, a given mark and exemplar may have an AFIS similarity score of s, from which the conditional probability of the score can be calculated ( Figure 4) to give an LR of

Parametric Based Models
In order to estimate the score distributions used in equation (3), the authors of [36] proposed using the Weibull W(λ, β) and Log-Normal ln N (µ, σ 2 ) distributions with scale/shape parameters tuned to estimate the genuine and imposter AFIS score distributions, respectively. Given query and template fingermarks with an AFIS similarity score, s, the LR is Genuine Impostor Figure 4. Typical AFIS imposter and genuine score distributions. The LR can be directly calculated for a given similarity score using the densities from these distributions.
using the proposed probability density functions of the estimated AFIS genuine and imposter score distributions.
An updated variant can be found in [37], where imposter and genuine score distributions are modelled per minutiae configuration. This allows the rarity of the configuration to be accounted for.

Non-Match Probability Based Model
The authors of [38] proposed a model based on AFIS score distributions, using LR and Non-Match Probability (NMP) calculations. The NMP can be written mathematically as which is simply the complement of the probability that the null hypothesis (i.e., x and y come from the same known source) is true, given prior conditions x, y, and I cs (i.e., background information).
Three main methods for modelling the AFIS score distributions where tested, being (i) histogram based, (ii) Gaussian kernel density based, and (iii) parametric density based estimation using the proposed distributions found in [36]. Given an AFIS score, s, the NMP and LR were calculated by setting P(H A ) = P(H 0 ), while estimating both P(s|H A ) and P(s|H 0 ) either by normalised bin (method (i)) or probability density (methods (ii) and (iii)) values for respective distributions. Experimentation revealed that the parametric method was biased. In addition, the authors suggest that the kernel density method is the most ideal, as it does not suffer from bias while it can be used to extrapolate NMP scores where no match has been observed, unlike the histogram based representation.

Practicality of AFIS based LR Models
AFIS score based LR models provide a framework that is both practically based and simple to implement in conjunction with the AFIS architecture. However, model performance is dependent on the matching algorithm of the AFIS. In fact, LR models presented will usually reflect the exact information contained in a candidate list of an AFIS query. A more complex construction, for instance, multiple AFIS matching algorithms with a mixture-of-experts statistical model would be more ideal and avoid LR values that are strictly algorithm dependent.
The scores produced from matching algorithms in AFIS detail pairwise similarity between two impressions (i.e., mark and exemplar). However, the methods used in [36] [38], which generalise the distributions for all minutiae configurations, do not allow evidential aspects such as the rarity of a given configuration to be considered. A more sound approach would be to base LR calculations on methods that do not have primary focus on only pairwise similarities, but consider statistical characteristics of features within a given population. For instance, the LR for a rare minutiae configuration should be weighted to reflect its significance. This is achieved in the method described in [37] by focusing distribution estimates of scores for each minutiae configuration.

Feature Vector based LR models
Feature Vector (FV) based LR models are based on FVs constructed from landmark (i.e., minutiae) feature analyses. A dissimilarity metric is defined that is based on the resulting FV. The distributions of such vector dissimilarity metrics are then analysed for both genuine and imposter comparisons, from which an LR is derived.

Delauney Triangulation FV Model
The first FV based LR model proposed in the literature can be found in [33]. FVs are derived from Delaunay triangulation ( Figure 5 left) for different regions of the fingerprint. Each FV was constructed as follows: where GP x is the pattern of the mark, R x is the region of the fingerprint, Nt x is the number of minutiae that are ridge endings in the triangle (with Nt x ∈ {0, 1, 2, 3}), A ix is the angle of the i th minutia, and L ix−((i+1) mod 3)x is the length in pixels between the i th and the ((i + 1) mod 3) th minutiae, for a given query fingerprint. Likewise, these structures are created for candidate fingerprint(s): Radial Triangulation Delaunay Triangulation Figure 5. Delaunay triangulation (left) and radial triangulation (right) differences for a configuration of 7 minutiae. The blue point for the radial triangulation illustration represents the centroid (i.e., arithmetic mean of minutiae x-y coordinates).
The FVs can be decomposed into continuous and discrete components, representing the measurement based and count/categorical features, respectively. Thus, the likelihood ratio is rewritten as: where LR d is formed as a prior likelihood ratio with discrete FVs x d = [GP x , R x , Nt x ] and y d = GP y , R y , Nt y , while continuous FVs x c and y c contain then remaining features in x and y, respectively. The discrete likelihood numerator takes the value of 1, while the denominator was calculated using frequencies for general patterns multiplied by region and minutia-type combination probabilities observed from large datasets.
A dissimilarity metric, d(x c , y c ), was created for comparing the continuous FV defined as: with ∆ 2 as the squared difference of corresponding variables from x c and y c . This was used to calculate the continuous likelihood value, with: were estimated using a kernel smoothing method. All LR numerator and denominator likelihood calculations were derived from these distribution estimates.
Two experiments were configured in order to evaluate within-finger (i.e., genuine) and between-finger (i.e., imposter) LRs. Ideally, LRs for within-finger comparisons should be larger than all between-finger ratios. The within-finger experiment used 216 fingerprints from 4 different fingers under various different distortion levels. The between-finger datasets included the same 818 fingerprints used in the minutia-type probability calculations. Delaunay triangulation had to be manually adjusted in some cases due to different triangulation results occurring under high distortion levels. Error rates for LRs greater than 1 for false comparisons (i.e., between-finger) and LRs less than 1 for true comparisons (i.e., within-finger) for index, middle, and thumbs, are given in Table 1. These errors rates indicate the power that 3 minutiae (in each triangle) have in creating an LR value dichotomy between within and between finger comparisons.

Radial Triangulation FV Model: I
Although the triangular structures of [33] performed reasonably well in producing higher LRs for within-finger comparisons against between-finger comparisons, there are issues with the proposed FV structure's robustness towards distortion. In addition, LRs could potentially have increased dichotomy between imposter and genuine comparisons by including more minutiae in the FV structures, rather than restricting each FV to only have three minutiae.
The authors of [34] defined radial triangulation FVs based on n minutiae x = [GP x , x s ] with: (and similarly for y and y (n) ), where GP denotes the general pattern, T k is the minutia type, RA k is the direction of minutia k relative to the image, R k is the radius from the kth minutia to the centroid ( Figure 5 right), L k,k+1 is the length of the polygon side from minutia k to k + 1, and S k is the area of the triangle defined by minutia k, (k + 1) mod n, and the centroid.
The LR was then calculated as The component LR g is formed as a prior likelihood with P(GP x , GP y |H 0 , I cs ) = 1 and P(GP x , GP y |H A , I cs ) equal to the FBI pattern frequency data. Noting that the centroid FVs can be arranged in n different ways (accounting for clockwise rotation): for j = 1, 2, . . . , n, LR n|g was defined as where the dissimilarity metric is The calculation of each of the d(x (n) , y (n) i ) is the Euclidean distance of respective FVs which are normalised to take a similar range of values. The two conditional probability density functions of P(d(x (n) , y (n) )|GP x , GP y , H 0 , I cs ) and P(d(x (n) , y (n) )|GP x , GP y , H A , I cs ) were estimated using mixture models of normal distributions with a mixture of three and four distributions, respectfully, using the EM algorithm to estimate distributions for each finger and number of minutiae used.
This method modelled within and between finger variability more accurately in comparison to the earlier related work in [33], due to the flexibility of the centroid structures containing more than three minutiae. For example, the addition of one extra minutia halved the LR error rate for some fingerprint patterns. In addition, the prior likelihood is more flexible in real life applications as it is not dependent on identifying the specific fingerprint region (which is more robust for real life fingermark-to-exemplar comparisons).

Radial Triangulation FV Model: II
The authors of [35] proposed a FV based LR model using radial triangulation structures. In addition, they tuned the model using distortion and examination influence models. The radial triangulation FVs used were based on the structures defined in [34], where five features are stored per minutia, giving for a configuration y (n) starting from the i th minutia, for i = 1, 2, . . . , n, where δ j is the distance between the j th minutia and the centroid point, σ j is the distance between the j th minutia and the next contiguous minutia (in a clockwise direction), θ j is the angle between the direction of a minutia and the line from the centroid point, α j is the area of the triangle constituted by the j th minutia, the next contiguous minutia and the centre of the polygon, and τ j is the type of the j th minutia (ridge ending, bifurcation, unknown).
The distance between configurations x (n) and y (n) , each representing n minutiae, is where with where x (n) (δ j ) (and y (n) i (δ j )) is the normalised value for δ for the j th minutiae, and likewise for all other normalised vector components σ, θ, α, and τ, while d θ is the angular difference and d T is the defined minutiae type difference metric. The multipliers (i.e., q δ , q σ , q θ , q α , and q τ ) are tuned via a heuristic based procedure.
The proposed LR calculation makes use of: • distortion model: based on the Thin Plate Spline (TPS) bending energy matrices representing the non-affine differences of minutiae spatial detail trained from a dataset focused on finger variability, • examiner influence model: created to represent the variability of examiners when labelling minutiae in fingerprint images.
Let y (k) be the configuration of a fingermark, x i,M }), from which the LR is given as where ψ is defined as which is a mixture of Exponential and Beta functions with tuned parameters λ 1 and λ 2 , while d 0 is the smallest value into which distances were binned, and T (k) is the 95th percentile of simulated scores from the examiner influence model applied on y (k) . Experimental results from a large validation dataset showed that the proposed LR model can generally distinguish within and between finger comparisons with high accuracy, while an increased dichotomy arose from increasing the configuration size.

Practicality of FV based LR Models
Generally speaking, to implement robust FV based statistical models for forensic applications, the following must be considered: • Any quantitative measures used should be based on the data driven discovery of statistical relationships of features. Thus, a rich dataset for both within and between finger data is essential.
• Effects of skin distortion must be considered in models. Latent marks can be highly distorted from skin elasticity and applied pressure. For instance, differences in both minutiae location (relative to other features) and type (also known as type transfer) can occur when different distortion exists.
• Features used in models must be robust to noisy environmental factors, whilst maintaining a high level of discriminatory power. For instance, level 1 features such as classification may not be available due to partiality. In addition, level 2 sub-features such as ridge count between minutiae, minutiae type, and level 3 features such as pores, may not be available in a latent mark due to the material properties of the contacted medium or other environmental noise that regularly exist in latent mark occurrences.
• The model should be robust towards reasonable variations in feature markings from practitioners in the analysis phase of ACE-V. For instance, minutiae locations can vary slightly depending on where a particular practitioner marks a given minutia.
The LR models proposed in [33] and [34] use dissimilarity measures of FVs (equations (9) and (14)) which are potentially erroneous as minutiae types can change, particularly in distorted impressions. While the method in [35] has clearly improved the dissimilarity function by introducing tuned multipliers, squared differences in angle, area, and distance based measures are ultimately not probabilistically based. A joint probabilistic based metric for each FV component using distributions for both imposter and genuine populations would be more consistent with the overall LR framework.
With regards to skin distortion, the radial triangulation FV structures of [34] [35] are robust, unlike the Delaunay triangulation structure of [33]. Furthermore, the model proposed in [35] models realistic skin distortion encountered on flat surfaces by measuring the bending energy matrix for a specialised distortion set. However, this only accounts for the non-affine variation. Affine transformations such as shear and uniform compression/dilation are not accounted for. Such information can be particularly significant for comparisons of small minutiae configurations encountered in latent marks. For instance, a direct downward application of force may have prominent shear and scale variations (in addition to non-affine differences) for minutiae configurations, in comparison to the corresponding configurations of another impression from the same finger having no notable downward force applied.

Proposed method: Morphometric and Spatial Analyses (MSA) based Likelihood Ratio model
In this section, we present a newly formulated FV based LR model that focuses on the important sub-population of close non-matches (i.e., highly similar imposters), with intended practicality for fingermark-to-exemplar identification scenarios where only sparse minutiae triplet information may be available for comparisons. First we discuss relevant background material concerning morphometric and spatial measures to be used in the FVs of the proposed model. The proposed model is presented, which is based on a novel machine learning framework, followed by a proposed LR calculation that focuses on the candidate list population of an AFIS match query (i.e., containing close non-match exemplars and/or a matching exemplar). Finally, an experimental framework centred around the simulation of fingermark-to-exemplar close non-match discovery is introduced, followed by experimental results.

Morphometric and spatial metrics
The foundations of the morphometric and spatial analyses used in the proposed FV based LR model are presented. This includes a non-parametric multidimensional goodness-of-fit statistic, along with several other morphometrical measures that describe and contrast shape characteristics between two given configurations. In addition, a method for finding close non-match minutiae configurations is presented.

Multidimensional Kolmogorov-Smirnov Statistic for Landmarks
A general multidimensional Kolmogorov-Smirnov (KS) statistic for two empirical distributions has been proposed in [39] with properties of high efficiency, high statistical power, and distributional freeness. Like the classic one dimensional KS test, the multidimensional variant looks for the largest absolute difference between the empirical and cumulative distribution functions, as a measure of fit. Without losing generality, let two sets with m and n points in R 3 be denoted as X = {(x 1 , y 1 , z 1 ), . . . , (x m , y m , z m )} and For each point (x i , y i , z i ) ∈ X we can divide the plane into eight defined regions | |X ∩ q i,s | − |Y ∩ q i,s | | (20) which is the maximum pairwise difference of point tallies for X and Y within each of the eight defined regions centred and evaluated at each point in X, and likewise, which is the maximum pairwise difference of point tallies for the eight defined regions centred and evaluated at each point in Y, the three dimensional KS statistic is Z m,n,3D = n.m/(n + m).
The three dimensional KS statistic can be specific to the minutiae triplet space where each minutia spatial and directional detail is represented as a three dimensional point, (x, y, θ). Given m = n matching minutiae correspondences from two configurations X and Y, alignment is performed prior to calculating the statistic, in order to ensure that minutiae correspondences are close together both spatially and directionally. However, direction has a circular nature that must be handled differently from the spatial detail. Instead of raw angular values, we use the orientation difference defined as where z ∈ [− π 2 , π 2 ]. Each minutia, (x, y, θ), is then transformed to (x, y, z(θ, θ 0 )) if the centred minutia used to create the eight regions has a direction of θ 0 , while region borders are defined in the third dimension by z ≥ 0 and z < 0.

Thin Plate Spline and Derived Measures
The Thin Plate Spline (TPS) [40] is based on the algebraic expression of physical bending energy of an infinitely thin metal plate on point constraints after finding the optimal affine transformations for the accurate modelling of surfaces that undergo natural warping (i.e., where a diffeomorphism exists). Two sets of landmarks from each surface are paired in order to provide an interpolation map on R 2 → R 2 . TPS decomposes the interpolation into an affine transform that can be considered as the transformation that expresses the global geometric dependence of the point sets, and a non-affine transform that fine tunes the interpolation of the point sets. The inclusion of the affine transform component allows TPS to be invariant under both rotation and scale.
Given n control points {p 1 = (x 1 , y 1 ), p 2 = (x 2 , y 2 ), . . . , p n = (x n , y n )} from an input image in R 2 and control points from a target image R 2 , the following matrices are defined in TPS: where u(r) = r 2 log r 2 with r as the Euclidean distance, r ij = p i − p j , where K, P, V, Y, L have dimensions n × n, 3 × n, 2 × n, (n + 3) × 2, and (n + 3) × (n + 3), respectively. The vector W = (w 1 , w 2 , . . . , w n ) and the coefficients a 1 , a x , a y , can be calculated by the equation The elements of L −1 Y are used to define the TPS interpolation function with the coordinates compiled from the first column of L −1 Y giving where a 1,x a x,x a y,x T is the affine transform component for x, and likewise for the second column, where f y (x, y) = a 1,y + a x,y x + a y,y y + with a 1,y a x,y a y,y T as the affine component for y. Each point (or minutia location in our application) can now be updated as It can be shown that the function f (x, y) is the interpolation that minimises where I f is the bending energy measure and L n is the n × n sub-matrix of L. Affine transform based metrics relating to shear, rotation, and scale (i.e., compression and dilation) can be calculated straight from Singular Value Decomposition (SVD) of the affine matrix USV T = SVD a x,x a x,y a y,x a y,y .

Shape Size and Difference Measures
Shape size measures are useful metrics for comparing general shape characteristics. Given a matrix X of dimensions k × m, representing a set of k m-dimensional points, the centroid size [41] is defined as where (X) i is the i th row of X andX is the arithmetic mean of the points in X (i.e., centroid point). Given a second landmark configuration Y also with k m-dimensional points, we define the shape size difference as Another useful shape metric is derived from the partial Procrustes method [41], which finds the optimal superimposition of one set of landmarks, X, onto another, Y, using translation and rotation affine operators: where 1 k is a (k × 1) vector of ones, Γ is a m × m rotation matrix and γ is the (m × 1) translation offset vector. Using centred landmarks, X c = CX and Y c = CY where C = I k − 1 k 1 k 1 T k , the ordinary partial Procrustes sum of squares is with ρ (X c , Y c ) as the Procrustes distance defined as where λ 1 , . . . , λ m are the square roots of the eigenvalues of Z T X Z Y Z T Y Z X with Z X = HX/ HX and Z Y = HY/ HY for the Helmert sub-matrix, H, with dimension k × k.

Close Non-Match Discovery and Alignment
In order to reproduce the process of an examiner querying a minutiae configuration marked on fingermark with an AFIS, a method for finding close configurations was developed. To find close non-matches for a particular minutiae configuration, we employed a simple search algorithm based solely on minutiae triplet features, in order to maintain robustness towards such fingermark-to-exemplar match scenarios. The minutiae triplet features are extracted in a fully automated manner using the NIST mindtct tool [49] without particular attention to spurious results, besides minimum quality requirements as rated by the mindtct algorithm.
Algorithm 1 f indCloseTripletCon f igs: Find all close triplet configurations to X Require: A minutiae triplet set X and a dataset of exemplars D. Once feature extraction is complete, the close match search algorithm (Algorithm 1) finds all equally sized close minutiae configurations in a given dataset of exemplars to a specified minutiae set configuration (i.e., potentially marked from a latent) in an iterative manner by assessing all possible minutiae triplet pairs via a crude affine transform based alignment on configuration structures. Recorded close minutiae configurations are then re-aligned using the partial Procrustes method using the discovered minutiae pairings. Unlike the Procrustes method, the partial Procrustes method does not alter scale of either landmarks. For the application of fingerprint alignment, ignoring scale provides a more accurate comparison of landmarks since all minutiae structures are already normalised by the resolution and dimensions of the digital image. The TPS registration is then applied for a non-affine transformation. If the bending energy is higher than a defined threshold, we ignore the potential match due to the likely unnatural distortion encountered. Finally, a candidate list with all close minutiae configurations is produced for analysis.

Proposed model
We now propose an LR model based on what is found in [4], developed specifically to aid AFIS candidate list assessments, using the intrinsic differences of morphometric and spatial analyses (which we label as MSA) between match and close non-match comparisons, learnt from a two-class probabilistic machine learning framework.

Feature Vector Definition
Given two matching configurations X and Y (discovered from the procedure described in Section 3.1.4) a FV based on the previously discussed morphometric and spatial analyses is defined as: where Z m,n,3D is the three dimensional KS statistic of equation (22) (29) and (32)(33)(34) resulting from registering X onto Y via TPS, S(X) and d S are the shape size and difference metric of equations (35)(36), OSS p (X c , Y c ) is the ordinary partial Procrustes sum of squares of equation (38), and d mc is the difference of the number of interior minutiae within the convex hulls of X and Y. The d mc measure is an optional component to the FV dependent on the clarity of a fingermark's detail within the given minutiae configuration. For the experiments presented later in this chapter, we will exclude this measure.
The compulsory measures used in the proposed feature vector rely solely on features that are robust to the adverse environmental conditions of latent marks, all of which are based on minutiae triplet detail. The FV structures are categorised by genuine/imposter (or match/close non-match) classes, number of minutiae in the matching configurations, and configuration area (categorised as small, medium, and large).

Machine Learning of Feature Vectors
Using the categories prescribed for the defined FVs, a probabilistic machine learning framework is applied for finding the probabilities for match and close non-match classes.
The probabilistic framework employed [42] is based on Support Vector Machines (SVMs) with unthresholded output, defined as with where k(•, •) is the kernel function, and the target output y i ∈ {−1, 1} represents the two classes (i.e., 'close non-match' and 'match', respectively). We use the radial basis function due to the observed non-linear relationships of the proposed FV. Training the SVM minimises the error function where C is the soft margin parameter (i.e., regularisation term which provides a way to control overfitting) and F is the Reproducing Kernel Hilbert Space (RKHS) induced by the kernel k. Thus, the norm of h is penalised in addition to the approximate training misclassification rate. By transforming the target values with the posterior probabilities P(y i = 1| f (x i )) and P(y i = −1| f (x i )) which represents the probabilities that x i is of classes 'match' and 'close non-match', respectively, can now be estimated by fitting a sigmoid function after the SVM output with and P(x i is a close non-match| f (x i )) = P( The parameters A and B are found by minimising the negative log-likelihood of the training data: using any optimisation algorithm, such as the Levenberg-Marquardt algorithm [43].

Likelihood Ratio Calculation
The probability distributions of equations (47)(48) are posterior probabilities. Nevertheless, for simplicity of the initial application, we assume uniform distributions for P( f (x i )) = z for some constant, z, whereas P(x i is a match) = a and P(x i is a close non-match) = 1 − a where a reflects the proportion of close minutiae configuration comparisons that are ground truth matches. Thus, the LR is equivalent to the posterior ratio (PR) For future consideration, the probabilities P(x i is a match) and P(x i is a close non-match) can be adaptively based on Cumulative Match Characteristic (CMC) curve [44] statistics of a given AFIS system or any other relevant background information.
As already noted, the LR formulas are based on different distributions specified per FV categories of minutiae count and the area of the given configuration. This allows the LR models to capture any spatial and morphometric relational differences between such defined categories. Unlike previous LR methods that are based on the distributions of a dissimilarity metric, the proposed method is based on class predictions based on a number of measures, some of which do not implicitly or explicitly rate or score a configuration's dissimilarity (e.g. centroid size, S(X i )). Instead, statistical relationships of the FV measures and classes are learnt by SVMs in a supervised manner, only for class predictions.
In its current proposed form, the LR of equation (49) is not an evidential weight for the entire population, but rather, an evidential weight specifically for a given candidate list.

Experimental Databases
Without access to large scale AFISs, a sparse number of fingermark-to-exemplar datasets exists in the public domain (i.e., NIST27 is the only known dataset with only 258 sets). Thus, to study the within-finger characteristics, a distortion set was built.
We follow a methodology similar to that of [35] where live scanned fingerprints have eleven directions applied, eight of which are linear directions, two torsional, and central application of force. Using a readily available live scan device (Suprema Inc. Realscan-D: 500ppi with rolls, single and dual finger flats), we follow a similar methodology, described as follows: • sixteen different linear directions of force, • four torsion directions of force, • central direction of force, • all directions described above have at least three levels of force applied, • at least five rolled acquisitions are collected, • finally, numerous impressions with emphasis on partiality and high distortion are obtained by recording fifteen frames per second, while each finger manoeuvres about the scan area in a freestyle manner for a minimum of sixty seconds.
This gave a minimum total of 968 impressions per finger. A total of 6,000 impressions from six different fingers (from five individuals) were obtained for our within-finger dataset, most of which are partial impressions from the freestyle methodology. For the between-finger comparisons, we use the within-finger set in addition to the public databases of NIST 14 [45]

SVM Training Procedure
A simple training/evaluation methodology was used in the experiments. After finding all FVs for similar configurations, a random selection of 50% of the FVs were used to train each respective SVM by the previously defined categories (i.e., minutiae configuration count and area). The remaining 50% of FVs were used to evaluate the LR model accuracy. The process was then repeated by swapping the training and test sets (i.e., two-fold cross-validation). Due to the large size of the within-finger database, a substantially larger number of within-finger candidates are returned. To alleviate this, we randomly sampled the within-finger candidates to be of equal number to the between-finger counterparts (i.e., a = 0.5 in equation (49)). All individual features within each FV were scaled to have a range of [0, 1], using pre-defined maximum and minimum values specific to each feature component.
A naive approach was used to find the parameters for the SVMs. The radial basis kernel parameter, γ, and the soft learning parameter, C, of equations (43) and (44), respectively, were selected using a grid based search, using the cross-validation framework to measure the test accuracy for each parameter combination, (γ, C). The parameter combination with the highest test accuracy was selected for each constructed SVM.

Experimental Results
Experiments were conducted for minutiae configurations of sizes of 6, 7, and 8 ( Figure 6) from the within-finger dataset, using configurations marked manually by an iterative circular growth around a first minutiae until the desired configuration sizes were met. From the configuration sizes, a total of 12144, 4500, and 1492 candidates were used, respectively, from both the within (50%) and between (50%) finger datasets. The focus on these configuration settings were due to three reasons: firstly, the high computational overhead involved in the candidate list retrieval for the prescribed datasets, secondly, configurations of such sizes perform poorly in modern day AFIS systems [50], and finally, such configuration sizes are traditionally contentious in terms of Locard's tripartite rule, where a probabilistic approach is prescribed to be used.
The area sizes used for categorising the minutiae configurations were calculated by adding up the individual areas of triangular regions created using Delaunay triangulation. Small, medium, and large configuration area categories were defined as 0 < A < 4.2mm 2 , 4.2mm 2 ≤ A < 6.25mm 2 , and A ≥ 6.25mm 2 , respectively.
The results clearly indicate a stronger dichotomy of match and close non-match populations when the number of minutiae was increased. In addition, the dichotomy was marginally stronger for larger configuration areas with six minutiae. Overall, the majority of FV's of class 'match' derive significantly large LR values.  . Tippett plots for minutiae configurations of 6 (top row), 7 (middle row), and 8 (bottom row) minutiae with small, medium, and large area categories (left to right, respectively), calculated from P(x i is a match| f (x i )) and P(x i is a close non-match| f (x i )) distributions. The x-axes represents the logarithm (base 2) of the LR values in equation (49) for match (blue line) and close non-match (red line) populations, while the y-axes represents proportion of such values being greater than x. The green vertical dotted line at x = 0 signifies a marker for LR = 1 (i.e., x = log 2 1 = 0).

Summary
A new FV based LR model using morphometric and spatial analysis (MSA) with SVMs, while focusing on candidate list results of AFIS, has been proposed. This is the first LR model known to the authors that use machine learning as a core component to learn spatial feature relationships of close non-match and match populations. For robust applications for fingermark-to-exemplar comparisons, only minutiae triplet information were used to train the SVMs. Experimental results illustrate the effectiveness of the proposed method in distinguishing match and close non-match configurations.
The proposed model is a preliminary proposal and is not focused on evidential value for judicial purposes. However, minor modifications can potentially allow the model to also be used for evidential assessments. For future research, we hope to evaluate the model with commercial AFIS environments containing a large set of exemplars.