3D RMS errors (mm), dice (%) and cobb angles (
Abstract
Manifold learning theory has seen a surge of interest in the modeling of large and extensive datasets in medical imaging since they capture the essence of data in a way that fundamentally outperforms linear methodologies, the purpose of which is to essentially describe things that are flat. This problematic is particularly relevant with medical imaging data, where linear techniques are frequently unsuitable for capturing variations in anatomical structures. In many cases, there is enough structure in the data (CT, MRI, ultrasound) so a lower dimensional object can describe the degrees of freedom, such as in a manifold structure. Still, complex, multivariate distributions tend to demonstrate highly variable structural topologies that are impossible to capture with a single manifold learning algorithm. This chapter will present recent techniques developed in manifold theory for medical imaging analysis, to allow for statistical organ shape modeling, image segmentation and registration from the concept of navigation of manifolds, classification, as well as disease prediction models based on discriminant manifolds. We will present the theoretical basis of these works, with illustrative results on their applications from various organs and pathologies, including neurodegenerative diseases and spinal deformities.
Keywords
- manifold learning
- medical imaging
- discriminant manifolds
- piecewise geodesic regression
- spine deformities
- neurodegenerative diseases
- shape modeling
1. Introduction
Learning on large medical imaging datasets is an emerging discipline driven from the availability of vast amounts of raw data in many of today’s biomedical studies. However, challenges such as unbalanced data distributions, complex multivariate data and highly variable structural topologies demonstrated by real-world samples makes it much more difficult to efficiently learn the associated representation. An important goal of scientific data analysis in medicine, particularly in neurosciences or oncology, is to understand the behavior of biological process or physiological/morphological alterations. This introduces the need to synthesize large amounts of multivariate data in a robust manner and raises the fundamental question of data reduction: how to discover meaningful representations from unstructured high-dimensional medical images.
Several approaches have attempted to understand how dimension reduction and regression establishes the relationship in subspaces and finally determine statistics on manifolds that optimally describe the relationships between the samples [1]. However, certain assumptions based on the representation of shapes and images using smooth manifolds are made in most cases, which frequently will not be adequate in the presence of medical imaging data and often perturbed by nuisance articulations, clutter or varying contrast.
High-dimensional classification methods have shown promise to measure subtle and spatially complex imaging patterns that have diagnostic value [2, 3]. Defining statistics on a manifold is not a straightforward process when simple statistics cannot be directly applied to general manifolds [4]. But while Euclidean estimators have been used for vector spaces, none have been adapted for multimodal data lying in different spaces. Still, there has been interest in the characterization of data in a Riemann space [5, 6]. Unfortunately, manifold-valued metrics based on the centrality theory or the geometric median [7] often lacks robustness to outliers.
A related topic lies in dimensionally reduced growth trajectories of various anatomical sites which have been investigated in neurodevelopment of newborns for example, based on geodesic shape regression to compute the diffeomorphisms with image time series of a population [8]. These regression models were also used to estimate spatiotemporal evolution of the cerebral cortex [9]. The concept of parallel transport curves in the tangent space from low-dimensional manifolds proposed by Schiratti et al. [10] was used to analyze shape morphology [11] and adapted for radiotherapy response [12]. Regression models were proposed for both cortical and subcortical structures, with 4D varifold-based learning framework with local topography shape morphing being proposed by Rekik et al. [13].
This chapter presents several manifold learning methodologies designed to address challenges encountered in medical imaging. In Section 2, we present an articulated shape inference model from nonlinear embeddings, expressing the global and local shape variations of the spine and vertebrae composing it, introduced in [14]. We then present in Section 3 a probabilistic model from discriminant manifolds to classify the neurodegenerative stage of Alzheimer’s disease. Finally, a piecewise-geodesic transport curve in the tangent space from low-dimensional manifolds designed for the prediction of correction in spinal surgeries is shown in Section 4, introducing a time-warping function controlling the rate of shape evolution. We conclude this article in Section 5.
2. Shape inference through navigating manifolds
Statistical models of shape variability have been successful in addressing fundamental vision tasks such as segmentation and registration in medical imaging. However, the high dimensionality and complex nonlinear underlying structure unfortunately makes the commonly used linear statistics inapplicable for anatomical structures. Manifold learning approaches map high-dimensional observation data that are presumed to lie on a nonlinear manifold, onto a single global coordinate system of lower dimensionality.
Inferring a model from the underlying manifold is not a novel concept but far from being trivial. In this section, we model both global statistics of the articulated model and local shape variations of vertebrae based on local measures in manifold space. We describe a spine inference/segmentation method from CT and MR images, where the model representation is optimized through a Markov Random Field (MRF) graph, balancing prior distribution with image data.
2.1. Data representation
Our spine model
2.2. Manifold embedding
For nonlinear embeddings, we rely on the absolute vector representation

Figure 1.
Representation of intervertebral transformations in manifold space.
The main limitation of embedding algorithms is the assumption of Euclidean metrics in the ambient space to evaluate similarity between sample points. Thus, a metric in the space of articulated structures is defined so that it accommodates for anatomical spine variability and adopts the intrinsic nature of the Riemannian manifold geometry allowing us to discern between articulated shape deformations in a topological invariant framework. For each point, the
While for the translation, the
Afterwards, the manifold reconstruction weights are estimated by assuming the local geometry of the patches can be described by linear coefficients that permit the reconstruction of every model point from its neighbors. In order to determine the value of the weights, the reconstruction errors are measured using the following objective function:
Thus,
The algorithm maps each high-dimensional
with
To obtain the articulation vector for a new embedded point in the ambient space (image domain), one has to determine the representation in high-dimensional space based on its intrinsic coordinates. We first assume an explicit mapping
which captures the overall trend of the data in
By assuming
which integrates the distance metric
2.3. Optimization on manifold
Once an appropriate modeling of spine shape variations is determined with a manifold, a successful inference between the image and manifold must be accomplished. We describe here how a new model is generated. We search the optimal embedded manifold point
The global alignment of the model with the target image primarily drives the deformation of the model. The purpose is to estimate the set of articulations describing the global spine model by determining its optimal representation
The inverse transform allows to obtain
where
The prior constraint for the rigid alignment are pairwise potentials between neighboring models
This term represents the smoothness term of the global cost function to ensure that the deformation
One can integrate the global data and prior terms along with local shape terms parameterized as the higher-order cliques, by combining (9), (11):
The optimization strategy of the resulting MRF (12) in the continuous domain is not a straightforward problem. The convexity of the solution domain is not guaranteed, while gradient-descent optimization approaches are prone to nonlinearity and local minimums. We seek to assign the optimal labels
We solve the minimization of the higher-order cliques in (13) by transforming them into quadratic functions [18]. We apply the FastPD method [19] which solves the problem by formulating the duality theory in linear programming.
2.4. Results

Figure 2.
Low-dimensional manifold embedding of the spine dataset comprising 711 models exhibiting various types of deformities. The sub-domain was used to estimate both the global shape pose costs and individual shape instances based on local neighborhoods.
Adaptation of the articulated model was done on two different data sets. The first consisted of volumetric CT scans (
3. Probabilistic modeling of discriminant nonlinear manifolds in the identification of Alzheimer’s
Neurodegenerative pathologies, such as Alzheimer’s disease (AD), are linked with morphological and metabolic alterations which can be assessed from medical imaging and biological data. Recent advances in machine learning have helped to improve classification and prognosis rates, but lack a probabilistic framework to measure uncertainty in the data. In this section, we present a method to identify progressive mild cognitive impairment (MCI) and predict their conversion to AD from MRI and positron emitting tomography (PET) images. We show a discriminative probabilistic manifold embedding where locally linear mappings transform data points in low-dimensional space to corresponding points in high-dimensional space. A discriminant adjacency matrix is constructed to maximize the separation between different clinical groups, including MCI converters and nonconverters, while minimizing the distance in latent variables belonging to the same class.
3.1. Probabilistic model for discriminant manifolds
Manifold learning algorithms are based on the premise that data are often of artificially high dimension and can be embedded in a lower dimensional space. However the presence of outliers and multiclass information can on the other hand affect the discrimination and/or generalization ability of the manifold. We propose to learn the optimal separation between four classes (1) normal controls, (2) nonconverter MCI patients, (3) converter MCI patients and (4) AD patients, by using a discriminant graph-embedding. Here,
In order to effectively discover the low-dimensional embedding, it is necessary to maintain the local structure of the data in the new embedding. The graph
Using the theoretical framework from [20], we can determine a distribution of linear maps associated with the low-dimensional representation to describe the data likelihood for a specific model:
This joint distribution can be separated into three prior terms: the linear maps, latent variables and the likelihood of the high dimensional points
We now define the discriminant similarity graphs establishing neighborhood relationships, as well define each of the three prior terms included in the joint distribution.
with
with
The prior added to the linear maps defines how the tangent planes described in low and high dimensional spaces are similar based on the Frobenius norm. This prior ensures smooth manifolds:
Finally, approximation errors from the linear mapping
with
3.2. Variational inference
The objective is to infer the low-dimensional coordinates and linear mapping function for the described model, as well as the intrinsic parameters of the model
By assuming the posterior
The discriminant latent variable model can then be used to perform the mapping of new image feature vectors to the manifold. The variational EM algorithm described in the previous section can be used to transform a set of new input points
3.3. Experiments
We used the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database with 1.5 or 3.0 T structural MR images (adni.loni.usc.edu) and FDG-PET images. For this study, 187 subjects with both MRI and PET images during a 24 month period were used to train the probabilistic manifold model, including 46 AD patients, 94 MCI patients, and 47 normal controls. During the follow-up period, 43 MCI subjects converted to AD and 56 remained stable. All groups are matched approximately by age (mean of
A 9-fold cross-validation was performed to assess the performance of the method. The optimal manifold dimensionality was set at

Figure 3.
Selected FSL segmented brain regions for feature selection on (left) MRI and (right) PET images.

Figure 4.
ROC curves comparing the SVM, LLE and LL-LVM with the proposed method for cMCI/nMCI prediction using MRI, PET and multimodality data.
4. Spatiotemporal manifold prediction model for surgery prediction
In this final section, we present a statistical framework for predicting the surgical outcomes following spine surgery of adolescents with idiopathic scoliosis. A discriminant manifold is first constructed to maximize the separation between responsive and nonresponsive groups of patients. The model then uses subject-specific correction trajectories based on articulated transformations in order to map spine correction profiles to a group-average piecewise-geodesic path. Spine correction trajectories are described in a piecewise-geodesic fashion to account for varying times at follow-up exams, regressing the curve via a quadratic optimization process. To predict the evolution of correction, a baseline reconstruction is projected onto the manifold, from which a spatiotemporal regression model is built from parallel transport curves inferred from neighboring exemplars (Figure 5).

Figure 5.
Proposed prediction framework for spine surgery outcomes. In the training phase, a dataset of spine models are embedded in a spatiotemporal manifold
4.1. Discriminant embedding of spine models
We propose to embed a collection of nonresponsive (NR) and (2) responsive (R) patients to surgery which will offer a maximal separation between the classes, by using a discriminant graph-embedding. Here,
Because the discriminant manifold structure in
4.2. Piecewise-geodesic spatiotemporal manifold
Once sample points
However, due to the fact the representation of the continuous curve is a variational problem of infinite dimensional space, the implementation follows a discretization process which is derived from the procedure in [22], such that:
This minimization process simplifies the problem to a quadratic optimization, solved with LU decomposition. The piecewise nature is represented by the term
4.3. Prediction of spine correction
Finally, to predict the evolution of spine correction from an unseen preoperative spine model, we use the geodesic curve
Based on Riemannian theory, an exponential mapping function at
Hence, given the manifold at time
Therefore by repeating this mapping for manifold points seen as samples of individual progression trajectories along
A time warp function allowing
For spine correction evolution, displacement vectors
which yields a predicted postoperative model
4.4. Experiments
The discriminant manifold was trained from a database of
FE visit | 1-year visit | 2-year visit | |||||||
---|---|---|---|---|---|---|---|---|---|
3D RMS | Dice | Cobb | 3D RMS | Dice | Cobb | 3D RMS | Dice | Cobb | |
Biomec. sim | 3.3 | 85 | 2.8 | 3.6 | 84 | 3.2 | 4.1 | 82 | 3.6 |
LL-LVM [20] | 3.6 | 83 | 3.8 | 4.7 | 79 | 5.5 | 6.6 | 71 | 7.0 |
Deep AE [24] | 4.1 | 80 | 5.1 | 5.0 | 77 | 5.8 | 6.3 | 72 | 6.6 |
Proposed | 2.4 | 92 | 1.8 | 2.9 | 90 | 2.0 | 3.2 | 87 | 2.1 |
Table 1.
Predictions are evaluated at FE, 1 and 2-years.
5. Discussion
Algorithms capable of extracting clinically relevant and meaningful descriptions from medical imaging datasets have become of widespread interest to theoreticians as well as practitioners in the medical field, accelerating the pace in recent years involving varied fields such as in machine learning, geometry, statistics and genomics to propose new insights for the analysis of imaging and biologic datasets. Towards this end, manifold learning has demonstrated a tremendous potential to learn the underlying representation of high-dimensional, complex imaging datasets.
We presented frameworks describing longitudinal, multimodal image features from neuroimaging data using a Bayesian model for discriminant nonlinear manifolds to predict the conversion of progressive MCI to Alzheimer’s disease. This probabilistic method introduces class-dependent latent variables which is based on the concept that local structure is transformed from manifold to the high-dimensional domain. This variational learning method can ultimately assess uncertainty within the manifold domain, which can lead to a better understanding of relationships between converters and nonconverters for patients with MCI.
Finally, a prediction method for the outcomes of spine surgery using geodesic parallel transport curves generated from probabilistic manifold models was presented. The mathematical models allow to describe patterns in a nonlinear and discriminant Riemannian framework by first distinguishing nonprogressive and progressive cases, followed by a prediction of structural evolution. The proposed model provides a way to analyze longitudinal samples from a geodesic curve in manifold space, thus simplifying the mixed effects when studying group-average trajectories.
References
- 1.
Yang Y, Dunson DB, et al. Bayesian manifold regression. The Annals of Statistics. 2016; 44 (2):876-905 - 2.
Davatzikos C, Resnick SM, Wu X, Parmpi P, Clark CM. Individual patient diagnosis of AD and FTD via high-dimensional pattern classification of MRI. NeuroImage. 2008; 41 (4):1220-1227 - 3.
Li S, Shi F, Pu F, Li X, Jiang T, Xie S, Wang Y. Hippocampal shape analysis of Alzheimer disease based on machine learning methods. American Journal of Neuroradiology. 2007; 28 (7):1339-1345 - 4.
Beg MF, Miller MI, Trouvé A, Younes L. Computing large deformation metric mappings via geodesic flows of diffeomorphisms. International Journal of Computer Vision. 2005; 61 (2):139-157 - 5.
Fletcher PT, Lu C, Joshi S. Statistics of shape via principal geodesic analysis on lie groups. In: Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on. Vol. 1. IEEE; 2003. pp. I-I - 6.
Pennec X. Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. Journal of Mathematical Imaging and Vision. 2006; 25 (1):127 - 7.
Fletcher PT, Venkatasubramanian S, Joshi S. The geometric median on Riemannian manifolds with application to robust atlas estimation. NeuroImage. 2009; 45 (1):S143-S152 - 8.
Singh N, Hinkle J, Joshi S, Fletcher PT. A hierarchical geodesic model for diffeomorphic longitudinal shape analysis. In: International Conference on Information Processing in Medical Imaging. Springer; 2013. pp. 560-571 - 9.
Fishbaugh J, Prastawa M, Gerig G, Durrleman S. Geodesic regression of image and shape data for improved modeling of 4D trajectories. In: 2014 International Symposium on Biomedical Imaging. IEEE; 2014. pp. 385-388 - 10.
Schiratti JB, Allassonniere S, Colliot O, Durrleman S. Learning spatiotemporal trajectories from manifold-valued longitudinal data. In: Advances in Neural Information Processing Systems. 2015:2404-2412 - 11.
Kadoury S, Mandel W, Roy-Beaudry, Nault ML, Parent S. 3-D morphology prediction of progressive spinal deformities from probabilistic modeling of discriminant manifolds. IEEE Transactions on Medical Imaging. 2017; 36 (5):1194-1204 - 12.
Chevallier J, Oudard S, Allassonnière S. Learning spatiotemporal piecewise-geodesic trajectories from longitudinal manifold-valued data. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA; 2017 - 13.
Rekik I, Li G, Lin W, Shen D. Predicting infant cortical surface development using a 4D varifold-based learning framework and local topography-based shape morphing. Medical Image Analysis. 2016; 28 :1-12 - 14.
Kadoury S, Labelle H, Paragios N. Spine segmentation in medical images using manifold embeddings and higher-order MRFs. IEEE Transactions on Medical Imaging. 2013; 32 :1227-1238 - 15.
Roweis ST, Saul LK. Nonlinear dimensionality reduction by locally linear embedding. Science. 2000; 290 :2323-2326 - 16.
Nadaraya EA. On estimating regression. Theory of Probability and its Applications. 1964; 10 :186-190 - 17.
Davis B, Fletcher P, Bullitt E, Joshi S. Population shape regression from random design data. In: Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007. IEEE. 2007; 1 :1-8 - 18.
Rother C, Kohli P, Feng W, Jia J. Minimizing sparse higher order energy functions of discrete variables. In: Conference on Computer Vision and Pattern Recognition; 2009. pp. 1382-1389 - 19.
Komodakis N, Tziritas G, Paragios N. Performance vs computational efficiency for optimizing single and dynamic MRFs: Setting the state of the art with primal dual strategies. Computer Vision and Image Understanding. 2008; 112 (1):14-29 - 20.
Park M, Jitkrittum W, Qamar A, Szabó Z, Buesing L, Sahani M. Bayesian manifold learning: The locally linear latent variable model (LL-LVM). In: Advances in Neural Information Processing Systems. 2015:154-162 - 21.
Patenaude B, Smith SM, Kennedy DN, Jenkinson M. A Bayesian model of shape and appearance for subcortical brain segmentation. NeuroImage. 2011; 56 (3):907-922 - 22.
Boumal N, Absil PA. A discrete regression method on manifolds and its application to data on SO (n). IFAC Proceedings Volumes. 2011; 44 (1):2284-2289 - 23.
Humbert L, de Guise J, Aubert B, Godbout B, Skalli W. 3D reconstruction of the spine from biplanar X-rays using parametric models based on transversal and longitudinal inferences. Medical Engineering & Physics. 2009; 31 (6):681-687 - 24.
Thong W, Parent S, Wu J, Aubin CE, Labelle H, Kadoury S. Three-dimensional morphology study of surgical adolescent idiopathic scoliosis patient from encoded geometric models. European Spine Journal. 2016; 25 (10):3104-3113 - 25.
Cobetto N, Parent S, Aubin CE. 3D correction over 2 years with anterior vertebral body growth modulation: A finite element analysis of screw positioning, cable tensioning and postop functional activities. Clinical Biomechanics. 2018; 51 :26-33