This is an extraordinary time in cell biology with evolving data pushing a reconsideration of the stability of cell systems and the regulatory mechanisms underlying cell phenotypes, especially the functional cell phenotypes. In this chapter, we will explore new insights into stem cell and extracellular vesicle biology with a focus on the role of extracellular vesicles in normal stem cell physiology as well as in various disease states. Extracellular vesicles (EVs) are being recognized as influential mediators of cellular function and potential experimental therapeutic strategies for a number of disorders outlined in this review. An evolving paradigm indicates a dynamic flux of EV populations within these disease states. We conclude our discussion of EV by extending our knowledge of robust EV biology toward disease detection and prognostication. Characterizing the biophysical and functional changes of vesicles amid disease progression or regression enables investigators to merge this information flux with existing deep learning computational and statistical techniques—allowing knowledge to be abstracted from large data sets profiling the biology of EVs within various disease states. Understanding how EV population shifts represent disease regression or progression creates paramount potential for EVs as salient and clinically relevant diagnostic and prognosticating tools.
- stem cell continuum
- extracellular vesicles
- deep learning
In many traditional stem/progenitor models for different tissues the generally accepted models have posited a primitive stem cell giving rise to more differentiated progenitors and finally terminally differentiated end cells, which may or may not retain the capacity for cell division. Perhaps the most intensively studied stem cell system has been that of the hematopoietic stem cell [1, 2, 3, 4, 5, 6]. In general, current dogma has it that the long-term repopulating hematopoietic stem cell is a dormant non-cycling cell characterized by a surface phenotype that is negative for conventional differentiation markers (B220, Gr-1, Mac-1, Lyt-2, L3T4 and Ter119) and positive for c-kit, Sca-1 and CD150. It is felt that this cell can be purified by FACS and that in response to various differentiating stimuli it progressively differentiates into different lineage restricted populations, which in general are actively cycling. A large number of studies have extensively characterized its molecular regulation and biologic characteristics [6, 7, 8, 9, 10, 11]. We began studying both purified stem cells and unseparated whole marrow stimulated to progress through cell cycle with cytokine exposure and demonstrated that there were cycle related and reversible changes over time in long and short term engraftment, progenitor levels, differentiation into megakaryocytes and granulocytes, homing to marrow, capacity to alter phenotype toward lung cells in response to pulmonary derived extracellular vesicles (EVs), overall gene expression, capacity to take up vesicles and circadian characteristics [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. Passegue and colleagues  studied lineage negative Sca-1+ c-kit+ and thy+/− stem cells further separated into G0, G1 and S/G2/M fractions as to long-term engraftment into lethally irradiated mice. They found that all engraftment capacity was in the G0 population. This indicated that our observations might be in vitro artifacts. However, we noted that no one had adequately studied unseparated marrow as to the cycle status of long-term repopulating stem cells. We essentially reproduced the Passegue data studying purified stem cells, but when we studied unseparated whole marrow we found that over 50% of the long-term engraftable cells were in the S/G2/M fractions. In order to address the potential problem of cellular cross contamination in the FACS experiments we utilized a thymidine suicide technique in which cycling cells are selectively killed by a 30-minute exposure to high specific activity tritiated thymidine. Studies here showed that anywhere between 70 and 100% of the long-term engrafting stem cells were in S phase at the time of the incubation. Further work using in vivo BrdU indicated that the dormant purified stem cells (Lin-C-kit+Sc-1+) rapidly progressed through the cell cycle such that up to 85% of them showed the BrdU label by 48 hours of in vivo BrdU exposure. These data showed that a large number of HSC in the mouse are actively proliferating and thus always changing phenotype. When lineage positive and lineage negative marrow cells were assessed for engraftment and cycle status, it was found that a large number of marrow stem cells were in both fractions and were cycling . In further work, we have shown that different lineage positive cells are rich in cycling stem cells, but intriguingly when double sorted that lineage positive cells no longer showed HSC characteristics but there was a separate population negative for the particular marker with enriched and cycling HSC. This data has led to our current hypothesis that hematopoietic stem cells exist on a cycle related continuum and that these cells while maintaining critical stem cell markers show cycle related fluctuation in differentiation markers . We feel that we may be defining the calculus of hematopoietic stem cells with time and cycle related phenotype changes being the derivatives and with the final overall picture the integral.
Whether this model of small incremental cellular changes over time applies to other stem cell systems will clearly be the object of much future work.
The stability of cell types and systems is also up for grabs. Early in the study of hematopoietic stem cells Till et al.  showed that the first stem cell assay, the colony forming unit spleen, showed marked heterogeneity but the assay was generally reliable. They compared the CFU-s system to radioactive isotopes; the individual decay rates were totally heterogeneous, but the overall half-lives were reproducible and quite exact. These data would indicate the potential importance of evaluating the total population of stem cells alongside the purified variety. The variable and shifting phenotypes of the stem cell with cell cycle transit has to be considered in the context of extracellular vesicle modulation of cell phenotype.
1.1 Extracellular vesicles
Tiny lipid membrane enclosed particles are released from essentially all cell types in the mammalian body [26, 27]. These entities were first found to come from red blood cells and platelets and felt to be essential cellular waste products [28, 29]. Subsequent work has characterized and subdivided these entities to size, density and morphology. Eventually two basic types of vesicles were defined by differential ultracentrifugation; exosomes and microvesicles. Exosomes derived from multivesicular bodies were from 30 to 100 nm in diameter while microvesicles derived from membrane blebbing were from 100 to 1000 nm in diameter. Other vesicular entities were also defined including apoptotic bodies. In general, there was much overlap between exosomes and microvesicular bodies and a meeting of investigators decided it might be best to simply term these as extracellular vesicles (EV) and then define their source and the conditions under which they were isolated. Vesicles were eventually isolated from virtually all bodily fluids and cells [26, 27]. Recent focus has been on the capacity of EV to restore injured tissue and treat disease. Initial work showed that vesicles could transfer protein and RNA while modifying the phenotype of cells and reversing disease in animal models.
Ratajczak et al. showed that embryonic stem cell derived microvesicles could reprogram hematopoietic progenitors by horizontal transfer of mRNA and protein delivery . This was followed by work by Aliotta et al. [31, 32, 33] and Valadi et al.  showing RNA transfer and phenotype change in different experimental models. Further work has indicated that cellular phenotype change may be mediated by transfer of transcriptional activators possibly miRNA . Vesicles from different sources are different but do contain some features of their originating cells. Vesicles contain protein, mRNA, miRNA, lipids and variably DNA, thus they represent complex bio machines with a tremendous range of potential phenotype altering messages.
1.2 Hematopoietic stem/progenitor cells
Vesicles have been found to have a variety of effects on both normal and diseased or injured tissues. These effects may be negative or positive depending upon the specifics of the experimental models under consideration. In many instances there appears to be a yin/yang nature to vesicle effects. Studies with murine marrow cells have illustrated the complexity of vesicle marrow cell interactions [26, 27]. Early studies showed that ES derived vesicles could improve proliferative status of lin-Sca-1+ marrow stem cells  and work in our laboratory has shown that lung derived EVs could induce expression of surfactant A, B, C and D, Clara cell protein and aquaporin in normal murine marrow cells [31, 32, 33]. Studies indicated that for a genomic change to occur the vesicle had to enter the target marrow cells  and that initially both mRNA and a transcriptional regulator were transferred to target cells but that the transferred mRNA was degraded and long-term expression of surfactants B and C (those tested in these studies) derived from the target cells and represented a stable epigenetic event . The functional effects of vesicles on marrow mRNA expression depended upon the cell cycle status of the target marrow cells and the condition of the originator lung cells, in this case either irradiated or not . The results showed that Lin-Sca-1 murine marrow cells showed peak pulmonary epithelial cell-specific mRNA expression in cell cycle phase G0/G1 when the vesicles were derived from irradiated lung tissue while the peak was in late G1/early S phase when the vesicles were derived from nonirradiated lung. Vesicles were present in all types of differentiated marrow cells. Vesicles demonstrated a capacity to reverse radiation damage to marrow and gastrointestinal tissues of mice with the most impressive effect being on long-term engrafting stem cells . The vesicles were shown to increase proliferation, decrease apoptosis and reverse double-strand DNA breaks.
2. EVs and implications on selected disease states
In the following section, we discuss the role of EVs in various disease states, as well as their role in disease detection, progression, and treatment.
2.1 Pulmonary hypertension
The yin/yang vesicle effect is clearly illustrated by studies on murine models of pulmonary hypertension.
There are two major models of murine pulmonary hypertension; the monocrotaline treated mouse  and the Sugen/hypoxia treated mouse . These may represent different forms of pulmonary hypertension but results with different vesicle populations have been similar in both models. Work with the monocrotaline murine model has shown that vesicles in the serum or from the lungs of mice with monocrotaline induced pulmonary hypertension will induce pulmonary hypertension when injected into normal mice . Further work has indicated that marrow from these mice with PH will induce PH in normal irradiated mice . It appears that vesicles from damaged lung tissue, probably damaged endothelium, travel to marrow and induce an endothelial to hematopoietic transition (EHT) with production of “toxic” endothelial progenitors which travel back to the lung, differentiate into pro inflammatory macrophages and induce vascular remodeling resulting in pulmonary hypertension . Marrow derived mesenchymal stem cell (MCS) derived extracellular vesicles were shown to either prevent or reverse pulmonary hypertension in both rodent models . As the endothelial progenitors are quite radiosensitive low dose was tested as a potential therapy for pulmonary hypertension. One hundred cGy whole body irradiation both prevents and reverses pulmonary hypertension in these models. This is the second potential therapy. The EHT is regulated to a large extent by the transcriptional factor Runx-1. A Runx-1 inhibitor, RO5-3335 has been investigated in leukemia. Here we have shown that the Runx-1 inhibitor Ro5-3335 blocks the EHT and reverses pulmonary hypertension in the rodent models . Thus, three potential therapies have evolved from extracellular vesicle research.
2.2 Vesicles in renal disorders
Dr. Giovanni Camussi and colleagues have carried out a series of ground breaking studies on MSC-vesicle effects in murine models of kidney injury. They demonstrated that MSC-vesicles could stimulate proliferation and diminish apoptosis of injured kidney cells [42, 43]. Human mRNAs were transferred and translated into proteins in renal epithelial tubular cells of kidney injured mice. They studied cisplatin treated mice with acute kidney damage. Here they found a dose related correction of injury and felt the therapeutic action was related to the antiapoptotic effect of the MSC-vesicles. They also investigated a ischemia-reperfusion model of kidney injury and showed that the injury could be prevented by a single infusion of MSC-EV . These workers also demonstrated that the active vesicle population were the smaller exosomes as opposed to microvesicles .
2.3 Lung cancer
Being one of the most common, and increasingly deadly cancers in the world, lung cancer lends itself to early screening mechanisms, as well as the potential clinical value EVs may hold in diagnosing and treating neoplastic disease [46, 51, 52, 54]. Exosomal nucleic acid (such as microRNAs) released from neoplastic lung cancer cells play a vital role in cancer’s ability to evade immune response. These cancer-derived exosomes have been shown to have a critical impact on disease progression via their ability to modulate gene expression post-transcriptionally [47, 50]. Lung cancer derived exosomes are laden with, and shuttle a vast array of immune suppressive cargo that stymie the function of immune cells. Interestingly the protein and nucleic acids carried in these tumor-derived exosomes is similar to those of the parent cell it was derived from, allowing for an effective mode of non-contact-dependent cellular manipulation which has wide reaching implication on cancer immune evasion and metastasis.
Vesicles can also have direct actions on target cells. Tumor-associated antigens are also loaded into, and perhaps found bound to the surface of many of these nanolipid carriers, which can then go on to directly modulate immune mediators’ cellular function [47, 71, 73, 74, 86].
EVs shed from lung cancer have also had various implications on the tumor microenvironment and phenotype, a phenomenon observed across numerous cancer types including leukemia . Vascular endothelial growth factor (VEGF) has often been studied as a potential drug target to quench the growth of localized and distant lung cancer. Certain monoclonal antibodies that target VEGF are used to inhibit the formation of new vasculature often initiated by growing cancer cells, which in essence starves a growing tumor from oxygen and other vital nutrients . Work by Azmi has shown how a selective population of EVs allows sensitive lung cancer to escape these treatments. Tumor cells threatened by an increasingly more hypoxic microenvironment secrete a very select population of adapted EVs which can directly stimulate the formation of new blood vessels as well entire organelles, including mitochondria, allowing for a more efficient biochemical use and energy production within an oxygen depleted microenvironment . Other works have confirmed this showing that STAT3-regulated exosomal miR-21 enhances the level of VEGF.
Hypoxia and other cellular stressors can also drive numerous cellular adaptations in lung and other cancers. Hypoxia, acidosis, an immune response initiation such as with endotoxin promotes tumor cells to secret more oncogenic EV—these cancer-derived exosomes have direct toles in mediating metastasis; perhaps being implicated in early cellular dysregulation in order to establish a pre-metastatic niche for future metastatic cells .
There is direct evidence for the involvement of exosomes from highly metastatic cancer cells in educating stromal cells and altering the cancer microenvironment. In addition, much of the stromal microenvironment that is exposed to cancer undergoes mesenchymal transition (EMT), allowing for the genesis of a more aggressive phenotype via an EV-mediated process . Rahman et al. found that exosomes derived from patients with lung cancer induced vimentin expression, and subsequent EMT in normal lung epithelial cells .
Lung cancer derived exosomes promote cancer survival via a myriad of other mechanisms, including fibroblast growth to enhance desmoplastic stromal response which has been shown to enhance tumor growth and block drug delivery in lung, breast and pancreatic cancer models. In addition, tumor cell derived EVs can sequester and carry bioactive Fas ligand (FasL) which has a role in inducing immune cell death, thus dampening the T cell immune response and progressing metastasis in lung cancer .
Prior work has explored the effect of EVs from lung and bone marrow sources, and demonstrated that once at the effector cell, EVs impart cellular effects by several purported mechanisms including: (i) direct binding and activation of cell surface receptors by proteins and lipid ligands, or (ii) fusion and uptake (phagocytosis/endocytosis) of vesicle contents into the recipient cells. Effector molecules (e.g., mRNA), non-coding regulatory RNAs (e.g., microRNAs or miRNAs), proteins, and transcription factors can all be delivered, each having short- and long-term implications on effector cell phenotype and function [58, 59]. As discussed, various other studies have also highlighted the ability of EVs to directly transfer relatively larger molecules such as cellular receptors, major histocompatibility complex (MHC) molecules, antigens, as well as entire organelles, some containing fully intact mitochondria, lysosomes, Golgi and intermediate filaments .
EV studies relating to both breast and prostate cancer highlight many of the salient principles observed in lung cancer studies, and also exhibit the promising roles that EV play in evolving chemoresistance. As we will come to see across a variety of cancer models and disease states, EV function carries great plurality-exhibiting multiple, and often times contradictory effects depending on their cellular origin and physiological state .
In breast cancer, while healthy mammary epithelial cells within the breast stroma secrete EVs that prevent the release of breast cancer derived EVs, the EVs shed by the disease cells promote the opposite, imparting an immense impact on chemoresistance. Cancer derived EVs are known to shuttle pro-oncogenic proteins and nucleic acids from diseases cells to surrounding healthy stroma and connective tissue . Zhou et al. reported that breast cancer secreted exosomes are enriched in particular RNA species, such as miR-105, which destroys the vascular endothelial barrier, allowing cancer to enter the circulation and spread . Studies employing fluorescently labeled miRNA-loaded EVs showed that tamoxifen resistant breast cancer cells in vitro can carry multiple miRNA profiles. EVs packed with fluorescently-tagged miR-221/222 can also shuttle their cargo to sensitive cells of the same type, thereby transferring resistance RNAs which effectively reduced gene expression of P27 and estrogen receptor-α (ERα) in target cell. The loss of p27 has been linked to drug resistance, as it is able to take a cell that is arrested in its cell cycle and stimulate its reentry back into active cycling . However, as discussed, healthy stromal cells counteract the effects of oncogenic vesicles. This competition between “good” and “bad” vesicles is a fine balance; a yin/yang that loses equilibrium as cancer overwhelms healthy stoma. When stromal cells are outcompeted and significantly influenced by oncogenic EV signaling, the now altered stroma in turn activates STAT1 and NOTCH3 signaling in breast cancer cells, promoting cancer initiating cell populations responsible for drug resistance and nascent tumor formation . This is a common theme in EV-mediated cancer progression which we will see is universal across numerous solid and hematological cancers.
2.5 Prostate cancer
Human bone marrow mesenchymal stem cell (MSC) derived EVs are involved in the modulation of cell signaling, cellular differentiation, and proliferation—and this is seen across multiple disease paradigms. These regenerative EVs have been shown to reverse the malignant phenotype in prostate and colorectal cancer, recovering function in a murine model of AKI, as well as mitigating radiation damage to marrow . In models of prostate cancers, the reversal of taxane resistance and tumorigenic phenotype in a human prostate carcinoma cell line (as well as human explants) can be accomplished by treatment with healthy MSC-derived EVs . Other populations of “therapeutic” EVs (outside of the bone marrow) have also been isolated and applied: EVs isolated from normal prostate cells acquired via patient biopsy reverse the resistance of malignant prostate cells to various drugs. On the contrary we have shown that EVs derived from cancerous cells can drive cancer progression and enhance resistance to certain chemotherapies, which again highlights the specificity, plurality, the yin/yang of EV functionality . Panagopoulos et al. confirmed much of this work, showing that vesicles from both in vivo prostate cancer cell and explant cultured prostate cancer cells can induce cellular changes that produce a neoplastic phenotype in normal prostate cell lines . These results were also reproduced using vesicles from patients with other malignancies, namely prostate, and lung .
2.6 Neural-derived EVs in traumatic brain injury
Our group has developed a unique biomarker system focusing on EVs isolated from the saliva of patients who have experienced mild traumatic brain injury (mTBI) . Rather than conventional human serum isolation, this has been a novel protocol allowing for the easily accessible collection of saliva laden with EVs that have freely trafficked from injured brain parenchyma into the saliva—allowing for a representative sample which captures the shift of various EV populations and cargo following brain trauma. EVs are membrane bound, and thus are not subject to the same degradation that conventional serum biomarkers face, making them ideal biomarker candidates. Salivary EVs in particular can be isolated based on tissue specificity and have well established roles in the detection of numerous other disease states, including oral squamous cell carcinoma . Bolstering their utility as a unique biomarker, upon analysis and characterization of patients that had sustained mTBI it became apparent that EVs isolated from saliva had numerous neural markers on them, confirming their origin from brain parenchyma . Following analysis of the expression of Alzheimer disease (AD) genes in patients who had suffered mTBI vs. healthy controls, multiple important AD specific genes were significantly upregulated in patients that had suffered mTBI when compared to healthy controls; allowing for the identification of mTBI-specific genetic profiles derived from neural derived EVs. The potential characterization of early mTBI biomarker genes, including (but not limited to) CTSD, CDC2 and casein kinase (CSNK1A1) is being explored . Longitudinal analysis of these patients coupled with further analysis of the identified surrogate markers allows for possible prognostication of mTBI patients in regard to severity of post-TBI concussion symptoms, chronicity of symptomology and potential recovery. This is all made possible by the ubiquitous and specific nature of EVs.
2.7 Hematologic malignancies
EVs secreted by blood borne hematologic cancer have modulating affects impacting a variety of cancer hallmarks. EVs have a direct effect on phenotypic and genotypic changes, highlighting the central role of EVs in the progressions and reversal of hematologic malignancies.
2.7.1 Impact on various leukemias
Pathways involved in angiogenesis have been shown to modulate cancer progressions and chemotherapeutic evasion in multiple models . Vesicles shed by chronic lymphocytic and myelogenous leukemia (CLL, CML) transmit cargo containing a myriad of cancer-inducing factors, such as rapamycin/p70S6K/hypoxia-inducible factor-1α axis. Similar to lung cancer-derived EVS, these CML-derived vesicles have been shown to bolster the survival of CML B-cells via the establishment and proliferation of vascular endothelial growth factor within the forming leukemic bone marrow stromal cells [67, 69]. In multiple myeloma (MM) models, bone marrow stromal cell-derived exosomes, mediate cellular communication by transferring mRNAs, miRNAs, and proteins important in proliferation, survival, and chemoresistance . Experiments utilizing in vivo mouse K562 CML cells showed that neogenic angiogenesis can be induced by immortalized myelogenous leukemia cell line K562 exosomes, as well as neogenic changes in human umbilical endothelial cells .
Other hematologic malignancies show similar cancer induction potential. CML-derived EVs given to rat models can induce CML-like characteristics via the transfer of their oncogenic cargo . Bone marrow stromal cells respond to this influence by producing interleukin (IL)-8 (mRNA and protein), a potent pro-angiogenic factor that modulates both in vitro and in vivo the leukemia cell malignant phenotype . In our own established acute myeloid leukemia (AML) model, we explored the potential of human bone marrow MSC-derived EVs as a direct adjunct therapy for AML. Our studies indicated that the killing potential of cytarabine, at even relatively low doses, is potentiated by the addition of healthy MSC-derived EVs. We believe EVs can also alter a cancer cell’s sensitivity to chemotherapy via EV guided horizontal information transfer. This has implications directly on the cell itself but likely also impacts the surrounding stroma in order to further promote oncogenic growth and drug resistance of leukemia cells . In models of MM when marrow MSC-derived exosomes are cultured with cancerous MM cells, there is a significant increase in multiple anti-apoptotic pathways which promoted MM cell viability. These exosomes, derived from stromal cells within a microenvironment amidst developing active cancer, were also able to induce drug resistance to the proteasome inhibitor bortezomib via activation of several survival relevant pathways, including c-Jun N-terminal kinase, p38, p53, and Akt .
2.7.2 Impact on chemo-resistance
Cancer derived EVs play a central role in facilitating the escape of cell death, by cancer cell. Proteins such as BCL-2, MCL-1, BCL-X, and BAX as well as other cell death-related proteins were shown to be more concentrated in the EV of apoptosis-resistant primary AML blasts than EVs from more sensitive AML cells . Via confocal-microscopy-based colocalization studies, the direct transfer of EVs from resistant to sensitive cells has been observed. Leukemia derived EV harbor multiple bioactive lipids, proteins and miRNAS important in chemoresistance. Ibrutinib is a drug used clinically to combat leukemia. Analysis of plasma samples collected from CLL patients showed exosomes bearing unique micro-RNA prolife, including miR-29 family, miR-150, miR-155, and miR-223, showed a different exosome profile from what is seen when disease is suppressed with ibrutinib treatment—perhaps, indicating the potential pathophysiology by which cancerous EVs impart resistance, as well as creating a potential for biomarker identification . EVs packed with miR-221/222, from tamoxifen resistant MCF-7 breast cancer cells, can shuttle their cargo to sensitive cells of the same type, thereby transferring resistance.
2.7.3 Impact on the cancer microenvironment
Healthy bone marrow stroma likely functions to maintain and protect healthy bone marrow stroma from nascent cancer. At first detection of threat, the bone marrow microenvironment and residing cells, such as MSCs, combat early cancer—much of this is likely EV mediated. Our established leukemic cell model has established this hypothesis, indicating that MSC-EVs impart a robust anti-proliferative and pro-apoptotic effect on leukemic cells in vitro. We also have preliminary data using EVs toward clinically relevant endpoints, and have showed they serve as a synergistic adjunct to conventional AML therapies, such as cytarabine.
As discussed, the bone marrow stroma can be recapitulated by active cancer via multiple EV-dependent mechanisms. Leukemic models have shown the net effect of EV-modulation translated to a phenotypic change of the bone marrow stromal cells toward a more inflammatory signature that resembles the phenotype of cancer-associated fibroblasts (CAFs) . CAFs show enhanced proliferation, migration, and secretion of inflammatory cytokines, all contributing to a tumor-supportive niche . As a result, stromal cells exposed to a leukemic EVs are not killed but “reprogrammed” to be pro-oncogenic and support tumor growth. As discussed, EV populations change depending on disease state . In the case of CLL, as leukemic cancer cells progress varying EV populations establish control within the microenvironment. CLL-derived EVs rapidly deliver their biologic cargo to the surrounding stromal cells, and promoting CAF phenotypes with enhanced proliferative and migratory properties . CLL models have shown that CAF-derived factors may also have an immunogenic effect on the T and myeloid cells, altering their phenotypes into immunosuppressive and tumor-promoting Th2/M2-like cells, respectively. These modifications lead to defective T-cell and myeloid cell immune responses and an inflammatory milieu characteristic of CLL promotion .
Leukemic EVs impart genotypic and phenotypic effects on all components of the leukemic microenvironment. The bony endosteal compartment of the bone marrow niche, composed of osteocytes/osteoblasts/osteoclasts, is reprogrammed by AML—derived EVs toward inflammatory myelofibrotic cells. These cells support leukemic growth and support BM fibrosis, a well-established risk factor for leukemia . Metastasis is crucial to cancer survival. Leukemia derived EVs have also been shown to disturb the architecture of multiple tight junction proteins in cells of the basement membrane, allowing cancer to detach, mobilize, and metastasize beyond in situ disease. Leukemic EVs can also bolster angiogenesis . In vitro studies, first reported by Umezu et al., clearly showed leukemic cell to endothelial cell communication via exosomal miRNAs by fluorophore signaling localization, allowing for the creation of new blood vessels to feed cancer growing in newly seeded microenvironmental niches . Microenvironment stromal cells have been shown to directly take up cargo from EVs fluorescently labeled with GFP. As we’ve seen in lung and breast cancer cells, Boelens et al. showed that this cross talk is reciprocal and that when stromal cells are influenced by oncogenic EV signaling the stromal cells themselves in turn activate STAT1 and NOTCH3 signaling in developing cancer cells. This cell signaling in turn leads to cell populations responsible for drug resistance and nascent tumor formation [64, 78]. Schepers et al. have shown that AML cells (likely via an EV directed mechanism) cause numerous chromosomal anomalies and genetic mutations within the surrounding stroma, thereby altering the biology of the stem cell continuum away from normal hematopoiesis, and toward transforming bone marrow stem cells toward immature progenitors that will subsequently develop into leukemic blasts or altered cancer-stem cells capable of supporting a pro-leukemic environment . The CAF phenotype promoted by tumor-derived EVs, has, itself, secondary effects on endothelial cells, increasing angiogenesis. The sum and synergy of all of these EV-directed microenvironment modulations means the leukemia-modified stroma favors leukemic blast proliferation while stymieing normal hematopoiesis [69, 76].
3. Machine learning
In this section, we discuss machine learning (ML) as an emerging scientific field of sophisticated algorithms that aid in the understanding of how nonlinear interactions between molecular features contribute to disease etiology. Here, we give relevant background on how machine learning is used in biology, provide a formal and probabilistic specification of the hierarchical architectures implemented by common ML methods (Bayesian deep neural networks), and demonstrate their power via real data applications. Indeed, a myriad of well-established algorithms can be surveyed in detail, but our main goal is to develop a more conceptual pipeline on how to use machine learning techniques on individualized biological problems.
In the context of our own research interests, we have found vesicle biology to be amenable to ML because of (i) the ability to observe millions of vesicles during a single study and (ii) the nonlinear nature of downstream vesicle effects. As we will show, large sample sizes and the presence of variable interactions are often leveraged by ML algorithms to provide high predictive accuracies. We hypothesize that these performance gains will lend to a more complete picture of how vesicle behavior impacts the overall cellular environment.
3.1 Background and significance
Machine learning is often described as a subarea of artificial intelligence that seeks to recognize subtle patterns found within data. It has been noted that the field has roots dating back to early work done by Arthur Samuel in 1959 . However, despite this long history, only recent technological advances over the past two and a half decades have considerably revived interest in ML. With increases in both data collection and computational power, the applications for machine learning algorithms have become vast and integral parts of our everyday lives (e.g., facial recognition, spam detection, etc.)
One explanation for the utility of ML approaches is that they are able to model complex structures in data and leverage the detailed information to accurately predict or classify unobserved outcomes. Unique to these algorithms is their ability to adaptively update themselves (learning) through repeated exposure to new observations (a process formally known as “training”) [81, 82]. Intuitively, an algorithm should achieve a higher predictive accuracy after training on larger data sets: the more possibilities an algorithm is exposed to, the better said algorithm becomes at correctly identifying similar complex patterns in heterogeneous populations [82, 85]. This represents a common tenet about ML theory: the more data the better. However, just having data is not always enough. A second tenet is related to the strength of signal between the observed data and the scientific question of interest. The greater the signal-to-noise ratio, the more amenable the task is to ML methodology. In practice, there exists a general relationship between tenets 1 and 2: the more data one has, the less robust the signal-to-noise ratio must be to achieve an acceptable prediction/classification; conversely, a high signal-to-noise ratio will compensate for less data. Note that this is obviously not a strict relationship, as many have demonstrated ML algorithms to perform well on noisy data sets with few observations.
With its growing popularity in the biological literature, the formal connection between machine learning and more traditional statistical sciences cannot be overlooked. Indeed, many current approaches in ML are motivated by prediction; however, there are opportunities to pair these tools with fundamental probabilistic concepts to improve power for inference-based tasks as well. This is particularly relevant for biological problems where it is also important to understand the processes that are contributing to better predictions. To this end, recent works have used (interpretable) ML algorithms for live risk stratification in cancer patients , novel biomarker identification in liquid biopsies , hypoxemia prevention during surgery , point-of-care diagnosis of lymphoma , as well as many other uses in genetics and genomics [82, 83].
3.2 Probabilistic formulation
With an increasing literature on both statistical and machine learning methods, it can be difficult to decide which algorithm to use Figure 1 provides a general approach for determining the proper choice . In this section, however, we will focus on detailing an increasingly popular machine learning method known as a neural network (NN). Although NNs excel at classification tasks (see Figure 1), many recent works have focused on applying neural networks to a wider range of applications [83, 91, 92].
For simplicity, we will consider an arbitrary data analysis problem. Let be an -dimensional response/outcome vector for individuals. Assume that for each individual, we measure features and tabulate their collection via an design matrix . Statistically, these features are variables that we believe will help accurately predict the outcome. In the case of our research on vesicle biology, features may be biophysical (i.e., vesical diameter and volume), genomic (i.e., sequence data), proteomic, or lipidomic measurements. Following previous work, we may specify a (Bayesian) NN by assuming some hierarchical architecture to “learn” the predicted response for each observation in the data .
These sets of equations reformulate a general NN as a probabilistic hierarchical statistical model. In Eq. (1), is an -dimensional vector of predicted values, is an -dimensional vector of continuous unbounded values that need to be estimated, and is a link function that relates to the mean of the (assumed) distribution of . Note that the link function can be flexibly changed depending on the goals of the research. For example, in the case of regression problems with continuous outcomes, the link function is set to the identity; while for classification-based applications with binary data, we may use a sigmoid function that which transforms the systematic part of the model to be between 0 and 1. If one is faces with a multiclass problem, then can be redefined as a softmax function.
In Eq. (2), we use to denote an matrix of activations from the penultimate layer (which are fixed given a set of inputs and point estimates from previous layers), is a -dimensional vector of weights at the output layer that is assumed to follow some prior distribution (see Eq. (3)), and is an -dimensional vector of biases that is produced during the training phase.
Under this formulation, notice that we may divide arbitrary Neural Networks into three components (see the middle panel in Figure 2): (i) an input layer of the features in the design matrix (red nodes), (ii) a set of hidden layers where parameters are deterministically computed based off of a set series of activations and point estimates (blue nodes), and (iii) a penultimate layer where the weights are treated as random variables (green nodes). This structure is also highly generalizable: hidden layers can take on any form, provided that the additional structure can be represented via some linear combination of activations, weights, and biases.
3.3 Real data applications
We now demonstrate how machine learning and, more specifically, neural networks can be adopted to positively impact data analysis. Our group looks to characterize the vesicle phenotype of patients at various stages of treatment in various leukemias, such as AML. Here, we utilize a common NN architecture known as a Multilayer Perceptron  where we first train the algorithm on patients with known disease statuses (i.e., if the th patient has cancer) and then test its ability to accurately classify a set of undiagnosed individuals. We define accuracy here as simply the percentage of correctly classified samples in a testing dataset. For each validation run, a Receiver Operating Characteristic (ROC) curve is drawn and the area under the curve (AUC) is calculated. The AUC is a standard performance metric for classification problems in statistics and may be interpreted as an assessment of how effective an algorithm is at discriminating between two classes (i.e., a healthy versus a disease phenotype) . Higher AUC values (on a scale from 0 to 100%) indicate better model performance. An overall summary of our workflow may be found in Figure 2 where we illustrate how different biological features are quantified and few through a NN to make predictions.
We first trained the algorithm on data collected from a NanoSight Tracking instrument, the NS5000. This allowed us to collect a wide selection of vesicle features including size, area, volume, diffusion coefficients, and total vesicles secreted. This data was collected from two cell type populations: (i) a primary hMSC cell line and (ii) a Kasumi AML cell line. We did this in order to first assess the validity of the idea that there is a discernable difference between vesicles derived from “normal” hMSC and vesicles from the cancerous Kasumi cell line. Within the training set, we were able to classify vesicles with relatively high accuracy: the mean AUC (plus or minus standard deviation) after 10-fold cross validation was 90.16 ± 9.26%. This translated into a high accuracy in the testing population with a mean AUC (after 10-fold cross validation) of 95.97 ± 5.38%. We next tested the algorithm on real patient samples, achieving perfect accuracy in reliably characterizing and classifying healthy tissue. We believe the reason for the high accuracy is due to the primary hMSC cell line accurately representing the vesicular phenotype of normal, healthy bone marrow. There is still some work to be done in accurately classifying malignant samples. We believe that the heterogeneity of the leukemic vesicle phenotype cannot trivially be captured through cell line data [94, 95].
To address this heterogeneity problem, we then elected to train and test our machine learning algorithm solely on patient samples—in hopes of increasing the predictive performance. We fed the model 35 samples from patients with various hematologic conditions. We tested and trained the model on these 35 samples and were able to achieve a mean training accuracy of 93.76 ± 4.77% and an out-of-sample AUC of 97.33 ± 3.46%. The high testing performance suggests that the algorithm is capable of accurate classification and serves as a general proof-of-concept of the potential utility of machine learning in this space. Here, this technology has the power to identify complex, heterogeneous patterns that distinguish the normal healthy vesicle phenotypes from leukemic vesicle phenotypes.
EVs are a ubiquitous and dynamic population of cell-specific information. Functionally, they act as a class of membrane-bound cellular communication particles that contain bioactive molecules. By exerting their effects through RNA, proteins, lipids and variably DNA, EVs implement various downstream phenotypic and genotypic effects across multiple disease states. By enabling contact-free cell to cell communication, EVs can modulate normal physiological homeostasis. Moreover, research has shown that certain subpopulations of EVs are responsible for initiating and maintaining certain pathological states. This is the “Yin and Yang” of vesicle biology which posits that EV populations harbor a functional endpoint specific to cell type and disease state. There are far reaching implication in utilizing EVs toward clinical endpoints focused on disease identification, progression, modulation, and ultimately cures.
In order for EVs to be effectively utilized in the identification, prognostication, modulation, and curing of disease, more work needs to be done with regards to understanding the nonlinear effects of EVs on target cells. Enter the need for novel and sophisticated statistical modeling techniques. The steadily increasing size of “omic” data sets, along with significant improvements in computational power and machinery, has caused resurging interest in machine learning. Consequently, this revival has led to algorithmic improvements in both predictive accuracy and precision. With the large amount of information that EVs provide, there is a unique opportunity to gain new knowledge from applying ML techniques to current problems focused on better understanding complex EV biology.
Our lab has already begun utilizing ML algorithms in the characterization of diseased subpopulations of EVs. Currently, most active areas of research in vesicle biology focus on characterizing EVs via isolation methods. Alternatively, we propose to analyze entire populations of EVs jointly, for we believe a wholistic view better captures the true nature and variability of a patient’s disease process. Thus, our work is novel in this respect: we use predictive algorithms to identify subtle patterns within a given vesicle population. Here, we analyze how particular subpopulations interact and detail how these interactions influence the underlying disease process. Furthermore, we hypothesize that by monitoring a patient’s entire population of EVs throughout the course of treatment, we can better predict the efficacy of the treatment. Preliminary results have yielded positive results with respect to categorizing diseased and healthy EV populations. Work now must be done to further characterize these subpopulations within the context of specific diseases, such as AML and other blood neoplasms. EV biology presents another avenue of utility for the field of machine learning. Concatenating large sets of EV information within interpretable ML algorithmic frameworks can lead us closer to the use of EVs as a predictive and useful clinical marker. Overall, the future is bright for both the fields of EV biology and ML.