Sexual assault (SA) is a crime of violence against a person’s body resulting in a physical trauma, mental anguish, and suffering for victims generating expenses for government intended criminal investigation, medical care, and psychological attention. During the crime scene investigation, the identification and recovery of biological evidence (BE) are utmost important, since sometimes these are the only way to prove sexual contact and the perpetrator’s identity. The examiner, with the help of specific technologies and techniques, must be able to find evidence that otherwise could go unnoticed. Forensic laboratories identify biological evidence with systemized protocols and use molecular methods to generate DNA profiles based on the amplification and DNA sequencing. Before the arrival of the new-generation sequencers, the application of other markers (single nucleotide polymorphisms (SNPs), insertion-deletion of nucleotides (INDEL), or microhaplotypes (MHs)) was laborious, expensive, and not very informative for forensic purposes; however, now they are useful in this field. Next-generation sequencing (NGS) brought a new series of applications like epigenetics, microbiota, messenger RNA, and microRNA analysis and the inferences in the ancestry and phenotyping of individuals. In the end, the results obtained from such analyses and stored in databases are very useful for the identification of sexual aggressors.
- sexual assault
- biological evidence
- DNA analysis
- next generation sequencing
- human identification
Sexual assault is considered a serious offense all around the world due to the impact it has on the victims, their relatives, and society in general. The investigation of sex crimes requires a group of multidisciplinary forensic professionals focused on the identification, recovery, packing, and analysis of evidence.
This document describes general recommendations and the decision-making process is necessary for the recovery and analysis of the collected evidence in sexual assault especially that of biological nature coming from victims, perpetrators, and crime scenes. The use of appropriate tools for identifying biological evidence (BE) is a key element in the success of the investigation since it allows forensic investigators to make decisions and utilize presumptive or confirmatory methods to recover and forward evidence to specialized laboratories. Additionally, the utilization of microscopic techniques and genetic fluorescence hybridization allows accurate work while selecting and isolating components on a cellular level thus increasing the possibilities of obtaining a genetic profile that identifies the perpetrator.
Obtaining genetic profiles out of BE in sexual assault cases requires the use of DNA extraction techniques designed for the separation of cells (sperm cells from the aggressor and epithelial cells from the victim) which contribute to the acquiring of differentiated genetic profiles of the contributors.
The use of short tandem repeats (STRs) in forensic investigation has been, for many years, a key element in human identification. Other techniques such as mitochondrial DNA (mtDNA) and single nucleotide polymorphism (SNP) analysis and its variants broaden the possibility of obtaining a profile by providing additional information when other methods fall short. DNA methylation analysis, microRNAs, and genome sequencing of microorganisms provide scientific information for criminal investigation.
The development of new-generation sequencing has set the perspective of analysis on establishing geographical origin of individuals, estimating marker frequency of different population groups around the world just as genetic markers of phenotypic expression allow acquiring information of visible external characteristics (height, baldness, eye color, skin and hair), they provide help for criminal investigation.
The implementation and use of databases which register the information acquired from sexual assault investigations are a necessary tool to facilitate the comparison of the resulting information; hence, establishing the parameters to enter or delete perpetrator profiles and other genetic profiles, different in nature, from such databases must be contained in every country’s legislation to be used in criminal investigation processes.
This chapter focuses on the location, sampling, molecular analysis and management of biological evidence in sexual crimes, as well as on specific aspects and a panorama of their molecular analysis.
2. Crime scene investigation and recovery of biological evidence
2.1 General inspection
Before the investigators begin examining the scene of the crime, they should gather as much information as possible about the setting to prevent lost or destruction of valuable and/or fragile evidence such as shoeprints, trace evidence, etc. The main areas of inspection are the floor, rugs, bathroom, bedding, and trash receptacles where other elements could be discarded by the aggressor during cleaning such as condoms; the inspection should be extended to the neighborhood if necessary [1, 2].
In the search for signs of sexual contact, the investigator can identify evidence through naked eye observation; however, it is convenient to emphasize that evidence of contact is frequently not visible. These elements of BE require the use of forensic light sources for detection due to their natural characteristics, such as light absorption (blood) or fluorescence emissions (semen, saliva, and urine). This method is a simple, presumptive, and nondestructive test [3, 4, 5].
In cases where evidence is not detected with the use of forensic light, it is necessary to use other techniques such as Bluestar® to detect washed blood stains, low light or magnifying glasses to observe fibers, and the use of vacuum machines that retain material in filters which could be analyzed in a criminal laboratory [2, 4, 5].
2.2 Recovery of evidence at the crime scene
In a SA investigation, it is necessary to identify any possible source of BE left on the victim or at the crime site (e.g., condoms, body fluids on objects or textiles, bottles, cigarette butts, and hair). Transportable evidences will be packed and sent to the laboratory. When BE are in non-transportable objects, the use of a dry or lightly moistened swab passed gently through and rotated in the same spot (swabbing method) is sufficient for recovery. In the case of wet evidence, care should be taken to dry them to avoid damage of BE, by the growth of microorganisms that cause degradation of DNA .
The success of DNA typing is related to the amount of target material recovered from an evidentiary item. Absorption and adsorption are two features that related to capability to collect BE and to later release the cells/DNA during the extraction process, respectively. Synthetic swabs release more cells/DNA during the extraction process and yielded up to 2.5 times more alleles compared to cotton swabs because portions of DNA remain entrapped in the fibers .
Swabs of different design, shape, and size used for evidence recovery are commercially available (X-Swab™ Diomics Corporation and Copan 4N6FLOQSwab™); all of them with highly absorptive properties. The use of double swabbing method are recomended to recovery of touched (trace) evidence; this technique increases the possibility of obtaining DNA profiles; however the use of cotton swabs is not recommended for trace evidence [7, 8, 9, 10]. Figure 1 shows the workflow of evidence recovery from the crime scene.
2.3 Recovery of evidence on victim and perpetrator
When a SA is reported, authorities order a medical interview and examination for evidence recovery; during the interview, the expert needs to document the type of sexual aggression (penile-vaginal rape, oral, copulation, sodomy, penetration with foreign objects, or digital penetration), Personal hygiene, and the elapsed time after the incident are crucial; these information will indicate the type of sampling to be performed. Additionally, the examiner will look for elements that are associated with aggression (e.g., bites and body fluids), and these will be obtained from anatomical regions that show signs of injury or attack [8, 11, 12, 13] (Figure 1).
One source of evidence in SA investigation is the suspect or perpetrator. It is known that the evidence could potentially be transferred from the suspect to the victim and vice versa. Therefore, depending on the type of contact involved in a SA, the suspect’s body may actually be a better source of probative evidence. The biological evidence deposited on the victim and perpetrator deteriorates rapidly; therefore, it needs to be collected as soon as possible . Figure 2 shows the evidence recovery guidelines.
The sperm cells are resistant to biological degradation compared to somatic cells; this rationale is supported by the knowledge that the protein composition of the sperm nucleus (protamine) acts as a protector of the damage caused by the nucleases, delaying the degradation process .
3. Identification of biological evidence in the laboratory
The evidence/garments collected (from the victim, corpse, aggressor, and crime scene) are inspected in the laboratory in order to perform a search for blood, semen, hair, saliva, sweat, tissues, fibers, and other elements. One of the first interventions is the macroscopic analysis that consists of evaluating evidence through meticulous and sequential observation, evaluating and establishing strategies to find biological spots. When BE is not visible to the naked eye, it is then necessary to use technological help: the forensic light sources with specific wavelengths for its detection [3, 4, 5] (Figure 1).
In daily forensic practice, the latent spots of some biological fluids such as semen, saliva, urine, and sweat require the application of light radiation with specific wavelengths for detection by fluorescence depending on their emission properties or absorption of light; although fibers and hairs are elements that can be observed without instruments, the lack of contrast in the background makes their visibility difficult; in such cases, the use of magnifying glasses or lights helps to generate shadows that can help to locate them.
Once identified, the BE on the area—depending of surface or support of the fluid—is taken with moistened swabs with sterile water, or a portion (of support) is cut to perform a presumptive or confirmatory analysis of the evidence. In the case of trace evidence, it should be kept in its original support (textile) and analyzed ensuring sufficient evidence is left for subsequent trials .
The applications of presumptive chromatic reaction tests are useful for orientation in the identification of its nature and its selection of confirmatory test for determination of human origin through immunological tests. It is important to consider the amount of BE for the destructive processes for some test and to apply necessary measures for its preservation or greater use for subsequent studies.
Some forensic laboratories analyze semen through optic microscopes, aiming to identify the sperm cells. There is controversy regarding this procedure since a portion of the sample is separated from the original support, making it difficult to apply other analyses, even though it is important to consider it as minimal evidence for obtaining genetic profiles. On the other hand, laboratories use fluorescence microscopy for cytological preparations to apply fluorescent techniques that allow increasing the sensitivity in the detection of spermatozoa, confirming the presence of these cells in the analyzed fluids [16, 17]. Figure 3 describes the advantages and disadvantages of presumptive and confirmatory forensic tests.
4. Cell isolation from biological evidence
Biological cell mixtures represent one of the major challenges in forensic genetics. In principle, when more individuals contribute to a mixture with different biological fluids, their single genetic profiles can be obtained by separating the distinct cell types [18, 19]. There are standard DNA extraction methods developed to separate the sperms (male fraction) from the epithelial cells (female fraction) as preferential lysis; however, these methods are incapable of separating single-source sperm from multiple male donors .
There has been a recent use of modern tools to reach that goal. Laser microdissection (LMD) is a technology that has been around for more than 40 years; it combines the amplification power of a microscope with the precision cut of objects allowed by the laser technology. Only in the last decade has LMD been used for forensic purposes, mainly in SA for isolating sperm cells from vaginal swabs [18, 21, 22, 23].
4.1 Laser microdissection (LMD)
The use of LMD in the forensic field was first described in 2003 as a way of recovering sperm cells from slide smears of SA cases. LMD allows the selection of individual cells based on morphologic analysis (e.g., sperm and epithelial cells) or on labeling with specific fluorescent dyes. The microscopic search for sperm in cases where there is a limited number of cells can be exhaustive and prolonged . However, this technology includes an automatic searching function module as introduced by the manufacturers [20, 24].
Until today two variants of this technique are noted: laser capture microdissection (harvesting cells by melting a thermoplastic membrane) and laser cutting microdissection (harvesting cells by catapulting). The operating principles of these types of LMD are the identification of cells, using the laser to perform clean cuts in the supporting layer around them and not requiring physical manipulation of the cells eliminating the risk to foreign contamination [19, 22, 23]. The cell analysis in a mixture with an azoospermic or oligospermic contributor is more difficult. This is because in the absence of sperm cells, the male and female cells are indistinguishable; therefore, the use of specific fluorescent dyes is required .
4.2 Fluorescence in situ hybridization and laser microdissection
The use of LMD does not always allow distinguishing the sperms in the microscopic bright field for several reasons: they can lose the tail; few sperms; or azoospermic cases. However, non-sperm cells can be found in semen, such as leukocytes and epithelial cells from the ejaculatory duct and urethra [18, 25].
Fluorescence in situ hybridization (FISH) method allows distinguishing male cells from female ones in cellular mixtures. The DNA is hybridized with DNA probes for the “X” and “Y” chromosomes (marked with fluorophores) and then observed in fluorescence microscopy, enabling individual identification [18, 25, 26]. The LMD in combination with the FISH technology can greatly improve the identification and later separation of male non-spermic cells from epithelial female cells.
This technique (FISH with LMD) has been shown to be capable of producing autosomal STR profiles from samples that previously would have proved difficult or impossible to separate; additionally, it has applications in numerous other sample types where the ratio of female cells to male cells is large, including cases involving penetration without ejaculation, digital penetration, or oral sex [18, 27].
On the other hand, other separation methods  were developed which consisted of separating sperms from epithelial cells taking the difference in size and shape; this gave mixed genotypes in the results. Other new methods have also been proposed for cell separation, such as low-volume polymerase chain reaction (LV-PCR) used for single sperm isolation and detection, aspiration capillaries, microfluidic devices, the mDip technique, and fluorescence-activated cell sorting with flow cytometry, based on immunolabeling only applicable on fresh vaginal lavages and not on vaginal smears or archived material .
5. DNA analysis
5.1 DNA extraction methods
There are many extraction methods available, and they vary in their ability to extract the DNA in an efficient way; some of the factors to consider are the kind of sample to be analyzed, the time it takes to process, the operator intervention, the risk of contamination, and the difficulty or ease of use. This is the basis for successful forensic DNA profiling [6, 29].
The method of preference has the task to not only ensure that the DNA is efficiently extracted from each sample, but it must also remove possible inhibitors which may interfere with other processes like the amplification .
5.1.1 Techniques for DNA extraction
One of the most common techniques used in DNA extraction is Chelex, which is a chelating resin that uses ion exchange to bind transition metal ions protecting the DNA from degradation. The advantage of the Chelex® method is that it is quick, it does not require multiple tube transfers, and it does not use toxic organic solvents; the main disadvantage is that it is unable to remove inhibitors that interfere with the amplification process [6, 30, 31, 32].
When processing samples with inhibitors, it is advisable to use the organic extraction method, which requires lysis of cells carried out in a salt solution containing detergents and proteases to denature proteins and release the DNA from the cell. This cocktail can be separated by using a mixture of phenol-chloroform-isoamyl alcohol, which leaves the DNA in the aqueous phase. The extracted DNA can be concentrated from the aqueous phase by ethanol precipitation or with a centrifugal filter unit, which allows for additional purification and concentration of the DNA in the samples [6, 29, 31].
The advantage of the organic extraction method is that it can obtain genetic material from difficult samples (degraded and/or low amount of DNA) and can successfully remove the presence of inhibitors for the PCR. While this method remains one of the most reliable and efficient, it is also very time-consuming, uses hazardous chemicals, and, because of the greater hands-on effort and multiple tube transfers involved, introduces increased risks for contamination and sample mishandling [6, 31].
5.1.2 Differential lysis in DNA mixtures
The genetic analysis of the evidence collected in sexual crimes commonly includes genetic profiles of two or more contributors; in this kind of mixtures, the genetic contribution of the individuals is generally unbalanced. In some circumstances, the biological mixture presents a minimal level of one contributor, usually the perpetrator in cases of SA. The genetic rate of this donor is likely not to be detected because of the sensitivity limits or the reaction saturation by the component that has more quantity. In most cases, the minor contributor in the DNA mixture cannot be detected when ratios exceed 1:20 .
The recovery of evidence in cases of SA is a great challenge for the DNA forensic analysts, because it requires the separation of DNA from epithelial (the victim) and sperm (perpetrator) cells. The differential extraction was first described in 1986 by Gill and coworkers , as a modification of the organic phenol-chloroform extraction, and it is called differential lysis because the non-sperm cells are selectively lysed with detergent and proteases, while the sperm cells are not lysed due to the heavily disulfide cross-linked proteins in the sperm head that resist protease treatment [6, 29, 33, 34].
In DNA forensic labs, the differential lysis method has long been the standard for separating spermatozoa from epithelial cells. Although this technique can theoretically provide two fractions, as pointed out earlier (one comprising the offender’s DNA and the other containing the victim’s DNA), the separation is not always complete, resulting in mixed genotypes [29, 33].
5.1.3 Other DNA extraction methods
There are other methods to separate sperm and epithelial cells from sexual assault samples. The Differex™ System method involves a proteinase K-selective digestion of epithelial cells, followed by differential centrifugation and phase separation. The use of this method in DNA laboratories indicates it offers efficiency equal to the two-step method for extracting sperm DNA from mixed stains [6, 35, 36].
5.2 Molecular methods for human identification
The first use of DNA testing in a forensic setting came in 1986; two girls were sexually assaulted and then brutally murdered in 1983 and 1986, in Leicestershire, England. This case showed an innocent being accused and 1 year later the guilty responsible one being found and processed .
In the last 30 years, DNA molecular analysis has become an important tool in forensic investigations. Currently, DNA profiling is based on polymerase chain reaction (PCR) analyses. This method includes the autosomal STRs, Y and X chromosomes. The PCR is a process of replicating a specific region on the genome, over and over again to yield many copies of a region [29, 38].
Before the PCR, the DNA has to be quantified. This is essential in order to ensure its correct amplification; its primary purpose is to determine the amount of DNA template, resulting from the isolation. There are many methods with different accuracy, but knowing the DNA concentration present in the samples allows the forensic scientist to establish the ideal amount of DNA required for its amplification in order to make it possible to obtain a genetic profile that falls within the quality parameters set by the laboratory [29, 39].
The genetic analysis of the evidence collected in sexual crimes commonly includes genetic profiles of two or more contributors; in this kind of mixtures, the genetic contribution of the individuals is generally unbalanced. This will further impair the identification process through a series of stochastic effects, such as preferential amplification, which it is known to possibly affect PCR [29, 40].
5.2.1 Short tandem repeat (STR) analysis
Short tandem repeat (STR), also called microsatellites, or simple sequence repeats (SSRs) contain a core of nucleotides (length) that are tandemly repeated, and their use in forensic science opened a new path in human identification [29, 40, 41].
It is well known that STRs have a high degree of discrimination due to their hypervariable markers, which are useful when it is intended to involve the perpetrator in the crime scene or in the victim. Artifacts are a common challenge in forensic cases; biological ones (stutter products, incomplete adenylation, etc.) and instrumental ones (arise from voltage spikes, dye blobs, etc.) must often be sorted through in order to generate a complete and accurate STR profile [29, 41, 42].
Biological evidence showing fragmented DNA is commonly found in SA cases and can be recovered more effectively when the PCR products are smaller. By moving the PCR primers closer to the STR region, the product sizes can be reduced while retaining the same information [43, 44, 45]. In practice, the success rates in recovering information from compromised DNA samples improve with mini STR systems compared with conventional STR kits.
The sex chromosomal STR indicates biological lineage of a person, obtaining a low power of exclusion between relatives. Y-STR markers can play a role when mixed profiles of opposite sexes are involved, in cases where differential extraction is not possible, in an azoospermic male or in aged sexual stains [46, 47]. The X-STR markers have a wide range of forensic applications and can be used for establishing the relationship between distant relatives, such as aunt, niece, and cousins [48, 49].
Furthermore, theoretical and the first empirical evidence was provided to show that a set of 13 RM Y-STRs (rapidly mutating Y-STRs) is able to achieve an order of magnitude higher than male relative differentiation. The effects of this near-complete male individualization will be of great benefit to forensic applications (e.g., to reduce the inclusion of innocent individuals in sexual investigations due to adventitious haplotype matches) [50, 51].
5.2.2 Single nucleotide polymorphisms (SNPs) analysis
Single nucleotide polymorphisms (SNPs) are a single-base sequence variation between individuals at a particular point and take place in millions of sites in the human genome which means they could differentiate individuals from one another. SNPs are able to recover information from degraded DNA samples that show no stochastic phenomena, the sample processing and data analysis can be more automated because a size-based separation is not needed, and it has the ability to predict ethnic origin and certain physical traits with a careful selection of markers [6, 52].
One of the biggest challenges of using SNPs in forensic DNA typing applications is the inability to simultaneously amplify enough SNPs in robust PCR multiplexes, from small amounts of DNA. Because a single biallelic SNP yields less information than a multi-allelic STR marker, it is necessary to analyze a larger number of SNPs in order to obtain a reasonable power of discrimination to define a unique profile. Formerly, high-density SNP arrays allow hundreds of thousands or even millions of SNPs to be analyzed in parallel.
The basic principles of SNP array are the convergence of DNA hybridization, fluorescence microscopy, and solid surface DNA capture. The three mandatory components of the SNP arrays are an array containing immobilized allele-specific oligonucleotide (ASO) probes; fragmented nucleic acid sequences of target, labeled with fluorescent dyes; and a detection system that records and interprets the hybridization signal. However, these arrays typically require hundreds of nanograms of DNA, which are usually not available from forensic casework samples arising from minute biological stains, and for this reason it is more often used in ancestry studies [6, 29, 53, 54].
Another form of a biallelic (or di-allelic) polymorphism is insertion-deletion of nucleotides or INDEL which can be a DNA segment. Most INDELs exhibit allele of few nucleotides length differences. The PCR amplicons were designed to be less than 160 bp, and with this a complete profile could be obtained down to approximately 300 pg of DNA template [29, 55]. However, not all INDELs are highly informative in all populations, and the exact number of INDELs in the human genome remains unknown .
Both SNPs and INDELs can now be typed using multiplexes based on fragment length analysis on instruments available in all routine forensic laboratories, thus making it possible to extend the range of markers beyond the currently used STRs. In recent years haplotype systems based on multiple SNPs are being tried as optimal markers for the forensic area due to their discriminating power nearing that of STRs which provides a powerful alternative for the analysis. The microhaplotypes (MHs) have 2 or more SNPs in a span of less than 200 nucleotides (creating a multi-allelic locus), with extremely low recombination rates and discriminating power similar to STRs useful in cases with fragmented DNA and mixture sample analysis [57, 58, 59].
5.2.3 Mitochondrial DNA analysis
Mitochondrial DNA (mtDNA) analysis is commonly performed using the Sanger sequencing chemistry [60, 61, 62, 65]. This DNA sequencing is performed in both the forward and reverse directions so that the complementary strands can be compared to one another for quality control purposes. The focus of most forensic DNA studies to date has involved two hypervariable regions within the control region commonly referred to as HVI (HV1) and HVII (HV2). Occasionally a third portion of the control region, known as HV3, is examined to provide more information regarding a tested sample [29, 63, 64].
Human mitochondrial DNA is considered to be inherited strictly from our mothers and is commonly used in parental linkage. The middle piece of sperm cells contains mtDNA, and this DNA is more resistant than autosomal DNA because in small circular genomes, the double membrane of the mitochondrion and the circular structure (without open ends) act as protective agents against the degradation processes [35, 66].
The forensic applications for mtDNA include analysis of samples that are degraded or with low amount of DNA (e.g., stains, hairs, bones), and it was used for the identification of Tsar Nicholas II and his brother Georgij Romanov . In recent approach, it has been demonstrated that the mtDNA could be used for the identification of sperm cells in the vaginal tract through a micromanipulation technique [68, 69]. Besides the physical separation, sequence-specific primers (SSP) for the man were used to ensure that the woman’s mtDNA would not be co-amplified. The primer design was based on the mtDNA haplotype differences between contributors determined after mtDNA analysis of buccal swabs. This procedure allows the characterization of the male mitotype from a single sperm cell present in a vaginal swab .
5.2.4 Next-generation sequencing for forensics
There are several next-generation sequencing (NGS) platforms using different sequencing technologies. All of them perform sequencing of millions of small fragments of DNA in parallel; they use the bioinformatic analyses to piece together these fragments by mapping the individual reads to the human reference genome, providing to deliver accurate data and an insight into unexpected DNA variations [70, 71, 72].
The bases of the method consist in DNA polymerase catalyzing the incorporation of fluorescently labeled deoxyribonucleotide triphosphates into a DNA template strand during sequential cycles of DNA synthesis. During each cycle, at the point of incorporation, the nucleotides are identified by fluorophore excitation. The critical difference is that, instead of sequencing a single DNA fragment, NGS extends this process across millions of fragments in a massively parallel fashion [73, 74].
The NGS analysis allows to find differences in the ordering of nucleotides in the DNA in cases where the alleles are of the same size. It also allows us to analyze multiple polymorphisms simultaneously in a single workflow (Autosomal STRs, Y-STRs, X-STRs, Identity SNPs, Phenotypic SNPs and Biogeographical ancestry SNPs) . NGS reveals substantial sequence variation in addition to repeat length, thereby increasing the discriminatory power of STRs compared to conventional fragment analysis; it also allows for the analysis of large panels of SNPs when severely degraded DNA is involved .
On the other hand, the information obtained from multiple analyses in NGS is not needed in all forensic cases and can take up large portions of the sequencing capacity which will eventually result in fewer samples per sequencing run and a higher cost of the investigation. In our experience, a reliable quality control platform for the sizing and quantification of the libraries is necessary.
NGS can also be used for the detection and identification of microorganisms found in biological evidence (on victims or perpetrators) and the sexually transmitted infections (STIs) with the aim of tracing the source of microbes besides estimating the postmortem interval (PMI) related to changes in microbial community profiles or “microbial clocks” [75, 76]. NGS has the advantage of high throughput and multiplexing capability and accuracy, which makes it suitable for rapid whole-genome typing of polymorphisms detected by analyzing every base of the genome, thus giving forensic data higher resolution and greater accuracy.
Edaphic, necrobiomic microorganisms at the cadaver-soil interface construct multi-species communities that change when the host body dies and begins to decompose. Characterization of these dynamic changes has been made possible by metagenomic technologies [71, 72, 75]. It is expected that a high-quality forensic microbial database will soon become a reality and aid in the fast and accurate identification of criminals and biological terrorists.
Even nonhuman species identification is an important component of forensic practice: The species that range from domestic animals (common in the urban areas) to insects that were present in crime scene [29, 77]. Entomological evidence is used to define the PMI, and it is essentially based on the morphological recognition of the insect and an estimation of its insect life cycle stage; however, molecular genotyping methods can also provide an important support for forensic entomological investigations when the identification of species or human genetic material in their digestive tract is required [71, 72, 75, 76].
Epigenetic approaches based on NGS technology include whole-genome bisulfite sequencing and methylated DNA sequencing. Interestingly, extremely low amounts of starting DNA (100 pg) were successfully analyzed through genome-wide amplification of a bisulfite-modified DNA template, followed by quantitative methylation detection using pyrosequencing. Additionally, another encouraging study performed bisulfite genomic DNA sequencing with micro-volume blood spot samples. This can also be used to predict tissue type and associations with diseases and determine the sex and age of a DNA donor [71, 72, 75].
Furthermore distinguishing monozygotic twins has been a limitation in forensic genetics, since they exhibit identical STR profiles; the high number of readings of a single sequence that is able to reach NGS, allows to see the variations of methylated DNA and mitochondrial SNPs, giving us a way to distinguish them [78, 79].
MicroRNAs (miRNAs) have only recently been introduced to forensic science; they are a class of endogenous small RNA molecules with 18–24 nucleotides in length. There small size, resistance to degradation, and tissue-specific or highly tissue-divergent expression plays an essential regulative role for many cellular processes. They are suitable for forensic body fluid identification making it possible to conclusively link a DNA profile to a particular body fluid, species identification, different disease states, and PMI .
NGS technology in forensic science will increase the field of applications which contribute to the resolution of criminal cases. The standardization of procedures among laboratories will lead to the acceptance before the court, as well as to the understanding of their uses and limitations (see Figure 3).
5.3 Ancestry and phenotypic expression
Humans are 99.9% identical in their DNA. The difference between each human genome is small. Yet, in analyzing these small differences, we can begin to understand what makes us unique. The variation between human genomes is not randomly distributed across the globe. Humans are more likely to have descendants with people that live nearby; the closer geographically two individuals or populations are, the more genetically similar to they tend to be. If we were to gather DNA from across the globe, we could connect certain genetic signatures to geographic spaces. Population-specific alleles have been found in both STR and SNP markers. The genetic patterns of human population variation arose from a series of sequential migrations and bottleneck events [80, 81].
5.3.1 Analysis of genetic markers of ancestry
SNPs are more convenient to become “fixed” in a population than are STRs, because of their lower mutation rate. SNPs change on the order of once every hundred generations, while STR mutation rates are approximately one in a thousand. Ancestry informative markers (AIMs) possess alleles with large frequency differences between populations that can help distinguish them. A small proportion of SNP variants have emerged as particularly informative for ancestry, inferred by comparing a sample’s genetic diversity with the patterns of variation in contemporary populations. When selecting suitable ancestry informative markers, the degree of divergence between populations and the number of populations that a test seeks to differentiate have both a bearing on the selection process [80, 82].
Ancestry inference offers many other applications, including aiding cold case reviews with additional data on linked profiles; achieving more complete identifications of missing persons or disaster victims; assessing atypical combinations of physical characteristics in individuals with admixed parentage; and enhancing genetic studies where forensic sensitivity is necessary, e.g., testing medical archive material or archaeological DNA .
AIMs, however, are not 100% accurate for predicting ancestral background of samples; for example, individuals with mixed ancestral backgrounds may not possess the expected phenotypic characteristics. Thus, results from genetic tests attempting to predict ethnic origin or ancestry should always be interpreted with caution and only in the context of other reliable evidence. In countries like the United States where movement of the population is more fluid, greater levels of admixture are expected, and thus genetic testing results would not be as likely to correlate strongly with geographic location. However, the possibility of admixed ancestry raises a warning in the use of any statistic with any panel of AIMs. Admixed ancestry cannot be estimated accurately unless the ancestral populations are represented among the reference populations .
AIMs are limited, identification of the optimal SNPs could change between group of samples, and some panels are based on very large numbers of SNPs, thereby limiting the ability of others to test different populations. AIMs in forensic genetic investigations of crime scene can be performed on very small amounts of DNA, less than 1 ng. The strategy for interpretation of the result of AIM investigations can be explorative. The likelihoods of the AIM profiles in various populations may be calculated, and the one with the highest likelihood may be considered the population of origin. When two populations are identified a priori, the likelihood ratios of the populations are calculated.
The likelihood that one population is greater than another does not prove that any of the two populations are relevant to the AIM profile, due to the fact that even though the populations may be exclusive, they are not exhaustive in the sense that covers all possible human populations [84, 85].
Due to continuous migrations, AIM alleles are shared across all human groups; it is not the absolute presence/absence of an allele, rather its frequency in the population that is usually analyzed when inferring ancestry. The recombination of autosomal markers can provide additional information about the admixed nature of an individual. Y-chromosome markers and mitochondrial DNA (mtDNA) sequence variation have benefits and limitations for ancestry inference that relate to their maternal and paternal lineages . INDELs may also be valuable AIMs, but the number of markers and the informative value are less than those of SNPs .
5.3.2 Analysis of genetic markers of phenotypic expression
Forensic phenotyping can provide useful intelligence regarding the ancestry and externally visible characteristics (EVCs) of the donor of an evidentiary sample. Currently, SNPs base inference of externally visible characteristics. This may substitute and support eyewitness testimony when descriptions are unavailable or uncertain, in which DNA from the perpetrator is available but no suspect is identified [80, 86].
The predicting phenotypes of EVCs from DNA genotypes have the final aim of concentrating police investigations to find persons completely unknown, without database matches or low quality/quantity of DNA available and finally requesting standard forensic STR profiling only for the reduced number of EVC matching suspects aiming DNA individualization for courtroom use [86, 87].
The ability to predict the physical appearance of an individual directly from crime scene material can in principle help police investigations by limiting a large number of potential suspects where unknown perpetrators are involved, where STR profiling could not provide a hit within the DNA (profile) database or could not provide a match with a suspect singled-out by authorities or cases where an STR profile could simply not be generated due to low quality and/or quantity of DNA available.
In the case of an unidentified body being found in an advance state of decomposition with no visible physical characteristics, EVCs are expected to provide leads for human identification. However, work is still being done to identify predictive DNA markers for several other EVCs such as skin color, hair color, body height, male baldness, and hair morphology [84, 87, 88, 89].
Numerous global studies describe correlations between population geographical distribution and variations in the allele frequencies that are linked to several human phenotypes, including the skin, hair, and iris pigmentation, biological metabolism, biological modification variants, disease susceptibility, and morphology, because these variations are expected to display great population diversity. The investigators and juries may have trouble understanding probabilities from ancestry or phenotyping predictions using DNA results. Telling a detective that the individual donor of a biological sample at a crime scene has an 80% chance of having blue eyes is new territory when he or she typically associates a DNA result as being irrefutable evidence. If ancestry prediction and forensic phenotyping are pursued, then expectations of individuals using the information will need to be managed [89, 90]. Figure 4 shows next-generation sequencing applications and its usefulness in human identification.
6. Genetic DNA database
Forensic genetics has become a key test in multiple criminal and civil proceedings for its ability to confirm or eliminate a suspect. In the criminal field, it allows to analyze criminal strategies and identify authors, improving judicial and police management [5, 6, 12, 29, 91]. The DNA databases pursue the resolution of criminal cases allowing the automated comparison of DNA profiles from the crime scene, of suspects or convicts and sometimes of the victims. The usefulness of this type of database is indisputable in all the countries in which it exists [6, 14, 29, 92].
Currently genetic database CODIS (Combined DNA Index System) developed by the US FBI exchange and compare DNA profiles electronically from crime scenes and convicted offenders are stored. CODIS can be searched to determine if a DNA profile pulled from biological evidence in a crime matches the DNA of a known offender or DNA from evidence in another crime.
The legislations of each country vary in certain points that affect these issues. Another important point is to determine which laboratories can generate DNA profiles that are included in the database. It is likely that in the near future, developed countries will establish collaboration agreements for the exchange of genetic data, which could be a fundamental tool for the fight against some crimes. It is important that public agencies know the scope of these databases and establish collaboration agreements for the exchange and collation of information for criminal investigation purposes.
Sexual assault is a complex crime that involves medical and psychological attention for the victim and generates high financial cost per the development of forensic investigation. During investigation the identification, collecting and packing of biological fluids in the crime scene and the analysis of evidence in labs are fundamental since errors during this stage would affect the rest of the investigation . The use of protocols of interventions in crime scene decreases the possibility of loss of data that could clarify the crime, and even the protocol must be complemented with the interview of witnesses and/or victims in order to make decisions in broadening the area of evidence search. The standardization and quality control of procedures guarantee that all personnel manage a crime scene in the same way.
For the correct and successful investigation of sexual crimes, it is necessary to recover evidence in three principal areas: crime scene, victims, and perpetrator. Evidence recovery must be completed during the first hours after the crime; this is crucial for the success of the investigation, although it does not always happen for some investigation units [8, 11, 13, 14].
The analysis of evidence in the laboratory continues with the macroscopic examination of biological spots. The methods used by crime laboratories are presumptive screening tests, and some of them have confirmatory tests that will conclusively identify their presence. A disadvantage of most of these current methods is that they are designed to detect a specific body fluid (Figure 3); the investigator needs to decide which test to perform based on the fluid that is most likely present . It is necessary to develop a universal confirmatory test that can be applied to an unknown stain and which will be able to identify any of the body fluids. However, in 2016 Scientific Working Group on DNA Analysis Methods (SWGDAM) recommends the SA-targeted testing approach: direct to DNA. The serology test employed by laboratories is less sensitive than modern DNA typing kits; However, DNA typing only the swabs which screen positive in the serology test enables the possibility of missing elegible profiles [42, 71, 73].
Microscopic identification of sperm cells continues to be used in some forensic laboratories; its usefulness continues to be controversial due to the fact that the use of this technique in cases in which the evidence is minimal leads to the loss of such evidence besides making sperm cell identification difficult due to the lack of contrast. Fluorescent contrast techniques (FISH and immunolabeling) and LMD solve the problem of microscopic identification by allowing to separate cell mixtures from more than one contributor and producing genetic autosomal profiles free from DNA contamination [18, 25, 26, 27].
DNA extraction methods are increasingly effective in the recovery of trace evidence but are still ineffective in the analysis of mixture (separation of contributors), which is a common scenario in sexual assault. The technique used to isolate sperm cells from epithelial cells is the differential extraction, but since it is not always possible to separate both cells, it is necessary to implement other techniques .
Autosomal STR analysis using the PCR technique is widely used for human identification; however, DNA mixture is frequent in sex crimes, and its scope is limited. The application of next-generation sequencing in cases of mixed DNA allows the solving of the problem since the sequencing can show the construction of the bases that make up the units of alleles. Thus, even if two or three people in a mixture have the same length, next-generation sequencing (NGS) can tell them apart or, in compromised and degraded samples, regain relevance in sexual crimes [72, 76].
NGS has opened new possibilities in human identification, since it is no longer limited to one type of marker at a time. It allows analyzing a large number of individuals obtaining a significant depth of sequencing of their genomes; an analyst can sequence a multiple number of STRs, identity, ancestry, and phenotypic informative SNPs . However, it is necessary to establish parameters in the admissibility of the evidence on new technologies; considering phenotypic information as a search pattern for a suspect, as well as tracking it with the information of their ancestry, is debatable from an ethical and moral point of view. There is a lot of work to be done for this area to be developed.
Conclusively, solid foundations in the development of sexual assault investigations include scrutiny, selection, and discrimination of evidence supported on the knowledge of the forensic investigator. It is the investigators who hold a crucial role in the fulfillment of the purpose of forensic sciences which is to contribute to the uphold of justice amid the threat to humanity’s most fundamental rights, to life and freedom.
The authors would like to express our gratitude to all forensic scientists of the Criminalistics and Forensic Services Institute for providing technical reference and their valuable information.
Conflict of interest
We declare that we have no conflict of interest.