Open access peer-reviewed chapter - ONLINE FIRST

Of DNA and Demography

Written By

Emily Klancher Merchant

Submitted: 14 December 2022 Reviewed: 12 February 2023 Published: 14 March 2023

DOI: 10.5772/intechopen.1001293

Population and Development in the 21st Century<br> IntechOpen
Population and Development in the 21st Century
Between the Anthropocene and Anthropocentrism Edited by Parfait M Eloundou-Enyegue

From the Edited Volume

Population and Development in the 21st Century - Between the Anthropocene and Anthropocentrism [Working Title]

Prof. Parfait M Eloundou-Enyegue

Chapter metrics overview

78 Chapter Downloads

View Full Metrics

Abstract

Over the past 40 years, the focus of demography has expanded beyond the causes and consequences of population growth (and how to stem it) into the causes and consequences of socioeconomic inequality and health disparities, giving rise to new data sources: large-scale longitudinal cohort studies. More recently, these studies have begun to collect a variety of biomarkers, including DNA and epigenetic measures. This chapter explains the three ways in which demographers have used genomic and epigenetic data (epigenetic dependent variables with socioeconomic independent variables, genomic control variables with biomedical dependent variables, and genomic independent variables with socioeconomic dependent variables) and the key findings from each type of research. It describes the shift from candidate gene studies to genome-wide association studies and explores ongoing challenges with using genome-wide association studies and the polygenic scores they produce in demographic research.

Keywords

  • sociogenomics
  • genome-wide association study
  • polygenic score
  • demography
  • behavior genetics
  • epigenetics

1. Introduction

Demography has changed dramatically over the past 40 years. Beginning in the late 1970s, the field’s emphasis expanded beyond population problems into social demography [1]. Whereas the former focused on the causes and consequences of rapid population growth (primarily in the Global South) and ways to slow growth by reducing fertility, the latter focuses on the causes and consequences of socioeconomic inequality (primarily in the Global North) and ways to promote population health [2, 3]. This new focus has spurred the development of large-scale longitudinal cohort studies, including the Wisconsin Longitudinal Study (WLS), the Panel Study of Income Dynamics, the Health and Retirement Study (HRS), the National Longitudinal Study of Adolescent to Adult Health (Add Health), and the Future of Families and Child Wellbeing Study (formerly the Fragile Families and Child Wellbeing Study) in the United States; a series of nationally representative birth cohort studies in the UK; the English Longitudinal Study of Ageing; and the Dunedin Multidisciplinary Health and Development Study in New Zealand [410]. Longitudinal demographic studies began to collect biomarkers from participants around the turn of the twentieth century, giving rise to biodemography [11]. Today, they also make genomic data available to users, facilitating the development of the new field of sociogenomics [12].

This chapter explains how demographers and other social scientists use genomic data and what they have learned from doing so. However, the use of DNA in demography and other social sciences has also come in for critique, as will be described in greater detail in Sections 3.3 and 4. The final section of the chapter explores some of the challenges demographers still face in using genomic data and explains why sociogenomics has not lived up to the expectations set by early adopters, but also expresses optimism for new approaches to integrating genomic and epigenetic data into demography.

Advertisement

2. How demographers use genomic data

Demographers typically use genomic data in one (or more) of three ways. First, they use epigenetic measures as dependent variables, examining how social factors contribute to changes or differences in the epigenome. This research uses epigenetics to explore somatic responses to social experiences. Second, demographers use a measure of DNA (the genome itself) as a control variable in an analysis with a biomedical outcome, holding genomic variation constant to better identify the effects of social variables. This research treats the genome as a somatic moderator of social determinants of health and treats social factors as moderators of somatic determinants of health. Third, demographers use a measure of DNA as an independent variable in an analysis with a socioeconomic dependent variable, estimating the effects of genetic variation on various forms of socioeconomic inequality. This research seeks somatic causes of social outcomes. Since the epigenome and the genome are ontologically distinct, I will discuss the first approach (hereafter Type 1 Sociogenomics) separately from the other two (hereafter Type 2 and Type 3 Sociogenomics).

2.1 Epigenetic dependent variables (somatic responses to social experience)

Epigenetic measures are biomarkers that are near the genome but not part of an individual’s genetic sequence. Unlike the DNA itself, epigenetic measures change over the course of a person’s life and are influenced by the physical and social environment. The two primary epigenetic markers demographers have considered so far are methylation and telomere length. In methylation, a methyl group binds to a segment of DNA, turning the relevant genes off so they do not get expressed [13]. Methylation can, therefore, explain differential bodily functioning among people with the same genetic sequences. Telomeres are the protective caps at the ends of chromosomes. They get shorter when DNA replicates, so telomere length is a measure of aging at the cellular level [14].

Type 1 Sociogenomics seeks correlations between socioeconomic independent variables and epigenetic dependent variables. It, therefore, examines potential cellular pathways through which social experiences and social inequality “get under the skin” to cause somatic changes. This approach originates in medical sociology, fundamental tenets of which are that the social world influences our health and that socioeconomic inequalities cause health disparities [15]. Demographers working in Type I Sociogenomics have primarily utilized the Future of Families Study (for example, see [16]).

2.2 Genomic independent variables (somatic moderators of social determinants of health, social moderators of somatic determinants of health, and somatic causes of social outcomes)

The genome itself comprises approximately three billion pairs of nucleotides, segmented into 23 pairs of chromosomes. Some of these nucleotides form protein-coding sequences, known as genes. Humans have approximately 20,000 genes, but most of our DNA is not part of any gene. Twenty years after the completion of the Human Genome Project, scientists still do not know how (or if) the majority of our DNA functions. Type 2 and Type 3 Sociogenomics seek genomic causes for biomedical and socioeconomic outcomes to answer the age-old “nature vs. nurture vs. structure” question: to what degree are a variety of biomedical and social outcomes determined by social variables as opposed to our genetic makeup? Over the past 20 years, demographers have moved from a candidate gene approach to a genome-wide approach to discovering associations between DNA and a variety of diseases, behaviors, and social processes.

2.2.1 Candidate genes

When demographers began sampling the DNA of survey respondents in the first decade of the twentieth century, they looked for known variants of specific genes, typically genes that relate to neurotransmitters. Efforts to link variants of these genes to particular social outcomes were known as “candidate gene” studies. Major foci of the candidate gene approach in demography and other social sciences included the MAOA gene, which codes for monoamine oxidase A; 5-HTTLPR, which is the promoter region of the serotonin transporter gene; DRD2, which codes for the dopamine receptor D2; and DRD4, which codes for the dopamine receptor D4. Information about these genes and four others was incorporated into Wave III of Add Health for the sibling subsample, spurring an outpouring of candidate gene research across the social sciences [17].

Early results seemed promising, especially outside of demography. Political scientists found that MAOA appeared to predict credit card debt, and MAOA and 5-HTTLPR both appeared to predict voter turnout [18, 19]. DRD2 appeared to predict partisanship, and DRD4 appeared to predict political ideology [20, 21]. In the first decade of the twenty-first century, scholars in a variety of fields identified correlations between these genes and an astounding range of behaviors, from sugar consumption to susceptibility to victimization [22]. Yet demographers were more circumspect. They pointed out that most of these associations failed to replicate, and their carefully designed studies on outcomes that should have followed logically from the genes’ known functions turned up negative or inconclusive results (for example, [23, 24]).

By 2012, demographers and other social scientists had concluded that most of the findings from candidate gene studies were false positives [25]. Candidate gene research had tested small samples for associations between a small number of genes and tens of thousands of phenotypes, without correcting for multiple hypothesis testing [22]. Meanwhile, research in medical and animal genetics had begun to suggest that most traits—physiological as well as behavioral—were massively polygenic, influenced by tens if not hundreds of thousands of nucleotides across the genome (not just in genes) [26]. The effect of any individual gene or nucleotide would, therefore, be minuscule, requiring enormous sample sizes for identification. By this time, it was becoming cheaper and easier to genotype individuals at thousands of points along the genome (not just specific genes), and demographic studies that had already collected the DNA of participants—including HRS, Add Health, and WLS—were able to reanalyze stored samples.

2.2.2 Genome-wide association studies and polygenic scores

In the wake of the candidate gene debacle, demographers and other social scientists followed the lead of medical and psychiatric genetics and embarked upon genome-wide association studies (GWAS). GWAS rely not on variants of specific genes, but on single-nucleotide polymorphisms (SNPs), which represent individual nucleotides at loci across the genome where humans are known to differ from one another. At each locus, each person receives one nucleotide from each parent. At most loci, everyone’s genome looks exactly the same, and we each receive the same nucleotide from each of our parents. But at about four or five million loci (~0.1% of the genome), humans differ from one another in substantial proportions. At those loci, an individual may be homozygous, meaning that they receive the same nucleotide from each parent, or heterozygous, meaning that they receive different nucleotides from each parent. For example, at a given locus, it might be that some people have two copies of adenine (AA), but others have two copies of thymine (TT) and others yet have one of each (AT or TA). The nucleotide that is more common in the population is known as the “major allele” and the one that is less common is known as the “minor allele.” Each of these loci is known as a SNP [27]. SNP arrays or “SNP chips” genotype individuals at hundreds of thousands or even millions of SNPs across the genome. Stored DNA can be regenotyped as SNP chips get larger and, therefore, become capable of identifying more SNPs.

A GWAS uses a series of regression models (one for each measured SNP) to identify correlations between an outcome of interest and each SNP available in the dataset (measured as 0, 1, or 2 depending on the number of minor alleles) [28]. The outcome of interest may be a medical diagnosis (such as schizophrenia or diabetes), a physiological trait (such as height or body mass index [BMI]), a behavior (such as smoking), or a socioeconomic outcome (such as educational attainment). GWAS summary statistics provide a formula for calculating an individual’s polygenic score (PGS, also known as a “polygenic risk score” or “polygenic index”) for the outcome in question. The PGS multiplies the regression coefficients for each SNP by the individual’s value for each SNP (0, 1, or 2) and sums across all SNPs. It is measured in standard deviations from the mean (0).

GWAS require enormous samples and PGS cannot be calculated for individuals included in the GWAS. This means that a single demographic study, such as HRS or Add Health, cannot run its own GWAS to calculate PGS for its participants. Instead, these studies and others participate in the Social Science Genetic Association Consortium (SSGAC) and other consortia, which organize GWAS across demographic and many other data sources—such as the UK Biobank and 23andMe—to produce summary statistics that each study can use to calculate PGS for its participants. Several longitudinal demographic studies currently include a vast array of PGS for disease states (e.g., coronary artery disease, myocardial infarction, diabetes, Alzheimer’s disease, schizophrenia), biomarkers (e.g., cholesterol and triglycerides), physical characteristics (e.g., height, body mass index, waist circumference, waist-to-hip ratio, and ages at menarche and menopause), mental characteristics (e.g., cognitive function, intelligence, worry, and positive affect), behaviors (e.g., smoking, drinking, children born, age at first birth, and religious attendance), and socioeconomic outcomes (e.g., educational attainment) for each participant. Each PGS is a single variable that demographers and other social scientists can include in standard quantitative models [29].

Advertisement

3. What demographers have learned from genomic data

In general, demographic research with genomic data has validated rather than challenged what demographers previously understood with regard to population health and the causes and consequences of socioeconomic inequality. Even when genetic variation is taken into account, social inequality is self-perpetuating and causes health disparities. This section reviews some of the major findings of each type of sociogenomics.

3.1 Epigenetic dependent variables and socioeconomic independent variables (Type 1 Sociogenomics: somatic responses to social experience)

Demographic research using epigenetic dependent variables and socioeconomic independent variables has demonstrated that a variety of adverse social circumstances correlate with adverse epigenetic outcomes. For example, mothers and boys living in disadvantaged environments (in the Future of Families sample) have shorter telomeres (indicating more rapid cellular aging) than mothers and boys living in nondisadvantaged environments [30, 31]. Boys in the Future of Families sample who have lost their fathers (through divorce, incarceration, or death) have shorter telomeres than boys who have not [32]. Children who have experienced family violence and disruption have shorter telomeres than children who have not [33]. Children who experience depression have different methylation patterns than those who do not, indicating differential functioning of the same genes [34]. Adults in the 1958 British Birth Cohort Study who had low socioeconomic status as children have different methylation patterns than those who had high socioeconomic status as children [35]. More recent research has indicated other epigenetic mechanisms linking low childhood socioeconomic status to poor adult health [36]. To further explore these mechanisms, Add Health has begun to add blood-based transcriptional profiles for a nationally representative subsample of the original cohort [37]. Demographers have long known that adverse social circumstances produce adverse health consequences. These studies begin to suggest some of the cellular pathways that may mediate between social causes and somatic consequences.

3.2 Genomic control variables and biomedical dependent variables (Type 2 Sociogenomics: somatic moderators of social determinants of health and social moderators of somatic determinants of health)

Type 2 Sociogenomics, like Type 1 Sociogenomics, originates in medical sociology and aims to identify social determinants of health and disease. In this type of research, outcomes are typically disease states or adverse biomarkers or other metrics, and analysts typically use PGS for the outcome in question (or a related outcome) to control for individual genomic propensities for that outcome.

Research in Type 2 Sociogenomics has largely found that previously identified social determinants of health remain salient when PGS for disease states or biomarkers are included in quantitative models. For example, stressful life events still predict depressive symptoms in older adults (in the HRS sample), even when individual PGS for depression are taken into account [38]. Similarly, the high school environment continues to influence later-life cognitive function (in the WLS sample), even when controlling for the PGS for cognitive function [39]. Controlling for the PGS for BMI does not change the inverse correlation between educational attainment and BMI in the Add Health sample [40]. In the same study, depression and self-rated health remained correlated with educational attainment when their PGS were taken into account, but the magnitude of the correlation was attenuated, indicating that some (but not all) of the correlation was driven by genetic factors.

Research in this area has also turned up interesting interactions between the genome and the social world. For example, it has long been known that higher socioeconomic status correlates with lower BMI, but recent research in Type 2 Sociogenomics suggests that higher socioeconomic status actually reduces the genetic influence on BMI [41], as does higher educational attainment [42]. Conversely, perceiving one’s neighborhood as disorderly exacerbates the genetic risk of type 2 diabetes for older adults in the HRS sample [43]. For younger adults (in the Add Health sample), having a higher PGS for type 2 diabetes increases disease risk only when living in high-crime neighborhoods because living in a high-crime neighborhood increases the risk of obesity [44].

Type 1 and Type 2 Sociogenomics have made valuable contributions to medical sociology [15]. Further research in these areas, particularly in Type 1 Sociogenomics using new epigenetic markers, promises to better elucidate how specific social determinants of health and disease operate and, in particular, how they influence the workings of the human genome. However, several caveats attach to any research using PGS (Type 2 Sociogenomics). These will be discussed further in Section 4.

3.3 Genomic independent variables and socioeconomic dependent variables (Type 3 Sociogenomics: somatic causes of social outcomes)

Type 3 Sociogenomics also uses PGS as independent variables, but the dependent variables it seeks to explain are socioeconomic outcomes. It, therefore, uses PGS for markers of socioeconomic status. The overwhelming focus of research in this area has been the PGS for educational attainment, and that will be the focus of this discussion as well.

Although Type 3 Sociogenomics looks similar to Type 2 Sociogenomics in the sense that both use PGS as independent variables, Type 3 Sociogenomics originates in a different corner of the social sciences: behavior genetics, the subfield of psychology concerned with establishing a genetic basis for human and animal behavior [45]. Throughout the second half of the twentieth century, nonmolecular research on twins and adoptees formed the core of human behavior genetics [46]. These studies focused on estimating the “heritability” of behaviors and social outcomes: the proportion of the variance in outcome that arises from genetic variation as opposed to nongenetic (environmental) variation [47]. However, when it became possible to look for molecular causes of particular behaviors and outcomes, the candidate gene approach failed, as described above. For a moment, it appeared that the field would not survive in the molecular age [46, 29].

In 2013, however, the SSGAC breathed new life into behavior genetics with its first GWAS of educational attainment [48]. With this GWAS, behavior genetics expanded into sociogenomics, bringing on board economists and demographers, particularly those associated with long-running studies that had genotyped their participants: HRS, Add Health, and WLS. Although the SSGAC has led GWAS for a few other behavioral outcomes, its primary focus has been educational attainment, for three reasons. First, educational attainment initially appeared to be a reasonable proxy for intelligence, which had long been the focus of behavior genetics and its predecessor, differential psychology [49]. Second, educational attainment (unlike more direct measures of intelligence) was widely available across the disparate data sources available for inclusion in the GWAS (recall from above that GWAS require enormous samples). Third, demographers and other sociologists had long ago identified education as the primary vehicle for social mobility in the United States [50].

The first GWAS of educational attainment (EA1) had a discovery sample of 101,069 people. In validation samples, the PGS it produced accounted for about 2% of the variance in educational attainment [48]. The second GWAS of educational attainment (EA2) had a discovery sample of 293,723 people. Its PGS accounted for about 3.2% of the variance in educational attainment in replication samples [51]. With the 2017 release of data from the UK Biobank [52], the SSGAC was able to coordinate a GWAS of educational attainment with a discovery sample of 1,131,881 people (EA3) [53]. In replication, the PGS accounted for 12.7% of variance in educational attainment in Add Health and 10.6% in HRS.

Social scientists debated whether an R2 of 12.7% was large or small, but behavior geneticists were triumphant. These GWAS appeared to demonstrate that specific genetic variants played a decisive role in producing differences in educational attainment [54]. Some even argued that an individual’s PGS for educational attainment provided a more accurate representation of their intelligence than an IQ test and proposed that it be used to distribute educational opportunities and occupational placements [49]. The start-up Genomic Prediction, founded by physicist Stephen Hsu, biochemist Nathan Treff, and bioinformatician Laurent Tellier, offered couples undergoing in-vitro fertilization the opportunity to screen embryos for low predicted educational attainment [55]. The first baby born of this technology entered the world in 2020 [56].

GWAS of educational attainment and the claims made on their behalf by behavior geneticists drew severe critique from sociologists, historians, and other scholars, who warned that this research threatened to substitute genetic determinism for social science and to give new legitimacy to eugenics [5759]. Geneticists began to express these concerns as well in response to a 2019 GWAS of same-sex sexual activity led by the Broad Institute [60]. Their fears were validated by the nearly immediate release of an app called “How Gay Are You?” which allowed users to upload their raw genetic data and calculate their PGS for same-sex sexual activity [61]. Today, the company Traitwell offers the same service for the PGS of educational attainment [62].

Demographers were initially optimistic about the GWAS of educational attainment. Research using genetically enhanced demographic surveys (primarily Add Health, HRS, WLS, and the Dunedin Study) found that the PGS for educational attainment predicts a number of markers of socioeconomic success: social-class mobility [63], adult occupational status [64], geographic mobility [65], labor earnings [66], wealth at retirement [67], and of course educational attainment itself [68].

Demographers and other sociologists, however, were more interested in using PGS for educational attainment and other socioeconomic outcomes to control for unobserved (genetic) heterogeneity, thereby producing a clearer picture of how social variables contribute to socioeconomic outcomes [12]. This research has demonstrated that, even when the PGS for educational attainment is taken into account, childhood socioeconomic status remains an important predictor of educational outcomes and adult socioeconomic status. In the HRS sample, respondents whose PGS was in the lowest quartile but whose fathers’ income was in the highest quartile graduated from high school at a higher rate than respondents whose PGS was in the highest quartile but whose fathers’ income was in the lowest quartile. The same was true of college graduation rates [66]. Data from HRS also showed that correlations between a child’s educational attainment and that of their parents are primarily driven by social inheritance rather than genetic inheritance [69]. Data from the UK Biobank showed that children with higher socioeconomic status had greater returns to schooling, net of the PGS for educational attainment [70].

In 2022, researchers associated with SSGAC published a fourth GWAS of educational attainment (EA4) [71]. This study used a discovery sample of 3,037,499 individuals, more than two-thirds of whom came from 23andMe. The resulting PGS accounts for 15.8% of the variance in educational attainment in the Add Health sample and 12.0% of the variance in educational attainment in the HRS sample (sample-size-weighted mean of 13.3%) [71].

By that time, however, these larger R2 figures had come to seem less impressive. A raft of research had demonstrated that an individual’s PGS for educational attainment is correlated with socioeconomic advantage at family and neighborhood levels [6872], and that a substantial proportion of the predictive power of the PGS for educational attainment is due to its correlation with a person’s living environment rather than to biochemical effects in a person’s body (direct genetic effects) [73]. Several studies found that the PGS for educational attainment lost upward of half of its predictive power in samples of siblings as opposed to unrelated individuals [6374]. Research comparing adopted children to nonadopted children found that the PGS for educational attainment was substantially less predictive of actual educational attainment for adopted children [75] and that parents’ PGS for educational attainment predicted the educational attainment of both biological and adopted children (though less well for adopted children than for biological children) [76]. Researchers typically describe the effect of the parents’ PGS on the child’s education as “genetic nurture” or an “indirect genetic effect,” but there is no evidence that the parents’ DNA has a direct effect on the environment they provide for their children. Data from the Brisbane Adolescent Twin Study suggest that this effect is entirely accounted for by parents’ socioeconomic status [77].

In the most recent GWAS of educational attainment (EA4), within-family analysis demonstrated that only 30.9% of the R2 could be attributed to direct genetic effects [71]. That is, an individual’s own SNPs account for only 4.1% (30.9% of the sample-size-weighted mean R2 of 13.3%) of the variance in educational attainment in HRS and Add Health. Moreover, the findings of within-family studies cannot be generalized to between-family differences [78]. Just in the past two years, social and medical geneticists have warned publicly against using PGS (for any outcomes, but especially for educational attainment) to select embryos, as they do not have the effect that companies like Genomic Prediction claim [79]. Genetic differences do seem to have some bearing on educational attainment (and therefore other socioeconomic outcomes), but their effect is small, and we still do not know which genes matter or how they matter.

Advertisement

4. The drawbacks of GWAS and PGS

Only six years ago, demographers were optimistic that GWAS could produce PGS that would allow them to control for unobserved genetic heterogeneity, increasing the explanatory power of their social models [12]. As the previous section indicates, however, GWAS and PGS for educational attainment have not lived up to the hype that accompanied their introduction. Educational attainment is the most studied social outcome in Type 3 Sociogenomics, but the difficulty of identifying direct genetic effects, even with discovery samples that include millions of people, suggests that GWAS for other socioeconomic outcomes will face similar challenges. This section describes two additional problems with GWAS and PGS that apply to biomedical outcomes (those used in Type 2 Sociogenomics) as well as socioeconomic outcomes (those used in Type 3 Sociogenomics): their lack of portability and their mismeasurement of genetic risk.

4.1 The nonportability of PGS

Across biomedical and social genomics, GWAS discovery samples are typically limited to white people with exclusively European genetic ancestry [80]. The rationale for this limitation derives from a racist thought experiment known as “the chopsticks problem”: if a GWAS of facility with chopsticks were performed on students from a US university, it would inevitably identify SNPs that are more common among Asian Americans, regardless of whether such SNPs have any bearing on manual dexterity [81]. The issue is what geneticists term “population stratification”: DNA and culture both vary geographically. Ideally, in a GWAS discovery sample, the only systematic genetic differences between cases and controls would be in SNPs that cause (or are in linkage [correlated] with SNPs that cause) the outcome in question. However, since the purpose of the GWAS is to identify the SNPS that cause (or are in linkage with SNPs that cause) the outcome in question, there is no way to apply this limitation in advance. Instead, researchers typically use race as a proxy for genetic similarity. They do so in the language of “genetic ancestry” [54] and identify it with genetic markers (either ancestry informative markers or principal component analysis) [82]. However, GWAS typically define ancestry as the continent from which one’s recent ancestors originated, thereby conflating ancestry with the socially constructed racial categories used in the United States, even though human genetic variation occurs continuously across continents rather than categorically between them [83]. Boundaries between the so-called ancestry groups are arbitrary and artificial [59].

Owing to the large sample sizes required by GWAS, white/European ancestry subjects have been most readily available, and all of the GWAS of educational attainment have been limited to white/European ancestry samples (as identified by ancestry informative markers or principal components analysis). Three problems remain, however. First, these large white/European ancestry samples are not representative of any actual population. GWAS discovery samples rely heavily on two sources of data. One is the UK Biobank, which is volunteer-based and has consistently experienced low participation rates, and is therefore not representative of the UK population [84]. The other is 23andMe, a direct-to-consumer genetic testing company that collects research samples from customers who have paid for some type of genetic testing [85]. Its customer base comes primarily from the United States, but is far from representative of the US population. These samples become even less representative when combined with each other and with additional samples. Second, limiting GWAS discovery samples to white/European ancestry populations does not fully solve the problem of population stratification because stratification occurs geographically at scales much finer than the continent [86] and socially within geographic locations [87]. Third, the PGS that are created from these GWAS have limited portability, both outside of and within white/European ancestry populations.

In biomedical as well as social genetics, PGS constructed from GWAS that include only white people are much less predictive (if predictive at all) for nonwhite individuals [88]. The PGS constructed from the most recent GWAS of educational attainment (EA4) accounts for only 1.3% of the variance in educational attainment among African Americans in HRS and 2.3% of the variance in educational attainment among African Americans in Add Health, as compared to 12.0% (HRS) and 15.8% (Add Health) for white Americans [71]. The use of biomedical or socioeconomic PGS for clinical or policy purposes, therefore, threatens to exacerbate existing disparities [88]. Several efforts are currently underway to make GWAS for biomedical outcomes more diverse, but these have encountered scientific challenges along with those related to recruiting participants [89].

The lack of portability across racial categories is typically explained in biological terms: perhaps there are different SNPs that account for the outcomes in question in non-European-ancestry populations, or perhaps the SNPs identified in the GWAS are in linkage (correlated) with different SNPs in non-European-ancestry populations than they are in European-ancestry populations. However, it is equally possible that genetic variation has different consequences in different social environments [15]. For example, data from HRS, Add Health, and WLS indicate that the PGS for educational attainment became more predictive for white American women across the twentieth century, as gender discrimination abated [90]. Currently, the nonportability of the PGS for educational attainment beyond white/European subjects means that responsible research using this PGS must be limited to white/European samples (though irresponsible research does occasionally get published, for example, [91]).

Demographers and geneticists have recently found that PGS have limited portability even among white individuals with exclusively European ancestry, indicating substantial genetic diversity within this racially defined group. For example, PGS constructed from GWAS using individuals with ancestry in Northwest Europe do a much worse job of predicting outcomes in individuals with ancestry in other parts of Europe [92]. Research using data from WLS has demonstrated that the PGS for educational attainment does not predict outcomes for Jewish individuals as well as it does for non-Jewish individuals [93]. Research using data from the UK Biobank has shown that PGS also fail to port across other social categories, including age, sex, and socioeconomic status [94]. While researchers can account for the lack of portability outside of white/European ancestry samples by limiting their studies to white individuals with exclusively European ancestry, they cannot account for the lack of portability within white/European ancestry samples because the factors limiting portability are rarely known in advance.

4.2 PGS and the mismeasurement of genetic risk

The final problem with the use of GWAS and PGS in demography is that they do not effectively do what demographers need them to do: account for genetic heterogeneity and thereby facilitate the production of better models and more accurate estimates of social parameters [12, 54]. That is, the PGS produced by GWAS of educational attainment and other social outcomes do not accurately capture individual genomic propensity for those outcomes. As this chapter has shown, existing methods generate PGS that are seriously confounded by social and other environmental factors, and whose portability is severely limited in ways that are both known (beyond white/European ancestry samples) and unknown (within white/European ancestry samples). Additionally, GWAS measure only one type of genetic risk: the additive effects of SNPs. They exclude rare variants, structural variants, insertions, and deletions. They also exclude the X and Y chromosomes and mitochondrial DNA [78].

PGS, therefore, exclude genetic information that may be relevant and include social information that is relevant to the outcome in question but not caused by DNA [95]. They, therefore, mismeasure the genetic risk of the outcomes they are supposed to index, even in individuals who resemble the discovery sample. Medical geneticists have already moved away from PGS as indicators of disease risk. Demography, however, seems to be moving in the opposite direction, with long-running studies increasingly genotyping participants and adding PGS to their public data offerings. Proponents have claimed that any social research that does not include a PGS is akin to bank robbery: a waste of public money due to omitted variable bias [54]. But it is becoming increasingly clear that PGS do not solve the problem of unobserved genetic heterogeneity, and that including them in social models may produce worse rather than better parameter estimates. They, therefore, threaten to naturalize social inequality and thereby preserve the status quo. While this is primarily a problem for Type 3 Sociogenomics, any cautions about using PGS also apply to Type 2 Sociogenomics.

Advertisement

5. Conclusions

The use of genomic and epigenetic data in demography is quite new—dating only from the first decade of the present century—and rapidly changing. So far, it has validated nongenetic research in social demography, demonstrating that educational attainment is strongly conditioned by childhood socioeconomic status and that socioeconomic inequality in both childhood and adulthood produces health disparities. With the development of new epigenetic measures in particular, it may be possible to get a more detailed picture of exactly how the social world influences health, which may point to new ways to advance population health and health equity.

Yet, while the use of genomic and epigenetic data has provided new evidence for the ways in which social variables produce social and biomedical outcomes, it has provided scant new evidence for the ways in which DNA contributes to these outcomes. Candidate gene studies proved a boondoggle, demonstrating that genetic influences on socioeconomic and biomedical outcomes are weaker than previously expected and more widely distributed across the genome. Individual genes rarely have identifiable effects, either medically or socially. GWAS identify SNPs that are correlated with particular outcomes, but they do not identify causal pathways. They give credence to the long-standing claim of behavior genetics that DNA influences all human outcomes, but do not provide any specific information about how.

Six years ago, demographers were enthusiastic about the prospect that GWAS would produce PGS that could simply be dropped into quantitative models to control for genetic heterogeneity and better clarify social processes. Increasingly, however, it has become clear that PGS are not adequate to this task, though they may have other uses. Research by demographers has been key to identifying the limitations of PGS, particularly the PGS for educational attainment, and to demonstrating the impossibility of separating genetic and environmental causes of socioeconomic and biomedical outcomes. This research has clearly shown that genes matter differently in different social contexts. GWAS may eventually be consigned to the same dustbin as candidate gene studies, or they may prove useful beyond the construction of PGS.

Several new lines of research are beginning to appear on the horizon of sociogenomics. One is further inquiry into the genetics of education and other social outcomes, moving beyond SNPs and into rare variants and structural variants. A second is the study of endophenotypes: biomarkers or subclinical phenomena that show stronger correlations with DNA than the phenotypes demographers have studied to this point (socioeconomic outcomes and diseases states). Finally, RNA and new epigenetic phenomena might provide additional insights into the social determinants of health, particularly as they relate to aging and to the effects of childhood experiences on later life health. Undoubtedly, demographers will continue to use genomic and epigenetic data to advance the project of social demography in creative and productive ways.

Advertisement

Acknowledgments

This research was funded in part by a Hellman Fellowship from the University of California, Davis, and partially completed while the author was a member of the School of Social Science at the Institute for Advanced Study in Princeton, NJ. It was inspired and shaped by conversations with many demographers, sociologists, and sociogenomicists. Particular thanks are due to Dalton Conley, Sam Trejo, and the members of the Biosociology Lab at Princeton University.

References

  1. 1. Preston SH. The contours of demography: Estimates anssociation of telomered projections. Demography. 1993;30:593-606. DOI: 10.2307/2061808
  2. 2. Merchant EK. A digital history of Anglophone demography and global population control, 1915-1984. Population and Development Review. 2017;43:83-117. DOI: 10.1111/padr.12044
  3. 3. Merchant EK, Alexander CS. U.S. demography in transition. Historical Methods: A Journal of Quantitative and Interdisciplinary History. 2022;55:168-188. DOI: 10.1080/01615440.2022.2098216
  4. 4. Herd P, Carr D, Roan C. Cohort profile: Wisconsin Longitudinal Study (WLS). International Journal of Epidemiology. 2014;43:34-41. DOI: 10.1093/ije/dys194
  5. 5. Johnson DS, McGonagle KA, Freedman VA, Sastry N. Fifty years of the Panel Study of Income Dynamics: Past, present, and future. The ANNALS of the American Academy of Political and Social Science. 2018;680:9-28. DOI: 10.1177/0002716218809363
  6. 6. Sonnega A, Faul JD, Ofstedal MB, Langa KM, Phillips JWR, Weir DR. Cohort profile: The Health and Retirement Study (HRS). International Journal of Epidemiology. 2014;43:576-585. DOI: 10.1093/ije/dyu067
  7. 7. Harris KM, Halpern CT, Whitsel EA, Hussey JM, Killeya-Jones LA, Tabor J, et al. Cohort profile: The National Longitudinal Study of Adolescent to Adult Health (Add Health). International Journal of Epidemiology. 2019;48:1415-1415k. DOI: 10.1093/ije/dyz115
  8. 8. Reichman NE, Teitler JO, Garfinkel I, McLanahan SS. Fragile Families: Sample and design. Children and Youth Services Review. 2001;23:303-326. DOI: 10.1016/S0190-7409(01)00141-4
  9. 9. Steptoe A, Breeze E, Banks J, Nazroo J. Cohort profile: the English Longitudinal Study of Ageing. International Journal of Epidemiology. 2013;42:1640-1648. DOI: 10.1093/ije/dys168
  10. 10. Poulton R, Moffitt TE, Silva PA. The Dunedin Multidisciplinary Health and Development Study: Overview of the first 40 years, with an eye to the future. Social Psychiatry and Psychiatric Epidemiology. 2015;50:679-693. DOI: 10.1007/s00127-015-1048-8
  11. 11. National Research Council. Conducting Biosocial Surveys: Collecting, Storing, Accessing, and Protecting Biospecimens. Washington, DC: National Academies Press; 2010
  12. 12. Conley D, Fletcher J. The Genome Factor: What the Social Genomics Revolution Reveals about Ourselves, Our History, and the Future. Princeton: Princeton University Press; 2017. p. 296
  13. 13. Singal R, Ginder GD. DNA methylation. Blood. 1999;93:4059-4070. DOI: 10.1182/blood.V93.12.4059
  14. 14. Blasco MA. Telomere length, stem cells and aging. Nature Chemical Biology. 2007;3:640-649. DOI: 10.1038/nchembio.2007.38
  15. 15. Boardman JD, Fletcher JM. Evaluating the continued integration of genetics into medical sociology. Journal of Health and Social Behavior. 2022;62:404-418. DOI: 10.1177/00221465211032581
  16. 16. Notterman DA, Mitchell C. Epigenetics and understanding the impact of social determinants of health. Pediatric Clinics. 2015;62:P1227-P1240. DOI: 10.1016/j.pcl.2015.05.012
  17. 17. Harris KM, Halpern CT, Smolen A, Haberstick BC. The National Longitudinal Study of Adolescent Health (Add Health) twin data. Twin Research and Human Genetics. 2012;9:988-997. DOI: 10.1375/twin.9.6.988
  18. 18. De Neve JE, Fowler JH. Credit card borrowing and the monoamine oxidase A (MAOA) gene. Journal of Economic Behavior and Organization. 2014;107:428-439. DOI: 10.1016/j.jebo.2014.03.002
  19. 19. Fowler JH, Dawes CT. Two genes predict voter turnout. The Journal of Politics. 2008;70:579-594. DOI: 10.1017/S0022381608080638
  20. 20. Dawes CT, Fowler JH. Partisanship, voting, and the dopamine D2 receptor gene. The Journal of Politics. 2009;71:1157-1171. DOI: 10.1017/S002238160909094X
  21. 21. Settle JE, Dawes CT, Christakis NA, Fowler JH. Friendships moderate an association between a dopamine gene variant and political ideology. The Journal of Politics. 2010;72:1189-1198. DOI: 10.1017/S0022381610000617
  22. 22. Charney E, English W. Candidate genes and political behavior. The American Political Science Review. 2012;106:1-34. DOI: 10.1017/S0003055411000554
  23. 23. Haberstick BC, Lessem JM, Hewitt JK, Smolen A, Hopfer CJ, Halpern CT, et al. MAOA genotype, childhood maltreatment, and their interaction in the etiology of adult antisocial behaviors. Biological Psychiatry. 2014;75:25-30. DOI: 10.1016/j.biopsych.2013.03.028
  24. 24. Haberstick BC, Boardman JD, Wagner B, Smolen A, Hewitt JK, Killeya-Jones LA, et al. Depression, stressful life events, and the impact of variation in the serotonin transporter: Findings from the National Longitudinal Study of Adolescent to Adult Health (Add Health). PLoS One. 2016;11:e0148373. DOI: 10.1371/journal.pone.0148373
  25. 25. Chabris CF, Hebert BM, Benjamin DJ, Beauchamp J, Cesarini D, van der Loos M, et al. Most reported genetic associations with general intelligence are probably false positives. Psychological Science. 2012;23:1314-1323. DOI: 10.1177/0956797611435528
  26. 26. Chabris CF, Lee JJ, Benjamin DJ, Beauchamp JP, Glaeser EL, Borst G, et al. Why it is hard to find genes associated with social science traits: Theoretical and empirical considerations. American Journal of Public Health. 2013;103:S152-S166. DOI: 10.2105/AJPH.2013.301327
  27. 27. Brookes AJ. The essence of SNPs. Gene. 1999;234:177-186. DOI: 10.1016/S0378-1119(99)00219-X
  28. 28. Bush WS, Moore JH. Genome-wide association studies. PLoS Computational Biology. 2012;8:e1002822. DOI: 10.1371/journal.pcbi.1002822
  29. 29. Harden KP. “Reports of my death were greatly exaggerated”: Behavior genetics in the postgenomic era. Annual Review of Psychology. 2021;72:37-60. DOI: 10.1146/annurev-psych-052220-103822
  30. 30. Massey DS, Wagner B, Donnely L, McLanahan S, Brooks-Gunn J, Garfinkel I, et al. Neighborhood disadvantage and telomere length: Results from the Fragile Families Study. The Russell Sage Foundation Journal of the Social Sciences. 2018;4:28-42. DOI: 10.7758/RSF.2018.4.4.02
  31. 31. Mitchell C, Hobcraft J, McLanahan SS, Siegel SR, Berg A, Brooks-Gunn J, et al. Social disadvantage, genetic sensitivity, and children’s telomere length. Proceedings of the National Academy of Sciences. 2014;111:5944-5949. DOI: 10.1073/pnas.1404293111
  32. 32. Mitchell C, McLanahan S, Schneper L, Garfinkel I, Brooks-Gunn J, Notterman D. Father loss and child telomere length. Pediatrics. 2017;140:e20163245. DOI: 10.1542/peds.2016-3245
  33. 33. Drury SS, Mabile E, Brett ZH, Jones E, Shirtcliff EA, Theall KP. The association of telomere length with family violence and disruption. Pediatrics. 2014;134:e128-e137. DOI: 10.1542/peds.2013-3415
  34. 34. De Vito R, Grabski IN, Aguiar D, Schneper LM, Verma A, Fernandez JC, Mitchell C, Bell J, McLanahan S, Notterman DA, Engelhardt BE. Differentially methylated regions and methylation QTLs for teen depression and early puberty in the Fragile Families Child Wellbeing Study. BioRxiv 2021. DOI: 10.1101/2021.05.20.444959
  35. 35. Borghol N, Suderman M, McArdle W, Racine A, Hallett M, Pembrey M, et al. Associations with early-life socio-economic position in adult DNA methylation. International Journal of Epidemiology. 2012;41:62-74. DOI: 10.1093/ije/dyr147
  36. 36. Carmeli C, Kautalik Z, Mishra PP, Porcu E, Delpierre C, Delaneau O, et al. Gene regulation contributes to explain the impact of early life socioeconomic disadvantage on adult inflammatory levels in two cohort studies. Scientific Reports. 2021;11:3100. DOI: 10.1038/s41598-021-82714-2
  37. 37. Cole SW, Shanahan MJ, Gaydosh L, Harris KM. Population-based RNA profiling in Add Health finds social disparities in inflammatory and antiviral gene regulation emerge by young adulthood. Proceedings of the National Academy of Sciences. 2020;117:4601-4608. DOI: 10.1073/pnas.1821367117
  38. 38. Musliner KL, Seifuddin F, Judy JA, Pirooznia M, Goes FS, Zandi PP. Polygenic risk, stressful life events and depressive symptoms in older adults: A polygenic score analysis. Psychological Medicine 2014; 45:1709-1720. DOI: 10.1017/S0033291714002839
  39. 39. Moorman SM, Greenfield EA, Garcia S. School context in adolescence and cognitive functioning 50 years later. Journal of Health and Social Behavior. 2020;60:493-508. DOI: 10.1177/0022146519887354
  40. 40. Boardman JD, Domingue BW, Daw J. What can genes tell us about the relationship between education and health? Social Science & Medicine. 2015;127:171-180. DOI: 10.1016/j.socscimed.2014.08.001
  41. 41. Liu H, Guo G. Lifetime socioeconomic status, historical context, and genetic inheritance in shaping body mass in middle and late adulthood. American Sociological Review. 2015;80:705-737. DOI: 10.1177/0003122415590627
  42. 42. Barcellos SH, Carvalho LS, Turley P. Education can reduce health differences related to genetic risk of obesity. Proceedings of the National Academy of Sciences. 2018;115:E9765-E9772. DOI: 10.1073/pnas.1802909115
  43. 43. Robinette JW, Boardman JD, Crimmins EM. Differential vulnerability to neighbourhood disorder: A gene-x-environment interaction study. Journal of Epidemiology and Community Health. 2019;73:388-392. DOI: 10.1136/jech-2018-211373
  44. 44. Guo F, Harris KM, Boardman JD, Robinette JW. Does crime trigger genetic risk for type 2 diabetes in young adults? A G x E interaction study using national data. Social Science & Medicine. 2022;313:115396. DOI: 10.1016/j.socscimed.2022.115396
  45. 45. Knopik V, Neiderhiser J, DeFries J, Plomin R. Behavioral Genetics. New York: Macmillan; 2017
  46. 46. Panofsky A. Misbehaving Science: Controversy and the Development of Behavior Genetics. Chicago: University of Chicago Press; 2014. p. 320
  47. 47. Keller EF. The Mirage of a Space between Nature and Nurture. Durham: Duke University Press; 2010. p. 120
  48. 48. Rietveld CA et al. GWAS of 126,559 individuals identifies genetic variants associated with educational attainment. Science. 2013;340:1467-1471. DOI: 10.1126/science.1235488
  49. 49. Plomin R. Blueprint: How DNA Makes Us Who We Are. Cambridge: MIT Press; 2018. p. 280
  50. 50. Blau P, Duncan OD. The American Occupational Structure. New York: John Wiley and Sons; 1967. p. 520
  51. 51. Okbay A et al. Genome-wide association study identifies 74 loci associated with educational attainment. Nature. 2016;533:539-542. DOI: 10.1038/nature17671
  52. 52. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203-209. DOI: 10.1038/s41586-018-0579-z
  53. 53. Lee JJ et al. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nature Genetics. 2018;50:1112-1121. DOI: 10.1038/s41588-018-0147-3
  54. 54. Harden KP. The Genetic Lottery: Why DNA Matters for Social Equality. Princeton: Princeton University Press; 2021. p. 312
  55. 55. Adler S. G: Unnatural selection. RadioLab. July 25, 2019. Available from: https://www.wnycstudios.org/podcasts/radiolab/articles/g-unnatural-selection [Accessed: December 12, 2022]
  56. 56. Shanks P. The first polygenic risk score baby. Biopolitical Times. September 30 2021. Available from: https://www.geneticsandsociety.org/biopolitical-times/first-polygenic-risk-score-baby [Accessed: December 12, 2022]
  57. 57. Bliss C. Social by Nature: The Promise and Peril of Sociogenomics. Stanford: Stanford University Press; 2018. p. 304
  58. 58. Henn B, Merchant EK, O’Connor A, Rulli T. Why DNA is no key to social equality: On Kathryn Paige Harden’s The Genetic Lottery. LA Review of Books. September 21, 2021. Available from: https://lareviewofbooks.org/article/why-dna-is-no-key-to-social-equality-on-kathryn-paige-hardens-the-genetic-lottery/ [Accessed: December 12, 2022]
  59. 59. Coop G, Przeworski M. Lottery, luck, or legacy. A review of The Genetic Lottery: Why DNA Matters for Social Equality. Evolution. 2022;76:846-853. DOI: 10.1111/evo.14449
  60. 60. Ganna AA et al. Large-scale GWAS reveals insights into the genetic architecture of same-sex sexual behavior. Science. 2019;365:eaat7693. DOI: 10.1126/science.aat7693
  61. 61. Maxmen A. Controversial “gay gene” app provokes fears of a genetic Wild West. Nature. 2019;574:609-610. DOI: 10.1038/d41586-019-03282-0
  62. 62. Explore your educational possibilities. Available from: https://www.traitwell.com/app/EA. [Accessed: December 12, 2022]
  63. 63. Belsky DW, Domingue BW, Wedow R, Arseneault L, Boardman JD, Caspi A, et al. Genetic analysis of social-class mobility in five longitudinal studies. Proceedings of the National Academy of Sciences. 2018;115:E7275-E7284. DOI: 10.1073/pnas.1801238115
  64. 64. Belsky DW, Moffitt TE, Corcoran DL, Domingue B, Harrington H, Hogan S, et al. The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development. Psychological Science. 2016;27:957-972. DOI: 10.1177/0956797616643070
  65. 65. Abdellaoui A, Hugh-Jones D, Yengo L, Kemper KE, Nivard MG, Veul L, et al. Genetic correlates of social stratification in Great Britain. Nature Human Behaviour. 2019;3:1332-1342. DOI: 10.1038/s41562-019-0757-5
  66. 66. Papageorge NW, Thom K. Genes, education, and labor market outcomes: Evidence from the Health and Retirement Study. Journal of the European Economic Association. 2020;18:1351-1399. DOI: 10.1093/jeea/jvz072
  67. 67. Barth D, Papageorge NW, Thom K. Genetic endowments and wealth inequality. Journal of Political Economy. 2020;128:1474-1522. DOI: 10.1086/705415
  68. 68. Domingue BW, Belsky DW, Conley D, Harris KM, Boardman JD. Polygenic influence on educational attainment: New evidence from the National Longitudinal Study of Adolescent to Adult Health. AERA Open. 2015;1:1-13. DOI: 10.1177/2332858415599972
  69. 69. Conley D, Domingue BW, Cesarini D, Dawes C, Rietveld CA, Boardman JD. Is the effect of parental education on offspring biased or moderated by genotype? Sociological Science. 2015;2:82-105. DOI: 10.15195/v2.a6
  70. 70. Barcellos SH, Carvalho L, Turley P. The effect of education on the relationship between genetics, early-life disadvantages, and later-life SES. NBER Working Paper. 2021:28750. DOI: 10.3386/w28750
  71. 71. Okbay A et al. Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nature Genetics. 2022;54:437-449. DOI: 10.1038/s41588-022-01016-z
  72. 72. Trejo S, Belsky DW, Boardman JD, Freese J, Harris KM, Herd P, et al. Schools as moderators of genetic associations with life course attainments: Evidence from the WLS and Add Health. Sociological Science. 2018;5:513-540. DOI: 10.15195/v5.a22
  73. 73. Young AI, Benonisdottir S, Przeworski M, Kong A. Deconstructing the sources of genotype-phenotype associations in humans. Science. 2019;365:1396-1400. DOI: 10.1126/science.aax3710
  74. 74. Howe LJ et al. Within-sibship genome-wide association analyses decrease bias in estimates of direct genetic effects. Nature Genetics. 2022;54:581-592. DOI: 10.1038/s41588-022-01062-7
  75. 75. Cheesman R, Hunjan A, Coleman J, Ahmadzadeh Y, Plomin R, McAdams T, et al. Comparison of adopted and nonadopted individuals reveals gene-environment interplay for education in the UK Biobank. Psychological Science. 2020;31:582-591. DOI: 10.1177/0956797620904450
  76. 76. Domingue BW, Fletcher J. Separating measured genetic and environmental effects: Evidence linking parental genotype and adopted child outcomes. Behavior Genetics. 2020;50:301-309. DOI: 10.1007/s10519-020-10000-4
  77. 77. Bates TC, Maher BS, Colodro-Conde L, Medland SE, McAloney K, Wright MJ, et al. Social competence in parents increases children’s educational attainment: Replicable genetically-mediated effects of parenting revealed by non-transmitted DNA. Twin Research and Human Genetics. 2019;22:1-3. DOI: 10.1017/thg.2018.75
  78. 78. Burt C. Challenging the utility of polygenic scores for social science: Environmental confounding, downward causation, and unknown biology. Behavioral and Brain Sciences. Forthcoming. 2001. DOI: 10.1017/S0140525X22001145
  79. 79. Turley P, Meyer MN, Wang N, Cesarini D, Hammonds E, Martin AR, et al. Problems with using polygenic scores to select embryos. The New England Journal of Medicine. 2021;385:78-86. DOI: 10.1056/NEJMsr2105065
  80. 80. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature. 2016;538:161-164. DOI: 10.1038/538161a
  81. 81. Hamer D, Sirota L. Beware the chopsticks gene. Molecular Psychiatry. 2000;5:11-13. DOI: 10.1038/sj.mp.4000662
  82. 82. Tian C, Gregersen PK, Seldin MF. Accounting for ancestry: Population substructure and genome-wide association studies. Human Molecular Genetics. 2008;17:R143-R150. DOI: 10.1093/hmg/ddn268
  83. 83. Fujimura JH, Bolnick DA, Rajagopalan R, Kaufman JS, Lewontin RC, Duster T, et al. Clines without classes: How to make sense of human variation. Sociological Theory. 2014;32:208-227. DOI: 10.1177/0735275114551611
  84. 84. Keyes KM, Westrich D. UK Biobank, big data, and the consequences of non-representativeness. The Lancet. 2019;393:1297. DOI: 10.1016/S0140-6736(18)33067-8
  85. 85. Harris A, Wyatt S, Kelly SE. The gift of spit (and the obligation to return it): How consumers of online genetic testing services participate in research. Information, Communication & Society. 2013;16:236-257. DOI: 10.1080/1369118X.2012.701656
  86. 86. Abdellaoui A, Dolan CV, Verweij KJH, Nivard MG. Gene-environment correlations across geographic regions affect genome-wide association studies. Nature Genetics. 2022;54:1345-1354. DOI: 10.1038/s41588-022-01158-0
  87. 87. Merchant EK. The social stratification of population as a source of downward causation. Behavioral and Brain Sciences. 2023
  88. 88. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nature Genetics. 2019;51:584-591. DOI: 10.1038/s41588-019-0379-x
  89. 89. Ruan Y et al. Improving polygenic prediction in ancestrally diverse populations. Nature Genetics. 2022;54:573-580. DOI: 10.1038/s41588-022-01054-7
  90. 90. Herd P, Freese J, Sicinski K, Domingue BW, Harris KM, Wei C, et al. Genes, gender inequality, and educational attainment. American Sociological Review. 2019;84:1069-1098. DOI: 10.1177/0003122419886550
  91. 91. Piffer D. Evidence for recent polygenic selection on educational attainment and intelligence inferred from GWAS hits: A replication of previous findings using recent data. Psychiatry. 2019;1:55-75. DOI: 10.3390/psych1010005
  92. 92. Privé F, Aschard H, Carmi S, Folkersen L, Hoggart C, O’Reilly PF, et al. Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. American Journal of Human Genetics. 2022;109:12-23. DOI: 10.1016/j.ajhg.2021.11.008
  93. 93. Freese J, Domingue B, Sicinski K, Trejo S, Herd P. Problems with a causal interpretation of polygenic score differences between Jewish and non-Jewish respondents in the Wisconsin Longitudinal Study. SocArXiv. 2019. DOI: 10.31235/osf.io/eh9tq
  94. 94. Mostafavi H, Harpak A, Agarwal I, Conley D, Pritchard JK, Przeworski M. Variable prediction accuracy of polygenic scores within an ancestry group. eLife. 2020;9:e48376. DOI: 10.7554/eLife.48376
  95. 95. Cecile A, Janssens JW. Validity of polygenic risk scores: Are we measuring what we think we are? Human Molecular Genetics. 2019;28:R143-R150. DOI: 10.1093/hmg/ddz205

Written By

Emily Klancher Merchant

Submitted: 14 December 2022 Reviewed: 12 February 2023 Published: 14 March 2023