Etiological Research of Childhood Acute Leukemia with Cluster and Clustering Analysis

Cluster disease and clustering of diseases is an aggregation of cases of a particular disease that occur within a group of people, a geographic area or a period of time, and which is higher than the researchers would expect considering its natural history, and the chance fluctuations. So far, the study of clusters has led to the identification of health problems that have spatial and temporal dimensions.


The investigation of clusters of diseases
Cluster disease and clustering of diseases is an aggregation of cases of a particular disease that occur within a group of people, a geographic area or a period of time, and which is higher than the researchers would expect considering its natural history, and the chance fluctuations. So far, the study of clusters has led to the identification of health problems that have spatial and temporal dimensions.
According to Elliot and Best, "The study of the geographic patterns of a specific disease is part of the classic triad in descriptive epidemiology characterized for the time, the person, and the place" [1]. Therefore, these studies have been considered as part of Spatial analysis in Epidemiology because they can be interpreted in a temporal and spatial context; they should not be confused with an ecological study as commonly happens.
The space-time clustering studies describe populations in historical and geographical contexts, not individuals or population´s particularities, such as risk factors. The results of these studies must be interpreted in terms of period of time and geographic area. The space-time clustering studies are classified as follows: 2. Temporal cluster. Is an excess of cases or events observed in a limited period of time. The results of this study must be interpreted chronologically. For example, a cluster composed of unrelated individuals whose dates of birth approximately coincide, would result in a temporary cluster; however, this coincidence do not necessarily is observed in individuals that lived in the same geographic area, they could live distantly. There are clusters which individuals coincide on the time of birth, time of diagnosis, or the period in which they moved to a new city.
3. Space-time cluster. Is an excess of cases or events in both space and time. In other words, a space-time cluster can be observed in cases that are geographically close and are observed in the same period of time.
The more common space-time clustering studies designs are spatial clusters and space-time clustering. Importantly, these types of studies should not be confused with other statistical cluster analysis methods. Both techniques have similarities as they search characteristics of groups or other elements. However, their objectives are different, since the space-time clustering studies search groups of people in both dimensions; while the statistical cluster analysis explores the strength of the relationship between words, ideas or interrelated concepts. The distinction between the two methods is even more ambiguous as is common to find them under the same term [2].
The term spatial dimension is used to refer spatial cluster. The latter research associations between individuals distributed in a geographic space. The spatial cluster involves the presence of an environmental factor or factors in the etiology of the disease. One example worth to mention from veterinary medicine is the exposed by Poljak et al who published a study of influenza in pigs using Cuzick and Edwards method. They searched several clusters, but only found significant results in two strains, for influenza H3N2 Sw/Col/77 and H3N2 Sw/Tex/98 in an area near to a documented region of isolation of avian influenza. From an epidemiological perspective, the source of the spread of these types of influenza in pig herds was an environmental factor. Evidence suggests that the proximity between both types of farms favored the formation of a cluster of swine influenza [3].
There are specific tests for each scenario. The considerations for determining whether a cluster actually exists, or not, depend on the underlying populations. These studies describe the spatial or temporal behavior of a population; making inferences that are able to describe an area or a period. The findings are then extrapolated to the population under study. One of the advantages of these results is that they can be explaining visually when displayed on a map or a time curve.
Moreover, the cluster and the ecological studies differ from the classic major epidemiological studies such as cohort, case-control and cross sectional. However, these studies can be combined with the objective of studying a particular population. For example, information from a cross-sectional study does not require major changes in order to use it for a cluster study. The main objective of a cluster study is the description of the population. The principal objectives of a cross-sectional study are sometimes similar to the objectives of clusters studies.
On the other hand, information collected longitudinally, for instance from a cohort study, can be used for a temporal cluster or space-time cluster. Moreover, data from a cohort study is frequently used for other studies such as temporal cluster studies. For example, Gaudar et al used a cohort study to identify clusters, which allowed them to define an epidemiological surveillance tool [4]. The objective of this research was to identify areas of high risk for malaria using a dynamic cohort from 1996 to 2001. This group of researchers employed a cohort of identification of clusters by Kulldorf's technique. Their results showed an identification of six clusters of high risk of infection by P. falciparum, and concluded the advantages of detecting clusters to generate maps of high risk for malaria. A cohort is characterized by collecting data over a period of time. Nonetheless, the passing of time in collecting and analyzing data is not an exclusive characteristic of the cohort study. There are techniques of cluster analysis able to analyze the effect of time on the formation of clusters. Regarding at the studies of clusters, it can be observed that this condition can also occur with temporal data. You can also use data from a case-control research to detect clusters. Alternatively, the case-control studies are generally used to verify the findings of a study of clusters. The techniques that combine casecontrol studies with clusters studies are currently improving through for example, new scanning techniques [5]; they also can be used to evaluate the mobility of individuals [6]. At the end of this chapter, we present a study of clusters with information from a case-control study.
Historically, studies of clusters were used to know what caused a heterogeneous distribution of cases; in other words to better understand why the incidence of cases were more concentrated in a particular space or time. In fact, with this question in mind the cluster detection studies are conducted. However, on several occasions has taken the initiative to anticipate it and, without waiting for evidencing disease predilection to join an aggregate of cases, the researchers have investigated whether this phenomenon is part of the natural history of the disease.
This difference is related to the distinction between clusters and clustering. By clustering means the overall propensity to form groups of cases. Moreover, clusters are excessive pooling of people, usually in a small and well defined area and generally have a few cases [7]. Thus, with two different approaches, there is another classification of such studies: the post-hoc, which is based on observation of past events; and the priori, found as a result of a specific statistical exercise. A priori investigation of a propensity for the tendency of clustering is relatively new and can be useful in the interpretation of some post hoc cluster observations.
The studies of post hoc clusters take place because of public concern about a possible cluster. The problem is that the cases have been identified by personal knowledge and therefore may lead to an inherent bias. Indeed cases may not even be from the same disease. The a priori surveillance schemes, systematically monitoring a region for geographical or temporal excesses. There are specialized methods for looking for overall clustering (space-time, spatial or temporal) or to find specific clusters. In turn, clusters studies can be classified otherwise, as expressed in the following chart:  The particularity of clusters analysis is that it shows the heterogeneous spatial distribution of cases, or the different behavior of occurrence of cases. It is generally accepted that the explanation of this distinction lies in the unequal distribution of the causal factors of disease in time and space. The underlying causes factors may vary by location, such as city, country, neighborhood, or rural areas. Putative exposures may have changed through time.

Types of clusters
The value of a cluster analysis is that they show the hypothetical consequences of any possible factor on the spatial or temporal distribution of the population. If there is a spatial or temporal difference in the incidence of a disease, this could suggest the presence of an environmental factor. When the positive detection of a cluster happens, this suggests that an environmental factor may be involved in the development of this health problem. At the level of individuals, genetic factors are important in determining which people get sick. However, when we need to explain the disease in the population level, environmental factors and lifestyle have a higher relative weight [8]. Given the conclusion, the question is: what are these factors?

Definition and concepts
Among the various techniques and methods to study epidemiology, there is an area called Spatial epidemiology (or Geographical epidemiology). The Spatial epidemiology complements other studies that are considered traditional in epidemiology (cross sectional, cohort and casecontrol studies). The objectives of Spatial epidemiology are two: 1) to identify the possible risk factor that contribute to the spatial variation of the disease, and 2) highlight unusual groups that may say something more, than what is already known through other research channels. Finally in these studies, the clusters are defined as groups of persons [1,9].
Furthermore the study of geographical distribution´s patterns of disease depends of the geographic or temporal scale [1]. Of the scale, because, for example, in a big city, one kilometer can be sufficient to determine the presence of a cluster. While on the other hand, if the territory under study is only a section of a city, the assessment may be limited to a few tens of meters. And of the time, since if the health problem is the result of a very clear, and definite exposure, then the clusters of cases may be observable after a few months. One example is the radiation´s effects from Chernobyl on the increasing in the prevalence of Down´s syndrome in the children of Belarus [10]. This cluster could be related to radiation´s exposure, and was confined to a single month. Another example is related to studies that monitoring acute outbreak, which lasted a few days, where a cluster could be detected on a much smaller window of time. E.g., one study, conducted in Hong Kong, illustrates the spatial and temporal dynamics of human influenza A (H1N1). They could detect space-time heterogeneity in the incidence of disease. It is remarkable the chronological description of the spread of disease across the territory of Hong Kong. In this study, were detected space-time clusters of people with the disease from the third week until week 22. It described the cluster transformation weekly. Although researchers have evaluated space-time clustering, rather than temporal clusters, their results demonstrate how can be detected clusters within small periods of weeks, or even in a few days [11].
A study of clusters (or clustering) can support, or not support, an etiological hypothesis. There must be similarities between the premises of the hypothesis and the design principles inherent at the study. For example, in one study of McNally et al, they assumed that, according to the hypothesis put forward by Smith [12] a high incidence of acute leukemia in children is linked with an infectious exposure that occurred in uterus. Under this premise, the space-time cluster of children with leukemia should have manifested when it searched according to the place and date of birth, because these are according with assumptions of Smith's hypothesis. If it had, this result could have been interpreted as an indirect support of that acute leukemia in children is linked, in fact, with an infectious exposure in uterus, prior to disease development. They did not find this result, so it ended without support the Smith's hypothesis [13]. This is one illustration of how the principles of design of a cluster (or clustering) study, should coincide with the assumptions of a previously stated hypothesis.
The test driver of these studies is derived from other sciences. According to Lawson, Spatial epidemiology concerns the analysis of the spatial-geographical incidence of disease. The Spatial epidemiology keeps a close link with Spatial analysis. Last one forms an entire branch, and a school of thought within the geographical science. Lawson noted that the Spatial epidemiology is a field, or discipline, whose interest concerns the use and interpretation of maps for the location of cases of disease. Also pointed out that all matters related to the production of maps and statistical analysis of mapped data should be dedicated to study in Epidemiology. Furthermore, it pointed out that many epidemiological concepts play an important role in their analysis.
The importance of maps in epidemiological work is clear. However, Spatial epidemiology does not restrict its activity only to such cartographic work. There are a detailed set of tasks for Spatial/Geographical analysis in Epidemiology. There are: cluster (or clustering) studies, models of exposure to sources of risk of disease mapping, field surveys of information, analysis, ecological models of infectious diseases, among other studies [9]. You can do analysis within related areas, comparisons, analysis of surfaces and areas, analysis of lines, and analysis of points. Each category has additional subdivisions into more types of studies [14]. Clustering studies look for general patterns in a region, whereas, in contrast, cluster studies have a focused representation. The study of spatial clusters (or clustering) and spatial-temporal clusters (or clustering) correspond to the study of points and areas, and correspond with focused and general clusters and clustering studies. Both types of studies are having a geographical interpretation. Focused cluster studies actually seek to detect a very specific cluster, e.g., distinctly clustered cases around a defined point localized in a territory. General clustering and focused cluster analyses have their own statistical tests. Also, there are two concepts whose interpretation may be mistaken, and these are cluster and clustering. A cluster is a group of children that arises in a small and well defined area, usually has a few cases. By clustering means the general propensity of cases to form groups [13].
Tango, in his book Statistical Methods for Disease Clustering [15], classified clusters studies according to geographical approach to the problem. If the intention is to recognize the occurrence of conglomerate, over a territory and/or a given time, then this is a general test. If, instead, there is already a predetermined point, as a given event in a defined location, then a test is focused. To further clarify the last point, it is the example of radiation in Chernobyl. The event is the nuclear disaster, the place is the nuclear plant, and the conglomerate is looking around that point, after that event. The cluster, detected in the above example, was a focused cluster.
A focused study of clusters is a specific study of clusters. In these studies, the location of any cluster, if it exists on the map, is a matter of first importance. In these studies there is a source of exposure, which is a fixed place in the territory, and is known incidence of the disease. The search feature is that if any cluster is detected, it should revolve around sources of exposure that are under suspicion. Often this relationship is identified as a causal interaction, i.e., indicate the relationship between nuclear plant exposure and the spatial distribution of sick cases around it. Indirectly, the results of these studies are interpreted as support evidence that an exposure located in one place can generate a disease in the population that is under the effect of exposure. Tango pointed a difference: when the location of possible clusters is expected a priori, then it is a specific focusing study; when possible location is unknown, a specific study is not focused [15].
For his part, Lawson, classifies these two ways of approaching with other names: like a general study of clusters, and as specific studies [9]. However, he further detailed definition of both concepts, as he points out the usefulness of studies from a map view. Furthermore, according to Lawson, a general study of clusters is the valuation of a map of a complete territory, in order to find out if there are clusters in that place. If anything there is no cluster, as proposed by a null hypothesis, the map should not observe any difference in the distribution of the disease. The explanation, in the alternative hypothesis, should then provide some specific mechanisms to understand the grouping that the maps show. It concerns a preconceived notion of how these clusters are given. Such studies may also be called as non-specific, since in reality are not required to identify the place where the clusters are placed, but that really only intended to identify whether there is a pattern, a pattern of grouping into clusters.
It is extremely important, in Spatial epidemiology, which in all the techniques and methods explained, epidemiological considerations, proper exercise of Epidemiology, are taken into account. Elliot and Best, declared that the differences in the distribution of disease incidence, could produce important clues to the etiology of disease research. Then, later studies could be carried out by methods designed to analyze the population at an individual level (e.g. cohort or case-control-control) [1].
The Spatial epidemiological analysis has three peculiarities inherent to the source of their data, which explain the logic of the clusters. First, epidemiological analysis has in statistics and, in particular in spatial statistics, at the core of their method. This is because these data have the property of being geo-referenced (located on a map) and may be interlinked a result of their location. The data can be at the personal level, always associated with it spatial localization. Secondly, in Epidemiology, generally the spatial data are discrete, i.e. they are not quantifiable data on a continuous scale. They are the occurrence of these phenomena is in part a consequence of previous events, and in addition it also depends on an important individual independence, due to random processes. That is, they are stochastic processes. Furthermore, these data (for example, the location of a child with leukemia) behave according to processes associated with discrete probability distributions. Put another way, these are processes whose phenomena are clearly separated values. Finally, the nature of all information used in Spatial epidemiology are linked to conventional studies of Epidemiology, which leads to the derivation of models and methods related to spatial analysis [9]. In these studies a null hypothesis indicates the "normal" variation of sick cases, or with health problems. It is compared against alternative hypothesis, which explains the difference in question There should be no confusion. This type of study stored multiple matches with other epidemiological studies. For example, the size of the sample examined, studies of clusters also yield less uncertainty when making inferences. Can be made a study of conglomerates stratified, where a disease cases are divided into groups suitable for research purposes. For example, it can do cluster detection between boys and girls or by age groups sensitive to the different susceptibility of some individuals over others. The results of a study of clusters can be enhanced, too, by using more variables. These variables need not be necessarily included in the operations of the proper analysis of the detection of clusters. Its can indicate the environmental conditions of the population. For example, one can know the socioeconomic status of the cases studied the population density of the locality where they live, or industrial activity where they live, just to name a few. Finally, the study outcomes can be explained in light of these variables, also under the logic of stratified analysis, if required. On the other hand, cluster studies are sensitive to data quality. If the database from which the analysis is not good quality, the final conclusion should be taken into account these shortcomings, or even avoid the start of operations. Other deficiencies that also affect the validity of the results include the underestimation a result of misdiagnosis or because of diagnoses made by different methods. Also, excessive division and subdivision into groups or strata of the population under study, can lead to pulverization of the sample into too small sets, unable to be analyzed, without sufficient statistical power. And similarly, these studies are not without the danger of research bias, whether by chance or by a systematic error.

Limitations
From the epidemiological discipline, two important mistakes could occur in these studies: the fallacy of aggregation, and the ecological fallacy. The concepts are defined, including: 1. Fallacy of aggregation: the misapplication of a causal explanation at the individual, when it was seen as a relation to the group level. It is considered a kind of ecological fallacy [16].

2.
Ecological fallacy: it has two meanings. The first sense is very similar to the aforementioned fallacy of aggregation; sometimes it is taken as synonymous with this. The second meaning, more detailed, defines it as an error of inference due to the mistake to distinguish between different levels of organization. "A correlation between variables that are based on characteristics of an group, not necessarily reproduced between variables based on characteristics of individuals; an association that is given at a level could be gone in the other, or even be reversed" [16].
In other words, the conclusions, if not verified by other research designs, should be limited to the population level inferences. For instance, when it is used in conjunction with a case-control study, the formulation of hypotheses that promotes may result in many more immediate applications, like the identification of possible risk factors [17]. Otherwise, any generalization of the results from a study of clusters may be inferred.
A space-time cluster study has the advantage that the associations between cases, being close to both a time and in space, can provide explanations that detail the coincidence of the incidence of leukemia in space and at a time determined. This feature is characteristic also of infectious events. On the other hand, when we only talk about spatial clusters, possible causes of these clusters can potentially just be environmental. The consideration of an infectious cause is harder. The explanations behind their formation must be sought strictly by spatial thought, less specific.
One more limitation is the lack of consistent information between the outcomes of similar studies. Sometimes it is necessary to work with this drawback, in addition to bias influences. This bias can be reported and noted, provided that the information given is somewhat predictable. For example, due to lack of data for workers who arrived in the area of study up before 1983, and the lack of information from other data, the results could be biased [18].
Another limitation is the geographical and historical principles that the researcher must working when making interpretations. Analyzing retrospective data carries uncertainty. For example, using mortality data in children with LA, it seems bias, may not be appropriate for studies of clusters, because the difference between the date of illness onset and date of death are variable among cases, and because people are likely to move between the time of onset and death [19].
If not careful, you can change the unit of analysis, from population to the territory. In itself this is not bad, but the evidence should not be combined. For example, Bellec et al, did a spatial analysis, discarding the population analysis, and results are properly interpreted from a spatial logic, focused on the areas under study [20]. Even after, its meaning was explicated to the people. Related to that, among the challenges to be addressed soon, there are: to study too small areas; to study too large areas; and use biomarkers to check risk factors. An example of the limitations of both geographical and historical boundaries is the definition of significant results for both spatial and temporal dimensions. The danger is that you can led out of the expected results to several other cases are linked. One test, the Knox test shows this disadvantage [21].
One test, known as Moran test, prove test's spatial limits, and how it could produce a bias in the correlation tests. Rogerson's method indicates something similar. These could be due to the difficulty of finding differences in a too small or too large geographic area. This is a disadvantage in cases such as a city, where opportunities for analysis using very large spatial limits potentially could led to finding an enormous variability; but on the other hand, for diseases that are hard to find in general population, like childhood leukemia, to find a cluster in a very large area will be difficult. As shown, also for this reason, is preferable works it with existing cases, and choose a convenient method, or a combination of methods.
Finally, not all studies carried out by detecting the presence of complete clusters. This is in part due to the use of the wrong method, or because the sample size are not adequate. This has been reported with Glass et al [19], and Bellec et al [20]. The work of Birch et al also makes a similar warning [21].

Studies of clustering and clusters of children with leukemia
An old idea has always troubled the minds of those who investigate the causes of childhood leukemia. The research regarding the etiology of childhood acute leukemia has a long history of about 100 years, without finding the full factor explaining the origin of the disease. For example, by the results of this research have been identified two risk factors for the development of leukemia: intense ionizing radiation and certain congenital genetic syndromes, which only account for 10% of cases [22]. Amin pointed one factor more: exposure to chemotherapeutic agents. However, these factors explain only a small percentage of all cases of leukemia.
He presented several studies that examined the complexity of other risk factors: "early-life exposures to infectious agents, parental, fetal, or childhood exposures to environmental toxins, parental occupational exposures to radiation or chemicals; parental medical conditions during pregnancy or before conception; maternal diet during pregnancy, early postnatal feeding patterns and diet, and maternal reproductive [23]. Finally, he added that environmental factors may play a role in cancer development in children, and too many cases, concentrated in one geographical area, one cluster, could be evidence of that. After all, these sets of studies have supported the etiological investigation of childhood leukemia, especially acute lymphoblastic leukemia. At its completion, attention has been paid to various causes, such as ionizing radiation, contaminated water, petrochemical industry, exposure to agrochemicals [24]. The less controversial idea is that the clusters are evidence that environmental factors are generally involved in the development of cancer, including childhood leukemia.
Ward, for example, since 1917, thought of a theory of infection of childhood acute leukemia [25]. It is a very old hypothesis and, until today, has not been proven or disproved. The study of clustering and clusters is often used to support the ideas developed around the possibility that infections are behind the onset of childhood leukemia. A cluster of children with leukemia has been interpreted as evidence that behind the development of the disease are implicated infections in children's lifetime [7]. Tango, in 2010, said "in the search for evidence of whether a disease such as leukemia, is indeed an infectious disease and, therefore, a viral etiology, the focus will be on whether the cases are grouped into clusters" [15]. Specifically, concerning analysis of space-time clustering McNally and Eden clarified that if the infections were implicated in the etiology of childhood leukemia, then the geographic distribution of these children may show evidence of clustering in space, and under certain conditions may also show space-time clustering [26].
In fact, the presence of a cluster of children with leukemia has been interpreted in a more broad and diverse sense. A cluster of children with leukemia would suggest that since the origin of the disease, one or more factors are involved, not just infection. As already mentioned, genetic predisposition may also play a role. It is very important to reiterate that the results obtained from such studies should be interpreted with special care. The cluster studies, by themselves, do not reveal causal agents in the sense of identifying risk factors involved in a health problem. It should be noted that studies of clustering and clusters are behind the search for relationships that are implicit in theory of the etiology of a disease, but these studies are not used to make formal determinations of risk factors implied. Hence, the inferences developed from the results of these studies are not without controversy.
In a literature review of the last fifteen years (1997-2012), were found more than 20 publications, of which 18 are focused on acute lymphoblastic leukemia (ALL). There are two questions that have been repeatedly proposed to be answered by studies of clusters, the first is whether infections play an important role in the development of childhood acute lymphoblastic leukemia, and the second is whether there are environmental risk factors that are not well specified influence on disease onset. However, some researchers have considered infection like a type of environmental exposure, and therefore the two terms may have been used interchangeably in some research.
The objectives that want to answer research questions of cluster studies are several, for example, the objectives can be descriptive, or may be the result of pose a hypothesis "a priori". Of the 18 studies conducted in children with ALL, eight had a descriptive pur-Clinical Epidemiology of Acute Lymphoblastic Leukemia -From the Molecules to the Clinic pose, proposed a methodological improvement, or only sought to detect at least one cluster in the territory studied [12,[26][27][28][29][30][31]. Also within the objectives of the studies of clusters there are some that are rare but very interesting, such studies have been conducted on exploring the presence of risk factors that could potentially be common between different types of childhood cancer [33]; or studies that are even more specific to look for the presence of a cluster of children, within a particular subtype of leukemia (pre-B ALL) [21]. Moreover cluster studies conducted so far, to consider since the start of the investigation. By contrast, there are publications about where it was proposed as a risk factor that could potentially lead to the development of leukemia, which allowed those jobs also propose a working hypothesis or also called "a priori" with the sole purpose of find scientific evidence [7,17,24,[34][35][36][37][38]. As mentioned above, one of the risk factors considered most important, proposed and used in different studies, is related to the role of infections in the development of leukemia, resulting in the so-called infectious hypothesis [17,[34][35][36]39]. Another risk factor that has been considered and that is relevant to this issue is the so-called environmental risk factor (unspecified) [7,37]. For example, in a study conducted by Petridou et al, in the year 1997 was referred on that environmental factors may impact the development of leukemia, but in this study did not specify these environmental factors nor their relationship in a given moment with infections [38].
According to a possible risk factor involved in the development of ALL, referenced within the findings of cluster studies conducted so far, they can be divided into three groups: 1) studies that suggest to infections [17,21,24,29,30,33,34,36,37]; 2) studies proposing an unspecified environmental factor [27,38] and finally 3) studies that consider both factors, infections and unspecified environmental factor [7,13,28].
Likewise cluster studies have favored the generation of new knowledge about how to conduct such studies [21,31], and even both proposals were generated for subsequent original studies as to confirm previous findings [32,35].
A longstanding discussion agreement concerning the definition, existence, frequency and interpretation of clusters of childhood leukemia (CL) remains unresolved [40]. In 2009, zur Hausen et al., said it that is very difficult to understand that mere clusters of cases of leukemia, Hodgkin lymphoma, Hodgkin lymphoma, or even cases of multiple myeloma, are indicating that these tumors have their origin in systemic infections. It has been proposed that viral infections should be revealed in a geographical area according to a random pattern of geographical distribution. The same author argued that if the simple processes in non-infectious disease lead to cancer onset, as further modifications of the genome are needed at the cellular level, caused by viruses. However, he also thought about the possibility that clusters may be the expression of a mutagenic factor in a family or, probably, in a small community or region, exposed to the factor [41].
The debate is not over. The controversies surrounding the usefulness of cluster studies for understanding the leukemia in children are constant. "Most clusters do not have evidence for obvious, prolonged and biologically plausible exposures. The etiology is obscure or unknown. Very rarely they can lead to a better understanding of causation, usually in situations with well-documented and heavy local contamination" [42]. There are intermediate positions, which do not ensure the possibility of a conglomerate but neither discarded. Law, in 2008, setting out "[...] it is difficult to see how these clusters provide evidence for infectious disease being involved in the etiology of childhood leukemia" [43]. But he clarified that the positive results of a study of clusters are used as a proxy, in fact, that its possible etiology. In addition, he relied on evidence generated by other studies, as the peak age between 2 and 5 years, the incidence of disease or increased incidence over time, and seasonal variations in the incidence of leukemia. These studies have been identified as potentially indicative of the role of infections [43].
Perhaps studies of space-time clustering result in less doubt when their results are interpreted. Possibly, they are tremendously bounded by the condition of the two dimensions examined by these techniques. It must be remembered that this type of study seeks to distinguish clustering patterns of cases, both in time and in space, simultaneously. In 2006, McNally, Alexander and Bithell, began a study in the hope of detecting space-time clustering of children with cancer in the United Kingdom. They predicted that if an antecedent infection was involved in cancer development, this type of clusters of children with cancer should be revealed in the territory of the island, if the infection (either viral or bacterial)was not ubiquitous or endemic. Otherwise there could be differences in spatial or temporal dimensions. The authors clarified that, however, this would only occur when the delay time between exposure and cancer diagnosis was short, or at least relatively constant. In the first scenario they wanted to test that, in the etiology of some childhood cancers, the timing of onset of this disease possibly masks the ability to be a rare response to infection [37].
Furthermore, McNally et al., suggested that infections may act in the etiology of certain types of diseases of childhood cancer. Similarly, they relied on the fact that several studies conducted elsewhere in the world, had also found space-time clustering of children with childhood acute leukemia, especially acute lymphoblastic leukemia, with similar conclusions. They began to change their language, from infections to environmental exposures. Three years later, in 2009, they presented a retrospective reflection on the study mentioned in the preceding paragraph, and referred to themselves in the following words: "These findings provided support for the involvement of environmental agents in etiological processes occurring close to diagnosis" [13]. Really, they confirmed a change in favor of extending the suspicion of the sum of environmental factors, rather than just infection.
In that study were found space-time clustering of children with various types of cancer. The children were registered in a database, from 1969 to 1993. Among cancers, they included leukemia. The clusters were searched according to two possible outcomes. The first of these was the phenomenon that presumably finds clusters of children that matched both their place of residence at birth, for the date they were born. The second concerned the possibility of finding these clusters when they seek groups according to place of birth of children and also according to the time of diagnosis, regardless of date of birth. Each result has a different interpretation. By residence and date of birth, there were clusters of cases with Hodgkin lymphoma and central nervous system tumors. When searched according to the residence at birth, but considering the date of diagnosis, there were space-time clusters with leukemia cases (specifically for children aged 1 to 4 years) and also for Wilms tumor and Hodgkin lymphoma.
Interpretations for each type of cancer have its nuances, but the meanings of space-time clusters in the above variables are as follows. When clusters of cases matched both in time and in place of birth, the given interpretation is that this conjunction supports the possible involvement of infections in the etiology of the disease studied. In particular, the space-time clusters of children with Hodgkin lymphoma suggest that a relevant etiological exposure occurred among children at similar ages after birth, or in uterus period. Moreover, since the dates are the time of birth and diagnosis, this would indicate that there was a heterogeneous latent period from the time of exposure until the time of diagnosis. Clusters of children with central nervous system tumors were interpreted with caution. According to the authors, this finding only strengthens the possibility that infections are implicated in the development of tumors mentioned.
On the other hand, when the space-time clusters match the place of birth and date of diagnosis rather than date of birth, the conclusion was different. For example, clusters of cases with NHL suggest that exposure which resulted in the onset of this disease in similar stages, before diagnosis. Childhood leukemia received a special mention. According to McNally et al, in a previous study [37], the authors had found clusters of children with leukemia by place and date of diagnosis, whereas in the study cited here, they found clusters by date of birth more date of diagnosis. In no cases children, clusters were found by location and date of birth. Again, the results supported the hypothesis of infectious childhood leukemia. The result is restricted since, although children with leukemia were between 1 and 4 years of age in the first study, this outcome was found only among children with ALL, the most common; and in the second study, was found among children with overall acute leukemia.
When we talk about leukemia clusters in children, McNally and colleagues reported that there are clusters when they are sought under the variables of time and place of birth [21]. Mulder et al, found clusters around environmental factors: the cases were grouped when analyzed by exposure to petroleum products and pesticides, and even found a link between having swum in a pond contaminated with petrochemical spill in previous years [44]. Petridou's team found correspondence between the ages of cases and their place of residence, while in urban areas had clusters of children between 0 and 4 years, and excluding those over 5 years in rural areas should be age higher [45]. In England, there was found clusters from both the date and place of birth of cases, and between the date and place of diagnosis of the same group [17]. Gusstafsson and Carstensen, in contrast, dismissed the clustering according to the place and date of diagnosis, but they found them by date and place [29]. Gilman came to similar results, finding clusters by date and place of diagnosis, and date and place of birth, with the added bonus that also could rule out cancer clusters in solid tumors [46]. Finally, Alexander, who sought and found clusters with defined variables from another perspective: cases susceptible and infected cases [47].
There are also studies of clustering and clusters with negative outcomes. In France, Bellec not found clusters in a study which used data from the national registry, notwithstanding they used many different methods to detect clusters: Potthoff-Whittinghill, Moran, Knox and Kulldorff [20]. Dockerty and colleagues [48], and Alexander et al [49], found nothing when only raw data were based on geographic, demographic and diagnostic information. There are certainly limitations to this type of study, but, specifically, since 1970, it warned that a number of cases over a long period could lead to detection of artificial nature [50].
The infectious etiology of acute leukemia has been revised, too, from other study designs. Wartenberg et al, in 2004, sought to study the infectious origin of leukemia, testing a hypothesis developed for this purpose (Kinlen hypothesis). The study was conducted from an ecological point of view. This type of study is characterized in that its inferences can only be applied to the ambient, and cannot be categorical causal statements about the population being studied [51]. Another ecological study, carried out by Knox [52], revealed an association between the prevalence of childhood cancers (including leukemia) with the geographic distribution of air pollutants. In a third study, also ecological [53], there was a comparison between cases of people with different cancers (and ages), in a city of Wales. Hypothetically they expected that the prevalence of cancers decrease as people were found at greater distances from a source of pollution (petrochemical industry). There was an inverse correspondence between the distance and the incidence of some cancers, but not the incidence or mortality of leukemia.
The relationship between environmental factors and the development of cancer, including leukemia, has been extensively studied, and the outcomes, far from discouraging the search, prompt further investigation.
Studies of Lehtinen et al [54], Bogdanovic et al [55], Roman et al [56], Gilham et al [57], looked for associations between viral agents and the development of leukemia. These studies used a case-control design. Lehtinen hypothesized about in utero infection with Epstein-Barr virus and human herpes virus. There were no significant results. Bogdanovic analyzed the relationship between the Epstein-Barr virus and its reactivation in the mother, showing a possible association with childhood acute lymphoblastic leukemia. Roman found positive relationships between the incidences of disease by these viruses generated with leukemia. Gilham assumed that a large exposure to infection, when the child lives in day care, was associated with protection that reduced the development of childhood leukemia; his results led to conclude that reduced exposure to infections during the first months of the child's life increases the risk of developing acute lymphoblastic leukemia.
The study of clusters suggests possible etiologies. The interest is that these putative risk factors can be assessed in more detail. For example, the study of cluster analysis is often combined with a case-control study. This gives more relevant results, as measured relationships between risk factors and disease [44,48,[58][59][60][61]. In addition, cluster detection techniques can be compared with other techniques, using the same data, allowing comparisons [20,62,63].
When spatial distributions are unusual, it is possible to speculate on the etiological implications [49]. A study of clusters can evaluate multicausal factors. Bellec's findings supported the hypothesis that a community's geographic isolation and low density, possibly combined with the mixture of different populations, may play an important role in the etiology of leukemia. It could not been possible to consider the geographic isolation as a risk factor using any other technique. However, the same author suggests that a new research question should be to determine whether this phenomenon was specific to one age group or diagnosis; and proposes that future statistical models could allow further investigation and better understanding of these findings, especially with respect to the role of population density and population mixing [20].

Analysis of data from Mexico City
Acute leukemia among children in Mexico City has been studied for over a decade. Through these studies, we know well that its incidence is among the highest in the world. In 2011 Pérez Saldívar et al., reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city-more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year-are expressed in more than 200 children with childhood leukemia each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood acute leukemia in Mexico City [65].
For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Instituto Mexicano del Seguro Social [66], were found morbidity standardized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypothesis that children with leukemia are grouped into clusters that reflect the aforementioned heterogeneous spatial distribution.
Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City (Federal District). A total of 224 incident cases were identified. We also included 224 children without leukemia (controls), nor other cancers, genetic malformation or asthma. The controls were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were recruited between the years 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used the Kullorff´s scan statistic, which is based on a Bernoulli model, to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area. The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig. 2) (O=98, E=74.66, O/E=1.313, p=0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the possibility of chance may play a role, cannot be excluded.
Although we cannot exclude the effect of chance on these results, nor forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable. For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p=0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p=0.81. As we can see, none of the clusters is statistically significant. The author of the paper commented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, in other study, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (K-function, Cuzick & Edward's, kernel intensity function) plus SaTScan, without finding significant associations with any of them.
Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state, or in a entire country. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as an aggregate of municipalities (about 30 × 30 Km) or cantons, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116,096 Km² vs. 1,485 Km², respectively), could be one reason why we detected a cluster in the city. However, the number of clster's children (98 cases), and a value of p = 0.01325, is not commonly reported. In addition, the data collection period of the cases was two years, against eight of Ohio study.
When the geographic location of the children who make up the cluster is mappeding, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican Federal District. The hypothetical explanations for this cluster are the following.
Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics, Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively comparatively, in the Iztapalapa borough would, there are on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.
Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City's metropolitan area is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a spatial cluster in the south-southwest of Mexico City, instead of having appeared in the east.
In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldivar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agricultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published to Mexico in 2005 [75]. The third sought to relate the average number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems in Mexico City and higher number of people per household. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster reported in the study. Further studies are needed to investigate this point.
In summary, we can say that the investigation of spatial clusters in geographic areas of a relatively small size, as a city with a high population density and a high incidence of childhood acute leukemia, is favorable for the detection of clusters.
Unfortunately we do not have a longer register of children with leukemia, so that the interpretation of the results is inconclusive. The results of this study support the involvement of environmental factors in the development of childhood acute lymphoblastic leukemia.
these studies, we know well that its incidence is among the highest in the world. In 2011 we reported an incidence of 57.6 cases/million children [64]. This high incidence, coupled with the large population of the city-more than eight million in 2010 in the territory of the Federal District, and more than 20 million people in the metropolitan area of Mexico City, in the same year-are expressed in the more than 200 children with childhood leukemia incidents each year. These children and adolescents are treated in nine tertiary hospitals, the highest rank of specialist medical care in the Mexican health system. It has been estimated that these hospitals serve approximately 97.5% of the cases of childhood acute leukemia Mexico City [65].
For over ten years it was suspected that the spatial distribution of children with leukemia in Mexico City was heterogeneous. In 2000, in a descriptive longitudinal study, conducted by researchers at the Mexican Social Security Institute [66], were found morbidity standarized rates (MSR), that suggest a spatial concentration of cases of childhood leukemia. For acute lymphoblastic leukemia (ALL), the MSR were highest at south of Mexico City; for acute myeloblastic leukemia (AML), the MSR were highest in the west. In addition, from further investigations, some matches attracted the attention: it was found that among patients with acute leukemia (AL), some of them were immediate neighbors, suggesting that behind the development of the disease; there is an environmental factor that could promote it [67]. It was, therefore, decided to make a study of spatial clustering in Mexico City, to confirm the hypothesis that children with leukemia are grouped into clusters that reflect the aforementioned heterogeneous spatial distribution.
Information was extracted on individuals aged 0-14 years, diagnosed with ALL between 2006 and 2007, who were resident in Mexico City. A total of 224 incident cases were identified. We also included 224 children without leukemia, nor other cancers, genetic malformation or asthma. Were matched by sex, age and health institution of origin. We located the addresses of the homes, with an accuracy of 0.1 km, from where children residing at diagnosis of leukemia, or the time of the interview, in those without the disease. The cases were collected between 2006 and 2007, and for the controls needed a much larger period, from 1998 to 2011. Due to this discrepancy, we discard the detection of clusters according to the temporal dimension. We used Kullorff´s scan statistic based on a Bernoulli model to identify individual clusters. The complete study region was scanned by construction of a two-dimensional circular window. The window was varied so at most it included 50% of the entire geographical area.
The variable circular window is centered on the geo-reference of each case [68]. The method has been used previously in an analysis of leukaemia in Sweden [69]. Statistical significance (P<0.05) was evaluated using one-sided tests and 99999 simulations. Only one large statistically significant cluster was identified (see Fig. 2) (O = 98, E = 74.66, O/E = 1.313, P = 0.01325). There were no statistically significant secondary clusters. This finding indicates that locally varying environmental factors may be implicated in the origin of ALL in Mexico City. However, the possibility that variable levels of ascertainment may play a role cannot be excluded.
Although we cannot exclude the effect of chance on these results, or forget the fact of that the periods of data collection between cases and controls are different, these results are remarkable.
For comparison, in a study conducted in Ohio [62], also carried out with SaTScan, the most significant data were found for the group of children aged 10-14 years, with a value p = 0.33, and only three cases formed part of cluster. When all the data were analyzed together, including all types of leukemia and all ages, the most likely cluster had 43 cases, with p = 0.81. As we can see, none of the clusters is statistically significant. The author of the paper commented that these results were not entirely surprising, considering that the study area is very large and diverse (the state of Ohio). He argued that in a large area, it is doubtful that a particular risk factor going to have a consistent and sustained effect through space. It's unlikely to see a cluster in a large geographical area. And again, Wheeler repeated the same sentence with another argument: the results are consistent with the literature worldwide, because it is difficult to find statistically significant clusters. Wheeler used several tests of clusters (Kfunction, Cuzick & Edward's, kernel intensity function) plus SaTScan without finding significant associations with any of them.
Apparently the cluster analysis is most effective for small-area spatial analysis [31], on the scale of a city, and not in an area as large as a U.S. state. Goujon-Bellec et al, concluded that very few cluster detection techniques have enough power to scan large areas, as a large country like France [70]. It has also been observed that when the unit of geographic analysis is very large, such as a county or a municipality, or an aggregate of these, the sensitivity of a technique for detecting clusters is diminished; it is as though the proximity between the cases are "diluted" among these geographical units, and the cluster simply "does not appear". Studies of Germany [7] and France [20] seem to confirm it. Therefore, differences in surface between the territories of Ohio, in the United States, and Mexico City, in the Mexican capital (116.096 km² vs. 1485 km², respectively), could be one reason why we detected a cluster in the city. However, many children (98 cases), and a value of p = 0.01325, statistically significant, is not commonly reported. In addition, the data collection period of the cases was two years, against eight of Ohio study.
When the geographic location of the children who make up the cluster is mapped, it can be seen that this spatial cluster is located to the east of Mexico City (Federal District), slightly to the northeast. Indeed, about half of all children with leukemia investigated are part of this cluster. The conglomerate is clearly excluded from west and southwest of Mexico City. The extent of the surface cluster includes the territories of some of the boroughs of the Mexican Federal District. The hypothetical explanations for this cluster are the following.
Industrial establishments located in the environment of a child have been considered as risk factors for developing cancer. Sans et al, in 1995, expected the prevalence of cancers, including childhood acute leukemia, decrease as people were found farther from the petrochemical industry plant, as in Britain [53]. According to data from the National Institute of Statistics, Geography and Informatics (INEGI by its Spanish acronym), the two main boroughs by number of economic units of industrial activity are Azcapotzalco and Miguel Hidalgo. In fact, these two areas form the most industrialized landscape of Mexico City, for over half a century. There was an old oil refinery, an auto plant, dozens of railways and many other industries. Contradictorily, these delegations are not part of the cluster found on the east of Mexico City. At first glance, this suggests that the spatial cluster detected in this study is not related to large industrial facilities. However, to be more careful, we can see that the phenomenon probably does have a relationship but more nuanced. In third place for the number of industrial establishments, appears Iztapalapa borough, which is included within the area of spatial cluster. When comparing those boroughs, Azcapotzalco and Miguel Hidalgo, against the latter, Iztapalapa, we note that the average personnel employed in industrial establishments is very different. While the staff working at the Miguel Hidalgo and Azcapotzalco boroughs is 32.01 and 31.29 workers per economic unit, respectively. Comparatively in the Iztapalapa borough would, on average, 11.19 workers per industrial unit. This suggests that facilities most prevalent in the spatial cluster detected include more small establishments such as family workshops. If so, the work on these workshops can be an important parental exposure.
Another consideration is air pollution, which is a risk factor that has been studied. Knox suggested an association between the prevalence of cancer in the geographical distribution of air pollutants [71]. According to National Institute of Ecology (INE by its Spanish acronym), the impact of air pollution in Mexico City generates 4,000 premature deaths per year and 2.5 million lost work days [72]. EMBARQ states that, with about 18 million people and 6 million cars, Mexico City is one of the largest and busiest cities in the world. Around 600 new cars come into service each day, and in 2007 it sold just over 300,000 cars this year. Most alarming is that, according to the same place, less than 4% of vehicles, trucks and buses, generated in 2002, about 70% of air pollution. The other 30 percent is allocated to factories, small cars and motorcycles [73]. Air pollution is thus mainly attributed to heavy cars. The smog of Mexico City is concentrated in the southern part of it, and the children of the conglomerate are located mainly to the east. With the evidence found, it is difficult to ensure that the cause of spatial cluster is due, mainly, to the atmospheric concentration of pollution in the city. If so, we would have expected a conglomerate in the south-southwest of Mexico City, instead of having appeared in the east.
In an earlier study, which also uses the information from the same database of this study, we measured the relationship between socioeconomic status and the development of childhood acute leukemia. The study, conducted by Perez-Saldívar et al, considered the problem according to three indicators [64]. The first is indirect and only represents the existing agricultural activity in each delegation of Mexico [74]. The second used the information developed by the United Nations to measure human development, the Human Development Index by municipalities, published from Mexico in 2005 [75]. The third sought to relate the average number of people per household [76]. The results only showed a relationship between the incidences of ALL with the number of people per household, in Pre-B ALL. For none of the other two indicators found a significant relationship. In this study, the detected cluster is located in an area of relatively low economic level, with several boroughs suffering from poverty problems in Mexico City. However, given the information that it has, we cannot explain a relationship between socioeconomic status and the cluster reported in the study. Further studies are needed to investigate this point.