Spatial and Temporal Patterns of Primary Syphilis and Secondary Syphilis in Shenzhen, China

Syphilis remains a global problem with an estimated population of 12 million infected each year, despite the existence of effective prevention measures, such as condoms, and effective and relatively inexpensive treatment options (World Health Organization, Department of Reproductive Health and Research, 2007). China is currently witnessing a major resurgence of syphilis from the elimination of the disease in the 1960s to 8.7 cases per 100,000 people in 2005 (Chen et al., 2007). The total incidence of syphilis in 2009 was 24.66 cases per 100,000 people, with incidence of primary and secondary syphilis accounting for 11.74 cases per 100 000 people (National Center for STD control, CDC, China, 2009). Shenzhen, a city located in southern coastal China and adjacent to Hong Kong, is also with great disease burden of syphilis. Preliminary estimation showed that there were more than 80,000 syphilis cases in Shenzhen (Hong et al., 2009a). A total of 22861 syphilis cases were reported from 2004 to 2008 in Shenzhen, ranking the first among 21 cities in Guangdong province. The total incidence was 73.07 cases per 100,000 people in 2008, which was 3.47 times higher than that in China and 2.19 times higher than that in Guangdong Province. Syphilis has become a very serious problem of public health in Shenzhen. Population-based studies showed that lots of human factors were associated with genotype clustering in syphilis, including marital status, education level, age structure, human immunodeficiency virus (HIV) infection, homelessness, drug abuse and the local pornography (Newell et al., 1993). The local health bureau had paid much attention to syphilis prevention and control, and it set up a long-term program titled Syphilis Prevention of Mother-to-Child Transmission which had yielded great cost effect and largely reduced the incidence of congenital syphilis (Cheng et al., 2007; Hong et al., 2010; Cai et al., 2007). However, the total incidence of syphilis as well as the prevalence of syphilis among some high risk population (e.g., men who have sex with men) had been rising (Feng et al., 2008; Hong et al., 2009b). Actually, tens of health centers in Shenzhen had offered free voluntary counseling and testing for syphilis as well as HIV, and health education and promotion programs had been conducting, while the effects seemed limited. Then questions arose, had the prevention and intervention been put into the right place? Were there any clusters of syphilis? It was believed that if we could find out these clusters and encourage more intervention in these regions, it would largely reduce the incidence of syphilis. Explore the

genotype clusters of syphilis and describe its spatial and temporal patterns would be useful and it would provide essential information for designing syphilis intervention programs.It used to be very difficult to decide which districts and what time the clusters would occur.If we simply compared the incidence of each district and/or each time period, it would be likely to cause selection bias as the spatial and temporal boundaries of the clusters had been set at the very beginning.Moreover, any geographical region always contained high-rate areas that occurred by chance alone.Using this method to find out the significant clusters needed to consider a lot of random variables, otherwise the statistics results would be far from truth (Kulldorff et al., 1998).
To solve the problem, Kulldorf and Nagarwalla advocated the generalized mathematical model to describe spatial distribution (Wallenstein, 1980).The model applied the likelihood ratio test, adjusted the underlying spatial in homogeneity of a background population and created the scan window with varied sizes to do the data statistics.Afterwards, Kulldor raised spatial scan statistics, space-time permutation scan statistics and prospective spacetime permutation scan statistics which were based on Monte Carlo hypothesis test.Kulldorf and his colleagues also created SaTScan (available online for free at: http://www.satscan.org/)for the above statistics.This method was widely used to detect and evaluate disease clusters for a variety of diseases, including cancer (Jackson et al., 2009;Meliker et al., 2009), shigella infection (Stelling et al., 2009), malaria (Haqua et al., 2009;Coleman et al., 2009), low birth weight and infant mortality (Grady & Enander, 2009), amyotrophic lateral sclerosis (Scott et al., 2009) and diabetes (Green et al., 2003).Prospective space-time clustering statistics could also be used for early warning.This method might detect the outbreak at any place and any size as early as possible, which would facilitate public health officers to conduct related investigations and control program with the goal of preventing disease transmission (Kulldorff et al., 2005).Moreover, the prospective spacetime permutation scan statistics could detect the abnormity in the process of disease transmission.Thus, it was widely used in acute infectious diseases and biological terror warning study (Kulldorff et al., 2005).As the patients in primary and secondary syphilis stage would have higher transmission ratio, timely treatment and intervention among these cases would be more effective in syphilis prevention.Thus, this study aimed to identify the clusters of primary and secondary syphilis in Shenzhen in recent years via spatial and space-time statistics and detect the early warning signals of syphilis outbreak by prospective space-time permutation scan statistics.

Data sources
Data of syphilis cases was downloaded from the China Information System for Disease Control and Prevention.Patients' gender, date of birth, date of diagnosis, current address and other information (e.g., education level, marital status, occupation) were recorded in the database.Geographic Information System (GIS) data was from Shenzhen 1:10,000 digital maps.Data of population size was from the Shenzhen Municipal Bureau of Statistics.

Data coding
Shenzhen was divided into 55 neighbourhoods (the smallest administrative unit in Shenzhen).In this study, current address was recoded into 55 street communities which were matched with Shenzhen's GIS data.Date of diagnosis for each case was recoded by year.

Software and spatio-temporal clustering analysis
The spatio-temporal clustering analysis was carried out via SaTScan software.The purely spatial scan statistics imposes a circle window on the map.The window is in turn centered on each of several possible grid points positioned throughout the study region.For each grid point, the radius of the window varies continuously in size from zero to some upper limit defined by the user.In this way, the method creates lots of distinct geographical circles with different sets of neighboring data locations within them.Each circle is a possible candidate cluster (Kulldorff Martin. SaTScanTM User Guide for version 9.0.2010.).The space-time scan statistics is defined by a cylindrical window with a circular geographic base and with height corresponding to time.The base is defined exactly as for the purely spatial scan statistics, while the height reflects the time period of potential clusters.The cylindrical window is then moved in space and time.It creates a number of overlapping cylinders of different size and shape, jointly covering the entire study region, where each cylinder reflects a possible cluster (Kulldorff Martin. SaTScanTM User Guide for version 9.0.2010.).For each scanning window, the analysis process will firstly get the expected number of cases based on the observed number of cases, then calculate the log likelihood ratio (LLR) by observed number and expected number of cases inside or outside the window.LLR is used to evaluate the cluster status.The likelihood function is maximized over all window locations and sizes, and the one with maximum likelihood constitutes the most likely cluster.The likelihood ratio for this window constitutes the maximum likelihood ratio test statistics.Its distribution under the null-hypothesis is obtained by repeating the same analytic exercise on a large number of random replications of the data set generated under the null hypothesis.The P-value is obtained through Monte Carlo hypothesis testing, by comparing the rank of the maximum likelihood from the real data set with the maximum likelihoods from the random data sets.If this rank is R, then P=R/(1+simulation).In this study, the number of simulations is defined as 999, thus, the calculation of P value is accurate to three decimal.For purely spatial and space-time analyses, it also identifies secondary clusters besides the most likely cluster, and orders them according to their likelihood ratio test statistic.There will always be a secondary cluster that is almost identical with the most likely cluster and that have almost high likelihood value, since expanding or reducing the cluster size only marginally with not change the likelihood very much.Most clusters of this type provide little additional information, but their existence indicates the possibility of pinpointing the general location of a cluster.There are some secondary clusters that do not overlap with the most likely cluster which are of great interest.In this study, we used the default settings, it means there was no geographical overlap between secondary clusters and the most likely clusters.

Results
A total of 9126 primary and secondary syphilis cases were reported from 2005 to 2008, with 5173 males and 3953 females (Table 1 2-5).The distribution of clusters changed with years (See Figure 1).The neighbourhood with cluster number '1' was the most likely cluster.Other neighbourhoods which were not shown in Table 2-5 or with white color in Figure 1 were not clusters.
Cluster  In this study, many minor statistically significant clusters became significant clusters in the following year or subsequent years, while some significant clusters became nonsignificant clusters conversely.The change of minor significant clusters also reflected the spatial and temporal trends (See Table 6).

Districts
Note: 1=the existence of statistically significant clusters in a certain year, 0=the existence of minor significant clusters in a certain year, -1=no cluster in a certain year.

Results of space-time permutation scan statistics
This study set the maximum spatial cluster size at 10 kilometers and the longest duration was two years.By space-time permutation scan statistics, ten significant clusters were detected in Shenzhen from 2005 to 2008, which was similar with the results of spatial scan statistics.See table 7 and Figure 3.

Early warning signal for syphilis outbreak
We explored the adaptability of prospective space-time scan statistics via analyzing the syphilis cases from Nov to Dec, 2005.The time span was set as 1 day, the maximum time window was 7 days (due to the early warning, time scale cannot be too long) and the greatest scanning radius was 10 kilometers.As patients would choose varied day of a week to visit the medical institutions (for example, many people would prefer to visit on Saturday and Sunday as they needed to work from Monday to Friday), this study adjusted the space by day-of-week interaction.By prospective space-time scan statistics, five early warning signals were found.The most likely clusters were located in Shekou and Zhaoshang with the recurrence interval of one year and 135 days (

Discussion
Identification of primary and secondary syphilis clusters can contribute to syphilis control strategies selection.Two different methods are applied in this study to explore the spatial and temporal patterns of primary and secondary syphilis in Shenzhen from 2005 to 2008.By purely spatial analysis, minor significant changes of the clusters' locations are found from 2005 to 2008.The clusters mainly concentrate in the neighbourhoods of Longhua, Xixiang, Xin'an, Nanshan, Nantou, Shekou, Futian, Fuhua, Cuizhu, Dongmen and Huangbei.Many of these neighbourhoods are adjacent to each other and most of them own large number of migrant workers as well as entertainment venues.A lot of surveys have proved that migrant workers are more likely to have unprotected sex behavior as their low risk perception, lack of money and psychological problem, resulting in more risk of syphilis infection.Moreover, the entertainment venues in these neighbourhoods meet the desire of migrant workers and encourage their sex behavior.Thus, the clusters of syphilis detected in this study may be associated with the clusters of high risk groups.More attention should be paid to the cluster areas.To accelerate progress towards epidemiological impact target sites, we should move up local government and public health agenda with increased resource allocation and enhance the intervention and prevention work.
The purely spatial analysis also shows the small-scale changes of syphilis clusters during the four study years.This may indicate the changes of syphilis incidence in the districts and reflect the changes of risk factors.But how the specific risk factors changed has not been discussed here, and it will be the topic of our further study.
It is difficult to explain the minor significant statistically clusters.While, in this study, there are many minor statistically significant clusters that became significant clusters in the following year or subsequent years while some significant clusters became minor significant clusters conversely.This transformation gives clue to us that the minor statistically significant clusters may be in sub-cluster aggregation state, which also need to be paid attention to, especially for those clusters which had smaller P values (but P> 0.05) (Recuenco et al., 2007).
Because surveillance data will be updated year by year, using purely spatial scan statistics to analyze the gradually-increased surveillance data will affect the reliability of the results.Besides, it is very difficult to choose a suitable time span in comprehensive data analysis.If time span is too short, it may not detect the weaker clusters, while if the time span is too long, it may detect the recent clusters ineffectively.Thus, this study adopted space-time permutation scan statistics to analyze the accumulated data from 2005 to 2008, showed that there were some similarities, but differences still existed.
The incidence of syphilis is closely associated with the local social-demographic factors.By retrospective time and/or space scan analysis, we have found the regional and temporal cluster of syphilis, which would direct the further investigation.Finding the factors that attribute to the clusters and taking the intervention measures will benefit the syphilis reduction.The prospectively space-time permutation scan statistics can detect suspicious clusters earlier, which will be more meaningful to prevention on primary and secondary syphilis via public health aspect.It may promote public health officers to conduct investigations, e.g., to find out whether there are any clusters of high risk behavior (gathering parties among men who have sex with men, group drug absorption, etc) and adopt timely and suitable activities.In addition, it will encourage government financial input, which will greatly support the intervention program.In this study, five cluster alarm signals were detected by the prospective space-time permutation scan statistics.It proved the adaptability of applying the method to explore the disease outbreak in primary and secondary syphilis.However, the number of primary and secondary syphilis cases in an individual district was very small in this study, which might cause high variability in rates and introduce errors into calculations.It would be more efficient if more cases were involved, e.g., including all the common sexually transmitted diseases.In fact, only syphilis and gonorrhea were involved in the daily real-time surveillance system in Shenzhen, other sexually transmitted diseases were calculated by weeks.Thus, it called for the real-time surveillance system for common sexually transmitted diseases if we want to make an efficient prediction.Besides, as sexual transmission has been the main transmitted route of HIV infection and syphilis infection will promote HIV infection, early detection and treatment of primary and secondary syphilis will be much more important.Based on the real-time surveillance system, developing early warning system among sexually transmitted diseases, will benefit the control and prevention work.
The syphilis real-time surveillance system was established in 2004 in Shenzhen, thus, the study period was relatively short, involving only four years in this study.The next step is to use the space-time statistics in practical work regularly with the goal of studying the longterm trends of the spatial and temporal clusters of syphilis in Shenzhen.
The epidemic of syphilis could be attributed to varied reasons, including migration of population, number of total population, level of social and economic development, distribution of commercial sex entertainments, number of high risk population, occurrence of unsafe behavior, health resource allocation and service provision.Finding out the factors associated with the spatial and temporal patterns of primary and secondary syphilis cases is essential as it can get a clear picture of the reasons on clusters.However, this study did not collect the related information.This limitation would be considered and improved in the future study.

Conclusion
Space-time statistics is an effective method to describe clusters.This study applied the space-time statistics to explore the spatial and temporal patterns of primary and secondary syphilis at neighbourhoods level in Shenzhen.The results showed that the clusters of primary and secondary syphilis cases tend to be distributed towards the regions owing great many migrant workers and entertainment venues, such as Longhua, Cuizu, Huafu, Huangbei and Nanshan street communities, which provided important information on syphilis prevention program as well as the source allocation.The application of prospective space-time permutation scan statistics in this study proved its adaptability among primary and secondary syphilis, and it could detect some risk factors.Moreover, if adding other sexually transmitted disease, e.g., gonorrhea, in the prospective space-time permutation scan statistics, the test would be more effective and the method would produce early warning signals easily.Finally, it would provide useful information for STD and HIV control and prevention.Its efficiency in visualizing public health problems such as syphilis spread also allows policy-maker to target resources more efficiently.SaTScan software is now powerful tool for analyzing and exploring disease incidence, especially some infectious diseases.Further studies to investigate influential factors related to syphilis epidemic in more detail could provide more information to inform risk assessments and control strategies.

Fig. 1 .Fig. 2 .
Fig. 1.Clusters of primary and secondary syphilis in Shenzhen by spatial scan statistics from 2005 to 2008
. Primary and secondary syphilis cases reported from 2005 to 2008 in Shenzhen 3.1 Results of retrospective spatial scan statistics In 2005, there were 12 statistically significant clusters (P≤0.05), and two minor significant clusters (P>0.05), while the numbers were eight and seven in 2006, ten and five in 2007, and nine and four in 2008 (Table

Table 3 .
The significant and minor significant clusters in Shenzhen in 2006 by retrospective spatial scan statistics.
number Neighbourhood Actual incidence Theory incidence RR P values

Table 4 .
The significant and minor significant clusters in Shenzhen in 2007 by retrospective spatial scan statistics.

Table 5 .
The significant and minor significant clusters in Shenzhen in 2008 by retrospective spatial scan statistics.

Table 6 .
Changes of nonsignificant syphilis clusters from 2005 to 2008

Table 7
. Location of statistically significant (P≤ 0.05) syphilis clusters, in Shenzhen, 2005-2008, by space-time permutation statistics.Fig.3. Statistically significant clusters of syphilis in Shenzhen by space-time permutation scan statistics from 2005 to 2008.

Table 8 .
Early warning signals for primary and secondary syphilis in Shenzhen by prospective space-time permutation scan statistics*