From Anopheles to Spatial Surveillance: A Roadmap Through a Multidisciplinary Challenge

Anopheles mosquito species are diverse and vector of many pathogens. A review of the genus Anopheles [1] recently updated [2] listed more than 520 species, some of which including subspecies and cryptic species. Each of them presents ecologic requirements and behaviours that can influence their status as vector for specific pathogens. Pathogen transmission dynam‐ ics vary greatly from one region to another such as documented for the biodiversity of malaria in the world [3]. Acknowledging these variations at local scale within a country through detailed mapping can lead to better targeted measures and improved monitoring. Interactions between vectors, pathogens and humans in a given area can be better comprehended using a spatial framework leading to what we call here a spatial surveillance.


Introduction
Anopheles mosquito species are diverse and vector of many pathogens. A review of the genus Anopheles [1] recently updated [2] listed more than 520 species, some of which including subspecies and cryptic species. Each of them presents ecologic requirements and behaviours that can influence their status as vector for specific pathogens. Pathogen transmission dynamics vary greatly from one region to another such as documented for the biodiversity of malaria in the world [3]. Acknowledging these variations at local scale within a country through detailed mapping can lead to better targeted measures and improved monitoring. Interactions between vectors, pathogens and humans in a given area can be better comprehended using a spatial framework leading to what we call here a spatial surveillance.
Part of spatial variation is explained by differences in pathogen species or by successful control in some areas. Nevertheless, Anopheles species play a major part in the occurrence, seasonality and spatial variation of Anopheles-borne diseases. The environment in a given region provides or not support for a given species to breed, thrive and live long enough to be an efficient vector. Because of these variations a species might be an efficient vector in one settlement and then only a secondary vector in another. The need to clarify Anopheles distribution is recognised as a crucial step towards malaria eradication [4]. A recent effort to provide detailed maps of malaria and vectors has been carried out [5][6][7][8] including a description of ecological requirements. While these distribution maps are essential for an overview, some issues [9] (described further) linked to the data and modelling impede usage in an operational world. Modelling

Anopheles and sampling strategies
Many useful attributes can be collected on Anopheles such as the list of species and their vector status in a given area, resistance to insecticides, behaviour influencing human vector contact, control effort avoidance (early biting, outdoor resting). Research and monitoring programmes might be based on existing entomological data whose particularities should be dealt with at the modelling step. However the most direct way is to design the collection protocol in relation to the objective, i.e. mapping the Anopheles species. In this last case, the quality of the dataset could be high if some rules are followed. Monitoring data are typically collected in a network of sampling locations according to a variety of standardized procedures [11] and used for mapping species distributions [12]. However, records are generally collected only in a restricted number of locations often loosely distributed across the region of interest, which is inconvenient for documenting species distribution.
Species distribution modelling techniques [13,14] provide assistance to achieve mapping based on monitoring data [15] such as detailed further in the road map. When coupling Anopheles monitoring and mapping efforts, defining an optimal sampling strategy becomes of highest interest. Indeed, well-designed monitoring projects have the potential to produce appropriate data to estimate changes in species attributes [16] but also document the distribution in space and time [12]. An appropriate sampling design should address key issues: what constitutes a sampling location? How many are needed? Where do they have to be located? How often to survey? When monitoring data are used to generate species distribution models, designing the sampling strategy is a challenge because these issues are to be addressed relatively to the monitoring and the modelling objectives.

Optimizing sample size
Sampling locations may be sites, squares, transects or any spatial unit from which the measurements are made in the field to document attributes (e.g. presence, population density, infected/infective mosquitoes, reproductive status, insecticide resistance) that describe the Anopheles species. A sample is a set of sampling locations where attributes of the species are measured to estimate its characteristics over the entire study area. Hence, a sample must be representative of the whole study area and more than one sampling location is needed to account for the variation in the measurements made in the field. For instance, the population density or even the presence of a species depend on environmental conditions and this is to be taken into account to estimate the mean population density or the infection rate of the species in the study area or to document its spatial distribution with sufficient accuracy. Precision (typically measured by standard error) reflects how similar to each other are the different measurements made in the sampling locations, thereby providing a measure of sampling uncertainty. When sample measurements are similar to each other, the sample mean is likely to be estimated with an acceptable level of precision from a few sampling locations. In contrast, when the between-location variation in the measurements is high, a larger number of sampling locations is needed [11]. Achieving a sufficient level of precision is of critical importance: the higher the precision of the estimates, the better the chances to detect temporal changes using statistical hypothesis testing procedures. Sample size is also known to impact on the performance of species distribution models [17][18][19]: predictions based on few records are likely to be less accurate than predictions based on larger sample sizes [18]. A sufficient number of sampling locations is needed to capture in the statistical models the response of the species to the environmental conditions. A balance is, therefore, to be achieved between ensuring statistical robustness (i.e. increasing the sample size) and reducing sampling effort (i.e. decreasing the sample size) because sampling is time-and/or budget-consuming.
For monitoring purpose, a power analysis may be performed to evaluate the number of sampling locations required to detect a given level of change over time in the attributes of the species with a predetermined level of statistical certainty. First, decisions are to be made by the users on (1) the minimum level of change that is to be detected in the analysis (for instance, 10% of change between time t and t+1) and (2) the acceptable chances of making type-1 (i.e. concluding that change is taking place when it is not) and type-2 (i.e. concluding that no change is taking place when it is) errors in hypothesis testing procedures [15]. Such decisions are often based on the precautionary principle and the relative importance of type-1 and type-2 errors also depends on the objective. Then, the analysis integrates information on the precision of the estimates to calculate the optimal sample size needed to detect the desired level of change. A pilot survey is, however, required to obtain an initial approximation of the precision of the estimates linked to the variation in the field measurements. For modelling applications, modelling performance increases with sample size and impact of sample size on modelling performance may strongly depend on the modelling technique used [20]. A series of studies have also recently shown that the performance may be sensitive to particularly small sample sizes and may reach an asymptote level beyond a sufficiently large sample size [18,19]. In order to examine how large the sample size should be to obtain sufficiently well-performing models, different alternative options are available: (1) using readily available datasets in the study area [12] or (2) creating virtual species in real landscapes [19,21]. With such data, it becomes possible to manipulate the number of sampling locations to represent a range of sample size and to examine the impact of restricted sample size on modelling performance.

Optimizing sampling strategy in space
An appropriate sampling design also involves positioning the sampling locations so that the full range of environmental conditions across the study area may be covered to ensure the representativeness of the sample. Several approaches are available to position the sampling locations (only some are presented below) [21] with different advantages and disadvantages (details in [11]): Those include: Expert-based sampling -Sampling units are located based on a priori knowledge of the study area and the status of the species. This subjective strategy is to be avoided because the sample is most likely not representative of the study area and may thus not be used for statistical inference.
Random sampling -Random selection of sampling locations among a list is an easy-to-use procedure that is recommended when the aim of the sampling is to provide a picture of the situation across the study area. However, the precision of the estimates may be much lower than when using a stratified sampling (see below), especially in heterogeneous environments.
Systematic (or regular) sampling -A regular distribution of the sampling locations may prove to be appealing because the whole study area is covered with the same sampling effort. However, the sample may provide a biased picture when the fixed distance between sampling locations coincides with a particular structure in the spatial arrangement of the environmental conditions. Stratified sampling -The study area is first divided in strata assumed to influence differently the attributes of the species measured in the field. A random sampling procedure is applied to select a number of sampling locations within the strata in ratio to their relative geographical extent. The main advantage of stratification is that the precision of the estimates based on the sample may be considerably improved compared to a simple random sampling. Stratification requires preliminary survey to be conducted to minimize the within-strata variation in the measurements. In practice, however, stratification is often applied according to environmental layers representing heterogeneity of the environment conditions that are assumed to exert an influence on the attributes of the species.

Optimizing sampling strategy in time
Presence-only techniques can deal with the issue of false absences in species distribution modelling studies [14,22], and failure to consider the detectability of a species (i.e. the probability of detecting it when present at a site) when designing a monitoring pro-gramme might lead to misleading conclusions [23,24]. In order to account for detection probabilities and to provide an unbiased estimate of Anopheles species occupancy or infection rate, it becomes necessary to carry out repeated survey at least in some sampling locations over a single season of data collection. If the emphasis of the programme is on estimating changes in the species occupancy or infection rate over time, it is also required to repeat the surveys from one season to the other. Site occupancy modelling is a statistical framework specifically designed to jointly estimate detectability and occupancy of the species as well as changes in those parameters over time [24]. Designing effective sampling schemes to estimate Anopheles species dynamics in space and time requires decisions to be made about how to allocate sampling effort among spatial and temporal replicates. Power analyses may be implemented to optimize the sampling design in space and time, i.e. to achieve a compromise between the number of sampling locations and the number of repeated surveys within sampling locations in relation to (1) the acceptable level of imprecision associated with the estimates of species occupancy, (2) the occupancy and detectability of the species, (3) the available manpower and possible sampling effort.

Environmental factors
Once environmental factors of interest are identified, their importance according to the type of climate (e.g. semi-arid or humid), type of species, and altitude must be further discussed. Any place where surface water is available for breeding and emergence might lead to Anopheles occurrence. Vector status requires above plus (1) presence of human/animal host and their disease parasites. Then (2) suitable temperature and humidity which have then an effect on (3) vector dynamics and parasite development. A review [10] of the current state of the art in the context of remote sensing applications for malaria underlines that, temperature, humidity, surface water, climate seasonality, vegetation type and growth stage influence vector abundance irrespective of their association with rainfall. The vegetation around breeding sites may also determine abundance associated with the breeding site by providing resting sites, sugar feeding supplies for adult mosquitoes and protection from climatic conditions [25]. Furthermore, vegetation type or land use may influence mosquito abundance by affecting the presence of animal or human hosts and thus availability of blood meals [10]. Factors are of two kinds [9]: (1) abiotic slow changing factors such as long term climatic variables, soils, topography, (2) fast changing biotic factors such as vegetation, presence of predator, hosts, interactions with other Anopheles, seasonal temperature/ rainfall, water bodies,….
Remote sensing products provide environmental characteristics on large surfaces even in areas of limited accessibility and can provide recent information on an area compared to commonly available maps. The quality of the information provided is however dependent of the original remote sensing data quality and suitabilility. The processing required to mosaic images in order to cover a large area, to make various types of image correction, cloud screening operations and image interpretation are not straightforward for non-specialists. Derived products, such as land cover maps or composited time series of simple vegetation indices, are therefore often more adapted to the need of the users. However, the process behind the final product must be understood to a certain extent by the users, in order for them to be aware of the assumptions and simplifications done in the processing. Furthermore, different methods are typically available to reach a given goal, and the choice of the method can strongly influence on the quality of the results.

Long term abiotic variables
Abiotic slow changing factors might be used to delineate a species distribution area or maximum potential extend for a species. Those factors include topography, soil types, long term climate and ecoregions (Table 1). Available source are not many but cover the world. Consistent topography is available from the USGS GTOPO 30 suite [26] including derived variables such as digital elevation model, flow accumulation, slope or aspect or from the NASA Shuttle Radar Topographic Mission (SRTM) dataset reprocessed by the CGIAR [27]. The digital soils map of the world compiled by the FAO [28] is still a reference. Long term climatic datasets of monthly temperature and rainfall are available from Worldclim [29] which provided also bioclimatic variables. A second dataset CRU CL2.0 [30] provided also monthly temperature and rainfall but also number of monthly rainy days, rainfall monthly variation and relative humidity. The datasets are based on meteorological stations data from 1950 to 1990 or 2000. The quality of the data is high in some areas and less in others due to availability of meteorological station which can be quite low, particularly in Africa. The ecoregions [31] are a useful dataset to delineate sample stratification at regional level. Those dataset are mostly not derived from remote sensing (RS) images but grids developed from point data.

Monitoring air temperature
Air temperature Ta, is commonly obtained from measurements in weather stations, which depend on the regional infrastructure. Data are collected as point samples whose distribution is rarely designed to capture the range of climate variability within a region especially in developing countries. The data is also not readily available for real time applications and need to be interpolated to obtain information everywhere in a given region. On the other hand satellite images can provide land surface temperature Ts which is different from the air temperature and corresponds to the temperature of the top of the features present on the land surface (i.e. snow, ice, grass of a lawn, roof of a building, leaves of the canopy in a forest). Specific methods (split-windows techniques) can derive daily Ts at 1 km resolution [32,33] from two types of sensors, namely the Advanced Very High Resolution Radiometer (AVHRR) and the Moderate Resolution Imaging Spectroradiometer (MODIS) [33,34] (see description table of MODIS Ts: https://lpdaac.usgs.gov/products/modis_products_table). On the contrary, the derivation of air temperature (Ta) is far from straightforward. Recent research showed that minimum Ts retrieved from MODIS night images provide estimates of minimum Ta in different ecosystems in Africa [35]. Information on maximum Ta is also needed to study heat waves and can influence the transmission of vector-borne diseases in regions where temperature is a limiting factor. During daytime the retrieval of maximum Ta from Ts is more complex due to factors which influence (Ts-Ta): i.e. solar radiation, soil moisture and surface brightness. Methods based on Temperature Vegetation index, Normalized Difference Vegetation Index and Solar Zenith Angle to correct (Ts-Ta) are not sufficiently accurate to retrieve maximum Ta in different ecosystems [35]. Therefore, a new approach has been recently proposed [36] to estimate maximum Ta based on night AQUA-MODIS Ts data in combination with Worldclim [29] which provides long term monthly average of maximum and minimum air temperature. These inputs allow to characterize the diurnal cycle (amplitude and phase) and determine maximum Ta by extrapolating in time minimum Ta according to the determined diurnal cycle. The method is used to produce maximum Ta maps at 1km every 8 days over Africa available in real time from the International Research Institute for Climate and Society (IRI). Unfortunately Ta does represent temperature outside but no proxies are available to monitor indoor temperature or other stable microenvironment which can explain transmission in Finland when temperature in -20°c outside and is important in highlands malaria in Africa.

Monitoring rainfall
In some regions, the spatial distribution of weather stations is limited and the dissemination of rainfall data is variable, therefore limiting their use for real-time applications. If satellitebased data can partly compensate and help to monitor rainfall, unfortunately, no satellite yet exists which can reliably identify rainfall and accurately estimate the rainfall rate in all circumstances. Some sensors can make indirect estimates of rainfall by measuring parameters such as the thickness of clouds or the temperature of the cloud tops. Advantages and drawbacks of existing methods are summarized in [37]. Various satellite rainfall products exist at continental or global scales. The most relevant are: • The Tropical Rainfall Measuring Mission (TRMM) products [38] provide better spatial (25 km) and temporal estimation (3 hours) of rainfall in Africa [39] than most products but are available only between 35° North and South latitudes.
• Products from the CPC MORPHing technique (CMORPH) [40] cover the world at 8 km resolution every 30 min. This technique uses precipitation estimates derived from low orbite satellite microwave observations obtained entirely from various geostationary satellite infrared (IR) data. The estimation method developed for these products is extremely flexible such that any precipitation estimates from any microwave satellite source can be incorporated.
• African Rainfall Estimation (RFE) products cover Africa. The current version (RFE2) uses microwave estimates in addition to the use of cloud top temperature and station rainfall data to provide daily rainfall estimation at 10 km resolution. Comparison between CMORPH and RFE over complex terrain in Africa [39] and over the Desert Locust recession regions [41] shows that no single product stands out as having the best or the worst overall performance [42,43].
• The TAMSAT African Rainfall Climatology And Time-series data version 2 (TARCAT2) product [44] covers Africa at 4 km resolution and is derived from the MeteoSat thermal infra-red (TIR) satellite imagery. It consists of rainfall information every 10 days.
• The Multi-sensor Precipitation Estimate (GRIB MPE) [45] derives instantaneous rain rate from the infrared (IR) data of the geo-stationary EUMETSAT satellites over Europe and Africa by continuous re-calibration of the algorithm with rain-rate data from polar orbiting microwave sensors. The algorithm is only suitable in convective weather. Frontal precipitation, especially at warm fronts is very often wrongly located and overestimated. Two quality indicators distributed together with the MPE product give indications where the product should be used and where it may be problematic. Temporal resolution is high (15 min) and the product available in real time.

Remote sensing indicators of vegetation status
Monitoring the status of green biomass from space is made possible thanks to the particular spectral properties of green vegetation. In order to drive the exothermic reaction of photosynthesis, plant pigments absorb electromagnetic radiation over different parts of the visible spectrum (400-700 nm). This is known as photosynthetically active radiation (PAR). Additionally, much of the near-infrared light (740-1100 nm) is scattered by green plant tissues to avoid overheating, and this scatter results in strong spectral reflectance at these wavelengths. These unusual spectral properties, which are directly linked to photosynthesis, stomatal resistance and evapotranspiration, facilitate the retrieval of information on plant canopies from the electromagnetic signal measured by satellite remote sensing instruments [46]. Satellites dedicated to vegetation monitoring have been equipped with sensors capable of measuring reflected electromagnetic radiations in various wavebands, with a particular emphasis on the red (Red) and near-infrared (NIR), to assess the green biomass in a canopy.
A common and simple way to resume the information content within these bands is the use of spectral vegetation indices, which is an algebraic combination of the spectral bands designed to be as sensitive to the desired factor (green biomass) and as insensitive as possible to perturbing factors affecting spectral reflectance (such as atmospheric and illumination conditions, soil properties and the viewing geometry of the imaging instrument). Indices based on red and near-infrared reflectance have been shown to be a measure of chlorophyll abundance and energy absorption [47]. Variations of across one year can help spotting vegetation types, and the quantification of the water content can help identifying areas in a similar vegetation class which retain more humidity and might thus be more favourable to mosquito breeding or survival in dry season. Dozens of vegetation indices assess the state of the vegetation qualitatively and quantitatively on the basis of reflectance values: • The Normalized Difference Vegetation Index (NDVI) (NDVI = (NIR -Red) / (NIR + Red)) [48] is the most popular of such vegetation indices. NDVI is easily available because it is based only on Red and NIR bands, which are present in most satellite sensors dedicated to land surface observation. The GIMMS (Global Inventory Modelling and Mapping Studies) NDVI dataset based on NOAA-AVHRR offer the longest coherent dataset from July 1981 to December 2011 which can be useful for long term studies [49]. However the spatial resolution of 8 km limits some applications. SPOT-VEGETATION provides a regular product since 1998 at a better spatial resolution of 1 km and geo-location. Similarly, the MODIS sensor provides NDVI at 250 m resolution. NDVI can also be calculated from images with a higher spatial resolution, such as those from the Landsat or SPOT series. The NDVI is used extensively but has several disadvantages such as its sensibility to atmospheric aerosols and to soil background (particularly in sparsely vegetated areas) [50]. Additionally, NDVI also tends to saturate in forested areas and is therefore not responsive to variations in the full range of canopy vegetation content [51].
• The Enhanced Vegetation Index (EVI) remains sensitive to variations in dense forests where NDVI saturates [52]. EVI calculated from MODIS imagery is provided, alongside NDVI, as standard freely available product. A disadvantage of EVI is that it requires an additional blue band, which is not available in NOAA-AVHRR, thereby blocking the possibility to exploit the long term dataset. To remediate that, a simplified 2-band EVI has also been proposed [53].
• The Normalized Difference Water Index (NDWI = (NIR -SWIR) / (NIR + SWIR)) [54], where SWIR is the Short wave infrared, is sensitive to vegetation water content and to the spongy mesophyll structure in vegetation canopies. Regarding vegetation water content, [55] summarized the limitations of using the NDVI: a decrease in chlorophyll content does not imply a decrease in vegetation water content and inversely. It might also help target vegetation retaining humidity in the dry season. Few studies have attempted to retrieve directly vegetation water content using operational satellite data such as provided by SPOT-VEGETATION [55], MODIS [56] and Landsat [51]. A disadvantage of NDWI is that several instruments are not equipped with detectors in the SWIR domain, and when they do they are often at lower spatial resolution than other bands.
• The Hue index is a qualitative index proposed recently by [57] for the monitoring of the Locust habitat. This exploits simultaneously three wavelengths (the SWIR, the NIR, and red) and has two main advantages: (i) avoiding confusions between bare soils and vegetation, and (ii) allowing the identification of green vegetation independently from the observation conditions, i.e., atmosphere and acquisition geometry, and from its intrinsic variations, i.e., the phenological stage. Potential for monitoring crops, forests and other applications still need to be assessed.
Albeit their widespread use, the use of vegetation indices over large geographic extents has its limits for describing canopy status in a fine and robust way, since both the desired information and the perturbing factors vary spatially, temporally and spectrally. Another type of information on canopy status that can be retrieved from remote sensing data is biophysical variables. The most common are the fraction of Absorbed Photosynthetically Active Radiation (fAPAR) and the Leaf Area Index (LAI), defined as half the total developed area of green leaves per unit of ground horizontal area [58]. Unlike vegetation indices, which are a convenient way to resume spectral information related to vegetation behaviour, biophysical variables such as fAPAR and LAI have a real physiological meaning. These variables govern the process of photosynthesis and the exchange of energy, water and carbon between the canopy and the atmosphere. To retrieve LAI and fAPAR from satellite remote sensing observations, the radiative transfer of photons within the canopy and through the atmosphere must be modelled. A thorough description of the physical problem, alongside caveats on its application to satellite remote sensing of vegetation, is presented in [59]. Dorigo et al. [60] provide a review of the various methods that exist to use such radiative transfer models to relate satellite observations to LAI and fAPAR. Up to recently, the two main datasets of global fAPAR and LAI are products from MODIS and CYCLOPES with different methodologies described in [61] and [62]. These datasets have been inter-compared and evaluated against ground measurements over different land cover types [63][64][65]. A combined product has recently been made available, GEOV1, in the framework of the Geoland2 project, in view of providing it as an operational land product service of the Global Monitoring for Environment and Security (GMES) programme [66]. This product is currently based on SPOT-Vegetation, but a compatible long term data record from 1981 to 2000 has been also constructed based on NOAA-AVHRR data (with a spatial resolution of 0.05°) [67], and in the future it is expected to be produced based on the future operational Sentinel3-OLCI mission. Such biophysical products are increasingly used but seldom in epidemiological studies.

Land cover
Detailed information from land cover maps is generally available in national geographical institutes but this information is often out of date due to the long process implied in developing such dataset for a whole country. Moreover, the diverse origin and scale of these datasets when considering more than one country impeded proper comparison between sites. One could thus consider producing national or regional land cover maps using satellite high-resolution data. This exercise includes the pre-processing, the interpretation of the images, and the validation through field surveys. For instance, Landsat images were used in the framework of the Food and Agriculture Organization of the United Nations (FAO) Africover program [68] to map land cover types at 30 m resolution for 11 countries in Africa. Such land cover maps present a great level of detail, but may suffer for some inconsistencies because of heterogeneity in acquisition dates, images and interpretation from one scene to another. Moreover this approach hardly takes into account the seasonal variation and phenological behaviour of different vegetation types. These datasets are also limited in their spatial coverage and cannot be regularly updated following the methodology commonly used (i.e. visual interpretation). Medium to coarse resolution imagery (250 to 1 km) can improve some major issues: the information is acquired consistently over the whole area and frequent images (every 1 or 3 days) of a same area can be combined to eliminate cloud contamination and angular effects, and characterize the vegetation phenology. These time series can be used to produce global maps such as (i) the Global Land Cover 2000 (GLC2000) map that is based on SPOT-VEGE-TATION data (1 km) thanks to an international partnership of research groups coordinated by the European Commission's Joint Research Centre (JRC) [69], (ii) the 500 m MODIS global land cover derived from collection 5 Nadir BRDF-Adjusted Reflectance (NBAR) and Land Surface Temperature (LST) products [70], (iv) the GlobCover map [71] at 300 m derived from a Medium Resolution Imaging Spectrometer (MERIS) time series for year 2005. These types of time series were also used to produce land cover and vegetation maps at national and regional scales such as for example [72]. These types of products have the advantage that the data preprocessing and the methodology used are adapted to the local constraints and application needs but are limited in their spatial coverage. The possibility to regularly update global land cover information has been proved recently with the second run of the GlobCover processing system [73], thus offering the potential to use such product in a monitoring program. The delineation of the vector habitat underlines the essential role of these land cover datasets which makes the necessary link between the technical remote sensing world and application requirements. Land cover dataset are one of the essential variables for the Group on Earth Observations (GEO). A major effort is to be continuously invested in the development and improvement of such dataset. The quality of this dataset can only be really tested if used for applications. Close interactions with final users remain the guarantee for the relevancy of the Earth observation product.

Monitoring water bodies
In order to identify the presence of water, it is also possible to use satellite-derived products that detect water bodies instead of approximate water availability using rainfall estimates. In the last 10 years, only a few operational methods applied to datasets with a spatial resolution equal or higher than 1 km, were proposed to monitor surface water at continental or global scale. Among these, two most recent offer dynamic detections in near real-time through an operational monitoring system: • First, the Small Water Bodies (SWB) product based on SPOT-VEGETATION [74] available via the DevCoCast project website makes use of 10 day NDVI, the NDWI and syntheses of the SWIR band data. It is based on a contextual algorithm [75] exploiting the local contrast of the water surface with respect to the surrounding area. The product performs well in subhumid and semi-arid regions, but limitations have been observed over dense vegetation areas. The 1 km spatial resolution is an intrinsic limitation. Nevertheless, the combination of eight years of small water body monitoring data demonstrated the value of multi-annual approaches to capture water bodies that do not replenish every year in relation with seasonal rainfall patterns [74][97].
• The HSV WATER product [76] based on Hue Saturation Value (HSV) transformation of SPOT-VEGETATION and MODIS time series allows consistent detection at continental scale. This pixel based approach uses SWIR, NIR and red bands and transform the RGB color space into HSV that decouples chromaticity and luminance. It presents the advantage to have a robust and reliable image-independent discrimination between water and other land cover types. An automatic processing chain based on SPOT-VEGETATION was designed to provide a dekadal water surface product at the continental scale. The product can be ordered freely through the geoland2 web portal following the link http:// www.geoland2.eu/core-mapping-services/biopar.html.
The analysis of eight years of small water body data demonstrated the capacity of such methods to capture inter-annual water bodies variability and the relation with seasonal rainfall patterns [98]. Nevertheless, the 1 km spatial resolution of products derived from SPOT-VEGETATION is still a strong intrinsic limitation. The operational production of a MODIS based product at 250 m using the second method is in progress and should be available soon.

Caveats on remote sensing data
Various issues have to be highlighted when looking from the application angle: • The spatial resolution for all environmental factors necessary for a study is often not similar (table 2) and transformation to similar resolution might lead to increase geolocation imprecision when pixels limits do not correspond. The pixel size selected for the analysis is dependent of the available datasets and not the best cell size to describe the phenomena under study. Datasets not available at high resolution thus limit spatial details of results.
• Some useful dataset are not covering the world or not available at the appropriate date.
• Too detailed datasets such as rainfall products with data every 3 hours (TRMM) would require long summarizing process for non-specialists to get information per week or month.
• At high spatial resolution, geo-location accuracy can be jeopardized by the viewing angle, particularly in accidented terrain. Image distorsion needs to be corrected using a topographic information not always available at high resolution.

Spatial resolution:
Remote sensing is typically characterized by a trade-off between the different types of resolutions: spatial, temporal, spectral and to a certain extent also radiometric and angular. High spatial resolution is desired to characterise the land in a detailed way. However, cloud occurrence limits its availability. Basis for land cover map might be a puzzle of images from different seasons or even years thus creating artifacts of land cover differences at the limits between the images. As it is discussed further, coarse spatial resolution imagery, with its frequent revisit and through the use of compositing can partially remediate the problem, but this can be a problem with high resolution imagery where images are costly and revisit not frequent. Having regular observations at fine spatial resolution typically limit the geographic extend that can be monitored. Even over a limited coverage, satellites providing such services are typically commercial ones for which the cost is currently high and for which there is competition for their observation capacity between different geographic sites. Such images are thus often used in studies of limited spatial extend from which the results are difficult to extrapolate to a country level needed for spatial surveillance. It is however just a matter of time before high spatial resolution (5 -20 m) becomes available for the entire globe and the European Space Agency is currently preparing its Sentinel-2 constellation (with an expected launch of its first satellite in 2014), which aims at operationally providing multispectral imagery, at spatial resolutions of 10 to 60 m for different bands, and with a 5-day revisit period. However, the challenge of collecting, processing and delivering this data may still limit its practical use for years.

Clouds and compositing:
The quality of the spatial and temporal spectral consistency of coarse resolution optical time series may be limited by processing steps of cloud-screening and compositing. The efficiency of the cloud-screening, i.e. its ability to remove clouds while keeping a maximum of useful information, depends on three factors: (i) the methodology used to identify cloud-free pixels, (ii) the type of clouds (thick clouds are easier to overcome than  (Figure 2) that minimizes the effect of undetected clouds since these would typically have a lower NDVI value. However, the composited reflectance bands may exhibit substantial radiometric variations, since composite radiances are generally recorded under varying atmospheric and geometric conditions. This may cause serious spatial inconsistencies in the composites and in the subsequent processing.

Figure 2. Maximum NDVI standard compositing
A more advanced approach consists of normalizing the bidirectional reflectance by fitting a bi-directional reflectance distribution function (BRDF) model to the available cloud free observations [78] which considerable improve the result. But operational implementation requires a large number of cloud-free observations, the BRDF retrieval has a high sensitivity to residual clouds [79] (Figure 3), the algorithm is complex and requires ancillary data. A more flexible and "user-friendly" compositing approach was recently proposed [80] where cloud free reflectance values are averaged after a quality control. It presents the advantages to reduce both the anisotropy effects and the possible remaining perturbation after atmospheric correction and cloud removal. Despite the benefits of compositing, for some applications it may be more interesting to avoid it altogether. Indeed, to follow vegetation changes at a finer time scale it may be better to exploit all available observations within a period (typically 10 days or more) instead of combining them together. In agriculture monitoring, considerable changes in biomass or phenology can occur within a week and exploiting all available observations should thus be preferred. Such approach has been used, to provide crop specific biophysical variable time series at regional scale by fitting a simplified model of the canopy dynamics over daily data [81] and might be of use to identify processing occurring in potential Anopheles habitat such as rice paddies.
What is in a pixel? Coarse spatial resolution satellite imagery has several advantages. Frequent observations enable timely detection of environmental changes that may indicate potential changes in the presence of Anopheles. Second, the higher frequency of available observations allows to better address the problem of lack of data due to cloud contamination and anisotropy through compositing or temporal smoothing. Third, their (relatively) long archives enable to have a picture of the past with which the actual conditions can be compared to. In the short coming future, coarse datasets may also serve as a benchmark in order to calibrate products to their signal, which could be more stable thanks to their higher revisit frequency. Finally, coarse spatial resolution data are also often the only data available and there is thus a tendency to use them at the limit of their spatial resolution by looking at individual pixels. A common misconception is that the observational footprint is the geometric projection of a rectangular pixel onto the Earth's surface [82]. The footprint rather depends on some properties of the instrument, resumed under the concept of spatial response [83], and which results in an observation footprint generally larger than the pixel delivered to the user (Figure 4).
This problem is compounded for sensors such as AVHRR, MODIS and VIIRS (the successor of MODIS), which scan the Earth with large angles, leading to an expansion of the observation footprint along the scanline (while the grid in which the data is provided keeps the same size). Furthermore, the pre-processing step of gridding, i.e. assigning an observation to a predefined system of grid, introduces a ``pixel-shift'' [84], which means that the centre of the pixel does not correspond with the centre of the observation. Such gridding artefacts have serious consequences on the quality of the MODIS signal, and more specifically on composites and band-to-band registration across various spatial resolutions [85]. Recent work [86] has further demonstrated the impact of gridding artefacts and the scan angle on the spatial purity of an observation, i.e. on the percentage of the target land cover within an observation footprint that effectively contributes to the signal encoded in the pixel.

Mosquito Land cover:
Land cover provides the more understandable information to nonspecialist in terms of vegetation and habitat but the classes are not always adapted to the user needs. Instead of choosing between vegetation indices which represent continuous values and land cover of more or less 20 classes, it might be useful to give access to intermediary products of land cover classification. Indeed, processing chains of a land cover such as GlobCover include a correction process, cloud screening and image compositing to improve overall quality of the data [71]. Then vegetation indices and reflectance bands linked to vegetation status are used to group similar adjacent pixels and assign to those a same class label through clustering method which creates a chosen number of classes. In the next step, each class is compared to classes of a reference existing dataset or other existing data. According to a set of decision rules, the classes are interpreted and grouped in definitive classes. This last step raises several issues. The transformation of continuous dataset and the separation of the continuous landscape into a set of discrete classes are bound to a loss of information and inaccuracy particularly at the border of the classes. For example, the transition from a forest to a meadow might not always present a clear cut border. Other land cover initiatives work with continuous fields to avoid this issue [88]. Moreover, at the end of the process, up to 30% of the pixels are integrated into mosaic classes used when it is impossible to attribute the group of pixels to a single class, the pixel itself being a mixture for example of forest and meadow and thus providing a signal which is neither corresponding to forest, neither corresponding to meadow. Using mosaic classes in models and analysis can create confusion, particularly if several mosaic classes are grouped together. In this context, access to intermediate products such as classes based on cluster of similar pixels produced by remote sensing specialist might allow to integrate those into ecological models integrating ecological information relevant to Anopheles into the building of the final land cover would allow to define a better suited product for the purpose. Integrating several sensors to build a Landover might also improve the results. Indeed, GlobCover is based on MERIS satellite images which do not contain the Short Wavelength Infrared (SWIR) useful for discrimination of the forest vegetation. A combination with spot VEGETATION could result into better discrimination power for a similar resolution.

Model development
Sampling strategies, detailed field studies and casual observations can provide data which constitute the baseline information for model development. While remote sensing products are still too coarse resolution or maybe not adapted to define microhabitats, they can however provide proxies for environmental factors influencing general habitat and might be used in two ways. (1) Environmental values can be extracted at the sampling sites or in a buffer around the sites and then related to Anopheles data in descriptive models. Buffer size is often a compromise between some meaningful ecological feature such as flying range and the spatial resolution of the environmental factors [89]. (2) For question regarding habitat, spatial variation in vector capacity and spatial surveillance, spatial models are needed. In these models, environmental factors are related to the species records collected in the sampling locations and this relation is then used to predict the distribution of the species beyond the sampling locations [90][91][92][93].
When working with existing data, sampling protocol cannot be influenced a posteriori but an adapted methodology can be used to take into account potential peculiarity of each dataset. Field data may be obtained as a by-product of existing operational projects. However, depending on the finality which determined sampling design, the data might not be used straightforwardly for spatial surveillance. The dataset might include non standardized data collected during different years, according to a variety of sampling strategies but might be the only data available covering many countries. Existing datasets can consist of a collection of literature records covering wide regions. However, the collection sites are seldom well georeferenced, large areas are not covered by the studies which might use different collection techniques at different seasons. With such datasets lack of records might be linked to inefficient sampling method, wrong timing for the survey or absence of survey and according to the source of data, abundance and absence need to be treated with caution. Even certified presence might not reflect current situation if recorded years ago. These issues may partly be addressed by methods similar to the previously mentioned subsampling procedures to reduce the potential biases in readily available datasets or using adapted modelling techniques.

Species distribution modelling
Early development in the field of remote sensing and vector-borne diseases risk mapping used the following methodological steps: collecting human cases (or mosquito presence/absence), collecting relevant environmental gridded data (pixel), extracting data at sampling sites to build a logistic regression model explaining cases of occurrence according to the environmental conditions, then mapping the probabilities by calculation of the model output for each gridded cell of the original environmental maps [94]. Numerous methods have now been used to model vector-borne diseases spatially [95] and suggestions to improve frequent drawbacks include (1) using several models and select the best suited for prediction and (2) make a summary model from the best-fitting models. On the other hand innovative methods are constantly improved in spatial ecology. Quantifying the link between species and their environment is a central research area in quantitative ecology. When absence data are available / reliable, numerous methods now do exist, ranging from logistic regression, ordinary multiple regres-sions and its generalized form (GLM), ordination, classification method, distance metrics such as Mahalonobis distances, neural networks, boosted regression tree, random forest and even more sophisticated support vector machines are some examples among the plethora of recently developed methods [14]. Multi-species community modelling methods have also been developed. One advantage of this kind of techniques is that it becomes possible to build species assemblage models that take into account the relationships between the different species in the community and so their relative location in the "environmental hyperspace", instead of modelling single species distribution independently from each other [96].
However, mapping elusive species such as mosquitoes is often a challenge mainly because of the impossible collection of reliable absence data such as described earlier. Discriminant approaches such as logistic regression analysis developed for specific diseases are thus not suited anymore because they compare environmental conditions in sites where the species is present and absent (not recorded). When only occurrence data are available, some niche-based modelling approaches offer adapted solution as they can use presence-only record information to build the statistical models. The concept of ecological niche has been defined [97] as follows: considering the n variables corresponding to all of the ecological factors relevant for the species, an n-dimensional hyper-volume can be defined in the environmental hyperspace between the limiting values permitting a species to survive and reproduce. This volume is called the fundamental niche of the species. This niche can be related to the two-dimensional geographical area of distribution considering that any point of the niche may represent a combination of environmental values that corresponds to some locations in the geographical space. Mechanistic approaches to ecological niche modelling [90] use direct measurements or physical modelling of response of individuals to parameters and infer from them individuals fitness values of different combinations of physical variables. On the contrary, correlative approaches to ecological niche models such as developed for species distribution models intend in a first step to define niches using the environmental variables at sampling point of occurrence, then assess for each spatial location in a study area probability to belong to the niche. Many large-scale species modelling techniques inspired by the principle of environmental envelopes were developed including BIOCLIM [98] based on a very simple classification tree, DOMAIN [99] based on a measure of multivariate distance, ENFA [100] based on the same principle of distance measure in an environmental hyperspace. Elith et al. [14] provide a good overview of most currently used methods including the Maxent method [22] based on presence data which seems to perform particularly well.
In this context classical presence-only modeling can also be integrated [9] into a hierarchical framework [101]. The first step is to model entomological data using environmental data relevant for the same time period. Indeed, mapping Anopheles information from literature records dating back several decades should be based on long term environmental factors such as climatic factors and not on factors such as land cover, or NDVI which are changing fast in some regions. The mapping of a first potential distribution based on long term slows changing information and literature records is then refined using a mask of fast changing updatable information such as land cover or current meteorological prediction. This allows producing a risk map or distribution map relevant for a specific date corresponding to the date of envi- ronmental factors used to refine the map. The resulting map is thus ecologically meaningful and relevant for a precise date. Recent other improvements in the field of presence only models include selection of pseudo-absence with a spatial bias similar to the potential bias of presence data [102], selection of the environmental factors to enter the model based on ecological requirements, adapted method for species with low number of occurrence [103].
Some issues still need to be tackled however. Ecological model should be based on source populations. Those are sustainable populations in suitable habitat. To the contrary, sink populations are surviving in habitat not suitable for population persistence but persist thanks to immigration from nearby source population. Typical museum records include both sink and source populations [104]. Moreover, current vector-borne disease distribution may be limited by a number of factors both environmental and socio-economic. For example, during the past 100 years, malaria risk zone has reduced from around a half down to a quarter of the Earth's land surface. However malaria remains prevalent in 106 countries of the tropical and semitropical world, with 35 countries in central Africa bearing the highest burden of cases and deaths [105,106]. The latitudinal limits apparent today are in effect 'control frontiers' reflecting the interplay of control interventions combined with changes in environmental management and socioeconomic developments that reduce community vulnerability to the disease [107]. Altitudinal limits to malaria transmission have been the subject of much discussion regarding shifting of malaria risk into highland regions, such as East Africa. If documented climate change [108] might have add a small impact, major factors for extension to new areas seem to be changes in land use and landscape leading to changes in local ecology for human and vector [109].

Time or space prediction -Evolution in time -Forecast
While delineation of potential habitat for a species is a first step in risk mapping for Anopheles-borne species, forecasting seasonal events and variation in (micro-) habitat suitability and mosquito population is essential. Remote sensing and Geographical Information Systems (GIS) contributed to the development of environmental systems to support vector control or more sophisticated early warning systems. Those systems usually target situations of epidemic malaria which occurs in regions where malaria is not present continuously but associated to climatic events such as a particularly wet season in near desert areas [110] or a hot season in African highlands [111]. Epidemic situation are predicted to increase preparedness in public health [112]. These first experiences are reviewed in [10]. Several trends are observed in current research, but a major effort is targeted towards the prediction of malaria epidemic season based on climatic/meteorological variables, particularly in the context of climate changes and availability of new meteorological data sources [35]. The disease risk is forecasted using seasonal climate prediction and in particular rainfall and sea-surface temperature [110], and influence of climate change analysed [113]. Following the development of the European ENSEMBLE System for seasonal to inter -annual prediction [114], challenging researches are now proposing to integrate the seasonal climate forecasts from climate model into malaria early warnings systems [115]. Regional specificity still needs to be integrated in such models as for example the fact that low rainfall may trigger epidemics in the highlands [116].

Anopheles vector capacity
When trying to assess disease occurrence risk, not only vector presence is necessary but the capacity and eagerness to transmit the diseases is essential. This capacity is well summarized in the vectorial capacity (VC) concept [117] derived from the Basic reproduction rate of MacDonald [118]. Vectorial capacity is a series of biological features that determine the ability of mosquitoes to transmit Plasmodium. It is defined as the daily rate at which future inoculations could arise from a currently infected case [119] and it is generally used as a convenient way to express malaria transmission risk. Interestingly, a spatial version of the VC called VCAP has been developed to propose a spatial version of the formula, allowing assessment of vectorial capacity for each pixel in a given area [120]. To be able to do so, the VCAP is VC only driven by minimum Ta and rainfall. Rainfall and temperature are used as inputs to the model because they have an impact on vectorial capacity. Temperature has an effect on both the vector and the parasite. For the vector, it affects the juvenile development rates, the length of the gonotrophic cycle and survivorship of larvae and adults with an optimal temperature and upper and lower lethal boundaries. For the parasite, it effects the extrinsic incubation period [121]. Plasmodium falciparum (the dominant parasite in Africa) requires warmer minimum temperature than Plasmodium vivax. This can account for the geographic limits of malaria transmission for this species in Africa [122]. At 26ºC the extrinsic incubation period of this species is about 9-10 days whereas at 20-22ºC it may take as long as 15-20 days. In highlands, where cold temperatures preclude vector and/or parasite development during part/or all of the year, increased prevalence rates may be associated with higher than average minimum temperatures [123] which might be led by period of low rainfall [116]. It is possible to use minimum Ta derived from MODIS for monitoring risks of malaria transmission in highlands regions including Eritrea and Ethiopia where a high proportion of the population lives at risk of epidemic malaria. Currently, the USGS EROS Center uses this temperature derived from MODIS night Ts on an 8-day basis jointly with rainfall data derived from the Tropical Rainfall Measuring Mission (TRMM) downscaled to 1 km spatial resolution to produce a 1 km VCAP map every 8-days specifically for the epidemic regions of sub-Saharan Africa [118]. In Eq. 1, the two raster images MODIS night time (T s ) and rainfall (TRMM) are integrated as follows: The analysis of VCAP in relation to rainfall, temperature, and malaria incidence data in Eritrea and Madagascar shows that the VCAP correctly tracks the risk of malaria both in regions where rainfall is the limiting factor and in regions where temperature is the limiting factor [118]. However, in Burundi highlands, low rainfall triggered higher temperature and increased the risk of epidemics [116] and thus lower rainfall might be the trigger particularly because houses provide microenvironment with stable temperature 5°c higher than outside temperature and reduce influence of temperature on epidemic risk. The VCAP could also be further detailed by carrying analysis per vector species.

Transferring spatial information to health professionals
Roberts et al. [124] demonstrated many potential uses of remotely sensed data in managing and targeting vector and disease control measures. Just mapping the existing Anopheles species attributes can already bring information. Recently a map of all existing records for the Anopheles dirus complex was proposed [125] to document ecological settings, but also to demonstrate that detailed mapping could bring much more information and could lead to more sophisticated models [9] from those datasets such as developed [6] for Anopheles vectors. In this context, major effort were made in the past to provide mapping expertise through customized GIS application to malaria control staff and help them to map their entomological and diseases cases records. Simply overlaying this information with existing environmental information can lead to new working hypotheses better defined by people with experience in the field. Current availability of easy to use packages such as Google earth and Google map offer new opportunities particularly in areas covered by detailed imagery. Studies carried out by scientist devoted to research provide outputs in scientific publications, in pdf format or might target small study areas not representative of the whole country. While this type of output is useful for advances in sciences it is often of little use to the health worker in the field. Two types of approaches are more adapted to the field and complementary. One is to provide ready-to-use product to integrate into operating systems, updated regularly to feed into early warning systems, or informative enough to provide the necessary clues for control and forecast. Those include vector capacity maps. The other approach is to bring the most expertise possible into the hands of the health worker.
However, to be fully operational the development of new products and early warning systems presented above must be integrated into a decision/action framework. There is currently a good deal of policy congruence through international, regional and local levels to support this effort (e.g. the Global Framework for Climate Services whose aims are to develop more effective services to meet the increasing demand coming from climate sensitive sectors including health). The remaining challenge is to get the knowledge into practice and sustaining it where it is needed. It is crucial that appropriate policies are developed and implemented to improve health system performance [126]. This may be helped by enhancing the workforces' ability to detect and treat diseases, monitor and predict spatio-temporal patterns and implement intervention and control strategies in a timely and cost-effective manner through the use of tools and analysis informed by climate data.
In order to get research outcomes into policy and practice it is important to understand the context in which policies are adopted and supported in a practical manner. Below is an example of how policies developed at the district and national level connect to the larger political agenda of international policy makers. At the global scale improved early warning, prevention and control of epidemics is one of the key technical elements of the current Global Strategy for Malaria Control [127] the RBM Partnership referenced earlier in this section. In Africa, Headsof-State declared their support for the Roll Back Malaria initiative in April 2000 with the Abuja Targets [128]. In these targets, national malaria control services are expected to detect sixty per cent of malaria epidemics within two weeks of onset, and respond to sixty per cent of epidemics within two weeks of their detection. With the support of the WHO Regional Office for Africa, the WHO Inter-Country Programme Teams engage in the development of recommendations, guidelines and technical support to improve prevention and control of epidemics and transboundary/cross border within their various sub-regions (e.g. Regional Economic Communities (RECS) ECOWAS, IGAD and SADC) including collaborative activities with the African Development Bank. As a consequence of these policy developments, nations epidemic prone have enhanced capabilities for delimiting epidemic/endemic prone areas; established epidemic malaria surveillance systems; and strengthening their epidemic response capacities with the help of the Global Fund to Fight AIDS, Tuberculosis and Malaria (GFATM) and other donor support.
In many national malaria control policy documents, countries now recognize that to achieve the Roll Back Malaria (RBM) and Millennium Development Goals (MDG) targets they need better information on where epidemics are most likely to occur, and some indication of when they are likely to happen. As a consequence, they have begun to explore the use of climate information in the development of integrated early warning systems. Thus, there is increasing congruence in policy initiatives from multilateral, bilateral, national and non-governmental agencies in relation to epidemic disease control and a growing demand for climate information and robust early warning systems to support these efforts. This is reflected in the newly emerging Global Framework for Climate Services. This policy congruence extends to the current discussions on adaptation to climate change. Strengthened health systems are also seen as vital to improving the management of climate-sensitive disease in the context of climate change. The IPCC identified building public health infrastructure as: The most important, cost effective and urgently needed adaptation strategy. Other measures endorsed by the IPCC include public health training programs, more effective surveillance and emergency response systems, and sustainable prevention and control programs. These measures are familiar to the public health community and are needed regardless of climate change and constitute what is the basis of a no regrets adaptation strategy [129,130].

Further research
In terms of data, interactions between Anopheles species should be investigated, those being sympatric on the same habitat or even breeding site or one dominant species deterring another species. Adapted methodology based on asymmetrical similarity coefficients, indirect clustering and the search of indicative species [131] have been proposed [132] to identify species association to help assess the risk of presence of elusive species, if another often associated species is present. Caveats and potential improvements to environmental factors have already been discussed. Remote sensing offers already a wide range of useful products but improvements could target easier delivery of products such as proposed by the IRI data library (http:// iridl.ldeo.columbia.edu/) in similar standardised format and resolution and availability of all useful derived products over the world.
In terms of modelling, various issues have also already been discussed such as the necessity to better integrate ecological issues such as sink and source population [104]. Regarding the outputs, quality assessment could be attached to the resulting maps. Bayesian inference can be used [133] to quantify the uncertainty in the predictions. Rather than mapping the prevalence, what is mapped is the probability, given the data, that a particular location exceeded the predetermined high-risk prevalence threshold for which a change in strategy for control or the delivery of the drug is required. A level of uncertainty attached to each location help the decision maker choose which areas are at risk or not.
There is a necessity to document in details the data entered in models and choices of the modellers particularly when dealing with results which might trigger decision in public health [134]. Indeed, the final results do not only depend on input data but on pre-processing of those data, selection of useful variables, selection of a best model between various potential models, a whole process of model building which leads to one final result dependant on choices of the modeller. More details on dates of satellite images used to derived RS product, or even detailing quality spatially could also improve the final results and potential interpretation. Providing maps of the dataset entered in the model could help spot good spatial consistency or mismatch between adjacent raw images.
While disease occurrence prediction is generally the objective of forecasting, targeting the vector instead of the disease cases might provide several advantages. Indeed, some diseases might be present in a high number of asymptomatic carriers (lymphatic filariasis), or might not be accurately reported because the disease is not notifiable or misdiagnosis is frequent such as confusion between malaria and Borrelia duttoni in parts of Senegal and Togo [135]. Targeting the vector can help identify areas where asymptomatic cases might occur, target several diseases at once and predict epidemics or seasonal occurrence of diseases in advance based on fluctuations in mosquito populations.

Conclusions
In conclusion, providing relevant information to help disease spatial surveillance is not straighforwards and resemble more to a multidisciplinary challenge. In order to improve the current situation, increased sharing of existing data and increase transparency and documentation in the building of models could help target low quality areas such as places with low information or part of modelling process which could be improved. The quality of the entomological and environmental dataset as well as documentation of the relevant dates of each parameter such as original satellite images included in land cover maps and potential issues such as source-sink population sample could help identify new questions. Meanwhile, the information is still needed for the support of essential activities such as malaria control or for scientific research. A better interaction between research and operational work also seems to be necessary. Research product and results can only be useful if validated in the field and the best research questions are defined by people working in the field. Constant interactions can improve quality of research products and finally improve surveillance. Reinforcing the research capabilities in the region and in the malaria centres is of up-most importance. Indeed malaria workers in-countries have an extended experience of the field. They are in a better position to analyze the situation, identify their needs and find the answers. This would help bringing the data and the expertise where it is mostly needed: in the malaria centres.

AVHRR Advanced Very High Resolution Radiometer
BRDF bi-directional reflectance distribution function