Abstract
Background
The risk of Plasmodium falciparum infection is variable over space and time and this variability is related to environmental variability. Environmental factors affect the biological cycle of both vector and parasite. Despite this strong relationship, environmental effects have rarely been included in malaria transmission models.
Remote sensing data on environment were incorporated into a temporal model of the transmission, to forecast the evolution of malaria epidemiology, in a locality of Sudanese savannah area.
Methods
A dynamic cohort was constituted in June 1996 and followed up until June 2001 in the locality of Bancoumana, Mali. The 15day composite vegetation index (NDVI), issued from satellite imagery series (NOAA) from July 1981 to December 2006, was used as remote sensing data.
The statistical relationship between NDVI and incidence of P. falciparum infection was assessed by ARIMA analysis. ROC analysis provided an NDVI value for the prediction of an increase in incidence of parasitaemia.
Malaria transmission was modelled using an SIRStype model, adapted to Bancoumana's data. Environmental factors influenced vector mortality and aggressiveness, as well as length of the gonotrophic cycle. NDVI observations from 1981 to 2001 were used for the simulation of the extrinsic variable of a hidden Markov chain model. Observations from 2002 to 2006 served as external validation.
Results
The seasonal pattern of P. falciparum incidence was significantly explained by NDVI, with a delay of 15 days (p = 0.001). An NDVI threshold of 0.361 (p = 0.007) provided a Diagnostic Odd Ratio (DOR) of 2.64 (CI95% [1.26;5.52]).
The deterministic transmission model, with stochastic environmental factor, predicted an endemoepidemic pattern of malaria infection. The incidences of parasitaemia were adequately modelled, using the observed NDVI as well as the NDVI simulations. Transmission pattern have been modelled and observed values were adequately predicted. The error parameters have shown the smallest values for a monthly model of environmental changes.
Conclusion
Remotesensed data were coupled with field study data in order to drive a malaria transmission model. Several studies have shown that the NDVI presents significant correlations with climate variables, such as precipitations particularly in Sudanese savannah environments. Nonlinear model combining environmental variables, predisposition factors and transmission pattern can be used for community level risk evaluation.
Background
Malaria kills between 1.1 and 2.7 million people per year, including almost one million children under the age of five years in subSaharan Africa [1,2]. The methods of control recommended by the WHO are based not only on chemical and physicochemical control and prophylaxis but also on environmental measures (e.g. draining of backwaters), targeted means of prevention and early detection of epidemics. The risk of Plasmodium falciparum infection is variable over space and time [3,4], and this variability is related to environmental and climatic changes [5]. The specific management of an environment favouring the proliferation of vectors (Anopheles) can significantly decrease transmission [6]. The choice of interventions and their relative importance are determined by the knowledge of environmental heterogeneity [3,4,6].
Climatic and environmental factors affect Anopheles production, survival, speed of reproduction and parasitic life cycle [717]. This relationship explains the distribution of P. falciparum. Rainfall and temperature play a major role, directly on Anopheles behaviour or indirectly on breeding sites. Vegetation is also an environmental factor depending on climatic evolutions, which influences the behaviour of the vector directly and indirectly [18]. In regions with alternate dry and rainy seasons, the transmission of malaria is seasonal, epidemic or endemoepidemic. The principal parameters influenced by rainfall and temperature are aggressiveness (depending on Anopheles density and on the length of their gonotrophic cycle), contagiousness and Anopheles mortality. The variation is highly structured across geographic and temporal subpopulations. The high diversity during the rainy season, when transmission rate peaks, contrasts with the low diversity during the dry season, when both mosquito population size and malaria transmission rate are low.
Following the first descriptions of the parasite and its life cycle, mathematical models have been designed by Ronald Ross (1909). These models not only brought a better understanding of the transmission, but also improved the first vectorial control strategies [1921]. The differential equations of Ross were modified by MacDonald. Other authors introduced additional concepts such as multiple infection, immunity, coinfection [for example, see [2224]]. In these historical models, parameters of transmission were constant, even if vectorial behaviour presents temporal evolution [9,23]. Despite the strong relationship between malaria risk and environmental factors [8,9,11], environmental effects have rarely been included in malaria transmission models, probably because of technical difficulties in obtaining environmental data from field. Satellite imagery has been used to investigate covariates related to disease transmission, particularly NDVI (Normalized Difference Vegetation Index) [2531]. Indeed, satellites from the NOAA series (National Oceanic and Atmospheric Administration) provide a vegetation survey at the climatic scale. These NOAA data have shown their usefulness in the monitoring of vegetation [3239]. Furthermore, NOAA data are freely available, and provide good information on environmental field characteristics. The relationship between NDVI data and malaria incidence has been demonstrated, and thus, NDVI can be used as a proxy of climatic and environmental factors [18,28,29,40].
Incorporating remotely sensed information on environment into a transmission model can improve the knowledge of the epidemiological pattern of malaria. A microepidemiology analysis, pivotal for testing control measures or individual risk factors and for forecasting epidemiological pattern of malaria, has to integrate environmental factors.
The aim of this study was to provide a temporal model of malaria transmission, based on classical models and adapted to field data (Sudanese savannah area), with environmental dependency introduced by NDVI simulations.
Methods
Parasitological data
Data was obtained by a field study in the locality of Bancoumana, located in the Sudanese savannah zone of the Upper Niger valley (district of Kati) about 60 km southwest of Bamako, the capital of Mali (Figure 1). This locality covers an area of 2.5 km^{2 }and has a population of 8,000 inhabitants. A dynamic cohort was constituted in June 1996 and followed up until June 2001. The study included 173 of the 340 households, selected at random from each of the four geographic blocks of the village, using a stratified sampling. In each household, all children aged 0 to 12 years were followed up, constituting the dynamic cohort (for more information, see [5]). The surveys [22] were carried out at the rate of about one survey every two months during the rainy season and one every three months during the dry season. The intervals between surveys were defined on the basis of the previous knowledge of the seasonal transmission [41,42]. For each survey, a blood sample was taken and parasitaemia assessed. A trained team of biologists carried out microscopy to search for P. falciparum and its gametocytes in Giemsastained thick blood films. Biological diagnostic was subjected to quality control. Infection was defined as the presence of the parasite in the thick blood film. The time series of incidences of P. falciparum parasitaemia and gametocytaemia were analysed in the present work in order to provide a dynamical model of malaria transmission.
Figure 1. Maps of Mali showing NDVI means. The coloured scale shows the NDVI means estimated over the four trimesters, 1982–2006.
Community permission and individual Informed Consent were obtained according to the stepwise process described by Diallo et al [43].
Remote sensing
For remotely sensed data the 15day composite NDVI provided by the GIMMS group (Global Inventory Monitoring and Modelling Studies) at NASA/GSFC (National Aeronautics and Space Administration/Goddard Space Flight Center) was used (Figure 1). NDVI was derived from channels 1 and 2 of the NOAA AVHRR (Advanced Very High Resolution Radiometer) satellite series 7, 9, 11, 14, 16 and 17. NDVI data were acquired over 25 years from July 1981 to December 2006. Data obtained from the focus on Bancoumana were used into the model transmission.
NDVI was calculated as the normalized difference of corrected reflectance of the NIR (near infrared ranged from 0.725–1.10 μm) and visible (ranged from 0.58–0.68 μm) channels using AVHRR GAC (Global Area Coverage, 4 km resolution) data. The 15day composites were generated by selecting the maximum value of NDVI, in order to minimize contamination by clouds. Spatial resolution was resampled to 8 km × 8 km pixels. The NDVI GIMMS data set was improved using the navigation procedure provided by El Saleous et al [44], the calibration of visible and NIR channels [45]. The solar zenith angle values from AVHRR sensor were also corrected [46]. Effects of stratospheric aerosols due to volcanic eruptions of El Chichon (1981) and Mount Pinatubo (1991), during April 82December 84 and June 91December 93, have been corrected using the method developed by Vermote et al [47]. No correction has been applied to correct for atmospheric effects due to water vapour, Rayleigh scattering or stratospheric ozone.
An additional Quality Control was applied to the NDVI data set to filter unrealistic values (i.e. values larger than 1 or smaller than 1). NDVI values retrieved from spline interpolation or average seasonal profile have been considered as missing data. For each fortnight, data were calculated using the maximum NDVI value of 15day composites, in order to provide time series of vegetation characteristics, in the locality of Bancoumana.
Statistical analyses
The statistical relationship between NDVI and incidence of P. falciparum infection was assessed by classical ARIMA time series analysis [48,49] after logarithmic transformation of the incidences. These established statistical models have been used to model time series, by breakdown into tendency, cyclic and accidental components, and also to identify significant predictor [50]. Observed NDVI was introduced in the ARIMA analysis as a covariate and tested, and temporal delays were also analysed.
ROC (Receiver Operating Characteristic) analysis was used to determine an NDVI threshold predicting an increase in the parasitaemia incidence. The quality of this threshold was assessed by AUC test (Area Under the ROC Curve) and by the DOR (Diagnostic Odd Ratio) [see for example [51,52]]. Statistical analysis was performed using SPSS 15.0^{® }(SPSS Inc., Chicago, Ill., USA). A significance level of α = 0.05 was used for hypothesis tests.
Malaria model
Malaria transmission was modelled using a deterministic approach. A SIRStype model [19,23] was adapted to Bancoumana's data. The model was built on the MacDonald equations, specifying states for infectednotcontagious and contagious children (such as Bailey's model [20]) and adding a resistant state (such as Dutertre's model [23]). The first state S was defined as the proportion of susceptible children. The second state I represented the proportion of infected but not contagious children, i.e. children without gametocytaemia. The third state G represented the production of contagious children, i.e. children with gametocytaemia. Indeed, the transmission needs two parasitic cycles, an asexual cycle in human and a sexual cycle in Anopheles, this latter is made possible by gametocyte production in human. The last state R represented the proportion of children "resistant" to infection, i.e. children were considered as resistant during the effectiveness of curative treatment (Figure 2). The transition from state S to state I depended on vectorial and climatic factors (i(t)), and children were considered without effective immunity. Demographic factors as human natality and mortality have been neglected. Infected but not contagious children (state I) could become contagious with a parameter η_{1 }(production of gametocytes) or resistant with a parameter γ (curative treatment). Contagious children (state G) could loose their contagiousness with a parameter η_{2}, or could become resistant (with the parameter γ). The parameter δ represented the inverse of the duration of the treatment effectiveness.
Figure 2. Malaria transmission model adapted to Bancoumana's data. S: susceptible state, I: infected not contagious state, G: contagious state, R: resistant state. A_{s}: susceptible state (Anopheles), A_{i}: contagious state (Anopheles).
The vectorial part of the cycle was modelled with a twostate model: the state of susceptible Anopheles (A_{s}) and the state of contagious Anopheles (A_{i}). The transition took place when susceptible Anopheles had a blood meal on contagious children (G), with a parameter i_{m}(t) depending on vectorial and climatic factors. Vectorial parameters were density (μ), length of the gonotrophic cycle (ν), contagiousness (β), aggressiveness (α), and mortality (ξ). Human contagiousness (ζ) has been added to the model. Model equations have been written as follows:
where VI(t) was the vegetation index (NDVI) and represented environmental factor modelling. Environmental factors influenced vector mortality ξ, length ν of the gonotrophic cycle, vectorial aggressiveness α, with a time lag θ.
Parameter estimations were issued from a review of published works [53]. The parameter values have been bounded within the range of published estimations and have to minimize quality indexes (RMSE and MAPE, see later). Furthermore, these values were validated by senior entomologists and parasitologists. Initial conditions were estimated from observed data (June 1996) (Table 1). Anopheles mortality and NDVI values were related by a functional form modelling a slow decrease of mortality when NDVI increased. This relationship provided also a high mortality constant rate for the lowest values of NDVI (during dry season), below a constant threshold (τ). The addition of 1 to the denominator permitted to avoid null values. χ represented the indicator function:
Table 1. Parameter estimations and initial conditions.
The transmission rates (i(t) and i_{m}(t)) were also related to NDVI values by a functional form modelling the increase of transmission when NDVI increased and low transmission constant rate during dry season.
The basic reproductive number z_{0 }has been calculated from these equations:
Note that the basic reproductive number was null for low values of NDVI. No transmission could occur if climatic factors do not favour the normal behaviour of Anopheles.
Environmental model
Environmental factors were considered as an extrinsic variable of the Bancoumana's model. Thus, these factors have been independently modelled. Among the environmental factors related to malaria, observations from 1981 to 2001 were analysed. The extrinsic NDVI variable VI(t) was simulated using a hidden Markov chain model. Observations from 2002 to 2006 served as external validation.
Hidden Markov models (HMM) were introduced by Baum and Petrie at the end of the 60's [54,55]. This family of stochastic models has been then developed both theoretically (for example [5658]) and in terms of applications particularly in hydrology and climatology sciences [5961]. These methods make the assumption that the observed data are generated by an underlying finite mixture of distributions, itself organized in a Markov chain (Figure 3). Used for sequence analysis, they provide a classification model of sequence parts. Indeed, the hidden variable can be interpreted as a class of the observed variables.
Figure 3. Hidden Markov model. The blue squares represent the time sequence of hidden classes (states). The green circles represent the time sequence of observed data.
The hidden Markov model {(S_{k}, O_{k})} is constituted by a set of finite states S_{k}, k ∈ {1, K}, associated to a probability distribution. Discrete time transitions between these hidden states are provided by transition probabilities, and the resulting time sequence of states (S_{t}, t > 0) is a homogeneous Markov chain of recursivity order 1:
At time t, for a given state S_{t }= k, an observation O_{t }= o is issued following the probability distribution associated to this state, the emission probability p(O_{t }= o/S_{t }= k). Then, the sequence of observations (O_{t}, t > 0) is a sequence of random variables conditionally independent, given the sequence of hidden states.
Such a model is defined by:
• p(S_{t = 1 }= k)_{k ∈ {1, ..., K}}, initial probabilities (at time t = 1)
• p(S_{t+1 }= j/S_{t }= i)_{(i, j) ∈ {1, ..., K}}^{2}, ∀ t, elements of the matrix of transition probabilities
• p(O_{t }= o/S_{t }= k)_{k ∈ {1, ..., K}}, emission probabilities
Following this approach, a HMM of NDVI was designed, where the hidden states represented the monthly evolution of climate and environment. An emission probability represented the probability that an NDVI value occurred at a time t, given the environment of a determinate month. A transition probability represented the probability of an environmental change. The EM algorithm was used for the estimation of emission and transition probabilities and then simulating NDVI. The choice of the hidden states (1 state representing each month, 2 months or one season) was conducted by the quality indexes (RMSE and MAPE see below).
Quality assessment and implementation
Quality of the predictions was performed using the root mean squared error (RMSE) and the mean absolute percentage error (MAPE) defined as follows:
where h was the timelag of prediction, the prediction at time t, and X_{t }the observed value at time t.
The complete model was implemented using Matlab 7.0.4 ^{®}, (Mathworks, Inc., Natick Massachusetts, USA)
Results
Time series analysis
The seasonal pattern of P. falciparum incidence was significantly explained by NDVI (Figure 4), with a delay of 15 days (p = 0.001). The value of the adjusted R2 (R_{adj}2 = 89%) was relatively high, and the quality indexes were relatively low (RMSE = 0.04, MAPE = 5.61). Thus, the statistical model, using NDVI as covariate, showed a satisfactory goodnessoffit. The known decrease in infection from year to year was significant (p = 0.001), but remained weak (0.109 after logarithmic transformation, standard deviation SD = 0.031).
Figure 4. Incidence of P. falciparum and NDVI. Xaxis: time (fortnight); Yaxis: NDVI (left) or Incidence (right). The timeseries model (modelling P. falciparum incidence by NDVI and a constant decrease in incidence) is presented in blue (bold). The bounds of the 95% confidence interval are indicated as dotted lines. The observed incidences are presented in red (bold) and NDVI values in green.
NDVI threshold
The NDVI values observed around Bancoumana were less than 0.34 during the dry season and the highest values (>0.52) have been observed during the rainy season. The ROC analysis has provided an NDVI threshold of 0.361. Beyond this threshold, the odd ratio of an increase in the parasitaemia incidence was significant, estimated at DOR = 2.64 (CI95% [1.26;5.52]).
The area under the ROC curve was 0.65 (CI95% [0.54;0.74]), significantly different from 0.5 (p = 0.007) (Figure 5).
Figure 5. ROC curve of NDVI for the prediction of an increase in parasitaemia incidence (blue). The green line represents the first bisector. The black cross represents the NDVI threshold with the best DOR (NDVI = 0.361; sensitivity of 67%; specificity of 56%).
NDVI simulations
The probabilities of changes in environmental characteristics (the transition probabilities from one month to another) were null if these 2 months were not contiguous. Indeed, environmental characteristics cannot change suddenly. Persistence of environment was also possible. Indeed, environmental characteristics may persist from one month to the next, with a probability of 50%. An environmental change from one month to the next was also possible, with a probability of 50%. These transmission probability were constant, whatever the months were. These estimations reflected the seasonal nature of the phenomenon: persistence of environmental characteristics between 2 contiguous months or progressive changes.
Probabilities of observing specific values of NDVI, the emission probabilities estimated for each month (Figure 6), were not high and reflected also the seasonal changes in NDVI regimes. Indeed, the probabilities that high NDVI values occur in January, February or March were nil and small NDVI values could occur with nonzero probabilities; for example there was 35.0% of chance observing an NDVI between 0.2 and 0.25 during March (cumulative probability). A contrario, the probabilities that high NDVI values occur were important during August and September; for example there was about 47.6% of chance observing an NDVI between 0.6 and 0.65 during September (cumulative probability).
Figure 6. Emission probabilities. The coloured scale shows the probability that a given NDVI (xaxis, ×1000) occurs for a given month (yaxis).
The choice of hidden classes reflecting the monthly scale of seasonal changes was conducted by MAPE and RMSE (Table 2) between predictions and validation set values (2002–2006 NDVI).
Table 2. Choice of hidden classes. Mean absolute percentage error (MAPE) and root mean squared error (RMSE) by external validation (2002–2006 NDVI).
The external validation showed the smallest values of MAPE (0.178) and RMSE of (59.63) for a monthly scale of seasonal changes. The model predicted adequately seasonal variations (Figure 7) and then could be used as environmental factor for malaria modelling.
Figure 7. NDVI simulation. Xaxis: number of fortnight; Yaxis: NDVI. The red line represents the prediction made by hidden Markov model. The blue line represents the observed NDVI from 2002 to 2006 (external set). The colored scale shows the three seasons, rainy (green), cool and dry (orange) and warm and dry (yellow).
Malaria model
The deterministic transmission model, with stochastic environmental factor, predicted an endemoepidemic pattern of malaria infection. Indeed, incidences of parasitaemia fluctuated around 70 per 100 inhabitants per 15days. The model provided a seasonality pattern of incidences, with low values for the dry seasons (about 65%) and high values for the rainy seasons (75%). These oscillations of predicted incidences were similar to observed values (Figure 8 and 9). Quality indexes have shown the smallest values (MAPE = 0.07, RMSE = 0.01 for parasitaemia) for a monthly model of environmental changes (Table 3). Incidences of gametocytaemia fluctuated around 5 per 100 inhabitants per 15days. Oscillations of predicted gametocytaemia incidences were less pronounced as observed incidences, but indexes have also shown low values (MAPE = 0.01, RMSE = 0.001 for gametocytaemia).
Table 3. Quality assessment of malaria transmission model for different hidden classes.
Figure 8. Observed parasitaemia and gametocytaemia incidences versus predicted values. Xaxis: time (10 days). Yaxis: incidences. The red and orange lines represent the observed incidences of respectively P. falciparum parasitaemia (PF) and gametocytaemia (GF), after removing the trend of the time series. The blue lines (dark and light blue) represent the predicted values of respectively parasitaemia (IO) and gametocytaemia (GO) incidences, using observed NDVI. The green lines (dark and light green) represent the predicted values of respectively parasitaemia (I) and gametocytaemia (G) incidences, using HMM model of NDVI.
Figure 9. Prediction of malaria incidence. Xaxis: time (15days). Yaxis: incidences. The solid or dotted lines represent the predicted values using respectively HMM model or observed NDVI. I: predicted incidence of P. falciparum parasitaemia using HMM model. IO: predicted incidence of P. falciparum parasitaemia using observed NDVI. G: predicted incidence of P. falciparum gametocytaemia using HMM model. GO: predicted incidence of P. falciparum gametocytaemia using observed NDVI. S: predicted incidence of susceptible children using HMM model. SO: predicted incidence of susceptible children using observed NDVI. R: predicted incidence of resistant children using HMM model. RO: predicted incidence of resistant children using observed NDVI.
Incidences of parasitaemia and gametocytaemia were adequately modelled, using the observed NDVI as well as the HMM model of NDVI (Figure 8 and 9). Indeed, seasonal variations and mean value of incidences were similar using both NDVI data.
Discussion
In this study, a malaria transmission model was designed, using NDVI as a proxy of environmental factors, especially humidity conditions. The NDVI allows linking detected physical characteristics of plants with their functional status and monitoring their temporal evolution. It helps to extract a strong signal related to vegetation and provides good contrast with other earth's surface objects [62]. Several studies have shown that the NDVI presents significant correlations with climate variables such as precipitations and land surface temperatures [6365], particularly in Sudanese savannah environments. Thus, NDVI can be used when climatic data as well as hydrological or environmental field characteristics are not easily available. The relationship between NDVI and malaria epidemiology is well known and is mostly due to the climatic dependency of vector behaviour [17,2527,30,66,67]. Indeed, it has been suggested that the number of breeding sites and NDVI values increase with the soil moisture state, the latter being multifactorial [26,27]. Furthermore, the 15days lag between NDVI and malaria incidence has also been reported in other studies [18,27,68]. The NDVI threshold deduced from this study is consistent with other publications where an NDVI between 0.35 and 0.4 is associated with an increased incidence [25,28].
Note that a clear relationship between NDVI and malaria has been shown in sahelian or Sudanese savannah environments (such as Bancoumana's region), but not in other regions [41], characterized by an absence of seasonality or persistent moisture (for example ricefield, flood regions).
It is clear that the use of observed NDVI allows adequate predictions of parasitaemia incidences. However, NDVI data are not always available. It is then necessary to use an adequate predictive model. The HMM model brings explanatory structures, such as seasonal classes represented by hidden classes. The stochasticity of the phenomenon is also modelled by HMM. In such an epidemiological model, stochastic events can lead to crossing a threshold and to an epidemiologic amplification. Because of this stochastic nature of modelling (Figure 9), a temporary gap has been found between the model using observed NDVI data and the model using NDVI data simulated by HMM. Other models (including sinusoidal models [18,53]) do not take into account this stochastic nature of natural phenomena. Other stochastic models can be used, for example nonparametric regressions [53], but these models allow rarely also an explanatory approach.
Based on historical models, this designed model reflects the nonlinearity of epidemiological phenomenon (in contrast to other approaches [18,68]). This model respects the chronological order of appearance of gametocytes, which has not been the case with other historical models [19,20,23], but is a key point for malaria transmission. The proposed basic reproductive number has the same form as that of MacDonald. The values of RMSE and MAPE are relatively low, both for parasitaemia and for gametocytaemia.
As the field study has included only children, the relative immunity was considered as inefficient here. In addition, since infected children have been treated, they were considered as "resistant" for the duration of the effectiveness of treatment. The collection method did not change over the study period. Cases of malaria have been confirmed biologically, biological diagnosis was subjected to continuous quality control [5]. The observed decreasing trend (and even the trend estimated with the ARIMA statistical analysis) in the incidence of P. falciparum is not taken into account by the deterministic model. This trend has already been observed in other field studies on the same site [41,42]. It is unlikely that this trend was due to the natural evolution of malaria in this region. The NDVI values observed during that period exclude climate change. There have been no further developments in the village, neither as regards the number of people nor about known risk factors (breeding site control for example). Most probably, this decreasing trend in the incidence of P. falciparum was linked to the presence of the medical team in the village.
Conclusion
In this study, remotesensed data were coupled with field study data in order to drive a malaria transmission model. In a microepidemiology context, NDVI provided useful variables, improving malaria transmission modelling. Nonlinear model combining environmental variables, predisposition factors and transmission evolution can be used for community level risk evaluation. Accumulating data [47,66] point to the need of integrating several control measures to enhance efficiency. Thus, control programmes, such as vector control, impregnated net use or early detection and treatment, should to be tailored to environmental conditions.
List of abbreviations
ARIMA: Autoregressive Integrated Moving Average; AUC: Area Under the Curve; AVHRR: Advanced Very High Resolution Radiometer; DOR: Diagnostic Odd Ratio; GAC: Global Area Coverage; GIMMS: Global Inventory Monitoring and Modelling Studies; GPS: Global Positioning System; GSFC: Goddard Space Flight Center; HMM: Hidden Markov chain Model; MAPE: Mean Absolute Percentage Error; NASA: National Aeronautics and Space Administration; NDVI: Normalized Difference Vegetation Index; NIR: Near Infra Red; NOAA: National Oceanic and Atmospheric Administration; RMSE: Root Mean Squared Error; ROC: Receiver Operating Characteristic; WHO: World Health Organization.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
JG performed the statistical analysis and the mathematical model, drafted the manuscript and participated in the interpretation of data. OT performed the GPS/GIS data collection, the data computing and the validation in the field site of Bancoumana. He participated in the clinical, biological data collection. ND performed the NDVI extraction, and participated in the interpretation of results and drafted the manuscript. AD participated in the clinical, biological data collection in the field site of Bancoumana. He participated in the GPS/GIS data collection, the data computing and the validation. SR participated in the GPS/GIS data collection and validation. LS participated in the mathematical model and correction of the manuscript. JD supervised the statistical analysis and the mathematical modelling. He participated in the result interpretation and corrected the manuscript. OKD the PI of the MaliTulane TMRC led the team who conceived and designed the studies, and supervised the field work. He participated in the community consent protocol, in data collection, data monitoring, QA/QC of the data, data analysis and correction of the manuscript. All authors read and approved the final manuscript.
Acknowledgements
The field study was founded by NIAD/NIH under the MaliTulane TMRC grant N0 AI 95002P50. We acknowledge the following coworkers for their efforts and contribution to the overall MaliTulane works at Bancoumana: Belco Poudiougou, Hamidou Coulibaly, Issaka Sagara, Mouctar Diallo, Sory Diawara, Amed Ouattara, Mahamadou Diakité, Yeya T Touré, Donald J Krogstad, Eric S Johnson, John Gerone, Ousmane Koita, Seydou Doumbia, Samba Diop, Moussa Konaré, Claire Brown, Mangara Bagayogo, Sekou F Traoré, Moussa Fané and all the MRTC/DEAP Parasitology and Entomology Teams.
This work was also supported by the ACCIESgroup http://www.cnrm.meteo.fr/accies/ funded by the GICC programme of the French Ministry of Ecology and we acknowledge Philippe Sabatier and Dominique Bicout of the ACCIESgroup.
We acknowledge the members of the ESPACE unit (US140), Remote Sensing Center/Maison de la Teledetection, Montpellier, France.
We also thank the population of Bancoumana for their full collaboration.
References

World Health Organization: Expert Committee on Malaria: 20th Report.

Breman JG, Alilio MS, Mills A: Conquering the intolerable burden of malaria: what's new, what's needed: a summary.
Am J Trop Med Hyg 2004, 71(2 suppl ):115. PubMed Abstract  Publisher Full Text

Baird JK, Agyei SO, Utz GC, Koram K, Barcus MJ, Jones TR, Fryauff DJ, Binka FN, Hoffman SL, Nkrumah FN: Seasonal malaria attack rates in infants and young children in Northern Ghana.
Am J Trop Med Hyg 2002, 66:280286. PubMed Abstract  Publisher Full Text

Mbogo CM, Mwangangi JM, Nzovu J, Gu W, Yan G, Gunter JT, Swalm C, Keating J, Regens JL, Shililu JI, Githure JI, Beier JC: Spatial and temporal heterogeneity of Anopheles mosquitoes and Plasmodium falciparum transmission along the Kenyan coast.
Am J Trop Med Hyg 2003, 68:734742. PubMed Abstract  Publisher Full Text

Gaudart J, Poudiougou B, Dicko A, Ranque S, Toure O, Sagara I, Diallo M, Diawara S, Ouattara A, Diakite M, Doumbo OK: Spacetime clustering of childhood malaria at the household level: a dynamic cohort in a Mali village.
BMC Public Health 2006, 6:286. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Killeen GF, Seyoum A, Knols BGJ: Rationalizing historical successes of malaria control in Africa in terms of mosquito resource availability management.
Am J Trop Med Hyg 2004, 71(2 suppl):8793. PubMed Abstract  Publisher Full Text

Craig MH, Kleinschmidt I, Nawn JB, LeSueur D, Sharp BL: Exploring 30 years of malaria case data in KwazuluNatal, South Africa: Part I. The impact of climatic factors.
Trop Med Int Health 2004, 9:12471257. PubMed Abstract  Publisher Full Text

Depinay JMO, Mbogo CM, Killeen G, Knols B, Beier J, Carlson J, Dusho J, Billingsley P, Mwambi H, Githure J, Toure AM, McKenzie FE: A simulation model of African Anopheles ecology and population dynamics for the analysis of malaria transmission.
Malar J 2004, 3:29. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Githeko AK, Ndegwa W: Predicting malaria epidemics in the Kenyan highlands using climate data: a tool for decision makers.
Global Change Human Health 2001, 2:5463. Publisher Full Text

Hay SI, Myers MF, Burke DS, Vaughn DW, Endyi T, Anandai N, Shanksi GD, Snow RW, Rogers DJ: Etiology of interepidemic periods of mosquitoborne disease.
Proc Natl Acad Sci USA 2000, 97:93359339. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Hoshen MB, Morse AP: A weatherdriven model of malaria transmission.
Malar J 2004, 3:32. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Ijumba JN, Mosha FW, Lindsay SW: Malaria transmission risk variations derived from different agricultural practices in an irrigated area on northern Tanzania.
Med Vet Entom 2002, 16:2838. Publisher Full Text

Teklehaimanot HD, Lipsitch M, Teklehaimanot A, Schwartz J: Weatherbased prediction of Plasmodium falciparum malaria in epidemicprone regions of Ethiopia I. patterns of lagged weather effects reflect biological mechanisms.
Malar J 2004, 3:41. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Guthmann H, LlanosCuentas A, Palacios A, Hall AJ: Environmental factors as determinants of malaria risk. A descriptive study on the northern coast of Peru.
Trop Med Int Health 2002, 7:518525. PubMed Abstract  Publisher Full Text

Lindsay SW, Parson L, Thomas CJ: Mapping the ranges and relative abundance of the two principal African malaria vectors, An. gambiae sensu stricto and An. arabiensis, using climate data.
Proc R Soc Lond [ser B] 1998, 265:847854. Publisher Full Text

Shanks GD, Hay SI, Omumbo JA, Snow RW: Malaria in Kenya's western highlands.
Emerg Infect Dis 2005, 11:14251432. PubMed Abstract  Publisher Full Text

Yé Y, Louis VR, Simboro S, Sauerborn R: Effect of meteorological factors on clinical malaria risk among children: an assessment using villagebased meteorological stations and communitybased parasitological survey.
BMC Public Health 2007, 7:101. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

GomezElipe A, Otero A, Van Herp M, AguirreJaime A: Forecasting malaria incidence based on monthly case reports and environmental factors in Karuzi, Burundi, 1997–2003.
Malar J 2007, 6:129. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Anderson RM, May RM: Infectious diseases of humans: dynamics and control. Oxford: Oxford Science; 1998.

Bailey NTJ: The biomathematics of malaria. London: C. Griffin; 1982.

McKenzie FE, Samba EM: The role of mathematical modeling in evidencebased malaria control.
Am J Trop Med Hyg 2004, 71(2 suppl):9496. PubMed Abstract  Publisher Full Text  PubMed Central Full Text

Dietz K, Molineaux L, Thomas A: A malaria model tested in the African savannah.
Bull World Health Organ 1974, 50:347357. PubMed Abstract

Dutertre J: Etude d'un modèle épidémiologique appliqué au paludisme.

Lindsay SW, Parson L, Thomas CJ: Mapping the ranges and relative abundance of the two principal African malaria vectors, An. gambiae sensus stricto and An. arabiensis, using climate data.
Proc R Soc Lond [ser B] 1998, 265:847854. Publisher Full Text

Rogers DJ, Randolph SE, Snow RW, Hay SI: Satellite imagery in the study and forecast of malaria.
Nature 2002, 415:710715. PubMed Abstract  Publisher Full Text

Eisele T, Keating J, Swalm C, Mbogo CM, Githeko AK, Regens JL, Githure JI, Andrews L, Beier JC: Linking fieldbased ecological data with remotely sensed data using a geographic information system in two malaria endemic urban areas of Kenya.
Malar J 2003, 2:44. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Patz JA, Strzepek K, Lele S, Hedden M, Greene S, Noden B, Hay SI, Kalkstein L, Beier JC: Predicting key malaria transmission factors, biting and entomological inoculation rates, using modelled soil moisture in Kenya.
Trop Med Int Health 1998, 3:818827. PubMed Abstract  Publisher Full Text

Gemperli A, Sogoba N, Fondjo E, Mabaso M, Bagayoko M, Briët OJT, Anderegg D, Liebe J, Smith T, Vounatsou P: Mapping malaria transmission in west and central Africa.
Trop Med Int Health 2006, 11:10321046. PubMed Abstract  Publisher Full Text

Jacob BG, Muturi EJ, Mwangangi JM, Funes J, Caamano EX, Muriu S, Shililu J, Githure J, Novak RJ: Remote and field level quantification of vegetation covariates for malaria mapping in three rice agrovillage complexes in cental Kenya.
Int J Health Geogr 2007, 6:21. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Hay SI, Snow RW, Rigers DJ: From predicting mosquito habitat to malaria seasons using remotely sensed data: practice, problems and perspectives.
Parasitol Today 1998, 14:306313. PubMed Abstract  Publisher Full Text

Thomson MC, Connor SJ, Milligan PJW, Flasse S: Mapping malaria risk in Africa – What can satellite contribute?
Parasitol Today 1997, 8:313318. Publisher Full Text

Tucker CJ, Townshend JR, Goff TE: African landcover classification using satellite data.
Science 1985, 227:369375. PubMed Abstract  Publisher Full Text

Justice CO, Townshend JRG, Holben BN, Tucker CJ: Analysis of the phenology of global vegetation using meteorological satellite data.
Int J Remote Sens 1985, 6:12711318. Publisher Full Text

Townshend JRG, Goff TE, Tucker CJ: Multitemporal dimensionaly of images of normalised difference vegetation index at continental scales.
IEEE T Geosci Remote 1985, 23:888895. Publisher Full Text

Townshend JRG, Justice CO: Analysis of the dynamics of African vegetation using the Normalized Difference Vegetation Index.
Int J Remote Sens 1986, 7:14351446. Publisher Full Text

Lloyd D: A phenological description of Iberian vegetation using short wave vegetation index imagery.
Int J Remote Sens 1989, 10:827833. Publisher Full Text

Los SO, Justice CO, Tucker CJ: A global 1° × 1° NDVI data set for climate studies derived from the GIMMS continental NDVI data.
Int J Remote Sens 1994, 15:34933518. Publisher Full Text

Sellers PJ, Tucker CJ, Collatz GJ, Los SO, Justice CO, Dazlich DA, Randall DA: A global 1° × 1° NDVI data set for climate studies. Part 2: The generation of global fields of terrestrial biophysical parameters from the NDVI.
Int J Remote Sens 1994, 15:35193545. Publisher Full Text

Los SO, Collatz GJ, Sellers PJ, Malmström CM, Pollack NH, DeFries RS, Bounoua L, Parris MT, Tucker CJ, Dazlich DA: A global 9yr biophysical land surface sataset from NOAA AVHRR data.
J Hydrometeorol 2000, 1:183199. Publisher Full Text

Liu J, Chen XP: Relationship of remote sensing normalized differential vegetation index to Anopheles density and malaria incidence rate.
Biomed Environ Sci 2006, 19:130132. PubMed Abstract

Dolo A, Camara F, Poudiougo B, Touré A, Kouriba B, Bagayogo M, Sangaré D, Diallo M, Bosman A, Modiano D, Touré YT, Doumbo O: Epidémiologie du paludisme dans un village de savane soudanienne du Mali (Bancoumana).
Bull Soc Pathol Exot 2003, 96:308312. PubMed Abstract

Toure YT, Doumbo O, Toure A, Bagayoko M, Diallo M, Dolo A, Vernick KD, Keister DB, Muratova O, Kaslow DC: Gametocyte infectivity by direct mosquito feeds in an area of seasonal malaria transmission: implications for Bancoumana, Mali, as a transmissionblocking vaccine site.
Am J Trop Med Hyg 1998, 59:481486. PubMed Abstract  Publisher Full Text

Diallo DA, Doumbo OK, Plowe CV, Wellems TE, Emanuel EJ, Hurst SA: Community permission for medical research in developing countries.
Clin Infect Dis 2005, 41:255259. PubMed Abstract  Publisher Full Text

El Saleous NZ, Vermote EF, Justice CO, Townshend JRG, Tucker CJ, Goward SN: Improvements in the global biospheric record from the Advanced Very High Resolution Radiometer (AVHRR).
Int J Remote Sens 2000, 21:12511277. Publisher Full Text

Vermote EE, Kaufman YJ: Absolute calibration of AVHRR visible and nearinfrared channels using ocean and cloud views.
Int J Remote Sens 1995, 16:23172340. Publisher Full Text

Tucker CJ, Pinzon JE, Brown ME, Slayback D, Pak EW, Mahoney R, Vermote E, El Saleous N: An extended AVHRR 8km NDVI data set compatible with MODIS and SPOT vegetation NDVI data.
Int J Remote Sens 2005, 26:44854498. Publisher Full Text

Vermote EE, El Saleous N, Kaufman YJ, Dutton E: Data Preprocessing stratospheric aerosol perturbing effect on the remote sensing of vegetation: correction method for the composite NDVI after the Pinatubo Eruption.

Box GEP, Jenkins GM: Time series analysis: forecasting and control. San Francisco: HoldenDay; 1976.

Droesbeke JJ, Fichet B, Tassi P: Séries chronologiques: théorie et pratique des modèles ARIMA. Paris: Economica; 1989.

Sun Y, Heng BH, Seow YT, Seow E: Forecasting daily attendances at an emergency department to aid resource planning.
BMC Emerg Med 2009, 9:1. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Massaro KS, Costa SF, Leone C, Chamone DA: Procalcitonin (PCT) and Creactive protein (CRP) as severe systemic infection markers in febrile neutropenic adults.
BMC Infect Dis 2007, 7:137. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text

Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM: The diagnostic odds ratio: a single indicator of test performance.
J Clin Epidemiol 2003, 56:11291135. PubMed Abstract  Publisher Full Text

Gaudart J: [http:/ / cybertim.timone.univmrs.fr/ recherche/ docrecherche/ statistiques/ Gaudart_these2007/ ] webcite
Analyse spatiotemporelle et modélisation des épidémies: application au paludisme à P. falciparum. PhD thesis, AixMarseille University; 2007.

Baum LE, Petrie T: Statistical inference for probabilistic functions of finite state Markov chains.
Ann Math Stat 1966, 37:15541563. Publisher Full Text

Baum LE, Petrie T, Soules G, Weiss N: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains.
Ann Math Stat 1970, 41:164171. Publisher Full Text

Bickel PJ, Ritov Y, Rydén T: Asymptotic normality of the maximum likelihood estimator for general hidden Markov models.
Ann Stat 1998, 26:16141635. Publisher Full Text

Durand JB: Modèles à structure cachée: inférence, sélection de modèles et applications. Ph.D. thesis, Université Grenoble I; 2003.

GenonCatalot V, Laredo C: Leroux's method for general hidden Markov models.
Stochastic Process Appl 2006, 116:222243. Publisher Full Text

Thyer M, Kuczera G: A hidden Markov model for modelling longterm persistence in multisite rainfall time series. 2. Real data analysis.
J Hydrol 2003, 275:2748. Publisher Full Text

Tucker BC, Anand M: On the use of stationary versus hidden Markov models to detect simple versus complex ecological dynamics.
Ecol Model 2005, 185:177193. Publisher Full Text

Zucchini W, Guttorp P: A hidden Markov model for spacetime precipitation.
Water Resour Res 1991, 27:19171923. Publisher Full Text

Tucker CJ, Sellers PJ: Satellite remote sensing of primary productivity.
Int J Remote Sens 1986, 7:13951416. Publisher Full Text

Hielkema JH, Prince SD, Astle WL: Rainfall and vegetation monitoring in the Savanna Zone of the Democratic Republic of Sudan using the NOAA Advanced Very High Resolution Radiometer.
Int J Remote Sens 1986, 7:14991513. Publisher Full Text

Smith RCG, Choudhury BJ: On the correlation of indices of vegetation and surface temperature over southeastern Australia.
Int J Remote Sens 1990, 11:21132118. Publisher Full Text

Ehrlich D, Lambin EF: Broad scale landcover classification and interannual climatic variability.
Int J Remote Sens 1996, 17:845862. Publisher Full Text

Graves PM, Osgood DE, Thomson MC, Sereke K, Araia A, Zerom M, Ceccato P, Bell M, Del Corral J, Ghebreselassie S, Brantly EP, Ghebremeskel T: Effectiveness of malaria control during changing climate conditions in Eritrea, 1998–2003.
Trop Med Int Health 2008, 13:218228. PubMed Abstract  Publisher Full Text

Roberts D, Paris J, Manguin S, Harbach R, Woodruff R, Rejmankova E, Polanco J, Wullschleger B, Legters L: Predictions of malaria vectors distribution in Belize based on multispectral satellite data.
Am J Trop Med Hyg 1996, 54:304308. PubMed Abstract  Publisher Full Text

Silue KD, Raso G, Yapi A, Vounatsou P, Tanner M, N'Goran EK, Utzinger J: Spatiallyexplicit risk profiling of Plasmodium falciparum infections at a small scale: a geostatistical modelling approach.
Malar J 2008, 7:111. PubMed Abstract  BioMed Central Full Text  PubMed Central Full Text