One Health. 2021 Dec;13 100358
Background: Mapping the spatial distribution of the dengue vector Aedes (Ae.) aegypti and accurately predicting its abundance are crucial for designing effective vector control strategies and early warning tools for dengue epidemic prevention. Socio-ecological and landscape factors influence Ae. aegypti abundance. Therefore, we aimed to map the spatial distribution of female adult Ae. aegypti and predict its abundance in northeastern Thailand based on socioeconomic, climate change, and dengue knowledge, attitude and practices (KAP) and/or landscape factors using machine learning (ML)-based system.
Method: A total of 1066 females adult Ae. aegypti were collected from four villages in northeastern Thailand during January-December 2019. Information on household socioeconomics, KAP regarding climate change and dengue, and satellite-based landscape data were also acquired. Geographic information systems (GIS) were used to map the household-based spatial distribution of female adult Ae. aegypti abundance (high/low). Five popular supervised learning models, logistic regression (LR), support vector machine (SVM), k-nearest neighbor (kNN), artificial neural network (ANN), and random forest (RF), were used to predict females adult Ae. aegypti abundance (high/low). The predictive accuracy of each modeling technique was calculated and evaluated. Important variables for predicting female adult Ae. aegypti abundance were also identified using the best-fitted model.
Results: Urban areas had higher abundance of female adult Ae. aegypti compared to rural areas. Overall, study respondents in both urban and rural areas had inadequate KAP regarding climate change and dengue. The average landscape factors per household in urban areas were rice crop (47.4%), natural tree cover (17.8%), built-up area (13.2%), permanent wetlands (21.2%), and rubber plantation (0%), and the corresponding figures for rural areas were 12.1, 2.0, 38.7, 40.1 and 0.1% respectively. Among all assessed models, RF showed the best prediction performance (socioeconomics: area under curve, AUC = 0.93, classification accuracy, CA = 0.86, F1 score = 0.85; KAP: AUC = 0.95, CA = 0.92, F1 = 0.90; landscape: AUC = 0.96, CA = 0.89, F1 = 0.87) for female adult Ae. aegypti abundance. The combined influences of all factors further improved the predictive accuracy in RF model (socioeconomics + KAP + landscape: AUC = 0.99, CA = 0.96 and F1 = 0.95). Dengue prevention practices were shown to be the most important predictor in the RF model for female adult Ae. aegypti abundance in northeastern Thailand.
Conclusion: The RF model is more suitable for the prediction of Ae. aegypti abundance in northeastern Thailand. Our study exemplifies that the application of GIS and machine learning systems has significant potential for understanding the spatial distribution of dengue vectors and predicting its abundance. The study findings might help optimize vector control strategies, future mosquito suppression, prediction and control strategies of epidemic arboviral diseases (dengue, chikungunya, and Zika). Such strategies can be incorporated into One Health approaches applying transdisciplinary approaches considering human-vector and agro-environmental interrelationships.
Keywords: ANN, Artificial neural network; AUC, Area under curve; Aedes aegypti; CA, Classification accuracy.; DENV, Dengue virus; Dengue; Early warning; GIS, Geographic information systems; HCI, Household crowding index; KAP, Knowledge, attitude, and practice; LR, logistic regression; ML, Machine learning; PCI, Premise condition index; Prediction; RF, Random forest; SES, Socioeconomic status; SVM, Support vector machine; Supervised learning; kNN, k-nearest neighbor