In: Bulletin of the World Health Organization: the international journal of public health = Bulletin de l'Organisation Mondiale de la Santé, Band 100, Heft 9, S. 562-569
Traditional sample designs for household surveys are contingent upon the availability of a representative primary sampling frame. This is defined using enumeration units and population counts retrieved from decennial national censuses that can become rapidly inaccurate in highly dynamic demographic settings. To tackle the need for representative sampling frames, we propose an original grid-based sample design framework introducing essential concepts of spatial sampling in household surveys. In this framework, the sampling frame is defined based on gridded population estimates and formalized as a bi-dimensional random field, characterized by spatial trends, spatial autocorrelation, and stratification. The sampling design reflects the characteristics of the random field by combining contextual stratification and proportional to population size sampling. A nonparametric estimator is applied to evaluate the sampling design and inform sample size estimation. We demonstrate an application of the proposed framework through a case study developed in two provinces located in the western part of the Democratic Republic of the Congo. We define a sampling frame consisting of settled cells with associated population estimates. We then perform a contextual stratification by applying a principal component analysis (PCA) and k-means clustering to a set of gridded geospatial covariates, and sample settled cells proportionally to population size. Lastly, we evaluate the sampling design by contrasting the empirical cumulative distribution function for the entire population of interest and its weighted counterpart across different sample sizes and identify an adequate sample size using the Kolmogorov-Smirnov distance between the two functions. The results of the case study underscore the strengths and limitations of the proposed grid-based sample design framework and foster further research into the application of spatial sampling concepts in household surveys.
Le dénombrement de la population, dénominateur de nombreux indicateurs statistiques, est crucial pour les politiques publiques d'un pays. Il est du ressort des instituts nationaux de statistique d'en organiser la collecte, le plus souvent par le biais d'un recensement. Que se passe-t-il lorsqu'une partie du territoire n'est pas accessible aux agents recenseurs ? Actuellement, les données spatiales, telles qu'extraites de l'imagerie satellite, offrent une information géographique complète et de haute résolution, qui représente, lorsque combinée à un dénombrement partiel de la population, une opportunité sans précédent pour estimer les effectifs des territoires manquants. Leur précision spatiale rend également possible une estimation carroyée de la population en haute résolution, un format de données innovant à la croisée de la géographie et de la démographie. À partir du cas du Burkina Faso, cet article analyse comment le découpage du pays en carreaux de 100m sur 100m permet dans un premier temps de développer un modèle pour estimer, par le biais d'une approche hiérarchique bayésienne, la population des zones caractérisées par des problèmes sécuritaires n'ayant pas pu être dénombrées lors du dernier recensement de 2019. Ce découpage permet dans un second temps de désagréger les effectifs obtenus, par le biais d'un modèle d'apprentissage statistique pour obtenir une précision spatiale d'estimation de la population inégalée.
Urban settlements and urbanised populations continue to grow rapidly and much of this transition is occurring in less developed countries. Remote sensing techniques are now often applied to monitor urbanisation and changes in settlement patterns. In particular, increasing availability of very high resolution imagery (<1 m spatial resolution) and computing power is enabling complete sets of settlement data in the form of building footprints to be extracted from imagery. These settlement data provide information on the changes occurring in cities, particularly in countries which may lack other data on urbanisation. While spatially detailed, extracted building footprints typically lack other information that identify building types or can be used to differentiate intra-urban land uses or neighbourhood types. This work demonstrates an approach to classifying settlement types through multi-scale spatial patterns of urban morphology visible in building footprint data extracted from very high resolution imagery. The work uses a Gaussian mixture modelling approach to select and hierarchically merge components into clusters. The results are maps classifying settlement types on a high spatial resolution (100 m) grid. The approach is applied in Kaduna, Nigeria; Kinshasa, Democratic Republic of the Congo; and Maputo, Mozambique and demonstrates the potential of computational methods to take advantage of large spatial datasets and extract meaningful information to support monitoring of urban areas. The model-based approach produces a hierarchy of potential clustering solutions, and we suggest that this can be used in partnership with local knowledge of the context when creating settlement typologies.
Population estimates are critical for government services, development projects, and public health campaigns. Such data are typically obtained through a national population and housing census. However, population estimates can quickly become inaccurate in localized areas, particularly where migration or displacement has occurred. Some conflict-affected and resource-poor countries have not conducted a census in over 10 y. We developed a hierarchical Bayesian model to estimate population numbers in small areas based on enumeration data from sample areas and nationwide information about administrative boundaries, building locations, settlement types, and other factors related to population density. We demonstrated this model by estimating population sizes in every 10- m grid cell in Nigeria with national coverage. These gridded population estimates and areal population totals derived from them are accompanied by estimates of uncertainty based on Bayesian posterior probabilities. The model had an overall error rate of 67 people per hectare (mean of absolute residuals) or 43% (using scaled residuals) for predictions in out-of-sample survey areas (approximately 3 ha each), with increased precision expected for aggregated population totals in larger areas. This statistical approach represents a significant step toward estimating populations at high resolution with national coverage in the absence of a complete and recent census, while also providing reliable estimates of uncertainty to support informed decision making.
Population estimates are critical for government services, development projects, and public health campaigns. Such data are typically obtained through a national population and housing census. However, population estimates can quickly become inaccurate in localized areas, particularly where migration or displacement has occurred. Some conflict-affected and resource-poor countries have not conducted a census in over 10 y. We developed a hierarchical Bayesian model to estimate population numbers in small areas based on enumeration data from sample areas and nationwide information about administrative boundaries, building locations, settlement types, and other factors related to population density. We demonstrated this model by estimating population sizes in every 10- m grid cell in Nigeria with national coverage. These gridded population estimates and areal population totals derived from them are accompanied by estimates of uncertainty based on Bayesian posterior probabilities. The model had an overall error rate of 67 people per hectare (mean of absolute residuals) or 43% (using scaled residuals) for predictions in out-of-sample survey areas (approximately 3 ha each), with increased precision expected for aggregated population totals in larger areas. This statistical approach represents a significant step toward estimating populations at high resolution with national coverage in the absence of a complete and recent census, while also providing reliable estimates of uncertainty to support informed decision making.
Utilising satellite images for planning and development is becoming a common practice as computational power and machine learning capabilities expand. In this paper, we explore the use of satellite image derived building footprint data to classify the residential status of urban buildings in low and middle income countries. A recently developed ensemble machine learning building classification model is applied for the first time to the Democratic Republic of the Congo, and to Nigeria. The model is informed by building footprint and label data of greater completeness and attribute consistency than have previously been available for these countries. A GIS workflow is described that semiautomates the preparation of data for input to the model. The workflow is designed to be particularly useful to those who apply the model to additional countries and use input data from diverse sources. Results show that the ensemble model correctly classifies between 85% and 93% of structures as residential and nonresidential across both countries. The classification outputs are likely to be valuable in the modelling of human population distributions, as well as in a range of related applications such as urban planning, resource allocation, and service delivery.
Background: surveillance is a core component of an effective system to support malaria elimination. Poor surveillance data will prevent countries from monitoring progress towards elimination and targeting interventions to the last remaining at-risk places. An evaluation of the performance of surveillance systems in 16 countries was conducted to identify key gaps which could be addressed to build effective systems for malaria elimination. Methods: a standardized surveillance system landscaping was conducted between 2015 and 2017 in collaboration with governmental malaria programmes. Malaria surveillance guidelines from the World Health Organization and other technical bodies were used to identify the characteristics of an optimal surveillance system, against which systems of study countries were compared. Data collection was conducted through review of existing material and datasets, and interviews with key stakeholders, and the outcomes were summarized descriptively. Additionally, the cumulative fraction of incident infections reported through surveillance systems was estimated using surveillance data, government records, survey data, and other scientific sources. Results: the landscaping identified common gaps across countries related to the lack of surveillance coverage in remote communities or in the private sector, the lack of adequate health information architecture to capture high quality case-based data, poor integration of data from other sources such as intervention information, poor visualization of generated information, and its lack of availability for making programmatic decisions. The median percentage of symptomatic cases captured by the surveillance systems in the 16 countries was estimated to be 37%, mostly driven by the lack of treatment-seeking in the public health sector (64%) or, in countries with large private sectors, the lack of integration of this sector within the surveillance system. Conclusions: the landscaping analysis undertaken provides a clear framework through which to identify multiple gaps in current malaria surveillance systems. While perfect systems are not required to eliminate malaria, closing the gaps identified will allow countries to deploy resources more efficiently, track progress, and accelerate towards malaria elimination. Since the landscaping undertaken here, several countries have addressed some of the identified gaps by improving coverage of surveillance, integrating case data with other information, and strengthening visualization and use of data.
Pandemics such as COVID-19 and their induced lockdowns/travel restrictions have a significant impact on people's lives, especially for lower-income groups who lack savings and rely heavily on mobility to fulfill their daily needs. Taking the COVID-19 pandemic as an example, this study analysed the risk of returning to poverty for low-income households in Hubei Province in China as a result of the COVID-19 lockdown. Employing a dataset including information on 78,931 government-identified poor households, three scenarios were analysed in an attempt to identify who is at high risk of returning to poverty, where they are located, and how the various risk factors influence their potential return to poverty. The results showed that the percentage of households at high risk of returning to poverty (falling below the poverty line) increased from 5.6% to 22% due to a 3-month lockdown. This vulnerable group tended to have a single source of income, shorter working hours, and more family members. Towns at high risk (more than 2% of households returning to poverty) doubled (from 27.3% to 46.9%) and were mainly located near railway stations; an average decrease of 10–50 km in the distance to the nearest railway station increased the risk from 1.8% to 9%. These findings, which were supported by the representativeness of the sample and a variety of robustness tests, provide new information for policymakers tasked with protecting vulnerable groups at high risk of returning to poverty and alleviating the significant socio-economic consequences of future pandemics.
Travel and physical distancing interventions have been implemented across the World to mitigate the COVID-19 pandemic, but studies are needed to quantify the effectiveness of these measures across regions and time. Timely population mobility data were obtained to measure travel and contact reductions in 135 countries or territories. During the 10 weeks of March 22 - May 30, 2020, domestic travel in study regions has dramatically reduced to a median of 59% (interquartile range [IQR] 43% - 73%) of normal levels seen before the outbreak, with international travel down to 26% (IQR 12% - 35%). If these travel and physical distancing interventions had not been deployed across the World, the cumulative number of cases might have shown a 97-fold (IQR 79 - 116) increase, as of May 31, 2020. However, effectiveness differed by the duration and intensity of interventions and relaxation scenarios, with variations in case severity seen across populations, regions, and seasons.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study was supported by the grants from the Bill & Melinda Gates Foundation (OPP1134076); the European Union Horizon 2020 (MOOD 874850). N.R. is supported by funding from the Bill & Melinda Gates Foundation (OPP1170969). O.P. is supported by the National Science Foundation (1816075). A.J.T. is supported by funding from the Bill & Melinda Gates Foundation (OPP1106427, OPP1032350, OPP1134076, OPP1094793), the Clinton Health Access Initiative, the UK Department for International Development (DFID) and the Wellcome Trust (106866/Z/15/Z, 204613/Z/16/Z). Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Ethical clearance for collecting and using secondary population mobility data was granted by the institutional review board of the University of Southampton (No. 48002). All data were supplied and analyzed in an anonymous format, without access to personal identifying information.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesCode for the model simulations is available at the following GitHub repository: https://github.com/wpgp/BEARmod. The data on COVID-19 cases and interventions reported by country are available from the data sources listed in Supplementary Materials. The parameters and population data for running simulations and estimating the severity are listed in Supplementary Data S1 to S2. The population movement data obtained from Baidu are available at: https://qianxi.baidu.com/. The Google COVID-19 Aggregated Mobility Research Dataset used for this study is available with permission of Google, LLC.