key: cord-0309095-lrj31hlq authors: Chandra, H.; Guha, S.; Desai, M.; Pyne, S. title: Small Area Estimation of Food Insecurity in the Eastern Indo-Gangetic Plain date: 2021-06-06 journal: nan DOI: 10.1101/2021.06.03.21258287 sha: 7d6181273b882554fbe3fe966684987e6c9ecb19 doc_id: 309095 cord_uid: lrj31hlq Achieving food security for all citizens is an important policy issue in India. While the existing data based on socio-economic surveys provide accurate estimates of food insecurity indicators at state and national level, due to small sample sizes, the surveys cannot be used directly to produce reliable estimates at the district or lower administrative levels. The availability of reliable and representative disaggregated measures of food insecurity is necessary for effective policy planning and monitoring, as food insecurity is often distributed unevenly within relatively small areas. This article explores a small area estimation (SAE) approach to derive reliable and representative estimates of food insecurity prevalence (FIP), gap (FIG), and severity (FIS) among people in different districts of the rural areas of the Eastern Indo-Gangetic Plain (EIGP) region by linking the latest round of available data from the Household Consumer Expenditure Survey collected by the National Sample Survey Office of India as well as the latest available Indian Population Census data. District-specific food insecurity indicators such as FIP, FIG, and FIS were estimated based on a recommended threshold of per capita caloric intake of 2400 kilocalories per day, as defined by the Ministry of Health and Family Welfare, Government of India. Spatial maps showing district-level inequality in the distribution of the indicators of food insecurity among the population in the EIGP region are also produced. Our disaggregated estimates can provide district-specific focused insights into food insecurity to policy analysts and decision-makers, and could thereby prove to be useful and relevant to the U.N. Sustainable Development Goal Indicator 2.1.2. population 12 and estimates of the past data show that changes in monsoon characteristics led to decrease in rice yields in India by 1.7% during 1960-2002 . 9 For every 1-degree Celsius increase in temperature, loss of 3.7%-14.5% in India's wheat yields was estimated. For rice, such estimates from multiple methods predict even larger temperature impact with an average reduction of 6.6 ± 3.8% per degree Celsius. 13 To study the complex interplay among the socioeconomic conditions, agro-ecology, and climate change, as well as their combined effects on food security of a population, few regions are as crucially important as India's "breadbasket", the Indo-Gangetic plain. The region comprises of a 2.5 million km 2 fertile plain that encompasses the northern regions of the Indian subcontinent. In particular, the Eastern Indo-Gangetic Plains (EIGP) region includes the states of Uttar Pradesh (UP), Bihar, and West Bengal (WB) in India, as well as parts of Nepal and Bangladesh. In India, the EIGP region consist of 39.27 million hectares and is home to 395.19 million people, i.e., 32.64% of India's total population (2011 census). It is among the most densely populated (700-1200 persons/km 2 ) regions in the world, and has high socioeconomic vulnerability. 14 EIGP is characterized by fertile soils with ample monsoon rainfall, continuous supply of surface and groundwater and a largely favorable climate that supports a predominantly rice-wheat cropping pattern. 15 While UP and Bihar contribute 32% and 5.76% of the country's total wheat production respectively, WB, Bihar, and UP contribute respectively 13.26%, 7%, and 11.75% to its total rice production. 16 However, the food security of EIGP is potentially vulnerable to adverse effects of both anthropogenic and environmental factors. In 2012, the percentage of populations living below the national poverty line for UP, Bihar, and WB was 29.43, 33.74, and 19.98 respectively. 17 The rates of unemployment and seasonal out-migration are relatively high among its rural populations while land holdings are generally small. Environmental concerns of EIGP include rising temperatures, high inter-annual variability of precipitation and frequent occurrence of adverse climatic events such as droughts and increasing cyclonic activity. 18 Annual buildups of atmospheric pollutants in intensively farmed areas may have resulted in relative yield changes of -15% or greater in this region between 2006 and 2010. 19 The Ganga basin has severe groundwater contamination of Arsenic, which enters the food chain, 4 especially through cultivation of rice. 20 In fact, the disproportionately large contribution of rice production to resource use, greenhouse gases, and climate sensitivity relative to its share of monsoon cereal calorie production in India was observed. 21 Studies have noted the importance of addressing the impacts of climate change through solutions such as diet and crop diversification, improved farming technology and issues of governance, etc. 3, 21, 22 In the presence of diverse agro-ecological and biophysical conditions, availability of precise and timely disaggregate level statistics is essential for developing focused and target-oriented policies to ensure food security in EIGP. In developing countries, however, the scarcity of reliable quantitative data represents a major challenge to policy-makers and researchers. 23 Such data, even when they exist, are often reported only at regional or state levels, and may be poorly correlated with local surveys. 24, 25 In India, Household Consumer Expenditure Survey (HCES) data collected by National Sample Survey Office (NSSO), Ministry of Statistics and Program Implementation, Government of India (GoI) is used to generate the estimates of food insecurity parameters at state and national level for both rural and urban sectors separately. However, the state and national level estimates available from this survey may not reveal the existing local disparities. In particular, estimates of food insecurity indicators are not available at the level of lower administrative units (e.g., district). Food security presents a complex systemic challenge to policy planners, researchers, and government and public agencies. To understand it closely, one could gain key insights from statistical summaries for smaller domains of interest (or "small areas") that are obtained by cross classifying demographic and geographic variables such as small geographic areas (e.g., districts) or small demographic groups (gender-wise, social groups, etc.) or both. However, in the existing large-scale survey data, the sample sizes of such small areas may be very small or even zero. The small area estimation (SAE) methodology provides a viable and efficient solution to this problem of small sample sizes. 26 While SAE was recently used to obtain precise food insecurity indicator estimates for districts of Bangladesh 27 , no district-specific estimates exist for the much larger, more populous and diverse EIGP region in India. 5 In this study, we used SAE methodology to compute precise and representative district level estimates of the food insecurity prevalence, gap and severity among the people in rural areas of EIGP region covering the Indian states of UP, Bihar, and WB. Section 2 describes the data used for SAE as well as the study variables and model specifications. We also briefly describe the SAE methodology used in this study. The empirical results and maps are used to describe district-level inequalities in distribution of food insecurity in Section 3. Finally, we end with concluding remarks in Section 4. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint interest for unit ( 1, , ) i j j N = in area i (for example, per capita calorie intake of person j in district i). The quantity of interest in area (or district) i is the food insecurity indicators i F  defined as Population Census as potential covariates. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. Based on the level of auxiliary information available, the models used in SAE are categorized as area level or unit level. Area-level modelling is typically used when unit-level data are unavailable, or, as is often the case, where model covariates or auxiliary variables are only available in aggregate form. Here, we assume that the auxiliary variables are accessible at aggregate level so this article focuses on area level small area modeling. In this context, Fay-Herriot model 30 is a widely used area level model in SAE that assumes area-specific survey All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; 9 estimates are available, and that these follow an area level linear mixed model with area random effects. The SAE methods based on linear mixed models for continuous data can produce inefficient and sometime invalid estimates when the variable of interest is binary. If the variable of interest is binary in nature and the target of inference is a small area proportion With this, the simple district (or area) specific two stage model suggested by (Fay and Herriot, 1979) 30 is described as Alternatively, we can express this model as Here  is a k-vector of unknown fixed effect parameters, ' The two errors are All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. Then under (1), the empirical best linear unbiased predictor (EBLUP) estimate of (2) Here, ( ) defines the shrinkage effect for area i. The mean squared error (MSE) estimation of EBLUP is followed from Molina and Rao (2015) . 26 The direct estimate of proportions (e.g., FIP) and area level auxiliary variables i x can also be modelled by Fay-Herriot model (1) Here We use an iterative procedure that combines the Penalized Quasi-Likelihood estimation of  and The MSE estimation of EPP is adopted from Chandra et al. (2011) . 29 The model (3) is based on unweighted sample counts, and hence it assumes that sampling within districts is non-informative given the values of the auxiliary variables and the random district specific effects. Here, the survey weighted probability estimate for a district is modelled as a binomial proportion, with an "effective sample size" that equates the resulting binomial variance to the actual sampling variance of the survey weighted direct estimate for the district. 33 In particular, in model (3) the "actual sample size" and the "actual sample count" have been replaced with the "effective sample size" and the "effective sample count" respectively to incorporate the sampling information. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. For the 95% confidence interval diagnostic, we observe the width of the interval for the direct estimates compared to the model-based estimates. 34 For more precise estimates, we expect the width of the confidence interval to be narrower. In addition, we consider the coverage diagnostic to assess the validity of the confidence intervals generated by the model-based SAE methods. The 95% CIs for the direct estimates should contain the "truth" approximately 95% of the time. This should also hold for the CIs surrounding the model-based estimates. We adjust both sets of intervals, so that their chance of overlapping should be 95% and count how often they actually do overlap. Assuming that the estimated coverage of the direct CIs is correct, comparing the counts to the binomial distribution provides a non-parametric significance test of the bias of model This Section describes the district-wise estimates of food insecurity indicators (FIP , FIG and FIS) generated by our SAE methods. In particular, the EPP (4) (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint 13 fitted to data. The latter diagnostics are used to provide an indication of validity and reliability of the small area estimates. In small area models (1) and (3), the random area specific effects are assumed to have a normal distribution with mean zero and fixed variance σ u 2 . If the model assumptions are satisfied then the area or district level residuals are expected to be randomly distributed around zero. Histograms and q-q plots are also used to inspect the normality assumption. Figure 1 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. [ Table 3 about here] We computed the CV to compare the extent to which the model-based estimates of FIP, FIG and FIS improve in precision compared to the corresponding direct estimates. In addition, we also All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. Figure 4 shows boxplots of these ratios. The distribution of CV in Figure 3 indicates that in most of the districts, the CVs of the model-based estimates are significantly smaller than those of the direct survey estimates. This demonstrates that the model-based estimates are relatively more precise than the direct estimates. Further, the improvement CV is higher for the districts with smaller sample sizes as compared to the larger sample sizes. The boxplots in Figure 4 also reflect that the CV of the model-based estimates are smaller than the those of the direct estimates, implying that the model-based estimates are less variable, and hence relatively more precise than the direct estimates. Overall, the CV diagnostic measures communicate that the estimates of FIP, FIG and FIS) for 127 districts are reported in Table 4 . As expected, the averages of the model-based estimates of food insecurity indicators are almost identical to those of the direct estimates but with lower variation (i.e., smaller values of All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint standard deviation). For example, the standard deviations of FIP estimates generated by the direct and the SAE methods are 0.154 and 0.117 respectively. It is obvious that the model-based estimates of food insecurity indicators are more precise and representative than the direct estimates. To test the aggregation property, Table 5 reports the region and state level estimates of the food insecurity indicators generated by direct and SAE methods. Comparing these with the corresponding direct estimates, we note that the model-based estimates are very close to the direct survey estimates as region level as well in each of the three states. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; Generally, food insecurity rates, intensity and severity are mainly concentrated more in the northern and eastern parts of WB. The U.S. Department of Agriculture describes food insecurity as a situation of "limited or uncertain availability of nutritionally adequate and safe foods or limited or uncertain ability to acquire acceptable foods in socially acceptable ways". 36 This definition draws our attention to the fact that food insecurity is more than just hunger, as is recognized by the 2030 sustainable development goals (SDGs) of the United Nations, and, in particular, SDG2 that is related to ending hunger, improving food security and nutrition, and promoting sustainable agriculture. 37 More recent data revealed that among Indian children of age 5 years and younger, 34.7% were stunted, 17.3% wasted, and 33.4% underweight; and among children of age 5-9 years, 21.9% were stunted, and 23% moderately-to-severely thin for their age. 40 To summarize, one-in-three Indian children was stunted and one-in-five wasted. 40 More than half the women of reproductive age (15-49 years) in India are anemic. 7 Studies on long-term effects of early-life and prenatal hunger among Indian populations are ongoing. 41 In India "food insecurity" is defined as an average calorie intake of less than 2400 Kcal per capita per day based on direct calorie intake (DCI) method. 42 The NSSO conducts nationwide HCE surveys at regular intervals as part of its "rounds", with the duration of each round normally being a year. The surveys are conducted through interviews of a representative sample of households selected randomly through a suitable sampling design and covering almost the entire geographical area of the country. The sampling design used in 2011-12 HCES is stratified multi-stage random All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint sampling with districts as strata, villages as first stage units and households as second stage units. These surveys provide reliable state and national level estimates, but they cannot be used directly to produce reliable estimates at the district level due to small sample sizes. This article focuses on disaggregate level estimation and analysis of food insecurity indicators, viz., food insecurity prevalence (FIP), food insecurity gap (FIG) and food insecurity severity (FIS) using small area estimation methods. With the policy and structural changes of the Green Revolution, countries such as India were among the first to demonstrate the gains from the high-yield varieties of cereals. While GR has substantially increased the country's food supply over the past half century, its other more mixed outcomes include homogenization of cereal production, unsustainable resource use especially where agroecological conditions are not well-suited, and greater vulnerability to climate variations. 22 The inter-regional differences in gains from GR reveal, especially among the EIGP districts, marked disparities in incidence of poverty, resource endowments, technology use and livelihoods. 43 To implement its agenda of sustainable development, India currently lacks the critically essential disaggregate level measures and maps of localized food insecurity. Towards this, the present study used the SAE approach to generate reliable and representative estimates and spatial maps of food insecurity prevalence, gap and severity among the populations in different districts of UP, Bihar and WB in EIGP region using the latest round of available data from the 2011-12 HCES and the 2011 Population Census. The results were evaluated through several diagnostics measures and revealed that the small area estimation methods offer significant gains in efficiency for generating district level estimates. Spatial maps thus produced provide key insights into the unequal distribution of incidence food insecurity, gap and severity among the different districts of these states. Our SAE approach could not only serve as a template of rigorous disaggregation for the emerging next round of the National Family Health Survey (NFHS-5) in India but also provide the much-needed precise measures for conducting systematic comparative analysis of district-wise gains or losses in food security over the past decade. EIGP has long been studied for its relatively low productivity, poor infrastructure, limited capacity for private investment, and climate sensitivity. 44 Our district-level estimates and spatial maps can All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint 20 be very effective tools in informed policy-making to address not only the challenge of food insecurity but indeed to understand the related factors that might be specific to the districts under consideration. Surely, the governmental agencies as well as various international organizations stand to benefit from our disaggregated data in formulating effective action plans to achieve the relevant SDGs. The study also underscores the potential for data-driven budget allocation and targeted welfare interventions by identification and prioritization of the districts with high food insecurity rates, intensity and severity. The COVID-19 pandemic has revealed the importance of building structural capacity with community-specific resiliency, which might have added significance in terms of regional sensitivity to climate change. For instance, formulation of policy for supporting micro, small and medium enterprises (MSMEs) in vulnerable districts could mitigate local unemployment, poverty and food insecurity. The SAE approach allows us to take a much-needed step in that strategic direction. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint 28 Figure 6 . District-wise maps showing the spatial distribution of food insecurity prevalence (right), gap (center) and severity (right) generated by SAE method for the states Uttar Pradesh (Top), Bihar (Middle), and West Bengal (Bottom) in the EIGP region. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 6, 2021. ; https://doi.org/10.1101/2021.06.03.21258287 doi: medRxiv preprint Food and Agriculture Organization of the United Nations; and food insecurity India's right to food act: A novel approach to food security Achieving food and environmental security: New approaches to close the gap SOFI 2017 -The State of Food Security and Nutrition in the World WFP Year in Review in 2015 . World Food Programme Global Hunger Index (GHI) -peer-reviewed annual publication designed to comprehensively measure and track hunger at the global, regional, and country levels The COVID-19 pandemic and food insecurity: A viewpoint on India Climate change, the monsoon, and rice yield in India Climate trends and global crop production since 1980. Science (80-) Climate change impacts on crop productivity in Africa and South Asia India's rainfed agroecosystem: Constraints and strategies Temperature increase reduces global yields of major crops in four independent estimates A vulnerability index for the management of and response to the COVID-19 epidemic in India: an ecological study Is rainfall gradient a factor of livelihood diversification? Empirical evidence from around climatic hotspots in Indo-Gangetic Plains Ganges Strategic Basin Assessment. A Discussion of Regional Opportunities and Risks Recent climate and air pollution impacts on indian agriculture Groundwater arsenic contamination in the ganga river basin: A future health danger Assessing the sustainability of post-Green Revolution cereals in India Green revolution: Impacts, limits, and the path ahead Localised estimates and spatial mapping of poverty incidence in the state of Bihar in India-An application of small area estimation techniques Comparison of food consumption in Indian adults between national and sub-national dietary data sources Trends in nutritional status and nutrient intakes and correlates of overweight/obesity among rural adult women (≥18-60 years) in India: National Nutrition Monitoring Bureau (NNMB) national surveys Small Area Estimation: Second Edition. wiley Disaggregate level estimates and spatial mapping of food insecurity in Bangladesh by linking survey and census data CensusInfo India 2011 -Dashboards, Data Query, Houselisting and Housing data, Population and Education Disaggregate-level estimates of indebtedness in the state of Uttar Pradesh in India: An application of small-area estimation technique Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data Small area estimation of proportions in business surveys