key: cord-0823981-dnnixa7p authors: Jenneson, Victoria; Clarke, Graham P.; Greenwood, Darren C.; Shute, Becky; Tempest, Bethan; Rains, Tim; Morris, Michelle A. title: Exploring the Geographic Variation in Fruit and Vegetable Purchasing Behaviour Using Supermarket Transaction Data date: 2021-12-30 journal: Nutrients DOI: 10.3390/nu14010177 sha: c5f48a7184cffa3b197e44429f5ce4b6114484b0 doc_id: 823981 cord_uid: dnnixa7p The existence of dietary inequalities is well-known. Dietary behaviours are impacted by the food environment and are thus likely to follow a spatial pattern. Using 12 months of transaction records for around 50,000 ‘primary’ supermarket loyalty card holders, this study explores fruit and vegetable purchasing at the neighbourhood level across the city of Leeds, England. Determinants of small-area-level fruit and vegetable purchasing were identified using multiple linear regression. Results show that fruit and vegetable purchasing is spatially clustered. Areas purchasing fewer fruit and vegetable portions typically had younger residents, were less affluent, and spent less per month with the retailer. Poor dietary quality contributes to rising rates of obesity and associated comorbidities in the UK [1, 2] . Many years of policies to encourage individual behaviour change have done little to reverse obesity rates [3] . Moreover, the influence of the food environment on obesity and poor diets [4, 5] has attracted policy attention [6] . Measures such as changes to food promotions [7, 8] and the soft drinks industry levy [9] in the UK have focused on altering the food environment to 'nudge' people towards healthier choices. The food industry has also taken voluntary action to make healthier diets more achievable, such as committing to selling more portions of vegetables as part of the Peas Please campaign [10, 11] . Studies of dietary behaviours are important for monitoring population dietary trends and responses to interventions such as policy changes. Population dietary assessment typically employs national survey data, such as the UK's National Diet and Nutrition Survey (NDNS) [12] . Surveys employ self-report methods, such as food diaries and food frequency questionnaires, and offer detailed information on diet and nutrition as well as participant characteristics. This makes them useful for understanding the socio-demographic determinants of diet [13] [14] [15] [16] . However, the time and cost burdens for participants to complete surveys, and for researchers to code their outputs, limits their sample sizes. Relatively low sample sizes mean that the spatial resolution of national surveys is often poor and rarely offers detail below the regional level; regions in England have an average population greater than five million [17] . This limits their utility to investigate spatial dietary inequalities which often occur at the neighbourhood level. These surveys enable us to monitor and understand consumption of fruits and vegetables which in turn can be used as a proxy for a healthy diet due to their role in prevention 1. Examine the small-area spatial distribution of fruit and vegetable purchases and predictors of this purchase behaviour 2. Explore associations at a neighbourhood level between mean daily fruit and vegetable portions purchased and area socioeconomic characteristics, customer demographics, and access to supermarkets. 3 . Develop a statistical model that identifies drivers of fruit and vegetable purchasing at a neighbourhood level. The study sample included 50,917 customers who held a loyalty card for a major UK supermarket, registered to an address in the city of Leeds, England. Eligible customers made at least ten transactions during 2016, which included a minimum of seven out of 16 food categories, developed from categories captured by the Living Costs and Food Survey (LCFS) [37] (Table 1) . The inclusion criteria are described in more detail elsewhere [36] , but briefly they aim to capture 'primary' shoppers who do the majority of their food shopping with the study retailer. The median shopping frequency of our sample is 53 occasions annually (interquartile range 33-82) [36] . Thus, we exclude customers with infrequent purchases from a limited range of food categories, on the basis that their purchases are unlikely to represent their overall diet. Exploratory data analysis identified some customers with extremely high loyalty card expenditure which we considered unlikely to represent typical household purchasing. We defined an upper bound of annual expenditure, based on household expenditure on food and non-alcoholic beverages from the 2016 edition of the Family Food Survey (FFS) [37] . A threshold of 1.5 times the inter-quartile range beyond the upper quartile (a common criteria to identify large outliers in box plots) from the FFS report, was used to exclude customers at the upper end of the expenditure distribution. For symmetry, the same proportion of customers (1.95%) at the bottom end of the annual expenditure distribution was removed. Customers must be aged 18 or over to obtain a loyalty card with the retailer. For this reason, we excluded customers with a recorded age of 17 years or below as these were assumed to be data errors. Anonymised customer characteristics (age, gender, and output area of residence) were derived from the retailer's loyalty card sign-up questionnaire. We assume that the loyalty card holder is the main person responsible for shopping in the household. The study region is determined by customers whose loyalty card is registered to an output area inside the Leeds boundary. Leeds is a diverse city with cosmopolitan (ethnically diverse) and deprived areas in the south and west of the city, affluent suburbs in the north and east, and a large student population in the inner western suburbs (Figure 1a) shows the spatial distribution of the 2015 Index of Multiple Deprivation (IMD) decile at the Lower Super Output Area (LSOA) level, a neighbourhood census geography representing 400-1200 households. The IMD is a rank of deprivation for more than 32,000 LSOAs in England [38] . These are split into deciles, where 1 represents the most deprived 10% of areas in England. Figure 1b shows the 2011 UK Output Area Classification (OAC) for Output Areas (OAs) in Leeds; the OA is a small-area census geography containing around 125 households. The OAC is an open-source census-derived national hierarchical geodemographic classification system [39, 40] . Customer area of residence is known at the Output Area (OA) level and is used to describe the characteristics of areas in the study in the absence of detailed individual-level demographic data. This study uses the Supergroup level of the OAC hierarchy, which assigns areas to one of eight Supergroups, according to the affluence, ethnic composition, rurality, age demographics and other characteristics of the people residing there. Due to small customer numbers at the OA level, areas were aggregated to the Lower Super Output Area (LSOA) (400-1200 households) [41] for analysis. LSOAs with low customer numbers ( 50,000) is very large compared with many presented in the literature and all socioeconomic and geodemographic groups are represented in relatively large numbers (the lowest being 731 customers in the Ethnicity Central Output Area Classification Supergroup). Supermarket data, even from a single retailer, may therefore contain higher numbers of the hard-to-reach groups, giving greater power across all socioeconomic segments of the population. That said, we cannot be sure that customers in our sample are typical of their neighbourhood characteristics. Customers in Leeds purchased on average 3.4 portions of fruits and vegetables per household per day, which equates to just 1.5 daily fruit and vegetable portions per person, considering the size of the average Leeds household (2.3 people) [44] . Our purchase estimate is well below the five-a-day recommendation and lower than daily intakes estimated by the NDNS (4.2 portions per person) [52] and the Health Survey for England (HSE) (3.8 portions per person) [53] . Survey estimates are known for over-reporting of fruit and vegetables due to social desirability biases, which are not a problem for objective automated purchase records. The degree to which household-level purchases from the retailer represent individual consumption is unknown. Previous validation studies highlight that agreement between purchases and consumption is likely to vary by loyalty status and household composition [54, 55] , with higher agreement observed for single-person households [54] . However, accepted adjustment factors remain lacking. Future work could incorporate known dietary variation by gender and life-stage by accounting for household composition (number and age of household members) to more accurately estimate individual-level intake from household purchase records. As this information cannot typically be obtained from retailer loyalty card records, this may involve using survey data, area-level estimates, or the development of methodologies to model household composition, for example microsimulation using census statistics [56, 57] . As we do not account for household waste or inedible proportions, our portions estimate may be inflated by as much as 28% for fresh vegetables and salad, and 6% for fresh fruit, according to national household waste estimates [58] . While robust methods for adjusting transaction records for waste are needed, crude application of national estimates would reduce our portions estimate to roughly 1.1 portions purchased per person per day. Furthermore, as our estimate is from a single retailer only, and does not include fruit and vegetables purchased or obtained elsewhere (e.g., from other retailers, home-grown, or consumed in restaurants) or in composite dishes purchased from the retailer, it is likely to under-represent total household fruit and vegetable purchases. Fruit and vegetable purchases were found to vary spatially, with clusters of high fruit and vegetable purchasing in the affluent rural and suburban areas to the north and east of the city, while clusters of low fruit and vegetable purchasing were observed in the more deprived neighbourhoods in and around the city centre. The observed association between fruit and vegetable purchasing and area deprivation concurs with research into the geography of dietary patterns based on survey data, which found a higher prevalence of the vegetable-rich 'health conscious' and 'high diversity vegetarian' dietary patterns in suburban areas with lower deprivation [31, 59] . Using transaction records, fruit and vegetable purchases were important determinants of the observed 'Fruity' and 'Meat Alternative' dietary patterns, which were more prevalent among customers in the most affluent deciles [36] . Yet, it is possible that the observed deprivation pattern may be confounded by differences in household composition, for example the mix of adults and children. Despite the apparent presence of an overall deprivation gradient in fruit and vegetable choice behaviours, exploration of LOSAs classed as outliers and with high residual values identified neighbourhoods which appear to be exceptions to the rule. These areas suggest that education and ethnicity moderate the effect of deprivation. In spite of relative deprivation and a low overall spend, outlier areas occupied by students and minority ethnic families spent a higher-than-average proportion of their total expenditure on fruits and vegetables, which translated to more portions purchased than predicted. This could be indicative of a preference for scratch-cooking or meal assembly (e.g., the addition of peppers to a fajita meal kit) among these groups. Similarly, deprivation did not translate to low fruit and vegetable purchases for some rural communities. A higher than average spend observed in these outlier areas could be attributed to transactions capturing a larger proportion of total purchases, due to less retail competition. Despite spending a lower proportion of their total expenditure on fruits and vegetables, this did not translate to fewer portions, which may indicate thriftiness and a preference for cheaper fruit and vegetable varieties, which enable them to get more portions for their money. Outlier LSOAs with lower than predicted fruit and vegetable purchases were occupied by families right across the deprivation spectrum. While these areas had a higher than average spend with the retailer, they prioritised spend on fruits and vegetables to a lesser degree. This may be indicative of busy family lives and a preference for convenience meals, a tendency to source fruits and vegetables elsewhere e.g., greengrocers or home-growing, or a preference for more expensive varieties. Outlier LSOAs also had a lower proportion of female customers overall, especially among more deprived areas. A sensitivity analysis repeating the model after exclusion of outlier LSOAs led to the proportion of females becoming a significant negative predictor of fruit and vegetable purchases (Supplementary Table S2 ). This is surprising given that females purchase more fruit and vegetables than males on average at the customer-level. While the reason is unclear, it could be that females are more likely to be the primary shopper for busy families which rely on convenience meals. At the neighbourhood level, a higher proportion of over 65s was associated with higher fruit and vegetable portions purchased. The relationship with age may be a true reflection of differences in fruit and vegetable intake and agrees with other studies which found higher fruit and vegetable consumption among older adults [19, 60, 61] . Yet, at the household level it is perhaps counter-intuitive that older adults should purchase more portions of fruit and vegetables, given that they are more likely to live alone or with just one other as children have left home. It is possible therefore that the relationship may also reflect differences in purchasing and food preparation practices. For example, younger adults often lack cooking skills, are likely to be under greater time-pressures due to work and childcare responsibilities, and may therefore prefer to choose convenience meals rather than cooking from scratch [62, 63] . While estimates by the retailer indicate that ready meals contribute only a small fraction of all vegetables purchased (unpublished data), our inability to accurately quantify the fruit and vegetable content of composite foods is likely to under-estimate fruit and vegetable purchases particularly among low-income working families and young people. Younger adults also consume more takeaway and restaurant meals [13] , which may provide additional uncaptured fruit and vegetable portions. Some research suggests that greater access to supermarkets is associated with higher fruit and vegetable intake [27, 28] . Despite this, distance to nearest store and most used store were not found to be significantly associated with fruit and vegetable purchases in either model in this study. Indeed, rural and suburban areas to the north of the city demonstrated both the greatest average distances to nearest store and the highest fruit and vegetable purchases. It is possible that the relationship between proximity and fruit and vegetable purchases may vary spatially, moderated by unmeasured structural factors such as car ownership, access to public transport, store format (superstore or convenience store), the availability of other food outlets in the neighbourhood, and the degree to which a particular retailer meets a customer's social, cultural and economic needs [27] . While all store formats offer some fruits and vegetables, there will be differences in the range offered. Aggarwal et al. [60] found that only one third of participants shopped at their nearest store, and those who shopped at low-cost stores were more likely to travel beyond their nearest store. In another study by Liese et al. [64] , access to store was associated with frequency of shopping trips, but not with fruit and vegetable intake, suggesting that access may be more closely associated with purchase pattern (e.g., top up shopping compared with a large weekly shop) than purchased amounts. While shopping frequency was not found to be significantly associated with fruit and vegetable purchases in the present study, we observed a narrowing of confidence intervals around our estimates after removal of outlier LSOAs, increasing the significance of findings (supported by a smaller p-value). Outlier areas were on average further from their most used store than the sample as a whole. The validity of distance as a measure of access should also be considered as it disregards the store offering and product prices. The average distance to the most-used store was high in this study (>10 km), with a number of customers frequenting stores outside of the Leeds study region. While these are likely to be edge cases led by store network accessibility, this behaviour warrants further exploration. The high distance to the most-used store may be explained, for example, by customers shopping on their commute to work outside of the area, spending time at two addresses (for example students who return home outside of term time), or customers who have migrated outside the area without updating the address associated with their loyalty card. The literature indicates good agreement between supermarket purchase data and self-reported dietary measures [55, 65, 66] . Among loyal customers, even a single retailer can make a significant contribution to total household food purchases [33, 55, 67] . While we do not know how much of a customer's total purchases are represented by the retailer, we have tried to select a relatively loyal customer sample, as indicated by their membership in the loyalty card scheme and frequent and broad-ranging purchase history. Customers in the sample visit the store on average five times per month. Controlling for total monthly spend on food and non-alcoholic beverages with the retailer goes some way to account for loyalty, assuming that higher spend with the retailer represents a higher proportion of the available food purse. However, higher total monthly spend may also be indicative of a larger household size or affluence, denoting a preference for more expensive premium food stuffs rather than volume of food purchased. Degree of loyalty could better be controlled for using estimates of basket share or the Recency, Frequency, and Monetary value (RFM) index for example. Alternatively, as proposed by Rains and Longley [48] , purchase 'completeness' at the category level could be estimated by comparing retail expenditure with estimates in national survey data. While we observed spatial clustering of the outcome variable, the only predictor variables which showed spatial clustering were IMD and OAC, which are inherently spatial. As the deprivation index and geodemographic segmentation to go some way to capturing the nature of the food environment and the characteristics of people who live in an area, we considered the effect of uncaptured spatial factors on the model coefficients to be minimal. Despite this, we found LSOAs with high positive residual values to be clustered in the south of the city and those with high negative residual values to be clustered in the west. Similarly, Clary et al. [27] found nonstationarity in the interaction between food environmental exposures and fruit and vegetable intake using GWR across four London boroughs. While there are likely to be limits to the validity of GWR at such granular geographic scales as that applied in this study, it is possible that our global model may have missed spatial variation in the local food environments and the way in which people respond to their environment. Incomplete spatial representation of dietary behaviours due to missing information about transactions from other retailers further limits the applicability of GWR approaches. Nevertheless, exploration of outlier areas from the regression model revealed some interesting insights which became more apparent when applying more granular levels of the hierarchical Output Area Classification (Group and Sub-group, rather than Supergroup as used in the model). Dietary research has long shown socioeconomic inequalities. While low overall fruit and vegetable purchase level warrant efforts to increase purchasing across the board, geographically untargeted strategies require huge investment and are likely to widen inequalities. To ensure those who purchase the least fruits and vegetables are not left behind, it is important to understand where best to focus interventions. Exploring neighbourhoodlevel fruit and vegetable purchases offers retailers insights for store-level stocking and marketing decisions. Interventions to increase fruit and vegetable purchases should target stores in areas with low purchase levels, especially those serving younger more deprived urban communities. These areas tend to be served by smaller stores where limited ranges make groceries comparatively more expensive. With small stores set to be exempt from new location-based in-store promotional restrictions in the UK [7] , strategies to level the playing field are increasingly important. Strategies focusing on convenience, affordability and appeal are most likely to be successful among these groups [68] . Outliers in the study reveal that the influence of deprivation may be moderated by education and ethnicity, while busy family lives could be an important barrier to purchasing fruit and vegetables. Outlier areas should be explored in more detail in subsequent studies to understand the local factors which cause them to buck the deprivation trend. This evidence would inform the current social prescribing debate by revealing local influencers of healthy diets. Further work should also explore whether diet-related inequalities are contributing to the spatial inequalities which can be observed in a wide range of health outcomes. Exploration of population diet using electronically captured secondary purchase data is in its relative infancy and, as such, we acknowledge several limitations which set out a foundation for future research. Future directions include estimation of and controlling for household characteristics to extrapolate individual-level estimates; controlling for the inedible proportion of fruit and vegetables and food waste; estimating the fruit and vegetable content of composite dishes; exploring purchases of fruits and vegetables separately, breaking these down further by type; and exploring the effect of seasonality on purchasing behaviours. The validity of applying geographically weighted regression to neighbourhood level geographies, and the ability of existing survey data to completement supermarket purchase records for the development of small area estimation models, should also be considered. In conclusion, supermarket loyalty card transactions allow us to investigate small area patterns in food purchase behaviours and reveal that areas purchasing fewer fruit and vegetable portions typically had younger residents, were less affluent, were closer to the supermarket but shopped less frequently, and had a lower total monthly spend with the retailer. In addition, we were able to unpack outliers such as those populated by students which had higher than expected fruit and vegetable purchases despite relative deprivation, illustrating that more nuanced relationships exist than those reported in earlier research. Supplementary Materials: The following are available online at https://www.mdpi.com/article/10. 3390/nu14010177/s1, Table S1 : Outlier LSOAs (n = 25) by IMD decile; Table S2 The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the University of Leeds Ethics committee reference: AREA 18-050. Informed Consent Statement: Informed consent was not required for this secondary data analysis, and not possible to obtain as all data were anonymized. Data Availability Statement: Due to the commercial nature of the data used in this research, it is not possible for data to be published alongside the manuscript. Health Survey for England 2018 Overweight and Obesity in Adults and Children; Health and Social Care Information Centre Is obesity policy in england fit for purpose? Analysis of government strategies and policies Obesity Systems Map Halting the obesity epidemic: A public health policy approach General practitioners and patients models of obesity: Whose problem is it? Consultation on restricting promotions of products high in fat, sugar and salt. In Department of Health and Social Care Tackling obesity: Empowering adults and children to live healthier lives. In Department of Health and Social Care If Your Drink is Liable for the Soft Drink Industry Levy: Gov.UK Veg Pledges UK2021. Available online Food Foundation National Diet and Nutrition Survey UK Frequency and sociodemographic correlates of eating meals out and take-away meals at home: Cross-sectional analysis of the UK national diet and nutrition survey, waves 1-4 (2008-12) Sugar intake, soft drink consumption and body weight among British children: Further analysis of National Diet and Nutrition Survey data with adjustment for under-reporting and physical activity Socio-economic dietary inequalities in UK adults: An updated picture of key food groups and nutrients from national surveillance data Time trends in adherence to UK dietary recommendations and associated sociodemographic inequalities, 1986-2012: A repeated cross-sectional analysis Regional and Social Differences in Coronary Heart Disease; British Heart Foundation Health Promotion Research Group World Cancer Research Fund International. Diet, Nutrition, Physical Activity and Cancer: A Global Perspective; World Cancer Research Fund International NDNS: Time Trend and Income Analyses for Years 1 to 9 Food Foundation Illustrating health inequalities in Glasgow The geography of fast food outlets: A review More than just food: Food insecurity and resilient place making through community self-organising Obesity and the Environment: Regulating the Growth of Fast Food Outlets Fast food and obesity: A spatial analysis in a large united kingdom population of children aged 13-15 The local food environment and fruit and vegetable intake: A geographically weighted regression approach in the ORiEL study Local food environment and fruit and vegetable consumption: An ecological study Neighbourhood socioeconomic disadvantage and fruit and vegetable consumption: A seven countries comparison Geographic disparities in Healthy Eating Index scores (HEI-2005 and 2010) by residential property values: Findings from Seattle Obesity Study (SOS) Geography of diet in the UK women's cohort study: A cross-sectional analysis Comparing supermarket loyalty card data with traditional diet survey data for understanding how protein is purchased and consumed in older adults for the UK Food and nutrient availability in New Zealand: An analysis of supermarket sales data. Public Health Nutr Tesco Grocery 1.0, a large-scale dataset of grocery purchases in London Supermarket sales data: A tool for measuring regional differences in dietary habits Dietary patterns derived from UK supermarket transaction data with nutrient and socioeconomic profiles Living costs and food survey. In User Guidance and Technical Information for the Living Costs and Food Survey English Indices of Deprivation 2015: UK Government Creating the National Classification of Census Output Areas: Data, Methods and Results; School of Geography Working Paper 05 Creating the 2011 area classification for output areas Geography: An Overview of the Various Geographies Used in the Production of Statistics Collected Via the UK Census Why 5 A Day? NHS Leeds Observatory. Household Size and Rooms in Leeds: Gov.UK. 2021. Available online Census: Population Estimates by Single Year of Age and Sex for Local Authorities in the United Kingdom: Unrounded Estimates of the Usually Resident Population by Age And sex, Along with Household Estimates on Census Day Full Results for Leeds Spreadsheet; IoD-2019-LSOA-Ward-Alt.xlsx: Esri UK The provenance of loyalty card data for urban and retail analytics A systematic review of supermarket automated electronic sales data for population dietary surveillance Reaching the hard-to-reach: A systematic review of strategies for improving health and medical research with socially disadvantaged groups The elusiveness of representativeness in general population surveys for alcohol National Diet and Nutrition Survey London: GOV.UK; Public Health England Health Survey for England 2017 Adult Health Related Behaviours Do we eat what we buy? Relative validity of grocery purchase data as an indicator of food consumption in the LoCard study Use of household supermarket sales data to estimate nutrient intakes: A comparison with repeat 24-hour dietary recalls Creating a Synthetic Spatial Microdataset for Zone Design Experiments using 2011 Census and Linked Administrative Data Using census data in microsimulation modelling Household Food Waste: Restated Data for What is the cost of a healthy diet? Using diet data from the UK women's cohort study Access to supermarkets and fruit and vegetable consumption /16) Confidence to cook vegetables and the buying habits of Australian households Sociodemographic characteristics and frequency of consuming homecooked meals and meals from out-of-home sources: Cross-sectional analysis of a population-based cohort study Environmental influences on fruit and vegetable intake: Results from a path analytic model The use of supermarket till receipts to determine the fat and energy intake in a UK population To what extent do food purchases reflect shoppers' diet quality and nutrient intake? The impact of food-related values on food purchase behavior and the mediating role of attitudes: A Swiss study Food Foundation. Peas Please. In Peas Please. Reviewing the Evidence: What Can Retailers Do to Increase Sales of Fruit and Veg; Food Foundation Acknowledgments: Thank you to the retailer for providing data in kind for this research; to Stephen Clark for technical expertise in GIS and R; to members of the lead author's (VJ's) Research Support Group panel for comments during the early stages of the work; and to the Data Analytics team at the Leeds Institute for Data Analytics for support with data import to the secure environment. T.R., B.T. and B.S. are employees at the grocery retailer. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.