key: cord-0720559-3xa6ru0q authors: Harris, R.; Brunsdon, C. title: Measuring the exposure of Black, Asian and other ethnic groups to Covid-infected neighbourhoods in English towns and cities date: 2021-03-08 journal: nan DOI: 10.1101/2021.03.04.21252893 sha: 896f61ba02f4dae27ff8b95675501ac20cc19cdb doc_id: 720559 cord_uid: 3xa6ru0q Drawing on the work of The Doreen Lawrence Review, a report on the disproportionate impact of Covid-19 on Black, Asian and minority ethnic communities in the UK, this paper develops an index of exposure, measuring which ethnic groups have been most exposed to Covid-19 infected residential neighbourhoods during the first and second waves of the pandemic in England. The index is based on a Bayesian Poisson model with a random intercept in the linear predictor, allowing for extra-Poisson variation at neighbourhood and town/city scales. This permits within-city differences to be decoupled from broader regional trends in the disease. The research finds that members of ethnic minority groups tend to be living in areas with higher infection rates but also that the risk of exposure is distributed unevenly across these groups. Initially, in the first wave, the disease disproportionately affected Black residents. As the pandemic has progressed, especially the Pakistani but also the Bangladeshi and Indian groups have had the highest exposure. This higher exposure of the Pakistani group is not straightforwardly a function of neighbourhood deprivation because it is present across a range of average house prices. However, we find evidence to support the view, expressed in The Doreen Lawrence Review, that it is linked to occupational and environmental exposure, particularly residential density. deprived areas of England throughout the period March to July, 2020, many of which are occupied by BAME groups but not only BAME groups. In England, the main predictors of Covid-19 vulnerability have been identified as the proportions of the population, (a) living in care homes, (b) admitted to hospital in the past five years for a long-term health condition, (c) from an ethnic minority background, and (d) living in overcrowded housing (Daras et al., 2020) . The co-linearity of these variables with each other, as well as with age and occupation, could explain why income deprivation was not found to be statistically significantly in this study (that is, the effects of income deprivation were measured through the other variables). Occupations with increased risk of exposure include frontline medical staff, the emergency services, public transit workers, teachers and those working in the hospitality industry, many of which -for example, pharmacists, dental and medical practitioners, and bus drivers -have a disproportionate percentage of their workforce from BAME backgrounds (ONS, 2020d) . In the United States, the Centers for Disease Control and Prevention observe that "[s]ome of the many inequities in social determinants of health that put racial and ethnic minority groups at increased risk of getting sick and dying from Covid-19" include: discrimination; healthcare access and utilisation; occupation; educational, income and wealth gaps; and housing (CPD, 2020) -concerns that echo those found in The Doreen Lawrence Review. A systematic review of 50 studies, 42 from the United States of America and 8 from the United Kingdom, confirms that individuals from Black and Asian ethnicities had a higher risk of Covid-19 infection compared to White individuals (Sze et al., 2020) . However, a report by the UK Government Equalities Office and Race Disparities Review (2021) stresses that "that population. These tests are done at regional test sites, mobile testing units, satellite test centres and via home tests." Not everyone who has the disease is tested or necessarily displays symptoms so the numbers are underestimates of the true prevalence of the disease within the population. Especially this will be true prior to the beginning of July 2020 when it is, in effect, a count of the number hospitalised by . The undercount is compounded throughout the study period by the suppression of the exact number in any MSOAs that had 0 to 2 cases that week. For this study, these are treated as zero values but they could be one or two. A further problem is that test results are necessarily a function of testing -of where people are being tested and their ability to access a test. The availability of tests has changed over time. (https://coronavirus.data.gov.uk/details/testing/). Although the increase is a response to the second wave of the disease, it also reflects increased capacity to test, which, in turn, allows for more positive diagnoses. That testing capacity has been constrained for some of the period of this study, with the British Medical Journal publishing a briefing in September entitled What's going wrong with testing in the UK? (Wise, 2020) . It is possible that the relative resilience of London to growing numbers of cases during the initial emergence of the second wave in England -from the end of September / early October, 2020 -was in part caused by insufficient testing. However, the nature of the regional economy and the type of jobs that are more easily adapted to home working may also be a factor (Harris & Cheshire, 2020) . Not everyone who has been in close contact with an infected person has necessarily been tested. Ongoing problems with the track-and-trace system (Triggle, Schraer & Kemp, 2020) meant about half of those who have been in contact with an infected person had not been contacted, so could harbour . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 the disease with neither symptoms nor knowledge. There have also been changes to the data, first to harmonise them and to remove duplicates, and, secondly, to more accurately assign people to their current residential address. This raised the numbers around Universities as students were located to their term-time address, which may differ from the one on their NHS record because the latter can be a parental address if they are registered to a GP 'back home'. Nevertheless, the analysis assumes that tests follow symptoms, as well as the identification of known clusters of the disease, and that, therefore, the data track the disease's spread amongst the population sufficiently well to be broadly representative of who has been infected and where they live. They are neither a random sample nor a complete census of all those who have been infected each week. However, they are the official source of infection data in the UK, used to inform public health policy and to guide preventative measures such as national and local lockdowns. We do not dispute the possibility of bias but note that the most likely systematic problem is an undercount of those who could be infected but either can least afford to socially isolate from out-of-home employment for the quarantine period required, or to travel to access a test. If such a bias exists, then it is likely to downplay the infection rate amongst BAME groups because of the intersections of ethnicity and social disadvantage, which means differences in exposure between BAME groups and the White British will be underestimated in the analysis. The study is confined to English major towns and cities, as defined in the ONS file of December 2015. 4 This is based on the built-up areas geography developed following the 2011 Census (ONS, 2013). Rural and semi-urban areas are not considered, partly because of their greater population sparsity that, all things being equal, reduces person-to-person contact and therefore the transmission of https://geoportal.statistics.gov.uk/datasets/major-towns-and-cities-december-2015-names-and-codes-inengland-and-wales . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 Much of the South West region of England, for example, has had consistently lower rates of infection throughout the pandemic and is mainly rural. However, the main reason is demographic, informed by the ethnic geography of the country. Whereas many English towns and cities are ethnically diverse, more rural areas are typically not, containing far fewer of the BAME than White population. According to the 2011 Census, 61 per cent of the White British population resides outside the major towns and cities. The next largest percentage is equal for the White Other and Mixed ethnicity groups, of whom 30 per cent do not reside in these towns and cities. For the Black Caribbean and Black African groups, it is only 11 per cent. Nationally, lower Covid-19 infection and death rates amongst the White British population reflect rural-urban patterns of living. To compare 'white' rural hamlets with multicultural urban settlements is problematic; they are very different types of places. Consequently, we prefer the direct comparison, asking whether differential rates of exposure remain evident between different ethnic groups, within urban settlements. In total, 109 English towns and cities are included in the study. The smallest is Walsall, with a mid-2019 estimated population of 65,928. The largest is London, with a population of 8,924,265. Because London is so much larger than the other settlements -its population is over 35 times greater than the average of 250,479 and approaching eight times greater than the second largest settlement, Birmingham, with 1,153,804 -it is split into its 32 local authorities for the analysis: the 32 London Boroughs, with the City of London merged with Westminster. This gives a total of 140 urban 'places'. Method of estimating the relative exposure of ethnic groups to neighbourhoods of Covid- For the analysis, an index of exposure is formed to measure how much members of the various ethnic groups are exposed, by residence, to neighbourhoods of Covid-19 infection for each week of the study period. This index is based on the residuals from a Bayesian Poisson model, which are used to identify neighbourhoods that have more or less than the expected number of Covid-19 cases, relative to: (a) The index is based on that used in studies of segregation where, For such studies, is the index value for ethnic group, , of whom there are ( ) in neighbourhood, , and +( ) for all neighbourhoods within the study. The remaining value, , may be interpreted probabilistically. For a segregation index, it is the probability of selecting a member of a second ethnic group from the same neighbourhood that the member of ethnic group is living in. 5 The modification we make is that becomes the probability of selecting, from a Normal distribution, a value lower than the MSOA's deviance from its expected infection rate for the week. This means that if most of an ethnic group is living in MSOAs where the infection rate is higher than expected then the index will tend towards one, getting closer to one the greater the deviance from expectation. If most are living in neighbourhoods where the infection rate is lower than expected, it will tend towards zero. If the group is spread amongst neighbourhoods where those with higher infection rates balance out those with lower, then the index value will be 0.5. is extracted from and in Equation 2. These are treated as quantiles from a Standard Normal distribution and used to generate the probabilities that feed into the index of exposure. With reference to Equation 3, = ( < + ) or = ( < ) where represents quantiles from a Standard Normal distribution. Critically, the two variants of mean that the index can be calculated with or without place effects, the latter controlling for the differences between towns and cities and therefore the broader scale geography of the disease. This is discussed further in the results section. 5 It is = ( ) + ⁄ where + is the total population of and denotes the second group: see, inter alia, the appendix of Harris & Johnston (2020), or Kaplan (2018). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101/2021.03.04.21252893 doi: medRxiv preprint The model is fitted 49 times, once for each week of the pandemic included in the study. It follows that the index of exposure is calculated weekly too. Adding-in the subscript, , to denote time gives, log( ) = 0 + 1 1 + 2 2 + 3 3 + log( ) + + This shows that the modelled rate varies weekly as do the random intercepts but not the control variables because the population size, demographic profile and number of care homes beds are assumed to be constant for the study period. The regression coefficients could also be assumed to be is when concern about the lack of protection afforded to care home residents and their staff was of considerable media interest. Greater percentages of those aged 18-21 generally has a negative effect on the infection rates but not in the first week of October, in particular, when it had the strongest effect on increasing infection rates. This is when students returned from their homes to their termtime addresses after a delay in the start of the University term. The effect of populations aged 22-35 on infections is greatest after the first lockdown had ended and through the summer when pubs and restaurants had partially reopened and more people had returned to their workplaces. The models are fitted as a Bayesian model using the brms package for R (Bürkner, 2017 (Bürkner, , 2018 , which provides an interface to Stan (Stan Development Team, 2019). Results takes the form of draws from a Bayesian posterior distribution for the parameters of interest that are then summarised by their mean value. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; Figure 1 . The regression coefficients for the control variables for each week of the study. The English national lockdown periods are shaded in grey. The index values, initially calculated with the inclusion of both the MOSA and place level effects, + , are shown in the upper part of Figure 2 . Because the model is fitted separately for each week, the values are relative: they show which of the ethnic groups are living more or less in Covid-infected neighbourhoods that week, given the underlying infection rate. To improve the legibility of the chart, some groups are omitted; specifically, the Arab, Black Other, is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101/2021.03.04.21252893 doi: medRxiv preprint of the Census population than some of those included but are removed either because they are less 'distinct' as a group or because they are not typically associated with BAME groups. Figure 2 reveals that it was especially the Black groups that were most exposed to higher Covidinfected residential neighbourhoods earlier in the pandemic. However, this declines over much of the period of the study, rising again between the second and third lockdowns with the spread of the mutated virus in London and the South East. The implication is that when Covid-19 is spreading across the capital, it is the Black population that is most impacted. However, the group that is most consistently 'over-exposed' is the Pakistanis, with the highest index value for 38 of the 49 weeks (78%). The Indian and Bangladeshi groups often have higher index values too but usually not to the same extent. Similar trends are found if the index is recalculated but modelling the numbers of Covid-19 deaths instead of infections. The results are in the lower part of Figure 2 . The two sets of indices are not fully comparable, partly because the mortality data are monthly instead of weekly but more especially because they mortality index is modelled with the same care homes variable as for infections but also with the percentages of the adult population aged 25 to 29, 30 to 34, 35 to 39, and so forth to those aged 80 years or greater. The reason for including the additional age variables for mortality but not infections is that age does not cause infection but it is the greatest risk factor for death having been infected, especially for older populations. The additional co-variates act to age standardise the mortality data. Despite the model differences, it is again the Pakistani group with most frequently the highest index value, for 8 of the 11 months (73%). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Smoothing has been applied and place names have been truncated in some cases. Nevertheless, even within places where the overall infection rate is high, it is possible that some groups are living in neighbourhoods where the rate is even higher. Or, in places where the overall rate is low, still some groups face much higher exposure. It is therefore instructive to modify the index so that the broader scale differences between the towns and cities are removed, leaving the exposure to be measured relative to the local context. If it emerges that one or more groups are still more exposed to Covid-19 than others then it cannot be attributed only to the broad, regional geography of the disease. It means that the differences between the groups also emerge within individual towns and cities, not just between them. The way to make the modification is to recalculate the index of exposure without the place effects from the underlying model. In terms of Equations 2 and 3, above, this means using only , and not + , giving an index of local exposure. The results are shown in the upper part of Figure 4 , drawn to the same scale as Figure 2 on the y-axis. The differences between the ethnic groups have reduced; unsurprisingly as we have removed the geographical variation between the towns and cities (but not within them). Whilst the remaining differences are not always large, it is the persistence of the difference between the Pakistani and other groups that remains of primary interest. In 47 of the 49 weeks (96%), the Pakistani group had the greatest local exposure, of which 43 weeks are consecutive. 6 We believe that this finding should not be understated because even under a restricted scenario that focuses only on urban areas and which removes the broader-scale patterns in the disease still the Pakistani group is found to be more exposed to the higher Covid-infected neighbourhoods than are other groups. The Chinese and the White British who are often the least exposed groups, at the local level. However, the differences all but disappear between the ethnic groups when the index is recalculated for the age-standardised mortality rates, In fact, the Pakistani group does still have the highest value 6 If the figure is redrawn with some smoothing and a confidence interval for each group, then that for the Pakistani group typically overlaps with few or no other others each week suggesting it is improbable that the persistence of the Pakistanis as the most exposed group is of no statistical or substantive significance. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101/2021.03.04.21252893 doi: medRxiv preprint for 8 of the 11 months but the differences are trivial and illegible on the chart. The implication of this is that urban inequalities in mortality are more strongly expressed at the regional level and between major towns and cities (Griffith et al., 2021) , whereas inequalities in infection are evident within towns and cities. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 5. The preceding analysis shows that some BAME groups have faced greater exposure to Covid-19 but not uniformly so and not always at the same time. Amongst them, it is the Pakistani and, to a lesser extent, the Bangladeshi and Indian groups that have had the greatest exposure to Covid-19 infected neighbourhoods within English major towns and cities. Their increased exposure partly relates to the overlapping geographies of the pandemic and of the residential geographies of where BAME live in the country. However, it is striking that for nearly every week of the pandemic the Pakistani group emerges as the one with the highest amount of local, neighbourhood exposure even after controlling for the variations between the towns and cities and therefore the broader scale geography of the disease. Furthermore, the finding is not confined to what might be regarded as the most economically disadvantaged neighbourhoods because it is broadly consistent for areas of lowest and of higher average house price. To evidence this, Figure 5 shows the results from recalculating the index of local exposure with each of the ethnic groups split into ten sub-groups according to the decile of the trimmed mean house price of the properties sold in their MSOA between January 1, 2017 and February 27, 2020. The deciles are calculated on a town and city basis, in this case treating London as a whole. This means that those living in the highest decile are in MSOAs with the most expensive property, on average, for their town and city, not necessarily nationwide. This is to prevent the top decile being dominated by London. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; Table 1 . The estimated residential density (people per dwelling) for the MSOA of the average member of each ethnic group, by the decile of the average house price of the MSOA. On face, there is not a strong relationship between residential density and how far each neighbourhood deviates, locally, from the expected, log infection rate. A simple linear regression suggests a Pearson correlation of = 0.08, with = as the dependent variable, the density values, mean-centred for each individual town and city, as the predictor variables, and with the separate weeks handled as fixed effects. This is a low effect size under Cohen's (1988) criteria. However, if the model is weighted according to the share of the total Pakistani population that lives in each neighbourhood then the correlation rises to 0.26, which is nearing a medium effect. This effect is greater for the Pakistani group then for any others. The corresponding correlation when weighting . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; Bringing this together, the Poisson model of infection rates (Equation 4 ) is refitted to recalculate the local index of infection, this time conditional on additional co-variates for the log of the trimmed mean house price, mean-centred per town or city, residential density, the percentages of the neighbourhood populations employed as key workers, and the percentages of the adult population in various age groups from 18 to 80 and above. As previously, the coefficients for these variables are largely incidental to the analysis, although it may be noted that it is the percentage aged 70 to 79, residential density and average house price that have the greatest effect on the infection rates, also being significant, most often, at a 95 per cent confidence interval. Residential density is positively correlated with infections; house price and those in their seventies negatively so. Of interest, is what remains in the index of local exposure values, despite the additional co-variates, calculated from the remaining variations in the log infection rates within major towns and cities. Figure 6 reveals that the difference between the Pakistani and other groups has been reduced. However, the Pakistani group still has the highest value for 35 of the 49 weeks (previously it was 47), with the Indian group being highest for 9, the Bangladeshi group for 7, and the Black Caribbeans for 2. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 This paper has developed an index of exposure using an underlying Bayesian Poisson model of the Covid-19 infection rates for neighbourhoods in English towns and cities. The index measures how exposed various ethnic groups, including BAME groups are to Covid-infected residential neighbourhoods and can be modified to allow for the broad-scale geography of the disease, focusing, instead, on the localised highs and lows and who is exposed to them. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 The results show that members of BAME groups tend to be living in neighbourhoods where the infection rates are higher, reflecting their greater risk of being infected by and dying from the disease. The higher index values are true of Black groups in the initial weeks of the pandemic when the rates of infection are highest in London but become more greatly characteristic of Pakistani and, to a lesser extent, Bangladeshi and Indian groups as the prevalence of the disease shifts to other towns and cities that include Oldham, Bradford, Blackburn, Manchester, Rochdale, Northampton and Leicester. On February 18, 2021, the Department of Health and Social Care published interim findings from a study, showing that "large household size, living in a deprived neighbourhood, and areas with higher numbers of Asian ethnicity individuals were associated with increased prevalence" (Department of Health and Social Care, 2021). This accords with our own findings, with the caveat that we observe the higher exposure values for the Pakistani group not solely to be associated with the least wealthy neighbourhoods but remaining evident when the data and the index are stratified into deciles based on property prices. This may be because the neighbourhoods concerned, the Middle Level Super Output Areas, are internally heterogenous regarding their housing stock, especially within major towns and cities. In any case, we agree with The Doreen Lawrence Review in citing environmental exposure as a concern. Ethnic inequalities in housing mean that members of the Pakistani group are living in higher residential density areas and, with the exception of the Bangladeshi group, are more likely to be living in overcrowded and/or intergenerational households. Shankley and Finney (2020) observe that ethnic inequalities in housing "stem from the particular settlement experiences of postwar migrants to the UK in terms of the locations and housing areas afforded to them […] consolidated by dramatic changes to the UK's housing landscape over recent decades [that have] exacerbated housing disadvantage for minorities" (p.149). They note the practices of discrimination and racism that existing in housing, the reduction in the social housing stock available and problems of affordability and financial risk -the overarching issue of excessive housing (including . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101/2021.03.04.21252893 doi: medRxiv preprint rental) costs, relative to income, in many English towns and cities that adds to financial pooling and sharing, and to overcrowding. Housing inequalities are not the only explanation for why particular ethnic groups live, disproportionately, in Covid-infected neighbourhoods. The Doreen Lawrence Review also identifies occupational exposure, which we find evidence for too. Both are linked to other sub-national, socioeconomic inequalities including the imbalanced nature of regional economies and employment structures in the UK (ONS, 2020d). Writing in The Guardian, Dorling (2020) picks up these themes, arguing that, in poorer, more often northern, parts more people have jobs that cannot be done from home and more use public transport. Frequently, childcare is provided by the extended family who live nearby -wages and benefits are usually too low to allow other childcare options. There is less early retirement and more pensioners need to work too. Further, overcrowding in homes in cities is more common and anyone out of work exacerbates that. Although there is an element of caricature there, the broad impression helps inform Dorling's conclusion that the key to understanding the geography of Covid-19, "is the underlying social and economic geography of England. To understand the changing medical geography of this pandemic, you must first understand how the country lives and works." We agree. REACT-1 study published (dated February 18, 2021). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) https://www.theguardian.com/commentisfree/2020/sep/14/working-from-home-covid-19-londonuk-capital-white-collar-work . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted March 8, 2021. ; https://doi.org/10.1101 https://doi.org/10. /2021 brms: An R Package for Bayesian Multilevel Models Using Stan International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity