key: cord-0605468-hwt0kv2d authors: Deason, Alis J.; Erkal, Denis; Belokurov, Vasily; Fattahi, Azadeh; G'omez, Facundo A.; Grand, Robert J. J.; Pakmor, Rudiger; Xue, Xiang-Xiang; Liu, Chao; Yang, Chengqun; Zhang, Lan; Zhao, Gang title: The mass of the Milky Way out to 100 kpc using halo stars date: 2020-10-26 journal: nan DOI: nan sha: 431da0eebc7919c4809dd6fdb80a4b5c6dc1a681 doc_id: 605468 cord_uid: hwt0kv2d We use a distribution function analysis to estimate the mass of the Milky Way out to 100 kpc using a large sample of halo stars. These stars are compiled from the literature, and the majority (~95%) have 6D phase-space information. We pay particular attention to systematic effects, such as the dynamical influence of the Large Magellanic Cloud (LMC), and the effect of unrelaxed substructure. The LMC biases the (pre-LMC infall) halo mass estimates towards higher values, while realistic stellar halos from cosmological simulations tend to underestimate the true halo mass. After applying our method to the Milky Way data we find a mass within 100 kpc of M(<100 kpc) = 6.31 +/- 0.32 (stat.) +/- 1.26 (sys.) x 10^11 M_Sun. For this estimate, we have approximately corrected for the reflex motion induced by the LMC using the Erkal et al. model, which assumes a rigid potential for the LMC and MW. Furthermore, stars that likely belong to the Sagittarius stream are removed, and we include a 5% systematic bias, and a 20% systematic uncertainty based on our tests with cosmological simulations. Assuming the mass-concentration relation for Navarro-Frenk-White haloes, our mass estimate favours a total (pre-LMC infall) Milky Way mass of M_200c = 1.05 +/- 0.25 x 10^12 M_Sun, or (post-LMC infall) mass of M_200c = 1.20 +/- 0.25 x 10^12 M_Sun when a 1.5 x 10^11 M_Sun mass of a rigid LMC is included. The total mass of the Milky Way has been an historically difficult parameter to pin down (see recent reviews by Bland-Hawthorn & Gerhard 2016; Wang et al. 2020) . Despite decades of measurements, there remains an undercurrent of elusiveness surrounding "the mass of the Milky Way". However, the continued eagerness to provide an accurate measure is perhaps unsurprising -the mass of a halo is arguably its most important characteristic. For example, almost every property of a galaxy is dependent on its halo mass, and thus this key property is essential to place our "benchmark" Milky Way galaxy in context within the general galaxy population. In addition, the host halo mass is inherently linked to its subhalo population, so most of the apparent small scale discrepancies with the ΛCDM model (e.g. Moore et al. 1999; Boylan-Kolchin et al. 2011 ) are strongly dependent on the Milky Way mass (see e.g. Wang et al. 2012) . Moreover, tests of alternative dark matter candidates critically depend on the ★ alis.j.deason@durham.ac.uk total mass of the Milky Way, particularly for astrophysical tests (e.g. Kennedy et al. 2014; Lovell et al. 2014) . The uncertainty has stemmed from two major shortcomings: (1) a lack of luminous tracers with full 6D phase-space information out to the viral radius of the Galaxy, and (2) an underestimated, or unquantified, systematic uncertainty in the mass estimate. However, there has been significant progress since the first astrometric data release from the Gaia satellite (Gaia Collaboration et al. 2018 ). This game-changing mission for Milky Way science provided the much needed tangential velocity components for significant numbers of halo stars, globular clusters and satellite galaxies. Indeed, there are encouraging signs that we are converging to a total mass of just over 1×10 12 M (e.g. Callingham et al. 2019; Deason et al. 2019a; Eadie & Jurić 2019; Grand et al. 2019; Vasiliev 2019; Watkins et al. 2019; Cautun et al. 2020; Li et al. 2020) . However, mass estimates at very large distances (i.e. beyond 50 kpc), are dominated by measures using the kinematics of satellite galaxies, which probe out to the virial radius of the Galaxy (e.g. Patel et al. 2018; Callingham et al. 2019; Li et al. 2020) . It is well-known that the dwarf satellites of the Milky Way have a peculiar planar alignment (see e.g. Metz et al. 2007) , and, without independent measures at these large distances, there remains uncertainty over whether or not the satellites are biased kinematic tracers of the halo. Arguably the most promising tracers at large radii are the halo stars. They are significantly more numerous than the satellite galaxies and globular clusters, and are predicted to reach out to the virial radius of the Galaxy (Deason et al. 2020 ). There currently exist thousands of halo stars with 6D phase-space measurements, thanks to the exquisite Gaia astrometry and wide-field spectroscopic surveys such as the Sloan Digital Sky Survey (SDSS) and the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) survey. Moreover, with future Gaia data releases and the next generation of wide-field spectroscopic surveys from facilities such as the Dark Energy Spectroscopic Instrument (DESI, DESI Collaboration et al. 2016) , the WHT Enhanced Area Velocity Explorer (WEAVE, Dalton et al. 2014) , and the 4-metre Multi-Object Spectroscopic Telescope (4MOST, de Jong et al. 2019 ), there will be hundreds of thousands of halo stars with 6D measurements. The magnitude limit of Gaia and the complementary spectroscopic surveys will likely limit the samples of halo stars to within ∼ 100 kpc, but this is still an appreciable fraction of the virial radius (∼ 0.5 200c ), and will probe relatively unchartered territory beyond 50 kpc. As we enter a regime of more precise mass measures, and significantly reduced statistical uncertainties, it is vital to be mindful of any systematic influences in our mass estimates. Although many mass-modelling techniques assume dynamical equilibrium, it is well-documented that "realistic" stellar haloes can be a mash-up of several coherent streams and substructures (e.g. Bullock & Johnston 2005; Cooper et al. 2010; Monachesi et al. 2019 ). Thus, comparisons with cosmologically motivated models of stellar haloes are crucial (e.g. Yencho et al. 2006; Wang et al. 2015) . However, while cosmological simulations can provide much needed context, the unique assembly history of the Milky Way is most relevant for Galactic mass measurements. For example, the influence of the Sagittarius (Sgr) stream, which contributes a significant fraction to the total stellar halo mass (∼ 10 − 15%, e.g. Deason et al. 2019b ), needs to be considered. Furthermore, and perhaps more importantly, it has recently been recognised that the recent infall of the massive Large Magellanic Cloud (LMC) can imprint significant velocity gradients in the Milky Way halo (e.g. Gómez et al. 2015; Erkal et al. 2019; Garavito-Camargo et al. 2019; Cunningham et al. 2020; Petersen & Peñarrubia 2020) . Indeed, Erkal et al. (2020) showed that these velocity gradients can bias equilibrium based mass modelling, and is thus an effect we can no longer ignore. In this work, we compile a sample of distant ( > 50 kpc) halo stars from the literature with 6D phase-space measurements, and use a distribution function analysis to measure the total mass within 100 kpc. We pay particular attention to systematic influences, such as the Sgr stream and the LMC, and, where possible, correct for these perturbative effects. In Section 2 we describe our sample of halo stars with Galactocentric distances between 50 and 100 kpc. We describe our distribution function analysis in Section 3, and discuss systematic influences, such as unrelaxed substructure and the LMC, in Section 4. Our main results are given in Section 5, and we discuss and conclude in Section 6. We have compiled a sample of = 830 halo stars with Galactocentric distances between 50 and 100 kpc (see also Erkal et al. in prep) . These stars have measured radial velocities and distances, and the majority (∼ 95%) have proper motion measurements from Gaia data release 2 (GDR2, Gaia Collaboration et al. 2018 ). Many of these stars derive from large spectroscopic surveys, such as the Sloan Digital Sky Survey (SDSS) and the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) survey. In particular, we use the SDSS blue horizontal branch (BHB) and K giant samples (Xue et al. 2011 (Xue et al. , 2014 and the LAMOST K Giant sample ). These are complemented by the following samples in the literature: RR Lyrae (RRL) stars selected from the Palomar Transient Factory with Keck-DEIMOS spectroscopy (Cohen et al. 2017) , and BHB and blue straggler stars selected from SDSS or Hyper Suprime-Cam with VLT-FORS2 spectroscopy (Deason et al. 2012b; Belokurov et al. 2019) . After cross matching our sample with GDR2, we find that a number of the sample have, given the distance range we are probing, higher tangential velocities than expected. These are mainly misclassified dwarfs in the K giant samples. Thus, we impose cuts on the total velocity and parallax such that the total velocity is less than 500 km s −1 within 1− uncertainty, and / < 3 (see also Erkal et al. in prep) . Our final sample of = 665 bonafide halo stars in the distance range 50-100 kpc comprises of = 437 K giant stars, = 121 BHB stars, = 86 RRL stars and = 21 BS stars. As our sample consists of a range of stellar populations, and is compiled from a variety of sources, the typical uncertainties in radial velocity, distance and proper motion can vary considerably. Typically, the K giants have 5 km s −1 radial velocity errors, 10% distance errors, and ∼ 0.2 mas yr −1 proper motion errors. On the other hand, the BHB and RRL stars typically have 10 − 20 km s −1 radial velocity errors, 5% distance errors, and ∼ 0.7 − 0.8 mas yr −1 proper motion errors. In Section 3 we describe how these uncertainties are taken into account in our analysis. In this work, heliocentric velocities are converted into Galactocentric ones assuming a distance to the Galactic of 0 = 8.122 kpc (Gravity Collaboration et al. 2018) , and a circular speed at the position of the Sun of 235 km s −1 . Finally, we use the solar peculiar motions derived by Schönrich et al. (2010) : ( , , ) = (11.1, 12.24, 7.25) km s −1 . The presence of un-relaxed substructure is in an important consideration in any equilibrium-based mass measurement (see Section 4.2). Thus, we identify stars in our sample that likely belong to the Sgr stream, and excise these from our sample. Stars that belong to the Sgr stream are identified from their position on the sky, distance and radial velocity. The Sgr coordinate system defined by Belokurov et al. (2014) is used to isolate stars close to the Sgr plane (| | < 20 • ). The predicted distances and radial velocities of Sgr stars along the stream are taken from the results of Hernitschek et al. (2017) , Belokurov et al. (2014) and Vasiliev et al. (2020) . We consider stars that lie within 3 − of these tracks (and with | | < 20 • ) to be Sgr members. This selection is shown explicitly in Figure 2 of Erkal et al. (in prep) . Our procedure identifies = 180 Sgr stars (out of = 665) in the sample between 50 < /kpc < 100, leaving a total of = 485 non-Sgr halo stars. We model the halo stars using spherical, power-law distribution functions (see Evans et al. 1997 ). These simple, and flexible distri-bution functions have been used in several previous works 1 to model halo populations (e.g. Deason et al. 2011 Deason et al. , 2012a Eadie & Harris 2016; Eadie et al. 2017 ). Our motivation for using this approach is two-fold: (i) the spherical, power-law approximation for underlying gravitational potential is a reasonable approximation for NFW-like haloes in this distance range (see e.g. Watkins et al. 2010) , and (ii) over a limited radial range we can approximate the stellar halo density profile as a single power-law. We use a power-law profile for the potential, Φ = Φ 0 ( /50kpc) − , where is constant. To test the above assertion (i) we fit a power-law profile to potentials described by NFW dark matter haloes and a baryonic component appropriate for the Milky Way (Bovy & Rix 2013) . For haloes that follow the well-known mass-concentration (with 0.11 dex scatter, see Dutton & Macciò 2014) relation we find that the overall potential between 50-100 kpc is well described as a power-law with exponent = 0.5 ± 0.06. Indeed, we use this expected distribution as a prior in our analysis (see also Eadie & Jurić 2019) . Finally, we remark that in the distance range we are probing the adiabatic contraction of the dark matter halo is relatively minor (see Cautun et al. 2020 ), so we do not consider this effect in this work. The stellar halo density is also defined as a spherical powerlaw, with ∝ − . Between 50-100 kpc, most recent studies favour a power-law slope of = 4 (e.g. Xue et al. 2015; Cohen et al. 2017; Deason et al. 2018a; Fukushima et al. 2019) , and this is the fiducial value we adopt in this work. However, we discuss how changes in this assumption effect our results in Section 5. The velocity distribution of our model is described in terms of the binding energy = Φ( ) − 1 2 ( 2 + 2 + 2 los ) and the total angular momentum = √︃ 2 + 2 + 2 as where, Here, is the velocity anisotropy of the stellar velocity distribution, which we assume is a constant. We aim to constrain the overall potential and stellar halo velocity anisotropy in the radial range 50 < /kpc < 100. Thus, Φ 0 , and are the parameters we wish to measure from our model. The most dominant source of uncertainty in the kinematics of the halo stars is the proper motions (and hence the and velocity components), and in some cases there are no proper motion measurements available. Thus, we reduce the 3D velocity distribution of our model to the line-of-sight velocity distribution (LOSVD) using the following equation: where ( ) is the error function, which we assume is a Gaussian with ( ) = 4.74047 ( ). Note, for simplicity, we assume that this Gaussian in , space has no covariance. For cases where there are no measured proper motions ( ) = ( ) = 1. A maximum-likelihood method is used to derive the unknown parameters, Φ 0 , and : Here, los is the LOSVD (see Eqn 3) and is the total number of stars in our sample. We use a brute-force grid-based approach, with ∈ [0, 1], ∈ [0.1, 1] and Φ 0 ∈ [1, 20] × 10 4 km 2 s −2 . The circular velocity and total mass can be inferred from and Φ 0 : Note that the above procedure assumes that the distances and line-of-sight velocities are known, with no appreciable uncertainty. In practice, we compute the likelihood values = 100 times, and each time the distances and line of sight velocities are scattered according to their (Gaussian) error distributions. These likelihoods are then added together to produce the overall likelihood distribution. Finally, we derive posterior distributions for each parameter by adopting a Gaussian prior on with mean 0.5 and dispersion 0.06, which is appropriate for NFW haloes (see also Eadie & Jurić 2019) . Our model described in Section 3 assumes spherical symmetry, power-law distributions for both the underlying potential and tracer density profile, and a tracer population that is in equilibrium in the gravitational potential. These assumptions, particularly spherical symmetry and equilibrium, are commonplace in many massmodelling techniques. Here, we discuss two systematic effects that can break these assumptions, and hence systematically affect our inferred mass profile. In recent years, we have realised that the most massive Milky Way satellite, the LMC, has a considerable influence on our Galaxy. This massive satellite (∼ 10 11 M ) has recently fallen into the Milky Way, and is on a highly eccentric orbit (Besla et al. 2007; Kallivayalil et al. 2013 ). The consequences are many-fold, including disturbing the Galactic disc (Laporte et al. 2018) , shifting the barycentre of the Milky Way (Gómez et al. 2015) , and inducing a large scale gravitational wake as the satellite sinks to the centre of the Galaxy due to dynamical friction (Garavito-Camargo et al. 2019). These last two considerations are particularly important for halo stars, which, as a result, are imprinted with noticeable velocity gradients . Recently, using the same sample of halo stars as this work, we have detected these predicted velocity gradients in the Galaxy (Erkal et al. in prep) . Erkal et al. (2020) show that the influence of the LMC biases mass estimates of the Milky Way, which assume dynamical equilibrium. In particular, this effect becomes more important at larger distances. We apply our method to the model described in Erkal et al. (2020) assuming a (rigid) 1.5 × 10 11 M LMC, which we show in Erkal et al. in prep is the most likely LMC mass. This mass is also in close agreement with the LMC mass inferred from perturbations to the Orphan and Sagittarius streams Vasiliev et al. 2020) . The model derives from a suite of simulations of the Milky Way stellar halo in the presence of the LMC, which have been used in previous works (e.g. Belokurov et al. 2019; Erkal et al. 2020) . Note, however, that the deformation of the Milky Way Power-law distribution functions are used to estimate the mass profiles, as described in Section 3. The solid black line shows the resulting profile when no correction to the stellar velocities is applied, and the solid blue line is the result when a correction is applied. The gray shaded and blue line-filled regions indicate the 1 − confidence regions. The "true" mass profiles are shown with the dashed red line. Note that this "truth" is the MW mass profile before the LMC has been accreted. When a correction is applied our method is able to reproduce the true MW profile within the 1 − uncertainty; the mass is typically overestimated when no correction is applied. and LMC potential in response to each other, and any resonances in the Milky Way's dark matter halo, are not included in the models. That said, the simulations do account for the deformation of the stellar halo, and the predictions for stellar halo kinematics using the models in Erkal et al. (2020) appear to reproduce the salient features in the predictions of the more realistic, fully deforming models in Garavito-Camargo et al. (2019) . We select = 485 halo stars from the Erkal et al. model (the same number as our observed sample when Sgr is excluded) between 50-100 kpc, which are chosen to follow the Galactic ( , ) distribution of the observed sample. The slope of the model stellar halo in this radial range is = 3.4, and we assume that this slope is known in our analysis. We apply our likelihood method directly to Eqn 1 rather than the LOSVD, and assume all velocity components are known. As we are interested in systematic effects, we do not consider the statistical uncertainties from measurement errors. The black lines in the bottom panels of Fig. 1 show the resulting circular velocity and mass profiles from this exercise. The "true" profiles for the model are shown with the dashed red lines. The mass is typically overestimated by ∼ 6 − 7%, and the true profile lies outside of the 1 − confidence region (shown with the gray shaded region). Note that Erkal et al. (2020) find a larger bias in this radial range (∼ 15%). This difference is because the ( , ) distribution of our observed sample is not random on the sky; in particular, a significant fraction of our observed sample (∼ 40%) lie in the octant on the sky which, according to the Erkal et al. model, is least affected by the presence of the LMC (the octant defined by ( , , ) = (−, +, −), see Fig. 5 in Erkal et al. 2020) . Although the bias in this case is fairly small, we investigate applying a small correction to minimize this bias. The top panels of Fig. 1 show the average los and velocity components in the Erkal et al. (2020) model as a function of height above/below the Galactic plane ( ). Erkal et al. (2020) show that the bulk motion of the distant stellar halo is mostly upwards, in the component. Hence, is largely unaffected and the los and components are shifted. The fiducial model in Erkal et al. (2020) has an 0.8×10 12 M dark matter halo. We also show the predicted offset for a 1.5× more massive halo with the red dashed lines. The los offset is unchanged, and the offset is slightly reduced. We apply a dependent correction to the los and velocity components using the trends shown in the top panel of Fig. 1 , i.e. los = los − off los and b = b − off b . Note that we find our results are largely unchanged if we apply an offset appropriate for the more massive halo. The resulting circular velocity mass profiles when this correction is applied are shown with the blue lines in the bottom panel of the figure. Now, the mass is only overestimated by ∼ 3% and the true profile lies within the 1 − confidence region. In Section 5 we apply the same correction to the observational data to approximately account for the reflex motion induced by the LMC. Here, especially when observational errors are taken into account, the correction is relatively small, however, such a procedure will become increasingly more important as we gain larger velocity samples of halo stars over wider regions of the sky. We stress that this correction is only appropriate for the Erkal et al. model of the MW-LMC system, which assumes rigid dark matter potentials. The model dependence of this effect deserves further scrutiny, and, for example, the influence of the LMC on a live MW-LMC system (e.g Garavito-Camargo et al. 2019) needs to be explored in future work. Finally, we note that this procedure assumes that the "true" Milky Way mass profile does not include the mass of the LMC itself. At 1.5 × 10 11 M it is clear that this is a significant contribution to the total Milky Way mass. However, the rapid orbit, and likely nonuniform distribution of LMC mass throughout the Galaxy, means that any equilibrium modelling can only hope to correct for the perturbation of the LMC, and recover the mass profile before the LMC was accreted. This means that when comparing to the total Milky Way mass in cosmological simulations, which includes all the mass within a spherical aperture like the virial radius, the mass of the LMC must be added to the derived equilibrium mass. One only needs to visualise the iconic "field of streams" image of the Galactic stellar halo (Belokurov et al. 2006 ) to see that the stellar halo does not consist of a smooth distribution of stars. Indeed, both observations and simulations have shown that the majority of the (distant) stellar halo is a superposition of the stripped material from destroyed dwarf galaxies; this debris can be lumpy, non-spherical and can have significant velocity gradients. Previous work has investigated how equilibrium modeling holds up under more realistic stellar haloes from simulations (Yencho et al. 2006; Eadie et al. 2018; Wang et al. 2015; Han et al. 2016; Sanderson et al. 2017 ; Right panel: est / true against the average time the progenitor dwarf galaxies that contributed to the halo within 50-100 kpc merged with the host halo. The dispersion of estimated masses is lower at earlier merger times (higher merge ), as these deposited stars are more phase-mixed. The black boxes indicate haloes with shell-type structures in the radial range 50-100 kpc. In these cases, the mass tends to be underestimated. Wang et al. 2018; Grand et al. 2019) . Some cases can be wildly inaccurate, but the general consensus is that there is perhaps a systematic floor of at least ∼ 20% accuracy when using halo stars to estimate the total mass. In the past, such a floor has been deemed negligible compared to observational uncertainties. However, we are now in a regime when the data is no longer dominated by statistical uncertainties and/or missing data, and thus this systematic needs to be considered. Here, we apply our modeling procedure to the halo stars in the Auriga simulations ). These are a suite of high resolution, cosmological hydrodynamic simulations of Milky Way-mass galaxies. We only consider the = 28 haloes between 200 = 1 − 2 × 10 12 M that are not currently undergoing a major merger. As in Fattahi et al. (2019) , we only consider accreted halo stars, which are defined as those stars that formed in a subhalo other than the main progenitor galaxy. This sample of accreted stars has been used in previous work studying the stellar haloes of the Auriga galaxies (e.g. Monachesi et al. 2016; Deason et al. 2019a; Monachesi et al. 2019; Fattahi et al. 2020) . We select = 485 halo stars in the radial range 50-100 kpc, and choose these stars to have the same Galactic ( , ) distribution as the observational data. For each halo, we create four different samples with different solar positions, ( , , ) = (± 0 , 0, 0), (0, ± 0 , 0). The variation between different mass estimates when the solar position is varied is shown in Fig. 2 . As in the previous subsection, we only consider systematic effects and do not include observational uncertainties in the analysis. Thus, we assume all of the halo stars have full phasespace information, with no appreciable errors, and we assume that the slope of the tracer density profile ( ) in this radial range is known. In Fig. 2 we show the resulting mass estimates (within 100 kpc) when our method is applied to the Auriga simulations. The left panel shows the distribution of estimated to true mass, and the middle panel shows the estimated to true mass against the true mass. Each halo is shown with different coloured symbols, so the four samples drawn from the same halo (with just the solar position varying) can be identified. The halo-to-halo scatter is significantly higher than the variation from different solar position. The masses are typically underestimated by ∼ 10%, with a scatter of ∼ 25%. This bias towards underestimating the true mass has been seen in previous work (e.g. Yencho et al. 2006; Wang et al. 2018; Grand et al. 2019) , and the scatter in mass measurements is also in good agreement with previous results. The halo-to-halo scatter is much larger than the uncertainties from the likelihood procedure, and thus represents a true systematic effect. To explore this further we calculate the typical time the progenitor dwarf galaxies that contribute to the stars between 50-100 kpc merge with the host galaxy. This is calculated by taking the median accretion time of all the star particles between 50-100 kpc for each Auriga halo. Here, merge = 0 is present day, so more recent accretion events have lower merge values. The halo-to-halo scatter changes significantly with merge , and the bias is also less pronounced for early time accretion. For example, for debris deposited (on average) over 5 Gyr ago ( merge > 5 Gyr), the median of est / true is 0.95 with a dispersion of 0.20 (see dashed blue line in left hand panel). In contrast, for haloes dominated by material accreted very recently (in the past 2 Gyr, merge < 2 Gyr), the median is 0.82 with dispersion 0.33 (see dot-dash red line in left hand panel). The former case, especially when Sgr stars are excluded from the sample, is likely the more appropriate for the Milky Way halo (e.g. Ruchti et al. 2015; Deason et al. 2017; Lancaster et al. 2019; Fragkoudi et al. 2020) . This exercise shows that the accretion history of the Galaxy is an important consideration for mass-modelling. In particular, if, as is currently believed, the Milky Way has a fairly quiescent recent accretion history, then the mass can be measured to 20% accuracy with some confidence. However, there still exists a slight (5%) bias in the mass measurement, which tends to underestimate the halo mass. After visual inspection of the phase-space diagrams of the Auriga haloes (see Fig. 3 ), we suggest that this bias is due to the presence of shell-like structures in this radial regime (e.g. Quinn 1984) . These shells can form when a massive dwarf galaxy collides with the host, and the stellar debris follows very radial orbits. The enhancement of stars at apocentre thus builds up shell structures. These structures can persist for very long times in the halo (e.g. Johnston et al. 2008) , and are more common at large distances. We are typically probing these structures at apocentre, and their pericentres are found at much lower distances. This leads to an underestimate in the halo mass as we are not fully sampling the phase-space distribution. We visually identify that approximately half of the Auriga haloes have shells in the radial regime 50-100 kpc. In the righthand panel of Fig. 2 these cases are indicated with the black boxes. For haloes with merge > 2 Gyr, the median est / true is 0.8 for haloes with shells, and 1.0 without shells. Four examples of phase-space diagrams are shown in Fig. 3 . Here, we show two cases with prominent shells (top panels), and two cases without (middle panels). In the bottom two panels of Fig. 3 we show the observational Milky Way data in the los vs. plane. In the bottom left panel, each star is shown as a filled circle with its associated velocity error. In the right panel, we show a 2D histogram with pixels of size 1kpc × 10kms −1 . Here, we incorporate the observational errors in distance and velocity into the weightings of each pixel to take into account these uncertainties. There is some evidence that there are shell-like features in the data at ∼ 65 − 70 kpc and ∼ 80 − 90 kpc. However, this is a tentative result because the observational sample does not randomly sample the halo density profile; in fact, the sample is likely biased towards smaller distances, so we must be careful of the interpretation of these features. Nonetheless, there does appear to be a cold features, which warrant further scrutiny with future, more expansive, datasets. Note that the most likely origin of these shells is the highly radial Gaia-Enceladus-Sausage merger (e.g. Belokurov et al. 2018; Deason et al. 2018b; Helmi et al. 2018) . Indeed, Fattahi et al. (2019) showed that the Auriga-5 and Auriga-22 haloes, shown in the top two panels of Fig. 3 , have experienced analogous Gaia-Enceladus-Sausage merger events. Interestingly, the bias apparent in the cosmological simulations acts in the opposite sense to the impact of the LMC. Thus, it is crucial that models of the impact of the LMC on a realistic, lumpy halo are explored. For now, in addition to correcting the bias caused by the LMC and excising stars belonging to the Sgr stream, we include a systematic bias of 5% in our analysis and a systematic uncertainty of 20% to account for the predictions from the simulations. Here, we assume that the influence of shells could be affecting our massestimate, and we adopt the typical bias and uncertainty from the Auriga haloes with relatively quiescent accretion histories. Note, here, we are assuming that the Auriga haloes are a representative sample of MW-mass haloes, and capture the main affects of substructure that are relevant for the MW. In future work, it would be beneficial to compare with much larger samples of high resolution haloes, and test independent simulation suites. We now apply the procedure outlined in Section 3 to the Milky Way data. Our sample of comprises of = 665 halo stars ( = 485 when Sgr stars are excluded), in the radial range 50-100 kpc. All of the stars have measured radial velocities and distances, and the majority (95%) have proper motion measurements from GDR2. As discussed in Section 3 we apply a maximum likelihood analysis to the LOSVD (Eqn. 3), which is marginalised over the tangential velocity error distributions. The process is repeated = 100 times and we scatter the distance and radial velocity measures by their associated uncertainties. The resulting likelihood contours and posteriors for the potential parameters (Φ 0 , ) and velocity anisotropy ( ) are shown in Fig. 4 . Here, we exclude Sgr stars, and apply an offset to the los and velocity components to account for the effect of the LMC (see Section 4.1). We also show the 1D posterior distributions for the cases when: (i) Sgr stars are excluded, but no velocity offset is applied (dashed black), (ii) Sgr stars are included, and a velocity offset is applied (solid purple), and (iii) Sgr stars are included, and no velocity offset is applied (dashed purple). Our results are summarised in Table 1 . We find a radially biased velocity anisotropy, with = 0.42 ± 0.06, in good agreement with Bird et al. (2019 Bird et al. ( , 2020 , and also in agreement with studies finding that the highly radial Gaia-Enceladus-Sausage debris dominates the central regions of the Milky Way (e.g Deason et al. 2018b; Lancaster et al. 2019) . By combining the (highly degenerate) Φ 0 and parameters, we find the mass within 100 kpc: (< 100kpc) = 6.01 ± 0.32 × 10 11 M . We note that our prior on does not significantly affect the total mass measurement within 100 kpc, but is an important influence on the shape of the mass profile that we derive. To further illustrate the influence of excluding Sgr and/or applying a velocity offset we show the resulting circular velocity and mass profiles in Fig. 5 . The solid black lines and gray shaded regions show the fiducial results (with Sgr exc. and a velocity offset applied) and 1 − uncertainty. The dashed lines show the results when an offset is not applied, and the purple lines are when Sgr stars are not excised. Here, we can see that inclusion of Sgr stars biases the mass estimates low, whereas the velocity gradients induced by the LMC bias the mass estimates high. We note, however, that both systematics are relatively small, and within the 1 − uncertainties. In Section 3 we discussed how the tracer density slope, , is kept fixed in our analysis. Previous work has shown that the tracer density slope is an important parameter in dynamical mass estimates (e.g. Dehnen et al. 2006; Deason et al. 2012a ). Our assumption of = 4.0 between 50 − 100 kpc is motivated by recent measurements in the literature (e.g. Xue et al. 2015; Cohen et al. 2017; Deason et al. 2018a; Fukushima et al. 2019) . However, the stellar halo power-law slope in this radial range is still debated, and the uncertainty is not well quantified. To this end, we provide an approximate fitting formula to adjust our mass measurement based on the input parameter: where this equation is valid for 3.0 < < 5.0. We emphasize that our fiducial mass estimates in this work use an = 4.0 slope. However, the approximate relation given above can, for example, be used to compare with other mass measures that assume a different tracer density profile. In future work, with more expansive halo samples beyond 50 kpc, this parameter will likely be pinned down with much greater confidence. Finally, in Figure 6 we show our derived circular velocity profile between 50-100 kpc in context with other measurements and models in the literature. Here, we have included an additional 20% systematic uncertainty and a 5% bias correction, following our tests on the Auriga cosmological simulations: (< 100kpc) = 6.31 ± 0.32(stat.) ± 1.26(sys.) × 10 11 M . Assuming the massconcentration relation for NFW haloes with 0.11 dex scatter (Dutton & Macciò 2014) , we find that our mass estimate within 100 kpc favours a total Milky Way mass of 200c = 1.05 ± 0.25 × 10 12 M . Note here we assume the baryonic mass given by Bovy & Rix (2013) . As mentioned earlier, the total mass calculated from equilibrium based modeling refers to the Milky Way mass before the LMC was accreted. If we include the LMC mass, the total mass (today) is 200c = 1.20 ± 0.25 × 10 12 M . Our circular velocity profile shown in Figure 6 is in excellent agreement with the Cautun et al. (2020) profile. This model uses Milky Way rotation curve data derived from Gaia DR2 to fit an adiabatically contracted dark matter halo. Interestingly, Cautun et al. (2020) find a relatively "average" halo concentration for the Milky Way, of 200c = 9.4 +1.9 −2.6 and a total (pre-LMC infall) mass of 200c = 1.08 +0.20 −0.14 × 10 12 M , in excellent agreement with our total mass estimate. Moreover, our total Milky Way mass also agrees with recent measurement using the more distant satellite galaxies as tracers, (e.g Callingham et al. 2019; Li et al. 2020) . It is reassuring that our results at intermediate radii are in such good agreement with independent measures from both the inner and outer halo. In this work we have applied a distribution function method to a large sample of distant ( > 50 kpc) halo stars to estimate the mass of the Milky Way out to 100 kpc. We pay particular attention to the systematic effect of unrelaxed substructure, and the dynamical influence of the LMC. Our main conclusions are summarised as follows: • We use a rigid Milky Way-LMC model to constrain the systematic reflex motion effect of the massive LMC on our halo mass estimate. As shown by Erkal et al. (2020) , the (pre-infall LMC) halo masses are over-predicted due to the velocity gradients caused by the recently infalling LMC. However, we find that a simple velocity offset correction in los and can minimize the overestimate caused by the reflex motion induced by the LMC, and, assuming a rigid LMC mass of 1.5 × 10 11 M , we can recover the true mass within 1 − . • By applying our method to a sample of Milky Way-mass haloes from the Auriga simulation we find that the halo masses are typically underestimated by 10%. However, this bias is reduced to ∼ 5% if we only consider haloes with relatively quiescent recent accretion histories. The residual bias is due to the presence of long-lived shell-like structures in the outer halo. The halo-to-halo scatter is ∼ 20% for the quiescent haloes, and represents the dominant source of error in the mass estimate of the Milky Way. • We apply our distribution function method to = 485 halo stars when high probability Sgr stars are excluded. The overall sample has a radial velocity anisotropy, = 0.4, in good agreement with recent measures in this radial range (Bird et al. 2019 (Bird et al. , 2020 . Our estimated mass within 100 kpc is (< 100kpc) = 6.31 ± 0.32(stat.) ± 1.26(sys.) × 10 11 M . A systematic bias correction (+5%), and additional uncertainty (20%), are included based on our results from the Auriga simulations. The mass estimates are slightly higher when we do not include a velocity offset to correct for the reflex motion induced by the LMC, or slightly lower when Sgr stars are included in our analysis. • Our mass estimate within 100 kpc is in good agreement with recent, independent measures in the same radial range (e.g. Eadie & Jurić 2019; Erkal et al. 2019; Vasiliev et al. 2020 ). If we assume the predicted mass-concentration relation for NFW haloes, our measurement favours a total (pre-LMC infall) Milky Way mass of 200c = 1.05 ± 0.25 × 10 12 M , or (post-LMC infall) mass 200c = 1.20 ± 0.25 × 10 12 M when a rigid 1.5 × 10 11 M LMC is included. The sample of halo stars we have used in this work is just the tip of the iceberg in terms of the datasets coming our way in the next few years. We are already in a regime where the systematic uncertainties dominate over the statistical uncertainties: this is uncharted territory for mass measurements of the Milky Way, which have, historically, been hampered by incomplete phase-space information. We show in this work that the next phase in mass modelling is to take into account the unique assembly history of the Milky Way in any analysis. In particular, understanding the influence of the LMC, and the dynamics of both relaxed and unrelaxed substructure are crucial. With larger halo samples and more accurate phase-space measurements, these effects will become more and more important. However, there is hope that with a more detailed mapping of the Milky Way's recent accretion history, and more realistic stellar haloes models that include the influence of the live MW-LMC system, we can provide both accurate and precise Milky Way mass measures. Durham University and STFC operations grant ST/R000832/1. DiRAC is part of the National e-Infrastructure. The data presented in the figures are available upon request from the corresponding author. Proc. SPIE. p AD thanks Gurtina Besla for valuable comments on the manuscript, and Marius Cautun for providing circular velocity data for Fig. 6 . Our gratitude is extended to all of the essential workers that support our livelihood, especially during the Coronavirus pandemic. AD thanks the staff at the Durham University Day Nursery who play a key role in enabling research like this to happen.AD