key: cord-136421-hcj8jmbm authors: Myers, Kyle R.; Tham, Wei Yang; Yin, Yian; Cohodes, Nina; Thursby, Jerry G.; Thursby, Marie C.; Schiffer, Peter E.; Walsh, Joseph T.; Lakhani, Karim R.; Wang, Dashun title: Quantifying the Immediate Effects of the COVID-19 Pandemic on Scientists date: 2020-05-22 journal: nan DOI: nan sha: doc_id: 136421 cord_uid: hcj8jmbm The COVID-19 pandemic has undoubtedly disrupted the scientific enterprise, but we lack empirical evidence on the nature and magnitude of these disruptions. Here we report the results of a survey of approximately 4,500 Principal Investigators (PIs) at U.S.- and Europe-based research institutions. Distributed in mid-April 2020, the survey solicited information about how scientists' work changed from the onset of the pandemic, how their research output might be affected in the near future, and a wide range of individuals' characteristics. Scientists report a sharp decline in time spent on research on average, but there is substantial heterogeneity with a significant share reporting no change or even increases. Some of this heterogeneity is due to field-specific differences, with laboratory-based fields being the most negatively affected, and some is due to gender, with female scientists reporting larger declines. However, among the individuals' characteristics examined, the largest disruptions are connected to a usually unobserved dimension: childcare. Reporting a young dependent is associated with declines similar in magnitude to those reported by the laboratory-based fields and can account for a significant fraction of gender differences. Amidst scarce evidence about the role of parenting in scientists' work, these results highlight the fundamental and heterogeneous ways this pandemic is affecting the scientific workforce, and may have broad relevance for shaping responses to the pandemic's effect on science and beyond. By mid-April 2020, the cumulative number of deaths due to COVID-19 had reached approximately 115,000 with nearly 1,800 deaths per day in the U.S. and 3,000 deaths per day in Europe 1 . Throughout the U.S. and Europe, schools and workplaces were typically required to be closed and restrictions on gatherings of more than 10 people were in place in most countries 2 . For scientists, not only did this drastically change their daily lives, it severely limited the possibilities of using traditional workspaces as most institutions had suspended "non-essential" activities on campus [3] [4] [5] [6] [7] [8] [9] [10] . To collect timely data on how the pandemic affected scientists' work, we disseminated a survey to U.S.-and Europe-based scientists across a wide range of institutions, career stages, and demographic backgrounds. We identified the corresponding authors for all journal articles indexed by the Web of Science in the past decade, and then randomly sampled 400,000 U.S.-and Europebased email addresses (see SI S1 for more). We distributed the survey on Monday April 13th, 2020, about 1 month after the World Health Organization declared the COVID-19 pandemic. Within one week, the survey received full responses from 4,535 individuals who self-identified as faculty or PIs from academic or non-profit research institutions. Respondents were located in all 50 states in the U.S. (63.7% of the sample, Figure S1A ), 35 countries in Europe (36.3% of the sample, Figure S1B ), and were affiliated with the full spectrum of research fields listed in the survey. For more on the response rate, sampling method, and a comparison to a national survey of doctorate-level researchers, see SI S3. Motivated by prior research on scientific productivity [11] [12] [13] [14] [15] , the survey solicited information about scientists' working hours, how this time is allocated across different tasks, and how these time allocations have changed since the onset of the pandemic. We asked scientists to estimate changes to their research output-the quantity and impact of their publications-in coming years relative to prior years. We also solicited a wide range of characteristics including field of study, career stage (e.g., tenure status), demographics (e.g., age, gender, number and age of dependents in the household), and other features (e.g., institution closure and whether the respondent was exempt from any closures). Details on the survey instrument are included in SI S2, and Table S1 reports summary statistics for all the respondents used in the analyses. To understand the immediate impacts of the pandemic, we compare the reported level and allocation of work hours pre-pandemic and at the time of the survey. Figures 1A and 1B illustrate two primary findings. First, there is a sharp decline in total work hours, with the average dropping from 61.4 hours per week pre-pandemic to 54.4 at the time of the survey (diff.=-6.9, s.e.=0.20). In particular, 5.01% of scientists reported that they worked 42 hours or less before the pandemic, but this share increased nearly six-fold to 29 .7% by the time of the survey (diff.=24.7, s.e.=0.67). Second, there is large heterogeneity in changes across respondents. Although 55.0% reported a decline in total work hours, 27.3% reported no change, and 17.7% reported an increase in time devoted to work. This significant fraction of scientists reporting no change or increases in their work hours is notable given that 91.0% of respondents reported their institution was closed for non-essential personnel. To decompose these changes, we compare scientists' reported time allocations across four broad categories of work: research (e.g., planning experiments, collecting or analyzing data, writing), fundraising (e.g., writing grant proposals), teaching, and all other tasks (e.g., administrative, editorial, or clinical duties). We find that among the four categories, research activities have seen the largest negative changes. Whereas total work hours decrease by 11.3% on average, research hours have declined by 24 .4% (teaching, fundraising, and "all other tasks" decrease by 1.9%, 9.3%, and 0.7%, respectively). Comparing the share of time allocated across the tasks ( Figure 1C -F), we find that research is the only category that sees an overall decline in the share of time committed (median changes: -16.2% for research, 0% for fundraising, +2.7% for teaching, and +2.0% for all other tasks). Overall, these results indicate that scientists' research time has been disrupted the most, and the declines in time spent on the other three categories are mainly due to the decline in total work hours. Furthermore, correlations suggest that research may be a substitute for each of the three other tasks (see SI S5.1 and Figure S4 ). Still, despite the large negative changes in research time, substantial heterogeneity remains, as 9.4% reported no change and 21.2% reported spending more time on research. The sizable heterogeneity begs the question as to what factors are most responsible for the observed heterogeneous effects among scientists. To unpack the varied effects of the pandemic, we first examine across-field differences. Figure 2A depicts the average change in reported research time across the 20 different fields we surveyed. Fields that tend to rely on physical laboratories and time-sensitive experiments -such as biochemistry, biological sciences, chemistry and chemical engineering -report the largest declines in research time, in the range of 30-40% below pre-pandemic levels. Conversely, fields that are less equipment-intensive -such as mathematics, statistics, computer science, and economics -report the lowest average declines in research time. The difference between fields can be as large as four-fold, again highlighting the heterogeneity in how certain scientists are being affected. These field-level differences may be due to the nature of work specific to each field, but may also be due to differences in the characteristics of individuals that work in each field. To untangle these factors, we use a Lasso regression approach to select amongst (1) a vector of field indicator variables, and (2) a vector of flexible transformations of demographic controls and pre-pandemic features (e.g., research funding level, time allocations before the pandemic). The Lasso is a datadriven approach to feature selection that minimizes overfitting by selecting only variables with significant explanatory power 16, 17 . We then regress the reported change in research time on the Lasso-selected variables in a post-Lasso regression, allowing us to estimate conditional associations for each variable selected (see SI S4). Comparing Figure 2A and 2B, we find that the contrast between the "laboratory" or "bench science" fields versus the more computational or theoretical fields is still significant in the post-Lasso regression, indicating that differences inherent to these fields are likely important mediators of how the pandemic is affecting scientists. Although we cannot reject a null hypothesis of no change, there is also suggestive evidence of an increase in research time for the health sciences, possibly due to work related to COVID-19. Importantly, we also find that most of the variation across fields is diminished once we condition on the individual-level features selected by the Lasso, which suggests a large amount of heterogeneity is due to these individual-level differences. Indeed, the standard deviation of the twenty field-level averages of reported changes in research time is 7.4%. By contrast, the standard deviation of the individual-level residuals from these fieldlevel averages-that is, how much each individual's response differs from the average in their field-is 50.5%, indicating there is substantial variation across individuals even within the same field. To illustrate the raw individual-level variation, we measure the average change in reported research time across demographic and other group features ( Figure 2C ). Given the persistent gender gap in science 18-28 , we include interactions with the female indicator to explore potential gender-specific differences. We find that there are indeed widespread changes across the range of individual-level features we examined. Yet, when we use the Lasso and regression to control for the field differences documented in Figure 2A , we find marked changes in the relevance of certain individual-level features. Figure 2D plots the post-Lasso regression coefficients associated with the demographic and careerstage characteristics and reveals four main results. First, career stage appears to be a poor predictor of the impacts of the pandemic, as conditional changes in research time for older versus younger and tenured versus untenured faculty are statistically indistinguishable. Second, scientists who report being subject to a facility closure also report only minor unconditional differences in their research time ( Figure 2C ), and this feature is not selected by the Lasso as a relevant predictor for changes in research time. Third, there is a clear gender difference. Holding field and all other observable features fixed, female scientists report a 4.2% larger decline in research time (s.e.=1.5). Fourth, child dependent care is associated with the largest effect. Reporting a dependent under 5 years old is associated with a 15.8% (s.e.=2.1) larger decline in research time, showing a substantially larger effect than any other individual-level features. Reporting a dependent 6 to 11 years old is also associated with a negative impact, ceteris paribus, but that decline is smaller than the decline associated with dependents under 5 years old. This is consistent with shifts in the demands of childcare as children age. Having multiple dependents is associated with an additional 4.5% decline (s.e.=1.6) in research time. Overall, these results are consistent with preliminary reports of differential declines in female scientists' productivity during the pandemic 29, 30 . Our findings further indicate that some of the gender discrepancy can be attributed to female scientists being more likely to have young children as dependents (21.2% of female scientists in our sample report having dependents under the age of 5, compared to 17.7% of male and other scientists, s.e. of diff.=3.6). For further results related to the other three task categories, see SI S5.2. To estimate the potential downstream impact of the pandemic, we also asked respondents to forecast how their research publication output in 2020 and 2021-in terms of the quantity and impact of their publications-will compare to their output in 2018 and 2019. We randomly assigned respondents to make a forecast for one of six possible scenarios where they were to take as given the duration of the pandemic to be 1, 2, 3, 4, 6, or 8 months from the time of the survey. For more on how we use this introduced random variation and adjust scientists' forecasts to account for underlying trends in publication output, see SI S4.2. Figure 3A plots the distribution of the estimated changes in publication quantity and impact due to the pandemic. We find that, on average, quantity is projected to decline 13.0% (s.d.=37.7). For comparison, prior estimates show that in the biomedical sciences, receiving a grant of approximately one million dollars from the National Institutes of Health raises a PI's short run publication output by 7-12% 31,32 , suggesting that a projected decline of 13% is not negligible. Moreover, the decline in output is not limited to quantity, as impact is projected to decline by 7.9% on average (s.d.=31.0). To understand which scientists are most likely to forecast larger declines in their output due to the pandemic, we repeat the Lasso-based regression approach using these forecasts as dependent variables. These analyses uncover two notable findings ( Figure 3B ). First, all of the features selected as relevant are related to caring for dependents. As in the case of research time, reporting a dependent under 5 years old is associated with the largest declines. Second, gender differences in these forecasts appear attributable to differential changes associated with dependents. Reporting a 6-to 11-year-old dependent is associated with a 6.6% (s.e.=1.9) and 5.4% (s.e.=1.8) lower forecast of publication quantity and impact, respectively, but only for female scientists (see SI S5.3 for the field-level results). We find that most of the same groups currently reporting the largest disruptions to research time also report the worst outlook for future publications. The correlations between reported change in research time and forecasted publication output are 0.337 for quantity (p-value < 0.001) and 0.214 for impact (p-value < 0.001). While understanding the relationships between time input and research output is beyond the scope of this study, we repeat the analysis, including the changes in reported time allocations to test if they moderate the effects we observe. We find that, while the post-Lasso regression coefficients associated with the selected demographic features generally become smaller, a statistically significant relationship remains in most cases even when conditioning on the (Lasso-selected) change in research time. This suggests the forecasted declines associated with reporting young dependents are not simply explained by the direct change in time spent on research ( Figure S7 ). We further investigate how these publication forecasts may depend on the expected duration of the COVID-19 pandemic by plotting the (randomized) expectation shown to the survey respondent against the estimated net effect of the pandemic ( Figure 3C) . A linear fit indicates that, for every 1 month that the pandemic continues past April 2020, scientists expect a 0.63% decrease in publication quantity (s.e.=0.23) and a 0.48% decrease in impact (s.e.=0.19) due to the pandemic. These marginal effects may appear small relative to the others documented in this paper, but it is important to note that they are on a similar scale as economic forecasts for the U.S. and Europe, which (as of May 2020) project economic declines in the range of 0.4-0.6% per month (5-7% for 2020) 33 . Still, these results could also reflect uncertainties or errors inherent to these forecasts, or strong personal beliefs about the timeline for the pandemic that are not easily swayed by the survey's suggestion. Our results shed light on several important considerations for research institutions as they consider reopening plans and develop policies to address the pandemic's disruptions. The findings regarding the impact of childcare reveal a specific way in which the pandemic is impacting the scientific workforce. Indeed, "shelter-at-home" is not the same as "work-from-home" when dependents are also at home and need care. Because childcare is often difficult to observe and rarely considered in science policies (aside from parental leave immediately following birth or adoption), addressing this issue may be an uncharted but important new territory for science policy and decision makers. Furthermore, it suggests that unless adequate childcare services are available, researchers with young children may continue to be affected regardless of the reopening plans of institutions. And since the need to care for dependents is by no means unique to the scientific workforce, these results may also be relevant for other labor categories. More broadly, many institutions have announced policy responses such as tenure clock extensions for junior faculty. Of 34 U.S. university policies we identified that provided some form of tenure extension due to the pandemic, 30 appeared to guarantee the extension for all faculty (see SI S5.5 for more). Institutions may favor such uniform policies for several reasons such as avoiding legal challenges. But given the heterogeneous effects of COVID-19 we identify, it raises further questions whether these uniform policies, while welcoming, may have unintended consequences and could exacerbate pre-existing inequalities 34 . While this paper focuses on quantifying the immediate impacts of the pandemic, circumstances will continue to evolve and there will likely be other notable impacts to the research enterprise. The heterogeneities we observe in our data may not converge, but instead may diverge further. For example, when research institutions begin the process of reopening, there may be different priorities for "bench sciences" versus work that involves human subjects or that requires travel to field sites. And research requiring international travel could be particularly delayed; all of which could lead to new productivity differences across certain groups of scientists. Furthermore, individuals with potential vulnerabilities to COVID-19 may prolong their social distancing beyond official guidelines. In particular, senior researchers may have incentives to continue avoiding inperson interactions 35 , which historically facilitate mentoring and hands-on training of junior researchers. The possibility of a resurgence of infections 36 suggests that institutions may anticipate a reinstatement of preventative measures such as social distancing. This possibility could direct focus toward research projects that can be more easily stopped and restarted. Funders seeking to support high-impact programs may have similar considerations, favoring proposals that appear more resilient to uncertain future scenarios. Lastly, although we have focused on two of the denser geographic regions of scientific output in this study, the pandemic is having a substantial impact on research worldwide. In the coming years, researchers may be less willing or able to pursue positions outside of their home nation, which may deepen or alter global differences in scientific capacity. Future work expanding our understanding of how the pandemic is affecting researchers across different countries, at different institutions, and in different points of their life and career could provide valuable insights to more effectively protect and nurture the scientific enterprise. The strong heterogeneities we observe, and the likely development of new impacts in the coming months and years, both argue for a targeted and nuanced approach as the world-wide research enterprise rebuilds. 30. Kitchener, C. Women academics seem to be submitting fewer papers during coronavirus. 'Never seen anything like it,' says one editor. https://www.thelily.com https://www.thelily.com/women-academics-seem-to-be-submitting-fewer-papers-duringcoronavirus-never-seen-anything-like-it-says-one-editor/ (2020 The study protocol has been approved by the Institutional Review Board (IRB) from Harvard University and Northwestern University. Informed consent was obtained from all participants. Figure S6 reports the results from a similar exercise focusing on fieldlevel differences. We find the same three fields associated with the largest declines in research time -biochemistry, biology, and chemistry -also forecast the largest pandemic-induced declines in their publication output quantity, ceteris paribus. C. Average estimated changes in publication outputs per the randomized duration of pandemic respondents were asked to assume for their forecasts (either 1, 2, 3, 4, 6, or 8 months from the time of the survey, mid-April 2020). .......................................................................................................................... ...........................................................7 S5 Additional Results ....................................................................................................................................9 spent on different tasks ................................................................................9 -research tasks by groups ...............................................................9 Forecast Results ................................................................................. ................................................................................10 S6 Supplementary Tables ...........................................................................................................................11 S7 Supplementary Figures .........................................................................................................................12 S8 References for Supplementary Information .........................................................................................19 To compile a large, plausibly random list of active scientists, we leverage the Web of Science (WoS) publication database. The WoS database is useful for two reasons: (1) it is one of the most authoritative citation corpuses available 1 and has been widely used in recent science of science studies 2-4 ; (2) among other large-scale publication datasets, WoS is the only one, to our knowledge, with systematic coverage of corresponding author email addresses. We are primarily interested in active scientists residing in the U.S. and Europe. We start from 21 million WoS papers published in the last decade (2010-2019). In an attempt to focus on scientists likely to still be active and in a more stable research position, we link the data to journal impact factor information (WoS Journal Citation Reports), and exclude papers published in journals in the bottom 25% of the impact factor distribution for its WoS-designated category. We use the journal impact factor calculated for the year of publication, and for papers published in 2019, we use the latest version (2018). We then extract all author email addresses associated with papers. For each email address in this list, we consider it as a potential participant if: (1) it is associated with at least two papers in the ten-year period, and (2) the most recent country of residence, defined by the first affiliation of the most recent paper, is in the U.S. or Europe. We have approximately 2.5 million unique email addresses after filtering, with about 521,000 in the U.S. and 938,000 in Europe. We then randomly shuffled the two lists separately and sampled roughly 280,000 email addresses from the U.S. and 200,000 from Europe. We oversampled the U.S. as a part of a broader outreach strategy underlying this and other research projects. We recruited participants by sending them email invitations through with the following text: We build on field classifications used in national surveys such as the U.S. Survey of Doctorate Recipients (SDR) to categorize fields in our survey, aggregating to ensure sufficient sample sizes within each field. The notable additions we make to the fields used in these other surveys are to include: Business Management, Education, Communication, and Clinical Sciences. These fields reflect major schools at most universities and/or did not immediately map to some of the default fields used in the SDR (i.e., the "Health Sciences" field in SDR does not include medical specialties). Out of a total of 480,000 emails sent, approximately 64,000 emails were directly bounced either due to incorrect spelling in the WoS data or the termination of the email account. In hopes of soliciting a larger sample, we also undertook snowball sampling by encouraging respondents to share the survey with their colleagues as well. Overall 9,968 individuals entered the survey and 8,447 continued past the consent stage. Of those that did not, 412 were not an active scientist, post-doc, or graduate student and thus not within our population of interest, 81 did not consent, and 1028 did not make any consent choice. When a respondent continued past the consent stage, we asked them to report the type of role they were in. Out of the 8,447 consenting responses, there 5,728 responses from faculty or principal investigators (PIs), 1,023 responses from post-doctoral researchers, 701 from graduate students in a doctoral program, and 52 from retired scientists. 551 of the remaining respondents were some other type of position and another 392 did not report their position. This yields an estimate of a response rate of approximately 1.6%. First, our low response rate may reflect the disruptive nature of the pandemic, but it also raises concerns for generalizability of our results. However, after we received feedback from the initial distribution that many individuals had received the email in their "junk" folder, we became concerned with our distribution being automatically flagged as spam. Based on spot-checking of five individuals that we ex-post identified as being randomly selected by our sample, and who we had professional relationships with, found that in four of the five cases the recruitment email had been flagged as spam. We know of no systematic way of estimating the true spam-flagging rate (nor how to avoid these spam filters when using email distributions at this scale) without using high-end, commercial-grade products. Additionally, as with any opt-in survey, there may be correlations between which scientists opt-in and their experiences about which they want to report. For example, scientists who felt strongly about sharing their situation, whether they experienced large positive or negative changes, may be more likely to respond, which would increase the heterogeneity of the sample. Furthermore, there may also be non-negligible gender differences that arise not due to actual differences in outcomes but due to differences in reporting known to occur across genders [5] [6] [7] [8] [9] . For our analyses, we focus entirely on responses from the sample of faculty/PIs. From the full sample of PIs, we retain respondents who reported working for a "University or college", "Nonprofit research organization", "Government or public agency", or "Other", and excluding 87 responses from individuals who report to work for a "For-profit firm". We also restrict the sample to respondents whose IP address originated from the United States or Europe (dropping 1,049 responses from elsewhere). We then drop observations that have missing data for any of the variables used in our analyses: 26 responses do not report their time allocations, 74 do not report their age, 10 do not report the type of institution they work at, and 114 do not report their field of study. Altogether, this amounts to dropping 187 observations. Given the relatively small subset of our sample dropped due to missing data, we do not impute missing variables as this introduces unnecessary noise 10 . The summary statistics for the final sample used in the analyses are reported in Figure S1 and the geographic distribution of respondents is shown in Figure S2 . To estimate the generalizability of our respondent sample, we use the public microdata from The Survey of Doctorate Recipients (SDR) as the best sample estimates of the population of principal investigators in the U.S. The SDR is conducted by the National Center for Science and Engineering Statistics within the National Science Foundation, sampling from individuals who have earned a science, engineering, or health doctorate degree from a U.S. academic institution and are less than 76 years of age. The survey is conducted every two years, and we use the latest data available (2017 cycle). For this comparison, we focus only on university faculty in both our survey and the SDR. We also constrain our sample to only include fields of study with a clear mapping to the SDR categories. The SDR focuses only on researchers with Ph.D.-type degrees, and so it does not capture researchers with other degrees still actively engaged in research (i.e., researchers with only M.D.s). This means we exclude "architecture and design," "business management," "medicine," "education," "humanities," and "law and legal studies." Figure S2 compares respondents between our sample and the SDR sample. Figure S3a illustrates differences on demographics and career-stage features, including raw differences as well as those adjusted by field. We find only a small difference in age and no difference in partner status. Our survey oversamples on female scientists, those with children, and untenured faculty. These differences persist after conditioning on the scientist's reported field. That we ultimately find female scientists and those with young dependents to report the largest disruptions suggests that these individuals may have been more likely to respond to the survey in order to report their circumstances. The geographic distributions are relatively similar, with slight oversampling of west and undersampling of south. Lastly, we find a significant but small oversampling of U.S. citizens. We also compare the distribution of research fields (Fig. S3.b) . Overall the distributions are relatively similar. We appear to oversample most significantly on "atmospheric, earth, and ocean sciences" and "other social sciences." While we undersample most significantly on the biological sciences, "mathematics and statistics," and "electrical and mechanical engineering". There does not seem to be a clear pattern with these field-level differences, as we undersample fields that ultimately report being across the spectrum of disruptions (i.e., mathematics and statistics reports some of the smallest disruptions, and the biological sciences are amongst the most disrupted). The unconditional changes reported by each group of scientists is informative of how the pandemic affected researchers overall. But it does not allow us to infer whether groups reporting larger or smaller disruptions are doing so for reasons inherent to that group (i.e., the nature of work in certain fields, or the demands of home life unique to certain individuals) or because the individuals that select into that group tend to also be disrupted for unrelated reasons. This motivates a multivariate regression analysis to explore whether changes associated with a group of individuals change after conditioning on other observables. However, selecting which of an available set of covariates (or transformations thereof) to include in a regression is notoriously challenging. The Lasso method provides a data-driven approach to this selection problem by excluding covariates from the regression that do not improve the fit of the model 11, 12 . When using the Lasso, our general approach is to include a vector of indicator variables for the fields or demographic/career groups of interest, along with an additional set of controls. When focusing on differences across fields, we include the demographic/career variables in the control set, and vice versa. The control variables common to all Lasso-based analyses are: pre-pandemic level of time allocations and totals, pre-pandemic share of time allocations, pre-pandemic funding estimate, and indicators for the type of institution (academic, non-profit, government, or other) and the location (state if in U.S., country if in Europe). To make minimal assumptions about the functional form of the control variables, we conduct the following transformations to expand the set of controls: for all continuous variables we use inverse hyperbolic sine (which approximates a logarithmic transformation while allowing zeros), square and cubic transformations, and we interact all indicator variables with the linear versions of the continuous variables. We perform the Lasso using the lasso linear package in Stata 16 © software. We use the defaults for constructing initial guesses, tuning parameters, number of folds (ten), and stopping criteria. We use the two-step cross-validation "adaptive" Lasso model where an initial instance of the algorithm is used to make a first selection of variables, and then a second instance occurs using only variables selected in the first instance. The variables selected after this second run are then used in a standard post-Lasso OLS regression with heteroskedastic robust standard errors. We are interested in the effect of the COVID-19 pandemic on research output. As an initial estimate of what this effect could be, we asked respondents to forecast how their research output in 2020 and 2021 will compare to their prior output in 2018 and 2019. This framing was chosen for its simplicity; however, it does not provide a direct estimate of the pandemic effect. For this effect, we could have asked how the respondent expects their output to be in 2020 and 2021 compared to what they would otherwise expect their output to have been in 2020 and 2021 had the pandemic not occurred. Clearly, this is more complicated. But since we chose the simpler framing, we must account for some underlying factors before arriving at figures closer to what scientists think the effect of the pandemic will be (or our estimates thereof). These raw year-to-year forecasted changes in publication outputs will be influenced by four major factors: (1) changes due to the pandemic to date; (2) anticipated future changes due to the pandemic; (3) the respondent's expectations about how long the pandemic will last; and (4) regular trends in the evolution of publication output across different individuals and fields (e.g., if female scientists have continually been increasing their number of publications produced each year, then in the absence of the pandemic we might expect this trend to continue into the near future). Again, we are primarily interested in (1) and (2). To overcome (3), we randomly assign respondents to make forecasts for one of 6 possible scenarios where they were to take as given the duration of the pandemic to be either 1, 2, 3, 4, 6, or 8 months from the time of the survey. In some analyses, we condition on this variable to control for variation due to perceptions about the length of the pandemic. In others, we explore the effect of these different perceptions directly to infer how scientists perceive disruptions may evolve as the pandemic does or does not continue to persist. With respect to the issue of differential trends across individuals and fields, we first note that the time scale we are concerned with (approx. 2 years) is small enough that we expect the majority of individuals to not change in terms of their observables. This is because all of our time-dependent observables used in the analyses are based on groupings of 5 years. Still, to more quantitatively address this issue, we use historical data and another Lasso-based regression model to project scientists' publication output in 2020 and 2021, using their observable features from the survey and publication data since 2010. Our assumption is that these projections can approximate what scientists would have forecasted in the absence of the pandemic--they provide a crude counterfactual. Given the short timeframes involved, and the rich observable data we possess, we hypothesize that the room for significant biases or deviations are small relative to the acrossindividual variation. Due to data quality limitations, we are only able to connect 56% of respondents to their publication records, but a comparison of observables indicates that there are no meaningful differences between those scientists connected to their publication record, and those not (See Figure S4 ). Since we observe the variables used in these projections for all respondents, we can project out trends for all scientists in our sample. While the measurement of publication quantity is straightforward, the measurement of quality or, as it was asked in the survey, "impact" is not. Following a long line of science of science research 13 , we use citation counts as the best available proxy for quality. We follow the state of the art in terms of adjusting and counting these citations in a manner that does not conflate across-field differences 14 . The Lasso-based projection proceeds as follows. First, we demean the publication measures at the year level. This is because we do not want to attribute aggregate year-to-year variations across the entire sample to actual changes in net output, since these fluctuations can very plausibly be linked to changes in the Web of Science (WoS) coverage over time, and we are much more concerned with differential trends amongst different fields and/or different individuals. Next, we use the Lasso to select which of the observables are the best predictors of publication counts and citations. The major difference between this Lasso-based approach and the others used in this paper is that, here, we interact all observables with flexible time trends (i.e., squared, cubic, and inverse hyperbolic sine transformations of the year variable) to allow differential trends across groups. Finally, we project out these expected output measures as a function of the selected covariates and their corresponding coefficients from a post-LASSO OLS regression. Importantly, we project out of sample just two years so that we have estimates of the counterfactual trends for 2020 and 2021. With these estimates of respondents' counterfactual forecasts in hand, we then simply subtract them from scientists actual reported forecasts to arrive at our estimate of scientists' forecast of the "net effect" of the pandemic. Figure S3 plots the distributions of the unadjusted forecasts and these net effects for both the quantity and impact measures. The adjustment does not substantially change the distribution, but we are more confident in these estimates as "pandemic effects" for the aforementioned reasons. Figure S5 plots the reported changes in research time (y-axes) against the reported changes in time allocated to the other three task categories (x-axes). The figures are binned scatterplots, and linear fits of the data suggest that research may be a substitute for the other categories. A 10% increase in fundraising, teaching, or all other tasks is associated with a decline in research by 1.4% (s.e.=0.14), 4.6 % (s.e.=0.16), and 3.2% (s.e.=0.13), respectively. We lack exogenous variation in the data that can clearly shift the time allocated to one (or a subset) of tasks, so we cannot identify the extent to which these correlations reflect actual substitution patterns or unobserved factors. Though the magnitudes and precision of these relationships suggests further investigations are certainly warranted to better understand how scientists allocate their time. Figure S6a and S6b replicate Figures 2b and 2d from the main text, respectively, instead using each of the other three task categories for the dependent variable. For the analysis focused on fields (Fig. S6a) , no clear patterns emerge with respect to changes in time spent fundraising or teaching. Reported time changes in teaching may be due to a combination of reasons. First, during the pandemic, the demand for these activities is likely relatively stable (e.g., most academic institutions have moved classes online, but there are few reports of suspension of classes); and second, impacts due to the transition to online teaching may have taken place earlier, hence not captured by our survey. There is evidence that clinical science and biochemists are spending an increasing amount of time on the "all other tasks" category, which could plausibly be due to a redirection of effort directly towards pandemic-related (non-research) work. For the analysis focused on demographic groups (Fig. S6b) , we find that scientists reporting a dependent under 5 years old tend to also report larger declines across all task categories. This result is consistent with an unsurprising hypothesis that these dependents require care that leads scientists to decrease their total work hours. The fact that there does not appear to be any substitution away from research towards these other categories for these specific individuals with young dependents suggests the association is driven by factors inherent to having a dependent at home, and not that these individuals also tend to select alternative work structures that has them performing less research and more of other tasks. Figure S7 recreates Figure 3b from the main text, but using the field-level Lasso approach. Forecasted changes in output are almost entirely confined to publication quantity (as opposed to impact), with the same fields of biology and chemistry that reported the largest declines in research time also forecasting the largest declines in publication output, here in the range of 4-10% relative to what would have been expected otherwise. Notably, some fields expect to publish more because of the pandemic, again highlighting the heterogenous experiences scientists are having due to the pandemic. Figure S8 recreates Figure 3b from the main text, but while including the reported changes in time allocated to each of the four task categories (in addition to the pre-pandemic reported time allocations as before). Again, we find a similar set of dependent-related variables to be most predictive of forecasted publication changes, even though the reported change in research time is also selected as relevant by the Lasso. For comparison, the forecasted disruption associated with a dependent under 5 years old (7.04% decline expected publication count) is approximately the same magnitude as the implied effect associated with a 26% decrease in research time. Using internet searches, we attempted to identify university-level tenure clock extension policies put in place as a result of the COVID-19 pandemic. While not a comprehensive list, we identified policies for 34 universities, encompassing both public and private, small and large institutions. Of the 34 universities, 17 have automatically applied a tenure clock extension to all faculty, with individuals having the ability to opt out 15-31 ; 13 require applications but are automatically approved [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] . Four universities have not established unilateral policies [45] [46] [47] [48] . Instead, they have either created a separate application process or added COVID-19-related impact to the list of reasons a faculty member may apply for an extension. Table S1 . Summary Statistics. Summary statistics for the main survey sample. "Mean, with pubs." And "Mean, miss. pubs." report the averages for the sub-samples that can and cannot be connected to their publication record in WoS, respectively. The "t-stat" column reports the tstatistic from a test of mean differences between these two sub-samples. The two WoS-based variables are "Pub. Quantity (Number) since 2010" (the sum of the author's number of publications in the WoS record), and "Pub. Impact (Eucl. Citations) since 2010" (the field-demeaned euclidean sum of citations to the author's publications in the WoS record 14 ). Figure S1 . Geographic distribution. Plots respondent locations in U.S. (a.) and Europe (b.), aggregated to preserve anonymity. Figure S2 . Comparison to U.S. University-based SDR respondents. Summary statistics for demographic variables and fields common to both our survey and the U.S. Survey of Doctoral Recipients (SDR). All comparisons are based on U.S.-located faculty or PIs at universities or colleges that report affiliation with a field of study present in both surveys (note: all fields present in the SDR are present in our survey, but not vice versa). a. Describes the sample averages for both samples and the mean differences in both the raw data ("Diff.") and after adjusting for the different composition of fields in each sample ("Diff., field adjusted") b. Plots the share of respondents in each sample that affiliate with each of the fields common to both surveys. (*** p<0.01; **p<0.05; *p<0.1) a. b. Figure S3 . Publication changes, raw and inferred pandemic effects. Plots the distribution of changes to publication output. Blue lines indicate publication quantity, red lines indicate impact. Solid lines indicate the raw responses from the survey (which asked only about changes in publication output from 2018-19 to 2020-21), and dashed lines indicate our estimates of the implied effect due to the COVID-19 pandemic based on the removal of group-specific trends in publication output. See the Methodology section 2 for more. Figure 3b from the main text, also including the scientists' reported changes in time committed to each of the four task categories. The error bars indicate 95% confidence intervals, and only variables selected in the corresponding Lasso selection exercises are included in the post-Lasso regression. The coefficient corresponding to the "10%Change Time, Research" variable indicates the percent change in the scientists' forecasted quantity or impact associated with a 10% increase in the change in reported research time. For example, we estimate that scientists who reported a 10% larger decline in their research time forecast that the pandemic will cause them to produce 2.67% fewer publications in 2020-2021. COVID-19) deaths. Our World in Data Policy responses to the coronavirus pandemic Science-ing from home How research funders are tackling coronavirus disruption Safeguard research in the time of COVID-19 Coronavirus outbreak changes how scientists communicate The pandemic and the female academic Early-career scientists at critical career junctures brace for impact of COVID-19 How early-career scientists are coping with COVID-19 challenges and fears How COVID-19 could ruin weather forecasts and climate records Productivity differences among scientists: evidence for accumulative advantage Research productivity over the life cycle: evidence for academic scientists The economics of science Faculty time allocation: a study of changeover twenty years Incentives and creativity: evidence from the academic life sciences Regression shrinkage and selection via the Lasso Regression shrinkage and selection via the Lasso: a retrospective Web of Science as a data source for research on scientific and scholarly activity Increasing dominance of teams in production of knowledge Quantifying long-term scientific impact Atypical combinations and scientific impact Highly confident but wrong: gender differences and similarities in confidence judgments Boys will be boys: gender, overconfidence, and common stock investment Trouble in the tails? What we know about earnings nonresponse 30 years after Lillard, Smith, and Welch Measurement error in survey data Response error in earnings: an analysis of the survey of income and program participation matched with administrative data Flexible imputation of missing data Regression shrinkage and selection via the Lasso Regression shrinkage and selection via the Lasso: a retrospective How to count citations if you must Extension of the probationary period for tenure-track faculty due to COVID-19 disruptions Harvard offers many tenure-track faculty one-year appointment extensions due to COVID-19 Extending the reappointment/promotion/tenure review timeline COVID-19 and tenure review Response to Covid-19 disruption: extension of the tenure clock. The University of Alabama in Huntsville Memo on tenure-track probationary period extensions due to Covid-19. University of Virginia Office of the Executive Vice President and Provost Extension of tenure clock in response to COVID-19 Rule waivers: tenure clock extensions, leaves of absence, conversions, dual roles Extension of the tenure-clock guidelines for contract extension and renewal. Iowa State University Office of the Senior Vice President and Provost Tenure clock extension due to COVID-19 disruption Faculty promotion and tenure Tenure-track faculty: extension of tenure clock due to COVID-19. The Ohio State University Office of Academic Affairs Promotion/tenure clock extensions due to COVID-19 -faculty One-Year opt-in tenure clock extension COVID-19 guidance for faculty: extensions of tenure clock Probationary period extensions for tenure-track faculty. The University of Texas at Austin Office of the Executive Vice President and Provost Tenure rollback policy for COVID-19 We thank Alexandra Kesick for invaluable help. This work is supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0354, National Science Foundation SBE 1829344, and the Alfred P. Sloan Foundation G-2019-12485 and G-2020-13873.