key: cord-0663354-9wrm20p3 authors: Zuzul, Tiona; Pahnke, Emily Cox; Larson, Jonathan; Bourke, Patrick; Caurvina, Nicholas; Shah, Neha Parikh; Amini, Fereshteh; Park, Youngser; Vogelstein, Joshua; Weston, Jeffrey; White, Christopher; Priebe, Carey E. title: Dynamic Silos: Increased Modularity in Intra-organizational Communication Networks during the Covid-19 Pandemic date: 2021-04-01 journal: nan DOI: nan sha: 2f01fb9c149966fee8197ee14b3fc977cbf6725d doc_id: 663354 cord_uid: 9wrm20p3 Workplace communications around the world were drastically altered by Covid-19 and the resulting work-from-home orders and rise of remote work. We analyze aggregated, anonymized metadata from over 360 billion emails within over 4,000 organizations worldwide to examine changes in network community structures over 24 months. We find that, in 2020, organizations around the world became more siloed than in 2019, evidenced by increased modularity. This shift was concurrent with decreased stability, indicating that organizational siloes had less stable membership. We provide initial insights into the meaning and implications of these network changes -- which we term dynamic silos -- for new models of work. How has Covid-19 altered intra-organizational communication networks? Studying communication networks-who communicates with whom-is critical in understanding how work gets done (Blau 1963 , Jacobs and Watts 2021 , Kleinbaum et al. 2013 . Covid-19 created exceptional circumstances-and a natural experiment-that disrupted work in many ways never observed on such a scale. The public health crisis, related work-from-home orders, and the subsequent rise of Our findings provide a baseline set of results, replicated across organizations and geographies, that can inform future work on why Covid-19 resulted in dynamic silos and what the implications are for individuals, organizations, and the future of work. Recent studies indicate a post-Covid shift in communication patterns in specific organizations (Yang et al. 2021) or over the short term (DeFilippis et al. 2020 , Teodorovicz et al. 2021 ; we extend those studies by comparing patterns within organizations around the world and over two years. Although it is not our intention to specify mechanisms, our findings suggest that intra-organizational communication networks were affected by shifts in the medium of communication (from in-person to email). The observed changes suggest that serendipitous, in-person interactions with those outside one's community are not replaced by email; instead, employees might reduce email with people outside their well-defined work groups once they no longer interact in-person. But they may also intensify communication within their organizational silos. These dynamics persist even in the face of increased membership instability within silos. The remainder of this paper is organized as follows. We anchor our study in research on organizational networks. We then describe the data and our methods of analysis. We present results at three levels: across organizations around the world, within individual countries, and within a single large organization. Next, we describe a generative model we created to facilitate future research. Finally, we highlight the implications of our results in order to inspire future research on the impact of Covid-19 on organizational performance and innovation. Scholars have long recognized that formal organizational charts may not represent actual flows of communication at work. Blau (1963) highlighted the importance of informal "water cooler" conversations versus formal organizational structures in understanding how employees work. Shepard (1960) and Strauss (1955) explored how formal organizational structures could even be subverted by informal structures driving output and efficiency. Recent research has examined how employees build different informal networks within the same formal organizational network. For example, women tend to have more ties than men (Kleinbaum et al. 2013) , gain less advantages from those ties (Ibarra 1992), and are less frequently introduced to new, potentially valuable ties (Abraham 2020) . Recently, scholars have highlighted email as a way to capture informal networks within organizations (Jacobs and Watts 2021, Kleinbaum et al. 2013) . By examining patterns of email, it is possible to create social networks that aggregate patterns of individual interaction to the organizational level. These aggregate patterns can provide insights into how information flows with an organization and how these flows vary between organizations. For example, Jacobs and Watts (2021) exploratory study found that more geographically dispersed organizations have more centralized email networks, indicating that longer paths are required to access information. Constructing social networks based on email data can also help identify higher-level community structures within organizations. Community structures reveal the structure of ties within a network, how frequently employees communicate with each other, and how often they engage employees outside of their own silos. For example, one well-known type of community structure is a small world (Watts and Strogatz 1998) , in which the network, or a portion of it, is characterized by dense clustering of ties and short path lengths between nodes, which can facilitate rapid and efficient information flow (Gulati et al. 2012a ). While recent studies have begun to explore how organizational communication metrics (e.g. number of emails and time spent in meetings (DeFilippis et al. 2020 , Teodorovicz et al. 2021 have changed due to Covid-19, we examine shifts in community structures for several reasons. First, it does not require imposing a formal organizational structure; community structures can be induced. Inducing rather than imposing structure allows for an understanding of how communication flows in practice between individuals, teams, and functions. Second, analyzing patterns within distinct communities reveals differences between an organization's different communities. Finally, building on Clement et al. (2018a) finding that "in a network, members of different communities have access to different knowledge and routines" (p. 266), we expect shifts in community structure to produce macro-level changes in information and knowledge flow (see, for example, Gulati et al. 2012a ) with potentially long-lasting impact on organizational performance and innovation. One measure of a network's community structure is modularity, which captures the degree to which it is divided into communities-that is, the extent of siloing. Highly modular networks are one form of small world (Watts and Strogatz 1998) . Those with low modularity have less-well-defined communities and greater overall connectivity. Organizational scholars have identified consistently high modularity (0.60-0.75) in the inter organizational collaboration networks of firms in the television (Clement et al. 2018b) , microelectronics (Tatarynowicz et al. 2016) , computing Tatarynowicz 2014, Gulati et al. 2012b) , and biotechnology/pharmaceutical (Tatarynowicz et al. 2016) industries. To our knowledge, research has not measured modularity in intra-organizational communication networks nor explored temporal shifts in modularity. We attribute this gap to data limitations, as a comparative and dynamic analysis of network structure requires longitudinal data within and across many organizations. We leverage precisely this kind of data to understand whether and how communication networks and community structures shifted following Covid-19. We examine both the volume of emails sent and received and the network features that can be observed from their exchange. We calculate community structures by attempting to maximize modularity, as this provides insight into a network-level descriptive statistic independent of email volume. Using these community structures, we also calculate the Adjusted Rand Index (ARI) to measure the stability of intra-community membership. We collected anonymized email data from approximately 100,000 organizations over 29 months, from July 2018 through November 2020-approximately 450 billion email receipts 1 . For each organization i and each month t, we constructed an undirected weighted edge (u, v) with the weight w i,t, (u,v) being the total number of messages observed between accounts u and v. To filter out company-wide messages, we follow prior research (Kleinbaum et al., 2013) by not considering emails with more than four recipients. 2 We eliminate self-loops by ignoring edges from u to u. This edge definition induced a undirected weighted graph from which we extracted the largest where V i,t is the collection of accounts and edge (u, v) ∈ E i,t indicates that accounts u and v had at least one message between them. We limited our sample to organizations i with |V i,t | > 2000 for all t, yielding 4,361 organizations (ranging from 2,000 to 500,000 nodes), 126,469 organization-month networks, and approximately 362 billion email receipts. 3 We also analyzed the data between January 2019 and December 2020 (a) within countries with enough organizations to allow for anonymization and (b) from a single organization, Microsoft. The time period allows for month-over-month comparisons that account for seasonal variation in email patterns and exploit Covid-19 as a natural experiment. Analyzing the data was computationally intensive and required large-scale distributed compute infrastructure; it took more than 55,000 compute-hours for clustered machines to process the data. We defined modularity (Newman and Girvan 2004 , Newman 2006 , Bickel and Chen 2009 as n encodes a network partition assigning n vertices to K communities. We used the Leiden algorithm (Traag et al. 2019) to find a network partition that approximately maximized the modularity function. We also consider ARI (Hubert and Arabie 1985) . Given a network on the same set of n nodes at two different times, G t and G t , and letting P t and P t be partitions of the two networks into communities, the Rand index is defined as RI(G t , G t ) = (a + b)/ n 2 where a is the number of pairs of nodes that are in the same subset in both P t and P t and b is the number of pairs of nodes that are in different subsets in both partitions. ARI adjusts this measure for chance, so that ARI ≈ 0 indicates that which nodes cluster together is essentially chance across the two networks, while ARI ≈ 1 indicates that individual nodes' community memberships are stable across the two networks. We calculate ARI using the maximum modularity partitions. To better illustrate the dynamic interplay between modularity and ARI, Figure 1 presents a simplified case: the network of a single organization, with K = 2 blocks (communities), n = 20 vertices (nodes), and 10 vertices per block, observed at times G 1 and G 2 . In stochastic blockmodels (SBMs) (Holland et al. 1983 ) with both the number and size of the blocks held constant, an increase in Q together with a small value of ARI implies (a) more siloed groups and (b) significant churn in group membership. Figure 1 compares G 1 and G 2 . The only difference in the two SBMs is in the block connectivity matrices B 1 and B 2 , which are of the form [b 11 , b 12 ; b 21 , b 22 ]. We assume the within-block connectivities b 11 = b 22 = 0.50 for both B 1 and B 2 , but the between-block connectivity is b 12 = b 21 = 0.15 for B 1 decreasing to 0.05 for B 2 . In this case, the network structure measure modularity Q(G 2 ) is larger than Q(G 1 ): Q(G 1 ) = 0.400 ± 0.035 vs. Q(G 2 ) = 0.266 ± 0.035. If also we assume that the block memberships are altered such that two of the ten members of block one from G 1 switch to block two in G 2 , replaced by two from G 1 's block two moving to G 2 's block one, then the block membership stability measure ARI(G 1 , G 2 ) = 0.324, as illustrated in Figure 1 . The shaded communities are the Leiden-derived maximum modularity communities. The vertex colors denote block membership in G 1 , so we can see that two vertices change communities. The modularity increase indicates fracturing of the internal community structure and the corresponding decrease in stability indicates churn in community membership. We examined changes in monthly modularity and ARI in 2019 and 2020. Figures 3 and 4 provide a cross-sectional view of modularity across organizations and geographies and contextualize a shift in modularity within Microsoft. We find that modularity within organizations is relatively high: 50 percent of all organizations fall within the 0.64-0.77 range. Figure 3 provides a histogram summarizing the modularity Q(G i , t) for all 126,469 organization-month networks, with interquartile ranges. Figure 3 also includes two histograms illustrating geographic differences; specifically, the modularity of organizations in Canada-the lowest-modularity country in our sample (mean modularity ranging over time from 0.66 to 0.69)-and Germany-the highest-modularity country in our sample (mean modularity ranging over time from 0.75 to 0.77). Figure To further isolate the mechanisms driving overall shifts in modularity, we examined modularity by country. We examined the extent to which changes in modularity occurred contemporaneously with emergency orders that resulted in widespread shifts to working from home (issued at different times in different countries). While cultural drivers of modularity are outside the scope of our study, this analysis suggests geographic drivers of variation in modularity. For example, Figure 8 presents modularity aggregated by geography for Canada (132 organizations) and Germany (84 organizations). But, while the modularity for Germany is consistently higher than that for Canada, in both cases we see an increase following the imposition of country-wide emergency orders. Figure 9 presents modularity as a function of time for 10 other countries/regions. Following the imposition of emergency orders (indicated with a red line) in the spring of 2020, modularity relative to 2019 increased across all regions. This was regardless of whether modularity in the early months of 2020 was higher (e.g. Germany, Japan) or lower (e.g. Canada, India) than in the early months of 2019. This analysis also suggests the persistence of dynamic silos: although modularity in 2020 continued to track the same seasonal trends as in 2019 (see, for e.g., France, United States), baseline levels remained higher even following the removal of emergency orders (indicated with a blue line). We also examine how community structures changed within a single organization, Microsoft, following the imposition of a company-wide work-from-home order on March 4, 2020. We find that Finally, we compared formal organization structure with changes in modularity for seven suborganizations within Microsoft. These subunits had different initial modularity scores, likely driven by their distinct strategies, approaches to work, and degree of coupling with the broader organization. We found that our results held in all but one suborganization. That suborganization, which dropped in modularity significantly between February and March 2020, includes members of Microsoft's strategy and operations organization, who were tasked with manning the organization's control center through the crisis. Members nimbly adapted their networks by paring down connections within their own working group and keeping and forming connections across groups-likely those most acutely relevant-to regain stability. This indicates that shifts in email communication, while broadly similar between organizations, likely differed between communities within the Access, privacy, and legal considerations often prohibit obtaining raw communication data for analysis. To facilitate future research, we have created a generative model designed for our intra- To generate this model, we used root-level Leiden community structures to create an a-posteriori stochastic block model that retained the population statistics for both vertices and edges from the real network being fit. To make Barabási-Albert fit well in the context of an SBM, we modified the algorithm: within each block of the SBM, we considered a specific budget of vertices and edges, obtained from the observed network being fit. We configured the Barabási-Albert algorithm to create a number of edges for each vertex equal to the intra-block average degree centrality. Then, using either Erdős-Rényi (Figure 11 , Panel (a)) or Panel (b)) to create intra-block connections, we observed major differences between the resulting networks and the real network being fit ( Figure 11 , Panel (d)). The inter-block connections are made at a rate determined by the real network, but the connections are made between random vertices across pairs of communities. We observed that the power-law distribution of the degree centrality is much closer to the real network's distribution when using the Barabási-Albert generator and that the network paths generated using Barabási-Albert are longer than those from Erd?s-Rényi. These longer paths produce less regularity in the structures and also allow for bleed-over between communities, as highly eccentric nodes connected to multiple communities will be pulled between those communities. Using these observations, we extended our model to use hierarchical Leiden communities obtained by running Leiden recursively on the real network until we attained leaf communities with no more than n max vertices. (We use n max = 250.) Using these leaf communities, we applied the Barabási-Albert algorithm again for the leaf intra-block connections, then proceeded with interblock connections between leaf clusters. This localized the connections between communities to small groups of nodes, dramatically fracturing the network structure ( Figure 11 , Panel (c)) and corresponding to the structure observed in the real network being fit; as in the real network, the generative model produced many new and small communities. Applying Leiden to data generated from our generative model, we found these groups of communities captured in the same partition when maximizing root-level modularity, indicating that the more complex and realistic structure generated by BA-HSBM has modularity characteristics similar to those of the real network being fit. This generative model significantly reduces the computational complexity of analyzing the data in the future. The modularity of intra-organizational email communication networks increased from 2019 to 2020, while ARI decreased across organizations. These results were replicated both around the world and in one organization (Microsoft). Our analysis shows that, as employees shifted to remote work due to Covid-19, organizational networks around the world became more siloed and that the membership of communities within these networks became less stable. These dynamics persisted over time, even as emergency orders driving many work-from-home restrictions were lifted. The widespread shifts in these measures-that is, dynamic silos-implies that changes in the medium of communication (from in-person to email) may affect who communicates with whom and how information flows at work in a lasting way. Our findings have significant implications for both practice and theory. Leaders of organizations embracing remote work need to understand the impact of such a change on employee communication. Those executives who made formal organizational changes in response to Covid-19 might consider whether and how those changes will need to be adapted to support long-term remote work. Our analysis provides insight into how informal networks changed. The increased siloing we observe need not be feared; indeed, our findings suggest baseline differences in modularity scores exist between firms and even countries. This suggests that there is no single optimal level of modularity and that the appropriate level of sioling in an organization likely depends on its strategy and structural features including the degree of coupling between its subunits. Ultimately, by understanding that shifts in the medium of communication can affect with whom, not just how, employees communicate, executives can begin to attend to the dynamics of informal as well as formal networks. Theoretically, our study provides a baseline set of findings with implications for research on organizational performance and innovation. Future research can explore both the benefits and the trade-offs associated with dynamic siloes. First, increased modularity might improve productivity and efficiency. Collaborating with people with similar domain knowledge (Simon 1996) or complementary role experience (Valentine and Edmondson 2015) increases trust (Coleman 1988) , cooperation Zuckerman 2001, Wang et al. 2021) , and efficiency (Reagans and McEvily 2003) . Increased siloing might allow employees to focus on communicating with those with whom they already share interpretive schemas (Gulati 1995) , allowing for rapid sharing of information and tacit knowledge (Granovetter 1985) . Research has suggested that these benefits can occur even with membership instability or churn, provided that members have clear roles within their groups (Valentine and Edmondson 2015) . Thus, depending on work practices and features of collaborative groups, increased siloing may improve efficiency (Choudhury et al. 2021 ) and productivity (Dahlander and McFarland 2013) . Future research can examine whether dynamic siloing is associated with changes in firm or sub-unit performance and whether or how these relationships are moderated by different work practices. Second, dynamic siloing may reduce innovation in some organizations. Innovation often arises from novel combinations of distantly held knowledge (Schumpeter and Opie 1934 , Kogut and Zander 1992 , Hargadon and Sutton 1997 , Burt 2004 ). Interdisciplinary or cross-department collaborations provide access to new ties and information that can provoke innovative ideas (Soda et al. 2021 , Rawlings et al. 2015 . Increased siloing could reduce such access (Uzzi 1997 , Gulati et al. 2012b , Tortoriello et al. 2012 . Future research should examine the impact of shifts in modularity on innovation rates (measured through patents, publications, new products, and so on). Finally, increased modularity in large organizations might be associated with a specific kind of innovation; namely, competence-destroying technologies (Abernathy and Clark 1985, Tushman and Anderson 1986), which render existing organizational capabilities obsolete. Such innovations are typically the work of new firms (Tripsas 1997, Zuzul and Tripsas 2020) or small teams (Wu et al. 2019) . In large or incumbent organizations, competence-destroying, architectural, or disruptive innovation is best developed by groups that are loosely coupled with the rest of the organization. As increased modularity might foster the required cultural separation and autonomy, dynamic siloing might promote innovation in large established organizations (Henderson and Clark 1990 , Christensen 1997 , Benner and Tushman 2003 . Our analysis highlights the need to examine the drivers and implications of geographic differences in baseline modularity scores and in the magnitude of post-Covid-19 modularity shifts. If these shifts are associated with changes in organizational performance and innovation, they may have implications for national competitiveness and resilience and therefore merit continued focus as organizational communities evolve after the pandemic. Finally, recent studies have shown that, as a result of Covid-related work-from-home orders, employees transferred their informal interactions to new forms of digital communication, including instant messages (Yang et al. 2021 ). While we focus on email data, future studies can examine the network structures revealed by multi-modal changes in communications via video conferencing, social media, or chat data. We hope our research will stimulate studies connecting modularity and related measures of communication networks to organizational outcomes. An anonymized version of the data on Microsoft Corporation that support this study will be retained indefinitely for scientific and academic purposes. The data are available from the authors upon reasonable request and with the permission of Microsoft Corporation. The code used to produce the results shown on Microsoft and the code used to create the generative models and the fitted generative models for all 126,469 organization-month networks is available upon reasonable request and with the permission of Microsoft Corporation. Innovation: Mapping the winds of creative destruction Gender-role incongruity and audience-based gender bias: An examination of networking among entrepreneurs Emergence of scaling in random networks Exploitation, Exploration, and Process Management: The Productivity Dilemma Revisited A nonparametric view of network models and newman-girvan and other modularities The dynamics of bureaucracy: Study of interpersonal relations in two government agencies Structural Holes and Good Ideas Work-from-anywhere: The productivity effects of geographic flexibility The innovator's dilemma: when new technologies cause great firms to fail. The management of innovation and change series Brokerage as a public good: The externalities of network hubs for different formal roles in creative organizations for Different Formal Roles in Creative Organizations Social Capital in the Creation of Human Capital Ties that last: Tie formation and persistence in research collaborations over time Collaborating during coronavirus: The impact of covid-19 on the nature of work Economic Action and Social Structure: The Problem of Embeddedness Bootstrapping exchangeable random graphs Does Familiarity Breed Trust? The Implications of Repeated Ties for Contractual Choice in Alliances The rise and fall of small worlds: Exploring the dynamics of social structure The Rise and Fall of Small Worlds: Exploring the Dynamics of Social Structure Technology Brokering and Innovation in a Product Development Firm Architectural Innovation: The Reconfiguration of Existing Product Technologies and the Failure of Established Firms Stochastic blockmodels: First steps Comparing partitions A large-scale comparative study of informal social networks in firms Life in the trading zone: Structuring coordination across boundaries in postbureaucratic organizations Discretion Within Constraint -Homophily and Structure in a Formal Organization Knowledge of the Firm, Combinative Capabilities, and the Replication of Technology Modularity and community structure in networks Finding and evaluating community structure in networks Streams of Thought: Knowledge Flows and Intellectual Cohesion in a Multidisciplinary Era Network Structure and Knowledge Transfer: The Effects of Cohesion and Range The Social Capital of Corporate R&D Teams The theory of economic development: an inquiry into profits, capital, credit, interest, and the business cycle Machine as mind. Machines and Thought 2021) Networks, Creativity, and Time: Staying Creative through Brokerage and Network Rejuvenation Group Dynamics and Intergroup Relations Exploring the Locus of Invention: The Dynamics of Network Communities and Firms' Invention Productivity Environmental Demands and the Emergence of Social Structure: Technological Dynamism and Interorganizational Network Forms Working from home during covid-19: Evidence from time-use studies Bridging the Knowledge Gap: The Influence of Strong Ties, Network Cohesion, and Network Range on the Transfer of Knowledge Between Organizational Units From Louvain to Leiden: guaranteeing well-connected communities Unraveling the process of creative destruction: Complementary assets and incumbent survival in the typesetter industry Technological Discontinuities and Organizational Environments Social Structure and Competition in Interfirm Networks: The Paradox of Embeddedness Team Scaffolds: How Mesolevel Structures Enable Role-Based Coordination in Temporary Groups The Past Is Prologue? Venture-Capital Syndicates Collaborative Experience and Start-Up Exits Collective dynamics of small-world networks Large teams develop and small teams disrupt science and technology The effects of remote work on collaboration among information workers Start-up Inertia versus Flexibility: The Role of Founder Identity in a Nascent Industry