key: cord-0629844-uvfclxeo authors: Manrique, Pedro D.; Oud, Sara El; Johnson, Neil F. title: Online Group Dynamics Reveal New Gel Science date: 2021-08-04 journal: nan DOI: nan sha: 7191c0a7ca855de0a66174787ebf4542ad6722c4 doc_id: 629844 cord_uid: uvfclxeo A better understanding of how support evolves online for undesirable behaviors such as extremism and hate, could help mitigate future harms. Here we show how the highly irregular growth curves of groups supporting two high-profile extremism movements, can be accurately described if we generalize existing gelation models to account for the facts that the number of potential recruits is time-dependent and humans are heterogeneous. This leads to a novel generalized Burgers equation that describes these groups' temporal evolution, and predicts a critical influx rate for potential recruits beyond which such groups will not form. Our findings offer a new approach to managing undesirable groups online -- and more broadly, managing the sudden appearance and growth of large macroscopic aggregates in a complex system -- by manipulating their onset and engineering their growth curves. Theories of aggregation have had a successful history in physics, chemistry and beyond [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] . Understandably, most models and analyses in physical and chemical systems have considered constant size populations of N identical objects in a constant volume space. The resulting aggregation can lead to the sudden appearance of a macroscopically large aggregate, i.e. a gel. Network science has reinforced the importance of this gel formation problem in the context of the growth of a giant connected component (GCC) in a network [19, 27] . Indeed, gel and GCC equations can be identical when the system is treated in an averaged way (i.e. mean-field) [19, 27] . Though unrelated in topic, society is currently experiencing another example of something large appearing suddenly 'out of nothing', in terms of the rapid recent rise of extremism and hate on social media. The question of why these undesirable behaviors arise has inspired remarkably in-depth studies across the social sciences [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] 29] (see Supplementary Information SI for fuller list of references). However, social media companies and governments are still struggling with the operational question of how such undesirable behavior manages to suddenly appear out of nowhere on their platforms and grow so quickly, and what can be done to prevent it. Here we introduce a generalized aggregation model ( Fig. 1 ) whose solutions can explain how undesirable behavior such as extremism grows online (Fig. 2 ) and which offers a new approach to mitigation and prevention ( Figs. 3 and 4) . By incorporating the complexities of the online world that the online user population is time-dependent (N (t)) and individuals are non-identical and operate in an expansive online space ( Fig. 1(a) ), the model also happens to represent a novel generalization of existing models in physics and chemistry. Hence our findings also make the broader contribution of ad-vancing understanding of non-equilibrium phenomena in open systems [37] [38] [39] [40] [41] . We also note that even for the Each individual i has its own internal variable (shade of gray) xi = (x1;i, x2;i, . . . xT ;i). (b) Each gel (or equivalently GCC using a network picture) is an observable online group in Fig. 2 . Each of the T dimensions mimics a trait or stance on an issue, around which an online group (gel, GCC) can form. Since individuals can simultaneously belong to any number of online groups, each axis can in principle form a group from all N (t) individuals. Groups that combine dimensions could form, but this just shifts the numerical value of the population-averaged aggregation probability F . arXiv:2108.01940v1 [physics.soc-ph] 4 Aug 2021 online world, the interpretation of the model is not limited exclusively to undesirable behaviors such as extremism, and hence could be used to interpret other irregular growth curves such as those reported recently for online Covid-19 (mis)information [30] . What makes this otherwise general model ( Fig. 1 ) particularly well-suited to describe online extremism and hate, is that it captures the empirical finding [31] [32] [33] [34] [35] [36] that individuals interested in such unacceptable or otherwise sensitive topics tend to utilize the inbuilt group-creation features provided by social media platforms such as Facebook and its Europe-based clone VKontakte (but not Twitter). In short, it describes how supporters 'gel' together into a group along one of the axes in Fig. 1(b) , with the benefits that the group provides a fairly shielded online environment and has a flavor of extremism or hate that appeals to them. Such groups are referred to by different names depending on the platform (e.g. Facebook Page, VKontake Club) and we stress they have nothing to do with communities inferred from a network analysis algorithm. Furthermore, the model produces group growth curves ( Fig. 2 ) that are far more irregular than for existing models [37] [38] [39] [40] [41] but which are close to the empirical curves for two high profile extremist movements. These movements are the Boogaloos in the U.S., who were reportedly involved in the January 2021 Capitol riot and have members drawn from highly diverse ideologies [1, 7] ; and ISIS (Islamic State). Our data collection methodology is the same as Refs. [34] [35] [36] and avoids requiring any information about individuals: we first search manually for an initial seed of groups, and then we track which other groups they connect to or mention, in order to iterate toward a closed list (see SI for details). The model ( Fig. 1 ) starts by assigning each individual i a list of T traits or stances around a given issue, , drawn from a general heterogeneity distribution P{ x i }. Centola et al. [42] discuss how such a simple approximation is nonetheless consistent with in-depth studies in sociology. Neither the nature nor number T of these traits or stances affects the form of the equations that we develop -and they could adapt over time but here are kept constant. For notational simplicity, we consider T = 1 in what follows. Starting from a population N (0) of isolated individuals, at each timestep we connect two randomly chosen individuals i and j (and hence the aggregates they may belong to in Fig. 1(a) ) with a probability per unit time that depends on the pair's similarity |x i − x j |. For the equations that follow, we do not need to specify a precise function of similarity at the level of individual pairs since this just controls the numerical value of the populationaverage pairing probability, F . As illustration, the SI cal- See SI for full statistics. Each curve is a separate solution of Eq. 2, and represents a separate group (gel, or GCC) in Fig. 1(b) . The histogram compiling the inferred F value from each of the groups, is shown in the inset. (a) Domestic U.S. extremist groups that identify as Boogaloos, on Facebook. The low F values for the groups (F ∼ 1/3) are consistent with recent social science case studies [1] that uncovered high diversity among Boogaloo groups. (b) International U.S. extremist groups that support ISIS, on the less moderated platform VKontakte. The higher F values (F ∼ 2/3) as compared to (a), are consistent with higher similarity within ISIS supporter groups. Other inset shows N (t) over time. N (t) is not accessible for Facebook in (a). culates F values for a uniform distribution P{ x i }: pairing favoring similarity (homophily) yields F = 2/3 while dissimilarity (heterophily) yields F = 1/3. This aggregation process can lead to large-scale connectivity transitions over time in any of the T dimensions, producing a group (gel, or GCC) comprising a non-negligible fraction of the population or network. Newcomers are injected as unattached ( Fig. 1(a) ). At mean-field level, the number of aggregates of size s, given by n s , follows these coupled, dynamical nonlinear equations for s = 1 and s ≥ 2: Empirical evidence supporting the product kernel form in Eq. (1), comes from studies of human communication and collaboration networks [18] , while its distance independence reflects the global reach of online interactions. As usual, the time t in such averaged equations is not supposed to correspond to a timestep in a full many-particle or network simulation, or real calendar time. Using the generating function E(y, t) ≡ s≥1 sn s e ys [19, 37] in Eq. 1 (see SI) leads to the following general equation that determines the time-dependent evolution of an online group: Setting E(0, t) = N (t)−G(t) and solving Eq. 2, yields the size of each group over time G(t), as well as its onset time t c beyond which G(t) becomes non-zero and large. Equation 2 has the form of a generalized Burgers equation with a forcing term. We have checked that our numerical, and where possible analytical, solutions of Eqs. 1 and 2 are consistent with full many-particle simulations (i.e. generalized version of the Gillespie algorithm [26, 28] , see SI) and with full network simulations (see SI) that include the full microscopic pairing dynamics and avoid any population-averaging. This helps confirm the suitability of the averaging process for the group (gel) formation and its equivalency to a network picture (GCC). Equation 2 leads to the conclusion that, in contrast to the usual situation of physical systems with a fixed number N of identical particles in a fixed volume [19] , the formation of a group (gel, GCC) may be suppressed entirely -and at the very least can exhibit a wide range of highly irregular growth profiles as shown in Fig. 2 . To demonstrate this explicitly, we adopt a constant dN (t)/dt (i.e. N (t) =Ṅ const ) which is a reasonable approximation for ISIS based on the empirical data in Fig. 2(b) inset, and also for the Boogaloos given the reported steady influx of people during the 2020 pandemic and election buildup. (Individuals could not be counted on Facebook). This leads (see SI) to the exact result for the onset time of a group (gel, GCC): where α 2 = (1 − 8F/Ṅ const ). Equation 3 implies that not only is the group formation time t c delayed with smaller F and/or larger rateṄ const , it can diverge (see Fig. 3 (a) for β = 0). As the system gets flooded with more heterogeneous individuals, this slows down the ability for a group (gel, GCC) to emerge -and can prevent it from ever forming. This abrupt transition between regimes of eventual group formation and no-group formation, is given byṄ const | c = 8F (vertical line Fig. 3(a) ). This result offers a counterintuitive but novel approach to mitigation: delay the onset of groups (gels, GCCs) by flooding the online space with heterogeneous individuals so thatṄ const > 8F . While such a proposition ultimately requires testing online, it provides a rigorously derived quantitative starting point for policy discussions. For groups that do emerge, their growth can be delayed or flattened by decreasing the F (Figs. 3(a), Fig. 4(a) ) which can be achieved by encouraging or engineering more diversity among newcomers. This impact of decreasing F can be seen directly from the following approximate analytic solutions for G(t) shown in Fig. 4(a) and derived in the SI, which are in close quantitative agreement with the simulations: where y 0 = ω−W (ωe ω ), z 0 = N (0) ω W (ωe ω ), W (.) denotes the Lambert function, and . (5) Independent empirical evidence supporting our finding of a delay in t c with decreasing F , comes from recent laboratory-controlled experiments which find that human groups formed by random aggregation (F = 1 for uniformly distributed heterogeneity) were quicker to attain a high level of innovation performance than groups formed by like or unlike individuals (F < 1) [43] . Figure 2 compares the numerical solutions of Eq. 2 foṙ N (t) =Ṅ const , to the empirical data for the extremist groups. Each group corresponds to a gel (or equivalently GCC) forming along a separate trait (or issue) axis in Fig. 1(b) . The rapid growth of these extremism support groups is generally far quicker and irregular than groups focused on benign topics. Hence although our model is not limited to extremism, it again seems particularly appropriate to such undesirable behavior. Equation 2 can be used as a starting point for quantifying the impact of other policies and interventions, in addition to our suggested one of pushingṄ (t) beyond 8F for the constant rate case. For example, suppose the social media platform tries to dynamically control newcomer flowṄ (t). Figure 3 shows the impact forṄ (t) = qt β (Ṅ (t) = q ≡Ṅ const for β = 0). Increasing β with β > 0, the onset t c gets further delayed and the transition to a no-group formation regime occurs at smaller q. By contrast, β < 0 appears to remove this transition. Figure 3(b) shows the corresponding group growths G(t). G(t) for β > 0 rises initially more slowly than for β ≤ 0, but eventually overtakes. As β becomes increasingly negative, G(t) rises quicker but saturates faster, eventually reaching the constant N (i.e.Ṅ (t) = 0) limit that provides a crude initial approximation for some growth curves [37, 41] . This sameṄ (t) = 0 limit is also reached by setting q → 0. Another possible policy would be for platforms to restrict the range of online community spaces that newcomers have access to. We mimic this by fixing the volume in Fig. 1(a) and hence setting the denominators in Eq. 1 to N (0) for all t. For β = 0 and henceṄ (t) =Ṅ const , the onset time is: which has no transition to a no-group formation regime, i.e. a group always emerges eventually. This onset time is quicker than for the prior discussed cases, including constant N (t). More generally, for β = 0, t c satisfies the following transcendental equation: where J n is the Bessel function of the first kind, and Figure 4 (b) illustrates the rich nonlinear dependence that can then arise for t c as a function of β for the limit of N (0) → 0. This and other results for this 'constant volume' limit are given in the SI. Our model and analyses are not a priori limited to extremism or human behavior. Hence our findings serve more broadly to indicate novel science for aggregation in heterogeneous, time-varying populations or networks across the sciences. In particular, our analysis has led to a general equation that describes the onset and timeevolution of macrosopically large groupings (i.e. a macroscopically large correlation or coherence) in a general population which has a time-dependent size and where its component objects are non-identical and operate in an expansive space. This is important since the sudden appearance of such a macrosopically large grouping may be beneficial or harmful to a system (e.g. it may generate extreme events). Our analysis then shows how to manipulate its onset, engineer its growth curve, and even prevent it from forming. PDM acknowledges support from Los Alamos National Laboratory Director's Fellowship. NFJ acknowledges support of US Air Force Office for Scientific Research through grants FA9550-20-1-0382 and FA9550-20-1-0383. We are grateful to Nicholas Johnson Restrepo, Rhys Leahy, Yonatan Lupu and Nicolas Velasquez for help with the empirical data. This is Our House: A Preliminary Assessment of the Capitol Hill Siege Participants. Program on Extremism Online influence, offline violence: Linguistic responses to the 'Unite the Right' rally Terrorist Use of the Internet by the Numbers Lone-actor terrorist use of the Internet and behavioral correlates A contagion of institutional distrust: Viral Disinformation of the COVID Vaccine and the Road to Reconciliation Far-Right Extremists Move From 'Stop the Steal' to Stop the Vaccine Congress Just Got an Earful About the Threat of the Boogaloo Movement. Vice News Measuring social response to different journalistic techniques on Facebook The Base Rate Study: Developing Base Rates for Risk Factors and Indicators for Engagement in Violent Extremism The Extreme Gone Mainstream The terrorist's dilemma: Managing violent covert organizations The nature of the beast: Organizational structures and the lethality of terrorist attacks The Dynamics of Multidimensional Secession: Fixed Points and Ideological Condensation Collective action and the collaborative brain Heterogeneous Preference and Local Nonlinearity in Consensus Decision Making Homophily Based on Few Attributes Can Impede Structural Balance Emergent Field-Driven Robot Swarm States Quantifying social group evolution A Kinetic View of Statistical Physics Theory of molecular size distribution and gel formation in branched-chain polymers Collisions and gravitational reaccumulation: Forming asteroid families and satellites Principles of polymer chemistry A deterministic model of competitive cluster growth: glassy dynamics, metastability and pattern formation Coagulation equations with gelation Gelation in coagulating systems Exact Stochastic Simulation of Coupled Chemical Reactions A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Reactions Intergroup aggression in chimpanzees and war in nomadic hunter-gatherers The COVID-19 social media infodemic Thanks for your interest in our Facebook group, but it's only for dads: social roles of stay-at-home dads Mothers' Perceptions of the Internet and Social Media as Sources of Parenting and Health Information: Qualitative Study Differences Between Mothers and Fathers of Young Children in Their Use of the Internet to Support Healthy Family Lifestyle Behaviors: Cross-Sectional Study New online ecology of adversarial aggregates: ISIS and beyond Multiscale dynamical network mechanisms underlying aging from birth to death Hidden resilience and adaptive dynamics of the global online hate ecology Hidden order in online extremism and its disruption by nudging collective chemistry Homophily, cultural drift, and the co-evolution of cultural groups Capturing the Production of Innovative Ideas: An Online Social Network Experiment and 'Idea Geography' Visualization