key: cord-0221075-d63zwmjw authors: Park, Jaehyuk; State, Bogdan; Bhole, Monica; Bailey, Michael C.; Ahn, Yong-Yeol title: People, Places, and Ties: Landscape of social places and their social network structures date: 2021-01-12 journal: nan DOI: nan sha: e24adfa67d3fb8f8be3d68e8321690af8e27c621 doc_id: 221075 cord_uid: d63zwmjw Due to their essential role as places for socialization,"third places"- social places where people casually visit and communicate with friends and neighbors - have been studied by a wide range of fields including network science, sociology, geography, urban planning, and regional studies. However, the lack of a large-scale census on third places kept researchers from systematic investigations. Here we provide a systematic nationwide investigation of third places and their social networks, by using Facebook pages. Our analysis reveals a large degree of geographic heterogeneity in the distribution of the types of third places, which is highly correlated with baseline demographics and county characteristics. Certain types of pages like"Places of Worship"demonstrate a large degree of clustering suggesting community preference or potential complementarities to concentration. We also found that the social networks of different types of social place differ in important ways: The social networks of 'Restaurants' and 'Indoor Recreation' pages are more likely to be tight-knit communities of pre-existing friendships whereas 'Places of Worship' and 'Community Amenities' page categories are more likely to bridge new friendship ties. We believe that this study can serve as an important milestone for future studies on the systematic comparative study of social spaces and their social relationships. We address this challenge by investigating the landscape of social places and their friendship networks geographically, and demographically, across the United States. We use nationwide, deidentified, and aggregated data from Facebook Pages to measure the distribution of various third places, which allows us to present a systematic perspective of social spaces in the US. Furthermore, we use social network of Facebook followers of third place establishiments, to examine the multifaceted interaction between third places and social lives. Representativeness of Facebook pages Facebook friendships provides a meaningful representation of people's social ties. In the United States, the Facebook usage rate is not only high (69% of adults), but also relatively constant across income groups, education groups, and racial groups among online US adults 25 . Furthermore, previous studies of online social networks have found a significant association between self-reported friendships and Facebook friendships [26] [27] [28] . However, while the representativeness of friendships in Facebook for real social relationships has been validated by previous studies and surveys, the representativeness of Facebook Pages for offline third places across the US has not been thorougly examined. To evaluate the representativeness of Facebook Pages, we compare the number of Facebook social place pages to the number of third place establishments observed in the County Business Patterns (CBP) dataset (See Methods). Note that it covers only a fraction of social spaces and thus falls short for our systematic investigation. Geographic and Demographic Landscape of Social Places Facebook Pages provide a means for individuals to connect over shared interests in hobbies, causes, businesses, or celebrities, to mention but a few categories which form the subjects of Facebook pages. Facebook Pages includes physical locations, which are of local interests. These pages on local places are arguably wellaligned with the concept of "third places." Using an existing list of third place categories created by Jeffres et al. 29 , we systematically identify (See Methods) the pages of social places which connect people locally and may function as "third places." The maps of social place distribution exhibit that there is significant geographic heterogeneity between third place categories. 'Bars and Pubs' and 'Clubs and Societies' are more common in Community characteristics and demographics play a large role in which establishments are created and persist. To better understand the relationship between demographics and prevalence of third places, we look at four characteristics and how they influence third place prevalence: urbanization, income, education, and foreign-born population. The results, shown in Figure 3 , reveals that prevalence of the four largest categories -Retail, Beauty, Restaurants, and Places of Worship -indeed varies with regional and demographic characteristics. Counties in the middle of the RUCC urbanization scale have the highest prevalence of retail stores, beauty shops, and restaurants (Figure 3 The prevalence of Places of Worship shows a stark decrease with increasing income levels. The counties in the lowest income decile tend to have about 2.5 times more places of worship per capita than the counterparts in the highest decile. By contrast, the numbers of beauty places in rich counties are higher than the number in poor counties in general. Other categories, such as Community Amenities, Bars and Pubs, Retail, and Restaurants, show the inverted-U shape, where the middle-class counties have more of those social places than either richer and poorer counterparts. we call "follower friendship network". This approach creates a user-to-user graph for each page, which allows us to analyze the structure of social relationships embedded in each social place. We first characterize the social network topologies around all third place categories by measuring various network features. We sampled 2,500 pages having between 50 and 50,000 followers -to exclude extreme pages -at random, from each of the twelve third place categories. Then, we compute multiple topological statistics of the social network of each sampled page (See Methods for more information on the extraction process). In particular, following a previous study to characterize social networks of US colleges 30 , we compute the following 18 network measurements with respect to the user-to-user friendship graph connecting followers of each page: density, number of edges, number of nodes, average degree, average clustering coefficient 31 , average degree assortativity 32 , degree variance, average path length within the largest connected component, algebraic connectivity 33 , modularity of modularity-maximizing partition 34 , and number of k-Cores 35 and k-Brace 36 for k ∈ {2, 4, 8, 16}. These measures examining various aspects in network structure allow us to capture the fragmentation and diversity of social networks, which we expect to differ by type of social place. We first extract the 'most representative' network for each third place category and visualize them (See Figure 4) . We find that social networks for Parks and Monuments and Outdoor Recreation are sparser, less-centralized, while those for Clubs and Societies, Bars and Pubs, and Performing Arts venues are highly connected, having multiple cores. We then quantify concretely the structural difference between third place categories by measuring network dissimilarity between each pair of third place categories. Here, we measure similarity between two categories as the difficulty of classification. If a classifier cannot separate two groups of samples easily, we consider that the graphs that produced them are similar. This approach of measuring similarity with a prediction task has been used in recent studies 30, 37, 38 . For each pair of third place categories, we train a random forest classifier. We train the sampled social networks The statistics are the top ten features by importance, which is presented in Supplementary Information. of the two categories using the 18 aforementioned topological characteristics as the features (See Methods for detail information). As a result of all possible pairs of the twelve categories, we obtain the cross-validated area under the curve (AUC) of the model for each pair of third place categories, as a measure of similarity distance between the cateogories. In other words, if the AUC is close to 0.5, the two categories cannot be easily distinguished, which can be considered as similar. On the other hand, if the AUC is close to 1.0, the two categories are easy to distinguish, which can be considered as different. The ROC-AUC distance matrix, as shown in Figure 5 Avg. Path Length, while not so regarding clustering measurements, such as Avg. Clustering, De-gree Assortativity, and Density. At the same time, the third places in the 5-category cluster is distinguished from the others -Beauty, Large Sports Venues, and Retail -in terms of Variation of Degree, which is probably related to their core-periphery structure. It is noteworthy that the similarity of social network between third place categories is not much correlated with the types of activity or behavior in the places. For instance, social activities and behaviors in Resturants are more similar to those in Bars and Pubs than Community Amenities. However, in terms of the topology of friendship network, Bars and Pubs is closer to Community Amenities than Resturants, which is aligned with previous studies about the social role of pubs in rural area 12, 17, 18 . Hence, our results here imply that it is essential to take the cultural and environmental factors into account, beyond the type of activity, when studying the third places. In this study, we present a systematic measurement on the prevalence of various third places across the United States, by leveraging Facebook Pages dataset. Our results reveal differences in geographical distribution of social places (e.g. places of worship in the South, and bars in the Midwest), as well as the distribution with respect to demographic features, including the levels of urbanization, income, education, and foreign-born population. Our results also reveal that different kinds of third places draw upon (or facilitate) heterogeneous social network structures among the "followers" of their pages. For instance, Places of Worship tend to be associated with social networks that are highly clustered and feature short path lengths. Parks and Monuments, by con-trast, have low clustering and longer path length, indicating a more sparsely-organized networks. Certain third place categories (e.g. Indoor Recreation and Clubs and Societies) have more similar social networks than others, and social networks systematically differ across categories. We expect that our study marks an important milestone towards the understanding of our social infrastructure and their roles in our society by exploring a unique dataset that covers all major types of social spaces and spans a whole country. Since our findings are decidedly descriptive, more work is required to ascertain the extent to which certain kinds of third places also help create and maintain social ties, as well as the extent to which third places benefit from existing social networks. A dynamic view is thus called for in future research, examining the interplay between social ties and third places. There are several limitations of this study. First, even with the strong correlation between the prevalence for third place categories in Facebook Pages and county-level CBP statistics, we cannot completely rule out the potential existence of systematic biases in our dataset. For instance, the delay between the actual opening (closing) of establishments and the creation (deletion) of their Facebook Pages may affect our findings. Second, given the existence of a digital divide, our observations may have been affected by the preferred social places for more online-friendly generations. For example, our ranking of the total number of Facebook pages (See Supplementary Information ) are more consistent with the rankings presented in a previous survey based on three college town areas in Massachusetts 39 , than the social place ranking in another study based on a national telephone survey 29 . Even if our study could pinpoint the existence of social places, it does not necessarily capture their size or usage. The number of places of worship may not reflect the number of people attending services, since the size of congregations can vary a great deal. Leveraging other datasets may address this issue in the future. During the last couple of decades, we have experienced a dramatic change in our social lives due to the emergence of a new type of social spaces -social media. As we become more familiar with online social places such as Facebook or Twitter, this new social space becomes more related and embedded into the existing offline social places. As our favorite social places create their own websites and Facebook Pages to interact with customers online and offline social places become more interdependent. Hence, studying online and offline interactions can help us to understand the similarities and differences between these different spaces. Furthermore, examining the changes in the landscape of the social spaces -particularly with respect to the COVID-19 pandemic and increasing interconnectedness of offline and online social lives -as well as studying the associations between the abundance of social places and the characteristics and dynamics of local communities will be fruitful future studies. Business Patterns (CBP) dataset is created by surveying firms around the US and categorizing them into place categories by self-assessments to estimate the number of establishments in each category. We manually matched a subset of social place categories with their corresponding North American Industry Classification System (NAICS) codes. For each matched categories, we compare the estimated number of establishments in the CBP dataset to the number of Facebook pages in the category. The matching table between our social place categories and NAICS codes is presented in Supplementary Information. Since Facebook pages can fall into a number of categories, we create a data-driven taxonomy of Facebook Pages that represents social places. In doing so we made use of a dataset of over 6 million local pages with between 50 and 50,000 US followers, people who clicked "like" or "follow" on the page to follow its posts and updates. In Facebook Pages, page administrators choose up to 3 page categories for their pages -for instance, a page may be identified as both "AMERICAN RESTAURANT", "RESTAURANT", and "FOOD" in the Facebook page category. Using a set of page categories for each page, we trained a word2vec model 401 . For every broad category indicated by a previous work 29 , we choose a Facebook Pages' category that could reasonably represent the broader category. For instance, "AMERICAN RESTAURANT" was chosen for the "restaurants" categories. The top 300 terms, in terms of their distance in the embedded space were then examined for each category, with the research team filtering categories that were judged to not fit the notion of third place. For instance, categories such as "COMPETITION" were excluded for not indicating a place, whereas "MEDICAL HEALTH" was excluded due to the ambiguous nature of the category, and so on. We also removed "SCHOOLS" from our consideration as they represent workplace ("second places", rather than "third places") for students. Inspection of the data further led us to combine "Restaurants" and "Cafes" into a single category, as we did with Community Centers, Senior Centers, and Libraries. Finally, we opted for a different categorization of recreation venues Altogether these "social place pages" account for 37.3% of all US local Facebook pages with between 50 and 50,000 fans. Based on this table, we then filtered and matched the local places in Facebook Pages belonging each of the following twelve social place categories: places of worship, restaurants, bars, community amenities, performing arts, parks and monuments, indoor recreation, non-urban outdoor recreation, large sports venues, clubs and societies, retail, and beauty. We use the number of pages in a category per one thousand residents, based on the county-level population estimates in 2018 by US Census, which allows us to control for the population of US counties. Since each page in Facebook pages has up to three categories, we adjust the weight of a page for counting, by dividing one by the number of categories that a page has. For instance, if one page has two categories, 0.5 pages are counted for the two categories, respectively. Demographic records of counties As a proxy of the urbanization level of a county, we use The 2013 Rural-Urban Continuum Codes (RUCC) created by the Office of Management and Budget (OMB) of the United States. RUCC is a classification scheme to distinguish metropolitan counties by the population size of their metro area, and nonmetropolitan counties by degree of urbanization and adjacency to a metro area. Under this scheme, US counties are coded from 1 (Counties in metro areas of 1 million population or more) to 9 (Completely rural or less than 2,500 urban population, not adjacent to a metro area), based on the population and adjacency to a metro area. For regional income, we utilized median household income records for each county from 2018 American Community Survey (ACS). Also, for education and foreign-born population, we use the number of people aged 25 or older who have higher than college degree in each county and the proportion of foreign-born residents in each county, respectively, from 2006-2010 ACS. To visualize demographic patterns in third place distributions, we compute the number of third places per 1000 residents for each county. We then compute the median for each RUCC code and for each decile in terms of the income, education, and foreign-born population proxies. Extracting social graphs of Facebook Pages For each of 2,500 randomly sampled Facebook Pages of each of the twelve third place categories, we extract the friendship network of the users who are the fan of the page. Then, we map the page to its corresponding third place category, and calculate the topological features of our interests, associated with the category. Finally, after mapping the topological features into each category, all network information is discarded. Since this study is only interested in the topological structure of each third place category, the entire process constitutes appropriate aggregations for each step, which carry no personally identifiable information. Measuring similarity between social networks We trained 66 binary random forest classifiers -one for each possible pair of different third place categories. The classifiers were trained under 10-fold cross-validation, using the scikit-learn Python package 42 to distinguish between page follower networks coming from either one of the paired categories. Feature importance, averaged across all classifier runs, reveals average clustering, mean degree, and degree assortativity as the most discriminative features. The list of importance for all 18 features are presented in Supplementary Information. This ordering suggests that the classifier is picking up on non-trivial structural differences in the make-up of page follower networks, rather than simply focusing on the number of followers (10/18 in terms of average importance) or density (4/18). The great good place: Cafes, coffee shops, bookstores, bars, hair salons, and other hangouts at the heart of a community Fighting a mcdonald's in queens for the right to sit. and sit. and sit' The Death and Life of Great American Cities (Random House The Social Life of Small Urban Spaces (Conservation Foundation The Production of Space Bowling alone: The collapse and revival of American community A space for place in sociology Friendship and mobility: user movement in locationbased social networks The role of space in the formation of social ties Segregated interactions in urban and online space The amenity mix of urban neighborhoods third places" and social interaction in deprived neighbourhoods in great britain Change in the social life of urban public spaces: The rise of mobile phones and women, and the decline of aloneness over 30 years Housing layout, social interaction, and the place of contact in abu-nuseir, jordan Testing the claims of new urbanism: Local access, pedestrian travel, and neighboring behaviors Unanticipated gains: Origins of network inequality in everyday life Industrial and provident societies and village pubs: exploring community cohesion in rural britain How third places foster and shape community cohesion, economic development and social capital: The case of pubs in rural ireland Neighborhood social processes, physical conditions, and disaster-related mortality: the case of the 1995 chicago heat wave Social capital: Dealing with community emergencies. Homeland Security Affairs Church-based social capital, networks and geographical scale: Katrina evacuation, relocation, and recovery in a new orleans vietnamese american community Heat wave: A social autopsy of disaster in Chicago sustainability and neighbourhood regeneration Homes, cities and neighbourhoods: planning and the residential landscapes of modern Britain Social media update 2014. pew research center Predicting tie strength with social media Inferring tie strength from online directed behavior Social networking sites and our lives. pew internet and american life project The impact of third places on community quality of life The structure of us college networks on facebook Collective dynamics of 'small-world'networks Mixing patterns in networks Algebraic connectivity of graphs Finding community structure in very large networks Cambridge studies in advanced mathematics Structural diversity in social contagion Measuring group differences in high-dimensional choices: method and application to congressional speech Coming apart? cultural distances in the united states over time Third places and the social life of streets Distributed representations of words and phrases and their compositionality Software Framework for Topic Modelling with Large Corpora Scikit-learn: Machine learning in python. the