key: cord-103108-vmze2mdx authors: Vanheer, Lotte; Schiavo, Andrea Alex; Van Haele, Matthias; Haesen, Tine; Janiszewski, Adrian; Chappell, Joel; Roskams, Tania; Cnop, Miriam; Pasque, Vincent title: Revealing the Key Regulators of Cell Identity in the Human Adult Pancreas date: 2020-09-25 journal: bioRxiv DOI: 10.1101/2020.09.23.310094 sha: doc_id: 103108 cord_uid: vmze2mdx Cellular identity during development is under the control of transcription factors that form gene regulatory networks. However, the transcription factors and gene regulatory networks underlying cellular identity in the human adult pancreas remain largely unexplored. Here, we integrate multiple single-cell RNA sequencing datasets of the human adult pancreas, totaling 7393 cells, and comprehensively reconstruct gene regulatory networks. We show that a network of 142 transcription factors forms distinct regulatory modules that characterize pancreatic cell types. We present evidence that our approach identifies key regulators of cell identity in the human adult pancreas. We predict that HEYL and JUND are active in acinar and alpha cells, respectively, and show that these proteins are present in the human adult pancreas as well as in human induced pluripotent stem cell-derived pancreatic cells. The comprehensive gene regulatory network atlas can be explored interactively online. We anticipate our analysis to be the starting point for a more sophisticated dissection of how transcription factors regulate cell identity in the human adult pancreas. Furthermore, given that transcription factors are major regulators of embryo development and are often perturbed in diseases, a comprehensive understanding of how transcription factors work will be relevant in development and disease biology. HIGHLIGHTS - Reconstruction of gene regulatory networks for human adult pancreatic cell types - An interactive resource to explore and visualize gene expression and regulatory states - Predicting putative transcription factors driving pancreatic cell identity - HEYL and JUND as candidate regulators of acinar and alpha cell identity, respectively A fundamental question in biology is how a single genome gives rise to the great diversity of cell types that make up organs and tissues. A key goal is to map all cell types of developing and mature organs such as the pancreas, an essential organ at the basis of multiple human disorders including diabetes and cancer (Kahn, Cooper and Del Prato, 2014; Kamisawa et al., 2016; Han et al., 2020) . Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to resolve cellular heterogeneity, identify cell types and capture highresolution snapshots of gene expression in individual cells (Grün et al., 2016) . With the advent of singlecell transcriptomics, great progress has been made toward the creation of a reference cell atlas of the pancreas (Baron et al., 2016; Segerstolpe et al., 2016; Wang et al., 2016; Xin et al., 2016; Enge et al., 2017; Lawlor et al., 2017; Augsornworawat and Millman, 2020; Han et al., 2020) . Work from several groups provided cellular atlases of the pancreas during mouse development (Stanescu et al., 2017; Byrnes et al., 2018; Scavuzzo et al., 2018) , in adult mice (Muraro et al., 2016) and in human fetal and adult pancreas (Liu et al., 2014; Baron et al., 2016; Muraro et al., 2016; Segerstolpe et al., 2016; Wang et al., 2016; Enge et al., 2017; Han et al., 2020) . Efforts have also been made to map cellular identity during pancreas development starting from human pluripotent stem cells Hrvatin et al., 2014; Zhu et al., 2016; Han et al., 2020; Hogrebe et al., 2020; Peterson et al., 2020) . Taken together, these studies provide an opportunity to better understand the maintenance and establishment of cellular identity among different pancreatic cell types. Work over the past decades indicated that cellular identity is established by combinations of transcription factors (TFs) that recognize and interact with cis-regulatory elements in the genome. These transcription factors, together with chromatin modifiers, give rise to gene expression programs. A small number of core TFs are thought to be sufficient for the establishment and maintenance of gene expression programs that define cellular identity during and after development (Ohno, 1979) . Studies conducted in both mouse and human have successfully identified TFs that are pivotal for the acquisition and maintenance of pancreatic cell fates (Dassaye, Naidoo and Cerf, 2016) . These include PDX1 (Zhou et al., 2008; Shih et al., 2015) , MAFA (Nishimura, Takahashi and Yasuda, 2015) , NGN3 (Gradwohl et al., 2000) , NKX2.2 (Sussel et al., 1998) , PAX4 (Sosa-Pineda et al., 1997) , NKX6.1, NEUROD1 (Mastracci et al., 2013) , ARX (Collombat et al., 2003) , MAFB (Artner et al., 2006) , RFX6 (Smith et al., 2010) , GATA4 (Ketola et al., 2004) , FOXA2 and SOX9 (Shroff et al., 2014; Shih et al., 2015) . Conditional deletion of TFs such as FOXA2 and PDX1 in adult beta cells results in the loss of cellular identity and function (Sund et al., 2001; Gao et al., 2014) . Genetic evidence for the role of these TFs in establishing human pancreatic cell identity is provided by the identification of TF loss-of-function mutations that cause pancreatic agenesis (Stoffers et al., 1997; Sellick et al., 2004; Allen et al., 2012; Shaw-Smith et al., 2014; De Franco et al., 2019) or neonatal or young-onset diabetes (Senée et al., 2006; Solomon et al., 2009; Rubio-Cabezas et al., 2010 Smith et al., 2010; Bonnefond et al., 2013; Flanagan et al., 2014) . In addition, TF overexpression can reprogram somatic cells to adopt alternative identities (Takahashi and Yamanaka, 2006; Zhou et al., 2008; Vierbuchen et al., 2010; Lima et al., 2016) . For example, the induced expression of Ngn3, Pdx1, and Mafa was shown to reprogram mouse alpha cells into beta-like cells in vivo (Zhou et al., 2008) . However, how key TFs underlie the maintenance of cellular identity in the human pancreas remains incompletely understood. Over the past decade, multiple approaches to reconstruct gene regulatory networks (GRNs) from bulk and single-cell omics data have been developed (Ghazanfar et al., 2016; Lim et al., 2016; Matsumoto et al., 2017; Fiers et al., 2018) . In particular, it is now possible to combine single-cell transcriptomic data with cisregulatory information to infer GRNs (Janky et al., 2014; Aibar et al., 2017; Van de Sande et al., 2020) . Because TFs recognize DNA motifs in the genome, one can measure if inferred target genes are expressed within single cells, and therefore quantify the activity of TFs. Such approaches have revealed the regulatory programs in distinct systems including the Drosophila brain (Davie et al , 2018) , cancer (Wouters et al., 2020) , during early mouse embryonic development (Peng et al , 2019) , in a mouse cell atlas (Suo et al., 2018 ) and a human cell atlas (Han et al., 2020) . Analysis of GRNs in the human adult pancreas has identified distinct endocrine and exocrine regulatory states with multiple stable cell states for alpha, beta and ductal cells (Kumar and Vinod, 2019) . Type 2 diabetes and body mass index (BMI) were shown not to impact GRN activity of alpha and beta cells (Kumar and Vinod, 2019) . Previous data shows that type 2 diabetic (Faerch et al., 2015; Dennis et al., 2019; Dybala and Hara, 2019; Zaharia et al., 2019) and non-diabetic human islet preparations vary greatly depending on age (Enge et al., 2017; Westacott et al., 2017) and BMI (Henquin, 2018) warranting the exploration of GRNs in larger cohorts. Hence, it remains unclear whether previous GRN findings can be extrapolated to a broader, highly heterogeneous non-diabetic and type 2 diabetes patient population. The development of integration methods provides an opportunity to analyze multiple scRNA-seq studies from multiple laboratories and patients (Butler et al., 2018; Luecken et al., 2020) . Additional knowledge on how GRNs maintain cellular identity in the human adult pancreas may further the understanding of disease states as well as improve efforts to convert patient cells into functional, mature beta cells for diabetes treatment. Here, we build an integrated human pancreas gene regulatory atlas. In this resource, we use single-cell transcriptomes of the human adult pancreas, taking advantage of integration strategies and computational tools to reconstruct GRNs. Our analysis identifies the GRN landscape and candidate regulators that are critical for cellular identity in the human adult pancreas. Integrating multiple human adult pancreas scRNA-seq datasets can further improve the power of scRNAseq analyses to create a human adult pancreas cell atlas. We set out to analyse and integrate five publicly available datasets covering a total of 35 non-diabetic, 15 type 2 diabetic and 1 type 1 diabetic individuals using Seurat v3.0 CCA integration tools ( Figure 1A , detailed donor information can be found in Table S1 ) (Segerstolpe et al., 2016; Wang et al., 2016; Xin et al., 2016; Enge et al., 2017; Lawlor et al., 2017; Hafemeister and Satija, 2019; Stuart et al., 2019) . After filtering out low quality transcriptomes and data integration, uniform manifold approximation and projection for dimension reduction (UMAP) visualisation revealed that 7393 cells localize into distinct clusters ( Figure 1B) . Cells from each original dataset localize together suggesting that the location of cells on the UMAP is not driven by the dataset of origin ( Figure 1C ). We next sought to identify pancreatic cell types ( Figure 1D ). Clustering analyses based on the expression of well-established cell type specific markers led to the identification of eight cell types in the human adult pancreas: beta, alpha, gamma, delta, acinar, ductal, stellate and endothelial cells ( Figure 1B , Table S2 ). UMAP visualization allowed for the segregation of endocrine, exocrine and other lineages ( Figure S1A ). Beta cells grouped together, away from other clusters and were marked by INS expression (Figure 1E ). Other distinct clusters corresponded to alpha, gamma and delta cells based on global transcriptional similarity and elevated expression of GCG, PPY and SST, respectively, and other markers ( Figure 1E , Table S2 ). Using a similar approach, we reliably detected other, previously described, major pancreatic cell types (acinar, ductal, endothelial and stellate, Figure 1B ). All cell types were detected in both non-diabetic and type 2 diabetic pancreases ( Figure 1F ). An additional four rare cell populations, that cannot be robustly identified through clustering analyses, were identified manually by assessing the expression of GHRL (epsilon cells), TPS1AB (schwann cells), CD86 (mast cells) and SOX10 (Major histocompatibility complex (MHC) class 2 cells) (Wierup et al., 2002; Segerstolpe et al., 2016) (Figure 1G ). These rare cell types often cluster with other common cell types. Importantly, our annotation recapitulated previous annotations to a large extent (Figure S1B-C). In summary, we reconstructed an integrated single-cell atlas of the human adult pancreas, and annotated 12 pancreatic cell types. Next, we set out to comprehensively reconstruct GRNs for all pancreatic cell types from single-cell transcriptomic data, applying single-cell regulatory network inference and clustering (pySCENIC) (Aibar et al., 2017; Van de Sande et al., 2020) . PySCENIC links cis-regulatory sequence information together with single-cell transcriptomes in three sequential steps by 1) co-expression analysis, 2) target gene motif and ChIP-seq track enrichment analysis, and 3) regulon activity evaluation (Figure 2A) . Each regulon consists of a TF with its predicted target genes (co-expressed genes with an enriched TF motif), altogether forming a regulon. pySCENIC identified 142 regulons that characterize the GRNs of the human adult pancreas 5 ( Figure 2B /C, Table S3 ). Multiple regulons identified here as active in the pancreas correspond to TF binding motifs enriched in accessible chromatin in the pancreas (assessed by ATAC-seq in FACS-purified pancreatic cells, (Arda et al., 2018) ), supporting the validity of the approach ( Figure S2A ). UMAP visualization based on the activity of 142 regulons in non-diabetic and type 2 diabetic pancreata revealed groups of cells that differ from one another based on their regulatory activity ( Figure 2B -C). In particular, there are distinct regulatory states for exocrine and endocrine pancreatic lineages, stellate and endothelial cells (Figure 2B /C). Endocrine cell types clustered together, indicating shared regulatory states, while exocrine cell types formed two distinct clusters. Stellate and endothelial cells differed most from other cell types in their regulatory states. These results are consistent with previous analyses (Baron, Veres, Samuel L. Wolock, et al., 2016; Lawlor et al., 2017; Kumar and Vinod, 2019) and are also in line with our findings above based on gene expression analysis ( Figure 1D ). As expected, regulons active in endocrine cell types include RFX6, PAX6 and NEUROD1 ( Figure 2D -G). These TFs have reported roles in endocrine cell fate commitment and maintenance of cell identity throughout adult life (Smith et al., 2010; Hart et al., 2013; Mastracci et al., 2013) . Using iRegulon for visualization, many of the NEUROD1 target genes identified here have been previously linked to beta cell survival and function, such as SNAP25, TSPAN2, (Gierl et al., 2006; Haumaitre et al., 2008; Hart et al., 2013; Churchill et al., 2017) . Clustering all cells based on the activity of all regulons identifies regulatory modules ( Figure 2D , red squares). In the exocrine pancreas, one regulatory module, containing NR5A2, was shared between acinar and ductal cells, although with a tendency for increased regulon activity in ductal cells ( Figure 2D /H). Other exocrine regulons included ONECUT1, REST and HNF1B with reported roles in exocrine development (Nissim et al., 2016; Kropp, Zhu and Gannon, 2019) and the adult exocrine pancreas (Quilichini et al., 2019; Bray et al., 2020 ) ( Figure 2D ). In summary, this analysis confirms the expected separation of exocrine and endocrine cells with distinct gene regulatory programs, and identifies known and novel candidate regulators of pancreas cell states. Several regulatory modules are shared between different cell types within the endocrine and exocrine pancreas. Additionally, each cell type is defined by cell-type specific regulatory modules ( Figure 1D ). In the endocrine pancreas, alpha and beta cells shared endocrine regulons (MAFB, MEIS2), whereas we observed distinct activities for ARX and IRX2 regulons in alpha cells and RXRG and PDX1 in beta cells ( Figure 2D /H), expanding previous findings (Kumar and Vinod, 2019) . Using iRegulon for visualization, PDX1 target genes include SLC6A17, PDIA6 and ABHD3, which have been reported to control insulin release (Thomas, Brown and Brown, 2014; Eletto et al., 2016; Rorsman and Ashcroft, 2018 ) ( Figure S2C ). Interestingly, gamma and delta cells overlapped with alpha and beta cells, respectively, suggesting a shared regulatory state ( Figure 2C/D) . This includes shared regulon activity for ARX in gamma and alpha and PDX1 in beta and delta cells ( Figure 2D /H), consistent with their reported expression in published scRNA-seq studies (Baron et al., 2016; Segerstolpe et al., 2016; Lawlor et al., 2017) . GATA4 and RBPJL, known acinar-specific TFs, were highly active in acinar cells (Masui et al., 2010; Carrasco et al., 2012) ( Figure 2H) . Similarly, ductal cells were characterised by highly active SOX9 and POU2F3 regulons, in line with previous literature (Shroff et al., 2014; Yamashita et al., 2017) (Figure 2H ). In sum, this analysis confirms that alpha, beta, acinar and ductal cells are defined by the activity of distinct TF combinations that form gene regulatory modules. In conclusion, the network approach recovers many of the expected regulators of pancreatic cellular identity allowing for the comprehensive characterisation of the gene regulatory state of all major human adult pancreatic cell types. A comprehensive network analysis provides an opportunity to predict and identify critical regulators of cell identity. To identify regulons with highly cell type-specific activities within the human adult non-diabetic pancreas, we calculated regulon specificity scores (RSS) (a complete list of RSSs can be found in Table S4 ) (Suo et al., 2018) . The RSS utilises Jensen-Shannon divergence to measure the similarity between the probability distribution of the regulon's enrichment score and cell type annotation wherein outliers receive a higher RSS and are therefore considered cell type-specific (Suo et al., 2018) . It can therefore be used to rank the activity of TFs within specific cell types. Among the top regulons identified in alpha cells, we recover well known regulators of alpha and endocrine cell fate such as ARX, IRX2, PAX6, MAFB, NEUROD1 and RFX6 ( Figure 3A /B) (Collombat et al., 2003; Artner et al., 2006; Delporte et al., 2008; Smith et al., 2010; Dorrell et al., 2011; Mastracci et al., 2013) . In addition, we identified JUND, EGR4, SREBF1 and STAT4, which have not yet been implicated in alpha cell identity. EGR1 (but not EGR4) has been shown to transcriptionally regulate glucagon expression (Leung- Theung-Long et al., 2005) as well as the PDX1 promoter in beta cells (Eto, Kaur and Thomas, 2007) . STAT4 and JUND have been described in pancreatic tissue in general and beta cells, respectively, but not in alpha cells (Yu and Kim, 2012; Good et al., 2019) (Figure 3B /C). These TFs respond to the JNK and EGFR signalling pathways and may have important physiological functions. Both JUND and the JUND/JNK signaling pathway have been implicated in pancreatic cancer (Shin et al., 2009; Recio-Boiles et al., 2016) . Immunocytochemistry of the human adult pancreas confirmed the presence of nuclear JUND in islets ( Figure 3Di ). We also detected nuclear JUND protein in a subset of human induced pluripotent stem cells (iPSCs) subjected to beta cell differentiation (Figure 3E/F) . Surprisingly, we also detected JUND protein in ductal cells, despite lower JUND regulon activity in this cell type (Figures 3C, 3Dii) . Thus, protein 7 expression does not always predict regulatory activity. Nevertheless, these results show that JUND is present and active in a subset of pancreatic cell types in the human adult pancreas and human pluripotent stem cell derived islet cells. Altogether, this analysis predicts TFs active in human alpha cells, recovering known as well as new candidate TFs. Among the top regulons identified in beta cells, we retrieved well-known as well as new candidate regulators of beta and endocrine cell identity. Known TFs include RXRG, PDX1, NEUROD1, PAX6 and RFX6 (Zhou et al., 2008; Miyazaki et al., 2010; Smith et al., 2010; Mastracci and Sussel, 2012; Hart et al., 2013) (Figure 3G /H). In addition, we found that ZNF705D, ASCL2 and HOXD13 were highly ranked regulons ( Figure 3H /I). HOXD13 and BHLHE41 have been shown to be present in the exocrine pancreas (Cantile et al., 2009; Sato et al., 2012) . Interestingly, ASCL2 has been reported to interact with β-catenin of the Wnt pathway, the latter has an established role in endocrine fate specification during in vitro differentiation (Schuijers et al., 2015; Sharon et al., 2019; Vethe et al., 2019) . Many putative target genes of ASCL2 including PDX1, INS, ABCC8, FOXA1, KCNK16, FXYD2 are directly related to glucose sensing and beta cell identity, in line with the beta cell-specific regulatory activity of ASCL2 ( Figure S3A ) (Gao et al., 2010; Arystarkhova et al., 2013; Vierra et al., 2015; Park, Lee and Park, 2016) . FXYD2γa, a regulatory subunit of the Na + -K + -ATPase, is a transcript exclusively expressed in human beta cells (Flamez et al., 2010) . Immunohistochemistry of human adult pancreas sections showed that ASCL2 is expressed in INS + beta and islet cells ( Figure 3J) . Surprisingly, ASCL2 was mainly localized to the cytoplasm ( Figure 3J) , which is unexpected for TFs which tend to localize to the nucleus (Baranek, Sock and Wegner, 2005) . Cytoplasmic localization of ASCL2 has been reported in the context of colon and breast cancer (Zhu et al., 2012; Xu et al., 2017) . These results implicate additional TFs including ASCL2 in the regulation of beta cell identity. They also illustrate the value of network analyses to increase our understanding of the biology of the human pancreas. In summary, GRN analysis and regulon ranking allowed us to pinpoint both known and novel candidate regulators of pancreatic endocrine cell identity, providing a resource for further investigation of their roles in cellular identity and function. Similarly, the comprehensive network analysis provides an opportunity to predict and identify critical regulators of exocrine cell identity. We also identify known and new TFs in acinar cells. Among the top acinar-specific regulons, we recovered well known regulators of acinar and exocrine cell identity such as PTF1A, RBPJL, GATA4 and NR5A2 (Ketola et al., 2004; Masui et al., 2010; Nissim et al., 2016; Sakikubo et al., 2018) (Figure 4A/B) . These findings are in line with a recent study that used single-nucleus RNA-seq on pancreatic acinar tissue (Tosti 8 et al., 2019) . Furthermore, we identified MECOM, HEYL and TGIF1 as highly ranked regulons ( Figure 4B /C). Interestingly, aberrant Mecom expression has been linked to the induction of gastric genes in acinar cells, which disrupts acinar cell identity and increases susceptibility to malignancy (Hoang et al., 2016) . The loss of TGIF1 has been linked to pancreatic ductal adenocarcinoma progression making further exploration of these regulons interesting in the context of cancer biology (Weng et al., 2019) . Ectopic expression of Tgif2 (but not Tgif1) reprograms mouse liver cells towards a pancreas progenitor state (Cerdá-Esteban et al., 2017) . HEYL is a reported Notch signalling target gene in NGN3 + exocrine cells (Gomez et al, 2015) . We confirmed nuclear expression of HEYL in human acinar and islet cells (Figure 4Di /ii, detailed donor information can be found in Table S1 ) by immunohistochemistry, in agreement with elevated HEYL regulon activity in acinar cells ( Figure 3C ). further functional studies in the healthy pancreatic context (Matsumoto et al., 2004; Coleman et al., 2013) . In summary, GRN analysis and regulon ranking allowed us to pinpoint both known and novel candidate regulators of pancreatic exocrine cell identity. Specifically, we identified HEYL as a candidate TF that might be important for acinar cell identity, warranting further investigation. To enable users to easily navigate the human pancreatic cell network atlas, we provide a loom file that allows for the visualisation and exploration of the data using the web-based portal SCope (Davie et al., 2018) (.loom file and tutorial available at http://scope.aertslab.org/#/PancreasAtlas/*/welcome and https://github.com/pasquelab/scPancreasAtlas). Features such as cell type annotation as defined in this paper, gene expression and regulon activity can be explored on the regulon and gene expression based UMAP. This resource enables users to select and visualize up to three genes or regulons simultaneously and select subsets of cells for downstream analyses. For example, the expression of COVID-19 related genes can be interactively explored (Yang et al., 2020) . Target genes of a specific regulon can be downloaded to facilitate further exploration, for example in iRegulon or gene ontology analysis (Janky et al., 2014) . A list of predicted target genes of all 142 regulons can also be found in Table S5 . Furthermore, a list of target genes can be manually defined to compute the activity of a custom regulon. This resource can be used to further study cell identity and gene regulation in the context of the pancreas, diabetes and cancer. In this resource, we take advantage of integration strategies and new computational tools to reconstruct an integrated cell and GRN atlas of the human adult pancreas from single-cell transcriptome data. This approach provides a comprehensive analysis of the gene regulatory logic underlying cellular identity in the human adult pancreas in a broad range of individuals, limiting the influence of inter-donor variability. We recovered known regulators of pancreatic cell identity and uncovered novel candidate regulators of cell identity that can be further investigated for their roles in cellular identity and function. By validating regulon analyses and creating an easily accessible interactive online resource which allows for the exploration of the gene regulatory state of 7393 cells from 51 individuals, this approach extends beyond previous gene regulatory studies in the human adult pancreas (Augsornworawat and Millman, 2020) . The present analysis identified regulators of pancreatic development, function and survival that are known to be critical in humans because loss-of-gene function causes pancreatic agenesis or young onset diabetes. For example, PTF1A (Sellick et al., 2004; Weedon et al., 2014) and GATA4 (Shaw-Smith et al., 2014) , whose loss of function are linked to pancreatic agenesis and neonatal diabetes, were among the top acinarspecific regulons (Figure 3 ). In addition, monogenic diabetes related genes PDX1 (Nicolino et al., 2010 ), NEUROD1 (Rubio-Cabezas et al., 2010 , PAX6 (Solomon et al., 2009) , RFX6 (Smith et al., 2010; Patel et al., 2017) and GLIS3 (Senée et al., 2006) were among the top beta cell-specific regulons (Figure 3 and Table S4 ). stress signalling (Table S4) . CREB3 and CREB3L2 are non-canonical ER stress transducers that are induced in human islets and clonal beta cells upon exposure to the saturated fatty acid palmitate (Cnop et al., 2014) . Interestingly, SREBF1 and -2 undergo similar ER exit and proteolytic processing in the Golgi as these ER stress transducers, but they do so in response to changes in ER cholesterol content; both also have high regulon activity in alpha and beta cells. XBP1 is abundantly expressed in the exocrine and endocrine pancreas (Cnop et al., 2017) , but the XBP1 regulon has its highest specificity in beta cells. ATF3 and ATF4 are TFs that are activated upon eIF2α phosphorylation, an ER stress response pathway to which no less than 5 monogenic forms of diabetes belong (Eizirik, Pasquali and Cnop, 2020) . Our data underscores the importance of these TFs for endocrine pancreatic cell identity. Given that we predict novel regulators of cell identity in the human pancreas, it will be interesting to also expand this analysis to pancreas embryonic development. Our work may also be beneficial in guiding improvements of and better understanding the in vitro derivation of pancreatic cell types. For example, the emergence of SST-positive cells together with beta-like cells at the end of in vitro differentiation could be explained by the overlap in regulatory states between beta and delta cells (Baron et al., 2016) . GRN analyses are particularly interesting for in vitro derived beta cells since a better understanding of the regulatory logic underlying control of beta cell fate may improve or facilitate future applications in regenerative medicine (Pagliuca et al., 2014; Rezania et al., 2014; Nostro et al., 2015; Russ et al., 2015; Baeyens et al., 2018) . Alternatively, many observed GRNs such as ASCL2, MECOM, PPARD, GATA6 and CDX2 are linked to pancreatic cancer making the additional exploration of GRNs interesting in the context of cancer biology (Matsumoto et al., 2004; Zhu et al., 2012; Coleman et al., 2013; Hoang et al., 2016; Xu et al., 2017; Weng et al., 2019; Brunton et al., 2020) . Finally, recent reports have stratified type 2 diabetes patients based on age at diagnosis, BMI, HbA1c and insulin secretion and sensitivity, and identified subtypes with different genetic predisposition, treatment response, disease progression and complication rates (Ahlqvist et al., 2018) . Hence, it would be interesting to assess differences in gene regulatory state and gene expression profiles of alpha and beta cells between different type 2 diabetic subgroups. It is important to note that pySCENIC is a stochastic algorithm that does not produce precisely the same regulons for repeated applications, limiting reproducibility when comparing different datasets (Huynh-Thu et al., 2010; Van de Sande et al., 2020) . To mitigate this uncertainty, we ran the full pySCENIC pipeline five times and only kept consistent regulons with the highest regulon activity. The performance of pySCENIC, and other GRN inference methods, suffers due to the large amount of drop-out events in scRNA-seq data warranting caution when interpreting results (Chen and Mar, 2018) . This could explain the absence of wellestablished pancreas TFs such as MAFA (Olbrot et al., 2002) , MNX1 (Flanagan et al., 2014) , NEUROG3 (Krentz et al., 2017) , FOXA2 (Lee et al., 2002) and NKX2-2 (Mastracci et al., 2011) in this analysis. Nevertheless, in support of the validity of our findings, ATAC-seq, literature and immunohistochemistry of human pancreas sections corroborate several pySCENIC predictions. Chen and colleagues underline the importance of using large sample sizes to derive the most accurate network inference possible (Chen and Mar, 2018) , highlighting the importance of dataset integration to increase the number of cells analysed. In the future, it will be interesting to extend our analyses to include many more cells and patients. In spite of current caveats, GRN analysis has enabled the capture of biological relevant information (Butte et al., 2000) . One additional limitation of this study is the assumption that all TFs bind their binding motifs in the promoters of expressed genes. However, TF binding can be restricted to a subset of TF motifs in the genome due to influence of chromatin processes including the presence of nucleosomes as well as DNA methylation. Therefore, additional approaches such as single cell multi-omics that capture additional layers of genome regulation will be helpful to increase our understanding of gene regulation in the context of the human pancreas. Taken together, our GRN atlas, containing 51 individuals, provides a valuable resource for future studies in the human pancreas development, donor variability, homeostasis and disease including type 2 diabetes 11 and pancreatic cancer. Finally, our results provide new insights into the activity of TFs and gene regulation in the human adult pancreas from a gene regulatory perspective. questions about data analysis should be directed to the Lead Contact, Vincent Pasque (vincent.pasque@kuleuven.be). The reviewer tokens for this GEO repository is cfsjciumfferjkd. Motif discovery of bulk ATAC-seq data Paired-end raw reads for bulk ATAC-seq (see Key Resource Table) were downloaded from SRA using SRA toolkit (v2.9.4). Reads were aligned and further analyzed using the ENCODE ATAC-seq pipeline with default parameters using the ENCODE human reference genome GRCh38.15 (Lee, 2016) . Bed files containing the global open chromatin landscape of Adult Alpha (Alpha_1; EA4 and Alpha_2; EA28), Beta (Beta_1; EA5 and Beta_2; EA29), Acinar (Acinar_1; EA7 and Acinar_2; EA27) and ductal (EA11) cells or cell type specific differentially accessible regions were used as input for motif discovery by HOMER (v4.10.4) using the 'findMotifsGenome.pl' with options using hg38 with size given (Heinz et al., 2010) . The TFs whose motifs identified by HOMER correspond with TFs identified by pySCENIC are visualized in Figure S2A . Analysis of publicly available scRNA-seq data Raw reads for five publicly available scRNA-seq datasets (see Key Resource Table) were downloaded from SRA using SRA toolkit (v2.9.4). Afterwards, reads were aligned to the human reference genome GRCh38.95 using STAR (v2.5.3a) with default parameters followed by the conversion to the coordinate sorted BAM format. Next, the featureCounts command from the "Rsubread" (v1.5.2) package in R (v3.6.1) was used to assign mapped reads to genomic features. Low quality transcriptomes with a mitochondrial contamination greater than 5% and less than 200 expressed genes per cell were excluded from subsequent analyses. The resulting raw count matrix was batch corrected using the FindIntegrationAnchors and IntegrateData functions from the "Seurat" package (v3.1.1) after which subsequent analyses were carried out in the R package "Seurat" (v3.1.4). Gene expression was used to cluster all 7393 cells with UMAP, using Seurat's function RunUMAP. Clusters for cell type annotation were defined using Seurat's shared nearest neighbour algorithm FindClusters function after which differential expression analysis was performed using Wilcoxon's rank sum test with a minimum cutoff of 0.25 average log fold change and min.pct of 0.25. pySCENIC GRNs were inferred using pySCENIC (python implementation of SCENIC, v0.9.15) in Python version 3.6.9 (Aibar et al., 2017) . Integrated read counts were used as input to run GENIE3 (Huynh-Thu et al., 2010) which is part of arboreto (v0.1.5). GRNs were subsequently inferred using pySCENIC with the hg38_refseq-r80 motif database and default settings. To control for the stochasticity, which is inherent to pySCENIC, a consensus GRN was generated by merging results from five repeat pySCENIC runs. If regulons were identified in multiple pySCENIC runs, only the regulon with the highest AUC value was retained. Regulon activity represented by AUCell values was used to cluster all 7393 cells with UMAP, using Seurat's RunUMAP function. All 142 regulons within non-diabetic cell types were visualized using the 'clustermap' function of the Python package "seaborn" (v0.9.0). The z-score for each regulon across all cells was calculated using the z-score parameter of the seaborn 'clustermap' function. Extended analysis of the target genes of specific regulons was conducted in Cytoscape (v3.7.1) using the iRegulon application (v1.3). The list of target genes of a specific regulon was downloaded from the loom file through the SCope platform (https://github.com/pasquelab/scPancreasAtlas) (Davie et al., 2018) . To quantify the cell-type specificity of a regulon, we utilized an entropy-based strategy as described previously (Suo et al., 2018) using the AUCell matrix as input in MATLAB R2019b. The top 10 most specific regulons were subsequently visualized using the R package ggplot2 (v3.1.1). The complete regulon ranking list is available in Table S2 . Control iPSC line HEL115.6 (Cosentino et al., 2018) was differentiated into beta cells using a previously published 7-step protocol (Cosentino et al., 2018) . At the end of the stage 4, cells were seeded into 24-well Aggrewell 400 microwell plates (Stem Cell Technologies) at a density of 0.9 ·10 6 cells per well after which differentiation was carried out as described previously (Cosentino et al., 2018) . Stage 7 differentiated beta cells were washed twice with PBS containing 0.5 mM EDTA and incubated in Accumax (SIGMA #A7089) for 8 min at 37 °C after which 50% volume of KnockOut Serum Replacement (ThermoFisher #10828028) was added to stop the reaction. After centrifugation at 400 g for 5 min at room temperature, cells were resuspended in 1 mL HAM's F-10 medium, supplemented as indicated before (Demine et al., 2020) . 75,000 cells in 500 μL medium were seeded per square of a Nunc Lab-Tek II ICC chamber (ThermoFisher). Immunohistochemistry analyses were carried out largely as described previously (Demine et al., 2020) , Immunohistochemistry analyses were carried out largely as described previously (Ceulemans et al., 2017) , using primary antibodies against the following proteins: ASCL2 (Merck, MAB4418, clone 7E2, 1/5000), JUND (Atlas Antibodies, HPA063029, 1/50), HEYL (Atlas Antibodies, HPA076960, 1/200) and INS (Agilent, IR002, 1/100). Pictures were taken using a Leica DMLB (Leica Microsystems). The integrated single-cell RNA-seq data and pySCENIC results can be explored interactively in SCope (Davie et al., 2018) . Loompy (v2.0.17) (Linnarsson Lab., 2015) was used to create the loom files which were uploaded to SCope. The embedding of the regulon and integrated gene expression based UMAP clustering, as seen in this article, were added to the loom file. This table is related to Figure 1 , 2, 3 and 4. This table is related to Figure 1 . This table is related to Figure 2 , 3 and 4. List of regulon specificity scores for all 142 regulons of non-diabetic alpha, beta, acinar and ductal cells. This table is related to Figure 3 and 4. List of putative target genes for each regulon. This table is related to Figure 2 , 3 and 4. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables', The Lancet Diabetes and Endocrinology SCENIC: single-cell regulatory network inference and clustering', Nature Methods Reduced Expression of PLCXD3 Associates With Disruption of Glucose Sensing and Insulin Signaling in Pancreatic β-Cells', Frontiers in Endocrinology GATA6 haploinsufficiency causes pancreatic agenesis in humans A Chromatin Basis for Cell Lineage and Disease Risk in the Human Pancreas An activator of the glucagon gene expressed in developing islet α-and β-cells Hyperplasia of pancreatic beta cells and improved glucose tolerance in mice deficient in the FXYD2 subunit of Na,K-ATPase Single-cell RNA sequencing for engineering and studying human islets', Current Opinion in Biomedical Engineering Re)generating Human Beta Cells: Status, Pitfalls, and Perspectives The POU protein Oct-6 is a nucleocytoplasmic shuttling protein A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter-and Intra-cell Population Structure.', Cell systems Transcription factor gene MNX1 is a novel cause of permanent neonatal diabetes in a consanguineous family Loss of RE-1 silencing transcription factor accelerates exocrine damage from pancreatic injury', Cell Death and Disease HNF4A and GATA6 Loss Reveals Therapeutically Actionable Subtypes in Pancreatic Cancer Integrating single-cell transcriptomic data across different conditions, technologies, and species', Nature Biotechnology Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks Lineage dynamics of murine pancreatic development at single-cell resolution', Nature Communications HOX D13 expression across 79 tumor tissue types GATA4 and GATA6 control mouse pancreas organogenesis Stepwise reprogramming of liver cells to a pancreas progenitor state by the transcriptional regulator Tgif2', Nature Communications RNA-sequencing-based comparative analysis of human hepatic progenitor cells and their niche from alcoholic steatohepatitis livers', Cell Death and Disease Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data Genetic evidence that Nkx2.2 acts primarily downstream of Neurog3 in pancreatic endocrine lineage development', eLife. eLife Sciences Publications Ltd RNA sequencing identifies dysregulation of the human pancreatic islet transcriptome by the saturated fatty acid palmitate', Diabetes Endoplasmic reticulum stress and eIF2α phosphorylation: The Achilles heel of pancreatic β cells Role of Peroxisome Proliferator-Activated Receptor β/δ and B-Cell Lymphoma Regulation of Genes Involved in Metastasis and Migration in Pancreatic Cancer Cells Opposing actions of Arx and Pax4 in endocrine pancreas development Pancreatic -cell tRNA hypomethylation and fragmentation link TRMT10A deficiency with diabetes SNAP-25b-deficiency increases insulin secretion and changes spatiotemporal profile of Ca 2+ oscillations in β cell networks', Scientific Reports A Single-Cell Transcriptome Atlas of the Aging Drosophila Brain Expression of zebrafish pax6b in pancreas is regulated by two enhancers containing highly conserved cis-elements bound by PDX1, PBX and PREP factors Pro-inflammatory cytokines induce cell death, inflammatory responses, and endoplasmic reticulum stress in human iPSC-derived beta cells Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data', The Lancet Diabetes and Endocrinology Transcriptomes of the major human pancreatic cell types Heterogeneity of the human pancreatic islet', Diabetes Pancreatic β-cells in type 1 and type 2 diabetes mellitus: different pathways to failure PDIA6 regulates insulin secretion by selectively inhibiting the RIDD activity of IRE1 Single-Cell Analysis of Human Pancreas Reveals Transcriptional Signatures of Aging and Somatic Mutation Patterns Regulation of pancreas duodenum homeobox-1 expression by early growth response-1' Heterogeneity of Pre-diabetes and Type 2 Diabetes: Implications for Prediction, Prevention and Treatment Responsiveness Mapping gene regulatory networks from single-cell omics data', Briefings in Functional Genomics A genomic-based approach identifies FXYD domain containing ion transport regulator 2 (FXYD2)γa as a pancreatic beta cell-specific biomarker Analysis of transcription factors key for mouse pancreatic development establishes NKX2-2 and MNX1 mutations as causes of neonatal diabetes in man A Specific CNOT1 Mutation Results in a Novel Syndrome of Pancreatic Agenesis and Holoprosencephaly through Impaired Pancreatic and Neurological Development Foxa1 and Foxa2 maintain the metabolic and secretory features of the mature β-cell Pdx1 maintains β cell identity and function by repressing an α cell program Integrated single cell data analysis reveals cell specific networks and novel coactivation markers The Zinc-finger factor Insm1 (IA-1) is essential for the development of pancreatic β cells and intestinal endocrine cells Neurogenin 3 Expressing Cells in the Human Exocrine Pancreas Have the Capacity for Endocrine Cell Fate JUND regulates pancreatic β cell survival during metabolic stress', Molecular Metabolism neurogenin3 is required for the development of the four endocrine cell lineages of the pancreas De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression', bioRxiv. Cold Spring Harbor Laboratory Construction of a human cell landscape at single-cell level', Nature The Developmental Regulator Pax6 Is Essential for Maintenance of Islet Cell Function in the Adult Mouse Pancreas Histone deacetylase inhibitors modify pancreatic cell fate determination and amplify endocrine progenitors Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities Influence of organ donor attributes and preparation characteristics on the dynamics of insulin secretion in isolated human islets Transcriptional Maintenance of Pancreatic Acinar Identity, Differentiation, and Homeostasis by PTF1A', Molecular and Cellular Biology Targeting the cytoskeleton to direct pancreatic differentiation of human pluripotent stem cells', Nature Biotechnology Inferring Regulatory Networks from Expression Data Using Tree-Based Methods', PLoS ONE Tetraspanin-2 promotes glucotoxic apoptosis by regulating the JNK/β-catenin signaling pathway in human pancreatic β cells iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections Neuron-enriched RNA-binding proteins regulate pancreatic beta cell function and survival Pathophysiology and treatment of type 2 diabetes: Perspectives on the past, present, and future', The Lancet Pancreatic cancer', The Lancet Transcription factor GATA-6 is expressed in the endocrine and GATA-4 in the exocrine pancreas Research resource: The Pdx1 cistrome of pancreatic islets Phosphorylation of NEUROG3 Links Endocrine Differentiation to the Cell Cycle in Pancreatic Progenitors Regulation of the Pancreatic Exocrine Differentiation Program and Morphogenesis by Onecut 1/Hnf6', CMGH Single-cell transcriptomic analysis of pancreatic islets in health and type 2 diabetes Single-cell transcriptomes identify human islet cell signatures and reveal cell-typespecific expression changes in type 2 diabetes Foxa2 controls Pdx1 gene expression in pancreatic β-cells in vivo kundajelab/atac_dnase_pipelines: ATAC-seq and DNase-seq processing pipeline FOXA2 Is Required for Enhancer Priming during Pancreatic Differentiation Essential interaction of Egr-1 at an islet-specific response element for basal and gastrin-dependent glucagon gene transactivation in pancreatic α-cells BTR: training asynchronous Boolean models using single-cell expression data Generation of functional beta-like cells from human exocrine pancreas A METTL3-METTL14 complex mediates mammalian nuclear RNA Benchmarking atlas-level data integration in single-cell genomics', bioRxiv. Cold Spring Harbor Laboratory Nkx2.2 and Arx genetically interact to regulate pancreatic endocrine cell development and endocrine hormone expression Regulation of Neurod1 Contributes to the Lineage Potential of Neurogenin3+ Endocrine Precursor Cells in the Pancreas The endocrine pancreas: insights into development, differentiation, and diabetes Replacement of Rbpj With Rbpjl in the PTF1 Complex Controls the Final Maturation of Pancreatic Acinar Cells SCODE: An efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation Cdx2 expression in pancreatic tumors: Relationship with prognosis of invasive ductal carcinomas Nuclear hormone Retinoid X Receptor (RXR) negatively regulates the glucosestimulated insulin secretion of pancreatic β-cells Neurexin-1α contributes to insulin-containing secretory granule docking A Single-Cell Transcriptome Atlas of the Human Pancreas A novel hypomorphic PDX1 mutation responsible for permanent neonatal diabetes with subclinical exocrine deficiency', Diabetes MafA is critical for maintenance of the mature beta cell phenotype in mice Iterative use of nuclear receptor Nr5a2 regulates multiple stages of liver and pancreas development', Developmental Biology Efficient Generation of NKX6-1+ Pancreatic Progenitors from Multiple Human Pluripotent Stem Cell Lines The Number of Genes in the Mammalian Genome and the Need for Master Regulatory Genes Identification of β-cell-specific insulin gene transcription factor RIPE3b1 as mammalian MafA Generation of functional human pancreatic β cells in vitro A novel mutation of ABCC8 gene in a patient with diazoxideunresponsive congenital hyperinsulinism Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants Heterozygous RFX6 protein truncating variants are associated with MODY with reduced penetrance', Nature Communications Molecular architecture of lineage allocation and tissue organization in early mouse embryo', Nature A method for the generation of human stem cell-derived alpha cells', Nature Communications Pancreatic Ductal Deletion of Hnf1b Disrupts Exocrine Homeostasis, Leads to Pancreatitis, and Facilitates Tumorigenesis', CMGH JNK pathway inhibition selectively primes pancreatic cancer stem cells to TRAIL-induced apoptosis without affecting the physiology of normal tissue resident stem cells Reversal of diabetes with insulin-producing cells derived in vitro from human pluripotent stem cells', Nature Biotechnology Pancreatic β-cell electrical activity and insulin secretion: Of mice and men Homozygous mutations in NEUROD1 are responsible for a novel syndrome of permanent neonatal diabetes and neurological abnormalities', Diabetes Permanent neonatal diabetes and enteric anendocrinosis associated with biallelic mutations in NEUROG3 Controlled induction of human pancreatic progenitors produces functional betalike cells in vitro Ptf1a inactivation in adult pancreatic acinar cells causes apoptosis through activation of the endoplasmic reticulum stress pathway', Scientific Reports A scalable SCENIC workflow for single-cell gene regulatory network analysis', Nature Protocols The basic helix-loop-helix transcription factor DEC2 inhibits TGF-β-induced tumor progression in human pancreatic cancer BxPC-3 cells Endocrine lineage biases arise in temporally distinct endocrine progenitors during pancreatic morphogenesis', Nature Communications Ascl2 acts as an R-spondin/wnt-responsive switch to control stemness in intestinal crypts Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes Mutations in PTF1A cause pancreatic and cerebellar agenesis Mutations in GLIS3 are responsible for a rare syndrome with neonatal diabetes mellitus and congenital hypothyroidism Wnt Signaling Separates the Progenitor and Endocrine Compartments during Pancreas Development GATA4 mutations are a cause of neonatal and childhood-onset diabetes', Diabetes A Gene Regulatory Network Cooperatively Controlled by Pdx1 and Sox9 Governs Lineage Allocation of Foregut Progenitor Cells Activator protein-1 has an essential role in pancreatic cancer cells and is regulated 28 by a novel Akt-mediated mechanism', Molecular Cancer Research SOX9: A useful marker for pancreatic ductal lineage of pancreatic neoplasms Differentiated human stem cells resemble fetal, not adult, β cells Rfx6 directs islet formation and insulin production in mice and humans Compound heterozygosity for mutations in PAX6 in a patient with complex brain anomaly, neonatal diabetes mellitus, and microophthalmia The Pax4 gene is essential for differentiation of insulin-producing β cells in the mammalian pancreas Single cell transcriptomic profiling of mouse pancreatic progenitors Pancreatic agenesis attributable to a single nucleotide deletion in the human IPF1 gene coding sequence Comprehensive Integration of Single-Cell Data Tissue-specific deletion of Foxa2 in pancreatic β cells results in hyperinsulinemic hypoglycemia', Genes and Development Revealing the Critical Regulators of Cell Identity in the Mouse Cell Atlas Mice lacking the homeodomain transcription factor Nkx2.2 have diabetes due to arrested differentiation of pancreatic β cells Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors In vivo metabolite profiling as a means to identify uncharacterized lipase function: Recent success stories within the alpha beta hydrolase domain (ABHD) enzyme family', Biochimica et Biophysica Acta -Molecular and Cell Biology of Lipids Single nucleus RNA sequencing maps acinar cell states in a human pancreas cell atlas', bioRxiv. Cold Spring Harbor Laboratory The Effect of Wnt Pathway Modulators on Human iPSC-Derived Pancreatic Beta Cell Maturation', Frontiers in Endocrinology Direct conversion of fibroblasts to functional neurons by defined factors', Nature Type 2 diabetes-associated K+ channel TALK-1 modulates β-cell electrical excitability, second-phase insulin secretion, and glucose homeostasis', Diabetes Single-Cell Transcriptomics of the Human Endocrine Pancreas Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis.', Nature genetics Loss of the transcriptional repressor TGIF1 results in enhanced Kras-driven development of pancreatic cancer', Molecular Cancer Age-dependent decline in the coordinated [Ca2+] and insulin secretory dynamics in human pancreatic islets', Diabetes The ghrelin cell: A novel developmentally regulated islet cell in the human pancreas Robust gene expression programs underlie recurrent cell states and phenotype switching in melanoma RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes Elevated ASCL2 expression in breast cancer is associated with the poor prognosis of patients Skn-1a/Pou2f3 functions as a master regulator to generate Trpm5-expressing chemosensory cells in mice A Human Pluripotent Stem Cell-based Platform to Study SARS-CoV-2 Tropism and Model Virus Infection in Human Cells and Organoids Role of janus kinase/signal transducers and activators of transcription in the pathogenesis of pancreatitis and pancreatic cancer Risk of diabetes-associated diseases in subgroups of patients with recent-onset diabetes: a 5-year follow-up study', The Lancet Diabetes and Endocrinology In vivo reprogramming of adult pancreatic exocrine cells to β-cells' Ascl2 knockdown results in tumor growth arrest by mirna-302b-related inhibition of colon cancer progenitor cells Genome Editing of Lineage Determinants in Human Pluripotent Stem Cells Reveals Mechanisms of Pancreatic Development and Diabetes We thank Stein Aerts, Kristofer Davie and the Stein Aerts lab for discussions and creating the permanent SCope link, Shengbao Suo for sharing the MATLAB script for calculating the regulon specificity score, the