key: cord-0907310-d0h7q9oo authors: de Jong, Auke W.; Francisco, Elaine C.; de Almeida, João Nóbrega; Brandão, Igor B.; Pereira, Felicidade M.; Dias, Pedro H. Presta; de Miranda Costa, Magda M.; de Souza Jordão, Regiane T.; Vu, Duong; Colombo, Arnaldo L.; Hagen, Ferry title: Nanopore Genome Sequencing and Variant Analysis of the Susceptible Candida auris Strain L1537/2020, Salvador, Brazil date: 2021-10-20 journal: Mycopathologia DOI: 10.1007/s11046-021-00593-7 sha: 80ae9694c3ca13a84a3b5eee7e70eed1a7683e34 doc_id: 907310 cord_uid: d0h7q9oo Candida auris has been reported worldwide, but only in December 2020, the first strain from a COVID-19 patient in Brazil was isolated. Here, we describe the genome sequence of this susceptible C. auris strain and performed variant analysis of the genetic relatedness with strains from other geographic localities. in [4] ) is part of the South Asia clade (clade I). Remarkably, the strain was susceptible for all antifungal classes [4] . Whole genome sequencing (WGS) was performed on strain L1537/2020 to enable indepth analysis of its genetic relatedness with publicly available clade I strains. Additionally, WGS of this strain provides genome data of an antifungal susceptible C. auris strain [4, 5] . The strain was cultured in 10 ml peptone glucose broth (LP0040; Oxoid, Basingstoke, UK) and incubated (125 rpm) at 25°C for 3 days. High-quality genomic DNA was extracted as described before [6] . However, the final step with chloroform/isoamylalcohol was replaced by column-based purification using the Fungi/Yeast Genomic DNA kit (catalog nr. 27300; Norgen Biotek, Thorold, ON, Canada) and DNA was eluted in 50 ll molecular grade 1 9 IDTE buffer (pH 8.0; IDT, Coralville, IA, USA). Sequencing was performed on the Oxford Nanopore MinION platform (Oxford Nanopore Technologies, Oxfordshire, UK) using the ligation sequencing kit (SQK-LSK109) and native barcoding kit (EXP- b IQ-TREE generated maximum likelihood phylogenetic analysis using 420,164 SNPs, rooted with strain AR1097 (C. auris clade V). Numbers above branches represent branch lengths interpreted as the numbers of nucleotide substitutions per nucleotide site. The heatmap z-scores and the IQ-Tree branch lengths show that clade I strains were very similar to each other, however, strain L1537/2020 had a longer branch length and a high number of SNPs (1893-2089) compared to other strains in clade I genome assembly was performed using Flye v2.8.2-b1689 (https://github.com/fenderglass/Flye; [7] ) with the parameters --nano-raw \ fastq [ --out-dir \ directory [ --genome-size 12.5 m. The assembly resulted in 15 fragments with a total length of 12,687,478 bp (N 50 of 2,134,410 bp; largest fragment 4,519,230 bp) with a mean coverage of 300X. However, when the 7 smaller contigs (range 490-4970 bp) were omitted, the mean coverage was 387X. The 8 large contigs (range 27,834-4,519,230 bp; total length 12,675,277 bp) included the circular mitochondrial genome (27,834 bp; 1000X coverage), the remaining nuclear contigs had a coverage of * 300X. Variant calling was performed using a subset of 47 published genomes representing the five C. auris clades, including a benchmark set for clade I, ( Fig. 1 ; [3, [8] [9] [10] [11] [12] ). Sequencing reads were aligned with reference genome B8441 (NCBI accession SRS1558430) using minimap2 for nanopore and bwa v0.7.17-r1188 for Illumina data [13, 14] . Sam-files were sorted and indexed using samtools v1.9 [15] . For nanopore data, longshot was performed on the bamfile to obtain variants of the reference genome [16] . Indels and SNPs with a quality \ 20 were removed using vcftools v0.1.15 [17] . For Illumina data, picard (http://broadinstitute.github.io/picard/) was performed to mark duplicates. Variants were then identified with gatk HaplotypeCaller, and SNPs were selected with gatk HaplotypeCaller and filtered with the settings ''QD \ 2.0||MQ \ 40.0||FS [ 60.0||SOR [ 3.0||MQRankSum \ -12.5 ||ReadPosRankSum \ -8.0'' [18] . SNP-files were merged using vcfmerge to produce a fasta-alignment file for all samples with R-package SNPRelate [19] . Different nucleotides within the fasta-alignment file were counted by pairwise SNP-number comparison (Fig. 1, panel A) . IQ-TREE v1.6.1 was performed to build a maximumlikelihood tree from the alignment file that was visualized with itol (https://itol.embl.de/) (Fig. 1, panel B ; [20] ). As a result, 420,164 SNPs were counted. Strains in clade I had * 1200-1900 SNPs, while much higher SNP-counts were observed for clade II (* 45 K), clade III (* 65 K), clades IV and V (* 170 K). Pairwise SNP-number comparison showed that all clade I strains were very much similar with only 230-1011 nucleotide differences, except for strain L1537/2020 that had 1893-2089 SNPs compared to the other clade I strains (Fig. 1) . Although strain L1537/2020 belongs to clade I, it is distantly related to all other representatives of that clade, and in contrast it is susceptible to all common antifungals ( Fig. 1; [4] ). Several mutations are reported to play a role in C. auris' antifungal resistance, viz. CIS2 (A27T), ERG3 (W182*, L207I), ERG11 (Y132F, K143R), FKS1 (S639P), MEC3 (A272V), PEA2 (D367V), TAC1B (FS191S, F214S, R495G, S611P), and UPC2 (M365) [3, 9, [21] [22] [23] . None of these mutations are present in the genome of L1537/2020. Nearly all publicly available C. auris genome data was generated by short-read Illumina sequencing [3, 9, 11, 12] . Nonetheless, the relative distant relation of the nanopore-based genome of strain L1537/2020 to other Illumina-sequenced clade I strains cannot be explained by the differences in sequencing technologies. The nanopore flowcell, chemistry and basecalling software used here approaches an accuracy of [ 98%. This means that a SNP precision of [ 99.9% can be achieved in the case of 50X genome coverage [24] . With the 300X coverage for L1537/ 2020 it is unlikely that an erroneous mutation was introduced in the assembly. Hence, further studies are needed to investigate the epidemiological and biological impact of the phenotypic and genotypic differences in L1537/2020 versus its multi-drug resistant siblings within clade I. Astellas, Biotoscana, United Medical, Gilead, MSD and Pfizer. The other authors report no conflict of interest. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Candida auris: what have we learned about its mechanisms of pathogenicity? Front Microbiol Attack, defend and persist: how the fungal pathogen C. auris was able to emerge globally in healthcare environments Tracing the evolutionary history and global expansion of C. auris using population genomic analyses Emergence of C. auris in Brazil in a COVID-19 intensive care unit Myco-pathologiaGENOMES: the new 'home' for the publication of fungal genomes The high-quality complete genome sequence of the opportunistic fungal pathogen Candida vulturna CBS Assembly of long, error-prone reads using repeat graphs Environmental isolation of C. auris from the coastal wetlands of Andaman Islands Simultaneous emergence of multidrug-resistant C. auris on 3 continents confirmed by whole-genome sequencing and epidemiological analyses Genomic insights into multidrug-resistance, mating and virulence in C. auris and related emerging species Genomic epidemiology of C. auris in a general hospital in Shenyang, China: a three-year surveillance study Candida auris whole-genome sequence benchmark dataset for phylogenomic pipelines Minimap2: pairwise alignment for nucleotide sequences Fast and accurate short read alignment with Burrows-Wheeler transform 1000 genome project data processing subgroup. The sequence alignment/map format and SAMtools Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing 1000 genomes project analysis group. The variant call format and VCFtools. Bioinformatics Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (1st Edition). O'Reilly Media A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies Genome-wide analysis of experimentally evolved C. auris reveals multiple novel mechanisms of multidrug resistance. mBio Novel ERG11 and TAC1b mutations associated with azole resistance in C. auris. Antimicrob Agents Chemother Mutations in TAC1B: a novel genetic determinant of clinical fluconazole resistance in C. auris. mBio Applications: Nanopore sequencing accuracy Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations The study was partially supported by Grant 2017/02203-7, São Paulo Research Foundation (FAPESP).