About the Author(s)


Anthony M. Smith Email symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Department of Medical Microbiology, Faculty of Health Sciences, University of Pretoria, Pretoria, South Africa

Phuti Sekwadi symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Hlengiwe M. Ngomane symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Bolele Disenyeng symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Linda K. Erasmus symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Juno Thomas symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Dineo Bogoshi symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Shannon L. Smouse symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Nomsa P. Tau symbol
Centre for Enteric Diseases, National Institute for Communicable Diseases, Johannesburg, South Africa

Citation


Smith AM, Sekwadi P, Ngomane HM, et al. Whole-genome sequencing for surveillance of Salmonella at a public health institution in South Africa. Afr J Lab Med. 2025;14(1), a2900. https://doi.org/10.4102/ajlm.v14i1.2900

Original Research

Whole-genome sequencing for surveillance of Salmonella at a public health institution in South Africa

Anthony M. Smith, Phuti Sekwadi, Hlengiwe M. Ngomane, Bolele Disenyeng, Linda K. Erasmus, Juno Thomas, Dineo Bogoshi, Shannon L. Smouse, Nomsa P. Tau

Received: 12 June 2025; Accepted: 11 Oct. 2025; Published: 09 Dec. 2025

Copyright: © 2025. The Authors. Licensee: AOSIS.
This work is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license (https://creativecommons.org/licenses/by/4.0/).

Abstract

Background: Whole-genome sequencing (WGS) is transforming communicable disease surveillance globally. The National Institute for Communicable Diseases, South Africa, participates in national laboratory-based surveillance for human isolates of Salmonella.

Objective: This study was to investigate human Salmonella isolates from South Africa, 2020–2023, using WGS analysis.

Methods: WGS was performed using Illumina NextSeq Technology. Data were analysed using multiple bioinformatics tools, including those available at the Center for Genomic Epidemiology, Pathogenwatch and EnteroBase. Data analysis allowed for identification and characterisation of isolates. Core-genome multilocus sequence typing was used to investigate the phylogeny of isolates.

Results: Of the 8006 isolates of Salmonella that were analysed using WGS, 130 distinctive serovars and subspecies were identified. Salmonella enterica serovar Enteritidis (Salmonella Enteritidis) (4271/8006; 53.3%) and Salmonella Typhimurium (1430/8006; 17.9%) were the most prevalent serovars, accounting for 71.2% of all isolates. This was followed by Salmonella Typhi (482/8006; 6.0%). Sixteen per cent (1288/8006) of isolates showed the presence of antimicrobial resistance (AMR) determinants associated with ≥ 2 classes of antimicrobials. Salmonella Isangi (167/8006; 2.1%) showed the highest prevalence of AMR, with most isolates (159/167; 95.2%) showing AMR determinants associated with ≥ 7 classes of antimicrobials. Core-genome multilocus sequence typing was used to confirm several suspected clusters and outbreaks and identified additional cryptic or unreported clusters and outbreaks. Investigation of clusters and outbreaks mostly involved Salmonella Enteritidis and Salmonella Typhi.

Conclusion: The implementation of WGS has enabled genomic surveillance of Salmonella, which allows for enhanced characterisation and AMR determination of isolates and identification of clusters and outbreaks, which informs targeted public health investigation and response.

What this study adds: This study describes the population structure of Salmonella isolated from humans in South Africa and hugely contributes to the available Salmonella WGS data from Africa.

Keywords: Salmonella; whole-genome sequencing; genomics; surveillance; outbreak; cluster; South Africa; Africa; public health.

Introduction

Salmonella remains a major cause of human disease worldwide, particularly in developing countries, where they are a leading cause of morbidity and mortality.1,2 In Africa, Salmonella disease is largely associated with non-invasive gastrointestinal infections; however, a sizable amount of the disease is also caused by invasive infections. In Africa, invasive Salmonella infections are associated with both typhoidal and non-typhoidal Salmonella.3,4 Surveillance and laboratory characterisation of Salmonella is important to monitor prevalence and trends in Salmonella infections. Recent trends in public health microbiology have shown an evolution towards whole-genome sequencing (WGS) as the methodology of choice for laboratory investigation of infectious diseases. Globally, many public health institutions have transitioned to WGS as their primary methodology for characterisation of bacterial pathogens, and this has included Salmonella.5,6,7 The World Health Organization has endorsed genomics and WGS approaches to investigate various communicable diseases.8,9

The National Institute for Communicable Diseases (NICD) (https://www.nicd.ac.za/) is a national public health institute for South Africa, providing disease surveillance, specialised diagnostic services, outbreak response, public health research, and capacity building to support the government’s response to communicable disease threats. The Centre for Enteric Diseases (CED), NICD, performs surveillance on pathogens associated with diarrhoea and enteric fever, and is involved with investigation and response to enteric disease outbreaks. The CED also provides specialised reference laboratory testing for enteric bacteria and viruses. In addition, the CED plays a part in national laboratory-based surveillance for human isolates of Salmonella,10 whereby isolates of Salmonella are received from more than 200 public and private clinical microbiology laboratories throughout South Africa. Suspected or laboratory-confirmed cases of enteric fever and clinical laboratory identifications of Salmonella isolates are ‘notifiable medical conditions’ in South Africa and it is thus mandatory for these to be reported to the Department of Health.11 The CED performs routine WGS on all Salmonella isolates received. Whole-genome sequencing data are analysed to confirm the identification of isolates with respect to genus, species and serovar, and further characterise the isolates with respect to multilocus sequence typing (MLST) and presence of antimicrobial resistance (AMR) determinants. Furthermore, WGS data is also submitted to the public EnteroBase platform (https://enterobase.warwick.ac.uk/species/index/senterica),12 where data are further interrogated using core-genome MLST (cgMLST) to investigate for clusters of genetically related cases (predictive of possible outbreaks) and to complement epidemiological investigation of outbreaks. Before implementation of WGS, presumptive Salmonella isolates were identified using traditional phenotypic microbiological methods, which included VITEK identification and serotyping performed according to the White-Kauffmann-Le Minor Scheme.13,14 This would have been followed by molecular subtyping of isolates, on selected isolates only (mostly associated with outbreak investigations), using methods such as pulsed-field gel electrophoresis analysis and multiple-locus variable-number tandem-repeats analysis.13,14,15,16

In 2020, CED implemented WGS for routine surveillance of clinical isolates of Salmonella. We now present the results of this WGS implementation and summarise key findings following analysis of isolates from 2020 to 2023. We report on the number of isolates sequenced, predominant serovars and subspecies identified, significant strains identified, notable AMR profiles identified, clusters identified, and outbreaks investigated.

Methods

Ethical considerations

Ethical approval to perform surveillance activities and laboratory analysis on clinical isolates of Salmonella was obtained from the Human Research Ethics Committee of the University of the Witwatersrand, Johannesburg, South Africa (protocol reference numbers: M160667, M1809107, M210752, M230985). Databases where patient data are stored are password protected and the patient identifiers were removed from genomic data shared at public repositories.

Surveillance for clinical isolates of Salmonella in South Africa

This project started on 01 January 2020 and ended on 31 December 2023. The NICD is a national public health institute for South Africa, providing disease surveillance, specialised diagnostic services, outbreak response, public health research, and capacity building to support the government’s response to communicable disease threats. The CED plays a part in national laboratory-based surveillance for human isolates of Salmonella. Isolates were received from more than 200 public and private clinical microbiology laboratories throughout the country. After Salmonella identification at laboratories, isolates were usually received at the CED within 1–4 weeks. Following receipt at the CED, isolates were immediately processed for WGS analysis (as described below), a process which is usually completed within 2–3 weeks.

Metadata and epidemiological investigation

Salmonella isolates were received with information related to basic metadata, including details of the patient, place of residence and specimen collection date; data all obtained from laboratory request forms. In some situations, such as cases of enteric fever, cases associated with outbreak investigations and cases from some enhanced surveillance sites; patients were followed up to obtain more detailed information and case investigation forms were completed. Clinical laboratory identifications of Salmonella isolates are ‘notifiable medical conditions’ in South Africa, so it is mandatory for these to be reported to the Department of Health.

Receipt of bacterial cultures and phenotypic characterisation

The CED received and processed isolates using methodology as previously described.17,18 Methodology is briefly described as follows. Following receipt of presumptive isolates on Dorset-Egg transport media (Diagnostic Media Products, National Health Laboratory Service, Johannesburg, South Africa), isolates are sub-cultured onto 5% Blood Agar (Diagnostic Media Products) to check for viability and purity, following which the isolates are processed to extract genomic DNA for WGS analysis. If there is suspicion that a culture is not a Salmonella, then that culture will be further investigated using standard phenotypic microbiological identification and serotyping methodologies, including VITEK-2 identification (bioMérieux, Marcy-l’Étoile, France) and serotyping completed as per the White-Kauffmann-Le Minor Scheme. When required, antimicrobial (ampicillin, ciprofloxacin, ceftriaxone, azithromycin) susceptibility testing was achieved via the Etest methodology (bioMérieux). Interpretation of susceptibility data was performed as per the Clinical and Laboratory Standards Institute guidelines.19

Genomic DNA extraction and whole-genome sequencing of bacteria

Genomic DNA was extracted from bacteria using either the Qiagen QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) or the Invitrogen PureLink Microbiome DNA Purification Kit (Invitrogen, Waltham, Massachusetts, United States). Whole-genome sequencing was performed by the NICD Sequencing Core Facility (SCF). From 2020 to 2023, WGS was performed using various models of Illumina equipment (Illumina, San Diego, California, United States), including Illumina MiSeq, Illumina NextSeq 550 and Illumina NextSeq 1000. DNA libraries were prepared using various Illumina kits, including the Nextera XT DNA Library Preparation Kit, the Nextera DNA Flex Library Preparation Kit and the Illumina DNA Prep Kit. Sequencing included paired-end sequencing runs, including ~80 times coverage.

Analysis of whole-genome sequencing data

The CED performed analysis of WGS data using methodology as previously described.17,18 Methodology is briefly described as follows. Illumina data were processed and investigated with the JEKESA bioinformatics pipeline (https://github.com/stanikae/jekesa), which includes several analysis tools. Default options were set for all tools, unless otherwise mentioned. Quality control and filtering of reads were performed with FastQC version 0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and TrimGalore version 0.6.2 set at a minimum Phred quality score of 30 and minimum read length of 50 bp. Identification of species and detection to closest reference were accomplished using BactInspector version 0.1.3 (https://gitlab.com/antunderwood/bactinspector). Checking for contamination was accomplished with ConFindr version 0.7.4 (https://github.com/OLC-Bioinformatics/ConFindr) and Kraken2 version 2.0.8-beta (https://github.com/DerrickWood/kraken2/releases). De novo assembly was accomplished using SKESA version 2.3.0 (https://github.com/ncbi/SKESA), followed by optimisation of the assemblies with Shovill version 1.1.0 (https://github.com/tseemann/shovill), with depth set to 100 and minimum contig length set to 200. Assessment of assembly metrics were performed with QUAST version 5.0.2 using PubMLST typing schemes (https://pubmlst.org/). Identification of AMR determinants was accomplished with ResFinder version 4.1 (https://www.genomicepidemiology.org/services/) and NCBI AMRfinder version 3.11.26.20 Prediction of Salmonella serovars was accomplished with SeqSero2 version 1.1.0 (https://denglab.info/SeqSero2) and SISTR version 1.1.2 (https://github.com/phac-nml/sistr_cmd).

Investigation of WGS data was further accomplished at EnteroBase (http://enterobase.warwick.ac.uk/species/index/senterica). Raw sequencing data were submitted to EnteroBase, where the data were quality checked, assembled and analysed via multiple tools, to provide information concerning Salmonella serovar, AMR determinants, MLST and cgMLST. The phylogeny of isolates was explored with the EnteroBase cgMLST tool incorporating the ‘cgMLST V2 + HierCC V1’ scheme, which performs an analysis on 3002 core genes of Salmonella. The phylogeny and genetic relatedness of isolates were visualised with a GrapeTree-generated minimum spanning tree using the ‘MSTree V2’ algorithm.21 For cluster detection, we followed the following steps. Once a GrapeTree was produced, the settings/operators of the tool were set to ‘collapse branches’ at a value of ‘5’, which ensued that isolates showing ≤ 5 allelic differences were collapsed together into a ‘cluster’. Our cluster definition was ≥ 3 isolates showing ≤ 5 allelic differences, as found by the above actions, following cgMLST analysis and creation of a GrapeTree. For all Salmonella serovars (except S. enterica serovar Enteritidis [Salmonella Enteritidis]), we defined a cluster of isolates at a ≤ 5-allele difference threshold, to denote high genetic relatedness among isolates and identify cases likely associated with a common cause (epidemiological link). Salmonella Enteritidis is a highly clonal serovar, so in order to refine cluster identification for this serovar to obtain the most epidemiological informative clusters, we lowered the cluster definition threshold to a 0-allele difference. Clusters were assigned (associated with) EnteroBase cgMLST hierarchical cluster level 5 identifying numbers (hierarchical cluster level 5 is where isolates are clustered at five allele differences).

Data availability

Sequencing data were uploaded to the public EnteroBase platform (http://enterobase.warwick.ac.uk/species/index/senterica) and are freely available to access at this platform. Data are also available at the European Nucleotide Archive under the project accession numbers PRJEB39002, PRJEB39546, and PRJEB39988.

Results

Turnaround time to whole-genome sequencing results and costing of whole-genome sequencing

Following genomic DNA extraction from bacteria, we typically batch samples and submit weekly (usually on a Friday) to our SCF. The turnaround time to completion of WGS at our SCF is ~10 working days. For urgent sequencing, such as for outbreak investigations, sequencing can be fast-tracked, resulting in decreased turnaround times (3–5 working days). In general, for routine sequencing, the turnaround time from receipt of a culture at the CED laboratory to completion of analysis of WGS data, is ~15 working days.

The costs to perform Illumina WGS have steadily decreased year on year. In January 2020, our cost to perform Illumina WGS (paired-end sequencing, at ~80 times coverage) on a single Salmonella isolate was ~R2510 South African Rand (ZAR) as compared to ~R1210 ZAR in December 2023 (R0.055 ZAR to United States dollar conversion rate, on 10 March 2025). So, WGS has become more affordable with time.

Sharing of whole-genome sequencing data and public health action

All WGS data were uploaded and shared at the EnteroBase Salmonella database (http://enterobase.warwick.ac.uk/species/index/senterica). Data shared at EnteroBase are immediately made available publicly to benefit the global public health and research community. EnteroBase also auto-uploads data to the Sequence Read Archive, following which project and sample accession numbers are assigned to isolate data. As of 10 March, 2025, EnteroBase ranked South Africa as country number one with respect to the number of Salmonella genome submissions from Africa, and seventh with respect to global country submissions.

As required, Salmonella WGS data was presented and discussed at the NICD weekly ‘Communicable Diseases Meetings’. These meetings include representatives from the NICD Outbreak Response Unit, all NICD Centres, and epidemiologists from all provinces across South Africa. Matters discussed included: any interesting findings related to routine disease surveillance activities, trends in disease notifications and reporting, reports of disease clusters, outbreak investigations, and new/emerging cases of diseases. Centre for Enteric Disease will report on any significant findings related to analysis of Salmonella WGS data, including clusters identified and outbreak investigations. As required, representatives of our Outbreak Response Unit will communicate with our Department of Health on all relevant matters. So, the chain of custody for reporting WGS data for public health action is: NICD Centre > NICD Outbreak Response Unit > Department of Health > public notification (as required), and further sharing of information.

Top (most common) eight Salmonella serovars and subspecies in South Africa

From 2020 to 2023, 8006 isolates of Salmonella were analysed using WGS. One hundred and thirty distinctive Salmonella serovars or subspecies were identified (Figure 1). Salmonella Enteritidis (4271/8006; 53.3%) and Salmonella Typhimurium (1430/8006; 17.9%) were the most prevalent serovars, accounting for 71.2% of all isolates. The following serovars or subspecies completed our top (most common) eight: Salmonella Typhi (482/8006; 6.0%), S. enterica subspecies salamae (279/8006; 3.5%), Salmonella Isangi (167/8006; 2.1%), Salmonella Dublin (114/8006; 1.4%), Salmonella Muenchen (108/8006; 1.3%), and Salmonella Infantis (98/8006; 1.2%). For Salmonella Typhi, 414/482 (85.9%) were of the H58 haplotype (genotype 4.3.1) strain (Figure 2). For Salmonella Typhimurium, 269/1430 (18.8%) isolates were of the ST313 variant; while 109/1430 (7.6%) isolates were of the monophasic variant (1,4,[5],12:i:-).

FIGURE 1: Minimum spanning tree created using cgMLST data for Salmonella isolates (N = 8006), South Africa, 2020–2023. The spherical nodes represent isolates. The larger the spherical node, the more isolates which are indicated. The figure legend lists Salmonella serovars and subspecies identified, in order from highest to lowest number of isolates.

FIGURE 2: Minimum spanning tree created using cgMLST data for Salmonella Typhi isolates (N = 482), South Africa, 2020–2023. The spherical nodes represent isolates. The larger the spherical node, the more isolates which are indicated. The number of segments within a spherical node is representative of the number of isolates. The number values between adjoining nodes specify the number of allele differences between connecting nodes (isolates). The figure legend lists the year of isolation. The cluster of H58 haplotype (genotype 4.3.1) strains is indicated.

Notable clusters and outbreak investigations

Table 1 provides a summary of notable Salmonella clusters and outbreaks investigated in South Africa from 2020 to 2023. These investigations included the following serovars: Salmonella Enteritidis, Salmonella Typhi, Salmonella Typhimurium, Salmonella Isangi, Salmonella Panama, Salmonella Vejle, Salmonella Newport, and Salmonella Muenchen. Most investigations were associated with Salmonella Enteritidis and Salmonella Typhi. Figure 3 shows some clusters investigated within the background of other circulating isolates, for Salmonella Enteritidis, and Figure 4 shows the same for Salmonella Typhi.

FIGURE 3: Snapshot from a minimum spanning tree created using cgMLST data for Salmonella Enteritidis isolates, South Africa, 2020–2023. The spherical nodes represent isolates. The larger the spherical node, the more isolates which are indicated. The number of segments within a spherical node is representative of the number of isolates. The number values between adjoining nodes specify the number of allele differences between connecting nodes (isolates). The figure legend points to some notable clusters investigated, of which details are described in Table 1.

FIGURE 4: Minimum spanning tree created using cgMLST data for Salmonella Typhi isolates (N = 68) sourced from the Western Cape province of South Africa, 2020–2021. The spherical nodes represent isolates. Isolates showing ≤ 5 allelic differences, are collapsed together into a single spherical node. The larger the spherical node, the more isolates which are indicated. The number of segments within a spherical node is representative of the number of isolates. The number values between adjoining nodes specify the number of allele differences between connecting nodes (isolates). The figure legend lists the year of isolation. Some notable clusters investigated are indicated, of which details are described in Table 1.

TABLE 1: Notable Salmonella clusters and outbreaks investigated in South Africa, 2020–2023.
Antimicrobial resistance determinants

For AMR determinants, data were reported as per analysis at the EnteroBase Salmonella database where the NCBI AMRfinder version 3.11.26 tool20 is used to report on the following AMR classes: aminoglycoside, penicillin, extended-spectrum beta-lactamase (ESBL), carbapenemase, colistin, fosfomycin, macrolide, phenicol, quinolone, sulfonamide, tetracycline, and trimethoprim. Sixteen per cent (1288/8006) of isolates showed the presence of AMR determinants associated with ≥ 2 classes of antimicrobials. Among our top (most common) eight serovars or subspecies, Salmonella Enteritidis showed the lowest prevalence of AMR, while Salmonella Isangi showed the highest prevalence of AMR (Table 2). Most Salmonella Isangi (159/167; 95.2%) showed AMR determinants associated with ≥ 7 classes of antimicrobials, including ESBL genes (blaOXA-1, blaOXA-10, blaCTX-M-15, blaTEM-63, blaDHA). For Salmonella Typhimurium ST313 (n = 269), only 37/269 (13.8%) were associated with AMR determinants, while most (232/269; 86.2%) were pan-susceptible (Figure 5).

FIGURE 5: Minimum spanning tree created using cgMLST data for Salmonella Typhimurium ST313 isolates (N = 269), South Africa, 2020–2023. The circular nodes represent isolates. The larger the circular node, the more isolates which are indicated. The number of segments within a circular node is representative of the number of isolates. The number values between adjoining nodes specify the number of allele differences between connecting nodes (isolates). The legend points to isolates with or without antimicrobial resistance determinants.

TABLE 2: Antimicrobial resistance determinants associated with the top (most common) eight Salmonella serovars and sub-species in South Africa, 2020–2023.

For the period of this research (2020 to 2023), we identified 297 cases of ESBL-positive Salmonella, which showed a variety of ESBL genes including: blaCTX-M variants, blaCMY variants, blaDHA, blaOXA-1, blaOXA-10, blaTEM-63, and blaSHV-2. Most ESBL-positive Salmonella (162/297; 54.5%) were associated with Salmonella Isangi. For carbapenemase-positive Salmonella, we identified 22 cases, which included: eight isolates of Salmonella Isangi (housing either the blaNDM-1 gene or blaOXA-181 gene), four isolates of Salmonella Enteritidis (housing either the blaOXA-48 gene or blaOXA-181 gene), four isolates of Salmonella Typhimurium (housing the blaOXA-48 gene), two isolates of Salmonella Montevideo (housing the blaOXA-48 gene), one isolate of Salmonella Gallinarum (housing the blaOXA-48 gene), one isolate of Salmonella Virchow (housing the blaOXA-48 gene), one isolate of Salmonella Muenster (housing the blaOXA-48 gene), and one isolate of S. enterica subspecies salamae (housing the blaOXA-181 gene).

During this same period of research, we also identified five cases of extensively drug-resistant Salmonella Typhi, of which two cases had confirmed travel history to Pakistan. Three isolates included resistome: blaTEM-1, blaCTX-M-15, catA1, sul1, sul2, dfrA7, qnrS1, gyrA S83F; while two isolates included resistome: blaTEM-1, blaCTX-M-15, catA1, sul1, dfrA7, qnrS1, gyrA S83F.

Discussion

The CED, NICD, is a member of the regional PulseNet Africa laboratory network (https://www.pulsenetafrica.org/), which forms part of the PulseNet International network (http://www.pulsenetinternational.org/), a global molecular subtyping network for foodborne disease surveillance. The CED has always followed standardised molecular subtyping methodologies as suggested by PulseNet International, of which in years gone by, the suggested primary methodology was pulsed-field gel electrophoresis analysis. The CED has published extensively on the use of these older (traditional) molecular subtyping methodologies for routine surveillance activities and for investigation of outbreaks involving enteric bacterial pathogens, which have included the use of pulsed-field gel electrophoresis analysis,15,22 multiple-locus variable-number tandem-repeats analysis,14,16 and MLST (using Sanger sequencing of polymerase chain reaction-amplified genes).13,14

In late 2015, CED took the first step towards the use of WGS for analysis of enteric pathogens. This coincided with the establishment of the NICD SCF facility equipped with Illumina MiSeq next-generation sequencing equipment. Our first WGS activities investigated a cluster of Listeria monocytogenes cases reported from the Western Cape province, South Africa, 2015. This analysis was timely, as the steering committee of the PulseNet International network was in discussions to start with implementation of WGS, of which the vision of the network for implementation of WGS was later published in 2017.23 In 2020, CED terminated the use of all older (traditional) molecular subtyping methodologies (pulsed-field gel electrophoresis and multiple-locus variable-number tandem-repeats analysis), and implemented the use of WGS analysis for routine surveillance and analysis of all clinical isolates of enteric bacterial pathogens, including the Salmonella. This was needed in order to align with trends in public health microbiology showing the evolution towards WGS as the primary methodology for laboratory investigation of infectious disease. Globally, many public health institutions and reference laboratories have transitioned to WGS as their primary methodology for characterisation of bacterial pathogens.5,6,7

We experienced very few challenges with our implementation of WGS. The reasons probably have been that we implemented well-established and well-validated methodologies, and we were supported by a well-established and well-equipped SCF with dedicated core staff, including bioinformatics support. Whole-genome sequencing all starts with a good quality DNA extraction from bacteria. We used good quality DNA extraction kits to produce quality in our DNA extractions. This then eliminated almost all further problems in downstream sequencing steps, including library preparation. The quality of our sequence data outputs were mostly excellent. On rare occasions, we would encounter assembled data which failed minimum quality thresholds. The quality metrics for our assembled data include an N50 value that must be > 20 kb and number of contigs that must be < 300. For assembled data that fail quality checks, the sample is subjected to a repeated round of sequence analysis, and this usually corrects the quality issue. We rarely encountered contamination problems in our analysis of sequence data. If contamination was encountered, then a repeat DNA extraction on a new pure culture and a new sequence analysis solved the problem. On rare occasions, species identification tools would 100% identify sequence data as a non-Salmonella species, and in these situations, presumptive laboratory identifications would be updated to reflect the WGS identification.

Importantly, the implementation of any new laboratory testing methodology must be accompanied by validation data and other checks associated with good laboratory practice. As such, our WGS analysis is accredited as per the international ISO 15189 standard, and we are regularly audited by the South African National Accreditation System, the official laboratory accreditation body of South Africa. Validation of our WGS analysis is continuously affirmed by annual participation in two external WGS Quality Assessment Schemes, one managed by the National Institute for Public Health and the Environment, the Netherlands, in collaboration with the European Centre for Disease Control,24 and the other managed by the National Food Institute, Technical University of Denmark.25

In the last 5 years, next-generation sequencing technology has advanced rapidly, resulting in shorter turnaround times to WGS results and higher accuracy of sequencing data. In parallel, costs of next-generation sequencing and WGS have also decreased dramatically over recent years, making the technology more affordable and cost-effective for use in public health laboratories. Year-on-year decrease in WGS costs has also been noted by CED, NICD, where the cost to perform Illumina WGS on a single Salmonella isolate in January 2020 as compared to the cost in December 2023 decreased by 51.8%. In South Africa, the costs associated with Illumina sequencing is generally more affordable as compared to many other African countries (anecdotal evidence). To some extent, this affordability in South Africa can probably be attributed to the presence of an official Illumina product distributor in the country, namely Separations (https://separations.co.za/). This not only ensures the affordability of reagents and consumables but also ensures rapid order and delivery of said reagents and consumables, timely maintenance and repair of Illumina equipment, and overall good customer support.

Of the 8006 Salmonella isolates analysed using WGS, 130 distinctive Salmonella serovars and subspecies were identified (Figure 1). Salmonella Enteritidis and Salmonella Typhimurium (5701/8006; 71.2%) were the most prevalent, which aligns with global trends.5,26,27 The following serovars or subspecies completed our top (most common) eight: Salmonella Typhi, S. enterica subspecies salamae, Salmonella Isangi, Salmonella Dublin, Salmonella Muenchen, and Salmonella Infantis. For Salmonella Enteritidis, this serovar was associated with the majority of our cluster and outbreak investigations (Table 1), and was mostly associated with a low prevalence of AMR (Table 2). Among Salmonella Typhimurium, the ST313 variant was commonly encountered (269/1430; 18.8%). The ST313 variants are known to be highly associated with Salmonella bloodstream infections in Africa.28 Our reported ST313 variants were mostly (232/269; 86.2%) pan-susceptible (Figure 5), which was an interesting finding, considering that literature reports ST313 variants as typically multidrug-resistant.28 For Salmonella Typhimurium, 109/1430 (7.6%) of our isolates were of the monophasic variant, of which this variant has emerged globally to become an important pandemic variant with increasing AMR,27,29 and increasingly associated with foodborne disease outbreaks.30 For Salmonella Typhi, most (414/482; 85.9%) of our isolates were of the H58 haplotype (genotype 4.3.1) strain (Figure 2). The H58 haplotype is a globally dominant variant of Salmonella Typhi and commonly associated with AMR.31 We were able to identify the H58 haplotype using the EnteroBase cgMLST hierarchical cluster assignment tool, where HC50:202 is known to be indicative of the H58 haplotype. Among our Salmonella Typhi H58 haplotype strains, most (411/414; 99.3%) showed AMR determinants associated with ≥ 4 classes of antimicrobials, commonly including the following resistome: blaTEM-1B, catA1, sul1, sul2, dfrA7. Salmonella Typhi was often associated with our cluster and outbreak investigations (Table 1). A notable investigation involved Salmonella Typhi cases in 2020 to 2022, associated with an outbreak among illegal gold miners, likely resulting from the consumption of contaminated groundwater while working in a gold mine shaft (Table 1).32 Salmonella enterica subspecies salamae (279/8006; 3.5%) was our fourth most prevalent Salmonella serovar/subspecies. The S. enterica subspecies salamae isolates were genetically diverse, with no evidence to suggest any clonal spread (data not shown). Our S. enterica subspecies salamae data were a surprise finding, considering that they are mostly reported from environmental and animal (mostly cold-blooded animals like reptiles) sources, and are generally considered less pathogenic for humans.33 The prevalence of this subspecies in South Africa has previously not been reported. Historically, before the use of WGS analysis, we would have a large contingent of Salmonella isolates reported as ‘Salmonella species’, because the traditional serotyping methodologies (using antisera) were sometimes inconclusive in making a call on subspecies or serovar. Now with the use of WGS analysis tools, Salmonella characterisation is more complete and more accurate, and is now able to more accurately identify S. enterica subspecies salamae. These isolates were mostly cultured from stool specimens of patients. Unfortunately, no further information was available about these cases, as no follow-up investigations were conducted for these. Also, we are not able to speculate on any possible environmental or zoonotic source for this subspecies, as no further investigations were conducted.

Salmonella Isangi (167/8006; 2.1%) was our fifth most prevalent Salmonella serovar or subspecies. Salmonella Isangi represents an emerging pathogen in South Africa. However, globally, Salmonella Isangi is an uncommon serovar. Very few (n = 409) Salmonella Isangi isolates have been reported in the EnteroBase database (as of 27 March 2025), with most of the cases (203/409; 49.6%) reported from South Africa. Among all the serovars or subspecies in South Africa, Salmonella Isangi showed the highest prevalence of AMR (Table 2). Most Salmonella Isangi (159/167; 95.2%) showed AMR determinants associated with ≥ 7 classes of antimicrobials, including ESBL genes (blaOXA-1, blaOXA-10, blaCTX-M-15, blaTEM-63, blaDHA). Globally (including South Africa), Salmonella Isangi are typically multidrug-resistant, and are often associated with hospital outbreaks.34,35,36,37 There is a pressing need for studies to identify the reservoir and transmission pathway for Salmonella Isangi, as this serotype is very capable of acquiring and retaining extensive drug resistance, and once introduced into the hospital environment, it appears to happily thrive and cause lengthy hospital outbreaks.35 Salmonella Dublin (114/8006; 1.4%) was our sixth most prevalent Salmonella serovar or subspecies. Globally, Salmonella Dublin is a relatively uncommon cause of human infections. Salmonella Dublin is host-adapted to cattle, so is most prevalent in cattle and cow’s raw milk cheese. Countries that produce large volumes of cheese (such as France) often show an increased prevalence of Salmonella Dublin.38,39 Salmonella Muenchen (108/8006; 1.3%) was our seventh most prevalent Salmonella serovar or subspecies. This prevalence aligns with a global reported prevalence of 1.2%, where globally, Salmonella Muenchen is listed as the 13th most prevalent Salmonella serovar.27 Salmonella Muenchen is a relatively uncommon cause of human infections globally and there are also very few documented reports of outbreaks associated with Salmonella Muenchen. Salmonella Infantis (98/8006; 1.2%) was our eighth most prevalent Salmonella serovar or subspecies. Salmonella Infantis is currently perhaps the biggest mover and shaker among the global Salmonella population, gaining increased global prevalence over recent years.40 Salmonella Infantis has become the fourth most prevalent Salmonella serovar causing human infections among European Union member countries.41 Salmonella Infantis is among the most frequently isolated Salmonella serovar in poultry in Europe and the United States.41 Globally, Salmonella Infantis is currently listed as the third most prevalent Salmonella serovar, with a global reported Salmonella prevalence of 6.6%.27 Interestingly, the population structure of South African Salmonella Infantis has been shown to differ substantially from Salmonella Infantis isolated elsewhere globally.42

Conclusion

The implementation of WGS for routine surveillance of clinical isolates of Salmonella in South Africa has seen a significant increase in the critical mass of Salmonella genomic data now available from the African continent. South Africa is currently ranked country number one with respect to the number of Salmonella genome submissions from Africa, and seventh with respect to global country submissions. Large WGS data sets, methodically generated over long time periods, provide essential information for: detailed and enhanced characterisation of bacterial strains (pathogens), molecular epidemiological investigations, early detection of clusters of disease, outbreak investigations, investigating for new and emerging strains, investigating for new or unusual AMR profiles, tracking the spread of strains, data for development of treatment strategies and vaccine development, and data for monitoring the effect of treatment interventions and vaccine rollout. Whole-genome sequencing data not only provide value ‘in the now’ but are ‘the gift that keeps on giving’, as data can be further and repeatedly investigated by multiple parties, be that for research purposes or public health activities, all to assist with investigation and containment of future public health threats.

Acknowledgements

The authors would like to thank all participants of the NICD GERMS-SA Laboratory Surveillance Network for submission of clinical isolates of Salmonella species to the NICD.

Competing interests

The authors declare that they have no financial or personal relationships that may have inappropriately influenced them in writing this article.

Authors’ contributions

A.M.S. contributed towards project conceptualisation, funding acquisition, project administration, project supervision, data analysis, data curation, writing the original draft of the article, and is the corresponding author for the project. P.S., L.K.E., and J.T. contributed towards data analysis, review, and editing of the article. H.M.N., B.D., D.B., S.L.S., and N.P.T. contributed towards laboratory analysis, data analysis, data curation, review, and editing of the article.

Funding information

This work was supported by the SEQAFRICA project which is funded by the Department of Health and Social Care’s Fleming Fund using United Kingdom aid. The views expressed in this publication are those of the authors and not necessarily those of the United Kingdom Department of Health and Social Care or its Management Agent, Mott MacDonald.

Data availability

The data that support the findings of this study are uploaded to the public EnteroBase platform (http://enterobase.warwick.ac.uk/species/index/senterica) and are freely available at this platform. In addition, sequencing data are deposited in the European Nucleotide Archive under the project accession numbers PRJEB39002, PRJEB39546 and PRJEB39988.

Disclaimer

The views and opinions expressed in this article are those of the authors and are the product of professional research. The article does not necessarily reflect the official policy or position of any affiliated institution, funder, agency, or that of the publisher. The authors are responsible for this article’s results, findings, and content.

References

  1. Smith SI, Seriki A, Ajayi A. Typhoidal and non-typhoidal Salmonella infections in Africa. Eur J Clin Microbiol Infect Dis. 2016;35(12):1913–1922. https://doi.org/10.1007/s10096-016-2760-3
  2. Martin LB, Tack B, Marchello CS, et al. Vaccine value profile for invasive non-typhoidal Salmonella disease. Vaccine. 2024;42(19 Suppl 1):S101–S124. https://doi.org/10.1016/j.vaccine.2024.04.045
  3. Crump JA, Nyirenda TS, Kalonji LM, et al. Nontyphoidal Salmonella invasive disease: Challenges and solutions. Open Forum Infect Dis. 2023;10(Suppl 1): S32–S37. https://doi.org/10.1093/ofid/ofad020
  4. Crump JA, Sjolund-Karlsson M, Gordon MA, Parry CM. Epidemiology, clinical presentation, laboratory diagnosis, antimicrobial resistance, and antimicrobial management of invasive Salmonella infections. Clin Microbiol Rev. 2015;28(4):901–937. https://doi.org/10.1128/CMR.00002-15
  5. Chattaway MA, Dallman TJ, Larkin L, et al. The transformation of reference microbiology methods and surveillance for Salmonella with the use of whole genome sequencing in England and Wales. Front Public Health. 2019;7:317. https://doi.org/10.3389/fpubh.2019.00317
  6. Morton V, Kandar R, Kearney A, Hamel M, Nadon C. Transition to whole genome sequencing surveillance: The impact on national outbreak detection and response for Listeria monocytogenes, Salmonella, Shiga toxin-producing Escherichia coli, and Shigella clusters in Canada, 2015–2021. Foodborne Pathog Dis. 2024;21(11):689–697. https://doi.org/10.1089/fpd.2024.0041
  7. Leeper MM, Tolar BM, Griswold T, et al. Evaluation of whole and core genome multilocus sequence typing allele schemes for Salmonella enterica outbreak detection in a national surveillance network, PulseNet USA. Front Microbiol. 2023;14:1254777. https://doi.org/10.3389/fmicb.2023.1254777
  8. World Health Organization. Whole genome sequencing for foodborne disease surveillance: Landscape paper [homepage on the Internet]. 2018 [cited 2024 Dec 11]. Available from: https://www.who.int/publications/i/item/789241513869
  9. World Health Organization. Global genomic surveillance strategy for pathogens with pandemic and epidemic potential, 2022–2032 [homepage on the Internet]. 2022 [cited 2024 Dec 11]. Available from: https://www.who.int/publications/i/item/9789240046979
  10. Public Health Bulletin South Africa. Unlocking insights: Key findings from GERMS-SA annual surveillance review 2022 [homepage on the Internet]. 2022 [cited 2024 Oct 28]. Available from: https://www.phbsa.ac.za/key-findings-from-germs-surveillance-review-2022/
  11. National Institute of Communicable Diseases. Overview [homepage on the Internet]. 2023 [cited 2024 Dec 11]. Available from: https://www.nicd.ac.za/nmc-overview/overview/
  12. Dyer NP, Päuker B, Baxter L, et al. EnteroBase in 2025: Exploring the genomic epidemiology of bacterial pathogens. Nucleic Acids Res. 2025;53(D1):D757–D762. https://doi.org/10.1093/nar/gkae902
  13. Smith AM, Mthanti MA, Haumann C, et al. Nosocomial outbreak of Salmonella enterica serovar Typhimurium primarily affecting a pediatric ward in South Africa in 2012. J Clin Microbiol. 2014;52(2):627–631. https://doi.org/10.1128/JCM.02422-13
  14. Smith AM, Smouse SL, Tau NP, et al. Laboratory-acquired infections of Salmonella enterica serotype Typhi in South Africa: Phenotypic and genotypic analysis of isolates. BMC Infect Dis. 2017;17(1):656. https://doi.org/10.1186/s12879-017-2757-2
  15. Smith AM, Keddy KH, Ismail H, et al. International collaboration tracks typhoid fever cases over two continents from South Africa to Australia. J Med Microbiol. 2011;60(9):1405–1407. https://doi.org/10.1099/jmm.0.030700-0
  16. Tau NP, Smith AM, Wain JR, et al. Development and evaluation of a multiple-locus variable-number tandem-repeats analysis assay for subtyping Salmonella Typhi strains from sub-Saharan Africa. J Med Microbiol. 2017;66(7):937–945. https://doi.org/10.1099/jmm.0.000526
  17. Smith AM, Erasmus LK, Tau NP, et al. Enteric fever cluster identification in South Africa using genomic surveillance of Salmonella enterica serovar Typhi. Microb Genom. 2023;9(6):mgen001044. https://doi.org/10.1099/mgen.0.001044
  18. Smith AM, Tau NP, Smouse SL, et al. Outbreak of Listeria monocytogenes in South Africa, 2017–2018: Laboratory activities and experiences associated with whole-genome sequencing analysis of isolates. Foodborne Pathog Dis. 2019;16(7): 524–530. https://doi.org/10.1089/fpd.2018.2586
  19. Clinical and Laboratory Standards Institute (CLSI). 2018. Performance standards for antimicrobial susceptibility testing; twenty-eighth informational supplement. CLSI document M100-S28. Wayne, PA: Clinical and Laboratory Standards Institute.
  20. Feldgarden M, Brover V, Haft DH, et al. Validating the AMRFinder tool and resistance gene database by using antimicrobial resistance genotype-phenotype correlations in a collection of isolates. Antimicrob Agents Chemother. 2019;63(11):e00483-19. https://doi.org/10.1128/AAC.00483-19
  21. Zhou Z, Alikhan NF, Sergeant MJ, et al. GrapeTree: Visualization of core genomic relationships among 100,000 bacterial pathogens. Genome Res. 2018;28(9): 1395–1404. https://doi.org/10.1101/gr.232397.117
  22. Tau NP, Meidany P, Smith AM, Sooka A, Keddy KH. Escherichia coli O104 associated with human diarrhea, South Africa, 2004–2011. Emerg Infect Dis. 2012;18(8): 1314–1317. https://doi.org/10.3201/eid1808.111616
  23. Nadon C, Van Walle I, Gerner-Smidt P, et al. PulseNet International: Vision for the implementation of whole genome sequencing (WGS) for global food-borne disease surveillance. Euro Surveill. 2017;22(23):30544. https://doi.org/10.2807/1560-7917.ES.2017.22.23.30544
  24. European Centre for Disease Prevention and Control. Thirteenth external quality assessment for Salmonella typing [homepage on the Internet]. Stockholm: ECDC; 2024 [cited 2024 Oct 31]. Available from: https://www.ecdc.europa.eu/en/publications-data/thirteenth-external-quality-assessment-salmonella-typing
  25. DTU Genomic. Genomic proficiency test 2023 [homepage on the Internet]. 2023 [cited 2024 Oct 31]. Available from: https://www.globalsurveillance.eu/projects/genomic-proficiency-test-2023
  26. Alikhan NF, Zhou Z, Sergeant MJ, Achtman M. A genomic overview of the population structure of Salmonella. PLoS Genet. 2018;14(4):e1007261. https://doi.org/10.1371/journal.pgen.1007261
  27. Deng X, Li S, Xu T, et al. Salmonella serotypes in the genomic era: Simplified Salmonella serotype interpretation from DNA sequence data. Appl Environ Microbiol. 2025;91(3):e0260024.
  28. Van Puyvelde S, De Block T, Sridhar S, et al. A genomic appraisal of invasive Salmonella Typhimurium and associated antibiotic resistance in sub-Saharan Africa. Nat Commun. 2023;14(1): 6392. https://doi.org/10.1038/s41467-023-41152-6
  29. Sun H, Wan Y, Du P, Bai L. The epidemiology of monophasic Salmonella Typhimurium. Foodborne Pathog Dis. 2020;17(2):87–97. https://doi.org/10.1089/fpd.2019.2676
  30. Larkin L, Pardos de la Gandara M, Hoban A, et al. Investigation of an international outbreak of multidrug-resistant monophasic Salmonella Typhimurium associated with chocolate products, EU/EEA and United Kingdom, February to April 2022. Euro Surveill. 2022;27(15):2200314. https://doi.org/10.2807/1560-7917.ES.2022.27.15.2200314
  31. Carey ME, Thi Nguyen TN, Tran DHN, et al. The origins of haplotype 58 (H58) Salmonella enterica serovar Typhi. Commun Biol. 2024;7(1):775. https://doi.org/10.1038/s42003-024-06451-8
  32. Sekwadi P, Smith AM, Maruma W, et al. A prolonged outbreak of enteric fever associated with illegal miners in the City of Matlosana, South Africa, November 2020-September 2022. Open Forum Infect Dis. 2024;11(3):ofae118. https://doi.org/10.1093/ofid/ofae118
  33. Lamas A, Miranda JM, Regal P, Vázquez B, Franco CM, Cepeda A. A comprehensive review of non-enterica subspecies of Salmonella enterica. Microbiol Res. 2018;206:60–73. https://doi.org/10.1016/j.micres.2017.09.010
  34. Dos Santos AMP, Panzenhagen P, et al. Genomic characterization of Salmonella Isangi: A global perspective of a rare serovar. Antibiotics. 2023;12(8): 1309. https://doi.org/10.3390/antibiotics12081309
  35. National Institute for Communicable Diseases. Communicable diseases communique, August 2022 [homepage on the Internet]. 2022 [cited 2025 Feb 14]. Available from: https://www.nicd.ac.za/wp-content/uploads/2022/08/310822-NICD-Monthly-Communique-Aug-NW5.pdf
  36. Suleyman G, Tibbetts R, Perri MB, et al. Nosocomial outbreak of a novel extended-spectrum β-lactamase Salmonella enterica serotype Isangi among surgical patients. Infect Control Hosp Epidemiol. 2016;37(8):954–961. https://doi.org/10.1017/ice.2016.85
  37. Wadula J, Von GA, Kilner D, et al. Nosocomial outbreak of extended-spectrum beta-lactamase-producing Salmonella Isangi in pediatric wards. Pediatr Infect Dis J. 2006;25(9):843–844. https://doi.org/10.1097/01.inf.0000233543.78070.a2
  38. De Sousa Violante M, Podeur G, Michel V, et al. A retrospective and regional approach assessing the genomic diversity of Salmonella Dublin. NAR Genom Bioinform. 2022;4(3):lqac047. https://doi.org/10.1093/nargab/lqac047
  39. Velasquez-Munoz A, Castro-Vargas R, Cullens-Nobis FM, Mani R, Abuelo A. Review: Salmonella Dublin in dairy cattle. Front Vet Sci. 2023;10:1331767. https://doi.org/10.3389/fvets.2023.1331767
  40. Mattock J, Chattaway MA, Hartman H, et al. A one health perspective on Salmonella enterica serovar Infantis, an emerging human multidrug-resistant pathogen. Emerg Infect Dis. 2024;30(4):701–710. https://doi.org/10.3201/eid3004.231031
  41. Alba P, Leekitcharoenphon P, Carfora V, et al. Molecular epidemiology of Salmonella Infantis in Europe: Insights into the success of the bacterial host and its parasitic pESI-like megaplasmid. Microb Genom. 2020;6(5):e000365. https://doi.org/10.1099/mgen.0.000365
  42. Mattock J, Smith AM, Keddy KH, et al. Genetic characterization of Salmonella Infantis from South Africa, 2004–2016. Access Microbiol. 2022;4(7):acmi000371. https://doi.org/10.1099/acmi.0.000371
  43. Brümmer B, Smith AM, Modise M, et al. Whole genome sequencing assisted outbreak investigation of Salmonella enteritidis, at a hospital in South Africa, September 2022. Access Microbiol. 2024;6(11):000835.v000833. https://doi.org/10.1099/acmi.0.000835.v3