key: cord-0018711-2wvcshz8 authors: Yu, Yang; Yao, Changliang; Guo, De-an title: Insight into chemical basis of traditional Chinese medicine based on the state-of-the-art techniques of liquid chromatography−mass spectrometry date: 2021-02-26 journal: Acta Pharm Sin B DOI: 10.1016/j.apsb.2021.02.017 sha: 6463f8334ef5db526376c10f2f3a0d13f96b82a6 doc_id: 18711 cord_uid: 2wvcshz8 Traditional Chinese medicine (TCM) has been an indispensable source of drugs for curing various human diseases. However, the inherent chemical diversity and complexity of TCM restricted the safety and efficacy of its usage. Over the past few decades, the combination of liquid chromatography with mass spectrometry has contributed greatly to the TCM qualitative analysis. And novel approaches have been continuously introduced to improve the analytical performance, including both the data acquisition methods to generate a large and informative dataset, and the data post-processing tools to extract the structure-related MS information. Furthermore, the fast-developing computer techniques and big data analytics have markedly enriched the data processing tools, bringing benefits of high efficiency and accuracy. To provide an up-to-date review of the latest techniques on the TCM qualitative analysis, multiple data-independent acquisition methods and data-dependent acquisition methods (precursor ion list, dynamic exclusion, mass tag, precursor ion scan, neutral loss scan, and multiple reaction monitoring) and post-processing techniques (mass defect filtering, diagnostic ion filtering, neutral loss filtering, mass spectral trees similarity filter, molecular networking, statistical analysis, database matching, etc.) were summarized and categorized. Applications of each technique and integrated analytical strategies were highlighted, discussion and future perspectives were proposed as well. As the quintessence and treasure of China, traditional Chinese medicine (TCM) has taken on the responsibility to prevent and treat human diseases for millennia, and the excellent therapeutic effects of TCM have also been proved through the long historical clinical practice 1 . For this reason, TCM has always been inspiring scientists around the world and employed as an ideal library for development of drugs and dietary supplements 2 . However, disparate from mono-component Western medicines, the constituents of TCM are complex and yet undisclosed, thus leading to the vague mechanisms of action 3 and hampering the worldwide adoption of TCMs, especially in the Western world. Therefore, it is of vital importance to conduct more in-depth research for a better understanding of the chemical complexity of TCM, especially those closely related with the pharmacological activities and toxicities. However, on the one hand, a large number of minor and trace components exhibiting predominant biological activities cannot be detected with an ordinary approach; on the other, the compounds in TCM are usually one or multiple types of secondary metabolites, and the metabolites of the same type are structurally similar and hard to differentiate because of the common biosynthetic pathways. Accordingly, the number and structural variety of metabolites constitute a great analytical challenge in terms of detection and identification. Combining the separation capacity of liquid chromatography (LC), especially the ultra-high performance liquid chromatography (UPLC), and remarkable selectivity and sensitivity of mass spectrometry (MS), the hyphenated analytical platform (LCÀMS) shows significant advantages in analyzing complex chemical systems and has become one of the most powerful tools for analysis of TCM 4 . In an endeavour to explore the structural variety of metabolites, the first step is to acquire as much LCÀMS information as possible, and the second step is to convert LCÀMS data into chemical information. Between these two steps, except for the stationary phase development of LC 5 and hardware enhancement of MS instrumentations 6 , the fast-developing MS data acquisition and data post-processing techniques also play a pivotal role 7, 8 . Data acquisition methods by LCeMS can be roughly classified into data-independent acquisition (DIA) method and datadependent acquisition (DDA) method. Typical DIA methods may provide production information in a non-selective manner without prior knowledge of the ions of interest. Featured with full coverage of MS 1 and MS 2 spectra in one injection, DIA methods maximally avoid the loss of MS information and possess satisfactory reproducibility and quantitative performance 9 . However, it is a daunting task to precisely assign the product ions generated by DIA methods to their precursor ions especially when the coeluting compounds exist, due to the complex spectra and the lost link between MS 1 and MS 2 , as well as requirements for more complicated deconvolution and data processing algorithms to further interpret the acquired MS data 10 . In a DDA method, a survey scan (usually full scan, FS) is first launched, and if certain criteria are met, further MS/MS (or MS n ) scan will be automatically triggered 7 . The diverse criteria set up in DDA methods provide customized solutions, taking advantages of the intrinsic chemical characteristics and fragmentation behaviors of the target compounds. In contrast to the DIA data, the structural information could be easily elucidated from DDA data, especially with the help of data post-processing techniques. Even though plenty of researches and applications of LCÀMS techniques in TCM qualitative analysis have been reported, it is still short of relevant reviews, with only several categorizing different compound classes and citing some representative examples of analyzing these compounds in TCM and TCM prescriptions 4,11e14 . Therefore, summary of universal and applicable techniques in TCM analysis are in urgent need. This review primarily focused on the applications of the advanced MS data acquisition and data post-processing techniques (Fig. 1) , as well as integrated analytical strategies in qualitative analysis of TCM. Mechanisms of these techniques were firstly introduced, followed by representative applications. It is anticipated that this review would render a comprehensive reference for future research with regard to the metabolites identification in TCM complex systems. Typical DIA modes include AIF (all ion fragmentation) mode of Orbitrap system (Thermo Fisher Scientific, CA, USA) 15 , MS E (elevated energy MS) and SONAR mode of Q-TOF system (Waters, MA, USA) 16, 17 , MS ALL and SWATH™ (sequential window acquisition of all theoretical mass spectra) mode of Triple TOF system (AB SCIEX, CA, USA) 18, 19 . Among them, AIF acquisition mode allows all the precursor ions of defined range into the HCD (high-energy C-trap dissociation) cell for fragmentation 15 . MS ALL is similar to MS E which involves two scan functions, one is at low collision energy (CE), collecting intact molecular ion information with a wide mass range. The other one obtains product ion data using higher CE (fixed or ramped), which is useful for structural elucidation 16 . However, even though the aforementioned DIA modes avoid missing MS/MS information of minor or trace components in TCM, the defects of complex MS/ MS spectra resulting from co-eluting ions still restricted the prevalent use of DIA techniques in TCM qualitative analysis. Luckily, different manufacturers have been devoted to improving the performance of DIA modes through instrument enhancement. For example, newly-developed SWATH™ acquisition mode allows the fragmentation of all ions across a given mass range in sequential narrow selection windows, thus retaining the benefits of both selectivity and scan speed 20 . While SONAR acquisition mode operates in a similar way, when the resolving quadrupole slides over a selected mass range during each MS scan with collision cell alternating between high and low CE from scan to scan, and the software aligns to associate the precursor and product ions in different data channels. Besides, the application of ion mobility (IM) technique also helps obtain relatively clean and high-quality MS/MS spectra by enhanced separation of co-eluting ions based on their sizes, shapes and charge states 21 . Different DIA-based strategies have been developed for TCM analysis. For example, MS ALL was applied in identification of triterpene saponins in Acanthopanax senticosus leaves by simultaneously collecting all the precursor ions ([MþH] þ , [MþNH 4 ] þ and [MþNa] þ ) and product ions with ramped CE on an AB SCIEX QTOF system, while SWATH mode as a complementation for co-eluting ions 22 . To characterize lanostane-type triterpene acids in Poria cocos, UPLCÀIMÀMS E method was used to obtain multi-dimensional information including retention time, drift time and mass-to-charge ratio (m/z), where UPLC was for preliminary separation, whilst IM for identification of co-eluents by aligning precursor ions and corresponding product ions with the same drift time 23 . Traditionally, in a DDA method, precursor ions were selected for fragmentation according to their signal abundances in MS 1 scans. However, there could be a stochastic nature to data acquisition as TCM mixtures become more complex. In addition, only fragmentations of the top-n precursor ions were available. As a result, it often encounters obstacles in comprehensively detecting minor or trace components in TCM and systematically characterizing metabolites of the same type. Since the multiple types of constituents in a certain TCM are closely related to one or more common biosynthetic pathways, they could be structurally classified into several chemical families according to the same carbon skeletons or substructures with varieties in substitutional groups. Therefore, MS fragmentation patterns of chemically similar compounds usually displayed the same neutral losses, product ions, or other characters. Accordingly, various DDA techniques targeting one-type or multi-types of metabolites, and covering the trace metabolites have been developed. In a precursor ion list (PIL) experiment, only (or preferred) target ions contained in the list can be scanned to trigger multistage fragmentations. PIL-based acquisition thus becomes a commonly used DDA technique with high selectivity, sensitivity and efficiency in discovery and characterization of target compounds. Unlike conventional DDA method that automatically fragments the top-n abundant ions, MS/MS or MS n data are acquired regardless of their peak abundances, overcoming the drawback of neglecting minor components in TCM extracts. In a PIL-based acquisition, the key is to establish a samplespecific PIL, based on molecular features of m/z value or a combination with retention time 24 . One of the most widespread approach is collecting previous phytochemical reports for further structural prediction. With the loss or addition of different substitutional groups to certain core substructures in accordance with the biosynthetic pathways, the molecular formulas of these derivatives are predicted, correspondingly, molecular weights and theoretical parent ions can be calculated. PIL constructed in this way encompasses both known and predicted compounds, thus greatly increasing the coverage and improving the selectivity of detecting expected compounds in TCM. Another approach, is using various data mining techniques to explore the potential metabolites, such as mass defect filtering (MDF) 25e27 , neutral loss filtering (NLF) 28,29 and molecular networking (MN) 30 . Their principles and applications in qualitative analysis of TCM are discussed in depth in Section 3. For example, some compound-specific PILs were generated by summarizing the reported natural compounds, and applied in profiling of components in Carthamus tinctorius L. (safflower) 31 , Ligustri Lucidi Fructus 32 , etc. Besides, based on compound prediction, aglycones substituted by 2e7 hydroxyl and methoxyl groups were included in a PIL, achieving targeted screening of 135 polymethoxylated flavonoids when analyzing the sample of Citrus reticulata Blanco on a hybrid linear iontrap/Orbitrap (LTQ-Orbitrap) 33 . Another PIL constructed by the arrangement of different substitutional groups to tanshinol was also used to characterize polymeric phenolic acids in Salvia miltiorrhiza 34 . One successful application of generating PIL with data postprocessing techniques is conducted by a FS-based untargeted metabolomics approach 35 . This method displays great advantages in analyzing complex systems, as no prior knowledge of the sample is necessary. Besides, separating a long PIL into several individual ion lists based on m/z value or time point can drastically avoid data loss and reduce duty cycle of the instumentation 34 . There is evidence showing that DDA triggered by time-staggered PIL exhibited superior selective advantages in activating MS/MS acquisition of co-eluting ions compared to the conventional DDA 35 . In summary, PIL-MS n , especially the time-triggered mode, displays excellent superiority in acquiring the fragmentation information of target ions, thereby enhancing the selective detection of target components in TCM. Contrary to PILÀMS/MS technique which preferentially trigger MS/MS fragmentation of predefined ions in list, exclusion list (EL)-based acquisition methods improve the selectivity and sensitivity of characterization by entering an EL of masses that will be ignored for MS n analysis, exhibiting great advantages in removing background interfering ions and non-targeted ions, and showing potentials in exposing more novel compounds 36 . Except for abundant interfering ions, there will always be target components that completely overlap when analyzing TCM complex systems regardless of the optimized chromatographic separation conditions. Under these circumstances, only the top-n intense ions can be selected for fragmentation, and MS/MS information of less abundant co-eluting ions will escape the detection. Dynamic exclusion (DE) provides a solution for this problem by temporarily putting a mass over the defined threshold into an EL for a selected period of time after its MS/MS spectrum is acquired, enabling less abundant ions to be analyzed rather than repeated scans of the abundant ions 37 . For this reason, DE is frequently set to improve the detection coverage in TCM qualitative analysis, especially for detecting those less abundant co-eluting ions 25,26,28,38 . The advantages of DE were fully illustrated by screening the polymethoxylated flavonoids from Citrus reticulata Blanco 33 . Compared with FS, FS-PIL and FS-DE on LTQ-Orbitrap, FS-PIL-DE acquisition was found to be most powerful in triggering MS/ MS fragmentations. Its superiority in acquiring the MS n information was further confirmed by another study for screening indole alkaloids from Uncaria sinensis 26 . Appropriate DE parameters are also essential for MS data acquisition. Because they are closely related to the chromatographic peak width, optimal chromatographic conditions should be investigated to minimize the differences in the peak widths of different retention times 34 . Besides, short repetition time and long exclusion time could minimize MS information loss of a complex system but potentially miss the fragmentation of the neighboring isomers. Therefore, a coordination of these two parameters is always a necessity for the better data quality. In a mass tag dependent acquisition, the parent ions with certain mass differences were recognized as mass partners, and selected to trigger MS n fragmentation. The mass partners can be produced by introducing isotopically labeled compounds, such as chrysin-d 0 and chrysin-d 5 with a mass difference of 5 Da. The technique of mass tag can largely improve signal-to-noise ratios and data quality in detection of tagged compounds, including reactive metabolites 39 and modified peptides 40 . Mass tag technique can be performed on LTQ-Orbitrap and applied to selectively detect phosphopeptides and N-glycopeptides with an in-source collisioninduced dissociation (ISCID) energy on. After the m/z values of the charged ions in MS 1 spectra were determined and converted into masses, mass partners that differed in the mass of defined mass tag and with intensity above the defined threshold, were selected to trigger MS 2 acquisition with the ISCID off 40 . Recently, expanding applications of mass tag can be found in TCM qualitative analysis, especially for targeted screening of components that can easily undergo in-source fragmentations. For example, malonylginsenosides from three Panax species were targeted screened out by defining mass tag of 43.9898 Da with an ISCID energy of 40 V, corresponding to the elimination of CO 2 41 . In addition, by applying ISCID of 50 V and mass tag of 46.0055 Da (referring to formic acid), neutral ginsenosides from three Panax species were also successfully classified and characterized 42 . In summary, mass tag-oriented acquisition has unique advantages in selectively screening the target components in TCM with a high coverage and introducing less false positives. Precursor ion scan (PIS) can be realized in a triple quadrupole (QqQ) mass spectrometer. Ions within a defined mass range are scanned in the first quadrupole (Q1), fragmented in the collision cell (q2), and then one specific product ion is selected in the third quadrupole (Q3) 43 . This acquisition mode allows selective screening of compounds with identical product ions, and is popular in metabolites identification 44 , lipidomics study 45 . Besides, PIS also shows great potential in characterization of various TCM components. Its advantage lies in requiring little prior-knowledge about the exact structure of compounds. Facilitated with PIS technique, target compounds that are expected to produce the same product ions can be screened out, such as the esters, saponins, flavonoids, etc. For example, benzoyl substituted diterpenoid alkaloids with diagnostic benzoyl ion at m/z 105 46 ; caffeoylquinic acid derivatives with diagnostic product ion of m/z 191 referring to quinic acid anion 47 ; oplopane-type sesquiterpenoids with ions at m/z 215 and 217, and bisabolanetype sesquiterpenoids with ions at m/z 229 and 231 48 . Besides, glycosyl-conjugated compounds could also be easily detected using characteristic ions corresponding to their core structures and sugar moieties. As an example, triterpene saponins in Glycyrrhiza yunnanensis with saccharide chains of glucuronic acid (GluA) and rhamnose (Rham) were detected using ions at m/z 351 [2GluAÀH] À and 497 [2GluAþRhamÀH] À49 . Polyoxypregnane and its glycosides were identified from Marsdenia tenacissima (Roxb.) Wight et Arn., using characteristic ions at m/z 329 and m/z 273, corresponding to the aglycone core and sugar moiety at C-3 position, respectively 50 . And characteristic ions of 153 þ /151 e were used to identify glycosyl-free flavonoids and then the related glycosyl-conjugated flavonoids were further searched using ions of the identified glycosyl-free flavonoids in PIS mode 51 . PIS-information dependent acquisition (IDA)-enhanced product ion (EPI) mode implemented on a hybrid triple quadrupolelinear ion trap (QTRAP) mass spectrometer is more effective in rapid detection and characterization of certain types of analytes than experiments on QqQ, due to an additional MS/MS fragmentation. It was applied to rapidly identify phenolic constituents in Danhong injection using six diagnostic ions, they are m/z 197 ( À ) generated from danshensu-related compounds, 137 ([MeH] À ) and 108 ([MeHeCHO] À ) generated from protocatechualdehyde-related compounds 52 . Wider applications of PIS-IDA-EPI in TCM qualitative analysis include identification of naphthalenyl glycosides and amino naphthoquinones in Juglans cathayensis 53 , coumarins in Glehniae Radix 54 , phenylethanoid glycosides, iridoids, and lignans in Cistanche deserticola and C. tubulosa 55 , and so on. In the neutral loss scan (NLS) mode of QqQ and QTRAP, ions within a defined mass range are scanned in the first quadrupole (Q1), fragmented in the collision cell (q2), and then Q3 scan range shifts by Dm to a low mass, which corresponds to a specific neutral loss (NL) of every potential precursor ion 43 , e.g., 162 Da corresponding to glucose. It is widely used for specific screening of modified metabolites that can undergo neutral eliminations, and also highly selective and sensitive to profile and characterize various specific conjugated components in TCM, as well as discovery of new compounds. Because O-glycosides can easily undergo the NL of sugars, NLS was used to profile glycosides with different aglycones in tobacco leaves 56 . NL-IDA-EPI scan mode was employed to screen glycosyl flavonoids in Astragali Radix with NL of 162 or 179 Da for screening of flavonoid aglycones that tended to form protonated or ammoniated glycosides, respectively 57 . Triterpene saponins in Glycyrrhiza yunnanensis were profiled using adducts-targeted NL-EPI by monitoring NL of 17 Da (referring to NH 3 ) and 46 Da (referring to HCOOH) generated from their ammonium and formic acid adducts, respectively 58 . Sulfated flavonoids in Flaveria were detected by a NL of 80 Da (representing SO 3 ) in negative mode 59 . And iridoids, lignans, and phenylethanoid glycosides in Cistanche deserticola and C. tubulosa were screened with NL of 162 Da (glucose residue or a C 9 H 6 O 3 group), 146 Da (coumaroyl group) and 176 Da (feruloyl groups) 55 . In addition, NLS was also enabled in high-resolution LTQ-Orbitrap, as the exact mass of the neutral loss can be used to screen the targets more accurately. For instance, malonylginsenosides were screened from nine Ginseng extracts with NL set at 43.9898 Da, corresponding to loss of a CO 2 unit from malonyl group 41 . Steroids substructures of dicarboxylic acid conjugated bufotoxins in Venenum bufonis were obtained by a NLbased MS 3 acquisition in positive ion mode with neutral masses of seven dicarboxylic acids 38 . Multiple reaction monitoring (MRM) mode is a typical scan mode operated in QqQ and QTRAP. By applying an appropriate CE, certain parent ions detected in Q1 cell can fragment in q2 cell and generate product ions passing through Q3 cell. Under optimal MS conditions, MRM-based acquisition, by simultaneously monitoring specific parent ion and corresponding product ion with the highest intensity, can drastically improve the selectivity and sensitivity of target components in TCM. MRM scan mode was adopted to screen out 77 flavonoid glycosides in the flower of Carthamus tinctorius L. 60 , identify 80 triterpenes in Alismatis Rhizoma 61 , etc. MRM-IDA-EPI scan on a QTRAP mass spectrometer was also utilized to identify 27 major components (18 diterpenoids, 6 phenolic acids, and 3 flavonoids) in Isodon serra 62 , characterize 421 flavonoids in Astragali Radix 57 , verify the structures of all tentative curcuminoids in turmeric 63 , and screen ginsenosides in Ginseng, American Ginseng and their processed materials 64 . However, the increasing number of analytes monitored in a single run will result in lower sensitivity of detection, because either the dwell time for each transition will be reduced and unable to accumulate sufficient signal for each transition, or the cycle time for each MS scan will increase, decreasing the data points of each peak 65 . Scheduled MRM (sMRM, or dynamic MRM) offers an option to increase the number of MRM transitions without compromising the data quality, by monitoring every sMRM transition in a defined range of retention time 66 . The superiority of sMRM over conventional MRM (cMRM) was proved by detection of components in a TCM mixture made up of eight commonly used TCM 67 , and more analytes could be detected using sMRM than cMRM method. Multiple ion monitoring (MIM) mode could be considered as a special type of MRM mode, with a minimal collision energy applied in q2 cell, thus the product ion monitored in Q3 cell is identical to the parent ion in Q1 cell. In general, suitable ion transitions in an MRM method are designed based on the priorknowledge of the fragment patterns of the parent ion, causing great challenges to screen unknown compounds. MIM method can overcome these limitations and be used to detect potential compounds regardless of their fragmentation patterns, thus playing an important role in targeted monitoring of compounds of interest in TCM, and can be served as a complementary approach for MRM 68 . For example, in order to identify naphthoquinones in Juglans cathayensis, MIM-EPI method was conducted based on 36 different molecular weights of all reported naphthoquinones from family Juglandaceae, stepwise MIM-EPI with a wide specified mass range (150.0e700.0 Da) was conducted for discovery of untargeted naphthoquinones, and segmented stepwise MIM-EPI with a narrow specified mass range was used for identification of certain types of naphthoquinones 53 . Another strategy integrating predefined MRM, step-wise MIM, and EPI scans was also proposed to universally screen the hydrophilic substances in Shenfu injection, and a total of 157 hydrophilic compounds were detected, 154 of which were identified as amino acids, nucleosides, organic acids, carbohydrates, etc. 69 . Conventionally, the structures of compounds were manually elucidated based on their MS fragmentation patterns, which requires laborious spectral assignments and produces somewhat experience-based results. Similar to the DDA techniques, newly-developed data post-processing techniques take full advantages of the same neutral losses, product ions, or other characters produced by MS/MS fragmentation. In addition, the MS characters in FS, such as mass defect, can also be a clue for compound recognition. In contrast to the DDA techniques, different data post-processing techniques can be applied to the same MS data for different purposes without reacquisition of the MS data. For example, the MS data of Carthamus tinctorius L. acquired by LTQ-Orbitrap was explored for both the quinochalcone C-glycosides 70 and flavonoid O-glycosides 71 with versatile data mining strategies, showing the advantages of data post-processing techniques on recycling of MS data. In a word, various data post-processing techniques can greatly simplify the manual interpretation process of the MS data, providing more conveniences, evidences and higher accuracy for TCM qualitative analysis. The calculation of the mass defect is conducted by the mass difference between the exact mass and nominal integer mass of different elements, in this way, each molecule can be described with a unique exact mass and mass defect based on its elemental composition 72 . Therefore, structural analogues sharing similar core substructure with various chemical substituted groups will have very similar mass defects, and the limited differences are only corresponded to the different substituted groups. Based on this fact, mass defect filtering (MDF) technique is developed and used as a specific filtering criterion to selectively pick out certain compounds or compound classes in a complex mixture. For example, a target class of endogenous substances, or parent drug and its metabolites 73,74 in a complex biological sample within an allowable mass defect range. In addition, MDF is also a powerful tool to expedite the screening process of different components contained in TCM and to generate cleaner profiles by providing more distinct and specific information 75 . Classic MDF algorithms, including fixed MDF algorithms with a rectangular coverage area and linear gradient MDF algorithms with a parallelogram coverage area 76 , can be conveniently realized by some commercial software such as Peakview 77 , Metabolynx XS 78 , Metworks 79 and MetID 80 . And both the mass range and the mass defect range can be defined for automatic exclusion of irrelevant ions from complex matrices. For example, a mass defect filter (mass range: 367.5 AE 65.5 Da; mass defect: 186 AE 40 mDa) was established using Metworks, which not only encompassed all free indole alkaloids in Uncaria rhynchophylla, but also distinguished indole alkaloids from triterpenic acids (mass range: 453e649 Da, mass defect: 333e413 mDa) and other components 79 . Besides, stepwise MDF approach which divides the mass defect range or mass range into multiple mass defect windows or mass windows were proved to improve the signal to noise ratio of the target peaks, and polymethoxylated flavonoids were successfully screened from the leaves of Citrus reticulata Blanco 81 . The strategy combining a single MDF window (mass range: 267e418 Da; mass defect: 0.04e0.12 Da) for methoxylated flavonoids, and multiple MDF windows (mass range/mass defect: 337e398 Da/0.08e0.12 Da; mass range/mass defect: 483e604 Da/0.10e0.18 Da; mass range/mass defect: 629e810 Da/ 0.13e0.23 Da) for three classes of chlorogenic acids was also successfully constructed and applied in Folium Artemisiae Argyi 82 . However, because the distribution space of target compounds established by a classic MDF strategy usually covers a quite large area, increasing the possibility of false positives, several in-house polygonal MDF techniques have been developed to narrow down the distribution space and focus more on the target ions. A modified MDF strategy, termed as "five-point screening" MDF strategy was proposed to rapidly screen saponins in Panax notoginseng 83 . Every selected ion of notoginsenosides was distributed in the region depicted by five points (integer mass as x and decimal mass as y), and the "IF" function of the Microsoft excel platform was used to pick out every potential ion distributed in this region in the.xls documents. Compared with the classic MDF, this modified approach effectively removed more putative spaces and thus increasing the screening efficiency. Another polygonal MDF algorithm established by a relatively compact octagonal region was also used to screen target alkaloids in Uncaria sinensis, as well as enable three novelty levels classification: known, unknown-but-predicted, and unexpected 26 . The region established by eight vertexes were generated by a two-dimensional distribution plot of the mass range (Da) and the mass defect range (mDa) of the known and unknown-but-predicted molecules, and multiple "IF" equations representing the octagonal region were edited in Microsoft Excel to rapidly screen the alkaloid components. This modified MDF algorithm greatly enhanced the accuracy in screening target components, and more importantly, enabled the discovery of novel compounds. Despite of the great efforts made in exploration of different MDF algorithms, including classic rectangular MDF and modified MDF algorithms, for reducing false positives and expanding the coverage, tedious data processing procedures are required to generate sample-specific target ions. To solve this problem, a raster-MDF screening strategy was implemented by a step-wise PIL with the parent mass width of AE55 mDa, and utilized for screening and characterization of indole alkaloids from Uncariae Ramulus Cum Unicis of multiple botanical origins 84 . This strategy not only comprehensively extended the coverage of potential indole alkaloids, but also excluded interferences of other components from the genus Uncaria. Background interferences are common in TCM analysis, which could be traced back to all steps of sample preparation procedures such as plasticizers from pipette tips and centrifuge tubes 85 , or just chemical noise resulting from mobile phases or buffers in LCÀMS analytical systems 86 . To suppress interference of the background noise and obtain relatively clean spectra, BS technique is therefore introduced. It can be implemented manually 87 , directly in a commercial software (Waters Masslynx, Thermo Scientific Metworks 26,88 , etc.) by selecting a region with background noise and then subtracting from the original chromatogram, preparing blank samples in the same manner as the test samples for detection and removal of the background interferences, or utilizing some customized programs for further information 89 . BS is a useful technique in elimination of background signals, as well as exposure of chemical constituents in TCM, especially those with a low abundance. To identify unknown compounds in Er-xian decoction, signals corresponding to the background interferences were checked manually and excluded from 533 candidate compounds, and finally 240 non-background compounds were extracted 87 . To comprehensively detect and characterize the chemical constituents in Arnebiae Radix, the obtained total ion chromatogram (TIC) was first processed with BS, and then MDF to filter the background-subtracted ion chromatogram, resulting in characterization of 96 compounds, including 30 with a low abundance 88 . Isotope pattern filtering (IPF) technique is specially developed for compounds containing elements of distinct isotopic distribution patterns in mass spectra, such as chlorine, bromine, iodine, etc. With the aid of IPF, compounds containing certain elements can be easily screened out. For instance, an ion with isotopic mass distance of 2 Da and equal isotopic abundance is most likely to correspond to a compound containing one bromine atom. Except for these natural stable isotopes, other isotope-labeled compounds such as 14 C, 15 N and 18 O-labeled compounds are also widely used in tracing and detecting related metabolites 90 . Compared with traditional IPF implemented by commercially available software (e.g., Metworks), some newly-developed algorithms such as accurate-mass-based spectral-averaging IPF enhances the specificity by inspecting not only the Mþ2 isotopic ion and M molecular ion, but also the Mþ1 isotopic ion 91 . In recent years, sulfurous compounds with very low abundance in TCM including Pueraria lobata (Willd.) Ohwi (Gegen) and Pueraria thomsonii (Fenge) were detected and characterized by a fine IPF technique 92 . Despite of the low abundance of 34 S signal and the influence of 13 C 2 þ 18 O, this IPF strategy achieved screening of S-containing compounds by defining the accurate mass shift and relative abundance between Mþ2S (representing the Mþ2 isotopic ion of 12 Similar to PIS, diagnostic ion filtering (DIF) provides a criterion for rapid classification of the target compounds that produce the identical product ions, by extracting the relevant ions in the MS/ MS or MS n channels. Whereas PIS can only be realized by triple quadrupole like low resolution mass spectrometry, DIF can be applied to high resolution data obtained by QTOF, LTQ-Orbitrap, and Q-Orbitrap, thus leading to high selectivity. In addition to the diagnostic ions, the high-resolution data and abundant fragmentation information drastically increase the identification confidence. DIF has been applied to characterize different types of chemical components in TCM, including saccharides and glycosides 93 , quinones 94 , phenylpropanoids 82,95 , flavonoids 96e98 , triterpenes 61 , triterpenoid saponins 22,95,99,100 , alkaloids 101, 102 , other compounds such as physalins 103 , as well as multiple compound classes in the same sample 104, 105 . To get the diagnostic ions in abundance, different dissociation methods at different energies have been explored on LTQ-Orbitrap. For example, to selectively identify flavonoid O-glycosides from Carthamus tinctorius, combination of two fragmentation modes, CID and HCD, was used for obtaining complementary fragmentation information, including high-mass product ions produced from loss of sugars, low-mass product ions originated from aglycone moieties, as well as intensity ratios of radical aglycone ion species useful for precise elucidation of aglycones and glycosylation patterns 71 . DIF combined with enhanced separation strategy such as multidimensional LC and IM can greatly expand the peak capacity and enhance sensitivity of characterizing minor components with structural diversity in TCM. An LTQ-Orbitrap approach using DIF of m/z 119.05 (C 8 H 7 O À ) was developed to characterize 163 quinochalcone C-glycosides in Carthamus tinctorius L. by combing offline two-dimensional LC separation 70 . Another multidimensional analytical strategy combining LCÀIMÀMS E and LCÀIMÀMS/MS were established to screen polycyclic polyprenylated acylphloroglucinols using diagnostic ions m/z 165.0182 or/and 177.0182, leading to the successful characterization of 140 targets in Garcinia oblongifolia 106 . In addition, DIF has also significantly facilitated the rapid classification and structural elucidation of more complicated multi-components contained in TCM prescriptions, including ginsenosides and lignans in Shenmai injection 107 , phenolic acids, saponins, diarylhepatonoids and gingerol-related compounds in Guge Fengtong tablet 108 , flavonoids, ginkgolides, phenolic acids, tanshinones, and saponins in Yindan Xinnaotong soft capsule 109 , etc. In summary, DIF is an option to filter and classify target ions, as well as remove interfering ions from the raw LCÀMS data. Neutral loss filtering (NLF) works by extracting the predicted specific neutral loss between the daughter ions and parent ions. NLF is similar to NLS, and the advantage lies in circumventing the use of multiple injections because of the combination use of multiple filters to identify multiple targets. NLF is promising in untargeted characterization of endogenous modified metabolites with specific neutral loss fragments, such as acetylation, sulfation, glucuronidation, etc. 110 , and exclusive profiling of phase II metabolites such as GSH conjugates in drug metabolism studies 111 . In analysis of TCM, NLF also finds its place in characterization of glycosidic compounds such as flavonoid glycosides 71,109 , alkaloid glycosides 79 and triterpenoid saponins 95 , acylation compounds such as malonyl ginsenosides 99 and diester-diterpenoid alkaloids 112 , as well as other compounds substituted with certain moieties such as methoxylated flavonoids 82 , etc. For example, the diester-diterpenoid alkaloids in Aconitum carmichaelii Debx. were determined by the diagnostic neutral losses of 60 28, 32, 18 and 122 Da, corresponding to AcOH, CO, MeOH, H 2 O and BzOH moieties, respectively 112 . Besides, NLF of the recognized sugars and acylation (GluA, Glc, Rha, Ace) was conducted for target characterization of flavonoid O-glycosides in Carthamus tinctorius, and some unknown ones (132.04, 86.00 and 71.98 Da for Xyl, Mal, and Oxa, respectively) were also detected and characterized 71 . A software named Neutral Loss MS Finder was developed to extract the in-source NL 110 , and successfully applied in screening the target malonyl-ginsenosides from three Panax species 29 . Because of the in-source removal of both neutral CO 2 and C 3 H 2 O 3 groups, two NL filters (43.9898 and 86.0004 Da) were set to reduce the false positives and enhance the characterization accuracy, and ultimately, to produce the peak lists containing 156, 130 and 126 target malonyl-ginsenosides in three Panax species, respectively. Mass spectral trees similarity filter (MTSF) is a data postprocessing technique for rapid classification of the unknown compounds by comparing the similarity of mass spectral trees. Commercially available software such as Mass Frontier is usually used to construct mass spectral trees, using HRMS data as the stem of the tree, MS 2 data as the bough, and multi-stage MS n data further extended as the branches. After importing MS n data of the template compounds into the software to build the library of MTSF, the similarity scores between the compounds detected in the sample and those in the library can be calculated, and unknown compounds can be classified and characterized by matching with those templates to obtain candidate structures. Because the mass spectral trees of structural analogs usually exhibit high similarity scores, MTSF technique is widely applied for rapid identification of metabolites in a complicated biological matrix, due to the structural resemblance between the metabolites and corresponding prototype 113 . Meanwhile, it also reveals great potential to significantly accelerate and simplify the process of discovery and identification of various chemical components in TCM which share the same substructures as template compounds, including flavones, phenylpropanoids, and sphingolipids in Saussurea involucrata 114 , chlorogenic acids in Duhaldea nervosa 115 and Lonicerae Japonicae Flos 116 , etc. In addition, for TCM prescriptions containing diverse types of compounds, MTSF can also efficiently fish potential components belonging to different compound classes, and rapidly identify both templated compounds and related compounds in Xiao-Xu-Ming decoction 117 , prenylflavonoid glycosides, alkaloids and phenylpropanoids in Er-xian decoction 87 . In a word, taking full advantages of MS n information, MTSF is a simple and efficient technique to fish unknown components in TCM complex systems, especially the structural analogs of the template compounds. Molecular networking (MN) uses a visualized computational strategy for comparison of the MS/MS spectral and calculation of the similarity degree between structural analogs. The major strength of this technique lies in requirements of no prior knowledge of the chemical structures for simultaneous exploration of up to millions (probably billions) of MS 2 data 118 . Developed from early-used MATLAB scripts installed on computers for similarity computation, Global Natural Products Social Molecular networking (GNPS) (http://gnps.ucsd.edu), an open-access online platform, allows even more analysts with no professional bioinformatics background to take advantage of this advanced technique 119 . Data visualization can be performed either directly online, or offline with Cytoscape 120 as well as other visualization tools, where similar MS/MS spectra can be grouped into clusters, and each node represents a unique MS/MS spectrum, connected with edges indicating the cosine similarity between corresponding MS/MS spectra. To help data interpretation, node size and color can also be tuned according to precursor ion intensity and sample origin. Moreover, GNPS can automatically retrieve detected MS/ MS spectral against public libraries, allowing rapid identification of molecules that are similar to database records. In addition to known compounds, MN also allows for exploration and identification of potential unknown analogues or compound families based on their MS/MS spectra relatedness with any known compounds. Compared with MTSF, MN is suitable for more LCÀMS platforms and attracts wider attentions. Since the introduction of MN in metabolic profiling of live microbial colonies in 2012 121 , it has been successfully applied in multiple areas 122 , such as drug discovery and natural products dereplication 123 , including microbials 124e127 , marine organisms 128e130 , fungi 131 , and plants 132e136 . In recent years, MN has also been used in TCM qualitative analysis. For example, offline two-dimensional separation integrated with MN was developed for dereplication and sorted-characterization of 229 bufadienolides in Venenum Bufonis, including two new subclasses 137 . Rapid recognition of 537 components in Lonicerae Japonicae Flos was also achieved with the help of MN, including a myriad of potential novel structures 36 . In addition, MN was also applied for identification and exploration of potential chemical markers of Aconiti Lateralis Radix Praeparata before and after processing based on both the compound clusters and node areas 138 . In face of complex and large-scale mass spectral data, although several data post-processing techniques such as DIF, NLF, MDF, MTSF and MN have been utilized to facilitate the data interpretation process and enhance the identification accuracy of target compounds in complex samples, it's still a challenge in the aspect of rapid recognition and extraction of potential characteristic ions. Therefore, in order to save time and manpower to dig for attractive compounds, some advanced statistical analysis (SA) techniques have been developed based on the intrinsic relationships between compound structures and MS data. Targeted data with similar characteristics can be classified into the same groups by statistical methods, while compounds of different substructures in TCM are efficiently discriminated. Combined with some diagnostic tools, such as loading plot and variable importance in projection (VIP), rapid recognition of potential characteristic ions can be easily achieved. SA technique was explored for rapid screening of the minor compounds in Danhong injection 139 . Firstly, a total of 7157 product ions were extracted from MS/MS spectra of 22 reference standards, and 4 filtering ions specific for compounds in Dan-shen, Hong-hua, and both herbs were selected from 195 common ions. Combined with another 6 diagnostic ions, 117 compounds were finally identified. Besides, SA was also utilized for discovery and global profiling of novel compounds from Curcuma longa 140 . After targeted detection of 846 terpecurcumins by 12 NL/PIS, principle component analysis (PCA) was used for discrimination of different structures and recognition of potential novel compounds by clustering similar NL/PIS patterns and differentiating distinct NL/PIS patterns. In another example of profiling phenylethanoid glycosides from Magnolia officinalis 141 , recognition models of partial least square-discriminant analysis (PLS-DA) were established to discriminate both the stereoisomers and the positional isomers using MS 2 data of the reference standards, and product ions contributing the most to isomer recognition were picked out according to the ranking order of their VIP values. Finally, PCA analysis was conducted for discrimination of isomers in samples and verification of the selected discriminant ions. Since manual analysis of massive and informative data generated by LCÀMS is a laborious task, which requires a tremendous amount of time and effort, and often generates non-reproducible results, database matching (DM) can significantly improve the efficiency via computer-aided extensive annotation of compounds in complex matrices, and is widely applied in identification of a variety of compounds. DM was accomplished by comparing the exact mass and fragments between the detected compounds and the recorded items in the database. These databases generally include chemical databases using accurate mass or molecular formula, and MS/MS spectral databases such as MassBank 142 , METLIN 143 , mzCloud 144 , HMDB, etc. Since the applications of some open-access online-databases in identification of compounds have already been described in several published reviews 145e148 , this review mainly focuses on the in-house databases used in identification of TCM constituents. Some intelligent software platforms from different manufacturers, including UNIFI 23,149e156 , Progenesis QI 157 , MetWorks 158 , Compound Discoverer 159 , SURIUS 160 , MassHunter 161 and PCDL Manager 162 , play an important part in alleviating the labor for mining structural information from large-scale datasets, by integrating various automated data processing steps. They usually provide accessibility to various online databases and allow creation of a customized database as well, facilitating automatic profiling and identification of chemical components in TCM and more complex TCM formulas 155 . For example, to rapidly elucidate structures of lanostane analogs and isomers in Poria cocos, LCÀIMÀMS E data were post-processed with the assistance of UNIFI software, a targeted compound database and a key MS database, resulting in identification of 121 lanostanes 23 . In a bid to expand the screening coverage of novel compounds in the leaves of P. notoginseng, a predicted metabolites screening approach was implemented in UNIFI, by using four modified groups and one or two steps of modification, increasing the theoretical coverage by 14 times, and finally 945 ginsenosides were discovered, including 662 potential novel ginsenosides 154 . Another strategy was developed to systematically profile lipids in three congeneric Panax species by searching full MS and fragments information against HMDB and LIPID MAPS, and collision cross section (CCS) information against Metabolic Profiling CCS Library (incorporated in Progenesis QI) and LipidCCS Predictor software 157 . Besides, several self-developed automatic data-processing tools can also increase the chemical coverages of traditional databases by creation of in-house databases, enhancing the efficiency and reproducibility in characterization of compounds in TCM. FlavonQ was developed for characterization of flavone and flavonol glycosides by calculation and combination of the common substituted groups of the core structures, and the results obtained by FlavonQ proved to be consistent with those determined conventionally by a sophisticated chemist, but greatly facilitated the data interpretation process in flavonoid research 163 . Another custom-built software PlantMAT (Plant Metabolite Annotation Toolbox), was also developed for identification of saponins and glycosylated flavonoids by running through various glycosyl, and acyl groups in every possible combination 164 . In majority of the cases, single analytical method is insufficient for the comprehensive characterization of the complex constituents in TCM, since each technique has its own strength and limitations. Therefore, development of integrated analytical strategies is necessary for rapid, accurate and systematic chemical analysis of TCM. The integrated strategies can be developed by integration of different MS instrumentations, or integration of various data acquisition or/and data post-processing techniques, laying the solid foundation for interpretation of the chemical diversity of TCM. High-resolution mass spectrometers (such as QTOF) with accurate mass measurement of ions are advantageous for untargeted qualitative analysis of TCM. However, extraction of corresponding precursor and product ions from the total ions chromatograms (TICs) could be time-consuming and laborintensive. On the contrary, QTRAP-MS show its advantages in sensitive detection of target compounds especially co-eluted minor components in TCM by using multiple targeted scanning modes such as MRM, and an EPI scan can also be triggered to acquire fragmentations of the preferred precursor ions. Therefore, QTRAP could act as a complementary tool of QTOF despite of its low resolution. Combination of MS/MS functions of both QTOF and QTRAP have been applied in systematic characterization and targeted identification of triterpenes in Alismatis Rhizoma and the corresponding processed products 61 , curcuminoids in turmeric 63 , naphthoquinones in Juglans cathayensis dode 53 , as well as minor components in TCM formulas such as Baoyuan decoction 155 . The integrated use of different DDA techniques is also powerful in identifying unknown trace compounds in TCM. Take QTRAP as an example, it possesses the scan capacities of both QqQ and linear ion trap (LIT), involving MIM, MRM, NLS, PIS, enhanced scans such as EPI and enhanced mass spectrum (EMS), as well as some combined functions, such as PIS-IDA-EPI. By the combined use of MIM-IDA-EPI and PIS-IDA-EPI modes, 41 coumarins were identified from Radix Glehniae 54 . Besides, integration of different triggered IDA-EPI scans, including those triggered by EMS, NLS, MRM, and PIS scans, lead to the detection of 513 components from Cistanche deserticola and C. tubulosa 55 . In the study of chemical profiling of Baoyuan decoction, the stepped MIM-EPI and predefined MRM-EPI data acquisition methods were established to pick out potential saponins based on the MS behaviors of authentic compounds, while MIM-EPI and PIS-EPI scanning modes were adapted to comprehensively recognize flavonoids based on various diagnostic product ions, and thus the co-eluted chromatographic peaks were easily distinguished and extracted 155 . Compared with individual filtering techniques, integration of different data post-processing techniques also exhibits higher efficiency in classifying multi-type compounds, excluding interference ions, as well as detecting the minor or trace components in TCM. It is illustrated by a strategy integrating different filtering techniques, including MDF and nitrogen rule filtering (NRF) based on MS information, and DIF and NLF based on MS/MS information, to identify two types of components in Artemisiae Argyi Folium, including chlorogenic acids and methoxylated flavonoids 82 . Results showed that none of a single technique could totally eliminate the interfering ions, whereas the integration of MS and MS/MS-based filtering techniques exerted a complementary effect to remove more interfering ions and expose target compounds. In another example for rapid identification of chemical constituents in Yindan Xinnaotong soft capsule, MDF and DIF were applied to preliminarily classify detected compounds into different chemical families, while NLF were used for characterization of flavonoid glycosides and triterpene saponins substituted with certain sugar units 109 . Intelligent DDA techniques allow targeted acquisition of MS/ MS or MS n data, improving the sensitivity and selectivity in detection of attractive compounds, while various data postprocessing techniques can further simplify the data interpretation process. Therefore, combined approaches which integrate targeted DDA and various data post-processing techniques have great advantages in detection and identification of target components in TCM. A strategy based on LTQ-Orbitrap following offline two-dimensional LC separation was established to systematically analyze five botanical origins of Uncariae Ramulus Cum Unicis, and lead to the ultimate characterization of 1227 indole alkaloids 84 . In the data acquisition process, a theoretical step-wise PIL (mass range: 310e950 Da; step size: 2 Da) and an optimal parent mass width (AE55 mDa), was defined to selectively trigger fragmentations of potential alkaloids, namely step-wise PIL-based raster-MDF scan method. Additionally, different data postprocessing techniques were also combinedly used to facilitate structural elucidation, including elemental composition analysis to remove false positives, DIF for subtype classification, DM and fragments annotation for structure characterization. In another example, 537 components were characterized in Lonicerae Japonicae Flos, including an enormous number of new structures 36 . To reduce the redundant scans and increase the selectivity, an EL was first generated by a two-step polygonal MDF and then used for triggering data dependent-MS/MS scan. To rapidly annotate the known compounds and elucidate those unreported ones, DM and MN were both adopted for the data interpretation process. All the above-mentioned applications of LCÀMS in TCM qualitative analysis were summarized in Table 1 . Besides, the selection criteria of each data acquisition and data post-processing technique were also listed in Table 2 , including the advantages, limitations, and scope of applications, providing valuable reference for future research on characterization and identification of chemical constituents in TCM. In addition, a decision tree in Fig. 2 also explained how to select appropriate data acquisition/ post-processing methods based on the known chemical information, which may help the readers' practical adoption of the above methods. Despite of the continuous developments of advanced data acquisition and post-processing techniques in TCM qualitative analysis, there still remains drastic challenges to be solved to achieve the ultimate goal of comprehensive chemical profiling and annotation. Firstly, it is virtually impossible to simultaneously detect all the components in one experiment, due to the selectivity and priority of the analytical system. Therefore, arrays of separation techniques differing from conventional reversed-phase LC were developed and combined in either on-line mode or off-line mode 165 , including hydrophilic interaction liquid chromatography 36,152,154 , supercritical fluid chromatography 137 , etc. Besides, IM 23,106,157 was also adopted to improve the MS performance. Secondly, to make the full use of typical DIA methods featured with minimum loss of MS information and good reproducibility, more complicated deconvolution and data processing techniques should be developed to further interpret the DIA data. Thirdly, manual annotation and interpretation of the MS data is still a dominant method used, which is labor-intensive and somewhat subjective. Databases of MS spectra of TCM ingredients combined with intelligent software may be an option. Though database search is anticipated to play a pivotal role in matching the known compounds, however, the comprehensive TCM databases of the known compounds are still in serious shortage, let alone the potentially new structures. Therefore, de novo annotation software, especially those considering the biosynthetic pathways may help increase the identification credibility. Fourthly, ambiguous differentiation and identification of the isomers hinder the further application of the acquired data. Common isomer identification strategies involve two aspects. One is associated with the liquid chromatographic separation by comparing the difference of 172 values and optimal CE (OCE) values which corresponded to the dissociation of chemical bonds were proved to be helpful in isomeric differentiation 173, 174 . In addition, some cutting-edge instrumentations such as LCeinfrared ion spectroscopy MS have been introduced and applied to distinguish closely related isomers 175e177 . In summary, based on the recent development and bottlenecks of LCÀMS in TCM qualitative analysis, following suggestions are tentatively proposed for the future research. Firstly, because the bioactive components contained in TCM are usually secondary metabolites, which share similar substructures due to the common biosynthetic pathways, in order to achieve targeted characterization of different classes of compounds, more efficient and specific data acquisition and post-processing techniques based on their chemical characteristics, MS fragmentation patterns and other properties are in urgent demand. Next, resorting to the theory of big data analytics and rapid developments of computer science as well as mathematical statistics methods, exploration of highthroughput, high coverage and automatic data analysis strategies become possible. By developing computer and software-aided automatic data processing workflows, MS data interpretation efficiency of TCM complex systems will be dramatically improved. Last but not least, utilization of integrated analytical strategies, combing the advantages of various MS instrumentations, data acquisition and data post-processing techniques, are helpful for comprehensive and systematic analysis of chemical components of TCM. TCM is gaining more and more global attentions because of its effectiveness in preventing and treating human diseases. LCÀMS is one of the most powerful tools for TCM qualitative analysis. In this review, the basic principles and applications of a variety of advanced LCÀMS-based data acquisition and post-processing techniques were summarized and illustrated for rapid and efficient characterization and identification of various chemical constituents in TCM. These data acquisition techniques include PIL, DE and mass tag based on MS information, PIS, NLS and MRM based on MS/MS information, and data post-processing techniques include MDF, BS, IPF, DIF, NLF, MTSF, MN, SA and DM, each one is indispensable with its unique superiority. Therefore, a brief introduction and applications of the integrated analytical strategies in TCM qualitative analysis were delivered. At last, discussions on the obstacles, possible solutions and future trends of TCM analysis were also provided, laying foundations for future research on the chemical basis of TCM. Analysis on herbal medicines utilized for treatment of COVID-19 Natural products in drug discovery Molecular networks for the study of TCM pharmacology Recent developments in qualitative and quantitative analysis of phytochemical constituents and their metabolites using liquid chromatographyÀmass spectrometry Recent development in liquid chromatography stationary phases for separation of Traditional Chinese Medicine components Recent developments in liquid chromatographyÀmass spectrometry and related techniques Data acquisition and data mining techniques for metabolite identification using LC coupled to high-resolution MS A classification of liquid chromatography mass spectrometry techniques for evaluation of chemical composition and quality control of traditional medicines DIA mass spectrometry Development of dataindependent acquisition workflows for metabolomic analysis on a quadrupole-orbitrap platform Phytochemical analysis of traditional Chinese medicine using liquid chromatography coupled with mass spectrometry Applications of HPLC/MS in the analysis of traditional Chinese medicines Recent advances on HPLC/MS in medicinal plant analysis Recent advances on HPLC/MS in medicinal plant analysisdan update covering A survey of orbitrap all ion fragmentation analysis assessed by an R MetaboList package to study small-molecule metabolites MS E with mass defect filtering for in vitro and in vivo metabolite identification Lipid profiling of complex biological mixtures by liquid chromatography/mass spectrometry using a novel scanning quadrupole data-independent acquisition strategy Comparison of informationdependent acquisition, SWATH, and MS All techniques in metabolite identification study employing ultrahigh-performance liquid chromatographyÀquadrupole time-of-flight mass spectrometry Data-independent acquisition for the quantification and identification of metabolites in plasma A fourstep filtering strategy based on ultra-high-performance liquid chromatography coupled to quadrupole-time-of-flight tandem mass spectrometry for comprehensive profiling the major chemical constituents of Akebiae Fructus Ultrasonic/microwave assisted extraction and diagnostic ion filtering strategy by liquid chromatography-quadrupole time-of-flight mass spectrometry for rapid characterization of flavonoids in Spatholobus suberectus A targeted strategy to analyze untargeted mass spectral data: rapid chemical profiling of Scutellaria baicalensis using ultra-high performance liquid chromatography coupled with hybrid quadrupole orbitrap mass spectrometry and key ion filtering Structure-oriented UHPLCÀLTQ Orbitrap-based approach as a dereplication strategy for the identification of isoflavonoids from Amphimas pterocarpoides crude extract Diagnostic ion filtering to characterize ginseng saponins by rapid liquid chromatography with time-of-flight mass spectrometry Integrated evaluation of malonyl ginsenosides, amino acids and polysaccharides in fresh and processed ginseng Systematic identification and quantification of tetracyclic monoterpenoid oxindole alkaloids in Uncaria rhynchophylla and their fragmentations in Q-TOF-MS spectra Discovery and characterisation of lycorine-type alkaloids in Lycoris spp. (Amaryllidaceae) using UHPLCÀQTOFÀMS The rapid discovery and identification of physalins in the calyx of Physalis alkekengi L.var.franchetii (Mast.) Makino using ultra-high performance liquid chromatography-quadrupole time of flight tandem mass spectrometry together with a novel three-step data mining strategy Characterization of the multiple chemical components of Glechomae Herba using ultra high performance liquid chromatography coupled to quadrupoletime-of-flight tandem mass spectrometry with diagnostic ion filtering strategy Global metabolite profiling and diagnostic ion filtering strategy by LCÀQTOF MS for rapid identification of raw and processed pieces of Rheum palmatum L Diagnostic filtering to screen polycyclic polyprenylated acylphloroglucinols from Garcinia oblongifolia by ultrahigh performance liquid chromatography coupled with ion mobility quadrupole time-of-flight mass spectrometry Diagnostic fragment-ion-based extension strategy for rapid screening and identification of serial components of homologous families contained in traditional Chinese medicine prescription using highresolution LCÀESIÀITÀTOF/MS: Shengmai injection as an example Diagnostic ion filtering strategy for chemical characterization of Guge Fengtong Tablet with high-performance liquid chromatography coupled with electrospray ionization quadrupole time-of-flight tandem mass spectrometry Comprehensive chemical profiling of Yindan Xinnaotong soft capsule and its neuroprotective activity evaluation in vitro Nontargeted modification-specific metabolomics study based on liquid chromatographyÀhigh-resolution mass spectrometry Post-acquisition analysis of untargeted accurate mass quadrupole time-of-flight MS E data for multiple collision-induced neutral losses and fragment ions of glutathione conjugates Neutral fragment filtering for rapid identification of new diester-diterpenoid alkaloids in roots of Aconitum carmichaeli by ultra-high-pressure liquid chromatography coupled with linear ion trap-orbitrap mass spectrometry A new strategy for the discovery of epimedium metabolites using high-performance liquid chromatography with high resolution mass spectrometry Identification of the chemical components of Saussurea involucrata by high-resolution mass spectrometry and the mass spectral trees similarity filter technique Rapid characterization of chlorogenic acids in Duhaldea nervosa based on ultra-high-performance liquid chromatography-linear trap quadropole-Orbitrap-mass spectrometry and mass spectral trees similarity filter technique A strategy for comprehensive identification of sequential constituents using ultrahigh-performance liquid chromatography coupled with linear ion trap-Orbitrap mass spectrometer, application study on chlorogenic acids in Flos Lonicerae Japonicae Rapid discovery and identification of 68 compounds in the active fraction from Xiao-Xu-Ming decoction (XXMD) by HPLCeHRMS and MTSF technique Mass spectral similarity for untargeted metabolomics data analysis of complex mixtures Sharing and community curation of mass spectrometry data with global natural products social molecular networking Cytoscape: a software environment for integrated models of biomolecular interaction networks Mass spectral molecular networking of living microbial colonies Molecular networking as a drug discovery, drug metabolism, and precision medicine strategy Integration of molecular networking and in-silico MS/MS fragmentation for natural products dereplication Molecular networking as a dereplication strategy Prioritizing natural product diversity in a collection of 146 bacterial strains based on growth and extraction protocols Using molecular networking for microbial secondary metabolite bioprospecting The tripod for bacterial natural product discovery: genome mining, silent pathway induction, and mass spectrometry-based molecular networking Molecular networking prospection and characterization of terpenoids and C15-acetogenins in Brazilian seaweed extracts Molecular networking-based analysis of cytotoxic saponins from sea cucumber Holothuria atra MS/MS-based molecular networking approach for the detection of aplysiatoxin-related compounds in environmental marine cyanobacteria Integrating molecular networking and 1 H NMR to target the isolation of chrysogeamides from a library of marine-derived Penicillium fungi Searching for original natural products by molecular networking: detection, isolation and total synthesis of chloroaustralasines Integration of biochemometrics and molecular networking to identify antimicrobials in Angelica keiskei Application of metabolomics and molecular networking in investigating the chemical profile and antitrypanosomal activity of British bluebells (Hyacinthoides nonscripta) Dereplication of flavonoid glycoconjugates from Adenocalymma imperatoris-maximilianii by untargeted tandem mass spectrometrybased molecular networking Acetylcholinesterase-inhibitory activity of Iranian plants: combined HPLC/bioassay-guided fractionation, molecular networking and docking strategies for the dereplication of active compounds A highefficiency strategy integrating offline two-dimensional separation and data post-processing with dereplication: characterization of bufadienolides in Venenum Bufonis as a case study A comprehensive quality evaluation of Fuzi and its processed product through integration of UPLCÀQTOF/MS combined MS/MS-based mass spectral molecular networking with multivariate statistical analysis and HPLCÀMS/MS Rapid quantitation and identification of the chemical constituents in Danhong Injection by liquid chromatography coupled with orbitrap mass spectrometry Global profiling and novel structure discovery using multiple neutral loss/precursor ion scanning combined with substructure recognition and statistical analysis (MNPSS): characterization of terpene-conjugated curcuminoids in Curcuma longa as a case study Profiling and isomer recognition of phenylethanoid glycosides from Magnolia officinalis based on diagnostic/holistic fragment ions analysis coupled with chemometrics MassBank: a public repository for sharing mass spectral data for life sciences METLIN: a metabolite mass spectral database Development of a sensitive untargeted liquid chromatography-high resolution mass spectrometry screening devoted to hair analysis through a shared MS2 spectra database: a step toward early detection of new psychoactive substances In-silico studies in Chinese herbal medicines' research: evaluation of in-silico methodologies and phytochemical data sources, and a review of research to date Software tools and approaches for compound identification of LCÀMS/MS data in metabolomics Differentiating signals to make biological senseda guide through databases for MS-based non-targeted metabolomics Open-access metabolomics databases for natural product research: present capabilities and future potential Rapid characterization of Ziziphi Spinosae Semen by UPLC/Qtof MS with novel informatics platform and its application in evaluation of two seeds from Ziziphus species Identification of chemical ingredients of peanut stems and leaves extracts using UPLCÀQTOFÀMS coupled with novel informatics UNIFI platform A multidimensional analytical approach based on time-decoupled online comprehensive two-dimensional liquid chromatography coupled with ion mobility quadrupole time-of-flight mass spectrometry for the analysis of ginsenosides from white and red ginsengs A green protocol for efficient discovery of novel natural compounds: characterization of new ginsenosides from the stems and leaves of Panax ginseng as a case study Nontargeted metabolomic analysis and "commercial-homophyletic" comparison-induced biomarkers verification for the systematic chemical differentiation of five different parts of Panax ginseng Global profiling combined with predicted metabolites screening for discovery of natural compounds: characterization of ginsenosides in the leaves of Panax notoginseng as a case study An integrated strategy for global qualitative and quantitative profiling of traditional Chinese medicine formulas: Baoyuan Decoction as a case Structural characterization and discrimination of the Paris polyphylla var. yunnanensis and Paris vietnamensis based on metabolite profiling analysis Systematic profiling and comparison of the lipidomes from Panax ginseng, P. quinquefolius, and P. notoginseng by ultrahigh performance supercritical fluid chromatography/high-resolution mass spectrometry and ion mobility-derived collision cross section measurement LTQ-Orbitrap-based strategy for traditional Chinese medicine targeted class discovery, identification and herbomics research: a case study on phenylethanoid glycosides in three different species of Herba Cistanches Identifying potential anti-COVID-19 pharmacological components of traditional Chinese medicine Lianhuaqingwen capsule based on human exposure and ACE2 biochromatography screening Rapid identification and isolation of neuraminidase inhibitors from mockstrawberry (Duchesnea indica Andr.) based on ligand fishing combined with HR-ESI-Q-TOF-MS Identification of oxygenated fatty acid as a side chain of lipo-alkaloids in Aconitum carmichaelii by UHPLCeQ-TOF-MS and a database Comprehensive identification and structural characterization of target components from Gelsemium elegans by high-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry based on accurate mass databases combined with MS/MS spectra FlavonQ: an automated data processing tool for profiling flavone and flavonol glycosides with ultra-highperformance liquid chromatography-diode array detection-high resolution accurate mass-mass spectrometry PlantMAT: a metabolomics tool for predicting the specialized metabolic potential of a system and for large-scale metabolite identifications Application of two-dimensional liquid chromatography in the separation of traditional Chinese medicine Critical evaluation of a simple retention time predictor based on LogKow as a complementary tool in the identification of emerging contaminants in water Prediction of liquid chromatographic retention for differentiation of structural isomers A strategy to improve the identification reliability of the chemical constituents by high-resolution mass spectrometry-based isomer structure prediction combined with a quantitative structure retention relationship analysis: phthalide compounds in Chuanxiong as a test case Kernel-based, partial least squares quantitative structure-retention relationship model for UPLC retention time prediction: a useful tool for metabolite identification Artificial neural network modelling of pharmaceutical residue retention times in wastewater extracts using gradient liquid chromatography-high resolution mass spectrometry data A comparison of three liquid chromatography (LC) retention time prediction models Chiral differentiation of the noscapine and hydrastine stereoisomers by electrospray ionization tandem mass spectrometry Integrated work-flow for quantitative metabolome profiling of plants, Peucedani Radix as a case Retention time and optimal collision energy advance structural annotation relied on LCÀMS/MS: an application in metabolite identification of an antidementia agent namely echinacoside Infrared ion spectroscopy in a modified quadrupole ion trap mass spectrometer at the FELIX free electron laser laboratory Mass-spectrometry-based identification of synthetic drug isomers using infrared ion spectroscopy Infrared ion spectroscopy: new opportunities for small-molecule identification in mass spectrometryda tutorial perspective Yang Yu wrote the paper. Changliang Yao edited the paper. De-an Guo conceived the review topic and edited the paper. The authors declare no conflicts of interest.