key: cord-0938836-c4zyorha authors: Patel, Seema title: Pathogenicity-associated protein domains: The fiercely-conserved evolutionary signatures date: 2017-06-30 journal: Gene Reports DOI: 10.1016/j.genrep.2017.04.004 sha: 15a29eca657f8adea9fab32ac036fb1fd8d4bddf doc_id: 938836 cord_uid: c4zyorha Abstract Proteins have highly conserved domains that determine their functionality. Out of the thousands of domains discovered so far across all living forms, some of the predominant clinically-relevant domains include IENR1, HNHc, HELICc, Pro-kuma_activ, Tryp_SPc, Lactamase_B, PbH1, ChtBD3, CBM49, acidPPc, G3P_acyltransf, RPOL8c, KbaA, HAMP, HisKA, Hr1, Dak2, APC2, Citrate_ly_lig, DALR, VKc, YARHG, WR1, PWI, ZnF_BED, TUDOR, MHC_II_beta, Integrin_B_tail, Excalibur, DISIN, Cadherin, ACTIN, PROF, Robl_LC7, MIT, Kelch, GAS2, B41, Cyclin_C, Connexin_CCC, OmpH, Bac_rhodopsin, AAA, Knot1, NH, Galanin, IB, Elicitin, ACTH, Cache_2, CHASE, AgrB, PRP, IGR, and Antimicrobial21. These domains are distributed in nucleases/helicases, proteases, esterases, lipases, glycosylase, GTPases, phosphatases, methyltransferases, acyltransferase, acetyltransferase, polymerase, kinase, ligase, synthetase, oxidoreductase, protease inhibitors, nucleic acid binding proteins, adhesion and immunity-related proteins, cytoskeletal component-manipulating proteins, lipid biosynthesis and metabolism proteins, membrane-associated proteins, hormone-like and signaling proteins, etc. These domains are ubiquitous stretches or folds of the proteins in pathogens and allergens. Pathogenesis alleviation efforts can benefit enormously if the characteristics of these domains are known. Hence, this review catalogs and discusses the role of such pivotal domains, suggesting hypotheses for better understanding of pathogenesis at molecular level. Proteins are prone to stress-driven modifications in their primary sequences (Marchler-Bauer et al., 2014) . Due to the accumulated heterogeneity, homology analysis tools like Basic Local Alignment Search Tool (BLAST) assign low identity scores to even closely-related proteins (Pearson, 2013) . Also, the in vitro conditions prevent elaboration of some pathogenically-critical proteins, which misguide interpretations. Such experimentally-absent proteins are termed as 'hypothetical proteins' and dismissed as unimportant from potential drug target candidate list. To overcome these lacunae, an innovative way of analysis is of paramount importance. Proteins have motifs which are strictly conserved, which can reveal their phylogenetic links, diversification paths and functions (Marchler-Bauer et al., 2014) . These conserved sites or folds are domains (Marchler-Bauer et al., 2014) . Protein databases vary in the number and notations of the domains. SMART (Simple Modular Architecture Research Tool) database catalogs more than a thousand protein domains (Ponting et al., 1999) . These domains belong to different categories of protein. In silico analysis of several viral, bacterial and allergen (from animal and plant origin) proteins have identified frequently-occurring domains. A majority of these pathogenesis-mediating domains are shared between the pathogens as well as allergens. It is important to understand these crucial domains that facilitate the establishment of pathogenesis, as they are potential druggable targets. This review discusses some of such protein domains that manipulate human components and lead to morbidity and lethality. The frequently-appearing domains can be clustered into different categories for ease of understanding, though the boundaries are not crisp and often overlapping. Many proteins are known to be modular which contain domains belonging to more than one category of the six possible enzyme classes such as hydrolases, transferases, lyases, oxidoreductases, isomerases and ligases (Cai and Chou, 2005) . Also, the Lactamase_B is a domain in metal-dependent hydrolases, which include the proteins like β-lactamases, thiolesterases, glyoxalase II family, glutathione hydrolase, and competence proteins etc. (Bradford, 2001) . Most of the Lactamase_B-containing proteins bind two zinc ions as cofactor and resist β-lactam antibiotics . A majority of this domain-containing proteins in bacteria are hypothetical proteins (van Tonder et al., 2014) . Lack of the right conditions in vitro might be causal of the lack of experimental evidence of Lactamase_B-containing proteins. DDHD (named after the four conserved residues) domain has conserved Asp and His residues, modification of which leads to loss of phospholipase and membrane trafficking activity (Inoue et al., 2012) . DDHD domain-containing phospholipase A1 family of proteins are required for organelle biogenesis and brain functioning (Inoue et al., 2012; Yamashita et al., 2010) . Mutation in this motif has been associated with hereditary spastic paraplegia, a neural disease of slowly progressive weakness (Gonzalez et al., 2013) . The acidPPc (acid phosphatase) domain is present in phosphatidate phosphatase, a critical enzyme that acts on phosphate monoesters, liberating diacylglycerol and inorganic phosphate (Carman and Han, 2006) . This domain has been detected in HCV and virulent strains of dengue virus. Perturbation of this enzyme in human has been associated with diseases like prostate cancer and osteoporosis, among other pathologies (Araujo and Vihko, 2013) . UDG (Uracil DNA glycosylase) domain occurs in proteins of uracil DNA glycosylase superfamily. This enzyme removes any uracil (generated by deamination of cytosine) from DNA, averting mutations and aberrations in information pathways (Lucas-Lledó et al., 2011) . A study on Vaccinia virus reveals that the uracil DNA glycosylase is crucial for virus DNA replication (De Silva and Moss, 2003) . Also, in silico analysis discovered the UDG domain in hepatitis C virus (HCV). PbH1 (parallel beta-helix repeats) are motifs present in many carbohydrate-lysing enzymes such as pectate lyases, and rhamnogalacturonases (Heffron et al., 1998) . These domains are present in SHCBP1 (centralspindlin complex, made of motor protein MKLP1 and GTPase-activating protein MgcRacGAP), involved in cytokinesis initiation (Asano et al., 2014; Pavicic-Kaltenbrunner et al., 2007) . These domains are present in polyductin proteins, defect in which causes autosomal recessive polycystic kidney disease (PKD) (Onuchic et al., 2002) . These repeats are abundant in exopolygalacturonase allergens of Platanus acerifolia (London planetree) and pectate lyase 1 of Juniperus ashei (Ashejuniper). These leurice-rich repeats (LRR) are present in highly N-glycosylated proteins and are involved in carbohydrate moiety recognition and/or modification (Heffron et al., 1998) . The pkhd1 (polycystic kidney and hepatic disease 1) gene product polyductin, associated with kidney disease (Igarashi, 2002; Menezes and Onuchic, 2006) and congenital hepatic fibrosis (Gunay-Aygun et al., 2010) contains these repeats. ChtBD3 (Chitin-binding domain type 3), a chitin-binding domain has been associated with host pathogenesis (Tran et al., 2011) . ChtBD3 is present in some Ebola virus strains (such as some isolates of Mayinga-76 outbreak and isolate A0A0F7IMH5 from Libria-14 outbreak) as well as dengue virus serotype 3 strains (Messina et al., 2014) . A number of pathogenic bacteria, including Vibrio cholerae elaborate an enzyme chitin oligosaccharide deacetylase which contains a ChtBD3 domain. Immense role of this domain in virulence is well-substantiated. CBM49 is a carbohydrate binding module (CBM), found at the C terminal of cellulases (Guillén et al., 2010; Shoseyov et al., 2006) . The binding of CBM domains to complex glycans has been linked to pathogenesis. Some dengue virus serotype 2 isolates such as P14337, and Q9WDA6 contain a CBM49, whereas isolate Q9WDA6 contains a CBM25 (a starch binding domain found in bacterial amylases). RAB is a domain in Rab subfamily of small guanosine triphosphate (GTPases) (Diekmann et al., 2011) . These proteins have wide and tissue-specific distribution, which play part in vesicle trafficking across membranes to their destined targets. These GTPases interact with numerous other components like sorting adaptors, tethering factors, kinases, phosphatases etc. for proper vesicular transport, defect in which can lead to immunodeficiencies, inflammations, neural pathol-ogies and cancers (Stenmark, 2009) . RUN is an N-terminal domain present in proteins crosstalking with Ras-like GTPase (especially in Rap and Rab family members), thus plays role in signaling pathways (Callebaut et al., 2001; Terawaki et al., 2015) . The proteins harboring this domain regulate cytoskeletal organization, autophagy, endocytosis, and endosomal maturation; the functions clearly indicating role in pathogenesis. Further, this domain is often associated with DUF4206 domain (Callebaut et al., 2001; Patel and Côté, 2013) . DUFs (domains of unknown function), as their name suggest are heavily-modified domains with poor annotations (Goodacre et al., 2014) . Tubulin is a domain in tubulin proteins belonging to GTPase family, playing role in polymer formation (Prigozhina et al., 2001) . Tubulin proteins harbor immense heterogeneity at their C-terminal end (Redeker et al., 1992) . Bacteria have a tubulin homolog, known as FtsZ (filamentous temperature-sensitive protein Z) proteins, that plays role in cell division. FtsZ protein is drafted to the membrane by the actin-related protein FtsA, and together both the proteins form Z ring, initiating bacterial cytokinesis (Loose and Mitchison, 2014) . EFh (EF-hand) are Ca 2 + binding α helical domains of Miro GTPases, the Ca 2 + sensors maintaining mitochondrial homeostasis (Suzuki et al., 2014) . Trematode tegument proteins have this domain, which is characterized to show Ig (immunoglobulin)-binding properties . ARF (ADPribosylation factor) domains are present in GTPases (like Ras) and homologues. This domain is involved in post-Golgi vesicular transport (Boman et al., 2002) . A tyrosine kinase Pyk2 regulates Arf1 gene activity through the protein ASAP1 (Arf GTPase-activating protein) (Inoue et al., 2008; Kruljac-Letunic et al., 2003) . PreSET are N-terminal part of cysteine-rich Zn 2 + -binding SET (Su (var)3-9, Enhancer-of-zeste, Trithorax) domains in histone lysine methyltransferases (HMTase) (Binda et al., 2010; Dillon et al., 2005) . PreSET domain has been detected in plant pollen allergens (such as Lig v and Bet v). G3P_acyltransf domain is present in glycerol-3-phosphate acyltransferase, a rate limiting enzyme for triacylglycerol biosynthesis (Wendel et al., 2009 ). This enzyme is required for immune response of the host as observed in Coxsackievirus infection to mice (Karlsson et al., 2009) . Some aggressive viral pathogens like dengue serotype 2 isolates lack it, though other serotypes harbor it. CAT (chloramphenicol acetyltransferase) is a trimeric domain that exists in chloramphenicol acetyltransferase, a bacterial enzyme that can metabolize antibiotic chloramphenicol, leading to drug resistance (Biswas et al., 2012; Yao et al., 1999) . CTD (C-terminal domain) of RNA polymerase II plays role in pre-mRNA processing, including splicing via phosphorylation (at Tyr residue) (Millhouse and Manley, 2005) . On the other hand, transcription termination can occur by dephosphorylation of the Tyr residue at the CTD (Schreieck et al., 2014) . CTD crosstalks with the complex transcript elongation factor SPT4/SPT5 to regulate transcription (Dürr et al., 2014) . RPOL8c is a subunit of RNA polymerase I, II and III, with role in transcription (Cramer et al., 2001) and microRNA gene regulation (Wang et al., 2010) . KbaA is a key domain in the protein for KinB-signaling pathway activation in sporulation. KinA kinase regulates sporulation initiation in Bacillus subtilis, by controlling phosphate supply to the phosphorelay system (Dartois et al., 1996) . Going by the literature, this domain in dengue virus might be involved in signaling pathways as well. Though most of these viruses contain this domain, it is missing in a serotype isolate P27909. TPK_B1_binding domain of thiamine pyrophosphokinase binds to vitamin B1 as the enzyme transfers pyrophosphate group from ATP to vitamin B1, in order to form the coenzyme thiamin pyrophosphate (Baker et al., 2001) . This coenzyme is required for functionality of cytosolic transketolase and mitochondrial enzymes for oxidative decarboxylation of pyruvate, α-ketoglutarate or branched chain amino acids (Mayr et al., 2011) . TyrKc are catalytic domain of Tyr-specific kinase subfamily, a group of cell surface receptors. This domain often co-occurs with FN3 (fibronectin type-III), IG (immunoglobulin) and Igc2 (immunoglobulin C-2 type) domains (Bernsel and Von Heijne, 2005) . UBA (ubiquitin associated) domain is present in C terminus of proteins like p62, BMSC-UbP, HHR23A, Rad23, SNF1-like kinases, and plays role in inter-and intramolecular communications (Chang et al., 2006; Raasi et al., 2004) . These proteins bind to ubiquitin which mediates proteasome complex degradation and optimal protein level retention in cells (Su and Lau, 2009) . HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) are approximately 55 aa-long domains present in the proteins coded by transducing genes (Kishii et al., 2007) . This domain containing helices and coiledcoil regions, often undergo conformational changes, relaying signals for chemotaxis, pathogenesis, and biofilm formation (Airola et al., 2013 (Airola et al., , 2010 Hulko et al., 2006; Matamouros et al., 2015) . HWE_HK domain is present in HWE type histidine kinases, known to mediate environmental signaling (Galperin, 2005; Karniol and Vierstra, 2004; Lavín et al., 2007) . HisKA is a crucial sensor kinase in pathogenic bacteria, including plant pathogen Pseudomonas syringae (Willett and Kirby, 2012) . It has been detected in pathogenic virus Ebola as well. Hr1 (homology region 1) domain is N-terminal part of Rho effector, or Serine/threonine C-related kinase proteins (PKN/PRK) that occur in multiple isozyme forms. PKN1 (Protein Kinase N1) isoforms abound in neural cells, playing role in cytoskeletal organization and neuronal differentiation. Neuro-pathologies like amyotrophic lateral sclerosis (ALS) and Alzheimer's disease arise due to malfunction of PKN1. Hr1 domain interacts with the small GTPases Rho and Rac, regulating actin dynamics (Flynn et al., 1998; Thauerer et al., 2014; Watson et al., 2016) . Dak2 (di-Mg 2 + ATP binding) domain is found in dihydroxyacetone kinases family, which helps bacteria to imbibe host fatty acids into their membrane phospholipids, via phosphotransferase activity (Kinch et al., 2005; Parsons et al., 2014) . YARHG is a 70 amino acidlong extracellular domain, which gets its name from the corresponding conserved motif in the protein sequence. This domain is detected in peptidases and kinase proteins, and predicted to bind bacterial cell wall or its adjacent components as outer membrane lipid or lipopolysaccharide (Coggill et al., 2013; Coggill and Bateman, 2012) . APC2 (cullin homology protein), the anaphase promoting complex or cyclosome is part of a ubiquitin ligase that regulates phase transition of mitosis (Puliyappadamba et al., 2011; Zhang et al., 2010; Zhou et al., 2011) . Some dengue virus isolates (such as P14337 and P29990) contain this domain. Citrate_ly_lig is the C-terminal domain in the cytosolic enzyme citrate lyase ligase that catalyzes citrate fermentation into acetyl-CoA, and oxaloacetate, coupled with ATP hydrolysis. However, apart from lipid biosynthesis, this domain has been discovered to bind DNA as well (Meyer et al., 1997) and to play role in tumor growth, following acetylation of lysine residues (Lin et al., 2013; Zaidi et al., 2012) . DALR is an anticodon binding domain of tRNA synthetase (arginyl/cysteinyl), made of α helices. In human, this domain-containing protein DALRD3 interacts with protein WDR6 (WD Repeat Domain 6) and C3orf60 (chromosome 3 open reading frame 60), involved in autophagy and protein assembly, respectively (Grinchuk et al., 2010; Schyth et al., 2015) . DALR_1 domain detected in pollen might have role in manipulating gene expression. DALR_2 domain is found in cysteinyl-tRNA-synthetases that link amino acid to its cognate transfer RNA (Tveit et al., 2014) . VKc is the catalytic subunit of vitamin K epoxide reductase. This enzyme processes blood coagulation factors to vitamin K (Oldenburg et al., 2006) . WR1 is domain in Worm-specific repeat type 1 proteins. This cysteine-rich domain is detected in nematode Caenorhabditis elegans (Marchler-Bauer et al., 2014) ; however, many pathogenic viruses possess this domain or homologues. This domain often co-occurs with KU (BPTI/Kunitz family of serine protease inhibitors) domains. PWI (proline-tryptophan-isoleucine) domains are present in pre-mRNA processing components, the spliceosome, and known to bind RNA as well as DNA (Szymczyna et al., 2003) . PWI-like domains are present in N-terminal of helicases (e.g. Brr2) (Absmeier et al., 2015) . Zinc fingers are motifs known to bind DNA, which can be of many types such as BED (named after the Drosophila proteins BEAF and DREF), UBR1 (Ubiquitin Protein Ligase E3 Component N-Recognin 1), UBP (ubiquitin-binding domain), U1, LIM (named after the LIN-11, ISL-1 and MEC-3 proteins in Caenorhabditis elegans), TTF (transcription termination factor), DBF, CHCC, CDGSH (Cys-Asp-Gly-Ser-His), ZZ, PMZ (plant mutator transposase), and C4 (Gupta et al., 2012) . ZnF_BED is a zinc finger domain in chromatin-boundary-element-binding proteins and transposases, required for terminal inverted repeat (TIR) and sub-terminal repeat binding, facilitating their autonomous transposition (Smith et al., 2012) . ZnF_A20 domain in N terminus of ZNF216 protein is an inhibitor of cell death-like zinc finger. This domain crosstalking with IKKgamma, RIP, and TRAF6 proteins is involved in ubiquitin mediated IL-1-induced NF-kappaB activation, apoptosis and proteasomal degradation (Huang et al., 2004; Searle et al., 2012) . ZnF_NFX (nuclear transcription factor, X-box binding-like 1) is a zinc finger domain in several proteins, including blast resistance Pi54 protein in rice plant (Gupta et al., 2012) . Homologues of all these zinc finger motifs have been detected in pathogenic viruses like HCV, HIV, and dengue. ZM (ZASP (Z band alternatively spliced PDZ-containing protein) -like motif) is about 26 aa-long pattern in an α-actinin-binding protein ZASP, and homologues (Lin et al., 2014) . ZM domain plays a role in cytoskeletal protein-protein interactions and provides structural integrity to sarcomeres (Klaavuniemi et al., 2004) . As a number of proteins involved in ion channel interactions, cytoplasmic and nuclear signaling, enzymatic reactions and cytoskeletal organization bind to Zline, mutation in ZASP leads to muscular diseases (Martinelli et al., 2014) . Mutation and aberrant isoforms in ZASP can lead to myofibrillar myopathy, cardiomyopathy etc. In human, the ZASP binds to mechanosensing protein Ankrd2 (Ankyrin Repeat Domain 2) and the tumor suppressor protein p53. TUDOR domain, a 60 aa-long motif is present in RNA-binding proteins and is involved in RNA metabolism and interactions. Several copies have been detected in arthropods like Drosophila (23 instances), and their epigenetic role in modification of chromatin, and gene expression has come forth (Altschul et al., 1997) . Binding of this domains to methyl-arginine ⁄ lysine residues, ligand, microRNPs, small RNAs and PIWI (named after P-element Induced WImpy testis in Drosophila) proteins has surfaced. Literature reveals their presence in fungi, protozoa, plants and metazoans (Ying and Chen, 2012) , but in silico analyses are revealing their presence in viruses as well. TUDOR domain-containing protein 1, 4 and 5 are antigens expressed on testis cells and are hallmarks of cancer (Yoon et al., 2011) . Human cell membrane manipulation property was evidenced from domains like MHC_II_beta, and Integrin_B_tail etc. MHC_II_beta (Class II histocompatibility antigen beta) domain is part of the MHC II glycoproteins expressed on antigen-presenting cells (APC) like macrophages, dendritic cells and B lymphocytes. These components are critical as they display fragmented antigens for recognition by helper T cells and successive immune response (Vyas et al., 2008) . Integrin_B_tail (Integ-rin beta subunit cytoplasmic) domain is involved in cell adhesion (Bodeau et al., 2001) . Flo11 domain made mostly of β sheets occurs at the N-terminal of Flo11 protein (a flocculin family adhesion protein) as found in yeast (Saccharomyces cerevisiae). This protein mediates hyphal formation, invasive growth and plays role in inter-cellular communications (Goossens and Willaert, 2012; Kraushaar et al., 2015) . Excalibur (extracellular calcium-binding region) are domains of bacterial surface proteins, showing similarity with Ca 2 + -binding loop of calmodulin-like EFh domains (Rigden et al., 2003) . SVWC (Single domain von Willebrand factor type C) is a group of adhesin proteins. These cysteine-rich proteins play role in immunity and diseases. Bone morphogenetic protein (BMP) is regulated by proteins with VWC domain such as chordin, CHL2 (chordin-like 2), and CV2 (crossveinless 2) (Fujisawa et al., 2009; Zhang et al., 2007) . TNFR (Tumor necrosis factor receptor/ nerve growth factor receptor) are repeat-rich extracellular domains, with role in growth factor and cytokine binding. TNF-α (tumor necrosis factor-alpha) is a cytokine mediating diverse inflammatory conditions. The pathological mechanism involves binding of TNF-α to TNFR (Deng, 2007) . TNFR-1 acts as a death receptor on ligand-mediated activation, leading to apoptosis (Park et al., 2014) . TSP1 (thrombospondin) domain is characterized to regulate cell interactions in vertebrates. Thrombospondins are glycoproteins with calcium-dependent anti-angiogenic property (Iruela-Arispe et al., 2004; Lawler and Lawler, 2012) . SCPU (Spore Coat Protein U) domain is found in a bacterial protein family including spore coat proteins, adhesive pili proteins and biofilmforming proteins (Chin et al., 2015) . Myxococcus xanthus mcu gene cluster (a CU (chaperone/usher) gene cluster) plays role in spore coat formation (Cao et al., 2015) . SCP (sperm coating protein) is a member in the large family SCP/Tpx-1/Ag5/PR-1/Sc7, known to contain extracellular domains. This domain, spanning 120-170 aa and capable of acquiring α-β-α sandwich conformation has been identified in nematode secretome, insect allergen, and semen. During pathogenesis, the expression of genes coding for this protein is upregulated, playing role in immune exacerbation and chronic condition (Chalmers et al., 2008) . DISIN (disintegrins) domains inhibit ligand-receptor association. The disintegrin proteins and metalloproteases, are together termed as ADAMs (a disintegrin and metalloprotease), which mediate cellular adhesion and recognition of sequences (Huang et al., 2003 ). An ADAM with thrombospondin type 1 repeats-13 (ADAMTS13) inhibits platelet aggregation and arterial thrombosis by cleavage of VWF (Xiao et al., 2011) . Canary grass (Phalaris canariensis) pollen Pha a 1 DISIN is likely to induce pathogenesis via interference with adhesion of integrins. Amb_V domain is found in Amb V pollen allergen in ragweed (Ambrosia sp.). A C-terminal helix is the key T cell epitope, leading to immune reactions, though free sulfhydryl groups play role too (Canis et al., 2012) . The presence of a similar domain in HCV indicates strong conservation of this domain. C4 is the C-terminal domain in type 4 procollagens, distributed in skin. This domain with tandem repeat renders the triple-helix collagen protein kinked and sheet-like. Mutation in this protein leads to autoimmune diseases like Goodpasture's syndrome (kidney and lungs inflammation) and Alport syndrome (kidney disease) (Abreu-Velez and Howard, 2012). Cadherin proteins mediate calcium dependent cell-cell adhesion and CNS (central nervous system) synapse control. Cadherin_pro domain occurs in N-terminal of cadherins. This prodomain lacks cadherin-cadherin interaction ability, but cleavage of its prosequence in the endoplasmic reticulum (ER) and Golgi apparatus can activate adhesive nature of the cadherin, conferring ability to control synapses (Koch et al., 2004; Latefi et al., 2009; Reinés et al., 2012) . CCP (complement control protein) domain containing SUSHI repeats (60 aa long and cysteine-rich) was identified in pathogenic viruses, which played role in complement activation by them. Literature review implies these CCP domains in arthropods like mosquitoes (including Aedes sp.) and fruit flies (Drosophila sp.), acting as human complement analog and eradicating bacteria . CHAD (conserved histidine alpha-helical domain) is an α-helical domain with conserved His residues, which chelates metals. It interacts with CYTH domain present in adenylyl cyclase and the mammalian thiamine triphosphatases . Cell adhesion necessitates binding of integrins with their ligands, which can be influenced by multiple domains. B_lectin is domain present in mannose-specific proteins. Apart from mannose, it recognizes N-acetylglucosamine, which can activate classical complement pathway (Muto et al., 2001) . ACTIN domain is characteristic of ACTIN subfamily in ACTIN/ mreB/sugarkinase/Hsp70 superfamily, clustered together by their common ATPase domain. Cortactin is an actin (F-actin-and Arp2/3 complex)-binding protein, regulating cytoskeleton dynamics and cortical actin-assembly (Shvetsov et al., 2009) . PROF domain binds to actin monomers, membrane polyphosphoinositides and poly-L-proline (Michaelsen-Preusse et al., 2016) . Robl_LC7 (Roadblock/LC7 family) domains regulate dynein, a motor protein, mediating several other adaptive functions. Mgl is a type of Robl_LC7, gene for which co-occur with gene encoding small GTPases (such as Ras superfamily involved in transduction pathways) (Miertzschke et al., 2011; Wuichet and Søgaard-Andersen, 2015) . Also, Robl_LC7 domains group with PROF domain, under profilin-like clan. MIT, involved in microtubule manipulation is present in virulent strains (Zaire and Sudan) of Ebola, while missing in avirulent strain (Reston). Kelch is a conserved domain with β-propeller topology. This repeat-rich domain is widely present across organisms, from virus Wilton et al., 2008) , plants to humans, and it mediates protein-protein interactions. The kelch-like (KLHL) gene family is spread across multiple chromosomes in human, and several of their coded proteins bind to the E3 ligase cullin 3, playing role in ubiquitination, signaling (such as NF-κB pathway inhibition), gene expression, actin binding and involved in several diseases (Dhanoa et al., 2013) . GAS2 (Growth-arrest-specific protein 2) domains manipulate actin microfilaments, bind to microtubules and lead to cell division arrest . Bet v protein essentially contains a GAS2 domain. B41, a plasma membrane-binding domain appears to be a critical domain for pathogenesis. It clearly indicates the role of this domain in attaching to host membrane. A conserved neuronal protein GRP1-associated scaffolding protein (GASP) has a B41 domain (as part of a FERM domain), implicated in binding to membrane as well as cytoskeletal elements like actin (MacNeil and Pohajdak, 2009) . DPBB (double-psi beta-barrel) domains are N terminal motifs present in lipoproteins like expansins (such as Phl p) (Kerff et al., 2008) . TLC, the acronym of TRAM, LAG1 and CLN8 homology is a domain in membrane proteins, and it has been linked to ceramide synthesis, lipid regulation and neural processes (Winter and Ponting, 2002) . Dengue virus polyprotein has this domain, and as the virus manipulates human neural system, it seems rational. AAI (alpha-amylase inhibitor) domain is a tetra-helix fold, which forms a part of LTP (lipid transfer protein) proteins (Zottich et al., 2011) . Polyketide synthases are a large group of multifunctional enzymes responsible for elaboration of myriad secondary metabolites, the polyketides, including antibiotics (Anand et al., 2010; Ansari et al., 2004) . The diverse array of polyketides is formed by molecular assembly, characterized by the successive addition of chain extension units. This group of enzyme occurs in bacteria, fungi and plants. Apart from the acyl carrier protein (ACP), acyltransferase (AT), and a ketosynthase domains, a variety of β-carbon processing domains (such as ER, KR) occur in these long, modular proteins (Cane, 2010) , some of which have been discussed here. PKS_ER are domains in enoylreductase in polyketide synthase enzymes (Gu et al., 2009 ). PKS_KR is a ketoreductase that reduces keto group to a hydroxy group. Also, studies have found the epimerase activity of PKS_KR which lies in the conserved Tyr or Ser, flanked by either Tyr or the triad of Leu, Asp, Asp residues (Bonnett et al., 2013; Xie et al., 2016) . PKS_TE are domains in thioesterases, catalyzing non-ribosomal synthesis of cyclic peptide antibiotics (Heathcote et al., 2001) . PLP (proteolipid protein) is a transmembrane myelin protein or lipophilin, playing role in stabilization of myelin sheaths and axonal survival. Mutant form of this protein causes neuropathies like Pelizaeus-Merzbacher disease and spastic paraplegia type 2 (Arvanitis et al., 2002; Garbern et al., 1997; Miller et al., 2009) . COLIPASE is a domain in small pancreatic protein with five conserved disulfide bonds, playing role in lipid metabolism (Berton et al., 2007) . In human, colipase-dependent pancreatic triglyceride lipase digests fat into fatty acids and monoacylglycerols (Johnson et al., 2013) . COLIPASE is part of flavivirus polyprotein propeptide (119-204 amino acid). UreE_C is the C-terminal domain in an accessory protein UreE, which hydrolyses urea into ammonia and carbamic acid. A study reports that Klebsiella aerogenes urease catalytic site binds to nickel ions and interacts with accessory proteins, including UreE for activation (Merloni et al., 2014; Song et al., 2001) . Skp1 (S-phase kinaseassociated protein 1) is a component of the kinase complex and it binds to F-box containing proteins like Cdc4, Skp2, and cyclin F. These adapter proteins act as transcription elongation factor and carry out proteasomal degradation of target proteins (in the form of Skp1-Cul1-Fbox protein (SCF), the E3 ubiquitin ligase) by ubiquitylation (Chandra Dantu et al., 2016; Yumimoto et al., 2013) . FoP_duplication domain, the acronym of C-terminal duplication domain of Friend of PRMT1, is a target of arginine methyltransferases in humans (van Dijk et al., 2010) . Fop is associated with chromatin and is an activator of estrogen receptor target genes (van Dijk et al., 2010) . Most isolates of dengue virus have this domain in their polyprotein. H4 is the domain of Histone H4 protein, which is likely to be involved in manipulating human interferon-beta (IFN-β) genes, as literature on hyperacetylation of H3 and H4, inactivating interferon gene expression exists (Parekh and Maniatis, 1999) . A yeast study has reported the role of H4 in nucleosome assembly during replication (Shibahara et al., 2000) . This domain is conserved in pathogenic viruses like dengue. HALZ is a homeobox associated leucin zipper domain present in transcription factors. The homeodomain binds to DNA while the leucine zipper carries out protein-protein interactions. With prolific growth regulatory role, this domain has been widely studied in plants (Elhiti and Stasolla, 2009 ). HTH_MARR domain is a helix-turn-helix motif occurring in MarR-family transcriptional regulators, thus facilitating in multiple antibiotic resistance. Also, this DNA binding domain is frequently found in hypothetical proteins, as in Staphylococcus aureus (Mohan and Venugopal, 2012) , so likely to be present in other bacterial hypothetical proteins as well. Cyclin_C are domains in Cyclin family of proteins, critical for cell cycle progression. The proteins and cyclin-dependent kinase (Cdk) enzymes work in sync to induce phosphorylation of RNA polymerase II. The C-type cyclin along with Cdk8 (recruited by the multi-protein complex Mediator) also responds to stress and carries out transcription regulation and modulation of gene expression by recruiting the fission tools comprising mitochondrial fragmentation and programmed cell death (Strich and Cooper, 2014) . Connexin_CCC is a cysteine-rich domain in gap junction channel protein connexin (Riquelme et al., 2013) . This protein has many subtypes. The C-terminal of connexin43 regulates assembly, gating, and binding to regulatory proteins by undergoing phosphorylation by kinases (Shin et al., 2001) . BAG domain in heat shock protein regulator plays role as co-chaperone of Hsp70 chaperones for proper protein folding with quality control and degradation pathways (Bracher and Verghese, 2015) . Role of this domain in regulating the heat shock protein quality check pathways can be correlated to the pathogenesis of the isolates harboring it. BTP (in bromodomain transcription factors and PHD domain (a small protein domain) containing proteins) are domains of histone-like transcription factors (chromatin-associated proteins, histone acetyltransferases) (Koutelou et al., 2010) . This domain recognizes lysine in histones and acetylates them, following which chromatin configuration and gene expression change, leading to viral replication, cancer, and inflammation (Sanchez and Zhou, 2009) . BTAD (bacterial transcriptional activator domain) is present in Actinobacteria (Huang et al., 2015) ; BRLZ (basic region leucin zipper also known as bZIP) is a domain in DNA-binding transcription regulators. This domains performs myriad critical gene regulation tasks by undergoing certain degree of flexibility in their configuration (Miller, 2009) . Some wellcharacterized members of this group of proteins include CREB (cAMP response element binding protein) (Thiel et al., 2005) , and MafK (MAF BZIP Transcription Factor K) (Töröcsik et al., 2002) . WHy domain occurs in water stress and hypersensitive response proteins, playing role in adaptation to stress, including cold temperature, and desiccation (Ciccarelli and Bork, 2005; Jaspard and Hunault, 2014) . This domain is found in LEA (late embryogenesis abundant) proteins, detected in bacteria and archaea, plants, nematodes, and typically induced by exposure to stress conditions (Anderson et al., 2015) . LANC_like domains are present in membrane-associated Lanthionine synthetase C-like protein (LanC) (Chen and Ellis, 2008) . The proteins LANCL1 (P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are produced profusely in the brain and testes, for their immune defense role (Landlinger et al., 2006; Mayer et al., 2001) . Lanthionines (macrocyclic thioether) are present in lantibiotics, a type of antimicrobial peptides, elaborated by some bacterial strains. OmpH (outer membrane protein H) domain has been found critical for pathogenesis. Pasteurella multocida, a Gram-negative bacterium causes fatal diseases in animals (porcine atrophic rhinitis, bovine diseases), birds (avian fowl cholera) and sometimes in human, of which OmpH is a major surface antigen (Lee et al., 2007; Okay et al., 2012) . Also, other enteric pathogenic bacteria like Salmonella typhimurium, Escherichia coli, Yersinia enterocolitica have the ompH genes, which can be borne in chromosome or plasmids. OmpH shares homology with α helix of the HLA-B27 (human leukocyte antigen subtype), which has been suspected to play role in inflammatory arthritis (Singh and Karrar, 2014) . Presence of this critical domain in pollen of grasses suggests similar pathogenic mechanism. Some other pathogenicity-causing motifs in several pollens were readily identified, such as OmpH in KBG41 (Kentucky bluegrass) allergen. Bac_rhodopsin (Bacteriorhodopsin-like protein for sensory function) domain is present in G protein-coupled receptors, a type of photoreceptors (Palfi et al., 2010) . Cg6151-P domain is part of a conserved membrane protein of about 190-200 aa long, but little functional characterization (Yao et al., 2009 ). 7TM_GPCR_Srsx domain occurs in serpentine type seven-transmembrane G-protein-coupled receptor class chemoreceptor Srsx (a Srg superfamily member) (Nagarathnam et al., 2012) . This domain has been detected in pathogenic viruses, including dengue virus. AAA are domains in the metalloproteases ATPases, the membrane-tethered enzymes with wide array of functions, including degradation of misfolded proteins, membrane quality control, membrane fusion, DNA replication etc. (Krzywda et al., 2002) . The AAA domain is located in the middle of the protein and it often co-occurs with C-terminal Zn 2 + -dependent protease (Scharfenberg et al., 2015) . Viral non-structural protein NS4A contains this domain as well. Knot1 is domain in knottins, a broad array of proteins including plant lectins, antimicrobial peptides (e.g. a plant cyclotide kalata B1), plant proteinase/amylase inhibitors, plant γ-thionins and arthropod defensins. These proteins possess a multitude of functions including inhibitory, cytotoxic, antiviral or hormone-like activity (Gracy et al., 2008) . The cysteine-rich domain derives its name from its knot-like topology. Three sulfide bridges in the knot interconnecting each other provide high stability to the protein against temperature, pH and chemicals (Herzig and King, 2015) . This protein has shown potential to block insect voltage-gated calcium channels (Herzig and King, 2015) . NH (Neurohypophysial hormone) is a domain in proteins of vasopressin/oxytocin gene family. The neurophysin protein with this domain serves as receptor for peptide hormone oxytocin, regulated by phosphatidylinositol-calcium second messenger system (de Bree and Burbach, 1998; Elphick and Rowe, 2009; Van Kesteren et al., 1995) . Galanin is a 29 aa-long neuropeptide that controls growth hormone, insulin, somatostatin, adrenal secretion, smooth muscle activity etc. By its endocrine regulation, it intervenes in pain, inflammation, memory, learning, mood swings, feeding, and sexual activities (Kask et al., 1996) . Role of this peptide in neural diseases, angiogenesis, cancer, obesity and diabetes has come forth as well (Poritsanos et al., 2009; Stevenson et al., 2012) . Pathogenesis via the stimulation of phospholipase C (GAL 2 ) has been recognized (Lang et al., 2015) . IB (Insulin growth factor-binding) domain-containing proteins are growth factors, which bind to receptors for their functions (Siegfried et al., 1992) . Elicitin is a group of plant necrotic proteins, exuded by pytopathogenic fungi and oomycetes like Phytophthora, Pythium, Hyaloperonospora, Albugo etc. (Uhlíková et al., 2016) . This PAMP domain is sulfide-rich (about 6 in number) and it possesses versatile functionality, including the manipulation of host signaling pathways. Some of this domain's sulfur residues have been identified as glycosylation sites. Pythiosiscausing Pythium insidiosum elaborates elicitin, which might be mediating the human pathogenesis. The elicitin is a sterol-carrying protein which might be sequestering human cholesterol (Lerksuthirat et al., 2015) . The β isoform of elicitin has higher plasma membrane affinity than that of α isoform. The genes inf2A and inf2B were identified to induce the elicitin activity (Huitema et al., 2005) . Interestingly, pathogenic bacteria like Mycobacterium tuberculosis have three inf genes as well. CT or CTCK (C-terminal cystine knot-like) domains are present in growth factors such as TGFβ (transforming growth factor-beta), NGF (nerve growth factor), PDGF (platelet-derived growth factor) and GCH (human chorionic gonadotropin). The knot formed of six cysteines is conserved in the CT domain, though the proteins harboring them can assume multi-meric forms, mediating an array of functions like cell growth, embryonic development, organogenesis, intercellular communication, differentiation, tissue repair and remodeling etc. This domain occurs in VWF, the glycoprotein involved in cell adhesion, homeostasis (Zhou and Springer, 2014) , and mucins (Iyer and Acharya, 2011 . ACTH_domain, which is present in corticotrophins has been linked to virus immune evasion. As per literature, SARS (Severe Acute Respiratory Syndrome), and influenza virus manipulate host corticosteroid stress response to circumvent the immune response, by expressing protein homologous to host ACTH (adrenocorticotropin hormone) (Wheatland, 2004) . As host immunity produces antibodies against the viral ACTH, the antibody binds to host ACTH as well, leading to adrenal gland injuries, hampering corticosteroid secretion (Wheatland, 2004) . Also, adrenal deficiency, and the dearth of ACTH in HIV patients has been reported (Shashidhar and Shashikala, 2012) . ACTH_domain has been detected only in dengue serotype 2 isolates (P14337, P29990, Q9WDA6 etc.). This ACTH-based molecular mimicry mechanism might be linked to the higher virulence of this serotype. Amelin (Ameloblastin precursor) are a group of proteins, found in mammalian enamel matrix. This amenoblastin amelin plays role in tooth crystal formation, as growth factor, though it has been detected to occur in extracellular matrix during embryogenesis and has been discovered to play role in bone repair (Tamburstuen et al., 2011) . Ameloblastin binds to calcium and is sensitive to matrix proteases like enamelysin and kallikrein. A study has found that ameloblastin can regulate the genes related to immune responses, by expression of cytokines and induction of STAT (signal transducer and activator of transcription) in the interferon pathway (Tamburstuen et al., 2010) . Interestingly, analysis has showed that HCV and HIV have amelin domain in their glycoproteins. DEP (named after the proteins Dishevelled, Egl-10, and Pleckstrin P) domain is present in G-protein signaling regulatory proteins. This globular domain made of three-helix bundle, a β-hairpin and two other β-strands modulates signal transduction by manipulating GTPase activity (Capelluto et al., 2014; Wong et al., 2000) . DDRGK is a domain occurring widely in plant and vertebrate proteins, and it is named after the corresponding amino acid motif. Studies reveal its role in multiple cell signaling pathways, including NF-kappaB signaling (Wu et al., 2010) . Cache_2 is an extracellular domain involved in signaling via recognition of small-molecules. Proteins forming voltage-gated Ca 2 + channels and bacterial chemotaxis receptors possess this domain. This domain has been well-studied in Vibrio cholerae (Upadhyay et al., 2016) . CHASE (cyclase/histidine kinase-associated sensing extracellular) is a conserved extracellular sensory domain that helps in perception of environmental changes. As the name indicates, this domain is present in signal transducing systems like histidine kinases, adenylate cyclases, diguanylate cyclases, serine/threonine protein kinases, phosphodies-terases and methyl-accepting chemotaxis proteins. CHASE domains can be of many types based on functions, out of which CHASE2, 3, 6 are well-studied. CHASE2 are part of serine/threonine kinases, which is followed by transmembrane helices (Mascher et al., 2006; Zhulin et al., 2003) . Adequate numbers of studies report their presence in bacteria (Cyanobacteria etc.) signal sensing proteins, however their presence in viruses and pollen is rather new discovery. AgrB (Accessory gene regulator B) family proteins include AgrB from Staphylococcus aureus and FsrB from Enterococcus faecalis, both regulating expression of virulence genes (Robinson et al., 2005) . These are quorum-sensing apparatus in the bacteria, coordinating bacterial communication (Hsieh et al., 2008) . Also, these signaling genes have been recently discovered in bacteriophage genomes (Hargreaves et al., 2014) . This domain is assumed to perform regulatory role in the dengue virus. Fig. 1 illustrates the pathogenic mechanisms of these domains. PRP (prion protein) domain occurs in prion proteins, known to cause neural diseases among animals, such as scrapie, bovine spongi- form encephalopathy (BSE), kuru and Creutzfeldt-Jakob disease. The disease progresses when cellular α-helix-rich prion protein converts into β-sheet-rich amyloid fibril-forming form (Krammer et al., 2008; Kupfer et al., 2009) . IGR domain is found in fungal and plant proteins; however, its annotation is too sparse and its function is unknown. Antimicrobial21 is a plant peptide, with two disulfide bonds which gives the peptide an α-helical hairpin fold topology. This peptide is antimicrobial, and antifungal, which binds to fungal conidia, penetrates and pools in the cytoplasm, leading to fungal death (Gautam et al., 2012; Nolde et al., 2011) . Several DUFs (Domain of unknown functions), though poorlyannotated frequently occur in pathogenesis-related proteins. DUF1237 occur in Ebola virus isolate from Zaire strain (isolate Q6V1Q2). This domain overlaps with B41 domain, also adjacent domains occurring before this DUF are exactly same (IENR1, DEP, LamG, Lipid_DES, YqgFc) in another Zaire isolate (A0A0G2Y8I7) which indicates the domain DUF1237 might be just a modified form of B41 domain. DUF1338 in DENV-1 (P17763) and DENV-4 (Q2YHF0 and Q5UCB8) has zinc-binding function and it is a part of putative metal hydrolase (Marchler-Bauer et al., 2014 Exerts antimicrobial, and antifungal properties to for further information on these protein domains (Ponting et al., 1999) . Even if some of the domains are non-functional, the findings indicate homology and phylogenetic conservation among organisms. Also, positional shuffling of the domains affirms mosaic nature of the virus nucleic acid, which has been already proven in some DNA virus . Table 1 lists all the pathogenically-critical protein domains outlined above. The diverse repertoires of domains have originated by stepwise or drastic reshuffling, depending on the stressors encountered. Several domains co-occur in one protein and crosstalk for critical functions. A huge number of above-discussed domains manipulate host actin protein, hormones and neurons. Its striking that despite being at different levels of evolutionary hierarchy, organisms have significant number of domains shared. The identified domain number is vast and ever-increasing, also they are being frequently re-annotated. Yet, the domains characterized here constitute the core of the pathogenesis mechanisms exploited by most pathogens and allergens. Many proteins are intrinsically unstructured (such as stress encountering proteins), yet they have highly conserved, structured domains. These domains are clues to the phylogenetic origin, evolutionary trajectories and permutation paths leading to the origin of other protein domains. Many of these critical domains occur in hypothetical proteins of pathogenic bacteria, which are normally ignored while searching for drug targets. It can be hypothesized that the hypothetical proteins with any of these domains are likely to be virulence factors and are eligible to be targeted. Patel has analyzed and reviewed extensively in this area, that has shed light on the conserved domains and their obligatory role in pathogenesis (Patel, 2016a (Patel, , 2016b (Patel, , 2016c Patel, 2017a Patel, , 2017b Patel and Patel, 2016) . The current work will be an interesting addition in this direction. To conclude, the most-conserved domains in pathogens and allergens are generally VWC, YARHG, WH1, RICTOR_M, Pro-kuma_activ, IENR1, B41, Y1_Tnp, HOX, HOLI, PLCYc, Hr1, H4, GGDEF, LPD_N, CHASE2, Galanin, Dak2, DALR_1, HAMP, PWI, EFh, Excalibur, CT, PbH1, HELICc, Kelch, Robl_LC7, YaeQ, PreSET, Bet_v_1, GAS2, CHAD, Integrin_B_tail, MHC_II_beta, DISIN, etc. Using these domains as clues, virulence agents and inflammation mediators can be identified. The genes responsible for coding these protein sequences deserves attention. The author declares that there is no competing interest. This work does not involve human participants or animal models. There are no coauthors, so consent is not required. Collagen IV in normal skin and in pathological processes A noncanonical PWI domain in the N-terminal helicase-associated region of the spliceosomal Brr2 protein HAMP domain conformers that propagate opposite signals in bacterial chemoreceptors Structure of concatenated HAMP domains provides a mechanism for signal transduction Gapped BLAST and PSI-BLAST: a new generation of protein database search programs SBSPKS: structure based sequence analysis of polyketide synthases A novel bacterial Water Hypersensitivity-like protein shows in vivo protection against cold and freeze damage NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases Structure of acid phosphatases Myelin proteolipid protein, basic protein, the small isoform of myelin-associated glycoprotein, and p42MAPK are associated in the Triton X-100 extract of central nervous system myelin SHCBP1 is required for midbody organization and cytokinesis completion The crystal structure of yeast thiamin pyrophosphokinase Resolution of holliday junctions by RuvC resolvase: cleavage specificity and DNA distortion Improved membrane protein topology prediction by domain assignments Role of the structural domains in the functional properties of pancreatic lipase-related protein 2 Molecular cloning, characterisation and expression analysis of melanoma differentiation associated gene 5 (MDA5) of green chromide, Etroplus suratensis Trimethylation of histone H3 lysine 4 impairs methylation of histone H3 lysine 9: regulation of lysine methyltransferases by physical interaction with their substrates The structural basis for substrate versatility of chloramphenicol acetyltransferase CATI A functional comparison of mutations in integrin beta cytoplasmic domains: effects on the regulation of tyrosine phosphorylation, cell spreading, cell attachment and beta1 integrin conformation ADP-ribosylation factor (ARF) interaction is not sufficient for yeast GGA protein function or localization Structural and stereochemical analysis of a modular polyketide synthase ketoreductase domain required for the generation of a cis-alkene GrpE, Hsp110/Grp170, HspBP1/Sil1 and BAG domain proteins: nucleotide exchange factors for Hsp70 molecular chaperones Extended-spectrum beta-lactamases in the 21st century: characterization, epidemiology, and detection of this important resistance threat Predicting enzyme subclass by functional domain composition and pseudo amino acid composition RUN domains: a new family of domains involved in Ras-like GTPase signaling Programming of erythromycin biosynthesis by a modular polyketide synthase IgE reactivity patterns in patients with allergic rhinoconjunctivitis to ragweed and mugwort pollens Identification of a putative flavin adenine dinucleotide-binding monooxygenase as a regulator for Myxococcus xanthus development Biophysical and molecular-dynamics studies of phosphatidic acid binding by the Dvl-2 DEP domain Roles of phosphatidate phosphatase enzymes in lipid metabolism Developmentally regulated expression, alternative splicing and distinct sub-groupings in members of the Schistosoma mansoni venom allergenlike (SmVAL) gene family Molecular dynamics simulations elucidate the mode of protein recognition by Skp1 and the F-box domain in the SCF complex Solution structure of the ubiquitin-associated domain of human BMSC-UbP and its complex with ubiquitin GCR2 is a new member of the eukaryotic lanthionine synthetase component C-like protein family Global transcriptional analysis of Burkholderia pseudomallei high and low biofilm producers reveals insights into biofilm production and virulence The WHy domain mediates the response to desiccation in plants and bacteria The YARHG domain: an extracellular domain in search of a function Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila 1.2 A crystal structure of the serine carboxyl proteinase pro-kumamolisin; structure of an intact pro-subtilase Structural basis of transcription: RNA polymerase II at 2.8 angstrom resolution Identification of a membrane protein involved in activation of the KinB pathway to sporulation in Bacillus subtilis Structure-function relationships of the vasopressin prohormone domains Vaccinia virus uracil DNA glycosylase has an essential role in DNA synthesis that is independent of its glycosylase activity: catalytic site mutations reduce virulence but not virus replication in cultured cells Tumor necrosis factor receptor pre-ligand assembly domain is an important therapeutic target in inflammatory arthritis Update on the Kelch-like (KLHL) gene family Serine proteases Thousands of rab GTPases for the cell biologist The SET-domain protein superfamily: protein lysine methyltransferases The transcript elongation factor SPT4/SPT5 is involved in auxin-related gene expression in Arabidopsis Structure and function of homodomain-leucine zipper (HD-Zip) proteins NGFFFamide and echinotocin: structurally unrelated myoactive neuropeptides derived from neurophysin-containing precursors in sea urchins Multiple interactions of PRK1 with RhoA. Functional assignment of the Hr1 repeat motif The binding of von Willebrand factor type C domains of Chordin family proteins to BMP-2 and Tsg is mediated by their SD1 subdomain A census of membrane-bound and intracellular signal transduction proteins in bacteria: bacterial IQ, extroverts and introverts Proteolipid protein is necessary in peripheral as well as central myelin ExPASy: the proteomics server for in-depth protein knowledge and analysis CPPsite: a curated database of cell penetrating peptides. Database (Oxford) 2012, bas015 Allostery in trypsin-like proteases suggests new therapeutic strategies Mutations in phospholipase DDHD2 cause autosomal recessive hereditary spastic paraplegia (SPG54) Protein domains of unknown function are essential in bacteria The N-terminal domain of the Flo11 protein from Saccharomyces cerevisiae is an adhesin without mannose-binding activity KNOTTIN: the knottin or inhibitor cystine knot scaffold in 2007 Integrative analysis of the human cis-antisense gene pairs, miRNAs and their transcription regulation patterns Metamorphic enzyme assembly in polyketide diversification Carbohydrate-binding domains: multiplicity of biological roles PKHD1 sequence variations in 78 children and adults with autosomal recessive polycystic kidney disease and congenital hepatic fibrosis Comparative analysis of zinc finger proteins involved in plant disease resistance Structure of Xanthomonas axonopodis pv. citri YaeQ reveals a new compact protein fold built around a variation of the PD-(D/E)XK nuclease motif What does the talking?: quorum sensing signalling genes discovered in a bacteriophage genome Role of type II thioesterases: evidence for removal of short acyl chains produced by aberrant decarboxylation of chain extender units Sequence profile of the parallel beta helix in the pectate lyase superfamily The cystine knot is responsible for the exceptional stability of the insecticidal spider toxin ω-Hexatoxin-Hv1a Regulation of Rot expression in Staphylococcus aureus ZNF216 Is an A20-like and IkappaB kinase gamma-interacting inhibitor of NFkappaB activation UNC-71, a disintegrin and metalloprotease (ADAM) protein, regulates motor axon guidance and sex myoblast migration in C. elegans Environmental sensing in actinobacteria: a comprehensive survey on the signaling capacity of this phylum Differences in intensity and specificity of hypersensitive response induction in Nicotiana spp. by INF1, INF2A, and INF2B of Phytophthora infestans The HAMP domain structure implies helix rotation in transmembrane signaling Genetics and pathogenesis of polycystic kidney disease Roles of SAM and DDHD domains in mammalian intracellular phospholipase A1 KIAA0725p Arf GTPase-activating protein ASAP1 interacts with Rab11 effector FIP3 and regulates pericentrosomal localization of transferrin receptor-positive recycling endosome Thrombospondin modules and angiogenesis The catalytic domains of thiamine triphosphatase and CyaBlike adenylyl cyclase define a novel superfamily of domains that bind organic phosphates Extensive domain shuffling in transcription regulators of DNA viruses and implications for the origin of fungal APSES transcription factors Tying the knot: the cystine signature and molecularrecognition processes of the vascular endothelial growth factor family of angiogenic cytokines Comparison of amino acids physico-chemical properties and usage of late embryogenesis abundant proteins, hydrophilins and WHy domain Modular evolution of phosphorylation-based signalling systems Inhibition of human serine racemase, an emerging target for medicinal chemistry Pancreatic lipase-related protein 2 digests fats in human milk and formula in concert with gastric lipase and carboxyl ester lipase Glycerol-3-phosphate acyltransferase 1 is essential for the immune response to infection with coxsackievirus B3 in mice The HWE histidine kinases, a new family of bacterial two-component sensor kinases with potentially diverse roles in environmental signaling Delineation of the peptide binding site of the human galanin receptor Crystal structure and activity of Bacillus subtilis YoaJ (EXLX1), a bacterial expansin that promotes root colonization EDD, a novel phosphotransferase domain common to mannose transporter EIIA, dihydroxyacetone kinase, and DegV Structural and functional studies of the HAMP domain of EnvZ, an osmosensing transmembrane histidine kinase in Escherichia coli The ZASP-like motif in actininassociated LIM protein is required for interaction with the alpha-actinin rod and for targeting to the muscle Z-line Structure of the neural (N-) cadherin prodomain reveals a cadherin extracellular domain-like fold without adhesive characteristics Beta-lactam antibiotics: from antibiosis to resistance and bacteriology Multiple faces of the SAGA complex Dynamic interactions of Sup35p and PrP prion protein domains modulate aggregate nucleation and seeding Interactions by the fungal Flo11 adhesin depend on a fibronectin type III-like adhesin domain girdled by aromatic bands The tyrosine kinase Pyk2 regulates Arf1 activity by phosphorylation and inhibition of the Arf-GTPase-activating protein ASAP1 The crystal structure of the AAA domain of the ATP-dependent protease FtsH of Escherichia coli at 1.5 A resolution Prion protein misfolding Myristoylation of human LanC-like protein 2 (LANCL2) is essential for the interaction with the plasma membrane and the increase in cellular sensitivity to adriamycin The nicking homing endonuclease I-BasI is encoded by a group I intron in the DNA polymerase gene of the Bacillus thuringiensis phage Bastille Physiology, signaling, and pharmacology of galanin peptides and receptors: three decades of emerging diversity N-cadherin prodomain cleavage regulates synapse formation in vivo Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae Molecular basis for the regulation of angiogenesis by thrombospondin-1 and -2. Cold Spring Harb Outer membrane protein H for protective immunity against Pasteurella multocida The elicitin-like glycoprotein, ELI025, is secreted by the pathogenic oomycete Pythium insidiosum and evades host antibody responses Acetylation stabilizes ATP-citrate lyase to promote lipid biosynthesis and tumor growth Z-disc-associated, alternatively spliced, PDZ motif-containing protein (ZASP) mutations in the actinbinding domain cause disruption of skeletal muscle actin filaments in myofibrillar myopathy Functional roles of DExD/H-box RNA helicases in pre-mRNA splicing The bacterial cell division proteins FtsA and FtsZ selforganize into dynamic cytoskeletal patterns Phylogenomic analysis of the uracil-DNA glycosylase superfamily Getting a GRASP on CASP: properties and role of the cytohesin-associated scaffolding protein in immunity CDD: NCBI's conserved domain database ZASP interacts with the mechanosensing protein Ankrd2 and p53 in the signalling network of striated muscle Stimulus perception in bacterial signaltransducing histidine kinases. Microbiol HAMP domain rotation and tilting movements associated with signal transduction in the PhoQ sensor kinase Organization and chromosomal localization of the human and mouse genes coding for LanC-like protein 1 (LANCL1) Thiamine pyrophosphokinase deficiency in encephalopathic children with defects in the pyruvate oxidation pathway HNH family subclassification leads to identification of commonality in the His-Me endonuclease superfamily Molecular and cellular pathogenesis of autosomal recessive polycystic kidney disease. Braz Molecular landscape of the interaction between the urease accessory proteins UreE and UreG Global spread of dengue virus types: mapping the 70 year history In vitro binding of the response regulator CitB and of its carboxy-terminal domain to A + T-rich DNA target sequences in the control region of the divergent citC and citS operons of Klebsiella pneumoniae Neuronal profilins in health and disease: relevance for spine plasticity and Fragile X syndrome Structural analysis of the Ras-like G protein MglA and its cognate GAP MglB and implications for bacterial polarity The importance of being flexible: the case of basic region leucine zipper transcriptional regulators Neuronal expression of the proteolipid protein gene in the medulla of the mouse The C-terminal domain of RNA polymerase II functions as a phosphorylation-dependent splicing activator in a heterologous protein Computational structural and functional analysis of hypothetical proteins of Staphylococcus aureus Independent subtilases expansions in fungi associated with animals Biological activities of human mannosebinding lectin bound to two different ligand sugar structures, Lewis A and Lewis B antigens and high-mannose type oligosaccharides Cross-genome clustering of human and C. elegans G-protein coupled receptors Disulfide-stabilized helical hairpin structure and activity of a novel antifungal peptide EcAMP1 from seeds of barnyard grass (Echinochloa crus-galli) Structural and biochemical characterization of the yeast exosome component Rrp40 Immune responses against chimeric DNA and protein vaccines composed of plpEN-OmpH and PlpEC-OmpH from Pasteurella multocida A:3 in mice Vitamin K epoxide reductase complex subunit 1 (VKORC1): the key protein of the vitamin K cycle PKHD1, the polycystic kidney and hepatic disease 1 gene, encodes a novel large protein containing multiple immunoglobulin-like plexin-transcription-factor domains and parallel beta-helix 1 repeats Adenoassociated virus-mediated rhodopsin replacement provides therapeutic benefit in mice with a targeted disruption of the rhodopsin gene Virus infection leads to localized hyperacetylation of histones H3 and H4 at the IFN-beta promoter Death domain complex of the TNFR-1, TRADD, and RIP1 proteins for death-inducing signaling Identification of a two-component fatty acid kinase responsible for host fatty acid incorporation by Staphylococcus aureus Ras GTPases' interaction with effector domains: breaking the families' barrier A critical review on serine protease: key immune manipulator and pathology mediator Letter to the editor on "The necessity of overhaul in perception of microbiological culture methods Letter to the Editor on "The urgency of annotating pathogen hypothetical genes Drivers of bacterial genomes plasticity and roles they play in pathogen virulence, persistence and drug resistance In silico analysis of Hepatitis C virus (HCV) polyprotein domains and their comparison with other pathogens and allergens to gain insight on pathogenicity mechanisms Analysis of Ebola virus polymerase domains to find strainspecific differences and to gain insight on their pathogenicity Cooperative assembly of CYK-4/MgcRacGAP and ZEN-4/MKLP1 to form the centralspindlin complex An introduction to sequence similarity ("homology") searching Novel domains and orthologues of eukaryotic transcription elongation factors SMART: identification and annotation of domains from signalling and extracellular protein sequences Chronic increase of circulating galanin levels induces obesity and marked alterations in lipid metabolism similar to metabolic syndrome Gamma-tubulin and the C-terminal motor domain kinesin-like protein, KLPA, function in the establishment of spindle bipolarity in Aspergillus nidulans Antagonists of anaphase-promoting complex (APC)-2-cell cycle and apoptosis regulatory protein (CARP)-1 interaction are novel regulators of cell growth and apoptosis Binding of polyubiquitin chains to ubiquitin-associated (UBA) domains of HHR23A Structure of tubulin Cterminal domain obtained by subtilisin treatment. The major alpha and beta tubulin isotypes from pig brain are glutamylated N-cadherin prodomain processing regulates synaptogenesis An extracellular calcium-binding domain in bacteria with a distant relationship to EF-hands Antibodies targeting extracellular domain of connexins for studies of hemichannels Evolutionary genetics of the accessory gene regulator (agr) locus in Staphylococcus aureus The role of human bromodomains in chromatin biology and gene transcription Structure and evolution of N-domains in AAA metalloproteases RNA polymerase II termination involves C-terminal-domain tyrosine dephosphorylation by CPF subunit Glc7 Two virus-induced MicroRNAs known only from teleost fishes are orthologues of MicroRNAs involved in cell cycle control in humans Structural insights into specificity and diversity in mechanisms of ubiquitin recognition by ubiquitin-binding domains Low dose adrenocorticotropic hormone test and adrenal insufficiency in critically ill acquired immunodeficiency syndrome patients The N-terminal domains of histones H3 and H4 are not necessary for chromatin assembly factor-1-mediated nucleosome assembly onto replicated DNA in vitro The regulatory role of the C-terminal domain of connexin43 Carbohydrate binding modules: biochemical properties and novel applications. Microbiol The actinbinding domain of cortactin is dynamic and unstructured and affects lateral and longitudinal contacts in F-actin A mitogenic peptide amide encoded within the E peptide domain of the insulinlike growth factor IB prohormone The role of intracellular organisms in the pathogenesis of inflammatory arthritis TCUP: a novel hAT transposon active in maize tissue culture Crystal structure of Klebsiella aerogenes UreE, a nickel-binding metallochaperone for urease activation Rab GTPases as coordinators of vesicle traffic Identification of galanin and its receptor GalR1 as novel determinants of resistance to chemotherapy and potential biomarkers in colorectal cancer The dual role of cyclin C connects stress regulated gene expression to mitochondrial dynamics Ubiquitin-like and ubiquitin-associated domain proteins: significance in proteasomal degradation Vibrio cholerae T3SS effector VopE modulates mitochondrial dynamics and innate immune signaling by targeting Miro GTPases Structure and function of the PWI motif: a novel nucleic acid-binding domain that facilitates pre-mRNA processing Ameloblastin promotes bone growth by enhancing proliferation of progenitor cells and by stimulating immunoregulators Ameloblastin expression and putative autoregulation in mesenchymal cells suggest a role in early bone formation and repair RUN and FYVE domain-containing protein 4 enhances autophagy and lysosome tethering in response to Interleukin-4 Protein kinase C-related kinase (PKN/PRK). Potential key-role for PKN1 in protection of hypoxic neurons Role of basic region leucine zipper transcription factors cyclic AMP response element binding protein (CREB), CREB2, activating transcription factor 2 and CAAT/enhancer binding protein alpha in cyclic AMP response element-mediated transcription The basic region and leucine zipper transcription factor MafK is a new nerve growth factor-responsive immediate early gene that regulates neurite outgrowth Potential role of chitinases and chitinbinding proteins in host-microbial interactions during the development of intestinal inflammation Genome-wide survey of prokaryotic serine proteases: analysis of distribution and domain architectures of five serine protease families in prokaryotes Metatranscriptomic analysis of arctic peat soil microbiota Elicitin-induced distal systemic resistance in plants is mediated through the protein-protein interactions influenced by selected lysine residues Cache domains that are homologous to, but different from PAS domains comprise the largest superfamily of extracellular sensors in prokaryotes Friend of Prmt1, a novel chromatin target of protein arginine methyltransferases Structural and functional evolution of the vasopressin/oxytocin superfamily: vasopressin-related conopressin is the only member present in Lymnaea, and is involved in the control of sexual behavior Defining the estimated core genome of bacterial populations using a Bayesian decision model HNHDb: a database on pattern based classification of HNH domains reveals functional relevance of sequence patterns and domain associations The known unknowns of antigen processing and presentation RNA polymerase II binding patterns reveal genomic regions involved in microRNA gene regulation Ectromelia virus encodes a BTB/kelch protein, EVM150, that inhibits NF-κB signaling Investigation of the interaction between Cdc42 and its effector TOCA1: handover of Cdc42 to the actin regulator N-WASP is facilitated by differential binding affinities Glycerol-3-phosphate acyltransferases: rate limiting enzymes of triacylglycerol biosynthesis Molecular mimicry of ACTH in SARS -implications for corticosteroid treatment and prophylaxis Genetic and biochemical dissection of a HisKA domain identifies residues required exclusively for kinase and phosphatase activities Ectromelia virus BTB/kelch proteins, EVM150 and EVM167, interact with cullin-3-based ubiquitin ligases TRAM, LAG1 and CLN8: members of a novel family of lipid-sensing domains? Structural basis of the recognition of the dishevelled DEP domain in the Wnt signaling pathway A gene, yaeQ, that suppresses reduced operon expression caused by mutations in the transcription elongation gene rfaH in Escherichia coli and Salmonella typhimurium Non-immune immunoglobulins shield Schistosoma japonicum from host immunorecognition A novel C53/LZAP-interacting protein regulates stability of C53/LZAP and DDRGK domain-containing Protein 1 (DDRGK1) and modulates NF-kappaB signaling Evolution and diversity of the Ras superfamily of small GTPases in prokaryotes Essential domains of a disintegrin and metalloprotease with thrombospondin type 1 repeats-13 metalloprotease required for modulation of arterial thrombosis Complement-related proteins control the flavivirus infection of Aedes aegypti by inducing antimicrobial peptides Epimerase and reductase activities of polyketide synthase ketoreductase domains utilize the same conserved tyrosine and serine residues Generation of lysophosphatidylinositol by DDHD domain containing 1 (DDHD1): possible involvement of phospholipase D/phosphatidic acid in the activation of DDHD1 A synaptic vesicle-associated Ca 2 + channel promotes endocytosis and couples exocytosis to endocytosis HIV-1 Vprchloramphenicol acetyltransferase fusion proteins: sequence requirement for virion incorporation and analysis of antiviral effect Tudor domain-containing proteins of Drosophila melanogaster Tudor domaincontaining protein 4 as a potential cancer/testis antigen in liver cancer Substrate binding promotes formation of the Skp1-Cul1-Fbxl3 (SCF(Fbxl3)) protein complex ATP-citrate lyase: a key player in cancer metabolism von Willebrand factor type C domain-containing proteins regulate bone morphogenetic protein signaling through different recognition mechanisms Growtharrest-specific protein 2 inhibits cell division in Xenopus embryos Molecular structure of the N-terminal domain of the APC/C subunit Cdc27 reveals a homodimeric tetratricopeptide repeat architecture Cortical localization of APC2 plays a role in actin organization but not in Wnt signaling in Drosophila Highly reinforced structure of a C-terminal dimerization domain in von Willebrand factor Common extracellular sensory domains in transmembrane receptors for diverse signal transduction pathways in bacteria and archaea Purification, biochemical characterization and antifungal activity of a new lipid transfer protein (LTP) from Coffea canephora seeds with α-amylase inhibitor properties Origin and evolution of the RIG-I like RNA helicase gene family