This is a table of type bigram and their frequencies. Use it to search & browse the list to learn more about your study carrel.
bigram | frequency |
---|---|
amino acid | 1069 |
amino acids | 721 |
dna sequences | 395 |
genome sequence | 368 |
protein sequences | 313 |
secondary structure | 291 |
protein sequence | 239 |
graphical representation | 238 |
sequence alignment | 218 |
sequence data | 208 |
dna sequence | 204 |
acid sequence | 201 |
complete genome | 199 |
protein structure | 197 |
nucleotide sequence | 183 |
genome sequences | 181 |
phylogenetic analysis | 179 |
human genome | 178 |
sequence analysis | 176 |
rna viruses | 175 |
phylogenetic tree | 160 |
active site | 154 |
secondary structures | 150 |
closely related | 147 |
gene expression | 143 |
binding site | 140 |
viral genome | 132 |
sequence accession | 130 |
acid sequences | 129 |
viral sequences | 128 |
nucleic acid | 127 |
machine learning | 124 |
multiple sequence | 120 |
sequence similarity | 120 |
binding sites | 120 |
cord uid | 119 |
large number | 119 |
doc id | 119 |
coding sequences | 116 |
respiratory syndrome | 115 |
genome sequencing | 111 |
viral genomes | 109 |
solid phase | 109 |
nucleotide sequences | 108 |
viral rna | 108 |
acid residues | 107 |
influenza virus | 107 |
highly conserved | 106 |
present study | 106 |
intrinsically disordered | 104 |
molecular dynamics | 103 |
sequence identity | 103 |
data sets | 102 |
acute respiratory | 100 |
antimicrobial peptides | 100 |
biological activity | 99 |
affi nity | 99 |
cell lines | 97 |
protein structures | 97 |
phylogenetic trees | 96 |
reading frames | 96 |
structure prediction | 96 |
mass spectrometry | 95 |
rna polymerase | 94 |
peptide synthesis | 94 |
severe acute | 91 |
escherichia coli | 91 |
open reading | 91 |
sequence space | 91 |
immune response | 91 |
sequence comparison | 91 |
infectious diseases | 91 |
side chains | 91 |
reading frame | 86 |
intrinsic disorder | 86 |
protein interactions | 86 |
immune system | 85 |
data set | 85 |
important role | 84 |
gene sequences | 84 |
side chain | 83 |
widely used | 83 |
fi rst | 82 |
neural networks | 80 |
similarity dissimilarity | 80 |
two different | 78 |
sequences based | 77 |
molecular biology | 77 |
crystal structure | 76 |
two sequences | 76 |
nucleic acids | 75 |
protein folding | 75 |
modifi ed | 75 |
nmr spectroscopy | 73 |
conformational changes | 73 |
whole genome | 72 |
wild type | 72 |
immune repertoire | 72 |
signal sequence | 70 |
sequence information | 70 |
codon usage | 70 |
type i | 70 |
generation sequencing | 69 |
maximum likelihood | 69 |
repetitive sequences | 68 |
sequence alignments | 68 |
results show | 68 |
deep learning | 68 |
results suggest | 67 |
binding domain | 67 |
coding regions | 66 |
reference sequence | 66 |
infectious disease | 66 |
represented sequences | 65 |
spike protein | 65 |
molecular weight | 65 |
class i | 64 |
novel coronavirus | 64 |
public health | 64 |
antimicrobial activity | 64 |
graphical representations | 63 |
full length | 62 |
sequencing data | 62 |
identifi ed | 61 |
rna virus | 61 |
query sequence | 61 |
fl uorescence | 60 |
electron microscopy | 60 |
free energy | 60 |
genomic sequences | 60 |
protein families | 60 |
small molecules | 60 |
wide range | 59 |
circular dichroism | 59 |
genome size | 59 |
comparative modeling | 59 |
coding region | 58 |
sequence database | 58 |
phase peptide | 58 |
may also | 58 |
coding sequence | 58 |
modern hopfield | 57 |
type ii | 57 |
throughput sequencing | 57 |
membrane proteins | 57 |
cell culture | 56 |
performed using | 56 |
viral proteins | 56 |
disordered proteins | 56 |
structural proteins | 56 |
immune responses | 56 |
attention mechanism | 56 |
biological sequences | 55 |
transcription factors | 55 |
previous studies | 55 |
dengue virus | 55 |
cell receptor | 54 |
binding proteins | 53 |
effi cient | 53 |
results indicate | 53 |
modifi cation | 53 |
mosaic virus | 53 |
drug discovery | 53 |
protein engineering | 53 |
reference sequences | 53 |
genetic diversity | 53 |
repetitive dna | 52 |
leaf curl | 52 |
idps idprs | 52 |
high throughput | 52 |
i i | 52 |
based methods | 51 |
recent years | 51 |
capsid protein | 51 |
sequence length | 51 |
repertoire classification | 50 |
fcov zu | 50 |
purifi cation | 50 |
protein coding | 50 |
hopfield networks | 49 |
consensus sequence | 49 |
genetic code | 49 |
rna secondary | 49 |
additional file | 49 |
viral families | 49 |
syndrome coronavirus | 49 |
bloom filter | 49 |
naturally occurring | 49 |
sequence motifs | 49 |
terminal region | 49 |
target protein | 48 |
sequences using | 48 |
draft genome | 48 |
cell line | 48 |
base pairs | 48 |
dependent rna | 48 |
ray crystallography | 48 |
neural network | 48 |
specifi city | 47 |
building blocks | 47 |
purifi ed | 47 |
previously described | 47 |
reverse transcriptase | 47 |
binding protein | 46 |
nervous system | 46 |
dna sequencing | 46 |
plasma membrane | 46 |
electrostatic interactions | 46 |
least one | 46 |
immunodeficiency virus | 46 |
see table | 46 |
even though | 46 |
primary sequence | 45 |
dna binding | 45 |
alignment methods | 45 |
similarity measure | 45 |
signifi cant | 45 |
high resolution | 45 |
membrane protein | 45 |
cell surface | 45 |
multiple alignment | 45 |
acid composition | 45 |
protein function | 45 |
sequence homology | 45 |
point mutations | 44 |
first step | 44 |
strand rna | 44 |
novo assembly | 44 |
protein secondary | 44 |
target sequence | 43 |
clinical samples | 43 |
orf ab | 43 |
united states | 43 |
virus discovery | 43 |
numerical characterization | 43 |
national center | 43 |
present work | 43 |
total number | 43 |
signal transduction | 43 |
chain reaction | 43 |
aug codon | 43 |
polymerase chain | 42 |
gel electrophoresis | 42 |
infl uence | 42 |
biologically active | 42 |
sars coronavirus | 42 |
crystal structures | 42 |
time pcr | 42 |
key role | 42 |
small molecule | 42 |
avian influenza | 42 |
tertiary structure | 42 |
synthetic peptides | 41 |
magnetic resonance | 41 |
terminal domain | 41 |
physicochemical properties | 41 |
virus infection | 41 |
cell death | 41 |
transcription factor | 41 |
molecular evolution | 41 |
viral replication | 41 |
high affinity | 41 |
probe design | 41 |
biological sequence | 41 |
host cell | 41 |
genome project | 41 |
protein interaction | 41 |
alignment search | 41 |
dynamic programming | 41 |
antimicrobial peptide | 41 |
reference genome | 41 |
protein domains | 41 |
hydrogen bonds | 41 |
homology modelling | 40 |
data analysis | 40 |
comparative models | 40 |
binding affinity | 40 |
biological function | 40 |
breast cancer | 40 |
genomic data | 40 |
stranded rna | 40 |
growth factor | 40 |
cancer cells | 40 |
dynamics simulations | 39 |
positively charged | 39 |
virus variation | 39 |
serine protease | 39 |
accession number | 39 |
phase synthesis | 39 |
next generation | 39 |
primary structure | 39 |
commonly used | 39 |
gene sequence | 39 |
viral sequence | 39 |
virus type | 39 |
modifi cations | 38 |
cell types | 38 |
comparative analysis | 38 |
determine whether | 38 |
stem cells | 38 |
untranslated region | 38 |
sequences within | 38 |
feature extraction | 38 |
nuclear magnetic | 38 |
small number | 38 |
disordered regions | 38 |
containing proteins | 38 |
class ii | 38 |
standard deviation | 38 |
enzymatic activity | 38 |
immunosequencing data | 38 |
ab initio | 37 |
expression levels | 37 |
structural folds | 37 |
three different | 37 |
different species | 37 |
stop codons | 37 |
computational methods | 37 |
solvent accessibility | 37 |
real time | 37 |
receptor binding | 37 |
structural features | 37 |
specific primers | 37 |
comparative genomics | 37 |
common ancestor | 37 |
local alignment | 37 |
coat protein | 37 |
human immunodeficiency | 37 |
template structure | 37 |
implanted signals | 36 |
bat species | 36 |
rna sequences | 36 |
sequence databases | 36 |
high degree | 36 |
total rna | 36 |
fatty acid | 36 |
ebola virus | 36 |
ordered proteins | 36 |
beta globin | 36 |
recent studies | 36 |
effi ciency | 36 |
biological processes | 36 |
identifi cation | 36 |
virus genome | 36 |
globin gene | 36 |
complete nucleotide | 36 |
convolutional neural | 35 |
evolutionary relationships | 35 |
metagenomic sequencing | 35 |
two types | 35 |
west nile | 35 |
cell adhesion | 35 |
structural information | 35 |
cyclic peptides | 35 |
distance matrix | 35 |
may play | 35 |
dimensional structure | 35 |
protein domain | 35 |
protein data | 34 |
also known | 34 |
analysis based | 34 |
novel viruses | 34 |
protein synthesis | 34 |
coiled coil | 34 |
protein complexes | 34 |
different types | 34 |
biotechnology information | 34 |
fecal samples | 34 |
cmv dataset | 34 |
peptide sequences | 34 |
fusion protein | 34 |
central nervous | 34 |
web server | 34 |
two major | 34 |
energy functions | 34 |
staphylococcus aureus | 34 |
mammalian cells | 34 |
dna viruses | 34 |
data bank | 34 |
dna polymerase | 34 |
specifi cally | 33 |
charged residues | 33 |
tumor cells | 33 |
high levels | 33 |
reverse transcription | 33 |
grid search | 33 |
results showed | 33 |
learning methods | 33 |
cell membrane | 33 |
mhc class | 33 |
virus rna | 33 |
analysis revealed | 33 |
virtual screening | 33 |
protein family | 33 |
new method | 33 |
protein kinase | 33 |
synthetic peptide | 33 |
peptides containing | 33 |
also found | 33 |
similarity measures | 33 |
hidden markov | 33 |
auditory display | 33 |
high quality | 33 |
energy function | 33 |
genomic rna | 33 |
disulfide bonds | 33 |
two proteins | 32 |
phage display | 32 |
search tool | 32 |
viral metagenomics | 32 |
input object | 32 |
viral diseases | 32 |
pcr amplification | 32 |
negatively charged | 32 |
proposed method | 32 |
positive selection | 32 |
nanopore sequencing | 32 |
sequencing projects | 32 |
polypeptide chain | 32 |
infected cells | 32 |
success rate | 32 |
first time | 32 |
clinical trials | 32 |
wide variety | 32 |
protein threading | 32 |
inhibitory activity | 32 |
metagenomic analysis | 32 |
large numbers | 32 |
immune status | 31 |
free methods | 31 |
similarity analysis | 31 |
based approach | 31 |
drug design | 31 |
experimental data | 31 |
pcr tests | 31 |
acid residue | 31 |
acid identity | 31 |
molecular mechanisms | 31 |
evolutionary history | 31 |
query sequences | 31 |
euclidean distance | 31 |
structure determination | 31 |
sequencing reads | 31 |
peptide chain | 31 |
mutation rate | 31 |
fold recognition | 31 |
read pairs | 31 |
molecular level | 31 |
cell cultures | 31 |
nonsynonymous mutations | 31 |
outer membrane | 31 |
analysis using | 31 |
signal peptide | 31 |
genomic dna | 30 |
computational biology | 30 |
stranded dna | 30 |
conformational change | 30 |
tomato leaf | 30 |
disulfide bridges | 30 |
input sequences | 30 |
cell cycle | 30 |
viral pathogens | 30 |
based sequence | 30 |
tree decomposition | 30 |
repetitive sequence | 30 |
main proteinase | 30 |
target sequences | 30 |
taxonomic classification | 30 |
full genome | 30 |
new repetitive | 30 |
hyperparameter search | 30 |
influenza viruses | 30 |
directed mutagenesis | 30 |
fl uorescent | 30 |
genome annotation | 29 |
diffi cult | 29 |
infectious peritonitis | 29 |
structural studies | 29 |
rna genome | 29 |
saccharomyces cerevisiae | 29 |
binding domains | 29 |
amyloid fibrils | 29 |
binding properties | 29 |
ligand binding | 29 |
previously reported | 29 |
conserved regions | 29 |
bioactive peptides | 29 |
years ago | 29 |
sequence embedding | 28 |
bioinformatics tools | 28 |
ribosomal frameshifting | 28 |
deep sequencing | 28 |
disordered protein | 28 |
prostate cancer | 28 |
world data | 28 |
genbank database | 28 |
unique sequences | 28 |
feline infectious | 28 |
nucleocapsid protein | 28 |
also used | 28 |
calculated using | 28 |
tandem repeat | 28 |
genomic sequence | 28 |
vast majority | 28 |
middle east | 28 |
among different | 28 |
will provide | 28 |
biological functions | 28 |
protein complex | 28 |
cleavage site | 28 |
protein fold | 28 |
results obtained | 28 |
mutation rates | 28 |
blood pressure | 28 |
structural protein | 28 |
supplementary table | 28 |
structural changes | 28 |
hong kong | 28 |
bloom filters | 27 |
ng ml | 27 |
single nucleotide | 27 |
related sequences | 27 |
virus species | 27 |
large scale | 27 |
table shows | 27 |
gene family | 27 |
using different | 27 |
leader sequence | 27 |
adaptive immune | 27 |
sequencing technologies | 27 |
variation resource | 27 |
well known | 27 |
target proteins | 27 |
broad range | 27 |
signifi cantly | 27 |
related proteins | 27 |
spectral representation | 27 |
will also | 27 |
protein gene | 27 |
monoclonal antibodies | 27 |
monte carlo | 27 |
training set | 27 |
starting point | 27 |
heat shock | 27 |
spinal cord | 27 |
random forest | 27 |
proteins involved | 27 |
different proteins | 27 |
human coronavirus | 27 |
new approach | 27 |
recent advances | 27 |
host range | 27 |
clustering algorithms | 27 |
different methods | 26 |
structure alignment | 26 |
protein design | 26 |
protein binding | 26 |
feature vectors | 26 |
virus genomes | 26 |
gene therapy | 26 |
dissimilarity analysis | 26 |
light scattering | 26 |
public databases | 26 |
copy number | 26 |
confi rmed | 26 |
three groups | 26 |
amino group | 26 |
alternative splicing | 26 |
update rule | 26 |
one sequence | 26 |
nile virus | 26 |
protein database | 26 |
immune receptor | 26 |
emerging infectious | 26 |
gene transfer | 26 |
sequence classification | 26 |
structural characterization | 26 |
thermal stability | 26 |
publicly available | 26 |
markov chain | 26 |
novel method | 26 |
coding genes | 26 |
hierarchical clustering | 26 |
cell proliferation | 26 |
zika virus | 26 |
see sect | 26 |
high affi | 26 |
east respiratory | 26 |
quality control | 26 |
genetic material | 26 |
plasmon resonance | 26 |
analysis showed | 26 |
catalytic domain | 26 |
surface plasmon | 26 |
recombinant proteins | 26 |
genome organization | 26 |
genes involved | 26 |
predictive performance | 25 |
pcr assays | 25 |
innate immune | 25 |
data obtained | 25 |
stop codon | 25 |
complex formation | 25 |
determined using | 25 |
hydrogen bond | 25 |
basic local | 25 |
aa sequences | 25 |
force field | 25 |
dna primary | 25 |
described previously | 25 |
cellular processes | 25 |
structural biology | 25 |
chicken astrovirus | 25 |
cell wall | 25 |
allowed us | 25 |
repetitive patterns | 25 |
hepatitis virus | 25 |
vaccine development | 25 |
human health | 25 |
overlapping genes | 25 |
support vector | 25 |
cell penetrating | 25 |
pairwise alignment | 25 |
consensus sequences | 25 |
poorly understood | 25 |
feature vector | 25 |
markov models | 25 |
random dna | 25 |
pairwise sequence | 25 |
threading problem | 25 |
virus strains | 25 |
virus sequences | 25 |
primary sequences | 25 |
false positives | 25 |
powerful tool | 24 |
sequence reads | 24 |
human body | 24 |
indel information | 24 |
chemical ligation | 24 |
coupling reagents | 24 |
genomic analysis | 24 |
complete sequence | 24 |
molecular modeling | 24 |
mitochondrial genome | 24 |
follow relationships | 24 |
terminal part | 24 |
limited number | 24 |
based approaches | 24 |
enzyme activity | 24 |
exclusion chromatography | 24 |
human rhinovirus | 24 |
active sites | 24 |
cell membranes | 24 |
pcr assay | 24 |
natural vector | 24 |
living cells | 24 |
epithelial cells | 24 |
dependent manner | 24 |
genome analysis | 24 |
different sequences | 24 |
purifying selection | 24 |
viral dna | 24 |
human viruses | 24 |
activity relationship | 24 |
better understand | 24 |
database search | 24 |
genetic information | 24 |
vaccine design | 24 |
molecular mechanism | 24 |
catalytic activity | 24 |
false negatives | 24 |
molecular characterization | 24 |
respiratory tract | 24 |
false negative | 24 |
peptide binding | 23 |
defi ned | 23 |
data suggest | 23 |
learning algorithms | 23 |
sequence comparisons | 23 |
metal ions | 23 |
phylogenetic analyses | 23 |
sequencing technology | 23 |
two distinct | 23 |
accession jx | 23 |
scientific community | 23 |
high level | 23 |
see supplementary | 23 |
fi brils | 23 |
scoring function | 23 |
infectious agents | 23 |
natural products | 23 |
rational design | 23 |
short peptides | 23 |
cysteine residues | 23 |
sequencing errors | 23 |
component spectral | 23 |
threading programs | 23 |
solved structures | 23 |
experimentally determined | 23 |
human pathogens | 23 |
drug development | 23 |
relationships among | 23 |
start codon | 23 |
rna editing | 23 |
better understanding | 23 |
model system | 23 |
mouth disease | 23 |
viral discovery | 23 |
will allow | 23 |
rna structures | 23 |
least two | 23 |
drug delivery | 23 |
penetrating peptides | 23 |
terminal amino | 23 |
cleavage sites | 23 |
molecular cloning | 23 |
rna sequence | 23 |
functional genomics | 22 |
genetic variation | 22 |
known motif | 22 |
ribosomal protein | 22 |
amaryllidaceae alkaloids | 22 |
peptide library | 22 |
amyloid formation | 22 |
immune repertoires | 22 |
neurodegenerative diseases | 22 |
human influenza | 22 |
protein expression | 22 |
room temperature | 22 |
viral infection | 22 |
native chemical | 22 |
sh domain | 22 |
sequence position | 22 |
endoplasmic reticulum | 22 |
using fmoc | 22 |
broad spectrum | 22 |
tree length | 22 |
blot analysis | 22 |
cell epitopes | 22 |
structural similarity | 22 |
different lengths | 22 |
new generation | 22 |
simulated immunosequencing | 22 |
receptor sequences | 22 |
takes place | 22 |
structural genomics | 22 |
also observed | 22 |
homologous recombination | 22 |
event sequence | 22 |
four different | 22 |
event sequences | 22 |
virus strain | 22 |
protein molecules | 22 |
pcr test | 22 |
learning rate | 22 |
also provides | 22 |
complete genomes | 22 |
virus isolates | 22 |
multiple instance | 22 |
biological activities | 22 |
molecular clock | 22 |
challenge stock | 22 |
protease inhibitors | 22 |
free sequence | 22 |
taken together | 22 |
like proteins | 22 |
batch size | 22 |
transition kernel | 22 |
instance learning | 22 |
therapeutic agents | 22 |
cancer cell | 22 |
binding partners | 22 |
phylogenetic relationships | 22 |
also identified | 22 |
viral diversity | 22 |
short sequences | 22 |
clinical microbiology | 22 |
causative agent | 22 |
spike proteins | 22 |
eukaryotic cells | 22 |
cyclic peptide | 22 |
hash functions | 22 |
sequence diversity | 21 |
chemical properties | 21 |
metagenomic data | 21 |
highly variable | 21 |
conformational analysis | 21 |
results will | 21 |
method based | 21 |
protein kinases | 21 |
long sequences | 21 |
time points | 21 |
terminal sequence | 21 |
read length | 21 |
nmr spectra | 21 |
tissue culture | 21 |
aqueous solution | 21 |
analyzed using | 21 |
recent work | 21 |
multiple alignments | 21 |
side effects | 21 |
levenshtein distance | 21 |
fatty acids | 21 |
reverse vaccinology | 21 |
ca i | 21 |
coupled receptors | 21 |
arabidopsis thaliana | 21 |
high accuracy | 21 |
functional proteins | 21 |
gene finding | 21 |
ion channels | 21 |
bacterial genomes | 21 |
ancestral sequence | 21 |
constructed using | 21 |
many cases | 21 |
library preparation | 21 |
logistic regression | 21 |
helix bundle | 21 |
new sequence | 21 |
raw data | 21 |
host immune | 21 |
crucial role | 21 |
solid support | 21 |
size exclusion | 21 |
authors declare | 21 |
highly divergent | 21 |
innate immunity | 21 |
event types | 21 |
mycobacterium tuberculosis | 21 |
distantly related | 21 |
kda protein | 21 |
negative bacteria | 21 |
infectious bronchitis | 21 |
chemical synthesis | 21 |
binding peptides | 21 |
sequence evolution | 21 |
genome assembly | 21 |
viral species | 21 |
acid substitutions | 21 |
ssrna viruses | 21 |
active peptides | 21 |
ncbi taxonomy | 21 |
chemical shift | 20 |
usage bias | 20 |
dimensional structures | 20 |
artifi cial | 20 |
recurrent neural | 20 |
titration calorimetry | 20 |
monoclonal antibody | 20 |
relatively small | 20 |
solved structure | 20 |
phosphorylation sites | 20 |
studies showed | 20 |
positive class | 20 |
recently developed | 20 |
sequences generated | 20 |
may lead | 20 |
two sets | 20 |
critical role | 20 |
samples collected | 20 |
may represent | 20 |
mg ml | 20 |
accession cp | 20 |
west bengal | 20 |
natively unfolded | 20 |
one another | 20 |
comparative genomic | 20 |
cd spectroscopy | 20 |
conserved domains | 20 |
based method | 20 |
pseudomonas aeruginosa | 20 |
implanted motif | 20 |
many different | 20 |
fluorescent protein | 20 |
completely sequenced | 20 |
new insights | 20 |
molten globule | 20 |
structural data | 20 |
another example | 20 |
carboxylic acid | 20 |
nmr data | 20 |
blast search | 20 |
two new | 20 |
proteins may | 20 |
validation set | 20 |
expression data | 20 |
experimental results | 20 |
multiple sclerosis | 20 |
applied biosystems | 20 |
protein evolution | 20 |
protein aggregation | 20 |
clinical isolates | 20 |
tuple distance | 20 |
cdna library | 20 |
nmr studies | 20 |
burden test | 20 |
phylogenetic inference | 20 |
isothermal titration | 20 |
disease virus | 20 |
molecular recognition | 20 |
molecular basis | 20 |
shotgun sequencing | 20 |
recombination events | 20 |
may provide | 20 |
helical structure | 20 |
many viruses | 20 |
mrna expression | 19 |
related species | 19 |
bat transcriptome | 19 |
identical sequences | 19 |
two groups | 19 |
feline coronavirus | 19 |
plasmodium falciparum | 19 |
human dna | 19 |
will present | 19 |
sequences may | 19 |
ion torrent | 19 |
well conserved | 19 |
signaling pathways | 19 |
functional groups | 19 |
last years | 19 |
instances per | 19 |
tree search | 19 |
ribosomal rna | 19 |
conserved region | 19 |
markov model | 19 |
sequence conservation | 19 |
given sequence | 19 |
nucleotide composition | 19 |
similar sequences | 19 |
extracellular matrix | 19 |
envelope protein | 19 |
fi eld | 19 |
first one | 19 |
commercially available | 19 |
fibril formation | 19 |
language model | 19 |
virus sequence | 19 |
autoimmune diseases | 19 |
natural language | 19 |
related genes | 19 |
allow us | 19 |
antiviral activity | 19 |
much higher | 19 |
supplementary document | 19 |
dual nucleotides | 19 |
supplementary material | 19 |
gene ontology | 19 |
human diseases | 19 |
statistically significant | 19 |
type iii | 19 |
cell growth | 19 |
contains two | 19 |
different parts | 19 |
family members | 19 |
virus replication | 19 |
large datasets | 19 |
two classes | 19 |
known protein | 19 |
false positive | 19 |
fi nal | 19 |
clustering algorithm | 19 |
joint model | 19 |
affinity chromatography | 19 |
hydrogen bonding | 19 |
binding activity | 19 |
domain swapping | 19 |
using nmr | 19 |
fusion proteins | 19 |
peptide analogues | 19 |
also shown | 19 |
across species | 18 |
sh domains | 18 |
last decade | 18 |
hrv serotypes | 18 |
ionic strength | 18 |
proteins encoded | 18 |
peptide sequence | 18 |
bronchitis virus | 18 |
endothelial cells | 18 |
quasispecies sequences | 18 |
extraction method | 18 |
membrane fusion | 18 |
subcellular localization | 18 |
bond formation | 18 |
positive charge | 18 |
single stranded | 18 |
synthesized using | 18 |
fold higher | 18 |
lipid bilayer | 18 |
international committee | 18 |
new hits | 18 |
feature selection | 18 |
expressed sequence | 18 |
amide bond | 18 |
mutational bias | 18 |
two main | 18 |
immune receptors | 18 |
probe sequences | 18 |
biological properties | 18 |
randomly generated | 18 |
energy transfer | 18 |
acid substitution | 18 |
preliminary results | 18 |
deduced amino | 18 |
antimicrobial resistance | 18 |
circular dna | 18 |
implanted motifs | 18 |
highly similar | 18 |
per bag | 18 |
reference databases | 18 |
results demonstrate | 18 |
computer science | 18 |
molecular weights | 18 |
based protein | 18 |
driving force | 18 |
stes mining | 18 |
containing peptides | 18 |
metal binding | 18 |
viral communities | 18 |
unnatural amino | 18 |
possible role | 18 |
protein degradation | 18 |
experimental conditions | 18 |
biological data | 18 |
biomedical research | 18 |
conserved sequence | 18 |
md simulations | 18 |
sequence variation | 18 |
similar results | 18 |
biological information | 18 |
generated data | 18 |
peptide ligands | 18 |
transmission electron | 18 |
curve tree | 18 |
manual curation | 18 |
functional protein | 18 |
target cells | 18 |
pcr products | 18 |
vaccine candidates | 18 |
also detected | 18 |
nucleotide substitutions | 18 |
accession numbers | 18 |
mature peptide | 18 |
translation initiation | 17 |
prion protein | 17 |
terminal fragment | 17 |
antifungal activity | 17 |
rheumatoid arthritis | 17 |
length distributions | 17 |
rice tungro | 17 |
several different | 17 |
stock virus | 17 |
human papillomavirus | 17 |
protein structural | 17 |
dnase i | 17 |
subunit vaccines | 17 |
branch lengths | 17 |
integral membrane | 17 |
peptides derived | 17 |
drug resistance | 17 |
directed graph | 17 |
sequences identified | 17 |
indian isolates | 17 |
available sequences | 17 |
bovine torovirus | 17 |
provide information | 17 |
nested pcr | 17 |
structure modeling | 17 |
cystic fibrosis | 17 |
solution structure | 17 |
nonstructural proteins | 17 |
partially folded | 17 |
gc content | 17 |
external branches | 17 |
sliding window | 17 |
closed curve | 17 |
threading alignment | 17 |
virus genes | 17 |
sequences per | 17 |
molecular interactions | 17 |
silico sensitivity | 17 |
new analogues | 17 |
tyrosine kinase | 17 |
binding capacity | 17 |
human microbiome | 17 |
first aug | 17 |
recombinant protein | 17 |
different protein | 17 |
tissue fecal | 17 |
coronavirus spike | 17 |
single molecule | 17 |
tat tac | 17 |
simulated annealing | 17 |
molecular docking | 17 |
cysteine proteases | 17 |
san diego | 17 |
binding motif | 17 |
high mutation | 17 |
genome evolution | 17 |
bacterial genome | 17 |
viral population | 17 |
peptide libraries | 17 |
herpes simplex | 17 |
gene product | 17 |
blast searches | 17 |
coronavirus associated | 17 |
time consuming | 17 |
also showed | 17 |
studies revealed | 17 |
epitope prediction | 17 |
binding affi | 17 |
repeated sequences | 17 |
novel virus | 17 |
type protein | 17 |
liquid chromatography | 17 |
structural analysis | 17 |
environmental samples | 17 |
homologous proteins | 17 |
natural selection | 17 |
unknown sequences | 17 |
query protein | 17 |
dimensional space | 17 |
novel approach | 17 |
dynamic light | 17 |
shed light | 17 |
dsdna viruses | 17 |
dihydrofolate reductase | 17 |
human cells | 17 |
spatiotemporal event | 17 |
substrate specificity | 17 |
functional analysis | 17 |
human disease | 17 |
novo design | 17 |
long time | 17 |
genome projects | 17 |
molecular mass | 17 |
animal models | 17 |
statistical significance | 17 |
standard genetic | 17 |
andhra pradesh | 17 |
sequences used | 17 |
sequence positions | 17 |
opioid receptor | 17 |
data mining | 17 |
currently available | 17 |
dark matter | 17 |
prior knowledge | 17 |
structural motif | 16 |
data generated | 16 |
sequences obtained | 16 |
fourier transform | 16 |
sequence based | 16 |
rich domains | 16 |
growth rate | 16 |
distance methods | 16 |
random walk | 16 |
genetically diverse | 16 |
new delhi | 16 |
genome segments | 16 |
reference genomes | 16 |
xho i | 16 |
previous work | 16 |
dna replication | 16 |
witness rate | 16 |
probability distribution | 16 |
structure elements | 16 |
gene encoding | 16 |
amino groups | 16 |
rat brain | 16 |
activity relationships | 16 |
novel human | 16 |
allows us | 16 |
two genes | 16 |
peptide conjugates | 16 |
tree width | 16 |
comparative protein | 16 |
accession jq | 16 |
sequences derived | 16 |
homology search | 16 |
like receptors | 16 |
protein primary | 16 |
capsid gene | 16 |
oxidative stress | 16 |
data structures | 16 |
participation index | 16 |
vero cells | 16 |
arginine residues | 16 |
template structures | 16 |
results provide | 16 |
primary protein | 16 |
functional properties | 16 |
data structure | 16 |
read graph | 16 |
coronavirus genomes | 16 |
protein targets | 16 |
substitution model | 16 |
clinical specimens | 16 |
optimal threading | 16 |
bluetongue virus | 16 |
base composition | 16 |
peptide backbone | 16 |
forward primer | 16 |
protein tyrosine | 16 |
known viral | 16 |
yellow fever | 16 |
sample preparation | 16 |
bacterial pathogens | 16 |
pi th | 16 |
coronavirus disease | 16 |
major role | 16 |
much larger | 16 |
expressed protein | 16 |
ligand docking | 16 |
directed graphs | 16 |
sanger sequencing | 16 |
significantly higher | 16 |
synthesis using | 16 |
triple helix | 16 |
hydrophobic regions | 16 |
also show | 16 |
identified using | 16 |
gene prediction | 16 |
bioinformatics resource | 16 |
per repertoire | 16 |
classification system | 16 |
previously published | 16 |
profi le | 16 |
native structure | 16 |
many genes | 16 |
base calls | 16 |
methods based | 16 |
rrna gene | 16 |
two decades | 16 |
food intake | 16 |
pi values | 16 |
peptide bond | 16 |
informative sites | 16 |
acid changes | 16 |
complete genomic | 16 |
taxonomic groups | 16 |
detailed analysis | 16 |
small fraction | 16 |
also present | 16 |
important roles | 16 |
sars virus | 16 |
proteins containing | 16 |
cell receptors | 16 |
multiple sequences | 16 |
short sequence | 16 |
using standard | 16 |
two methods | 16 |
mammalian viruses | 16 |
structural motifs | 16 |
like protein | 16 |
time point | 16 |
design method | 16 |
frameshift site | 16 |
disulfide bond | 16 |
chikungunya virus | 16 |
optimal alignment | 16 |
untranslated regions | 16 |
hypervariable regions | 16 |
mitochondrial dna | 16 |
fever virus | 16 |
peptide derivatives | 16 |
increasing number | 16 |
plant species | 16 |
significant role | 16 |
opioid receptors | 16 |
model peptides | 16 |
newly discovered | 16 |
average length | 16 |
complete viral | 16 |
peptide fragments | 16 |
genome duplication | 16 |
synonymous mutations | 16 |
rna viral | 16 |
gene cluster | 16 |
antibiotic resistance | 16 |
viral infections | 16 |
different positions | 16 |
cell epitope | 16 |
prediction methods | 15 |
conserved amino | 15 |
different aspects | 15 |
ci th | 15 |
genetic distance | 15 |
energy model | 15 |
host cells | 15 |
sequence records | 15 |
fold increase | 15 |
viral protein | 15 |
sewage samples | 15 |
see also | 15 |
pairwise alignments | 15 |
large amounts | 15 |
adenovirus type | 15 |
experimental methods | 15 |
metal ion | 15 |
homology modeling | 15 |
entire genome | 15 |
method using | 15 |
also provide | 15 |
chronic hepatitis | 15 |
input sequence | 15 |
best results | 15 |
avian infectious | 15 |
viral isolates | 15 |
human coronaviruses | 15 |
standard deviations | 15 |
running time | 15 |
functionally important | 15 |
cell type | 15 |
four bases | 15 |
transmembrane domain | 15 |
model systems | 15 |
van der | 15 |
sequence tags | 15 |
bias towards | 15 |
microbial communities | 15 |
protein molecule | 15 |
additional information | 15 |
bone marrow | 15 |
tyrosine kinases | 15 |
different time | 15 |
viral family | 15 |
potent inhibitors | 15 |
also possible | 15 |
relative abundance | 15 |
western blot | 15 |
batcv poa | 15 |
indel model | 15 |
unfolded proteins | 15 |
nuclear transfer | 15 |
molecular methods | 15 |
sequence similarities | 15 |
statistical analysis | 15 |
translational modification | 15 |
showed high | 15 |
capsid proteins | 15 |
new viruses | 15 |
horizontal gene | 15 |
serum samples | 15 |
results also | 15 |
binding studies | 15 |
reverse primer | 15 |
refseq records | 15 |
fluorescence spectroscopy | 15 |
scale sequencing | 15 |
sequence divergence | 15 |
complementary dna | 15 |
highly pathogenic | 15 |
results revealed | 15 |
kcal mol | 15 |
mammary gland | 15 |
best hit | 15 |
science foundation | 15 |
host species | 15 |
microbial genome | 15 |
computational complexity | 15 |
derived peptides | 15 |
clinical studies | 15 |
respiratory disease | 15 |
high prevalence | 15 |
standard amino | 15 |
mottle virus | 15 |
tumor growth | 15 |
extraction methods | 15 |
new family | 15 |
effi ciently | 15 |
conserved cysteine | 15 |
globular proteins | 15 |
silico validation | 15 |
model organisms | 15 |
rna molecules | 15 |
tissue samples | 15 |
ancestral sequences | 15 |
one example | 15 |
structural elements | 15 |
biological membranes | 15 |
gastrointestinal tract | 15 |
base call | 15 |
genes encoding | 15 |
sequences available | 15 |
annotation pipeline | 15 |
loop structure | 15 |
structural properties | 15 |
sequence read | 15 |
three major | 15 |
human cln | 15 |
proteins will | 15 |
sequences detected | 15 |
random sequences | 15 |
protein fragments | 15 |
several species | 15 |
ray scattering | 15 |
genomic islands | 15 |
coupling reaction | 15 |
new class | 15 |
original sequence | 15 |
proline residues | 15 |
order nidovirales | 15 |
plasmid dna | 15 |
metabolic pathways | 15 |
viral particles | 15 |
whole genomes | 15 |
supplementary file | 15 |
generated using | 15 |
mouse hepatitis | 15 |
plant viruses | 15 |
peptide chemistry | 14 |
streptococcus pneumoniae | 14 |
mil problems | 14 |
nucleotide binding | 14 |
distance measure | 14 |
tumor suppressor | 14 |
related protein | 14 |
fmoc tbu | 14 |
cellular functions | 14 |
read archive | 14 |
binding pocket | 14 |
human serum | 14 |
storage capacity | 14 |
protein alignment | 14 |
new sequences | 14 |
human igg | 14 |
sequence rseq | 14 |
signal processing | 14 |
forest virus | 14 |
state university | 14 |
yet another | 14 |
sequence structure | 14 |
nucleotide substitution | 14 |
protein ligation | 14 |
nearest neighbor | 14 |
confi guration | 14 |
ray structure | 14 |
expression patterns | 14 |
valuable information | 14 |
relative frequency | 14 |
joining method | 14 |
like attention | 14 |
like globin | 14 |
aspartic acid | 14 |
known viruses | 14 |
newly synthesized | 14 |
estimated using | 14 |
also performed | 14 |
biologically relevant | 14 |
time course | 14 |
often used | 14 |
human gut | 14 |
different regions | 14 |
significant similarity | 14 |
i will | 14 |
attention weights | 14 |
single amino | 14 |
ruditapes philippinarum | 14 |
minimum number | 14 |
using molecular | 14 |
nmr experiments | 14 |
attention values | 14 |
potential role | 14 |
population genetics | 14 |
evolutionary relationship | 14 |
also called | 14 |
accumulated natural | 14 |
hela cells | 14 |
integer programming | 14 |
sispa method | 14 |
class repertoires | 14 |
mutational robustness | 14 |
rich region | 14 |
branch length | 14 |
tandem repeats | 14 |
many proteins | 14 |
interaction energy | 14 |
protein dynamics | 14 |
software package | 14 |
homology models | 14 |
functional studies | 14 |
experimental evidence | 14 |
bovine serum | 14 |
repeat proteins | 14 |
successfully used | 14 |
extended idps | 14 |
useful tool | 14 |
evolutionary analysis | 14 |
adjacency matrix | 14 |
molecular evolutionary | 14 |
platelet aggregation | 14 |
blood donors | 14 |
messenger rna | 14 |
ray diffraction | 14 |
living organisms | 14 |
random coil | 14 |
simplex virus | 14 |
high performance | 14 |
related viruses | 14 |
high yield | 14 |
molecular function | 14 |
economically important | 14 |
database using | 14 |
water molecules | 14 |
highly significant | 14 |
high sensitivity | 14 |
attention mechanisms | 14 |
obtained using | 14 |
artificial intelligence | 14 |
signaling pathway | 14 |
may help | 14 |
peptides based | 14 |
dissimilarity matrix | 14 |
mitochondrial genomes | 14 |
incomplete purifying | 14 |
visual display | 14 |
default parameters | 14 |
computational tools | 14 |
research institute | 14 |
nk cells | 14 |
cells expressing | 14 |
deletion mutants | 14 |
alignment distribution | 14 |
positive samples | 14 |
language processing | 14 |
net charge | 14 |
spike glycoprotein | 14 |
er membrane | 14 |
proteins based | 14 |
protein intrinsic | 14 |
two stickers | 14 |
rna replication | 14 |
mrna levels | 14 |
take place | 14 |
dna genomes | 14 |
semliki forest | 14 |
fl ow | 14 |
conceptual model | 14 |
molecular epidemiology | 14 |
near future | 14 |
virus evolution | 14 |
dna fragment | 14 |
life cycle | 14 |
animal species | 14 |
antibacterial activity | 14 |
multiplex pcr | 14 |
anopheles gambiae | 14 |
various types | 14 |
human cell | 14 |
terminal end | 14 |
public domain | 14 |
learning models | 14 |
heavy chain | 14 |
new coronavirus | 14 |
plant amps | 14 |
graduate school | 14 |
porcine circovirus | 14 |
building block | 14 |
ncbi database | 14 |
glutamic acid | 14 |
different levels | 14 |
small angle | 14 |
samples using | 14 |
trypanosoma brucei | 14 |
using primers | 14 |
genbank accession | 14 |
highly selective | 14 |
synonymous codon | 14 |
analysis tools | 14 |
one protein | 14 |
randomly selected | 14 |
primer design | 14 |
low concentrations | 14 |
specific amino | 14 |
new world | 14 |
genome duplications | 13 |
analytical tools | 13 |
tumor cell | 13 |
data indicate | 13 |
much less | 13 |
three dimensional | 13 |
two novel | 13 |
virulence factors | 13 |
useful information | 13 |
dna molecules | 13 |
native state | 13 |
simple way | 13 |
two domains | 13 |
single protein | 13 |
drug targets | 13 |
code set | 13 |
conformational studies | 13 |
interaction networks | 13 |
highly specific | 13 |
situ hybridization | 13 |
protein science | 13 |
repetition number | 13 |
three distinct | 13 |
per primer | 13 |
lipid membrane | 13 |
affi nities | 13 |
ordered structure | 13 |
neutral networks | 13 |
similarity matrix | 13 |
helical conformation | 13 |
rapid identification | 13 |
massively parallel | 13 |
based energy | 13 |
binding kinetics | 13 |
nmr structure | 13 |
chemical group | 13 |
resequencing microarray | 13 |
negative control | 13 |
average number | 13 |
new graphical | 13 |
structure graph | 13 |
chemical shifts | 13 |
gives rise | 13 |
two viruses | 13 |
analysis reveals | 13 |
past years | 13 |
data available | 13 |
tadarida brasiliensis | 13 |
help us | 13 |
initiation factor | 13 |
provide new | 13 |
phase method | 13 |
amplicon sequencing | 13 |
early stage | 13 |
lstm model | 13 |
specific binding | 13 |
sequence analyses | 13 |
well studied | 13 |
protease inhibitor | 13 |
reaction conditions | 13 |
interacting proteins | 13 |
agarose gel | 13 |
muscle cells | 13 |
using high | 13 |
essential role | 13 |
one base | 13 |
replication cycle | 13 |
hydrophobic core | 13 |
geographic distribution | 13 |
gene clusters | 13 |
leader sequences | 13 |
time series | 13 |
infl uenza | 13 |
world health | 13 |
structural basis | 13 |
wide array | 13 |
freely available | 13 |
binding affinities | 13 |
genomic sequencing | 13 |
evolutionary rates | 13 |
protein disorder | 13 |
will help | 13 |
dna fragments | 13 |
previously known | 13 |
serine proteases | 13 |
coronavirus hku | 13 |
tumor necrosis | 13 |
sequence lengths | 13 |
may affect | 13 |
biological systems | 13 |
using two | 13 |
blood samples | 13 |
significant differences | 13 |
will discuss | 13 |
homologous sequences | 13 |
ligand interactions | 13 |
bound state | 13 |
pathogen identification | 13 |
respiratory pathogens | 13 |
larger number | 13 |
data using | 13 |
comparative model | 13 |
next step | 13 |
mutation frequencies | 13 |
previously identified | 13 |
currently used | 13 |
become available | 13 |
viral nucleic | 13 |
population size | 13 |
alignment uncertainty | 13 |
india anand | 13 |
neisseria meningitidis | 13 |
uorescence spectroscopy | 13 |
different cell | 13 |
pathogenicity islands | 13 |
evolutionary models | 13 |
metagenomic samples | 13 |
castv india | 13 |
sequence coverage | 13 |
functional domains | 13 |
comparative study | 13 |
give rise | 13 |
chaos game | 13 |
high specificity | 13 |
provide insight | 13 |
protein surface | 13 |
length distribution | 13 |
statistical properties | 13 |
protein coupled | 13 |
investigated using | 13 |
programmed cell | 13 |
just one | 13 |
hairpin structure | 13 |
different approaches | 13 |
drug candidates | 13 |
calf thymus | 13 |
ms ms | 13 |
sequences without | 13 |
similarity searches | 13 |
serum albumin | 13 |
vp gene | 13 |
viral strains | 13 |
genomic characterization | 13 |
long term | 13 |
polar coordinates | 13 |
functional group | 13 |
clustering methods | 13 |
cellular uptake | 13 |
regulatory elements | 13 |
dna microarrays | 13 |
immune genes | 13 |
vector machine | 13 |
nucleotide divergence | 13 |
subcellular location | 13 |
acid side | 13 |
gene families | 13 |
research groups | 13 |
blood vessels | 13 |
machine translation | 13 |
repeat sequences | 13 |
protein stability | 13 |
diarrhea virus | 13 |
control group | 13 |
multidomain proteins | 13 |
binding assays | 13 |
genetic variants | 13 |
isoelectric point | 13 |
decision tree | 13 |
tail window | 13 |
sequences will | 13 |
nucleotide level | 13 |
four nucleotides | 13 |
three types | 13 |
virus serotype | 13 |
nonstructural protein | 13 |
great interest | 13 |
may contain | 13 |
per year | 13 |
potential therapeutic | 13 |
drosophila melanogaster | 13 |
structure based | 13 |
emerging pathogens | 13 |
much smaller | 13 |
linked immunosorbent | 13 |
microwave assisted | 13 |
viral metagenomic | 13 |
viruses infecting | 13 |
programming algorithm | 13 |
st family | 13 |
olg sequences | 13 |
naturally infected | 13 |
corona virus | 13 |
differentially expressed | 13 |
alignment problem | 13 |
dna damage | 13 |
insectivorous bats | 12 |
also studied | 12 |
proteins within | 12 |
single gene | 12 |
sequence graph | 12 |
base pair | 12 |
detected using | 12 |
group representative | 12 |
relatively low | 12 |
structural positions | 12 |
alternative approach | 12 |
one amino | 12 |
supervised learning | 12 |
new protein | 12 |
national institute | 12 |
relative solvent | 12 |
intensity level | 12 |
may indicate | 12 |
limited set | 12 |
sequence retrieval | 12 |
sequence datasets | 12 |
genetic data | 12 |
structural fold | 12 |
conserved among | 12 |
past decade | 12 |
like structures | 12 |
hcv infection | 12 |
negative class | 12 |
rich peptides | 12 |
structure activity | 12 |
information contained | 12 |
candidate sequences | 12 |
previously characterized | 12 |
provide evidence | 12 |
hemorrhagic fever | 12 |
length genome | 12 |
proteins using | 12 |
highly dynamic | 12 |
membrane permeability | 12 |
previous study | 12 |
will give | 12 |
total synthesis | 12 |
cdna synthesis | 12 |
large data | 12 |
ins i | 12 |
methods used | 12 |
sequence may | 12 |
among others | 12 |
common ancestry | 12 |
evolutionary trees | 12 |
north america | 12 |
new york | 12 |
per residue | 12 |
short peptide | 12 |
immunosorbent assay | 12 |
study revealed | 12 |
see section | 12 |
entire sequence | 12 |
different mechanisms | 12 |
leading eigenvalues | 12 |
rna genomes | 12 |
assembly using | 12 |
classification using | 12 |
environmental factors | 12 |
world immunosequencing | 12 |
protein level | 12 |
peptide arrays | 12 |
average auc | 12 |
contact structures | 12 |
novel peptide | 12 |
current state | 12 |
seq data | 12 |
predicting protein | 12 |
biological databases | 12 |
challenge virus | 12 |
vitro translation | 12 |
heparan sulfate | 12 |
slightly different | 12 |
hopfield network | 12 |
virus variants | 12 |
virus isolation | 12 |
viral disease | 12 |
deeprc model | 12 |
different data | 12 |
new species | 12 |
conserved sequences | 12 |
deep neural | 12 |
double stranded | 12 |
receptor recognition | 12 |
sequencing efforts | 12 |
minion sequencing | 12 |
species level | 12 |
using blast | 12 |
substrate binding | 12 |
new genes | 12 |
computational design | 12 |
protein identification | 12 |
native protein | 12 |
genbank records | 12 |
good solvent | 12 |
gene structure | 12 |
inhibitory effect | 12 |
also able | 12 |
tree construction | 12 |
time scale | 12 |
cdna clones | 12 |
sequence patterns | 12 |
principal component | 12 |
synthesized peptides | 12 |
like peptides | 12 |
sequences showed | 12 |
vaccinia virus | 12 |
expressed genes | 12 |
also proposed | 12 |
polyacrylamide gel | 12 |
yeast genome | 12 |
genome structure | 12 |
catalytic efficiency | 12 |
fold cross | 12 |
genome shotgun | 12 |
previously shown | 12 |
genetic analysis | 12 |
computational approaches | 12 |
ca binding | 12 |
competing interests | 12 |
large fraction | 12 |
alignment using | 12 |
metabolic stability | 12 |
human host | 12 |
physiological processes | 12 |
also reported | 12 |
different families | 12 |
national institutes | 12 |
phylogenetic methods | 12 |
epidemiological data | 12 |
play important | 12 |
two peptides | 12 |
evaluated using | 12 |
unique amino | 12 |
noncoding regions | 12 |
cysteine protease | 12 |
atpase activity | 12 |
acute gastroenteritis | 12 |
comparative sequence | 12 |
hydroxyl group | 12 |
randomly chosen | 12 |
phylogeny estimation | 12 |
comprehensive database | 12 |
human populations | 12 |
adjacency vector | 12 |
rna binding | 12 |
high sequence | 12 |
sample set | 12 |
bacillus anthracis | 12 |
rna recombination | 12 |
implanted signal | 12 |
complete sequences | 12 |
linear regression | 12 |
specific sequences | 12 |
successfully applied | 12 |
rna structure | 12 |
adhesion molecules | 12 |
small size | 12 |
assembled using | 12 |
particular bases | 12 |
bacterial cell | 12 |
pcr product | 12 |
per se | 12 |
using multiple | 12 |
target cell | 12 |
new viral | 12 |
current study | 12 |
enteric viruses | 12 |
go terms | 12 |
moonlighting proteins | 12 |
minmax kernel | 12 |
conserved domain | 12 |
nucleotide identity | 12 |
intermolecular bridges | 12 |
pattern recognition | 12 |
geobacter sulfurreducens | 12 |
done using | 12 |
critical assessment | 12 |
specific sites | 12 |
folding problem | 12 |
last two | 12 |
sequences encoding | 12 |
hydrophobic residues | 12 |
gene duplication | 12 |
tested positive | 12 |
feline coronaviruses | 12 |
cnn kernels | 12 |
northern blot | 12 |
disordered residues | 12 |
close proximity | 12 |
vector method | 12 |
poor solvent | 12 |
transformer attention | 12 |
containing two | 12 |
islet amyloid | 12 |
motif blocks | 12 |
means clustering | 12 |
different groups | 12 |
rep protein | 12 |
genetic drift | 12 |
threading algorithms | 12 |
growth factors | 12 |
based model | 12 |
expression profiles | 12 |
short read | 12 |
two subunits | 12 |
contains three | 12 |
length sequences | 12 |
like receptor | 12 |
genome sizes | 12 |
chemical groups | 12 |
cd spectra | 12 |
peptide bonds | 12 |
likelihood phylogenetic | 12 |
swab samples | 12 |
specific genes | 12 |
necrosis factor | 12 |
rational drug | 12 |
rna synthesis | 12 |
full hyperparameter | 12 |
analysis indicated | 12 |
pyrosequencing reads | 12 |
acid change | 12 |
based drug | 12 |
resonance energy | 12 |
model based | 12 |
charged amino | 12 |
candidate sequence | 12 |
highly potent | 12 |
low abundance | 12 |
posttranslational modifications | 12 |
recombinant dna | 12 |
acid peptide | 12 |
cys residues | 12 |
genetically modified | 12 |
flow cytometry | 12 |
transmembrane proteins | 12 |
nitric oxide | 12 |
surface charge | 12 |
virus proteins | 12 |
tertiary structures | 12 |
game representation | 12 |
obtained results | 12 |
proteolytic cleavage | 12 |
repeat sequence | 12 |
metagenomic approach | 12 |
yellow vein | 12 |
word embedding | 11 |
sheet structure | 11 |
order moments | 11 |
protein concentration | 11 |
fl uorinated | 11 |
domain protein | 11 |
forward language | 11 |
success rates | 11 |
atomic force | 11 |
life technologies | 11 |
widely distributed | 11 |
amyloid fi | 11 |
experimental studies | 11 |
southern china | 11 |
using circular | 11 |
gap length | 11 |
university hospital | 11 |
distance matrices | 11 |
numerical representation | 11 |
first case | 11 |
expression vector | 11 |
times higher | 11 |
positive strand | 11 |
scaffold proteins | 11 |
ribosomal frameshift | 11 |
three classes | 11 |
low molecular | 11 |
classification based | 11 |
early stages | 11 |
different strains | 11 |
hiv protease | 11 |
character states | 11 |
phase ii | 11 |
wet lab | 11 |
bootstrap values | 11 |
brain tissue | 11 |
short reads | 11 |
structural feature | 11 |
surface receptors | 11 |
different datasets | 11 |
will show | 11 |
native peptide | 11 |
encephalitis virus | 11 |
new strains | 11 |
based analysis | 11 |
compressed sensing | 11 |
hydrophobic interactions | 11 |
low quality | 11 |
major histocompatibility | 11 |
chain translocation | 11 |
pcr using | 11 |
fl exible | 11 |
secondary structural | 11 |
much faster | 11 |
computational resources | 11 |
terminal residues | 11 |
correlation spectroscopy | 11 |
delivery systems | 11 |
tobacco mosaic | 11 |
astrovirus isolate | 11 |
protein conformation | 11 |
two species | 11 |
invertebrate vectors | 11 |
physiological conditions | 11 |
prediction results | 11 |
letter sequence | 11 |
effi cacy | 11 |
fully automated | 11 |
virus families | 11 |
scale sequence | 11 |
health problem | 11 |
different countries | 11 |
associated protein | 11 |
value ranges | 11 |
peripheral blood | 11 |
olg design | 11 |
pfam database | 11 |
specific protein | 11 |
exchange chromatography | 11 |
coronavirus sequences | 11 |
will enable | 11 |
indel events | 11 |
intrinsically unstructured | 11 |
dna synthesis | 11 |
bovine coronavirus | 11 |
real number | 11 |
cell biology | 11 |
therapeutic target | 11 |
genetics analysis | 11 |
natural killer | 11 |
test whether | 11 |
hla class | 11 |
three reading | 11 |
rgmov rna | 11 |
hamming distance | 11 |
first identified | 11 |
primer set | 11 |
integrated gradients | 11 |
good quality | 11 |
residue peptide | 11 |
conformational properties | 11 |
cell entry | 11 |
research group | 11 |
acid analysis | 11 |
like virus | 11 |
predicted using | 11 |
per site | 11 |
disease diagnosis | 11 |
sequencing libraries | 11 |
contained within | 11 |
gene regulation | 11 |
test set | 11 |
similarity search | 11 |
computational approach | 11 |
energy term | 11 |
produced using | 11 |
loop regions | 11 |
randomization models | 11 |
peptide concentration | 11 |
specific expression | 11 |
multiple proteins | 11 |
protein genes | 11 |
sequential patterns | 11 |
known structures | 11 |
specific activity | 11 |
approach may | 11 |
diagram graph | 11 |
generation time | 11 |
five different | 11 |
identify novel | 11 |
search space | 11 |
olg pairs | 11 |
knowledge base | 11 |
neuronal ceroid | 11 |
peptide analogs | 11 |
known structure | 11 |
embryonic development | 11 |
steric hindrance | 11 |
unique features | 11 |
fl exibility | 11 |
tetrafl uoroborates | 11 |
sars cov | 11 |
homo sapiens | 11 |
sequence assembly | 11 |
correlation coefficient | 11 |
hot spots | 11 |
internal nodes | 11 |
ed peptides | 11 |
receptor agonist | 11 |
well suited | 11 |
threshold value | 11 |
sentence generation | 11 |
recently proposed | 11 |
sequence search | 11 |
terminal regions | 11 |
nd protein | 11 |
functional role | 11 |
gb ram | 11 |
amyloid polypeptide | 11 |
tandem mass | 11 |
biochemical pathways | 11 |
global health | 11 |
ion channel | 11 |
nuclear extracts | 11 |
good agreement | 11 |
quaternary structure | 11 |
peptide hybrids | 11 |
lipid transfer | 11 |
signal peptides | 11 |
annotated sequences | 11 |
search programs | 11 |
virus taxonomy | 11 |
transgenic mice | 11 |
gapped blast | 11 |
diagnostic tools | 11 |
built using | 11 |
different viral | 11 |
predicted proteins | 11 |
sequencing techniques | 11 |
evolutionary genetics | 11 |
studied using | 11 |
sequence variability | 11 |
using one | 11 |
catalytic residues | 11 |
aromatic residues | 11 |
newly developed | 11 |
identify new | 11 |
taking advantage | 11 |
gene mutations | 11 |
olg pair | 11 |
high concentrations | 11 |
viral quasispecies | 11 |
free radical | 11 |
selective pressure | 11 |
pathogen detection | 11 |
analysis pipeline | 11 |
health organization | 11 |
proteolytic degradation | 11 |
unknown function | 11 |
coronavirus genome | 11 |
physical properties | 11 |
various biological | 11 |
ceroid lipofuscinosis | 11 |
organic solvents | 11 |
receptor protein | 11 |
sequences related | 11 |
cluster analysis | 11 |
information may | 11 |
basic idea | 11 |
specific proteins | 11 |
protecting groups | 11 |
great importance | 11 |
bacillus subtilis | 11 |
method used | 11 |
environmental conditions | 11 |
sequences among | 11 |
converting enzyme | 11 |
field isolates | 11 |
make use | 11 |
three sequences | 11 |
gs flx | 11 |
enzymatic degradation | 11 |
dna strand | 11 |
random sequence | 11 |
computer program | 11 |
merkel cell | 11 |
chicken astroviruses | 11 |
open access | 11 |
given protein | 11 |
protein docking | 11 |
bovine viral | 11 |
like enzymes | 11 |
conformational behavior | 11 |
sequencing project | 11 |
structure information | 11 |
wild birds | 11 |
gov genome | 11 |
four data | 11 |
vector machines | 11 |
transmembrane region | 11 |
loss function | 11 |
dna virus | 11 |
significantly different | 11 |
constructed sequences | 11 |
results may | 11 |
sequences containing | 11 |
follow relationship | 11 |
structural differences | 11 |
immune evasion | 11 |
hydrophobic amino | 11 |
glycosylation sites | 11 |
novel mammalian | 11 |
correct structural | 11 |
structural stability | 11 |
eukaryotic genomes | 11 |
different amino | 11 |
economic losses | 11 |
extremely high | 11 |
major challenge | 11 |
highly abundant | 11 |
random variants | 11 |
regulatory sequences | 11 |
deletion mutations | 11 |
using sequence | 11 |
individual amino | 11 |
manila clam | 11 |
large set | 11 |
diversity within | 11 |
directed evolution | 11 |
new approaches | 11 |
million people | 11 |
green fluorescent | 11 |
low affinity | 11 |
infected patients | 11 |
detailed description | 11 |
almost identical | 11 |
cdna cloning | 11 |
antiviral immunity | 11 |
microbiome project | 10 |
structural environments | 10 |
systemic spread | 10 |
infantile neuronal | 10 |
viral agents | 10 |
conserved across | 10 |
biorxiv doi | 10 |
flying fox | 10 |
network architectures | 10 |
different virus | 10 |
genetic diseases | 10 |
protein folds | 10 |
primer pairs | 10 |
multiple copies | 10 |
infl ammatory | 10 |
catalytic domains | 10 |
genome viruses | 10 |
fas ii | 10 |
discrimination measure | 10 |
nipah virus | 10 |
assembled contigs | 10 |
molecular modelling | 10 |
functional annotation | 10 |
antigen binding | 10 |
surface exposed | 10 |
species using | 10 |
sequence features | 10 |
angiotensin ii | 10 |
disordered region | 10 |
sequence clustering | 10 |
genome coverage | 10 |
different sizes | 10 |
tag tat | 10 |
cov main | 10 |
term memory | 10 |
alpha helix | 10 |
human plasma | 10 |
influenza data | 10 |
clustering method | 10 |
proposed architecture | 10 |
human enteroviruses | 10 |
useful tools | 10 |
rna genes | 10 |
associative polymers | 10 |
tree node | 10 |
brain barrier | 10 |
expert system | 10 |
saudi arabia | 10 |
three proteins | 10 |
various species | 10 |
divergent sequences | 10 |
manually curated | 10 |
main paper | 10 |
still unknown | 10 |
first approach | 10 |
plant defensins | 10 |
contact structure | 10 |
catalytic site | 10 |
higher eukaryotes | 10 |
molecular typing | 10 |
der waals | 10 |
recent developments | 10 |
histocompatibility complex | 10 |
ser thr | 10 |
long range | 10 |
fmoc chemistry | 10 |
structural model | 10 |
surface glycoprotein | 10 |
provides information | 10 |
high purity | 10 |
protein involved | 10 |
predicted protein | 10 |
kernel size | 10 |
endothelial cell | 10 |
rep gene | 10 |
bp long | 10 |
also important | 10 |
conformational transition | 10 |
upper respiratory | 10 |
language models | 10 |
recognition sites | 10 |
much greater | 10 |
probe sets | 10 |
species demarcation | 10 |
food borne | 10 |
gene products | 10 |
provide important | 10 |
healthy controls | 10 |
group i | 10 |
low frequency | 10 |
picobirnaviridae family | 10 |
gpu memory | 10 |
sequence variations | 10 |
high frequency | 10 |
important human | 10 |
haemophilus influenzae | 10 |
avian coronavirus | 10 |
substitution rates | 10 |
releasing hormone | 10 |
kinetic parameters | 10 |
range correlations | 10 |
predict protein | 10 |
experimentally validated | 10 |
polypeptide chains | 10 |
purified using | 10 |
stranded cdna | 10 |
time complexity | 10 |
parallel computing | 10 |
repeated dna | 10 |
novel gene | 10 |
accumulated indicator | 10 |
sequencing centers | 10 |
several methods | 10 |
esgrowth algorithm | 10 |
high similarity | 10 |
receptor antagonist | 10 |
mil problem | 10 |
growth hormone | 10 |
function relationships | 10 |
core regions | 10 |
raw sequence | 10 |
two vertices | 10 |
delhi virus | 10 |
pancreatic lipase | 10 |
genomes using | 10 |
acid level | 10 |
prot database | 10 |
titration curves | 10 |
tree based | 10 |
viral evolution | 10 |
protein regions | 10 |
experimental design | 10 |
bacterial species | 10 |
soluble proteins | 10 |
may result | 10 |
gracilaria chilensis | 10 |
gap penalty | 10 |
disease research | 10 |
first report | 10 |
native proteins | 10 |
acetic acid | 10 |
protein association | 10 |
molecular mechanics | 10 |
highly efficient | 10 |
predicted structure | 10 |
visual inspection | 10 |
threading algorithm | 10 |
conformational states | 10 |
threading energy | 10 |
computation time | 10 |
high coverage | 10 |
simple method | 10 |
china complete | 10 |
low levels | 10 |
dna rna | 10 |
receptor repertoires | 10 |
restriction sites | 10 |
human population | 10 |
remains unknown | 10 |
curl new | 10 |
located within | 10 |
sequence representation | 10 |
event instances | 10 |
using deep | 10 |
rna polymerases | 10 |
protein functions | 10 |
gene function | 10 |
database searches | 10 |
genetic variability | 10 |
leucine codon | 10 |
one way | 10 |
viral populations | 10 |
new era | 10 |
one major | 10 |
schwann cells | 10 |
penetrating peptide | 10 |
signalling pathways | 10 |
hidden state | 10 |
significant sequence | 10 |
genome database | 10 |
modification systems | 10 |
underrepresented sequences | 10 |
model selection | 10 |
relationship among | 10 |
helix formation | 10 |
conceptual modeling | 10 |
known sequence | 10 |
conformational freedom | 10 |
unique peptide | 10 |
lexical constraints | 10 |
coupled receptor | 10 |
new therapeutic | 10 |
fold assignment | 10 |
based models | 10 |
mammalian cell | 10 |
model peptide | 10 |
residues involved | 10 |
maximum number | 10 |
sequencing analysis | 10 |