This is a table of type trigram and their frequencies. Use it to search & browse the list to learn more about your study carrel.
trigram | frequency |
---|---|
a href http | 3308 |
as well as | 746 |
a number of | 651 |
li li b | 633 |
td td td | 600 |
img src http | 546 |
the use of | 539 |
a set of | 500 |
li li strong | 463 |
open source software | 424 |
com blog wp | 421 |
span a href | 414 |
li a href | 381 |
li li a | 336 |
edu sandbox eebo | 330 |
a list of | 318 |
the number of | 302 |
com alex concordance | 302 |
phrase zzzz word | 298 |
segsize usepre word | 298 |
zzzz word zzzz | 298 |
cmd wordsearch phrase | 298 |
wordsearch phrase zzzz | 298 |
word zzzz bookcode | 298 |
a span nbsp | 295 |
in order to | 280 |
edu sandbox htrc | 273 |
edu sandbox readings | 258 |
edu emorgan files | 258 |
this is a | 247 |
td tr tr | 246 |
ul li b | 242 |
a li li | 238 |
some of the | 236 |
the reader to | 234 |
p ul li | 233 |
the university of | 231 |
one of the | 230 |
of a librarian | 228 |
life of a | 226 |
p p the | 224 |
on the other | 222 |
when it comes | 219 |
the other hand | 218 |
b source b | 217 |
li b source | 217 |
p blockquote p | 215 |
it comes to | 215 |
tr tr td | 213 |
a href https | 206 |
the form of | 203 |
eric lease morgan | 200 |
bookcode etext segsize | 199 |
etext segsize usepre | 199 |
the a href | 199 |
zzzz bookcode etext | 199 |
td img src | 197 |
of notre dame | 196 |
b keywords b | 196 |
li b keywords | 196 |
the great books | 194 |
the distant reader | 192 |
td tr table | 187 |
the process of | 179 |
a part of | 177 |
cmd term id | 176 |
university of notre | 170 |
be able to | 168 |
all of the | 166 |
in the form | 166 |
li ul p | 165 |
the content of | 163 |
through the use | 157 |
the same time | 157 |
this is the | 157 |
tr align center | 156 |
intended to be | 155 |
li b date | 154 |
to be a | 154 |
align center td | 153 |
on the web | 152 |
at the university | 151 |
the creation of | 150 |
at the same | 148 |
p table align | 147 |
each of the | 146 |
need to be | 142 |
it is a | 142 |
it is not | 142 |
never formally published | 141 |
was never formally | 141 |
some sort of | 140 |
p ol li | 140 |
ought to be | 139 |
td td align | 137 |
linked data is | 137 |
com sandbox liam | 136 |
well as the | 135 |
of all the | 134 |
center td img | 134 |
this text was | 133 |
of linked data | 132 |
the idea of | 130 |
part of the | 129 |
in a text | 126 |
a couple of | 125 |
li ol p | 124 |
linked archival metadata | 124 |
a li ul | 122 |
number of times | 122 |
a span td | 122 |
number of words | 121 |
the whole thing | 121 |
in the corpus | 120 |
br span a | 119 |
of text mining | 118 |
the purpose of | 118 |
can be used | 118 |
any number of | 116 |
use of the | 116 |
p a href | 115 |
many of the | 115 |
as opposed to | 114 |
services against texts | 114 |
the problem of | 113 |
the semantic web | 112 |
a bit of | 110 |
the library profession | 109 |
as linked data | 109 |
text was never | 108 |
one or more | 108 |
to create a | 108 |
be used to | 107 |
a a href | 107 |
a lot of | 106 |
generation library catalogs | 105 |
all of these | 104 |
whether or not | 104 |
p img src | 103 |
to what degree | 103 |
in a document | 102 |
the opportunity to | 102 |
based on the | 101 |
tr table p | 101 |
data and information | 100 |
com water index | 100 |
the digital humanities | 100 |
zzzz bookcode thoreau | 99 |
span td tr | 99 |
and it is | 99 |
in a corpus | 98 |
table align right | 98 |
the value of | 98 |
library of congress | 98 |
align right tr | 98 |
total number of | 98 |
needs to be | 97 |
term id themes | 97 |
in the end | 97 |
p p a | 97 |
computers in libraries | 96 |
width height class | 96 |
td align center | 96 |
with the advent | 96 |
of electronic texts | 95 |
the advent of | 95 |
is possible to | 95 |
in the text | 95 |
it is possible | 94 |
the hesburgh libraries | 94 |
the topic of | 94 |
of the great | 94 |
is intended to | 93 |
there is a | 93 |
called a href | 91 |
there is no | 91 |
br a href | 91 |
a sort of | 91 |
an overview of | 90 |
cmd getwater id | 90 |
plain text files | 90 |
h p the | 89 |
most of the | 89 |
are expected to | 88 |
p p img | 88 |
the purposes of | 88 |
is a good | 87 |
next generation library | 87 |
td align right | 87 |
well as a | 87 |
the list of | 86 |
tcp love search | 86 |
for the purposes | 86 |
browser emerson search | 86 |
my water collection | 86 |
browser thoreau search | 86 |
tcp baxter search | 86 |
com water thumbnails | 86 |
not limited to | 86 |
open access publishing | 86 |
there are a | 85 |
great ideas coefficient | 85 |
p p i | 84 |
the application of | 84 |
my experiences at | 84 |
purpose of the | 84 |
in a sentence | 83 |
the results of | 83 |
the result is | 82 |
right tr align | 82 |
they can be | 81 |
there are many | 80 |
in the collection | 80 |
what is the | 80 |
in other words | 80 |
words in a | 80 |
in the same | 79 |
how to use | 79 |
along the way | 79 |
the result will | 78 |
catalogue of electronic | 78 |
term id formats | 78 |
p p strong | 78 |
the great ideas | 78 |
span violent span | 78 |
the alex catalogue | 78 |
facet terms b | 77 |
pdf local annotated | 77 |
b date read | 77 |
local annotated a | 77 |
b date created | 77 |
li b rights | 77 |
date read b | 77 |
date created b | 77 |
b rights b | 77 |
of open source | 77 |
li b facet | 77 |
b facet terms | 77 |
annotated a li | 77 |
the definition of | 77 |
of the collection | 77 |
li b creator | 77 |
the ability to | 77 |
li b versions | 77 |
the triple store | 76 |
the united states | 76 |
will need to | 76 |
a whole lot | 75 |
seems to be | 75 |
is not a | 75 |
a word cloud | 75 |
target blank http | 75 |
org library virtue | 74 |
p p in | 74 |
available on the | 74 |
of words in | 74 |
version of the | 73 |
an effort to | 73 |
i have been | 73 |
a good time | 73 |
i li li | 73 |
access to the | 73 |
on a timeline | 72 |
alex catalogue of | 72 |
td td img | 72 |
quite a number | 72 |
some of my | 72 |
this posting describes | 71 |
getwater id p | 71 |
but it is | 71 |
go to step | 71 |
id p table | 71 |
to use the | 71 |
the internet archive | 71 |
it a span | 71 |
map it a | 71 |
jpg br span | 71 |
natural language processing | 70 |
as a part | 70 |
expected to be | 70 |
on the topic | 70 |
experiences at the | 70 |
the full text | 70 |
just about any | 69 |
the reader can | 69 |
at notre dame | 69 |
the code lib | 69 |
world wide web | 68 |
each of these | 68 |
table align center | 68 |
a p p | 68 |
of these things | 67 |
to make the | 67 |
list of all | 67 |
align center tr | 66 |
a given word | 66 |
use of a | 66 |
or may not | 66 |
open li li | 66 |
into a single | 65 |
and linked data | 65 |
is used to | 65 |
as a whole | 65 |
different types of | 65 |
problem of find | 64 |
may or may | 64 |
the work of | 64 |
digital humanities computing | 64 |
the linked data | 64 |
it can be | 63 |
the library of | 63 |
is a list | 63 |
of the data | 63 |
this blog posting | 63 |
most frequently used | 62 |
learn how to | 62 |
used to create | 62 |
services against the | 62 |
and in the | 61 |
a type of | 61 |
it is about | 60 |
p this is | 60 |
result will be | 60 |
you want to | 60 |
more or less | 60 |
these sorts of | 60 |
the size of | 60 |
number of documents | 60 |
collections and services | 60 |
of the things | 60 |
will be a | 60 |
items of interest | 60 |
it would be | 59 |
make it easier | 59 |
such as the | 58 |
what are the | 58 |
a plain text | 58 |
to the reader | 58 |
in terms of | 58 |
a myriad of | 58 |
for more detail | 58 |
of a href | 58 |
ul li a | 58 |
to do with | 57 |
items in the | 57 |
for more information | 57 |
part of a | 57 |
align right td | 57 |
in an effort | 57 |
item in the | 57 |
form of a | 57 |
in my opinion | 57 |
will be able | 56 |
formats web articles | 56 |
henry david thoreau | 56 |
was originally published | 56 |
eric morgan infomotions | 56 |
on how to | 55 |
this sort of | 55 |
content of the | 55 |
that can be | 55 |
code lib journal | 55 |
text mining is | 55 |
the whole of | 55 |
a travel log | 55 |
seem to be | 55 |
be applied to | 55 |
point of view | 55 |
is akin to | 55 |
ul li strong | 55 |
humanities computing techniques | 55 |
back to the | 54 |
plot on a | 54 |
is a pre | 54 |
edu sandbox thatcamp | 54 |
the names of | 54 |
such a thing | 54 |
the heart of | 54 |
of open access | 54 |
the most frequently | 54 |
used in the | 54 |
inverse document frequency | 54 |
the frequency of | 54 |
it is the | 54 |
the next step | 53 |
american library association | 53 |
in the world | 53 |
a way to | 53 |
are a few | 53 |
in a given | 53 |
you may or | 53 |
this presentation was | 53 |
provide access to | 53 |
a data structure | 53 |
attachment class wp | 53 |
id attachment class | 53 |
sandbox liam sparql | 53 |
div id attachment | 53 |
png img src | 53 |
is not as | 53 |
cultural heritage institutions | 52 |
was able to | 52 |
of a library | 52 |
it easier to | 52 |
collected this water | 52 |
height class size | 52 |
here is a | 52 |
can also be | 52 |
to do the | 52 |
full text content | 52 |
for the most | 51 |
for a good | 51 |
look at the | 51 |
hathitrust research center | 51 |
are intended to | 51 |
a collection of | 51 |
top tech trends | 51 |
if it were | 51 |
align center img | 51 |
center img src | 51 |
of full text | 51 |
may not know | 51 |
the contents of | 51 |
of the day | 50 |
integrated library system | 50 |
li li the | 50 |
text mining and | 50 |
local annotated readings | 50 |
one way to | 50 |
as much as | 50 |
files in the | 50 |
do the work | 50 |
is expected to | 50 |
a means to | 49 |
is a type | 49 |
to be used | 49 |
i had the | 49 |
at the very | 49 |
described in this | 49 |
similar to the | 49 |
each item in | 49 |
of a text | 49 |
these things are | 49 |
it is an | 49 |
dh notre dame | 49 |
documents my experiences | 49 |
the principles of | 49 |
of bibliographic description | 48 |
the human condition | 48 |
jpg img src | 48 |
the western world | 48 |
the answer is | 48 |
a person can | 48 |
had the opportunity | 48 |
to figure out | 48 |
frequency inverse document | 48 |
books of the | 48 |
in this proposal | 48 |
li ul h | 48 |
publishing linked data | 47 |
goal is to | 47 |
versions of the | 47 |
enabling the reader | 47 |
the location of | 47 |
of digital humanities | 47 |
end of the | 47 |
words or phrases | 47 |
edited version of | 46 |
the center for | 46 |
be used as | 46 |
the current environment | 46 |
is more than | 46 |
as you may | 46 |
is not the | 46 |
in the united | 46 |
of the western | 46 |
the result ought | 46 |
of them are | 46 |
to be read | 46 |
to be the | 46 |
in the future | 46 |
great books of | 46 |
overview of the | 46 |
introduction to the | 46 |
of a number | 46 |
and at the | 46 |
this posting outlines | 46 |
a triple store | 46 |
result ought to | 46 |
declaration of independence | 46 |
can be done | 46 |
is designed to | 46 |
the implementation of | 45 |
authorities names n | 45 |
it easier for | 45 |
conference on digital | 45 |
gov authorities names | 45 |
they are not | 45 |
all of this | 45 |
p p this | 45 |
on digital libraries | 45 |
you will be | 45 |
akin to a | 45 |
the output of | 45 |
the end of | 45 |
can be applied | 45 |
an introduction to | 45 |
take advantage of | 45 |
frequently used words | 45 |
library association annual | 45 |
in order for | 45 |
is one of | 45 |
this is an | 45 |
this travel log | 45 |
this article was | 45 |
the most part | 45 |
of a book | 45 |
linked open data | 44 |
distant reader is | 44 |
of a given | 44 |
term frequency inverse | 44 |
of the digital | 44 |
a perl module | 44 |
are a number | 44 |
p id caption | 44 |
formats journal articles | 44 |
made up of | 44 |
of the library | 44 |
the goals of | 44 |
center tr align | 44 |
to see the | 44 |
of the reader | 44 |
because of the | 43 |
you will need | 43 |
create a list | 43 |
research data management | 43 |
more things in | 43 |
the set of | 43 |
i wrote a | 43 |
here at notre | 43 |
alt width height | 43 |
ralph waldo emerson | 43 |
to read the | 43 |
to do this | 43 |
target blank href | 43 |
to make a | 43 |
and open source | 43 |
to create the | 43 |
to read and | 43 |
enable the reader | 43 |
interface to the | 43 |
configure use constant | 43 |
a target blank | 43 |
plain text a | 43 |
things in common | 43 |
most common words | 42 |
association annual meeting | 42 |
the same breath | 42 |
of the original | 42 |
file not found | 42 |
integrated library systems | 42 |
and a href | 42 |
how to make | 42 |
a given document | 42 |
words and phrases | 42 |
i learned that | 42 |
be used in | 42 |
used to be | 42 |
globally networked computers | 42 |
themes data curation | 42 |
a relational database | 42 |
of the items | 42 |
txt plain text | 42 |
you can see | 42 |
much of the | 42 |
to get a | 41 |
public library of | 41 |
the characteristics of | 41 |
more than a | 41 |
i think the | 41 |
a td td | 41 |
org dc terms | 41 |
with a number | 41 |
of documents in | 41 |
against the database | 41 |
in the current | 41 |
number of ways | 41 |
source software and | 41 |
it does not | 41 |
the existence of | 41 |
is much more | 41 |
in common than | 41 |
the process is | 41 |
step is to | 41 |
version of a | 40 |
to be more | 40 |
a person to | 40 |
com blog feed | 40 |
number of things | 40 |
the library community | 40 |
the most frequent | 40 |
on the internet | 40 |
the fact that | 40 |
in conjunction with | 40 |
a long time | 40 |
a combination of | 40 |
a library catalog | 40 |
the content is | 40 |
of marc records | 40 |
not intended to | 40 |
akin to the | 40 |
of a corpus | 40 |
usr bin perl | 40 |
compared to the | 40 |
phrases in a | 39 |
university of illinois | 39 |
may not be | 39 |
names of people | 39 |
of the corpus | 39 |
next step is | 39 |
this is not | 39 |
a few of | 39 |
of the world | 39 |
generation library catalog | 39 |
library collections and | 39 |
to see how | 39 |
p this posting | 39 |
whole of the | 39 |
the public domain | 39 |
to use a | 39 |
p div id | 39 |
library of america | 39 |
li ul li | 39 |
a program called | 39 |
the download page | 39 |
digital public library | 39 |
the name of | 39 |
and they are | 39 |
the items in | 39 |
of relational databases | 38 |
it has been | 38 |
of the more | 38 |
please see the | 38 |
in a library | 38 |
an open source | 38 |
is a sub | 38 |
think of the | 38 |
i think it | 38 |
to compare amp | 38 |
full text of | 38 |
edu liam files | 38 |
it was a | 38 |
as if it | 38 |
european conference on | 38 |
p pre code | 38 |
but not limited | 38 |
is a process | 38 |
the functionality of | 38 |
the need for | 38 |
like this one | 38 |
a presentation called | 37 |
words in the | 37 |
for each of | 37 |
comes to the | 37 |
data is a | 37 |
pre blockquote p | 37 |
plain text file | 37 |
this can be | 37 |
the very least | 37 |
it to the | 37 |
a corpus of | 37 |
of the project | 37 |
this essay was | 37 |
know how to | 37 |
blank href http | 37 |
used to denote | 37 |
page for more | 37 |
and it was | 37 |
university of michigan | 36 |
article was originally | 36 |
h a id | 36 |
p in the | 36 |
the good work | 36 |
has founding date | 36 |
p blockquote pre | 36 |
tika release apache | 36 |
to denote the | 36 |
purpose is to | 36 |
a h p | 36 |
release apache tika | 36 |
in the file | 36 |
outlines some of | 36 |
the hathitrust research | 36 |
this text documents | 36 |
to have the | 36 |
to learn how | 36 |
is the most | 36 |
full text indexing | 36 |
and how it | 36 |
i was able | 36 |
are the great | 36 |
apache tika release | 36 |
some of these | 36 |
number of people | 36 |
as long as | 36 |
is based on | 36 |
make a store | 36 |
is a set | 36 |
originally published in | 36 |
as a librarian | 36 |
in the past | 36 |
on a map | 36 |
against the result | 36 |
used to describe | 35 |
to be an | 35 |
center for research | 35 |
the file named | 35 |
into plain text | 35 |
a p id | 35 |
the command line | 35 |
common than differences | 35 |
of the internet | 35 |
if you want | 35 |
of words and | 35 |
a few years | 35 |
code lib community | 35 |
sandbox liam id | 35 |
download page for | 35 |
will not be | 35 |
the most significant | 35 |
of library collections | 35 |
associated with the | 35 |
a look at | 35 |
the search results | 35 |
was used to | 35 |
has been released | 35 |
in the middle | 35 |
could be used | 34 |
data is not | 34 |
class aligncenter size | 34 |
the most common | 34 |
i decided to | 34 |
of plain text | 34 |
lib mailing list | 34 |
i want to | 34 |
catholic youth literature | 34 |
the means for | 34 |
number of pages | 34 |
a text and | 34 |
a subset of | 34 |
used as a | 34 |
the structure of | 34 |
text versions of | 34 |
the source code | 34 |
the reader will | 34 |
contains a set | 34 |
save the result | 34 |
the reader is | 34 |
will want to | 34 |
of find amp | 34 |
source software in | 34 |
libraries and librarianship | 34 |
but this time | 34 |
p p for | 34 |
of the time | 34 |
in the public | 34 |
some of them | 34 |
chttp a f | 34 |
is to provide | 34 |
in the first | 34 |
use of computers | 34 |
need to have | 34 |
a td tr | 34 |
height class aligncenter | 34 |
a mailing list | 34 |
hypertext markup language | 34 |
of interest to | 34 |
the future of | 34 |
aligncenter a href | 33 |
read and write | 33 |
to create and | 33 |
of the same | 33 |
p p as | 33 |
out of the | 33 |
on one hand | 33 |
frequency of words | 33 |
caption aligncenter a | 33 |
speech and named | 33 |
put another way | 33 |
allowing the reader | 33 |
i do not | 33 |
the folks at | 33 |
in the process | 33 |
source software is | 33 |
www html main | 33 |
and it will | 33 |
according to the | 33 |
the processes of | 33 |
a single word | 33 |
disk www html | 33 |
of the most | 33 |
main sandbox liam | 33 |
org target blank | 33 |
html main sandbox | 33 |
not so much | 32 |
li li what | 32 |
is in the | 32 |
center for digital | 32 |
the conference was | 32 |
have more things | 32 |
result is a | 32 |
for digital scholarship | 32 |
the context of | 32 |
my alex catalogue | 32 |
to the use | 32 |
out how to | 32 |
a christmas carol | 32 |
i collected this | 32 |
and try again | 32 |
part of http | 32 |
the root of | 32 |
and usually empty | 32 |
is going to | 32 |
img align right | 32 |
the words in | 32 |
it into a | 32 |
in a single | 32 |
the things i | 32 |
for a given | 32 |
optical character recognition | 32 |
to see what | 32 |
but they are | 32 |
for more than | 32 |
next to you | 32 |
balance of the | 32 |
in a number | 32 |
person next to | 32 |
words are used | 32 |
type of http | 32 |
describes how to | 32 |
the person next | 32 |
given at the | 32 |
if they were | 32 |
the life of | 32 |
the balance of | 32 |
state university libraries | 32 |
the dpla will | 32 |
the other end | 32 |
this posting documents | 32 |
o order by | 32 |
width src http | 32 |
to the library | 32 |
have to do | 32 |
the totality of | 32 |
opposed to the | 32 |
the result was | 32 |
of find get | 32 |
context of the | 32 |
and the result | 32 |
the works of | 32 |
of the dpla | 32 |
this process is | 31 |
in regards to | 31 |
want to do | 31 |
to be in | 31 |
the need to | 31 |
but at the | 31 |
it on the | 31 |
code li li | 31 |
the mobile web | 31 |
been able to | 31 |
research resources alliance | 31 |
books and journals | 31 |
a digital library | 31 |
providing access to | 31 |
step in the | 31 |
catholic research resources | 31 |
questions such as | 31 |
a single file | 31 |
the power of | 31 |
greater number of | 31 |
i believe the | 31 |
software in libraries | 31 |
but in the | 31 |
am able to | 31 |
linked data and | 31 |
the same thing | 31 |
is where the | 31 |
sorts of things | 31 |
in the library | 31 |
went on to | 31 |
to describe the | 31 |
of computer technology | 31 |
one of my | 31 |
i am able | 31 |
it is important | 31 |
of the text | 30 |
the world wide | 30 |
problem of use | 30 |
digital humanities and | 30 |
gave a presentation | 30 |
the american library | 30 |
just as importantly | 30 |
of the word | 30 |
right td td | 30 |
if you have | 30 |
a web server | 30 |
into rdf xml | 30 |
to be able | 30 |
this is where | 30 |
to be solved | 30 |
located in the | 30 |
see the list | 30 |
this article describes | 30 |
and named entities | 30 |
is an excellent | 30 |
linked data in | 30 |
from the hathitrust | 30 |
principles of librarianship | 30 |
at least a | 30 |
posting describes how | 30 |
edu tmp early | 30 |
is not about | 30 |
the th century | 30 |
align right src | 30 |
of this workshop | 30 |
the history of | 30 |
a presentation at | 30 |
advantages and disadvantages | 30 |
the way to | 30 |
i wanted to | 30 |
the amount of | 30 |
edu ontologies mods | 30 |
the problem to | 30 |
p i have | 30 |
right src http | 30 |
ul p the | 30 |
word in a | 30 |
figure out how | 30 |
notice how the | 30 |
record in the | 30 |
long time ago | 30 |
of the conference | 30 |
this is done | 30 |
text documents my | 30 |
and if so | 30 |
a span violent | 30 |
require use strict | 30 |
here are a | 30 |
documents in a | 29 |
the case of | 29 |
analysis of the | 29 |
is old is | 29 |
national library of | 29 |
can then be | 29 |
content uploads dpla | 29 |
a is a | 29 |
more information on | 29 |
does not mean | 29 |
an acronym for | 29 |
is new again | 29 |
to participate in | 29 |
to have a | 29 |
the length of | 29 |
the result of | 29 |
a description of | 29 |
of this is | 29 |
through the process | 29 |
as an example | 29 |
the first is | 29 |
length of a | 29 |
resource description framework | 29 |
need to know | 29 |
and there are | 29 |
you will find | 29 |
and this is | 29 |
man is span | 29 |
for research computing | 29 |
i learned about | 29 |
it is also | 29 |
a good idea | 29 |
the world of | 29 |
in the list | 29 |
file for a | 29 |
span man is | 29 |
list of urls | 29 |
i did not | 29 |
i learned a | 29 |
through this process | 29 |
is an acronym | 29 |
answer the question | 29 |
the majority of | 29 |
p p to | 29 |
the collection is | 29 |
old is new | 29 |
to accomplish this | 29 |
blockquote p the | 28 |
in the triple | 28 |
of the semantic | 28 |
are not limited | 28 |
the presentation was | 28 |
in this case | 28 |
include but are | 28 |
of the workshop | 28 |
is not necessarily | 28 |
an xml file | 28 |
all puns intended | 28 |
img width src | 28 |
is all about | 28 |
associated with a | 28 |
written by the | 28 |
could then be | 28 |
and the number | 28 |
width height br | 28 |
and some of | 28 |
book in the | 28 |
right td tr | 28 |
formats magazine articles | 28 |
mining is a | 28 |
the type of | 28 |
of the content | 28 |
youth literature project | 28 |
get back a | 28 |
to learn about | 28 |
idea of the | 28 |
as many as | 28 |
is probably the | 28 |
links to the | 28 |
it easy to | 28 |
designed to be | 28 |
is similar to | 28 |
a chttp a | 28 |
of times the | 28 |
this book is | 28 |
to do some | 28 |
to understand how | 28 |
and have a | 28 |
web articles a | 28 |
to provide a | 28 |
tools to do | 28 |
enables the reader | 28 |
id formats web | 28 |
the result in | 28 |
not able to | 28 |
how to do | 28 |
is not only | 28 |
jpg alt width | 28 |
open access journals | 28 |
of the people | 28 |
where b fs | 28 |
are some of | 28 |
is all but | 28 |
the answers to | 28 |
a given text | 28 |
if nothing happens | 28 |
the same way | 28 |
the lita blog | 28 |
the marc record | 28 |
to some degree | 28 |
one of them | 28 |
one or two | 28 |
that is not | 28 |
p img align | 28 |
are described in | 28 |
as if they | 28 |
presentation was given | 28 |
is not really | 28 |
is a part | 28 |
value of the | 28 |
this is true | 28 |
it is used | 28 |
the word cloud | 28 |
a good example | 28 |
good work of | 28 |
interested in the | 27 |
can be implemented | 27 |
the problem is | 27 |
available as a | 27 |
information about the | 27 |
in the book | 27 |
have a look | 27 |
able to read | 27 |
to write a | 27 |
h summary h | 27 |
controlled vocabulary terms | 27 |
summary h p | 27 |
i need to | 27 |
in a href | 27 |
used to do | 27 |
blockquote p code | 27 |
carolina state university | 27 |
of the other | 27 |
code pre p | 27 |
characteristics of the | 27 |
a linked data | 27 |
top technology trends | 27 |
from the command | 27 |
data into information | 27 |
in the set | 27 |
blog at http | 27 |
the result to | 27 |
be transformed into | 27 |
north carolina state | 27 |
p h summary | 27 |
content from the | 27 |
a good thing | 27 |
is the first | 27 |
strong and strong | 27 |
to search the | 27 |
have been able | 27 |
the first of | 27 |
a greater number | 27 |
a variety of | 27 |
is important to | 27 |
the time and | 27 |
plain text versions | 27 |
it is more | 27 |
libraries of notre | 27 |
in the humanities | 27 |
open access journal | 27 |
txt file for | 27 |
posted on april | 26 |
are akin to | 26 |
of the words | 26 |
table of contents | 26 |
to the index | 26 |
at the download | 26 |
putting it on | 26 |
by doing so | 26 |
other end of | 26 |
tech trends for | 26 |
your text editor | 26 |
if i get | 26 |
all sorts of | 26 |
it will be | 26 |
do with the | 26 |
not seem to | 26 |
is a lot | 26 |
the availability of | 26 |
growing number of | 26 |
more like this | 26 |
word and phrase | 26 |
in this regard | 26 |
applied to the | 26 |
of the works | 26 |
in the life | 26 |
a text editor | 26 |
the following query | 26 |
see the changes | 26 |
a means for | 26 |
make sense of | 26 |
provide the means | 26 |
at a href | 26 |
the middle of | 26 |
up to the | 26 |
is just one | 26 |
this subdirectory contains | 26 |
in this way | 26 |
wiki declaration of | 26 |
libraries at the | 26 |
content of a | 26 |
sets of words | 26 |
need to do | 26 |
of the documents | 26 |
here at the | 26 |
on a world | 26 |
h p this | 26 |
the archival community | 26 |
may need to | 26 |
a lot like | 26 |
the lengths of | 26 |
the home page | 26 |
something like this | 26 |
p h a | 26 |
the results are | 26 |
the beginnings of | 26 |
of the resulting | 26 |
are used to | 26 |
information on how | 26 |
make a book | 26 |
he has been | 26 |
span td td | 26 |
i needed to | 26 |
height br a | 26 |
there were a | 26 |
tfidf in libraries | 26 |
very similar to | 26 |
so we can | 26 |
against the index | 26 |
the development of | 26 |
to the collection | 26 |
the w c | 26 |
saved in the | 26 |
org wiki declaration | 26 |
a piece of | 26 |
and i have | 26 |
li li use | 26 |
list of changes | 25 |
i attended the | 25 |
a zip file | 25 |
what to do | 25 |
rdf and linked | 25 |
semantic web and | 25 |
to text mining | 25 |
and natural language | 25 |
part ii of | 25 |
order to be | 25 |
this release includes | 25 |
the briefest of | 25 |
you can download | 25 |
advent of the | 25 |
available as linked | 25 |
the language of | 25 |
is a collection | 25 |
at the other | 25 |
a coherent whole | 25 |
i hope to | 25 |
do not advocate | 25 |
of the books | 25 |
comes with a | 25 |
a research question | 25 |
library li li | 25 |
the curation of | 25 |
complete works of | 25 |
tei publishing system | 25 |
of changes in | 25 |
perl module called | 25 |
of ead files | 25 |
is also a | 25 |
file li li | 25 |
my a href | 25 |
the complete works | 25 |
in the right | 25 |
provide services against | 25 |
to determine the | 25 |
was originally a | 25 |
code lib conference | 25 |
early english books | 25 |
may be a | 25 |
topic modeling tool | 25 |
org dist lingua | 25 |
some of their | 25 |
when compared to | 25 |
obtain apache tika | 25 |
to obtain apache | 25 |
how to obtain | 25 |
to enable the | 25 |
file named data | 25 |
as a set | 25 |
the concept of | 25 |
of globally networked | 25 |
of the human | 25 |
full list of | 25 |
learn about the | 24 |
list of the | 24 |
words in each | 24 |
many of these | 24 |
open community knowledge | 24 |
it does this | 24 |
the result into | 24 |
in any number | 24 |
is not so | 24 |
of the database | 24 |
dates of conception | 24 |
the north carolina | 24 |
content uploads services | 24 |
by the author | 24 |
elements must be | 24 |
are not really | 24 |
university of chicago | 24 |
this document was | 24 |
i have to | 24 |
the author of | 24 |
code p blockquote | 24 |
programs and scripts | 24 |
characteristics of a | 24 |
find more like | 24 |
of times each | 24 |
and this posting | 24 |
tables of contents | 24 |
the time of | 24 |
and how to | 24 |
community knowledge hypermedia | 24 |
what degree do | 24 |
used words in | 24 |
and why should | 24 |
is located in | 24 |
annual meeting in | 24 |
to query the | 24 |
open archives initiative | 24 |
described in the | 24 |
such as but | 24 |
creation of a | 24 |
the services against | 24 |
you will have | 24 |
and the digital | 24 |
a sparql endpoint | 24 |
and you will | 24 |
the files in | 24 |
into a coherent | 24 |
the marc records | 24 |
the bulk of | 24 |
from my perspective | 24 |
is associated with | 24 |
what sorts of | 24 |
a process for | 24 |
the process was | 24 |
going to be | 24 |
each record in | 24 |
digital library framework | 24 |
if they are | 24 |
lease morgan eric | 24 |
the possibilities of | 24 |
for me to | 24 |
against the content | 24 |
denoting the number | 24 |
lita blog at | 24 |
probably the most | 24 |
and or their | 24 |
whole thing is | 24 |
done against the | 24 |
water from the | 24 |
corpus li li | 24 |
number of years | 24 |
what words are | 24 |
libraries are not | 24 |
if it is | 24 |
knowledge hypermedia administration | 24 |
how can i | 24 |
quickly and easily | 24 |
period of time | 24 |
average number of | 24 |
content uploads pos | 24 |
computer programs and | 24 |
of the texts | 24 |
hypermedia administration and | 24 |
to take a | 24 |
sanity check my | 24 |
created by the | 24 |
no more than | 24 |
why should i | 24 |
can be created | 24 |
for the past | 24 |
problem to solve | 24 |
to use amp | 24 |
can be illustrated | 24 |
the data and | 24 |
the types of | 24 |
frequency li li | 24 |
through the creation | 24 |
in the way | 24 |
for a few | 24 |
i think i | 24 |
the text mining | 24 |
queries can be | 24 |
h p in | 24 |
but there is | 24 |
administration and metadata | 24 |
the levenshtein algorithm | 24 |
to the wider | 24 |
this was originally | 24 |
but that is | 24 |
the national library | 24 |
accomplish this goal | 23 |
report against the | 23 |
a blog posting | 23 |
ask and answer | 23 |
atom publishing protocol | 23 |
be a good | 23 |
is relatively easy | 23 |
for the library | 23 |
a few weeks | 23 |
presentation at the | 23 |
td a href | 23 |
advantage of the | 23 |
to do so | 23 |
blockquote pre code | 23 |
to a href | 23 |
there is the | 23 |
of the book | 23 |
the combination of | 23 |
the meaning of | 23 |
none none none | 23 |
marc records with | 23 |
how to create | 23 |
li li click | 23 |
of the profession | 23 |
wide variety of | 23 |
this essay outlines | 23 |
the water collection | 23 |
code pre blockquote | 23 |
ala annual meeting | 23 |
an analysis of | 23 |
to answer the | 23 |
the right direction | 23 |
i sincerely believe | 23 |
subsets of the | 23 |
i participated in | 23 |
p there are | 23 |
role in the | 23 |
morgan eric morgan | 23 |
com blog water | 23 |
the essence of | 23 |
is a simple | 23 |
of unique words | 23 |
but it does | 23 |
knowledge of the | 23 |
of the given | 23 |
more than one | 23 |
is made up | 23 |
it means to | 23 |
encoded archival description | 23 |
the catholic pamphlets | 23 |
from the internet | 23 |
count and tabulate | 23 |
what is old | 23 |
hesburgh libraries at | 23 |
tools described in | 23 |
any sort of | 23 |
a marc record | 23 |
such as a | 23 |
online public access | 23 |
in a presentation | 23 |
is the way | 23 |
text version of | 22 |
may want to | 22 |
allow you to | 22 |
there are only | 22 |
the next day | 22 |
back a list | 22 |
library services and | 22 |
width height hspace | 22 |
the goal of | 22 |
thank you for | 22 |
more than the | 22 |
score for each | 22 |
browser thoreau graphs | 22 |
a word of | 22 |
of the current | 22 |
from a text | 22 |
the data sets | 22 |
of the water | 22 |
the given word | 22 |
to implement the | 22 |
to the nltk | 22 |
with the exception | 22 |
journal articles a | 22 |
violent fit of | 22 |
of freely available | 22 |
in the afternoon | 22 |
what degree is | 22 |
data has been | 22 |
to a greater | 22 |
make it easy | 22 |
the definitions of | 22 |
id formats journal | 22 |
use of these | 22 |
lease morgan lt | 22 |
data curation a | 22 |
or their frequency | 22 |
the means to | 22 |
if the reader | 22 |
the catholic research | 22 |
purpose of this | 22 |
was given by | 22 |
cloud illustrating the | 22 |
the following command | 22 |
where in the | 22 |
stop word list | 22 |
collection as a | 22 |
rdf triple store | 22 |
of rdf and | 22 |
of services against | 22 |
wish me luck | 22 |
subdirectory contains a | 22 |
we are not | 22 |
of each of | 22 |
to make sense | 22 |
and you can | 22 |
p p finally | 22 |
part i of | 22 |
the realm of | 22 |
text mining techniques | 22 |
with a set | 22 |
are not intended | 22 |
the collection as | 22 |
this is because | 22 |
an opportunity to | 22 |
find is not | 22 |
of publishing linked | 22 |
the total number | 22 |
have not been | 22 |
of the hesburgh | 22 |
a unique identifier | 22 |
of one or | 22 |
number of us | 22 |
to visit the | 22 |
it is very | 22 |
for libraries to | 22 |
new ways to | 22 |
pdf original a | 22 |
the end i | 22 |
provide a means | 22 |
set of books | 22 |
a code li | 22 |
word cloud illustrating | 22 |
to exploit the | 22 |
the field of | 22 |
as the number | 22 |
of the a | 22 |
tcp baxter graphs | 22 |
opportunity to visit | 22 |
whole lot of | 22 |
and the other | 22 |
mentioned in the | 22 |
problem to be | 22 |
while i was | 22 |
on a web | 22 |
of my experiences | 22 |
them to the | 22 |
we there yet | 22 |
the way they | 22 |
of books and | 22 |
of the future | 22 |
themes digital humanities | 22 |
in greater detail | 22 |
a long way | 22 |
of the university | 22 |
the exception of | 22 |
as but not | 22 |
the other day | 22 |
a violent fit | 22 |
a search for | 22 |
cite a href | 22 |
h p i | 22 |
catholic pamphlets project | 22 |
of the two | 22 |
or not the | 22 |
to linked data | 22 |
should i care | 22 |
is not easy | 22 |
the addition of | 22 |
but are not | 22 |
will continue to | 22 |
was a bit | 22 |
be possible to | 22 |
of search results | 22 |
the ala annual | 22 |
of the files | 22 |
tr tr align | 22 |
ol li strong | 22 |
code a href | 22 |
it is time | 22 |
i went to | 22 |
the user to | 22 |
a script called | 22 |
was attended by | 22 |
described how he | 22 |
first of all | 22 |
of the presentation | 22 |
the long haul | 22 |
has something to | 22 |
the creation and | 22 |
dpla beta sprint | 22 |
and a few | 22 |
this was a | 22 |
browser emerson graphs | 22 |
as a person | 22 |
themes libraries and | 22 |
in the story | 22 |
are in the | 21 |
edited copy for | 21 |
a and a | 21 |
to attend the | 21 |
just one example | 21 |
edu blog wp | 21 |
repository all github | 21 |
understand how you | 21 |
possible to create | 21 |
university of toronto | 21 |
in library land | 21 |
articulating a research | 21 |
as illustrated by | 21 |
leave of absence | 21 |
the bibliographic data | 21 |
of early english | 21 |
to analyze the | 21 |
the list is | 21 |
example of how | 21 |
cookies to understand | 21 |
things can be | 21 |
what i learned | 21 |
go back launching | 21 |
easy to use | 21 |
and through the | 21 |
be a list | 21 |
will be the | 21 |
for bibliographic description | 21 |
in the digital | 21 |
i gave a | 21 |
analytics cookies to | 21 |
then it would | 21 |
p h the | 21 |
sponsored by the | 21 |
library and information | 21 |
data for research | 21 |
of data and | 21 |
li go to | 21 |
tcp love graphs | 21 |
id themes data | 21 |
if there were | 21 |
this does not | 21 |
number of other | 21 |
of the top | 21 |
an idea of | 21 |
tr td align | 21 |
is not an | 21 |
from the catalog | 21 |
if there is | 21 |
you need to | 21 |
document was never | 21 |
it is now | 21 |
and saving the | 21 |
a step in | 21 |
in a few | 21 |
not too difficult | 21 |
a tool for | 21 |
to get the | 21 |
copy for eric | 21 |
but there are | 21 |
of years ago | 21 |
com ndlib text | 21 |
university libraries of | 21 |
the plain text | 21 |
for eric lease | 21 |
travel log documents | 21 |
archival descriptions as | 21 |
availability of full | 21 |
few years ago | 21 |
on your computer | 21 |
the more traditional | 21 |
information can be | 21 |
is one way | 21 |
the sorts of | 21 |
flavor of xml | 21 |
oai lod server | 21 |
number of different | 21 |
the first place | 21 |
a pdf document | 21 |
word phrases in | 21 |
tr table h | 21 |
the library catalog | 21 |
in this repository | 21 |
p it is | 21 |
you will learn | 21 |
of the hathitrust | 21 |
may have been | 21 |
the book is | 21 |
able to do | 21 |
this repository all | 21 |
be applied against | 21 |
essay was originally | 21 |
when it came | 21 |
to understand the | 21 |
to count and | 21 |
is really a | 21 |
presentation of the | 21 |
cite walden cite | 21 |
is never done | 21 |
and dissemination of | 21 |
li li go | 21 |
the values of | 21 |
text mining services | 21 |
including but not | 21 |
how you use | 21 |
one of those | 21 |
edited edited copy | 21 |
in the field | 21 |
not have been | 21 |
what it means | 21 |
understanding of the | 21 |
the role of | 21 |
set of perl | 21 |
p div p | 21 |
would not have | 21 |
part of this | 21 |
thanks go to | 21 |
from a set | 21 |
was never published | 21 |
a way of | 21 |
to explore the | 21 |
voyant tools is | 21 |
on top of | 21 |
because of this | 21 |
height src http | 21 |
half of the | 21 |
can be imported | 20 |
one and only | 20 |
some of those | 20 |
to these questions | 20 |
list of words | 20 |
p p there | 20 |
with the internet | 20 |
the query terms | 20 |
none of them | 20 |
such things are | 20 |
but not necessarily | 20 |
went to the | 20 |
from my point | 20 |
documents some of | 20 |
we do not | 20 |
locations in a | 20 |
found in the | 20 |
of the query | 20 |
illustrating the most | 20 |
com sandbox bibframe | 20 |
it or not | 20 |
my point of | 20 |
the dpla can | 20 |
a great deal | 20 |
is too much | 20 |
words phrases in | 20 |
the reader may | 20 |
results in a | 20 |
the wider community | 20 |
an entire corpus | 20 |
the data is | 20 |
are not the | 20 |
i will be | 20 |
getting started with | 20 |
does not have | 20 |
is an introduction | 20 |
idea of a | 20 |
my store rdf | 20 |
it in the | 20 |
the process i | 20 |
take the form | 20 |
a matter of | 20 |
the university libraries | 20 |
is no such | 20 |
the sru interface | 20 |
does not seem | 20 |
to do it | 20 |
word of interest | 20 |
state of the | 20 |
be used for | 20 |
all the words | 20 |
the levenshtein distance | 20 |
of library land | 20 |
work of the | 20 |
what degree are | 20 |
go to the | 20 |
my model rdf | 20 |
in the case | 20 |
of documents containing | 20 |
but rather a | 20 |
these files are | 20 |
be imported into | 20 |
org oclc worldcat | 20 |
no such thing | 20 |
on the good | 20 |
seems to have | 20 |
and i believe | 20 |
the database to | 20 |
great deal of | 20 |
html html a | 20 |
edu downloads catalog | 20 |
prove to be | 20 |
would like to | 20 |
a world map | 20 |
is time to | 20 |
is very important | 20 |
date ad http | 20 |
participate in the | 20 |
left up to | 20 |
the traditional reading | 20 |
a href mailto | 20 |
of these files | 20 |
to view the | 20 |
step for each | 20 |
for feature in | 20 |
atlantic ocean at | 20 |
would have been | 20 |
will be more | 20 |
a a a | 20 |
the body of | 20 |
files need to | 20 |
and i was | 20 |
the content they | 20 |
implemented in a | 20 |
plain text version | 20 |
from the university | 20 |
of computer science | 20 |
water and putting | 20 |
notre dame on | 20 |
mailing list archives | 20 |
least a couple | 20 |
url pointing to | 20 |
local file system | 20 |
google books project | 20 |
a growing number | 20 |
find all the | 20 |
taken with a | 20 |
depending on the | 20 |
and putting it | 20 |
a report against | 20 |
by richard baxter | 20 |
of what the | 20 |
of the first | 20 |
season to taste | 20 |
founding date ad | 20 |
it seems to | 20 |
things in the | 20 |
something like the | 20 |
oclc worldcat a | 20 |
bulk of the | 20 |
not advocate the | 20 |
use and understanding | 20 |
to know the | 20 |
and locations in | 20 |
national science foundation | 20 |
the dpla could | 20 |
a text services | 20 |
where the word | 20 |
or in a | 20 |
our ability to | 20 |
to calculate the | 20 |
all of them | 20 |
hosted by the | 20 |
seems as if | 20 |
the proximity of | 20 |
i was not | 20 |
project gutenberg is | 20 |
an example of | 20 |
content uploads c | 20 |
a few words | 20 |
description of the | 20 |
created from the | 20 |
text mining tools | 20 |
it seems as | 20 |
people want to | 20 |
contents of the | 20 |
here is the | 20 |
advocate the use | 20 |
release and have | 20 |
across a corpus | 20 |
feature in features | 20 |
the tiniest of | 20 |
use of rdf | 20 |
goal of the | 20 |
xml tei a | 20 |
a greater degree | 20 |
the second document | 20 |
for a presentation | 20 |
this is my | 20 |
are listed below | 20 |
the result on | 20 |
they are the | 20 |
ul li li | 20 |
li li find | 20 |
not the problem | 20 |
what they are | 20 |
home page for | 19 |
this one is | 19 |
it is almost | 19 |
tennessee library association | 19 |
are used in | 19 |
than ten years | 19 |
the google books | 19 |
and save the | 19 |
name of a | 19 |
and how many | 19 |
has not been | 19 |
be manifested in | 19 |
search retrieve via | 19 |
computing techniques to | 19 |
a javascript library | 19 |
of a concordance | 19 |
of the mohicans | 19 |
a process of | 19 |
blockquote p a | 19 |
of blog postings | 19 |
records from the | 19 |
sort of work | 19 |
work of others | 19 |
a wide variety | 19 |
for the web | 19 |
to come up | 19 |
answer questions such | 19 |
the integrated library | 19 |
reader to do | 19 |
definition of librarianship | 19 |
li li code | 19 |
to have been | 19 |
the top ten | 19 |
com blog how | 19 |
on the lita | 19 |
i found the | 19 |
source software for | 19 |
they need to | 19 |
more difficult to | 19 |
there was a | 19 |
content in the | 19 |
order to make | 19 |
list of a | 19 |
will find a | 19 |
how they can | 19 |
com photos infomotions | 19 |
public access catalogs | 19 |
of project gutenberg | 19 |
to extract the | 19 |
a thousand words | 19 |
i created a | 19 |
a mixture of | 19 |
then what might | 19 |
making stone soup | 19 |
i had to | 19 |
list of top | 19 |
search results are | 19 |
english books online | 19 |
majority of the | 19 |
come up with | 19 |
when i was | 19 |
td td tr | 19 |
of the information | 19 |
available for downloading | 19 |
more and more | 19 |
from a number | 19 |
these are the | 19 |
more than ten | 19 |
linked data publishing | 19 |
in a database | 19 |
descriptions as linked | 19 |
to transform the | 19 |
this is why | 19 |
the key to | 19 |
large number of | 19 |
last of the | 19 |
of these processes | 19 |
are not necessarily | 19 |
who wants to | 19 |
done with a | 19 |
trends for ala | 19 |
i have created | 19 |
the scholarly communications | 19 |
share some of | 19 |
build on the | 19 |
to know how | 19 |
its primary purpose | 19 |
com gallery valencia | 19 |
in ways that | 19 |
freely available on | 19 |
scholarly communications process | 19 |
of the oldest | 19 |
is to facilitate | 19 |
return a list | 19 |
rdf triple stores | 19 |
code lib mailing | 19 |
give it a | 19 |
the later is | 19 |
an integrated library | 19 |
the subjects of | 19 |
of the following | 19 |
results can be | 19 |
have a number | 19 |
li li create | 19 |
them in a | 19 |
access to information | 19 |
align right a | 19 |
difficult to read | 19 |
things such as | 19 |
is not always | 19 |
to the underlying | 19 |
to give a | 19 |
uploads c l | 19 |
also need to | 19 |
p the result | 18 |
increasing availability of | 18 |
it was also | 18 |
it is really | 18 |
it easy for | 18 |
to make things | 18 |
it is just | 18 |
do a search | 18 |
indexed true stored | 18 |
a matrix of | 18 |
almost always a | 18 |
the project was | 18 |
was originally given | 18 |
items from the | 18 |
be put into | 18 |
with a given | 18 |
of tools for | 18 |
keep stuff safe | 18 |
of particular interest | 18 |
to share some | 18 |
such are the | 18 |
would be a | 18 |
will probably be | 18 |
digital humanities work | 18 |
that needs to | 18 |
space and time | 18 |
learning how to | 18 |
to implement a | 18 |
file as input | 18 |
is was a | 18 |
as simple as | 18 |
was not able | 18 |
into a search | 18 |
search results by | 18 |
in computers in | 18 |
describes how the | 18 |
items in a | 18 |
think it is | 18 |
this text outlines | 18 |
urls pointing to | 18 |
local annotated mini | 18 |
it is difficult | 18 |
linked data principles | 18 |
learned how to | 18 |
the inclusion of | 18 |
a need for | 18 |
to use and | 18 |
much of this | 18 |
h links h | 18 |
many of them | 18 |
subjects and objects | 18 |
set of documents | 18 |
a presentation to | 18 |
based on a | 18 |
to say the | 18 |
the notre dame | 18 |
the interface is | 18 |
the fruits of | 18 |
author of the | 18 |
to go beyond | 18 |
data in the | 18 |
interact with the | 18 |
to a file | 18 |
not the only | 18 |
of a document | 18 |
content they find | 18 |
answers to these | 18 |
to search for | 18 |
word or phrase | 18 |
a reference to | 18 |
included in the | 18 |
none of the | 18 |
any one of | 18 |
be read by | 18 |
blog posting describes | 18 |
published in computers | 18 |
against the grain | 18 |
copies keep stuff | 18 |
into linked data | 18 |
the corpus is | 18 |
is a travel | 18 |
as easy as | 18 |
and their frequencies | 18 |
of stop words | 18 |
documents in the | 18 |
new and different | 18 |
people who work | 18 |
what are some | 18 |
a text file | 18 |
given the full | 18 |
is far from | 18 |
outlined in this | 18 |
whole lot like | 18 |
counting the number | 18 |
the free encyclopedia | 18 |
flesch readability score | 18 |
application programmer interface | 18 |
well as some | 18 |
the great idea | 18 |
but they can | 18 |
the ratio of | 18 |
d d d | 18 |
outlines my experiences | 18 |
text is a | 18 |
better understanding of | 18 |
part iii of | 18 |
number of topics | 18 |
parts of speech | 18 |
sorted by the | 18 |
and number of | 18 |
and distant reading | 18 |
used in conjunction | 18 |
to deal with | 18 |
to display the | 18 |
the increasing availability | 18 |
linked data from | 18 |
creating and maintaining | 18 |
and only one | 18 |
greater than the | 18 |
it were a | 18 |
the differences between | 18 |
the sizes of | 18 |
if you are | 18 |
to an end | 18 |
it is often | 18 |
and one of | 18 |
digital versions of | 18 |
how the word | 18 |
makes it easier | 18 |
size of the | 18 |
to evaluate the | 18 |
at the a | 18 |
believe it is | 18 |
is difficult to | 18 |
pointing to the | 18 |
all the works | 18 |
i attended a | 18 |
the way i | 18 |
and with the | 18 |
travel log documenting | 18 |
too difficult to | 18 |
library virtue htm | 18 |
eye candy by | 18 |
i was there | 18 |
of internet resources | 18 |
text of the | 18 |
ball state university | 18 |
the basis of | 18 |
and their associated | 18 |
sort the result | 18 |
of the pamphlets | 18 |
is a web | 18 |
i believe it | 18 |
new testament manuscripts | 18 |
at the conference | 18 |
this posting is | 18 |
p p after | 18 |
the entire corpus | 18 |
a stop word | 18 |
how to exploit | 18 |
written in perl | 18 |
of it is | 18 |
who work in | 18 |
the other person | 18 |
can be as | 18 |
reader can see | 18 |
more of the | 18 |
some of it | 18 |
of a person | 18 |
purpose was to | 18 |
the description of | 18 |
in the given | 18 |
records in the | 18 |
the developer to | 18 |
for the reader | 18 |
to the data | 18 |
saved in a | 18 |
there are no | 18 |
making it easier | 18 |
results of text | 18 |
all over the | 18 |
to interact with | 18 |
formats technical report | 18 |
it used to | 18 |
for the long | 18 |
yet to be | 18 |
reader to use | 18 |
sense of the | 18 |
to go to | 18 |
written for a | 18 |
of the levenshtein | 18 |
something to do | 18 |
h p img | 18 |
attended by approximately | 18 |
the distribution of | 18 |
document is about | 18 |
in the hathitrust | 18 |
but rather the | 18 |
is a tool | 18 |
of mass digitization | 18 |
a better understanding | 18 |
the people who | 18 |
three or four | 17 |
a p h | 17 |
i tried to | 17 |
a thing called | 17 |
provides an overview | 17 |
the help of | 17 |
we all have | 17 |
them into a | 17 |
of librarianship are | 17 |
suppose to do | 17 |
data from the | 17 |
to the local | 17 |
great books survey | 17 |
enabled me to | 17 |
this text is | 17 |
td img width | 17 |
content uploads ecdl | 17 |
the english language | 17 |
alex alex catalogue | 17 |
version of an | 17 |
a way for | 17 |
beta sprint submission | 17 |
and ngc lib | 17 |
query select distinct | 17 |
traditional reading process | 17 |
reader is a | 17 |
was written for | 17 |
colors a li | 17 |
ol p the | 17 |
types of input | 17 |
and maintenance of | 17 |
i have made | 17 |
to download the | 17 |
it behooves the | 17 |
means to be | 17 |
thumbnail alt a | 17 |
of illinois at | 17 |
through the application | 17 |
com water about | 17 |
mining and natural | 17 |
it came to | 17 |
is easier to | 17 |
by eric lease | 17 |
web and linked | 17 |
texts in the | 17 |
digital library collections | 17 |
changes in the | 17 |
to take advantage | 17 |
concord and merrimack | 17 |
in each item | 17 |
information and knowledge | 17 |
the features of | 17 |
given a text | 17 |
of traditional library | 17 |
the simplest of | 17 |
of the possibilities | 17 |
and how they | 17 |
he went on | 17 |
how to implement | 17 |
presentation to the | 17 |
library catalogs and | 17 |
is a sort | 17 |
and ipod touch | 17 |
much of my | 17 |
side of the | 17 |
the mailing list | 17 |
to this end | 17 |
the skills of | 17 |
web app development | 17 |
was not really | 17 |
as if there | 17 |
com alex alex | 17 |
both of these | 17 |
can be a | 17 |
illustrated by the | 17 |
time and effort | 17 |
sorts of services | 17 |
lends itself to | 17 |
and full text | 17 |
ii of iii | 17 |
have all the | 17 |
on the concord | 17 |
suite of software | 17 |
i now have | 17 |
good idea to | 17 |
what does this | 17 |
of a word | 17 |
target blank img | 17 |
is worth a | 17 |
two of the | 17 |
from my employer | 17 |
in the following | 17 |
was a success | 17 |
p the second | 17 |
considering the fact | 17 |
to the a | 17 |
text as well | 17 |
can be quite | 17 |
can not be | 17 |
open content alliance | 17 |
through a set | 17 |
worth a thousand | 17 |
is about the | 17 |
how the process | 17 |
using voyant tools | 17 |
and that is | 17 |
couple of years | 17 |
work in libraries | 17 |
denoting the location | 17 |
works in the | 17 |
table tr td | 17 |
the things of | 17 |
of an article | 17 |
more important than | 17 |
sandbox liam etc | 17 |
various types of | 17 |
of north carolina | 17 |
img width height | 17 |
width height src | 17 |
after a bit | 17 |
to be human | 17 |
in a previous | 17 |
com sandbox great | 17 |
cite td td | 17 |
as many of | 17 |
the concord and | 17 |
and answer questions | 17 |
the study carrel | 17 |
it is relatively | 17 |
provide a way | 17 |
a data set | 17 |
of what it | 17 |
think this is | 17 |
a large number | 17 |
process is not | 17 |
or just about | 17 |
output of the | 17 |
will have to | 17 |
collecting water and | 17 |
use constant marc | 17 |
and merrimack rivers | 17 |
picture is worth | 17 |
in the center | 17 |
also be used | 17 |
and open access | 17 |
of possible interest | 17 |
of words are | 17 |
in any event | 17 |
are not as | 17 |
a preponderance of | 17 |
the graphic design | 17 |
for text mining | 17 |
there is an | 17 |
a form of | 17 |
close and distant | 17 |
posting documents my | 17 |
are the most | 17 |
against the texts | 17 |
more information about | 17 |
a perl script | 17 |
the result as | 17 |
got this water | 16 |
com blog tfidf | 16 |
repeat step but | 16 |
of times a | 16 |
names n gt | 16 |
result was a | 16 |
to do things | 16 |
reader wanted to | 16 |
has been created | 16 |
of how the | 16 |
of such a | 16 |
query language of | 16 |
create a new | 16 |
the day configure | 16 |
services against text | 16 |
in it he | 16 |
the product of | 16 |
week i will | 16 |
to create model | 16 |
can be put | 16 |
were associated with | 16 |
for a limited | 16 |
and thus the | 16 |
with wilson for | 16 |
my etc etc | 16 |
as in the | 16 |
time and money | 16 |
implementation of the | 16 |
of copies keep | 16 |
is almost trivial | 16 |
day configure use | 16 |
the first part | 16 |
we can build | 16 |
unable to create | 16 |
the sort of | 16 |
a limited period | 16 |
than the others | 16 |
the second day | 16 |
sets of triples | 16 |
run the program | 16 |
he advocated the | 16 |
the semantics of | 16 |
freely available for | 16 |
called the catholic | 16 |
this being the | 16 |
of library and | 16 |
initialize and sanity | 16 |
the dates in | 16 |
sorts of questions | 16 |
the file system | 16 |
and i had | 16 |
can see that | 16 |
time of the | 16 |
li li select | 16 |
we will be | 16 |
saved to the | 16 |
in bibliographic records | 16 |
to a hash | 16 |
with the content | 16 |
edu sandbox cyl | 16 |
there will be | 16 |
but the process | 16 |
easier to read | 16 |
subjects predicates objects | 16 |
direct access to | 16 |
to the idea | 16 |
library virtue tsv | 16 |
my style parser | 16 |
in the realm | 16 |
it describes how | 16 |
called the great | 16 |
my db argv | 16 |
commenced upon a | 16 |
painting in tuscany | 16 |
of data sets | 16 |
goes a long | 16 |
to the book | 16 |
the results to | 16 |
for most of | 16 |
org country http | 16 |
saved to a | 16 |
such thing as | 16 |
of the alex | 16 |
check my output | 16 |
wrestling with wilson | 16 |
images img src | 16 |
some of this | 16 |
and the process | 16 |
can be made | 16 |
to address the | 16 |
data management plans | 16 |
text files are | 16 |
a search box | 16 |
they make it | 16 |
does not require | 16 |
html target blank | 16 |
the autosuggest interface | 16 |
org net c | 16 |
things like the | 16 |
in the hopes | 16 |
to be manifested | 16 |
photos infomotions in | 16 |
is there a | 16 |
of traditional reading | 16 |
relevancy ranking algorithms | 16 |
what types of | 16 |
in the database | 16 |
opportunity to attend | 16 |
the frequencies of | 16 |
vspace hspace a | 16 |
suppose the reader | 16 |
text mining interfaces | 16 |
a replacement for | 16 |
hypertext transfer protocol | 16 |
strong text mining | 16 |
upon and visualize | 16 |
the data into | 16 |
would be possible | 16 |
with the given | 16 |
the content in | 16 |
to some extent | 16 |
was originally written | 16 |
then you can | 16 |
director of the | 16 |
border vspace hspace | 16 |
reader ought to | 16 |
the event was | 16 |
is left up | 16 |
set of marc | 16 |
the software is | 16 |
things with the | 16 |
topic modeling is | 16 |
the computer programmer | 16 |
this proposal are | 16 |
to build on | 16 |
expected to read | 16 |
written against the | 16 |
this water on | 16 |
not want to | 16 |
of the work | 16 |
being the case | 16 |
for the word | 16 |
png colors a | 16 |
the hopes of | 16 |
for linked data | 16 |
search results and | 16 |
it is quite | 16 |
counting and tabulating | 16 |
of a uri | 16 |
than the first | 16 |
is to figure | 16 |
was going to | 16 |
information standards quarterly | 16 |
with the mobile | 16 |
blank img src | 16 |
in these lists | 16 |
the beginning of | 16 |
gnu public license | 16 |
spread full text | 16 |
the digital public | 16 |
great books data | 16 |