"Ei, dem alten Herrn zoll' ich Achtung gern'"


Malte Rehbein

It’s our department: On Ethical Issues of Digital
Humanities1

1 Anecdotal Introduction

I am not an ethicist and had not thought through the moral aspects of my profession
as a Digital Humanities scholar until quite recently. A historian by training with a
focus on Medieval Studies, I mostly considered the objects I studied to be beyond the
scope of moral and legal issues: historical figures are long dead and historical events
took place in the past, and my research would hardly ever influence the course of
history. After this training as a (traditional) historian, I went on to work digitally
with historical data, like Digital Humanists do, trying to identify entities, to find
correlations, or to visualize patterns within that very data.

At the big Digital Humanities gathering in Hamburg, 2012, however, rumours circu-
lated that a secret service agency was recruiting personnel at the conference. This
agency, they said, was interested in competences and skills in analysing and inter-
preting ambiguous, heterogeneous, and uncertain data. I had not been approached in
person and until today do not know whether the story is true or not. Nevertheless,
just the idea that a secret service might be interested in expertise from the Digital
Humanities was a strong enough signal to start thinking about the moral implications
of the work we are doing in this field, and it inspired for this essay.

In this light, examples of recent research in Digital Humanities such as psychological
profiling appear at the same time exciting and frightening. We can observe a typical
dual-use problem: something can have good as well as bad consequences according
to its use. Is it not a fascinating asset for research to determine a psychological profile

1 This contribution is based on the Institute Lecture “On Ethical Aspects of Digital Humanities (Some
Thoughts)” presented by the author at the Digital Humanities Summer Institute, Victoria BC, Canada,
5 June 2015. While the text has been revised and enriched with annotations, its essayistic style has
been maintained.


632 Malte Rehbein

of a historical figure just through automated analysis of historical textual sources?
On the other hand, what would the consequences be if a psychological profile of
anyone, living or dead, were to be revealed or circulated without her knowledge or
her assignee’s consent?

Ethical considerations are more than just a philosophical exercise. They help us to
shape the future of our society (and environment) in ways that we want it to be, and
they help us minimize risks of changes that we do not want to happen.

2 Setting the Stage: Use and Abuse of Big Data Now and Then

2.1 Big Data

This essay uses Big Data as a vehicle for considerations about ethical issues of the
Digital Humanities, pars pro toto for the field as a whole with all its facets. Big Data
has been a hyped term worldwide for some time now,2 and its methods have reached
the Humanities.3 Big Data is a collective term for huge collections of data and their
analysis, often characterized by four attributes: volume, velocity, variety, and veracity:

• Volume describes vast amounts of data generated or collected. The notion
of “vast” is ambiguous, however. Some define it as an amount of data that
is too big to be handled by a single computer. Digital Humanities and the
use-cases described in this essay will hardly ever reach this amount. However,
as Manfred Thaller has pointed out in his keynote presentation at the second
annual conference of the Digital Humanities im deutschsprachigen Raum,4 Big
Data is characterized especially by the multiplication of all four factors described
here. Since data in the Humanities is often far more complex than, for instance,
engineering data due to its ambiguous nature, data in the Humanities can be
“big” in this way.

• Velocity describes the speed at which data goes viral. As people often think of
new data generated within seconds or even faster, this is not characteristic for
the Humanities, which deals with historical or literary data. But it can become

2 Cornelius Puschmann and Jean Burgess, Big Data, Big Questions. Metaphors of Big Data, in: International
Journal of Communication 8 (2014), p. 1690–1709.

3 Christof Schöch, Big? Smart? Clean? Messy? Data in the Humanities, in: The Dragonfly’s Gaze.
Computational analysis of literary texts, (August 1, 2013), online available at http://dragonfly.hypotheses.
org/443 [last accessed: 30 Nov. 2015].

4 Manfred Thaller, Wenn die Quellen überfließen. Spitzweg und Big Data, Closing Keynote, Graz,
27 February 2015.

http://dragonfly.hypotheses.org/443
http://dragonfly.hypotheses.org/443


It’s our department: On Ethical Issues of Digital Humanities 633

relevant for Social Science, for instance in the analysis of so-called social media
data, which can be considered as part of Digital Humanities.

• Variety refers to the various types of data processed, multimodal data, data in dif-
ferent structures or completely unstructured data. This variety is characteristic
of Humanities’ sources, and Digital Humanities offer new methods to interlink
various types of data and to process it synchronously.

• Veracity questions the trustworthiness of the data to be processed, its quality
and accuracy. The Humanities, especially within the historical disciplines, know
best of all what critically questioning origin, context, and content of data means
– making veracity of data a very relevant aspect for the Digital Humanities.

Overall, Big Data collections are too big and/or too complex to be handled by tradi-
tional techniques of data management (databases) and by algorithmic analysis on a
single computer. With this definition of big data in mind, one might think of systems
like the Large Hadron Collider in Geneva, which is considered the largest research
infrastructure and scientific enterprise of all time, or of telecommunication data pro-
duced by mankind – tons of terabytes every second. Compared to this, it might not
be appropriate to speak of Big Data in the context of scholarship in the Humanities
at all. Nevertheless, Big Data can act as a metaphor for one of the current major
trends in Digital Humanities: data-driven, quantitative research based on an amount
of data that a single scholar or even a group of scholars can barely oversee let alone
calculate. Such data, due to its amount, complexity, incompleteness, uncertainty,
and ambiguity, requires the support of computers and their algorithmic power. For
centuries, Humanities scholars have recognised this aspect of their data, but now they
have this data at hand in a much larger quantity than ever before.

In general, typical applications for Big Data are well known and described. With
regard to ethics, these applications span a broad range of how they are used and what
implications this might cause. This shall be illustrated by the following examples,
divided into three groups.

The first group comprises those of the Sciences and Humanities (the German term
Wissenschaft fits better here and will be used henceforth). Incredible amounts of data
are investigated, for example for research on global climate (e. g., NASA Center for
Climate Simulation), to increase the precision of weather forecasts, to decode the
human genome, or in the search for elementary particles at the Large Hadron Collider
in Geneva. In a positive view on Wissenschaft, these investigations shall serve society
as a whole.


634 Malte Rehbein

The second class encompasses applications of Big Data from which particular groups
would benefit but which might interfere with the interests of others. Depending on
one’s perspective, such applications can easily be found in the business world. For
example, on February 16th, 2012, the New York Times published an article “How Com-
panies Learn Your Secrets”. Taking the example of the US retailer Target, the article
describes how companies analyse data and then try to predict consumer behaviour
in order to tailor and refine their marketing machine. In this context, Andrew Pole
proposed a “pregnancy-prediction model” to answer a company’s question: “If we
wanted to figure out if a customer is pregnant, even if she didn’t want us to know,
can you do that?” with the help of algorithms based on Big Data.5

In the wake of revelations by Edward Snowden, Glen Greenwald and others in 2013,6 a
third class of applications has come more and more into public consciousness. Current
mass surveillance might be the strongest example of Big Data analysis in which the
interests of a very small group are in stark contrast with the values of society at large.

2.2 Big Data Ethics

These three categories form only one, preliminary classification of Big Data applica-
tions from a moral perspective, a classification, which is, of course, simplistic and
disputable. Nevertheless, it shall lead us towards the basic question of ethics: the
distinction between right and wrong or good and evil when it comes to deciding
between different courses of action. It should also be clear by now that there is no
one answer to this question, but that different moral perspectives exist: perspectives
of those who conduct Big Data analysis, perspectives of those who do basic research
so that others can apply these methods, and perspectives of the ambient society.

Putting aside the particular scenarios in which Big Data is studied/examined, one
might ask what is methodologically typical for it? There are three main methodological
areas involved: pattern recognition, linkage of data and correlation detection. Then
people (or machines) begin the process of inference or prediction.

In his 1956 short story The Minority Report, Philip K. Dick depicts a future society
in which a police department called Precrime, employing three clairvoyant mutants,
called Precogs, is capable of predicting and consequently preventing violent crimes.

5 Charles Duhigg, How Companies Learn Your Secrets, in: New York Times, 16 February 2012, on-
line available at http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html [last accessed:
30 Nov. 2015].

6 The Snowden files, online available at http://www.theguardian.com/world/series/the-snowden-files [last
accessed: 30 Nov. 2015].

http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html
http://www.theguardian.com/world/series/the-snowden-files


It’s our department: On Ethical Issues of Digital Humanities 635

This world has not seen a single murder in years. In Steven Spielberg’s movie of the
same name from 2002,7 these Precogs are characterized in more detail: thanks to their
extrasensory capabilities they are worshiped by mankind. However, it is said in the
film, the only thing they do is search for patterns of crime in their visions: “people
start calling them divine – Precogs are pattern recognition filters, nothing more.”

Assuredly, Precogs are a fiction. However, modern real-life crime prevention indeed
attempts to find patterns in Big Data collections to predict likely locations of criminal
behaviour, which has been reported with various, some say doubtable success rates
from the USA and Germany, and probably other countries. The way our governments
and secret services justify (disguise) their actions of mass surveillance, namely to
predict and prevent terror attacks are real, too. Big data and its technology (pattern
recognition, data linkage, and inference) serve such predictions: weather prediction –
pregnancy prediction – crime prediction.

This is not new. For instance, the East German secret service Stasi under the direction
of Erich Mielke conceptualized a comprehensive database of human activities and
intentions.8 Its goal: “to put together digital data and reports of all of the 16.5 million
citizens of the German Democratic Republic.”9 In the historically realistic scenario
of the Academy Award-winning film The Lives of Others (orig. “Das Leben der An-
deren”),10 the Stasi possesses a type specimen collection of all typewriters that are in
circulation, and they know the machine favoured by each of the human writers they
are observing. Whenever they come across an anonymous document, they attribute
this document to a particular writer by comparing the typesetting of this document
with the specimen. This is not (yet) Big Data in the modern definition, but is a form
of pattern recognition. In The Lives of Others, the Stasi did not manage to disclose
Georg Dreyman as the author of an article published in the West German magazine
Der Spiegel, an article in which Dreyman revealed the high statistical rate of suicides
in East Germany after the suicide of his friend. The East German secret service did
not manage to do so, because Dreyman did not write the draft with his own but with
somebody else’s typewriter. He behaved untypically.

7 Minority Report, Dir. Steven Spielberg, 2002.
8 Cf. Stefan Wolle, Die heile Welt der Diktatur, Berlin 2013, p. 186.
9 Victor Sebestyen, Revolution 1989: The Fall of the Soviet Empire, New York 2009, p. 121. In comparison

to nowadays Big Data companies, Andrew Keen concludes: “Mielke war ein Datendieb des 20. Jahrhun-
derts, der die DDR in eine Datenkleptokratie verwandelte. Doch verglichen mit den Datenbaronen des
21. Jahrhunderts war sein Informationsimperium zu regional und zu klein gedacht. Er kam nicht auf
den Gedanken, dass Milliarden Menschen in aller Welt ihre persönlichen Daten freiwillig herausrücken
könnten.” (Andrew Keen, Das digitale Debakel, München 2015, p. 201).

10 Das Leben der Anderen, Dir. Florian Henckel von Donnersmarck, 2006.


636 Malte Rehbein

Scenarios and applications like this are not new. What is new is their dimension. And
what brings these briefly introduced examples together, be they fictitious or real, is
that they are all so-called probabilistic methods; they do not give us the truth, but the
probability that a particular event or behaviour will take or has taken place. However,
even a likelihood of 99 % prediction accuracy means that in one out of a hundred
cases, the wrong person will have to suffer the consequences.

For various reasons, Big Data yields several normative questions and issues. On May
30th, 2014, Kate Crawford published a piece in “The New Inquiry” under the title:
“The Anxieties of Big Data. What does the lived reality of big data feel like?” She
concludes:

If the big-data fundamentalists argue that more data is inherently better,
closer to the truth, then there is no point in their theology at which
enough is enough. This is the radical project of big data. It is episte-
mology taken to its limit. The affective residue from this experiment is
the Janus-faced anxiety that is heavy in the air, and it leaves us with an
open question: How might we find a radical potential in the surveillant
anxieties of the big-data era?11

Ethical questions in Big Data have barely been addressed in the research.12 In 2014,
Rajendra Akerkar edited a volume on Big Data Computing.13 In 540 pages, however,
neither legal nor ethical questions are discussed. In the chapter on Challenges and Op-
portunities by Roberto Zicarci, for example, opportunities are business opportunities,
challenges are mostly technical challenges.14 The volume does not address individual,
organisational let alone societal risks and consequences of Big Data Computing. This
seems to be symptomatic for hyped technologies such as Big Data and for techno-
logical advancement of our time generally. First, we do it, and then we handle the
consequences.

Very much alike is the 2013 report on Frontiers in Massive Data Analysis issued by the
National Academy of Sciences of the USA. Limitations of data analysis discussed here
11 Kate Crawford, The Anxieties of Big Data, in: The New Inquiry, 30 May 2014, online available at http:

//thenewinquiry.com/essays/the-anxieties-of-big-data/ [last accessed: 30 Nov. 2015].
12 More recently, a conference at Herrenhausen “Big Data in a Transdisciplinary Perspective” discussed

legal aspects of Big Data. Their proceedings have not yet been published. A report is available:
Christoph Kolodziejski and Vera Szöllösi-Brenig, Big Data in a Transdisciplinary Perspective. Her-
renhäuser Konferenz, 22 July 2015, online available at http://www.hsozkult.de/conferencereport/id/
tagungsberichte-6084 [last accessed: 30 Nov. 2015].

13 Rajendra Akerkar, Big data computing, Boca Raton 2014.
14 Roberto V. Zicar, Big Data: Challenges and Opportunities, in: Big data computing, ed. by Rajendra

Akerkar, Boca Raton, p. 103–128. Ethical challenges are mentioned (“Ensuring that data are used
correctly (abiding by its intended uses and relevant laws)”) but not further discussed (p. 111).

http://thenewinquiry.com/essays/the-anxieties-of-big-data/
http://thenewinquiry.com/essays/the-anxieties-of-big-data/
http://www.hsozkult.de/conferencereport/id/tagungsberichte-6084
http://www.hsozkult.de/conferencereport/id/tagungsberichte-6084


It’s our department: On Ethical Issues of Digital Humanities 637

are merely of a technical nature. The report states: “The current report focuses on
the technical issues – computational and inferential – that surround massive data,
consciously setting aside major issues in areas such as public policy, law, and ethics
that are beyond the current scope.”15 Bollier makes such issues more explicit: “The rise
of large pools of databases that interact with each other clearly elevates the potential
for privacy violations, identity theft, civil security and consumer manipulation.”16

Even in areas where potential ethical issues are more obvious than in the Humanities,
Wissenschaft and the general public are slowly beginning to realize the implications
of Big Data and to demand action. In June 2014, for example, the University of
Oxford announced a postdoctoral position of Philosophy in “ethics of big data”:
“this pilot project will formulate a blueprint of the ethical aspects, requirements and
desiderata underpinning a European framework for the ethical use of Big Data in
biomedical research.”17 Earlier, on October 24th, 2012, Stephan Noller called for a
general ethics of algorithms (orig.: Algorithmen-Ethik) in the German newspaper FAZ
to promote control and transparency: “Algorithmen müssen transparent gemacht
werden, sowohl in ihrem Einsatz als auch in ihrer Wirkweise.”18 It is clear that a
wide-spread understanding of algorithms is also an urgent necessity.

2.3 Technology is not value-free

In a brief survey of current research, one should not overlook a small publication by
Kord Davis from 2012, titled “Ethics of Big Data”. Davis’ analysis runs as follows:

While big-data technology offers the ability to connect information and
innovative new products and services for both profit and the greater
social good, it is, like all technology ethical neutral. That means it does
not come with a built-in perspective on what is right or wrong or what
is good or bad in using it. Big-data technology has no value framework.
Individuals and corporations, however, do have value systems, and it
is only by asking and seeking answers to ethical questions that we can
ensure big data is used in a way that aligns with those values.19

15 National Research Council, Frontiers in Massive Data Analysis, Washington DC 2013, online available at
http://www.nap.edu/read/18374/ [last accessed: 30 Nov. 2015], p. 5.

16 David Bollier, The Promise and Peril of Big Data, Washington DC 2010, p. 33.
17 Job offer for a Postdoctoral Research Fellowship in Ethics of Big Data at the University of Oxford, online

available at https://data.ox.ac.uk/doc/vacancy/113435 [last accessed: 30 Nov. 2015].
18 Stephan Noller, Relevanz ist alles. Plädoyer für eine Algorithmen-Ethik, in: Frankfurter Allgemeine

Zeitung, 24 October 2012.
19 Kord Davis, Ethics of Big Data, Sebastopol (CA) 2012, p. 8.

http://www.nap.edu/read/18374/
https://data.ox.ac.uk/doc/vacancy/113435


638 Malte Rehbein

While Davis is right in demanding that the discussion of Big Data ethics has to
be embedded in surrounding value systems, he is wrong about the neutrality of
technology. His argument reminds us of Francis Bacon who had this dream of value-
free Wissenschaft in the 17th century. In the wake of the bombing of Hiroshima and
Nagasaki in August 1945, many, such as Max Born woke up from this dream and
recognized the dual-use dilemma of technology and acknowledged the responsibility
of the scientists: “Wir stehen auf einem Scheidewege, wie ihn die Menschheit auf
ihrer Wanderung noch niemals angetroffen hat.”20 Closer to our field, Vannevar Bush,
who provided an important milestone for the development of the Digital Humanities
with his seminal publication As We May Think from 1945, asked how science can
come back to the track that leads to the growth of knowledge:

It is the physicists who have been thrown most violently off stride, who
have left academic pursuits for the making of strange destructive gadgets,
who have had to devise new methods for their unanticipated assignments.
[…] Now, as peace approaches, one asks where they will find objectives
worthy of their best.21

Technology is not value-free. Scientists and scholars develop it. Together with those
who apply technology in specific use cases, a huge share of responsibility belongs
to them. Computer pioneer Konrad Zuse recognised this. Looking back from the
vantage point of his memoir, he describes the qualms (orig.: “Scheu”) he had in the
end of 1944 to further develop his machine (Z4). Implementing conditional jumps
into it would allow free control flow:

Solange dieser Draht nicht gelegt ist, sind die Computer in ihren Möglich-
keiten und Auswirkungen gut zu übersehen und zu beherrschen. Ist aber
der freie Programmablauf erst einmal möglich, ist es schwer, die Grenze
zu erkennen, an der man sagen könnte: bis hierher und nicht weiter.22

According to Zuse’s memoir, his reputation suffered from this “Veranwortungsbe-
wußtsein des Erfinders.”23

There is a second critical aspect of Davis’ ethics. His readers are decision makers
of business enterprises. The value system he discusses refers to corporations and
individuals within the corporate structure. He does not address individuals outside
20 Max Born, Von der Verantwortung des Naturwissenschaftlers. Gesammelte Vorträge, München 1965, p. 9.
21 Vannevar Bush, As We May Think, in: Atlantic Monthly 176 (July 1945), online available at http:

//www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/ [last accessed: 30 Nov. 2015].
22 Konrad Zuse, Der Computer – Mein Lebenswerk. Mit Geleitworten von F. L. Bauer und H. Zemanek,

Berlin 1984, p. 77.
23 Ibid., p. 77.

http://www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/
http://www.theatlantic.com/magazine/archive/1969/12/as-we-may-think/3881/


It’s our department: On Ethical Issues of Digital Humanities 639

the corporation, let alone the ambient society and world at large: internal but not
external responsibility. For Wissenschaft, however, it is essential that we address both.
The freedom to study and to investigate always comes with the responsibility to use
this freedom carefully. In Wissenschaft, freedom and responsibility are two sides of
the same coin.

3 Case Studies

3.1 Some Thoughts on Digital Humanities

One can easily imagine that Big Data in biomedical research (as seen in the Oxford
job posting) opens the door for ethical considerations. But what about the Digital
Humanities? Why should we bother? In the context of this question, it is helpful
to characterize Digital Humanities as an attempt to offer new practices for the Hu-
manities. This is mainly facilitated by a) the existence or creation of and access to
digital data relevant to research in the Humanities, b) the possibility of a computer-
assisted operation upon this data, as well as c) modern communication technology in
particular the internet. Overall, this characterizes the Digital Humanities as a hybrid
field, suggesting two different perspectives within the scholarly landscape. For both
perspectives, ethical discussions play a role.

The first perspective is that of a distinct discipline, with its own research questions,
methodology, study programmes, publication venues, and so on, and of course: values.
As a discipline on its own, Digital Humanities needs its Wissenschaftsphilosophie (phi-
losophy of science), including theory24 and ethics. The second perspective, however,
sees Digital Humanities as a Hilfswissenschaft (auxiliary science) that provides ser-
vices for others, which one might compare with the role maths plays for physics and
engineering, or palaeography for history and medieval studies. This perspective on
Digital Humanities is relevant for our ethical discussion, because a Digital Humanist
might be tempted to argue that he is only developing methodologies and hence is not
responsible for the uses that others make of them.

24 For Digital Humanities as an emerging academic discipline on its own, more theoretical foundation
seems to be timely. This is particularly true in the context of Big Data analysis where proponents are
announcing an “end of theory” (provocative: Chris Anderson, The End of Theory: The Data Deluge Makes
the Scientific Method Obsolete, in: Wired Magazine 16 (2008), online available at http://archive.wired.
com/science/discoveries/magazine/16-07/pb_theory [last accessed: 30 Nov. 2015]. A critical discussion
offers Rob Kitchin, Big Data, New Epistemologies and Paradigm Shifts, in: Big Data & Society 1 (June
2014), DOI: 10.1177/2053951714528481.

http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory
http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory
http://doi.org/10.1177/2053951714528481


640 Malte Rehbein

3.2 Early Victims of Digital Humanities: William Shakespeare and Agatha Christie

A first case study comprises the work by Ryan Boyd and James Pennebaker on William
Shakespeare. In the context of modern text and language analysis, Pennebaker, a
social psychologist, is known for his method of Linguistic Inquiry and Word Count
(LIWC). Applying text analytical methods to the large corpus of the work of William
Shakespeare, Boyd and Pennebaker claim to be able to create a psychological sig-
nature of authors (“methods allowed for the inference of Shakespeare’s […] unique
psychological signatures”25) and to confirm the broadly accepted characterization of
the playwright as “classically trained” and “socially focused and interested in climbing
higher on the social ladder.”26 Shakespeare has long been dead, of course, and most
likely, neither he nor any of his kin has to face the consequences of this research.
But the methods employed here are of a general nature and can easily be applied to
anyone, living or dead, whether he wants it or not.

Another prominent “victim” of this kind was Agatha Christie, maybe the most read
English female writer of all time. In 2009, Ian Lancashire and Graeme Hirst published
a study “Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of
Dementia: A Case Study”.27 Lancashire and Hirst analyse the corpus of Christie’s
work as follows:

Fourteen Christie novels written between ages 34 and 82 were digitized,
and digitized copies of her first two mysteries […] were taken from Project
Gutenberg. After all punctuation, apostrophes, and hyphens were deleted,
each text was divided into 10,000-word segments. The segments were
then analysed with the software tools Concordance and the Text Analysis
Computing Tools (TACT). We performed three analyses of the first 50,000
words of each novel.28

The result of this, fairly straight-forward, textual analysis indicated that Christie’s
vocabulary was in significant decline over the course of her life and that the amount
of repetition increased, such as the usage of indefinite words. For Lancashire and
Hirst, this is an indication that Agatha Christie developed dementia.
25 Ryan L. Boyd and James W. Pennebaker, Did Shakespeare Write Double Falsehood? Identifying Individuals

by Creating Psychological Signatures With Text Analysis, in: Psychological Science 26 (2015), p. 570–582,
here p. 579.

26 Boyd/Pennebaker, Did Shakespeare Write Double Falsehood? (see note 25), p. 579–580.
27 Ian Lancashire and Graeme Hirst, Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of

Dementia: A Case Study, in: Forgetful Muses: Reading the Author in the Text, Toronto 2010, p. 207–219,
online available at http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf [last accessed:
30 Nov. 2015].

28 Ibid., p. 208.

http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf


It’s our department: On Ethical Issues of Digital Humanities 641

These techniques on textual corpora operate on text as a sequence of characters. They
are agnostic about who had written these texts and for what purpose. In other words,
not only texts by well-known and deceased writers can be examined in such manner.
Any text can. Lancashire and Hirst are well aware of this fact and of the potential
consequences. Like many technologists, however, their ethics and outlook is strictly
positive: “While few present-day patients”, they conclude,

have a large online diachronic corpus available for analysis, this will
begin to change as more individuals begin to keep, if only by inertia, a
life-time archive of e-mail, blogs, professional documents, and the like.
[… We can] foresee the possibility of automated textual analysis as a part
of the early diagnosis of Alzheimer’s disease and similar dementias.29

Early diagnosis of diseases or their prediction might be a wonderful “tool” in the future.
Research in this direction aims at something “good”, Lancashire and Hirst would
argue. Their ethics is utilitarian in the tradition of Jeremy Bentham and John Stuart
Mill. But what happens if this data is used against someone, for instance, to deny
an insurance policy? And as textual data becomes more and more easily available,
whether we consciously deliver it, for instance in blogs or Facebook microblogs, or
because our e-mails are intercepted, it becomes almost impossible for the individual
to avoid this situation.

3.3 Revealing Your Health Preconditions

Another, related example shall illustrate that not only texts and data that we currently
provide might lean to individual or societal consequences, but also data from the past.
An open question in medical research addresses whether or not there is a genetic
predisposition to Alzheimer’s disease. Neurologist Hans Klünemann and archivist
Herbert Wurster now propose that this hypothesis can potentially be tested with
historical data.30 Their research uses historical records, parochial death registers from
1750 to 1900, which were digitized, transcribed and encoded in a database at the
archive of the diocese of Passau. They analyse the data for family relations in order to
create family trees, and they analyse mortality data to find indicators for Alzheimer’s

29 Lancashire/Hirst, Vocabulary Changes (see note 27), p. 210.
30 Hans Klünemann, Herbert Wurster and Helmfried Klein, Alzheimer, Ahnen und Archive. Genetisch-

Genealogische Alzheimerforschung, in: Blick in die Wissenschaft. Forschungsmagazin der Universität
Regensburg 15 (2013), p. 44–51.


642 Malte Rehbein

disease.31 Through this, they hope to identify genetic conditions for the development
of Alzheimer’s disease and they hope, in the future, to be able to predict whether or
not someone belongs to such a risk group.

This is a highly interdisciplinary approach with Digital Humanities at its very heart:
digitization, digital transcription and encoding as well as computer-based analysis of
historical data make this work. If the approach turns out to work, one can foresee
great potential in it. What could be problematic about such research? This data
(the digitized church registers) has been made publically available, searchable, and
analysable. Many other archives have done or will do the same. Consequently,
however, information about an individual’s family and their causes of death will
become public information and this information can be used, for instance, to evaluate
the individual risk of a living descendant for a certain disease even if this individual
has not disclosed any personal information about him or herself. Hence, information
about living persons could be inferred from open historical data.

In addition to the question of whether individual rights are affected, these case studies
demonstrate typical dual-use problems. On the one hand, family doctors can use
the data and its analysis as an early diagnosis of severe diseases. On the other hand,
potential employers can also use it, for instance, to pick only those individuals that
do not belong to any risk group. There is no easy solution for this problem. Ethical
questions appear to be dilemmas, also in Digital Humanities.

3.4 Another Prominent Victim of DH: J. K. Rowling

In 2013, a quite prominent case of authorship attribution floated around. A certain
Robert Galbraith published a novel called The Cuckoo’s Calling. Despite positive
reviews, the book was at first only an average success on the book market. However,
three months later, rumours began circulating that the real author of The Cuckoo’s
Calling was J. K. Rowling, who had had such a sweeping success with her Harry Potter
series. Patrick Juola and Peter Millicam analysed the text of The Cuckoo’s Calling with
methods of forensic stylometry and came to the conclusion that it was quite probable
that Rowling is indeed its author, which she afterwards admitted.

Especially when it is a “closed game” as in this case, in which one computes the
likelihood with which a text can be attributed to an author candidate (as opposed
to the “open game” where one computes the most likely author of a text), forensic
31 As Dementia or Alzheimer were not known then, other terms were used as indicator for these diseases.

“Gehirnerweichung” or “Gehirnwassersucht” are typical expressions from the sources that Klünemann
and Wurster use for their research.


It’s our department: On Ethical Issues of Digital Humanities 643

stylometry is a simple method: “language is a set of choices, and speakers and writers
tend to fall into habitual, or at least common, choices. Some choices come from
dialect […], some from social pressure […], and some just seem to come.”32 This
leaves stylistic patterns that a computer can measure and compare to corpora of texts
already attributed, such as the Harry Potter series. The method has been described
and practised since the 19th century (although computers are a late entrant to the
game).

For the Digital Humanities, methods like these are – at first sight – fantastic. They
offer vast opportunities for fundamental research, for example in studying the history
of literature, or general history, they allow testing existing hypotheses, and they offer
new ones. The moral question, however, is again: at what cost? J. K. Rowling admitted
that she would have preferred to remain unrevealed: “Being Robert Galbraith has
been such a liberating experience […] It has been wonderful to publish without hype
and expectation and pure pleasure to get feedback under a different name.”33 Does
research in Digital Humanities threaten the effectiveness of a pseudonym and hence
an individual’s right to privacy and freedom to publish?

This kind of research does not only affect individuals. There are consequences for
society as whole, for the world we live in, and for our social interaction. If one thinks
the idea of authorship attribution through to its very end, then we arrive at a future in
which it is impossible to remain anonymous – even when we try. Proponents of mass
surveillance and leaders of totalitarian regimes will certainly favour such a scenario,
but free-speech advocates will certainly not. We have to carefully evaluate the risk
that our research carries. There is yet another interesting aspect to this story: we
usually speak of technology and Wissenschaft in the same breath as representing?
progress. Wissenschaft enhances, it extends, it augments. In the case discussed here,
however, we appear to lose a capability by this scientific progress: We will not be
capable anymore of hiding.

3.5 Psychological Profiling Through Textual Analysis

In 2013, inspired by the Pennebaker’s work on the psychological signature of Shake-
speare, John Noecker, Michael Ryan, and Patrick Juola published a study of “Psy-
32 Patrick Juola, Rowling and “Galbraith”: an authorial analysis, in: Language Log Blog, 16 July 2013, online

available at http://languagelog.ldc.upenn.edu/nll/?p=5315 [last accessed: 30 Nov. 2015].
33 Quoted after J. K. Rowling’s pseudonym: A bestselling writer’s fantasy, in: The Boston

Globe, 22 July 2013, online available at https://www.bostonglobe.com/opinion/editorials/
2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/
H9tkYJFB5dAHppCOe963yJ/story.html [last accessed: 30 Nov. 2015].

http://languagelog.ldc.upenn.edu/nll/?p=5315
https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html
https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html
https://www.bostonglobe.com/opinion/editorials/2013/07/21/with-pseudonym-richard-galbraith-rowling-lives-out-every-writer-fantasy/H9tkYJFB5dAHppCOe963yJ/story.html


644 Malte Rehbein

chological profiling through textual analysis”.34 This research presumes that the
personality of an individual can be classified with the help of psychological profiles
or patterns. Based on a typology suggested by Carl Gustav Jung in 1921,35 Kather-
ine Briggs and Isabel Myers developed a classification on their own (Myers-Briggs
type indicator, MBTI)36 in which they classify individuals’ preferences among four
dichotomies: extraversion versus introversion, sensation versus intuition, thinking
versus feeling, and perception versus judging. An individual can be, for instance,
an ISTJ type: an introversive, sensing thinker who makes decisions quite quickly.
Although the validity of this classification as well as its reliance on questionnaires
is disputable, the Myers-Briggs indicator appears to be quite popular, especially in
the USA where it is used in counselling, team building, social skill development, and
other forms of coaching.

Noecker, Ryan, and Juola formulate a simple hypothesis: the writing style of an
individual can serve as a measure for this individual’s MBTI and hence, stylometric
methods can be used to determine the type indicator. In other words, they propose
that automated textual analysis can create a psychological classification of the author
of a given text. For their experiments, Noecker, Ryan, and Juola used a corpus of
texts by Dutch authors whose MBTI is known (Luyckx’ and Daelemans’ Personae: A
Corpus for Author and Personality Prediction from Text).37 Noecker, Ryan, and Juola
state an average success rate of 75 %. They claim to detect the ‘J’-type (judging) and
the ‘F’-type (feeling) quite well (91 %, 86 %). For the ‘P’-types, the perceivers, however,
the method does not respond equally well (56 %).38 According to Myers and Briggs,
the perceivers are those individuals who are willing to rethink their decisions and
plans in favour of new information, those who act more spontaneously than others.

Again, the texts that these methods are grounded in might be provided consciously
and willingly or unconsciously and unwillingly. Hence, the same moral issue of use
and reuse of scholarly methods arises here and needs to be discussed within the
context of these usages. But what about the researcher who develops but does not
necessarily apply this technology? In this case, Digital Humanities would play the
role of an auxiliary science, providing services for others. As such an auxiliary science,
it is tempting to argue that research is value-free, that its sole goal is the development
34 John Noecker, Michael Ryan and Patrick Juola, Psychological profiling through textual analysis, in:

Literary & Linguistic Computing 28 (2013), p. 382–387, DOI: 10.1093/llc/fqs070.
35 Carl Gustav Jung, Psychologische Typen, Zürich 1921.
36 Cf. A Guide to the Isabel Briggs Myers Papers, online available at http://web.uflib.ufl.edu/spec/manuscript/

guides/Myers.htm [last accessed: 30 Nov. 2015].
37 K. Luyckx and W. Daelemans, Personae: A Corpus for Author and Personality Prediction from Text, in:

Proceedings of the 6th Language Resources and Evaluation Conference, Marrakech 2008.
38 Noecker/Ryan/Juola, Pschological profiling through textual analysis (see note 34), p. 385.

http://doi.org/10.1093/llc/fqs070
http://web.uflib.ufl.edu/spec/manuscript/guides/Myers.htm
http://web.uflib.ufl.edu/spec/manuscript/guides/Myers.htm


It’s our department: On Ethical Issues of Digital Humanities 645

of methods and that only those who apply these methods have to consider moral
consequences – whether that be literary scholars working on Agatha Christie or
historians interested in the psychological profiling of historical leaders. However,
as argued above, Wissenschaft and technology is never value-free. Everyone who
is developing something is responsible for considering potential risks of its usage.
Especially when Digital Humanities is understood as a discipline in its own right,
these issues have to be addressed and discussed.

4 Elements of an Ethical Framework – Towards a Wissenschaftsethik for
Digital Humanities

4.1 Fears of Media Change

With the rough definition of Digital Humanities elaborated above in mind, we next
sketch out some of the changes underway during this computational turn. Media
changeover has always been characterised by anxiety and outspoken criticism. Well-
known examples include Plato’s critique on writing as it led to degeneration of the
human capability of memorizing and more importantly comprehension (Phaedrus
dialogue), the invention of the printing press which allowed limitless publications and
led to moral decay, Nietzsche’s trouble with the typewriter and how this technology
changed his way of thinking,39 the “indoctrination or seize-over of the listener through
very close spraying” of sounds by stereophonic headphones,40 and many others.
More recently, the internet as a new medium has been criticized as leading towards
superficiality and the decline of cognitive capabilities as Nicholas Carr’s rhetorical
question “Is Google Making Us Stupid?” suggests.41

Let us briefly look at some positive aspects of these changes: writing down knowledge
allowed its increase beyond the memory capability of a single person, the invention
of the printing press led to a liberalisation of this knowledge, internet technology
and open access might lead to further democratization, de-imperialization and de-
canonization of knowledge. In the context of the latter, David Berry emphasizes the

39 Robert Kunzmann, Friedrich Nietzsche und die Schreibmaschine, in: Archiv für Kurzschrift, Maschinen-
schreiben, Bürotechnik, 3 (1982), p. 12–13.

40 “Psychoterror durch den Kunstkopf”, zitiert nach Ralf Bülow, Vor 40 Jahren: Ein Kunstkopf für bin-
aurale Stereophonie, in: heise online (31 August 2013), URL: http://heise.de/-1946286 [last accessed:
30 Nov. 2015].

41 Nicholas Carr, Is Google Making Us Stupid?, in: The Atlantic, 1 July 2008, online available at http:
//www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868 [last accessed:
30 Nov. 2015].

http://heise.de/-1946286
http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868
http://www.theatlantic.com/magazine/archive/2008/07/is-google-making-us-stupid/6868


646 Malte Rehbein

ubiquitous access to human knowledge,42 which reminds one of Vannevar Bush’s
memory extension system Memex:

Technology enables access to the databanks of human knowledge from
anywhere, disregarding and bypassing the traditional gatekeepers of
knowledge in the state, the universities, and market. […] This introduces
not only a moment of societal disorientation with individuals and institu-
tions flooded with information, but also offer a computational solution
to them in the form of computational rationalities, what Turing (1950)
described as super-critical modes of thought.43

One may regard it as positive or negative,44 but changes in media have always been
followed by a dismissal of the old “gatekeepers of knowledge”: first the authorities
of the classical age, the Christian church and the monasteries, then the publishing
houses and the governmental control in modern history. Progress dismissed them but
new gatekeepers succeeded them. In a way, Vannevar Bush’s vision of the memory
extension by what we would now call a networked database of knowledge seems
to have become reality. Not only do the various types of media converge, but also
man and machine merge. Data and algorithms become more and more important for
everyday life and work, and those who control these algorithms and “gatekeep” the
data, wield power. “Code is law”, postulates Lawrence Lessig,45 and in the German
newspaper DIE ZEIT, Gero von Randow follows this up and proclaims: “Who controls
this process, rules the future”.46 Apparently, this leaves the door open for manipulation
and for mistakes.

4.2 The Sorcerer’s Apprentice

The 1980s Czechoslovakian (children’s) science-fiction TV series Návštěvníci (The
Visitors)47 depicts a peaceful world in the year 2484. In this world, everything is in
harmony until the Central Brain of Mankind, a computer, predicts the collision of an
42 David M. Berry, Introduction, in: Understanding Digital Humanities, ed by David M. Berry, Basingstoke

2012, 1–20.
43 Ibid., p. 8–9.
44 Andrew Keen is one to emphasize the negative impact of the vanishing of gatekeepers because it led

to lost of trust and opens the door for manipulation and propaganda (Keen, Das digitale Debakel (see
note 9), p. 184–185).

45 Lawrence Lessig, Code and Other Laws of Cyberspace, New York 1999.
46 “Wer diesen Prozess steuert, beherrscht die Zukunft”. Gero von Randow, Zukunftstechnologie: Wer

denkt in meinem Hirn?, in: DIE ZEIT, No. 11 (7 March 2014), online available at http://www.zeit.de/
2014/11/verschmelzung-mensch-maschine-internet [last accessed: 30 Nov. 2015].

47 Návštěvníci (1981–1983), dir. Jindřich Polák.

http://www.zeit.de/2014/11/verschmelzung-mensch-maschine-internet
http://www.zeit.de/2014/11/verschmelzung-mensch-maschine-internet


It’s our department: On Ethical Issues of Digital Humanities 647

asteroid with the Earth leading to the planet’s destruction. The people completely and
blindly rely on this Central Brain and start a mission to rescue mankind. The mission
fails and people are about to evacuate the planet. Then an accidental traveller in time,
from the year 1984, comes into this world. What he finds out is very simple: the
people have built the machine (the Central Brain) onto a crooked surface which hence
caused crooked predictions. The traveller put the Central Brain back into its upright
position from which it could correct its prediction (Earth was not threatened) and the
machine apologized for causing so much trouble. The visitor from a past time did one
thing that the people of 2484 did not: he critically (one might say naïvely) approached
the computer and challenged its functionality – a capability that the people of 2484
have lost or forgotten. They never thought of questioning the computer’s prediction.

The moral of this story is that, in the end, it has to be the humans to justify the
consequences of actions. This is very much like what Joseph Weizenbaum has told us.
A computer can make decisions, he would argue, but it has no free choice. In the future
world of The Visitors, one single, central computer steers the fate of mankind. In our
present age, it is the ubiquity of computing technology – computers are everywhere –
that effects our daily lives.

Philosopher Klaus Wiegerling discusses ubiquitous computing48 from an ethical
perspective in ways that are highly relevant to (Digital) Humanities.49 If systems,
Wiegerling argues, acquire, exchange, process, and evaluate data on their own, then
the materialization of information can no longer be comprehended by people. Personal
identity, however, is formed through such comprehension, and making experiences
(an important part of these is doubt or resistance) is essential for it. Hence, ubiquitous
algorithms might lead to a loss of identity and personal capabilities and competences.
Like The Visitors, we start behaving like little children, being incapable of determining
reality correctly, losing our identity as an acting subject and limiting our options on
how to act. The “unfriendly takeover” by computers that technology critic Douglas
Rushkoff fears for our present society50 has taken place in The Visitors and it is only
someone from the past who saves the present live of the future.

We need to engage more critically with the origin of our data and with the algorithms
we are using. One needs only to look into a university classroom to observe how the
role search engines and smartphone apps play in decision-making is increasing. A
typical argument that you can often hear is that some information comes ‘from the
48 The term appeared around 1988. Cf. Mark Weiser, R. Gold and J. S. Brown, The Origins of Ubiquitous

Computing Research at PARC in the Late 1980s, in: IBM Systems Journal 38/4 (1999), p. 693–696.
49 Klaus Wiegerling, Ubiquitous Computing, in: Handbuch Technikethik, ed. by Armin Grundwald, Stuttgart

2013, p. 374–378.
50 Douglas Rushkoff, Present shock: when everything happens now, New York 2013.


648 Malte Rehbein

internet’. That this information is not challenged (by questioning who has provided
this ‘information’ and when, with what intention, which were the sources etc.) il-
lustrates the lack of information literacy. Additionally, the conclusion that because
something ‘comes from the internet’, this something has to be the truth (or at least
valid), illustrates the danger of this attitude and information illiteracy being abused.
Consequently, new gatekeepers of knowledge might emerge all too easily. Being
incapable of critical thinking can be observed more and more, from a classroom
situation in Digital Humanities to scholarship in general, and to society at large.

Crawford’s observation about Big Data that “If the big-data fundamentalists argue
that more data is inherently better, closer to the truth, then there is no point in
their theology at which enough is enough”51 leads us to a position that ethicists
would call the problem of the Sorcerer’s Apprentice, named after Goethe’s poem Der
Zauberlehrling.52

The poem begins as an old sorcerer departs his workshop, leaving his apprentice
with household chores to be done. Tired of fetching water with a pail, the apprentice
enchants a broom to do the work for him – using magic for which he is not yet fully
trained. The floor is soon awash with water, and the apprentice realizes that he does
not know how to stop the broom:

Immer neue Güsse
bringt er schnell herein,
Ach, und hundert Flüsse
stürzen auf mich ein!

The apprentice splits the broom in two with an axe, but every piece becomes a new
broom on its own and takes up a pail and continues fetching water, now at twice the
speed.

Die ich rief, die Geister,
werd’ ich nun nicht los

When all seems lost, the old sorcerer returns and quickly breaks the spell. The poem
finishes with the old sorcerer’s statement that powerful spirits should only be called
by the master himself.

51 Crawford, The Anxieties of Big Data (see note 11).
52 I am grateful to my colleague Christian Thies, Professor of Philosophy at the University of Passau to

share his thoughts on this with me.


It’s our department: On Ethical Issues of Digital Humanities 649

The analogy to the risks of Big Data is obvious: what initially has been useful to
handle a large amount of data might get out of control and start ruling us, taking
away from us the options that we once had: the normative power of the de-facto.
At some point, we might have no choice anymore but to use data analysis or other
computer-based methods for any kind of research in the Humanities. And what would
then happen if we do not understand the data and the algorithms anymore and stop
challenging the machines like the people from 2484?

Wiegerling concludes that it is becoming more and more important today to pinpoint
the options available for action, to make transparent the possibilities of intervening
with an autonomous operating system, and to enlighten people about the functionality
of these systems.53 This should be a core rationale of any training in Digital Humanities,
and it is essential to shape our tools before these tools shape us.54

4.3 Some General Thoughts on Wissenschaftsethik of Science for the Digital
Humanities

New technologies have their good sides and their bad sides depending on one’s
perspective. Every change brings forward winners and losers. The big ethical question
is how to value and how to opt and to justify what we are doing. Philosopher Julian
Nida-Rümelin pointed out that for various areas of human conduct, different normative
criteria might be appropriate and ethics cannot be reduced to one single system of
moral rules and principles.55 As we are currently forming Digital Humanities as a
discipline on its own, a definition of its own Wissenschaftsethik as a complementary
counterpart to its theory of science seems to be timely. Theory and ethics together
make philosophy of science. Their role it is to clarify what exactly this Wissenschaft
is (its ontological determination) and how Wissenschaft is capable to produce reliable
knowledge.56

Ethics is part of philosophy and is regarded as a discipline that studies moral (as a
noun), i. e., normative, moral (as an adjective) systems, judgements, and principles.
This is not the place to discuss any moral criteria. However, on a more general level,
a framework from which these criteria, or code of conduct, for Digital Humanities
53 Wiegerling, Ubiquitous Computing (see note 49), p. 376.
54 Cf. Keen, Das digitale Debakel (see note 9), p. 20, in analogy to the famous Winston Churchill quote:

“We shape our buildings; thereafter they shape us.”
55 Julian Nida-Rümelin, Theoretische und angewandte Ethik: Paradigmen, Begründungen, Bereiche, in:

Angewandte Ethik: Die Bereichsethiken und ihre theoretische Fundierung, ed. by Julian Nida-Rümelin,
Stuttgart 1996, p. 3–85, here p. 63.

56 Thomas Reydon, Wissenschaftsethik, Stuttgart 2013, p. 16.


650 Malte Rehbein

might be derived, shall be outlined along three areas following Hoyningen-Huene’s
systematization:57

1. Moral issues in specific fields of research and in close relation to the objects of
study

2. Moral aspects of Wissenschaft as a profession

3. The responsibility of an individual scholar as well as of the scholarly community
at large.

All these areas are relevant for Digital Humanities. The first area comes into play,
for instance, when one deals with and analyses personal data. Many of the examples
discussed above touch on this question. Consider the authorship attribution and the
case of Rowling. The researchers analyse text, but this text mediates an individual,
which then becomes the object of study. Do we violate Rowling’s right of privacy
or anonymity? Should one (or not) ask this individual whether she objects to this
investigation? If we are capable of inferring an individual’s genetic disposition to
certain diseases by just analysing historical records, should permission be required
from this individual when the historical data of his ancestors is going to be public
through digitization?

Scientific and technological progress seem to go more and more hand in hand with
an increasing readiness for taking risks as Ulrich Beck criticizes.58 He observes that
there are hardly any taboos anymore or that once existing taboos are broken. Societal
scruples seem to disappear with the consequence that society increasingly accepts
once questionable conduct without opposition. Beck’s observation applies not only
for the use of technology but also for research as such. Moreover, this research, Beck
argues, is taking place less and less inside the protected environment of a laboratory.
Instead, the world as a whole is becoming a laboratory for research. For the objects
that Beck discusses, for instance genetically mutated plants, it is rather obvious how
this ‘world as laboratory’ is threatening the world as a whole. For the Humanities,
it is less apparent. However, the tendency might indeed be the same. For instance,
Big Data offers the possibility of studying communicational patterns and behaviours
of people at large by analysing so-called social media such as Twitter. Unlike an
experiment in a laboratory where people are invited to participate as test subjects,
the internet, the virtual world, becomes the new laboratory where participation is
often unwitting and involuntary. In a physical laboratory, we used to ask people
57 Cf. Reydon, Wissenschaftsethik (see note 56), p. 12–15. The fourth area, a Sozialphilosophie der Wis-

senschaft is left out here. It addresses the interplay of Wissenschaft with society.
58 Ulrich Beck, Weltrisikogesellschaft, Frankfurt 2008.


It’s our department: On Ethical Issues of Digital Humanities 651

for their permission to participate in an experiment (and usually paid them some
compensation). Should we not do the same and achieve an informed consent when
we regard the internet as a laboratory and use its data? Can we accept the fact that
Tweeters are test persons in an experiment without even knowing it?59

The second area of ethics discusses moral aspects of Wissenschaft as a profession. We
can divide ethics of science into two dimensions: first, the internal dimension that
deals with issues of affecting individuals within a given scholarly community and this
community itself, and second, the external dimension that deals with consequences
for individuals outside this community, for the ambient society, culture and nature.
Moral aspects of Wissenschaft as a profession are of the first dimension. What is
understood here is usually a code of good practice: work lege artis, do not fabricate,
do not falsify, do not plagiarize, honour the work of others, give credit to all who
supported you, name your co-authors, do not publish the same thing twice, and
various other guidelines that many scholarly communities have given themselves.60

But it is more than that. Robert Merton formulated in 1942 four epistemological
dimensions of what distinguishes good from bad science.61 He claimed that scholars
shall only be guided by the ethics of their profession and not by personal or social
values. Between the 1920s and 1940s, he observed that science is developing not
autonomously and on its own anymore, but that societal and political forces and
their interests significantly drive it. This led to a loss of trust into the objectivity of
59 In fact, research of this kind has already been undertaken in a very problematic manner. Kramer,

Guillory, and Hancock manipulated the “News Feeds” of 700,000 Facebook users to study the impact
on their mood (Adam D. I. Kramer, Jamie E. Guillory and Jeffrey T. Hancock, Experimental Evidence of
Massive-Scale Emotional Contagion through Social Networks, in: Proceedings of the National Academy of
Sciences 111, No. 24 (17 June 2014), p. 8788–8790). Facebook was a partner in this experiment, provided
access to personal data and facilitated data manipulation. Informed consent by the users had not been
asked for. Many raised ethical concerns about this study, for instance in online comments to the publica-
tion (http://www.pnas.org/content/111/24/8788.full?sid=750ad790-21a1-4ebc-ba71-9dc0ac5af3d0 [last
accessed: 30 Nov. 2015]) and other media. In an opinion piece within the same journal, Kahn, Vayena,
and Mastroianni ask in a utilitarian view whether the concept of informed consent “makes sense in
social-computing research” and conclude that “best practices have yet to be identified” (Jeffrey P. Kahn,
Effy Vayena and Anna C. Mastroianni, Opinion: Learning as We Go: Lessons from the Publication of
Facebook’s Social-Computing Research, in: Proceedings of the National Academy of Sciences 111, No. 38
(23 September 2014), p. 13677–13679). A more critical opinion expresses Tufekci: Zeynep Tufekci,
Engineering the Public: Big Data, Surveillance and Computational Politics, in: First Monday, 19 (7 July
2014), DOI: 10.5210/fm.v19i7.4901.

60 For a general framework cf. for instance the memorandum of the Deutsche Forschungsgemeinschaft
(1998/2013). Sicherung guter wissenschaftlicher Praxis. Empfehlungen der Kommission “Selbstkontrolle
in der Wissenschaft”/Safeguarding Good Scientific Practice. Recommendations of the Commission on
Professional Self Regulation in Science.

61 Robert Merton, The normative structure of science, in: The sociology of science: theoretical and empirical
investigations, ed. by Robert Merton, Chicago 1973, p. 267–278.

http://www.pnas.org/content/111/24/8788.full?sid=750ad790-21a1-4ebc-ba71-9dc0ac5af3d0
http://doi.org/10.5210/fm.v19i7.4901


652 Malte Rehbein

scientific results. Although Merton’s view on the exclusion of personal and social
values does not hold out anymore, in the framework of today’s Wissenschaftssystem,
there are a couple of characteristics, similar to Merton’s observation 70 years ago.
These apparently change the way we work, but they also compel our research into
particular directions, and steer and restrict our choices of research topics and meth-
ods. These characteristics of today’s Wissenschaftssystem include (among others): a
permanent pressure to acquire third-party funding, the “publish-or-perish” principle,
a growing necessity to legitimate research, especially in the Humanities, international
competition and a demand to be “visible” as a researcher. It has to be discussed how
these conditions affect the objectivity of our research especially when at the same
time, a huge amount of data is conveniently at hand to quickly produce analytical
results, faster than by traditional methods but maybe also less grounded. Merton’s
principles from 1942 might still serve as guidance. In order to restore legitimation
and trust into research, he demands four principles:

1. Universalism: all research has to be measured against impersonal criteria
regardless of its origin. Only then, best results can be produced (this is a
teleological criterion)

2. Communalism: all research is the result of a communal effort (which refers to
Newton’s ‘Standing on the shoulders of giants’), cannot remain individual and
has to be published widely (the modern Open Access, Open Source, Open Data
movements builds on this)

3. Selflessness: the behaviour of a researcher has to be guided only by the interest
of the scientific community; it is his duty to produce reliable knowledge (this is
a deontological criterion)

4. Organized skepticism: it is the duty to steadily question the own work and the
work of others in order to produce best possible results.

The latter is particularly important within an emerging field such as the Digital
Humanities.

The third area of ethics is more abstract: it deals with the consequences of our research
for the world in which we live. In the 17th century, Francis Bacon formulated his ideal
of a Wissenschaft, which should serve society in order to improve the living conditions
of humankind. Science shall be – teleologically – subordinated under this higher
good. In Bacon’s time, this especially aimed at understanding nature. Knowledge
would then empower mankind to master nature.62

62 Cf. Reydon, Wissenschaftsethik (see note 56), p. 82–83.


It’s our department: On Ethical Issues of Digital Humanities 653

Scepticism about this view has been raised by many others, among them Philoso-
pher Hans Jonas.63 Technology’s control over nature has become excessive with the
consequence that technology does not lead any more towards improving living con-
ditions but towards their destruction. Jonas formulates an imperative of future ethics:
“Handle so, daß die Wirkungen deiner Handlung verträglich sind mit der Permanenz
echten menschlichen Lebens auf Erden”64 (“act only according to the maxim that
the consequences of your action are in harmony with a permanent existence of true
human life on Earth”; translation MR). Jonas does both: he criticizes and he extends
(modernizes?) Immanuel Kant’s categorical imperative: “Act only according to that
maxim whereby you can, at the same time will, that it should become a universal law
without contradiction”. Jonas demands from each scholar the duty to take responsibil-
ity for future generations and to preserve what makes “echtes menschliches Leben”,
true human life. “True” indicates that the question of permanent existence of life goes
beyond mere biological existence and procreation, but the Zeitgeist and the current
systems of values of a society probably define what “true human” actually means.
Liberty and privacy could be components of such a system in nowadays Western
World. Any research that threatens the continuity of these values would violate Jonas’
imperative.

For research undertaken in the Digital Humanities, questions like these may arise:
how is our social behaviour changing when we know that we cannot express ourselves
without being monitored? What consequences would follow out of this for society?
How does a society look like in which possibly the history of diseases and dispositions
of individuals can easily be detected based on Open Access historical data? Is there
a risk that we might create future generations in which values like a right to stay
anonymous do not exist anymore or is there not? And if there is, shall we take this
take or better not? Or what measures shall we take to minimize it?

Jonas gives us advice when it comes to finding answers for these questions, hence
to decide among different options of action. He asks us to think of the worst-case
scenario first. His heuristic is determined by fear (“Heuristik der Furcht”),65 and the
principle of Jonas’ ethics is responsibility, especially for the future. I personally agree
with this view and would like to establish the following: as long as the consequences
of our research in Digital Humanities are not sufficiently clear, one should be sensitive
to the problems that might arise, one should be careful in his actions, and we as a
community should at least have these discussions openly.

63 Hans Jonas, Das Prinzip Verantwortung, Frankfurt 1984.
64 Ibid., p. 36.
65 Ibid., p. 7–8.


654 Malte Rehbein

5 Conclusion

Wissenschaftsethik refers to all moral and societal aspects of the practice of our
Wissenschaft. Nevertheless, it can do nothing more than to problematize and to make
the stakeholders of Digital Humanities sensitive for moral questions. It can suggest
different perspectives and set a framework within which arguments take place, but
it cannot solve dilemmas. The decisions to be made are always up to the individual
scholar or – in terms of a code of conduct – up to the scholarly community: it’s our
department.