ajps_427_LR How to Analyze Political Attention with Minimal Assumptions and Costs Kevin M. Quinn University of California, Berkeley Burt L. Monroe The Pennsylvania State University Michael Colaresi Michigan State University Michael H. Crespin University of Georgia Dragomir R. Radev University of Michigan Previous methods of analyzing the substance of political attention have had to make several restrictive assumptions or been prohibitively costly when applied to large-scale political texts. Here, we describe a topic model for legislative speech, a statistical learning model that uses word choices to infer topical categories covered in a set of speeches and to identify the topic of specific speeches. Our method estimates, rather than assumes, the substance of topics, the keywords that identify topics, and the hierarchical nesting of topics. We use the topic model to examine the agenda in the U.S. Senate from 1997 to 2004. Using a new database of over 118,000 speeches (70,000,000 words) from the Congressional Record, our model reveals speech topic categories that are both distinctive and meaningfully interrelated and a richer view of democratic agenda dynamics than had previously been possible. W hat are the subjects of political conflict and attention? How does the mix of topic atten- tion change over time? How do we know? These questions are fundamental to much of political sci- ence, including studies of legislative representation (Lowi 1964; Mayhew 1974; Riker 1986), policy agenda change (Baumgartner, Green-Pedersen, and Jones 2006; Baum- gartner and Jones 1993; Kingdon 1995), and issue evo- Kevin M. Quinn is Professor of Law, University of California, Berkeley, 490 Simon #7200, Berkeley, CA 94720-7200 (kquinn@law. berkeley.edu). Burt L. Monroe is Associate Professor of Political Science and Director of the Quantitative Social Science Initiative, The Pennsylvania State University, 230 Pond Lab, University Park, PA 16802-6200 (burtmonroe@psu.edu). Michael Colaresi is Associate Professor of Political Science, Michigan State University, 303 South Kedzie Hall, East Lansing, MI 48824 (colaresi@msu.edu). Michael H. Crespin is Assistant Professor of Political Science, University of Georgia, 407 Baldwin Hall, Athens, GA 30602 (crespin@uga.edu). Dragomir R. Radev is Associate Professor, School of Information and Department of Electrical Engineering and Computer Science, University of Michigan, 3310 EECS Building, 1301 Beal Avenue, Ann Arbor, MI 48109-2122 (radev@umich.edu). An earlier version of this article was presented to the Midwest Political Science Association and was awarded the 2006 Harold Gosnell Prize for Excellence in Political Methodology. We would like to thank Steven Abney, Scott Adler, Scott Ainsworth, Frank Baumgartner, Ken Bickers, David Blei, Jake Bowers, Janet Box-Steffensmeier, Patrick Brandt, Barry Burden, Suzie Linn, John Freeman, Ed Hovy, Will Howell, Simon Jackman, Brad Jones, Bryan Jones, Kris Kanthak, Gary King, Glen Krutz, Frances Lee, Bob Luskin, Chris Manning, Andrew Martin, Andrew McCallum, Iain McLean, Nate Monroe, Becky Morton, Stephen Purpura, Phil Schrodt, Gisela Sin, Betsy Sinclair, Michael Ward, John Wilkerson, Dan Wood, Chris Zorn, and seminar participants at UC Davis, Harvard University, the University of Michigan, the University of Pittsburgh, the University of Rochester, Stanford University, the University of Washington, and Washington University in St. Louis for their comments on earlier versions of the article. We would like to give special thanks to Cheryl Monroe for her contributions toward development of the Congressional corpus in specific and our data collection procedures in general. We would also like to thank Jacob Balazer (Michigan) and Tony Fader (Michigan) for research assistance. In addition, Quinn thanks the Center for Advanced Study in the Behavioral Sciences for its hospitality and support. This article is based upon work supported by the National Science Foundation under grants BCS 05-27513 and BCS 07-14688. Any opinions, findings, and conclusions or recommendations expressed in this article are those of the authors and do not necessarily reflect the views of the National Science Foundation. Supplementary materials, including web appendices and a replication archive with data and R package, can be found at http://www.legislativespeech.org. lution (Carmines and Stimson 1989; Wolbrecht 2000). Conventional approaches to the problem of identifying and coding topic attention have used trained human coders to read documents. The careful and systematic use of human-coder techniques has helped to produce im- pressive data collections such as the Policy Agendas and Congressional Bills projects in American Politics (Adler and Wilkerson 2006; Jones, Wilkerson, and Baumgartner American Journal of Political Science, Vol. 54, No. 1, January 2010, Pp. 209–228 C©2010, Midwest Political Science Association ISSN 0092-5853 209 210 KEVIN M. QUINN ET AL. n.d.) and the Comparative Manifesto Project in com- parative politics (Budge et al. 2001; Klingemann et al. 2006). The impact and usefulness of these data sources to political science is difficult to overstate.1 The great ben- efit of human-coder techniques is that the mapping of words in a text to a topic category is allowed to be highly complicated and contingent. The downside of human- coder techniques is that reliability can be a challenge, per-document costs are generally high, and it assumes that both the substance of topics and rules that govern tagging documents with a specific topic are known a priori. Related tasks in political science have also been ad- dressed using computer-checked dictionaries or, more re- cently, hybrid human/computer (“supervised learning”) techniques. For example, event data coding in interna- tional relations has benefited enormously from the au- tomated coding of news wire feeds using dictionaries created by the Kansas Event Data system (Gerner et al. 1994), and the Policy Agendas and Congressional Bills Projects have moved toward the use of supervised learn- ing techniques to supplement human coding (Hillard, Purpura, and Wilkerson 2007, 2008). When automated approaches substitute computers for humans, the costs of coding are reduced and the reliability is increased (King and Lowe 2003). As with human coding, dictionary methods, and hybrid human/computer classification ap- proaches, both assume that the substance of topics and the features that identify a particular topic are known a priori. Here, we describe a statistical method to topic-code political texts over time that provides a reliable and replicable mapping of words into topics. However, un- like most extant approaches, our method estimates both the keywords that identify particular topics, as well as the division of topics from observed data, rather than assuming these features are known with certainty. Pre- viously, if a researcher was interested in tracking topic attention over time within a set of documents, that re- searcher needed to bring a great deal of information into the analysis. The researcher first needed to define the substance, number, and subdivisions of each topic. Sec- ond, the researcher was required to codify a set of rules or keywords that would allow human coders or a com- puter to place documents into the researcher-created tax- onomy of topics. In contrast, our statistical method of topic-coding text does not require a researcher to know the underlying taxonomy of categories with certainty. In- 1As outlined in the cited books and websites, each of these has inspired expansive research programs with books and papers too numerous to cite here. stead, the division of topics and keywords that identify each topic are estimated from the text. Our statistical topic-coding method opens up the exciting possibility of tracking attention within lengthy political corpora that would be prohibitively expensive for human coders. The only additional input required from the investigator is the total number of categories into which texts should be grouped. To illustrate the usefulness of our approach, we use our statistical model to topic-code the Congressional record for the 105th to the 108th U.S. Senate. The es- timates provide (1) an ontology of topic categories and language choice and (2) a daily data series of atten- tion to different topics in the U.S. Senate from 1997 to 2004. We believe this is the most extensive, temporally detailed map of legislative issue attention that has ever been systematically constructed. We evaluate the validity of our approach by examining (a) the extent to which there is common substantive meaning underlying the keywords within a topic, (b) the semantic relationships across topics, (c) the extent to which our daily measures of topic attention covary with roll calls and hearings on the topic of interest, (d) the relationships between ex- ogenous events (such as 9/11 or the Iraq War) that are widely perceived to have shifted the focus of attention in particular ways, and (e) the usefulness of the produced data for testing hypotheses of substantive and theoretical interest. Categorizing Texts: Methods, Assumptions, and Costs Each method for analyzing textual content imposes its own particular set of assumptions and, as a result, has particular advantages and weaknesses for any given ques- tion or set of texts. We focus our attention here on the basic problem of categorizing texts—placing texts into discrete target categories or bins.2 Methods of text cate- gorization vary along at least five dimensions: (1) whether they take the target categories as known or unknown, (2) whether the target categories have any known or unknown relationships with one another, (3) whether the relevant textual features (e.g., words, nouns, phrases, etc.) are known or unknown, (4) whether the mapping from fea- tures to categories is known or unknown, and (5) whether 2An equally interesting problem is placing texts, or their authors, in a continuous space, the problem addressed by such techniques as WORDSCORES (Laver, Benoit, and Garry 2003; Lowe 2008), WordFish (Slapin and Proksch 2008), and rhetorical ideal point estimation (Monroe and Maeda 2004; Monroe et al. 2007). ANALYZE POLITICAL ATTENTION 211 TABLE 1 A Summary of Common Assumptions and Relative Costs Across Different Methods of Discrete Text Categorization Method Human Supervised Topic A. Assumptions Reading Coding Dictionaries Learning Model Categories are known No Yes Yes Yes No Category nesting, if any, is known No Yes Yes Yes No Relevant text features are known No No Yes Yes Yes Mapping is known No No Yes No No Coding can be automated No No Yes Yes Yes B. Costs Preanalysis Costs Person-hours spent conceptualizing Low High High High Low Level of substantive knowledge Moderate/High High High High Low Analysis Costs Person hours spent per text High High Low Low Low Level of substantive knowledge Moderate/High Moderate Low Low Low Postanalysis Costs Person-hours spent interpreting High Low Low Low Moderate Level of substantive knowledge High High High High High the categorization process can be performed algorithmi- cally by a machine. We are at pains, in particular, to describe how five ways of categorizing texts—reading, human coding, automated dictionaries, supervised learning, and the topic model we describe here—fill dis- tinctive niches as tools for political science. Each of these five methods comes with unique costs and benefits. We find it useful to think of these costs along two main dimensions: (1) the extent to which the method requires detailed substantive knowledge and (2) the length of time it would take a single person to complete the analysis for a fixed body of text. Each of these two types of costs can be incurred at three stages of the analysis: the preanalysis phase where issues of conceptualization and operationalization are dealt with (perhaps in one or more pilot studies), the analysis phase where the texts of interest are categorized, and the postanalysis phase where the results from the analysis phase are interpreted and assessed for reliability and validity. Tables 1A and 1B depict how five major methods of text categorization compare in terms of their underlying assumptions and costs, respectively. The cell entries in Table 1A represent the minimal assumptions required by each method. In the most general sense, the fundamental “method” for inferring meaning from text is reading . For exam- ple, one reader of a specific journal article might at- tempt to place that article into one of a set of sub- stantive categories (e.g., legislative studies / agenda set- ting / methodology / text analysis), while another reader might categorize the text in terms of its relevance (cite / request more information / ignore). Not only might the relevant categories change by reader, but a given reader will create new categories as more information about the text becomes apparent. For some target sets of categories, we could delineate specific features of the text that make particular categories more likely. We can imagine that words like Congress or legislature make it more likely that we place an article under “legislative studies,” that typesetting in LATEX or multiple equations makes it more likely that we place it under “methodology,” and so on. For other target con- cepts, the relevant features are more abstract. To place it in the “cite” bin, we might require that the text display features like importance and relevance. Different readers may disagree on the salient features and their presence or absence in any particular text. This is important for the promise of automation via algorithm. We all use search engines that are useful at helping us find articles that are topically relevant (Google Scholar, JSTOR) or influ- ential (Social Science Citation Index), but we would be 212 KEVIN M. QUINN ET AL. more skeptical of an algorithm that attempted to tell us whether a given article should be cited in our own work or not. As one might expect—since all automated methods require at least some human reading—the act of read- ing a text rests on fewer assumptions than other meth- ods of text categorization. The number of topics is not necessarily fixed in advance, the relationships between categories are not assumed a priori, texts can be viewed holistically and placed in categories on a case-by-case ba- sis, and there is no attempt to algorithmically specify the categorization process. This allows maximum flexibility. However, the flexibility comes with nontrivial costs, espe- cially when one attempts to read large, politically relevant texts such as the British Hansard or the U.S. Congressional Record. More specifically, human reading of text requires moderate-to-high levels of substantive knowledge (the language of the text and some contextual knowledge are minimal but nontrivial requirements) and a great deal of time in person-hours per text.3 Finally, condensing the information in a large text requires a great deal of thought, expertise, and good sense. Even in the best of situations, purely qualitative summaries of a text are often open to debate and highly contested. Human coding (see, for instance, Ansolabehere, Snowberg, and Snyder 2003; Budge et al. 2001; Ho and Quinn 2008; Jones, Wilkerson, and Baumgartner n.d.; and Klingemann et al. 2006) is the standard methodol- ogy for content analysis, and for coding in general, in social science. For such manual coding, the target cat- egories of interest are assumed to be known and fixed. Coders read units of text and attempt to assign one of a finite set of codes to each unit. If the target categories have any relationship to each other (e.g., nesting), it is assumed to be known. There is typically no requirement that the readers use any particular feature in identifying the target category and the exact mapping from texts to categories is assumed unknown and never made explicit. One can tell, through reliability checking, whether two independent coders reach the same conclusion, but one cannot tell how they reached it. Manual coding is most useful when there are abundant human resources avail- able, the target concepts are clearly defined a priori, but the mapping from texts to categories is highly complex and unknown (“I know it when I see it”). By using clearly defined, mutually exclusive, and ex- haustive categories to structure the coding phase, human coding methods require less substantive knowledge than would be necessary in a deep reading of the texts. Nev- 3The Congressional Record currently contains over four billion words and produces another half million—about the length of War and Peace—a day. ertheless, the texts do still need to be read by a human (typically a research assistant) who is a competent reader of the language used in the texts. Further, some moderate contextual knowledge is required during this phase so that texts are interpreted in the proper context. While human coding is less costly than deep reading during the analysis phase, it has higher initial costs. In particular, arriving at a workable categorization scheme typically requires expert subject-matter knowledge and substantial human time. The first steps toward automation can be found in dictionary-based coding , which easily carries the most as- sumptions of all methods here. Examples include Gerner et al. (1994), Cary (1977), and Holsti, Brody, and North (1964). In dictionary-based coding, the analyst develops a list (a dictionary) of words and phrases that are likely to indicate membership in a particular category. A com- puter is used to tally up use of these dictionary entries in texts and determine the most likely category.4 So, as with manual coding, target categories are known and fixed. Moreover, the relevant features—generally the words or phrases that comprise the dictionary lists—are known and fixed, as is the mapping from those features into the target categories. When these assumption are met, dictionary-based coding can be fast and efficient. As with human coding, dictionary methods have very high startup costs. Building an appropriate dictionary is typically an application-specific task that requires a great deal of deep application-specific knowledge and (often- times) a fair amount of trial and error. That said, once a good dictionary is built, the analysis costs are as low or lower than any competing method. A large number of texts can be processed quickly and descriptive numerical summaries can be easily generated that make interpreta- tion and validity assessment relatively straightforward. A more recent approach to automation in this type of problem is supervised learning (Hillard, Purpura, and Wilkerson 2007, 2008; Kwon, Hovy, and Shulman 2007; Purpura and Hillard 2006). Hand coding is done to a sub- set of texts that will serve as training data and to another subset of texts that serve as evaluation data (sometimes called “test data”). Machine-learning algorithms are then used to attempt to infer the mapping from text features to hand-coded categories in the training set. Success is evaluated by applying the inferred mapping to the test data and calculating summaries of out-of-sample pre- dictive accuracy. Gains of automation are then realized by application to the remaining texts that have not been hand coded. There are a wide variety of possible algo- rithms and the field is growing. Again, note that target categories are assumed to be known and fixed. Some set 4One of the important early dictionary systems is the General En- quirer (Stone et al. 1966). ANALYZE POLITICAL ATTENTION 213 of possibly relevant features must be identified, but the algorithm determines which of those are relevant and how they map into the target categories. Some algorithms re- strict the mapping from text features to categories to take a parametric form while others are nonparametric.5 Since supervised learning methods require some hu- man coding of documents to construct training and test sets, these methods have high startup costs that are roughly the same as human-coding methods. Where they fare much better than human-coding methods is in the processing of the bulk of the texts. Here, because the pro- cess is completely automated, a very large number of texts can be assigned to categories quite quickly. In the same way that supervised learning attempts to use statistical techniques to automate the process of hand coding, our topic model attempts to automate the topic- categorization process of reading. The key assumption shared with reading, and not shared with hand coding, dictionary-based coding, or supervised learning, is that the target categories and their relationships with each other are unknown. The target categories—here, the top- ics that might be the subject of a particular legislative speech—are an object of inference. We assume that words are a relevant feature for revealing the topical content of a speech, and we assume that the mapping from words to topics takes a particular parametric form, described below. The topic model seeks to identify, rather than as- sume, the topical categories, the parameters that describe the mapping from words to topic, and the topical category for any given speech. The topic-modeling approach used in this article has a very different cost structure than all methods mentioned so far. Whereas other methods typically require a large in- vestment in the initial preanalysis stage (human coding, dictionary methods, supervised learning) and/or analysis stage (reading, human coding), our topic model requires very little time or substantive knowledge in these stages of the analysis. Where things are reversed is in the postanal- ysis phase where methods other than deep reading are rel- atively costless but where our topic model requires more time and effort (but no more substantive knowledge) than other methods. The nature of the costs incurred by the topic model become more apparent below. A Model for Dynamic Multitopic Speech The data-generating process that motivates our model is the following. On each day that Congress is in session a 5In Table 1A, we code the assumptions for the least stringent su- pervised learning techniques. legislator can make speeches. These speeches will be on one of a finite number K of topics. The probability that a randomly chosen speech from a particular day will be on a particular topic is assumed to vary smoothly over time. At a very coarse level, a speech can be thought of as a vector containing the frequencies of words in some vocabulary. These vectors of word frequencies can be stacked together in a matrix whose number of rows is equal to the number of words in the vocabulary and whose number of columns is equal to the number of speeches. This matrix is our out- come variable. Our goal is to use the information in this matrix to make inferences about the topic membership of individual speeches.6 We begin by laying out the necessary notation. Let t = 1, . . . , T index time (in days); d = 1, . . . , D index speech documents; k = 1, . . . , K index possible topics that a document can be on; and w = 1, . . . , W index words in the vocabulary. For reasons that will be clearer later, we also introduce the function s : {1, . . . , D} → {1, . . . , T }. s(d) tells us the time period in which document d was put into the Congressional Record. In addition, let �N denote the N -dimensional simplex. The Sampling Density The dth document yd is a W -vector of nonnegative in- tegers. The wth element of yd , denoted ydw, gives the number of times word w was used in document d. We condition on the total number nd of words in document d and assume that if yd is from topic k yd ∼ Multinomial(nd , �k ). Here �k ∈ �W−1 is the vector of multinomial probabili- ties with typical element �kw. One can think of �k as serv- ing as a “prototype speech” on topic k in the sense that it is the most likely word-usage profile within a speech on this topic. This model will thus allow one to think about all the speeches in a dataset as being a mixture of K prototypes plus random error. We note in passing that a Poisson data-generating process also gives rise to the same multinomial model conditional on nd . For purposes of in- terpretation, we will at some points below make use of 6The model we describe below differs from the most similar topic models in the computational linguistics literature (Blei and Laf- ferty 2006; Blei, Ng, and Jordan 2003; Wang and McCallum 2006) in several particulars. Among these are the dynamic model, the estimation procedure, and, most notably, the nature of the mix- ture model. In other models, documents have a mixture of topical content. This is perhaps appropriate for complex documents, like scientific articles. In ours, documents have a single topic, but we are uncertain which topic. This is appropriate for political speeches. Ultimately, our assumption allows us to distinguish between, for example, a speech on defense policy that invokes oil, and a speech on energy policy that invokes Iraq. 214 KEVIN M. QUINN ET AL. the transformation �k = ([ log ( �k1 �k1 ) − c ] , [ log ( �k2 �k1 ) − c ] , . . . , [ log ( �k W �k1 ) − c ])′ where c = W−1 ∑W w=1 log( �kw �k1 ). If we let �t k denote the marginal probabilities that a randomly chosen document is generated from topic k in time period t , we can write the sampling density for all of the observed documents as p(Y | �, �) ∝ D∏ d=1 K∑ k=1 �s (d )k W∏ w=1 � ydw kw . As will become apparent later, it will be useful to write this sampling density in terms of latent data z1, . . . , zD . Here zd is a K -vector with element zd k equal to 1 if doc- ument d was generated from topic k and 0 otherwise. If we could observe z1, . . . , zD we could write the sampling density above as p(Y, Z | �, �) ∝ D∏ d=1 K∏ k=1 ( �s (d )k W∏ w=1 � ydw kw )zd k . The Prior Specification To complete a Bayesian specification of this model we need to determine prior distributions for � and �. We as- sume a semiconjugate Dirichlet prior for �. More specif- ically, we assume �k ∼ Dirichlet (�k ) k = 1, . . . , K . For the data analysis below we assume that �kw = 1.01 for all k and w. This corresponds to a nearly flat prior over �k . This prior was chosen before looking at the data. The prior for � is more complicated. Let �t ∈ �K −1 denote the vector of topic probabilities at time t . The model assumes that a priori zd ∼ Multinomial(1, �s (d )). We reparameterize to work with the unconstrained �t = ( log [ �t 1 �t K ] , . . . , log [ �t (K −1) �t K ])′ . In order to capture dynamics in �t and to borrow strength from neighboring time periods, we assume that �t follows a Dynamic Linear Model (DLM; Cargnoni, Müller, and West 1997; West and Harrison 1997). Specifically, �t = F′t �t + �t �t ∼ N (0, Vt ) t = 1, . . . , T (1) �t = Gt �t−1 + t t ∼ N (0, Wt ) t = 1, . . . , T (2) Here equation (1) acts as the observation equation and equation (2) acts as the evolution equation. We finish this prior off by assuming prior distributions for Vt , Wt , and �0. Specifically, we assume Wt = W for all t and Vt = V for all t in which Congress was in session with V and W both diagonal and Vi i ∼ InvGa mma (a0/2, b0/2) ∀i Wi i ∼ InvGa mma (c 0/2, d0/2) ∀i We assume �0 ∼ N (m0, C0). In what follows, we assume a0 = 5, b0 = 5, c 0 = 1, d0 = 1, m0 = 0, and C0 = 25I. For days in which Congress was not in session we assume that Vt = 10I. We have found that this helps prevent oversmoothing. We note that our substantive results are not terribly sensitive to other, more diffuse, priors for Vi i and Wi i . In a web appendix we detail how models fit with a0 = b0 = c 0 = d0 = 1 and a0 = c 0 = 1, b0 = d0 = 10 produce extremely similar results. In what follows we specify Ft and Gt as a local linear trend for �t : Ft = ( IK −1 0K −1 ) t = 1, . . . , T Gt = ( IK −1 IK −1 0K −1 IK −1 ) t = 1, . . . , T. While we adopt a fairly simple model for the dynamics in the Senate data, the DLM framework that we make use of is extremely general. Details of the Expectation Con- ditional Maximization (ECM) algorithm used to fit this model are provided in the web appendix. Model fitting takes between 20 minutes and three hours depending on the quality of the starting values and the speed of the computer. No specialized hardware is required. Viewed as a clustering/classification procedure, the model above is designed for “unsupervised” clustering. At no point does the user pretag documents as belong- ing to certain topics. As we will demonstrate below in the context of Senate speech data, our model, despite not using user-supplied information about the nature of the topics, produces topic labelings that adhere closely to gen- erally recognized issue areas. While perhaps the greatest strength of our method is the fact that it can be used with- out any manual coding of documents, it can also be easily adapted for use in semisupervised fashion by constrain- ing some elements of Z to be 0 and 1. It is also possible to use the model to classify documents that were not in the original dataset used to fit the model. ANALYZE POLITICAL ATTENTION 215 Applying the Topic Model to U.S. Senate Speech, 1995–2004 We present here an analysis of speech in the U.S. Senate, as recorded in the Congressional Record, from 1995 to 2004 (the 105th to the 108th Congresses). In this section, we briefly describe how we process the textual data to serve as input for the topic model and then discuss the specification of the model for this analysis. Senate Speech Data The textual data are drawn from the United States Con- gressional Speech Corpus7 (Monroe et al. 2006) devel- oped under the Dynamics of Political Rhetoric and Polit- ical Representation Project (http://www.legislativespeech .org). The original source of the data is the html files that comprise the electronic version of the (public domain) United States Congressional Record, served by the Library of Congress on its THOMAS system (Library of Congress n.d.) and generated by the Government Printing Office (United States Government Printing Office n.d.). These html files correspond (nearly) to separately headed sections of the Record. We identify all utterances by an individual within any one of these sections, even if interrupted by other speakers, as a “speech” and it is these speeches that constitute the document set we model. For the eight-year period under study, there are 118,065 speeches (D) so defined. The speeches are processed to remove (most) punctu- ation and capitalization and then all words are stemmed.8 There are over 150,000 unique stems in the vocabulary of the Senate over this eight-year period, most of which are unique or infrequent enough to contain little infor- mation. For the analysis we present here, we filter out all stems that appear in less than one-half of 1% of speeches, leaving a vocabulary of 3,807 (W ) stems for this analysis. This produces a 118,065 × 3,807 input matrix of stem counts, which serves as the input to the topic model. This matrix contains observations of just under 73 million words.9 7Corpus (plural corpora) is a linguistic term meaning a textual database. 8A word’s stem is its root, to which affixes can be added for in- flection (vote to voted) or derivation (vote to voter). Stemming provides considerable efficiency gains, allowing us to leverage the shared topical meaning of words like abort, aborts, aborted, aborting, abortion, abortions, abortionist, and abortionists instead of treating the words as unrelated. An algorithm that attempts to reduce words to stems is a stemmer. We use the Porter Snowball II stemmer (for English), widely used in many natural language processing appli- cations (Porter 1980, n.d.). 9Details of the process are provided in the web appendix. Model Output The model contains millions of parameters and latent variables. We can focus on two subsets of these as defining the quantities of substantive interest, the �’s and the z’s. The � matrix contains K × W(≈ 160, 000) param- eters. Each element �kw of this matrix describes the log- odds of word w being used to speak about topic k. If �kw > �kw′ it is the case that word w is used more often on topic k than word w′. This is the source of the se- mantic content, the meaning, in our model. That is, we use this to learn what each topic is about and how top- ics are related to one another. � describes the intratopic data-generating process, so it can be used to generate new “speeches” (with words in random order) on any topic. It can also be used, in conjunction with the other model pa- rameters, to classify other documents. This is useful either for sensitivity analysis, as noted below, or for connecting the documents from some other setting (newspaper arti- cles, open-ended survey responses) to the topical frame defined by this model. Z is a D × K matrix with typical element zd k . Each of the approximately 5,000,000 zd k values is a 0/1 indicator of whether document d was generated from topic k. The model-fitting algorithm used in this article returns the expected value of Z which we label Ẑ. Because of the 0/1 nature of each zd k , we can interpret ẑd k (the expected value of zd k ) as the probability that document d was generated from topic k. We find that approximately 94% of documents are more than 95% likely to be from a single topic. Thus, we lose very little information by treating the maximum zd k in each row as an indicator of “the topic” into which speech d should be classified, reducing this to D (118,000) parameters of direct interest. Since we know when and by whom each speech was delivered, we can generate from this measures of attention (word count, speech count) to each topic at time scales as small as by day, and for aggregations of the speakers (parties, state delegations, etc.). It is also possible to treat ẑd as a vector of topic probabilities for document d and to then probabilistically assign documents to topics. Model Specification and Sensitivity Analysis We fit numerous specifications of the model outlined in the third section to the 105th–108th Senate data. In particular, we allowed the number of topics K to vary from 3 to 60. For each specification of K we fit several models using different starting values. Mixture models, such as that used here, typically exhibit a likelihood surface that is multimodal. Since the ECM algorithm used to fit the 216 KEVIN M. QUINN ET AL. model is only guaranteed to converge to a local mode, it is typically a good idea to use several starting values in order to increase one’s chances of finding the global optimum. We applied several criteria to the selection of K , which must be large enough to generate interpretable categories that have not been overaggregated and small enough to be usable at all. Our primary criteria were substantive and conceptual. We set a goal of identifying topical categories that correspond roughly to the areas of governmental competence typically used to define distinct government departments/ministries or legislative committees, such as “Education,” “Health,” and “Defense.” This is roughly comparable to the level of abstraction in the 19 major topic codes of the Policy Agendas Project, while being a bit more fine-grained than the 10 major categories in Rohde’s roll-call dataset (Rohde 2004) and more coarse than the 56 categories in the Comparative Manifestos Project. Conceptually, for us, a genuine topic sustains discussion over time (otherwise it is something else, like a proposal, an issue, or an event) and across parties (other- wise it is something else, like a frame). With K very small, we find amorphous categories along the lines of “Domes- tic Politics,” rather than “Education”; as K increases, we tend to get divisions into overly fine subcategories (“El- ementary Education”), particular features (“Education Spending”), or specific time-bound debates (“No Child Left Behind”). Results matching our criteria, and similar to each other, occur at K in the neighborhood of 40–45. We present here results for the K = 42 model with the highest maximized log posterior. A series of sensitivity analyses are available in the web appendix. Reliability, Validity, Interpretation, and Application This is a measurement model. The evaluation of any mea- surement is generally based on its reliability (can it be re- peated?) and validity (is it right?). Embedded within the complex notion of validity are interpretation (what does it mean?) and application (does it “work”?). Complicating matters, we are here developing multi- ple measures simultaneously: the assignment of speeches to topics, the topic categories themselves, and derived measures of substantive concepts, like attention. Our model has one immediate reliability advantage relative to human and human-assisted supervised learning meth- ods. The primary feature of such methods that can be assessed is the human-human or computer-human in- tercoder reliability in the assignment of documents to the given topic frame, and generally 70–90% (depending on index and application) is taken as a standard. Our approach is 100% reliable, completely replicable, in this regard. More important are notions of validity. There are sev- eral concepts of measurement validity that can be consid- ered in any content analysis.10 We focus here on the five basic types of external or criterion-based concepts of valid- ity. First, the measures of the topics themselves and their relationships can be evaluated for semantic validity (the extent to which each category or document has a coherent meaning and the extent to which the categories are related to one another in a meaningful way). This speaks directly to how the � matrix can be interpreted. Then, the derived measures of attention can be evaluated for convergent con- struct validity (the extent to which the measure matches existing measures that it should match), discriminant con- struct validity (the extent to which the measure departs from existing measures where it should depart), predic- tive validity (the extent to which the measure corresponds correctly to external events), and hypothesis validity (the extent to which the measure can be used effectively to test substantive hypotheses). The last of these speaks directly to the issue of how the z matrix can be applied. Topic Interpretation and Intratopic Semantic Validity Table 2 provides our substantive labels for each of the 42 clusters, as well as descriptive statistics on relative fre- quency in the entire dataset. We decided on these labels after examining �̂k and also reading a modest number of randomly chosen documents that were assigned a high probability of being on topic k for k = 1, . . . , K . This process also informs the semantic validity of each clus- ter. Krippendorff (2004) considers this the most relevant form of validity for evaluating a content analysis measure. We discuss these procedures in turn. In order to get a sense of what words tended to dis- tinguish documents on a given topic k from documents on other topics we examined both the magnitude of �̂kw for each word w as well as the weighted distance of �̂kw from the center of the �̂ vectors other than �̂kw (denoted �̂−kw). The former provides a measure of how often word w was used in topic k documents relative to other words in topic k documents. A large positive value of �̂kw means that word w appeared quite often in topic k documents. The weighted distance of �̂kw from the center of the �̂−kw, 10The most common is, of course, face validity. Face validity is inherently subjective, generally viewed as self-evident by authors and with practiced skepticism by readers. We believe the results from the model as applied to the Congressional Record (see below) demonstrate significant face validity. But, by definition, there are no external criteria one can bring to bear on the issue of face validity and thus we focus on several other types of validity. ANALYZE POLITICAL ATTENTION 217 TABLE 2 Topic Labels and Descriptive Statistics for 42-Topic Model Topic Labels %a Clarifying Notes 1. Judicial Nominations 1.0/2.4 2. Supreme Court / Constitutional 1.1/3.0 incl. impeachment, DOJ, marriage, flag-burning 3. Campaign Finance 0.9/2.4 4. Abortion 0.5/1.1 5. Law & Crime 1 [Violence/Drugs] 1.3/1.8 violence, drug trafficking, police, prison 6. Child Protection 0.9/2.6 tobacco, alcohol, drug abuse, school violence, abuse 7. Health 1 [Medical] 1.5/2.4 emph. disease, prevention, research, regulation 8. Social Welfare 2.0/2.8 9. Education 1.8/4.6 10. Armed Forces 1 [Manpower] 1.0/1.5 incl. veterans’ issues 11. Armed Forces 2 [Infrastructure] 2.3/3.0 incl. bases and civil defense 12. Intelligence 1.4/3.9 incl. terrorism and homeland security 13. Law & Crime 2 [Federal] 1.8/2.7 incl. the FBI, immigration, white-collar crime 14. Environment 1 [Public Lands] 2.2/2.5 incl. water management, resources, Native Americans 15. Commercial Infrastructure 2.0/2.9 incl. transportation and telecom 16. Banking and Finance 1.1/3.1 incl. corporations, small business, torts, bankruptcy 17. Labor 1 [Workers, esp. Retirement] 1.0/1.5 emph. conditions and benefits, esp. pensions 18. Debt / Deficit / Social Security 1.7/4.6 19. Labor 2 [Employment] 1.4/4.5 incl. jobs, wages, general state of the economy 20. Taxes 1.1/2.7 emph. individual taxation, incl. income and estate 21. Energy 1.4/3.3 incl. energy supply and prices, environmental effects 22. Environment 2 [Regulation] 1.1/2.8 incl. pollution, wildlife protection 23. Agriculture 1.2/2.5 24. Foreign Trade 1.1/2.4 25. Procedural 3 [Legislation 1] 2.0/2.8 26. Procedural 4 [Legislation 2] 3.0/3.5 27. Health 2 [Economics—Seniors] 1.0/2.6 incl. Medicare and prescription drug coverage 28. Health 3 [Economics—General] 0.8/2.3 incl. provision, access, costs 29. Defense [Use of Force] 1.4/3.7 incl. wars/interventions, Iraq, Bosnia, etc. 30. International Affairs [Diplomacy] 1.9/3.0 incl. human rights, organizations, China, Israel, etc. 31. International Affairs [Arms Control] 0.9/2.3 incl. treaties, nonproliferation, WMDs 32. Symbolic [Tribute—Living] 1.9/1.3 33. Symbolic [Tribute—Constituent] 3.2/1.9 34. Symbolic [Remembrance—Military] 2.3/1.9 incl. tributes to other public servants, WWII Memorial 35. Symbolic [Remembrance—Nonmilitary] 2.4/2.3 36. Symbolic [Congratulations—Sports] 0.6/0.4 37. Jesse Helms re: Debt 0.5/0.1 almost daily deficit/debt ‘boxscore’ speeches 38. Gordon Smith re: Hate Crime 0.4/0.1 almost daily speeches on hate crime 39. Procedural 1 [Housekeeping 1] 20.4/1.5 40. Procedural 5 [Housekeeping 3] 15.5/1.0 41. Procedural 6 [Housekeeping 4] 6.5/1.6 42. Procedural 2 [Housekeeping 2] 2.4/0.8 a Percentage of documents (left of slash) and percentage of word stems (right of slash). which we operationalize as rkw = �̂kw − median j =k ( �̂ j w ) MAD � =k ( �̂�w ) , where MAD represents the median absolute deviation, provides a measure of how distinctive the usage of word w is on topic k documents compared to other documents. To take an example, the word the always has a very high 218 KEVIN M. QUINN ET AL. � value, as it is very frequently used. However, it is used roughly similarly across all of the topics, so its value of r is generally quite close to 0. We combine these measures by ranking the elements of �̂k and rk and adding the ranks for each word w. This combined index gives us one measure of how distinctive word w is for identifying documents on topic k. Table 3 provides the top keys for each topic.11 Inspection of these tables produced rough descriptive labels for all of the clusters. After arriving at these rough labels we went on to read a number of randomly chosen speech documents that were assigned to each cluster. In general we found that, with the exception of the procedu- ral categories, the information in the keywords (Table 3, extended) did an excellent job describing the documents assigned to each (substantive) topic. However, by reading the documents we were able to discover some nuances that may not have been apparent in the tables of �̂ values, and those are reflected in the topic labels and clarifying notes of Table 2. In general, the clusters appear to be homogeneous and well defined. Our approach is particularly good at extracting the primary meaning of a speech, without be- ing overwhelmed by secondary mentions of extraneous topics. For example, since 9/11, the term terrorism can appear in speeches on virtually any topic from education to environmental protection, a fact that undermines in- formation retrieval through keyword search.12 It is worth noting that this technique will extract information about the centroid of a cluster’s meaning and lexical use. There will be speeches that do not fall comfortably into any cat- egory, but which are rare enough not to demand their own cluster.13 Reading some of the raw documents also revealed some additional meaning behind the clusters. For in- stance, two of the clusters with superficially uninforma- tive keywords turn out to be composed exclusively of pro forma “hobby horse” statements by Senator Jesse Helms about the current level of national debt and by Senator Gordon Smith about the need for hate crime legislation. The � parameters identify words that, if present, most distingush a document of this topic from all others, for the 11Longer lists of keywords and index values are provided in the web appendix. 12The reader can confirm this by searching on the word terrorism in THOMAS. 13As noted above, about 94% of all documents have a better than 95% chance of being generated from a single topic, over 97% of documents have a better than 75% chance of being generated from a single topic, and over 99% have a better than 50% chance of being generated from a single topic. The bulk of the documents that were not clearly on a single topic have high probabilities of being from two or more “procedural” categories and are thus clearly on some procedural topic. time period under study and for the Senate as a whole. Our approach does not demand that all legislators talk about all topics in the same way. To the contrary, there is typically both a common set of terms that identifies a topic at hand (as shown in Table 3) and a set of terms that identifies particular political (perhaps partisan) positions, points of view, frames, and so on, within that topic. For example, Table 3 lists the top 10 keys for Judicial Nominations (nomine, confirm, nomin, circuit, hear, court, judg, judici, case, vacanc), all of which are politically neu- tral references to the topic that would be used by speakers of both parties. Within these topically defined speeches, we can define keys that are at any given time (here the 108th) the most Democratic (which include republican, white, hous, presid, bush, administr, lifetim, appoint, pack, controversi, divis) or the most Republican (which include filibust, unfair, up-or-down, demand, vote, qualifi, experi, distinguish), clearly reflecting the partisan split over Bush appointees and Democratic use of the filibuster to block them.14 Relationships between Topics and Metatopic Semantic Validity An important feature of the topic model, another sharp contrast with other approaches, is that the � matrix is an estimate of the relationship between each word in the vocabulary and each topical cluster. As a result, we can examine the semantic relationships within and across groups of topics. Given the more than 150,000 parame- ters in the � matrix, there are many such relationships one might investigate. Here we focus on how the topics relate to each other as subtopics of larger metatopics, how they aggregate. The coherent meaning of the metatopics we find is further evidence of the semantic validity of the topic model as applied to the Congressional Record. This type of validation has not been possible with other approaches to issue coding. One approach to discovering relationships among the 42 topics is agglomerative clustering of the � vectors, �̂1, . . . , �̂42, by topic. Agglomerative clustering begins by assigning each of the 42 vectors to its own unique cluster. The two vectors that are closest to each other (by Euclidean distance) are then merged to form a new cluster. This process is repeated until all vectors are merged into 14These words all appear among the top keys using any of the variance-respecting feature selection techniques described in Mon- roe, Colaresi, and Quinn (2008). This includes the simplest method, roughly equivalent to ranking words by z-scores in a multinomial model of word choice with party as the only covariate, and a more computationally complex method based on regularization (a tech- nique designed to reduce noise in such feature selection problems). ANALYZE POLITICAL ATTENTION 219 TABLE 3 Topic Keywords for 42-Topic Model Topic (Short Label) Keys 1. Judicial Nominations nomine, confirm, nomin, circuit, hear, court, judg, judici, case, vacanc 2. Constitutional case, court, attornei, supreme, justic, nomin, judg, m, decis, constitut 3. Campaign Finance campaign, candid, elect, monei, contribut, polit, soft, ad, parti, limit 4. Abortion procedur, abort, babi, thi, life, doctor, human, ban, decis, or 5. Crime 1 [Violent] enforc, act, crime, gun, law, victim, violenc, abus, prevent, juvenil 6. Child Protection gun, tobacco, smoke, kid, show, firearm, crime, kill, law, school 7. Health 1 [Medical] diseas, cancer, research, health, prevent, patient, treatment, devic, food 8. Social Welfare care, health, act, home, hospit, support, children, educ, student, nurs 9. Education school, teacher, educ, student, children, test, local, learn, district, class 10. Military 1 [Manpower] veteran, va, forc, militari, care, reserv, serv, men, guard, member 11. Military 2 [Infrastructure] appropri, defens, forc, report, request, confer, guard, depart, fund, project 12. Intelligence intellig, homeland, commiss, depart, agenc, director, secur, base, defens 13. Crime 2 [Federal] act, inform, enforc, record, law, court, section, crimin, internet, investig 14. Environment 1 [Public Lands] land, water, park, act, river, natur, wildlif, area, conserv, forest 15. Commercial Infrastructure small, busi, act, highwai, transport, internet, loan, credit, local, capit 16. Banking / Finance bankruptci, bank, credit, case, ir, compani, file, card, financi, lawyer 17. Labor 1 [Workers] worker, social, retir, benefit, plan, act, employ, pension, small, employe 18. Debt / Social Security social, year, cut, budget, debt, spend, balanc, deficit, over, trust 19. Labor 2 [Employment] job, worker, pai, wage, economi, hour, compani, minimum, overtim 20. Taxes tax, cut, incom, pai, estat, over, relief, marriag, than, penalti 21. Energy energi, fuel, ga, oil, price, produce, electr, renew, natur, suppli 22. Environment 2 [Regulation] wast, land, water, site, forest, nuclear, fire, mine, environment, road 23. Agriculture farmer, price, produc, farm, crop, agricultur, disast, compact, food, market 24. Trade trade, agreement, china, negoti, import, countri, worker, unit, world, free 25. Procedural 3 mr, consent, unanim, order, move, senat, ask, amend, presid, quorum 26. Procedural 4 leader, major, am, senat, move, issu, hope, week, done, to 27. Health 2 [Seniors] senior, drug, prescript, medicar, coverag, benefit, plan, price, beneficiari 28. Health 3 [Economics] patient, care, doctor, health, insur, medic, plan, coverag, decis, right 29. Defense [Use of Force] iraq, forc, resolut, unit, saddam, troop, war, world, threat, hussein 30. International [Diplomacy] unit, human, peac, nato, china, forc, intern, democraci, resolut, europ 31. International [Arms] test, treati, weapon, russia, nuclear, defens, unit, missil, chemic 32. Symbolic [Living] serv, hi, career, dedic, john, posit, honor, nomin, dure, miss 33. Symbolic [Constituent] recogn, dedic, honor, serv, insert, contribut, celebr, congratul, career 34. Symbolic [Military] honor, men, sacrific, memori, dedic, freedom, di, kill, serve, soldier 35. Symbolic [Nonmilitary] great, hi, paul, john, alwai, reagan, him, serv, love 36. Symbolic [Sports] team, game, plai, player, win, fan, basebal, congratul, record, victori 37. J. Helms re: Debt hundr, at, four, three, ago, of, year, five, two, the 38. G. Smith re: Hate Crime of, and, in, chang, by, to, a, act, with, the, hate 39. Procedural 1 order, without, the, from, object, recogn, so, second, call, clerk 40. Procedural 5 consent, unanim, the, of, mr, to, order, further, and, consider 41. Procedural 6 mr, consent, unanim, of, to, at, order, the, consider, follow 42. Procedural 2 of, mr, consent, unanim, and, at, meet, on, the, am Notes: For each topic, the top 10 (or so) key stems that best distinguish the topic from all others. Keywords have been sorted here by rank(�kw ) + rank(r kw ), as defined in the text. Lists of the top 40 keywords for each topic and related information are provided in the web appendix. Note the order of the topics is the same as in Table 2 but the topic names have been shortened. 220 KEVIN M. QUINN ET AL. FIGURE 1 Agglomerative Clustering of 42-Topic Model J u d ic ia l N o m in a ti o n s , 1 .0 / 2 .4 % S u p re m e C o u rt / C o n s ti tu ti o n a l, 1 .1 / 3 .0 % C a m p a ig n F in a n c e , 0 .9 / 2 .4 % A b o rt io n , 0 .5 / 1 .1 % L a w & C ri m e 1 [ e s p . V io le n c e / D ru g s ], 1 .3 / 1 .8 % C h il d P ro te c ti o n , 0 .9 / 2 .6 % H e a lt h 1 [ M e d ic a l] , 1 .5 / 2 .4 % S o c ia l W e lf a re , 2 .0 / 2 .8 % E d u c a ti o n , 1 .8 / 4 .6 % A rm e d F o rc e s 1 [ M a n p o w e r] , 1 .0 / 1 .5 % A rm e d F o rc e s 2 [ In fr a s tr u c tu re ], 2 .3 / 3 .0 % In te ll ig e n c e , 1 .4 / 3 .9 % L a w & C ri m e 2 [ F e d e ra l, i n c l. W h it e C o ll a r / Im m ig ra ti o n ], 1 .8 / 2 .7 % E n v ir o n m e n t 1 [ P u b li c L a n d s ], 2 .2 / 2 .5 % C o m m e rc e / T ra n s p o rt a ti o n / T e le c o m , 2 .0 / 2 .9 % B a n k in g & F in a n c e / T o rt R e fo rm , 1 .1 / 3 .1 % L a b o r 1 [ W o rk e rs , e s p . R e ti re m e n t] , 1 .0 / 1 .5 % D e b t / D e fi c it / S o c ia l S e c u ri ty , 1 .7 / 4 .6 % L a b o r 2 [ E m p lo y m e n t, e s p . J o b s / W a g e s ], 1 .4 / 4 .5 % In d iv id u a l T a x a ti o n , 1 .1 / 2 .7 % E n e rg y, 1 .4 / 3 .3 % E n v ir o n m e n t 2 [ R e g u la ti o n ], 1 .1 / 2 .8 % A g ri c u lt u re , 1 .2 / 2 .5 % F o re ig n T ra d e , 1 .1 / 2 .4 % P ro c e d u ra l 3 [ L e g is la ti o n 1 ], 2 .0 / 2 .8 % P ro c e d u ra l 4 [ L e g is la ti o n 2 ], 3 .0 / 3 .5 % H e a lt h 2 [ E c o n o m ic s − S e n io rs ], 1 .0 / 2 .6 % H e a lt h 3 [ E c o n o m ic s − G e n e ra l] , 0 .8 / 2 .3 % D e fe n s e [ U s e o f F o rc e ], 1 .4 / 3 .7 % In te rn a ti o n a l A ff a ir s [ D ip lo m a c y / H u m a n R ig h ts ], 1 .9 / 3 .0 % In te rn a ti o n a l A ff a ir s [ A rm s C o n tr o l] , 0 .9 / 2 .3 % S y m b o li c [ T ri b u te − L iv in g ], 1 .9 / 1 .3 % S y m b o li c [ T ri b u te − C o n s ti tu e n t] , 3 .2 / 1 .9 % S y m b o li c [ R e m e m b ra n c e − M il it a ry ], 2 .3 / 1 .9 % S y m b o li c [ R e m e m b ra n c e − N o n m il it a ry ], 2 .4 / 2 .3 % S y m b o li c [ C o n g ra tu la ti o n s − S p o rt s ], 0 .6 / 0 .4 % D e b t / D e fi c it [ ~ D a il y J e s s e H e lm s S p e e c h e s ], 0 .5 / 0 .1 % H a te C ri m e [ ~ D a il y G o rd o n S m it h S p e e c h e s ], 0 .4 / 0 .1 % P ro c e d u re 1 [ H o u s e k e e p in g 1 ], 2 0 .4 / 1 .5 % P ro c e d u re 5 [ H o u s e k e e p in g 3 ], 1 5 .5 / 1 .0 % P ro c e d u re 6 [ H o u s e k e e p in g 4 ], 6 .5 / 1 .6 % P ro c e d u re 2 [ H o u s e k e e p in g 2 ], 2 .4 / 0 .8 % 5 0 1 0 0 1 5 0 2 0 0 2 5 0 H e ig h t Regulation Western Core Economy Social Domestic Constitutional / Conceptual Bills / Proposals Macro DHS Infrastructure Distributive International Policy (Spending / Regulation) Symbolic Procedural [Housekeeping] Notes: Hierarchical agglomerative clustering of �̂1, . . . , �̂K . Clustering based on minimizing the maximum euclidean distance between cluster members. Each cluster is labeled with a topic name, followed by the percentage of documents and words, respectively, in that cluster. a single cluster. The results of this process are displayed in the dendrogram of Figure 1.15 Roughly speaking, the lower the height at which any two topics, or groupings of topics, are connected, the more similar are their word use patterns in Senate debate.16 Reading Figure 1 from the bottom up provides infor- mation about which clusters were merged first (those 15The order of topics given in Tables 2 and 3 is as determined here; the labels were determined prior to the agglomerative clustering. 16Further specifics, with code to reproduce this analysis and fig- ure, are provided in the replication archive. Please note that the agglomerative clustering is not part of the model, but rather a tool (analogous to a regression table) for compactly displaying several important features of the estimates. merged at the lowest height). We see that topics that share a penultimate node share a substantive or stylis- tic link. Some of these are obvious topical connections, such as between the two health economics clusters or between energy and environmental regulation. Some are more subtle. For example, the “Environment 1 [Public Lands]” category, which is dominated by issues related to management and conservation of public lands and wa- ter, and the “Commercial Infrastructure” category are re- lated through the common reference to distributive public works spending. Both contain the words project and area in their top 25 keys, for example. The “Banking / Finance” category and the “Labor 1 [Workers]” category discuss different aspects of economic regulation and intervention, ANALYZE POLITICAL ATTENTION 221 the former with corporations and consumers, the latter with labor markets. Other connections are stylistic, rather than necessarily substantive. The symbolic categories, for example, all have great, proud, and his as keywords. We can also read Figure 1 from the top down to get a sense of whether there are recognizable rhetorical metaclusters of topics. Reading from the top down, we see clear clusters separating the housekeeping procedural, hobby horse, and symbolic speech from the substantive policy areas. The more substantive branch then divides a cluster of conceptual and Constitutional issues from the more concrete policy areas that require Congress to appropriate funds, enact regulations, and so on. Within the concrete policy areas, we see further clear breakdowns into domestic and international policy. Domestic policy is further divided into clusters we can identify with social policy, public goods and infrastructure, economics, and “regional.” Note that what binds metaclusters is language. The language of the Constitutional grouping is abstract, ideological, and partisan. The social policy grouping is tied together by reference to societal problems, suffering, and need. The public goods / infrastructure grouping is tied together both by the language of projects and bud- gets, as well as that of state versus state particularism. The most interesting metacluster is the substantively odd “regional” grouping of energy, environment, agriculture, and trade. Exploration of the language used here shows that these are topics that divide rural and/or western sena- tors from the rest—distributive politics at a different level of aggregation. This approach has the potential to inform ongoing debates about how to characterize the underlying political structure of public policy. Whether such characterization efforts are of interest in and of themselves—we would ar- gue they are—is not of as much relevance as the fact that they are necessary for understanding dimensions of po- litical conflict (Clausen 1973; Poole and Rosenthal 1997), the dynamics of the political agenda (Baumgartner and Jones 2002; Lee 2006), the nature of political representa- tion (Jones and Baumgartner 2005), or policy outcomes (Heitschusen and Young 2006; Katznelson and Lapinski 2006; Lowi 1964). Katznelson and Lapinski (2006) pro- vide an eloquent defense of the exercise and a review of alternative approaches. Speeches, Roll Calls, Hearings, and Construct Validity The construct validity of a measure is established via its relationships with other measures. A measure shows evi- dence of convergent construct validity if it correlates with other measures of the same construct. A measure shows discriminant construct validity when it is uncorrelated with measures of dissimilar constructs (Weber 1990). Construct validity has a double edge to it. If a new measure differs from an established one, it is generally viewed with skepticism. If a new measure captures what the old one did, it is probably unnecessary. In our case, the model produces measures we expect to converge with others in particular ways and to diverge in others. Con- sider a specific policy-oriented topic, like abortion. We expect that, typically, a roll call on abortion policy should be surrounded by a debate on the topic of abortion. This convergent relationship should appear in our measure of attention to abortion in speech and in indicators of roll calls on abortion policy. Figure 2 displays the number of words given in speeches categorized by our model as “Abortion” over time. We also display the roll-call votes in which the of- ficial description contains the word abortion. We see the basic convergence expected, with number of roll calls and number of words correlated at +0.70. But note also that we expect divergence in the indicators as well. Attention is often given to abortion outside the context of an abortion policy vote, the abortion policy nature of a vote might be unclear from its description, and a particular roll call might receive very little debate attention. Consider first, the spikes of debate attention that do not have accompanying roll-call votes. The first such spike is in February of 1998, when no vote was nominally on abortion. The occasion was the Senate confirmation of Clinton’s nominee for Surgeon General, David Satcher, and debate centered around Satcher’s positions on abor- tion. “Abortion” appears nowhere in the description of the vote. Hand-coding exercises would also not code the vote as abortion. For example, Rohde’s roll-call data (Rohde 2004) cover the House, but if extended to the Senate would clearly characterize the accompanying vote on February 10 as a confirmation vote, within a larger procedural category. None of Clausen (1973), Peltzman (1985), or Poole and Rosenthal (1997) extends forward to 1998, but all code previous Surgeon General confirma- tions at similar high levels of aggregation. For example, the C. Everett Koop confirmation vote, in 1981, is coded under the Clausen system as “Government Management,” under Peltzman as “Government Organization” (primar- ily) and “Domestic Social Policy” (secondarily), and un- der Poole and Rosenthal as “Public Health”.17 Satcher would have been coded identically in each case. But it 17These codes are all listed in the D-NOMINATE dataset used for Poole and Rosenthal (1997) and archived on Poole’s website, http://www.voteview.com. 222 KEVIN M. QUINN ET AL. FIGURE 2 The Number of Words Spoken on the ‘Abortion’ Topic Per Day 0 10000 20000 30000 40000 50000 60000 70000 Abortion N u m b e r o f W o rd s J a n 0 1 , 1 9 9 7 M a r 0 2 , 1 9 9 7 M a y 0 1 , 1 9 9 7 J u n 3 0 , 1 9 9 7 A u g 2 9 , 1 9 9 7 O c t 2 8 , 1 9 9 7 D e c 2 7 , 1 9 9 7 F e b 2 5 , 1 9 9 8 A p r 2 6 , 1 9 9 8 J u n 2 5 , 1 9 9 8 A u g 2 4 , 1 9 9 8 O c t 2 3 , 1 9 9 8 D e c 2 2 , 1 9 9 8 F e b 2 0 , 1 9 9 9 A p r 2 1 , 1 9 9 9 J u n 2 0 , 1 9 9 9 A u g 1 9 , 1 9 9 9 O c t 1 8 , 1 9 9 9 D e c 1 7 , 1 9 9 9 F e b 1 5 , 2 0 0 0 A p r 1 6 , 2 0 0 0 J u n 1 5 , 2 0 0 0 A u g 1 4 , 2 0 0 0 O c t 1 3 , 2 0 0 0 D e c 1 2 , 2 0 0 0 F e b 1 0 , 2 0 0 1 A p r 1 1 , 2 0 0 1 J u n 1 0 , 2 0 0 1 A u g 0 9 , 2 0 0 1 O c t 0 8 , 2 0 0 1 D e c 0 7 , 2 0 0 1 F e b 0 5 , 2 0 0 2 A p r 0 6 , 2 0 0 2 J u n 0 5 , 2 0 0 2 A u g 0 4 , 2 0 0 2 O c t 0 3 , 2 0 0 2 D e c 0 2 , 2 0 0 2 J a n 3 1 , 2 0 0 3 A p r 0 1 , 2 0 0 3 M a y 3 1 , 2 0 0 3 J u l 3 0 , 2 0 0 3 S e p 2 8 , 2 0 0 3 N o v 2 7 , 2 0 0 3 J a n 2 6 , 2 0 0 4 M a r 2 7 , 2 0 0 4 M a y 2 6 , 2 0 0 4 J u l 2 5 , 2 0 0 4 S e p 2 3 , 2 0 0 4 N o v 2 2 , 2 0 0 4 ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● 0 1 2 3 Satcher, Surgeon General PBA Ban 2000 Murray Amendments, DOD Partial Birth Abortion Ban, 1997 PBA Ban 2003 Unborn Victims of Violence ● Rollcalls Hearings is clear from reading the transcript that the debate was about, and that attention was being paid to, abortion. Another such spike is in March of 2004, when the Unborn Victims of Violence Act establishing penalties for violence against pregnant women was debated. The House vote on this identical bill is coded in the Rohde data under “Crime / Criminal Procedure” (Rohde 2004). Much of the debate attention, however, centered around the implications of the bill and possible amendments for abortion rights. In both cases, the spike in attention to abortion is real—captured by the speech measure and uncaptured by roll-call measures. Similarly, the speech measure captures subtleties that the roll-call count does not. For example, on or around July 1 in every year from 1997 to 2003, Senator Murray offered an amendment to the Department of Defense Ap- propriations bill, attempting to restore access to abortions for overseas military personnel. The roll-call measure cap- tures these through 2000, but misses them later. This is because the word abortion was removed from the de- scription, replaced by a more opaque phrase: “to restore a previous policy regarding restrictions on use of De- partment of Defense medical facilities.” But with speech, these minor spikes in attention can be seen. Moreover, the speech measure captures when the amendment receives only cursory attention (a few hundred words in 1998) and when it is central to the discussion (2000, 2002). Note also the relationship between speech and hear- ing data. The hearings data are sparse and generally ex- amined at an annual level. At this level of aggregation, the two measures converge as expected—both show more attention to abortion by the Senate during the Clinton presidency (1997–2000) than during the Bush presidency (2001–4). But at a daily level, the measures are clearly capturing different conceptual aspects of political atten- tion. Higher cost hearings are more likely to capture at- tention that is well along toward being formulated as policy-relevant legislation. Speech is lower cost, so more dynamic and responsive at the daily level, more reflective of minority interests that may not work into policy, and potentially more ephemeral. Exogenous Events and Predictive Validity Predictive validity refers to an expected correspondence between a measure and exogenous events uninvolved in the measurement process. The term is perhaps a confus- ing misnomer, as the direction of the relationship is not relevant. This means that the correspondence need not be a pure forecast of events from measures, but can be con- current or postdictive, and causality can run from events to measures (Weber 1990). Of the limitless possibilities, it suffices to examine two of the most impactful political events in this time period: 9/11 and the Iraq War. Figure 3a plots the number of words on the topic that corresponds to symbolic speech in support of the military and other public servants. Here we see a large increase in such symbolic speech immediately after 9/11 (the largest spike on the plot is exactly on September 12). There is another large spike on the first anniversary of 9/11 and then a number of consecutive days in March 2003 that feature moderate-to-large amounts of this type ANALYZE POLITICAL ATTENTION 223 FIGURE 3 The Attention to ‘Symbolic [Remembrance—Military]’ and ‘Defense [Use of Force]’ Topics over Time 0 5000 10000 15000 20000 25000 30000 35000 Symbolic [Remembrance − Military] N u m b e r o f W o rd s J a n 0 1 , 1 9 9 7 M a r 0 2 , 1 9 9 7 M a y 0 1 , 1 9 9 7 J u n 3 0 , 1 9 9 7 A u g 2 9 , 1 9 9 7 O c t 2 8 , 1 9 9 7 D e c 2 7 , 1 9 9 7 F e b 2 5 , 1 9 9 8 A p r 2 6 , 1 9 9 8 J u n 2 5 , 1 9 9 8 A u g 2 4 , 1 9 9 8 O c t 2 3 , 1 9 9 8 D e c 2 2 , 1 9 9 8 F e b 2 0 , 1 9 9 9 A p r 2 1 , 1 9 9 9 J u n 2 0 , 1 9 9 9 A u g 1 9 , 1 9 9 9 O c t 1 8 , 1 9 9 9 D e c 1 7 , 1 9 9 9 F e b 1 5 , 2 0 0 0 A p r 1 6 , 2 0 0 0 J u n 1 5 , 2 0 0 0 A u g 1 4 , 2 0 0 0 O c t 1 3 , 2 0 0 0 D e c 1 2 , 2 0 0 0 F e b 1 0 , 2 0 0 1 A p r 1 1 , 2 0 0 1 J u n 1 0 , 2 0 0 1 A u g 0 9 , 2 0 0 1 O c t 0 8 , 2 0 0 1 D e c 0 7 , 2 0 0 1 F e b 0 5 , 2 0 0 2 A p r 0 6 , 2 0 0 2 J u n 0 5 , 2 0 0 2 A u g 0 4 , 2 0 0 2 O c t 0 3 , 2 0 0 2 D e c 0 2 , 2 0 0 2 J a n 3 1 , 2 0 0 3 A p r 0 1 , 2 0 0 3 M a y 3 1 , 2 0 0 3 J u l 3 0 , 2 0 0 3 S e p 2 8 , 2 0 0 3 N o v 2 7 , 2 0 0 3 J a n 2 6 , 2 0 0 4 M a r 2 7 , 2 0 0 4 M a y 2 6 , 2 0 0 4 J u l 2 5 , 2 0 0 4 S e p 2 3 , 2 0 0 4 N o v 2 2 , 2 0 0 4 Capitol shooting 9/12/01 9/11/02 9/11/03 9/11/04 Iraq War (a) The Number of Words on the ‘Symbolic [Remembrance-Military]’ (‘Fallen Heroes’) Topic Per Day 0 20000 40000 60000 80000 100000 120000 Defense [Use of Force] N u m b e r o f W o rd s J a n 0 1 , 1 9 9 7 M a r 0 2 , 1 9 9 7 M a y 0 1 , 1 9 9 7 J u n 3 0 , 1 9 9 7 A u g 2 9 , 1 9 9 7 O c t 2 8 , 1 9 9 7 D e c 2 7 , 1 9 9 7 F e b 2 5 , 1 9 9 8 A p r 2 6 , 1 9 9 8 J u n 2 5 , 1 9 9 8 A u g 2 4 , 1 9 9 8 O c t 2 3 , 1 9 9 8 D e c 2 2 , 1 9 9 8 F e b 2 0 , 1 9 9 9 A p r 2 1 , 1 9 9 9 J u n 2 0 , 1 9 9 9 A u g 1 9 , 1 9 9 9 O c t 1 8 , 1 9 9 9 D e c 1 7 , 1 9 9 9 F e b 1 5 , 2 0 0 0 A p r 1 6 , 2 0 0 0 J u n 1 5 , 2 0 0 0 A u g 1 4 , 2 0 0 0 O c t 1 3 , 2 0 0 0 D e c 1 2 , 2 0 0 0 F e b 1 0 , 2 0 0 1 A p r 1 1 , 2 0 0 1 J u n 1 0 , 2 0 0 1 A u g 0 9 , 2 0 0 1 O c t 0 8 , 2 0 0 1 D e c 0 7 , 2 0 0 1 F e b 0 5 , 2 0 0 2 A p r 0 6 , 2 0 0 2 J u n 0 5 , 2 0 0 2 A u g 0 4 , 2 0 0 2 O c t 0 3 , 2 0 0 2 D e c 0 2 , 2 0 0 2 J a n 3 1 , 2 0 0 3 A p r 0 1 , 2 0 0 3 M a y 3 1 , 2 0 0 3 J u l 3 0 , 2 0 0 3 S e p 2 8 , 2 0 0 3 N o v 2 7 , 2 0 0 3 J a n 2 6 , 2 0 0 4 M a r 2 7 , 2 0 0 4 M a y 2 6 , 2 0 0 4 J u l 2 5 , 2 0 0 4 S e p 2 3 , 2 0 0 4 N o v 2 2 , 2 0 0 4 Iraqi disarmament Kosovo bombing Kosovo withdrawal Iraq War authorization Iraq supplemental $87b Abu Ghraib (b) The Number of Words on the ‘Defense [Use of Force]’ Topic Per Day of symbolic speech. This corresponds to the beginning of the Iraq War. The number of words on the topic dealing with the use of military force is displayed in Figure 3b. The small intermittent upswings in 1998 track with discussions of Iraqi disarmament in the Senate. The bombing of Kosovo is represented as large spikes in spring 1999. Discussion within this topic increased again in May 2000 surround- ing a vote to withdraw U.S. troops from the Kosovo peacekeeping operation. Post 9/11, the Afghanistan invasion brings a small wave of military discussion, while the largest spike in the graph (in October 2002) occurred during the debate to authorize military ac- tion in Iraq. This was followed, as one would expect, by other rounds of discussion in fall 2003 concern- ing the emergency supplemental appropriations bill for Iraq and Afghanistan, and in the spring of 2004 sur- rounding events related to the increasing violence in Iraq, the Abu Ghraib scandal, and the John Negroponte confirmation. Hypothesis Validity and Application to the Study of Floor Participation Hypothesis validity—the usefulness of a measure for the evaluation of theoretical and substantive hypotheses of interest—is ultimately the most important sort of valid- ity. In this section we offer one example of the sort of 224 KEVIN M. QUINN ET AL. analysis to which attention measures can be applied di- rectly. We return to a discussion of further applications in the concluding discussion. One direct use of these data is to investigate floor par- ticipation itself to answer questions about Congressional institutions, the electoral connection, and policy repre- sentation. Prior empirical work has had severe data lim- itations, depending on low frequency events (e.g., floor amendments; Sinclair 1989; Smith 1989), very small sam- ples (e.g., six bills; Hall 1996), or moderately sized, but expensive, samples (e.g., 2,204 speeches manually coded to three categories; Hill and Hurley 2002). Our data in- crease this leverage dramatically and cheaply. Figure 4 summarizes the results from 50-count mod- els (negative binomial) of the speech counts on all non- procedural topics and selected metatopical aggregations, for the 106th Senate, for all 98 senators who served the full session. Selected hypotheses, discussed below, are repre- sented by shaded backgrounds.18 Congressional behavior is of core relevance to ques- tions about the existence and possible decline of “norms” of committee deference, specialization, and apprentice- ship (Hall 1996; Matthews 1960; Rohde, Ornstein, and Peabody 1985; Shepsle and Weingast 1987; Sinclair 1989; Smith 1989). As noted by Hall, this is a difficult empiri- cal question as the primary leverage has come from floor amendment behavior, a relatively rare occurrence (1996, 180–81). Figure 4 shows that committee membership, but not necessarily service as chair or ranking member, continues to have a substantial impact on the tendency to participate in debate across policy topics. The apprentice- ship norm, as indicated by a negative impact of freshman status, also seems to be present in more technical policy areas, but notably not in common electoral issues like abortion or the size of government. Examination of the data over time could further inform the question of de- cline (Rohde, Ornstein, and Peabody 1985; Sinclair 1989; Smith 1989) and, with the cross-topic variation provided here, the role of expertise costs (Hall 1996) versus norms (Matthews 1960) in both deference and apprenticeship. Since at least Mayhew, Congressional scholars have also been interested in how career considerations affect the electoral connection (Fenno 1978, 1996; Hill and Hur- ley 2002; Maltzman and Sigelman 1996; Mayhew 1974). The sixth and seventh rows of Figure 4 identify two ca- reer cycle effects in the electoral connection and sym- bolic/empathy speech. A senator approaching election is more likely to give speeches in the symbolic (“I am proud 18These are graphical tables (Gelman, Pasarica, and Dodhia 2002; Kastellec and Leoni 2007). Alternative specifications, for both stan- dardized and unstandardized coefficients, and for equivalent mod- els of word count, all show substantively similar results. to be one of you”) and social (“I care about you”) cate- gories than is one whose next election is further in the fu- ture. Conversely, senators who subsequently retired gave many fewer such speeches, adding further evidence to the literature on participatory shirking (Poole and Rosenthal 1997; Rothenberg and Sanders 2000). The last two rows of Figure 4 provide evidence of two (arbitrary) examples of policy representation, un- employment and agriculture. This reflects the notion of representation as congruence between constituency and representation, a subject of considerable scholarly atten- tion (Ansolabehere, Snyder, and Stewart 2001 is a promi- nent example, in a literature that traces at least to Miller and Stokes 1963). Previous studies of congruence have generally been limited, on the legislator side, to mea- sures of position based on elite surveys or roll calls. Jones and Baumgartner (2005) examine the year-by-year con- gruence of relative attention to topic (via hearings) with aggregate (not constituency-level) demand measured by Gallup “most important problem” data. The party and ideology results (rows four and five) also contain interesting insights for our broader interests in how speech can inform our understanding of the lan- scape of political competition. Democrats are more likely to speak on social issues and more likely to speak in gen- eral. Given that Democrats were in the minority in the 106th Senate, this does lend some support to the assertion that speech is better than more constrained legislative be- haviors at revealing thwarted minority party preferences and strategies. Extremity (absolute DW-NOMINATE score) is as- sociated with more speeches on constitutional, interna- tional, and economics topics, but not generally on social issues or geographically driven topics. This could be taken as evidence that the former set of topics is of greater in- terest to ideological extremists. Or—our view—it could be taken as evidence that these are the topics that define the current content of the primary dimension captured by roll-call-based ideal point estimation procedures. The lack of association between “extremism” and attention to other topics is suggestive that those other topics define higher dimensions of the political space. Discussion In this article we have presented a method for inferring the relative amount of legislative attention paid to various topics at a daily level of aggregation. Unlike other com- monly used methods, our method has minimal startup costs, allows the user to infer category labels (as well as the mapping from text features to categories), and can be ANALYZE POLITICAL ATTENTION 225 FIGURE 4 Speech Count Models, 106th Senate Committee Chair/Ranking Freshman Republican Extremity Election Retiring Agriculture Unemployed J u d ic ia l N o m in a ti o n s C o n s ti tu ti o n a l S p e e c h , C a m p a ig n F in a n c e A b o rt io n C O N S T IT U T IO N A L D e fe n s e In t' l A ff a ir s , D ip lo m a c y In t' l A ff a ir s , A rm s C o n tr o l IN T E R N A T IO N A L L a b o r, W o rk e rs B u d g e t L a b o r, E m p lo y m e n t T a x e s M A C R O E C O N O M IC M il it a ry , M a n p o w e r M il it a ry , In fr a s tr u c tu re In te ll ig e n c e R e fo rm F e d e ra l L a w P u b li c L a n d s C o m m e rc ia l In fr a s tr u c tu re C o m m e rc ia l L a w D IS T R IB U T IV E E n e rg y E n v ir o n m e n ta l R e g u la ti o n A g ri c u lt u re T ra d e R E G IO N A L V io le n t C ri m e C h il d P ro te c ti o n M e d ic in e S o c ia l W e lf a re E d u c a ti o n S O C IA L H e a lt h , S e n io rs H e a lt h c a re P ro v is io n H E A L T H E C O N S y m b o li c , H o n o r S y m b o li c , C o n s ti tu e n t F a ll e n H e ro e s S y m b o li c , R e m e m b ra n c e S y m b o li c , S p o rt s S Y M B O L IC T O T A L Notes: Each column represents a negative binomial model of speeches delivered on a given topic, or group of topics, in the 106th Senate, with one observation per senator who served the entire two years (98 in total). Each row of the table represents a covariate: “Committee” (binary indicating whether the senator is on a topic-relevant committee); “Chair/Ranking” (binary indicating the senator is the chair or ranking member of a topic-relevant committee); “Freshman” (binary); “Republican” (binary); “Extremity” (absolute value of Dimension 1 Poole-Rosenthal DW-NOMINATE scores); “Agriculture” (log of state agricultural income per capita, 1997); “Election” (dummy, up for election in next cycle); “Retiring” (dummy, retired before next election); “Unemployment” (state unemployment rate, 1999). Plotted are standardized betas and 95% confidence intervals, darker where this interval excludes zero. Shaded areas represent hypotheses discussed in the text. applied to very large corpora in reasonable time. While other methods have one or more of these features, no other general method possesses all of these desirable prop- erties. While our method has several advantages over other common approaches to content analysis, it is not with- out its own unique costs. In particular, the topic model discussed in this article requires more user input after the initial quantitative analysis is completed. Since no sub- stantive information is built directly into the model, the user must spend more time interpreting and validating the results ex post . This article presents several ways that such inter- pretation and validation can be performed. Specifically, we demonstrate how (a) keywords can be constructed and their substantive content assessed, (b) agglomerative clustering can be used to investigate the semantic rela- tionships across topics, (c) construct validity of our daily measures of topic attention can be evaluated by looking at their covariation with roll calls and hearings on the topic of interest, and (d) predictive validity of our measures can be assessed by examining their relationships with ex- ogenous events (such as 9/11 or the Iraq War) that are widely perceived to have shifted the focus of attention in particular ways. In each case, we find strong support for the validity of our measures. While our method is useful, it will not (and should not) replace other methods. Instead, our data and method supplement and extend prior understandings of the political agenda in ways that have been to date prohibitively expensive or near impossible. Our method is particularly attractive when used as an exploratory tool applied to very large corpora. Here it quickly allows new insights to emerge about topic attention measured at very fine temporal intervals (in our example days). In some applications this will be enough; in others more detailed (and expensive) confirmatory analysis will be in order. There are many potential applications beyond those we have given here for such measures of attention as 226 KEVIN M. QUINN ET AL. this. The dynamic richness of our data allows topic- specific examination of policy-agenda dynamics, and questions of incrementalism or punctuated equilib- rium (Baumgartner and Jones 1993). The dynamic richness also allows us to move beyond static notions of congruence into dynamic notions of responsiveness, illuminating the topics and conditions under which leg- islators lead or follow public opinion (Jacobs and Shapiro 2000; Stimson, MacKuen, and Erikson 1995). Moving another step, there are many possible in- direct applications of the topic model. Once speeches are separated by topic, we can examine the substantive content—the values and frames—that underlie partisan and ideological competition. We can, for example, track in detail the dynamics by which issues and frames are adopted by parties, absorbed into existing ideologies, or disrupt the nature of party competition (Carmines and Stimson 1989; Monroe, Colaresi, and Quinn 2008; Poole and Rosenthal 1997; Riker 1986). Further, once we know the content of party com- petition, we can evaluate the positioning of individual legislators. That is, as hinted above, the topic model is a valuable first step toward using speech to estimate ideal points from legislative speech. This allows dynamically rich, topic-by-topic ideal point estimation, and insights into the content and dimensionality of the underlying po- litical landscape (Lowe 2007; Monroe and Maeda 2004; Monroe et al. 2007). Perhaps most exciting, our method travels beyond English and beyond the Congressional setting, where con- ventional methods and measures can be prohibitively ex- pensive or difficult to apply. We hope this might provide an important new window into the nature of democratic politics. References Adler, E. Scott, and John Wilkerson. 2006. “Congressional Bills Project.” Technical report, University of Washington, Seattle. NSF 00880066 and 00880061. Ansolabehere, Stephen, Erik C. Snowberg, and James M. Snyder. 2003. “Statistical Bias in Newspaper Reporting: The Case of Campaign Finance.” MIT Department of Political Science Working Paper. Ansolabehere, Stephen, James M. Snyder, and Charles Stew- art. 2001. “Candidate Positioning in U.S. House Elections.” American Journal of Political Science 45(1): 136–59. Baumgartner, Frank R., Christoffer Green-Pedersen, and Bryan D. Jones. 2006. “Comparative Studies of Policy Agendas.” Journal of European Public Policy 13(7): 959–74. Baumgartner, Frank R., and Bryan D. Jones. 1993. Agendas and Instability in American Politics. Chicago: University of Chicago Press. Baumgartner, Frank R., and Bryan D. Jones, eds. 2002. Policy Dynamics. Chicago: University of Chicago Press. Blei, David M., and John D. Lafferty. 2006. “Dynamic Topic Models.” 23rd International Conference on Machine Learn- ing, Pittsburgh, PA. Blei, David M., Andrew Y. Ng, and Michael I. Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3: 993–1022. Budge, Ian, Hans-Dieter Klingemann, Andrea Volkens, Judith Bara, and Eric Tannenbaum. 2001. Mapping Policy Prefer- ences: Parties, Electors and Governments, 1945–1998. Oxford: Oxford University Press. Cargnoni, Claudia, Peter Müller, and Mike West. 1997. “Bayesian Forecasting of Multinomial Time Series Through Conditional Gaussian Dynamic Models.” Journal of the American Statistical Association 92(438): 640–47. Carmines, Edward G., and James A. Stimson. 1989. Issue Evo- lution: Race and the Transformation of American Politics. Princeton, NJ: Princeton University Press. Cary, Charles D. 1977. “A Technique of Computer Content Analysis of Transliterated Russian Language Textual Mate- rials: A Research Note.” American Political Science Review 71(1): 245–51. Clausen, Aage R. 1973. How Congressmen Decide: A Policy Focus. New York: St. Martin’s Press. Fenno, Richard F. 1978. Home Style: House Members in Their Districts. Boston: Little, Brown. Fenno, Richard F. 1996. Senators on the Campaign Trail. Nor- man: University of Oklahoma Press. Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56(2): 121–30. Gerner, Deborah J., Philip A. Schrodt, Ronald A. Francisco, and Judith L. Weddle. 1994. “Machine Coding of Event Data Us- ing Regional and International Sources.” International Stud- ies Quarterly 38(1): 91–119. Hall, Richard L. 1996. Participation in Congress. New Haven, CT: Yale University Press. Heitschusen, Valerie, and Garry Young. 2006. “Macropolitics and Changes in the U.S. Code: Testing Competing Theories of Policy Production, 1874–1946.” In The Macropolitics of Congress, ed. E. Scott Adler and John S. Lapinski. Princeton, NJ: Princeton University Press, 129–50. Hill, Kim Quaile, and Patricia A. Hurley. 2002. “Symbolic Speeches in the U.S. Senate and Their Representational Im- plications.” Journal of Politics 64(1): 219–31. Hillard, Dustin, Stephen Purpura, and John Wilkerson. 2007. “An Active Learning Framework for Classifying Political Text.” Presented at the annual meeting of the Midwest Po- litical Science Association. Hillard, Dustin, Stephen Purpura, and John Wilkerson. 2008. “Computer Assisted Topic Classification for Mixed Methods Social Science Research.” Journal of Information Technology and Politics 4(4): 31–46. Ho, Daniel E., and Kevin M. Quinn. 2008. “Measuring Explicit Political Positions of Media.” Harvard Department of Gov- ernment Working Paper. Holsti, Ole R., Richard A. Brody, and Robert C. North. 1964. “Measuring Affect and Action in International Reaction ANALYZE POLITICAL ATTENTION 227 Models: Empirical Materials from the 1962 Cuban Crisis.” Journal of Peace Research 1(3/4): 170–90. Jacobs, Lawrence R., and Robert Y. Shapiro. 2000. Politi- cians Don’t Pander: Political Manipulation and the Loss of Democratic Responsiveness. Chicago: University of Chicago Press. Jones, Bryan D., and Frank R. Baumgartner. 2005. The Politics of Attention: How Government Prioritizes Problems. Chicago: University of Chicago Press. Jones, Bryan D., John Wilkerson, and Frank R. Baum- gartner. n.d. “The Policy Agendas Project.” http://www .policyagendas.org. Kastellec, Jonathan P., and Eduardo L. Leoni. 2007. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics 5(4): 755–71. Katznelson, Ira, and John S. Lapinski. 2006. “The Substance of Representation: Studying Policy Content and Legislative Behavior.” In The Macropolitics of Congress, ed. E. Scott Adler and John S. Lapinski. Princeton, NJ: Princeton University Press, 96–126. King, Gary, and Will Lowe. 2003. “An Automated Informa- tion Extraction Tool for International Conflict with Per- formance as Good as Human Coders: A Rare Events Eval- uation Design.” International Organization 57(3): 617– 42. Kingdon, John W. 1995. Agendas, Alternatives, and Public Poli- cies. Boston: Little, Brown. Klingemann, Hans-Dieter, Andrea Volkens, Judith Bara, Ian Budge, and Michael McDonald. 2006. Mapping Policy Pref- erences II: Estimates for Parties, Electors, and Governments in Eastern Europe, European Union and OECD 1990-2003. Oxford: Oxford University Press. Krippendorff, Klaus. 2004. Content Analysis: An Introduction to Its Methodology. 2nd ed. New York: Sage. Kwon, Namhee, Eduard Hovy, and Stuart Shulman. 2007. “Identifying and Classifying Subjective Claims.” Eighth Na- tional Conference on Digital Government Research, Digital Government Research Center. Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Ex- tracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97: 311–31. Lee, Frances. 2006. “Agenda Content and Senate Party Polar- ization, 1981–2004.” Presented at the annual meeting of the Midwest Political Science Association. Library of Congress. n.d. “THOMAS.” http://thomas. loc.gov. Lowe, William. 2007. “Factors, Ideal Points, and Words: Connecting Legislators’ Preferences to Legislative Speech.” Measures of Legislators’ Policy Preferences and the Di- mensionality of Policy Spaces, Washington University, St. Louis. Lowe, William. 2008. “Understanding Wordscores.” Political Analysis 16(4): 356–71. Lowi, Theodore J. 1964. “American Business, Public Pol- icy, Case-Studies, and Political Theory.” World Politics 16: 677–715. Maltzman, Forrest, and Lee Sigelman. 1996. “The Politics of Talk: Unconstrained Floor Time in the U.S. House of Rep- resentatives.” Journal of Politics 58(3): 819–30. Matthews, Donald. 1960. U.S. Senators and Their World. Chapel Hill: University of North Carolina Press. Mayhew, David R. 1974. Congress: The Electoral Connection. New Haven, CT: Yale University Press. Miller, Warren E., and Donald Stokes. 1963. “Constituency In- fluence in Congress.” American Political Science Review 57: 45–56. Monroe, Burt L., Michael P. Colaresi, and Kevin M. Quinn. 2008. “Fightin’ Words: Lexical Feature Selection and Eval- uation for Identifying the Content of Partisan Conflict.” Political Analysis 16(4): 372–403. Monroe, Burt L., and Ko Maeda. 2004. “Rhetorical Ideal Point Estimation: Mapping Legislative Speech.” Society for Polit- ical Methodology, Stanford University. Monroe, Burt L., Cheryl L. Monroe, Kevin M. Quinn, Dragomir Radev, Michael H. Crespin, Michael P. Colaresi, Jacob Bal- azar, and Steven P. Abney. 2006. “United States Congres- sional Speech Corpus.” http://www.legislativespeech.org. Monroe, Burt L., Kevin M. Quinn, Michael P. Colaresi, and Ko Maeda. 2007. “Estimating Legislator Positions from Speech.” Measures of Legislators’ Policy Preferences and the Dimensionality of Policy Spaces, Washington University, St. Louis. Peltzman, Sam. 1985. “An Economic Interpretation of the His- tory of Congressional Voting in the Twentieth Century.” American Economic Review 75: 656–75. Poole, Keith T., and Howard Rosenthal. 1997. Congress: A Political-Economic History of Roll-Call Voting . Oxford: Oxford University Press. Porter, Martin F. 1980. “An Algorithm for Suffix Stripping.” Program 14(3): 130–37. Porter, Martin F. n.d. http://snowball.tartarus.org/algorithms/ english/stemmer.html. Purpura, Stephen, and Dustin Hillard. 2006. “Automated Clas- sification of Congressional Legislation.” Technical report, John F. Kennedy School of Government. Riker, William H. 1986. The Art of Political Manipulation. New Haven, CT: Yale University Press. Rohde, David W. 2004. “Roll-Call Voting Data for the United States House of Representatives, 1953-2004.” Technical re- port, Political Institutions and Public Choice Program, Michigan State University. Rohde, David, Norman Ornstein, and Robert Peabody. 1985. “Political Change and Legislative Norms in the U.S. Senate, 1957–74.” In Studies of Congress, ed. Glenn Parker. Wash- ington, DC: Congressional Quarterly Press, 147–88. Rothenberg, Lawrence S., and Mitchell S. Sanders. 2000. “Sev- ering the Electoral Connection: Shirking in the Contem- porary Congress.” American Journal of Political Science 44: 316–25. Shepsle, Kenneth A., and Barry R. Weingast. 1987. “Why Are Committees Powerful?” American Political Science Review 81: 929–45. Sinclair, Barbara. 1989. The Transformation of the U.S. Senate. Baltimore: Johns Hopkins University Press. Slapin, Jonathan B., and Sven-Oliver Proksch. 2008. “A Scal- ing Model for Estimating Time-Series Positions from Texts.” American Journal of Political Science 52(3): 705– 22. 228 KEVIN M. QUINN ET AL. Smith, Steven S. 1989. Call to Order: Floor Politics in the House and Senate. Washington, DC: Brookings Institution. Stimson, James A., Michael B. MacKuen, and Robert S. Erikson. 1995. “Dynamic Representation.” American Political Science Review 89(3): 543–65. Stone, Philip J., Dexter C. Dunphy, Marshall S. Smith, and Daniel M. Ogilvie. 1966. The General Enquirer: A Computer Approach to Content Analysis. Cambridge, MA: MIT Press. United States Government Printing Office. n.d. “The Congressional Record, GPO Access.” http://www .gpoaccess.gov/crecord. Wang, Xuerui, and Andrew McCallum. 2006. “Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends.” 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia. Weber, Robert Phillip. 1990. Basic Content Analysis. New York: Sage. West, Mike, and Jeff Harrison. 1997. Bayesian Forecasting and Dynamic Models. New York: Springer. Wolbrecht, Christina. 2000. The Politics of Women’s Rights: Par- ties, Positions, and Change. Princeton, NJ: Princeton Univer- sity Press.