In Defense of Reflection In Defense of Reflection Author(s): Simon M. Huttegger Source: Philosophy of Science, Vol. 80, No. 3 (July 2013), pp. 413-433 Published by: The University of Chicago Press on behalf of the Philosophy of Science Association Stable URL: http://www.jstor.org/stable/10.1086/671427 . Accessed: 29/06/2013 11:00 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. . The University of Chicago Press and Philosophy of Science Association are collaborating with JSTOR to digitize, preserve and extend access to Philosophy of Science. http://www.jstor.org This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/action/showPublisher?publisherCode=ucpress http://www.jstor.org/action/showPublisher?publisherCode=psa http://www.jstor.org/stable/10.1086/671427?origin=JSTOR-pdf http://www.jstor.org/page/info/about/policies/terms.jsp http://www.jstor.org/page/info/about/policies/terms.jsp In Defense of Reflection Simon M. Huttegger*y I discuss two ways of justifying reflection principles. First, I propose that an undog- matic reading of dynamic Dutch book arguments provides them with a sound foundation. Second, I show also that minimizing expected inaccuracy leads to a novel argument for reflection principles. The required inaccuracy measures comprise a natural class of func- tions that can be derived from a generalization of a condition known as propriety or im- modesty. This shows that reflection principles are an essential feature not just of consis- tent degrees of belief but also of degrees of belief that approximate truth. And then, as he pushed through a hedge into a field untended, there suddenly close before him in the field was, as his father had told, the frontier of twilight. It stretched across the fields in front of him, blue and dense like water; and things seen through it seemed misshapen and shining. ðLord Dunsany, The King of Elfland’s DaughterÞ 1. Introduction. Reflection principles relate one’s anticipated future opin- ions to one’s current opinions. One way to couch reflection is to say that my “current opinion about event E must lie in the range spanned by the possible opinions I may come to have about E at later time t, as far as my present opinion is concerned” ðvan Fraassen 1995, 16Þ. In the theory of arbitrage, the fundamental theorem of asset pricing gives an exact statement of this idea ðSkyrms 2006Þ. If the opinions involved are precise probabilities, the fore- Received March 2013; revised May 2013. *To contact the author, please write to: Department of Logic and Philosophy of Science, University of California, Irvine, Social Science Plaza A, Irvine, CA 92697; e-mail: shuttegg@ uci.edu. yFor helpful comments on earlier drafts of this article, I want to thank Kenny Easwaran, Brian Skyrms, Bas van Fraassen, Jonathan Weisberg, Kevin Zollman, and an anonymous referee for this journal. Philosophy of Science, 80 (July 2013) pp. 413–433. 0031-8248/2013/8003-0009$10.00 Copyright 2013 by the Philosophy of Science Association. All rights reserved. 413 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp going reflection principle reads as follows: ðR1Þ An agent’s current belief P½A� in event A should lie in the interval spanned by any set of values Pf ½A� of her anticipated future degree of belief in A that has probability 1.1 Two further principles go by the name reflection: ðR2Þ An agent’s current belief P½A� in event A should be the expectation of her anticipated future degrees of belief Pf ½A�. ðR3Þ An agent’s current degree of belief in event A given that her antic- ipated future degree of belief Pf ½A� 5 r should be equal to r with proba- bility 1, whenever the event Pf ½A� 5 r has positive probability. Similar statements can be formulated for quantities other than future de- grees of belief. This, as well as what I mean by ‘anticipated future degrees of beliefs’, is explained in section 2. The main purpose of section 2 is to intro- duce the general concepts of conditional expectation and conditional prob- ability and to note in what sense they imply R1 and R2. This can be used to argue that if future degrees of belief are given by conditional probabilities, then R1 and R2 hold. The burden of proof therefore lies on showing that one’s anticipated future degrees of belief should be equal to conditional prob- abilities. This leads, among other things, to the question of justifying princi- ples like R3. Dutch book arguments are one way to show that one’s future degrees of belief should be given by a conditional probability. They can also be used to defend a principle such as R3 ðsee Goldstein 1983; van Fraassen 1984Þ. This approach has drawn rather fierce criticisms on itself and on reflection principles ðfor discussions and criticisms, see Levi 1987; Christensen 1991; Talbott 1991; Maher1992; Bacchus,Kyburg,andThalos1995;Bovens 1995; Arntzenius 2003; Briggs 2009Þ. I argue in section 3 that a liberal reading of Dutch book arguments helps in understanding the role of reflection for be- lief change and puts into context various aspects of the counterexamples to reflection. Reflection, properly understood, turns out to be a requirement of epistemic rationality for any agent who considers herself to update beliefs rationally. The main part of my article develops a new justification for reflection principles in terms of expected accuracy. In sections 4 and 5, I show that con- 1. The probability 1 qualification is important here and in R3, whenever a probability space is infinite. It can be disregarded in finite probability spaces, where all events have positive probability. This point is sometimes ignored in the philosophical literature. 414 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp ditional expectations minimize expected inaccuracy for a natural class of in- accuracy measures that are based on the well-known condition of propriety or immodesty. As a consequence of this result, one can derive generaliza- tions of the reflection principles R1–R3. Approaches based on minimizing expected inaccuracy have recently been used by Greaves and Wallace ð2006Þ, Leitgeb and Pettigrew ð2010aÞ, and Easwaran ð2013Þ to justify various as- pects of Bayesianism. These approaches are in the philosophical tradition of so-called nonpragmatic vindications of probabilism ðJoyce 1998Þ, with the distinguishing feature that accuracy is weighed by one’s own prior proba- bilities. I would like to emphasize that it is not my goal to show that reflection principles are requirements for rational belief change under all circum- stances. Rather, the aim of this article is to explicate both the framework and the assumptions that are required for deriving reflection principles. This will lead, I hope, to some clarity as to what those of us mean who claim that reflection is a requirement for rational belief change. 2. Conditional Probability and Reflection. I start with a rather careful in- troduction of the general concepts of conditional probability and conditional expectation. These concepts play a crucial role in understanding reflection principles. The easiest case is one in which you perform an experiment with finitely many possible results fEig that are assumed to form a partition. If you up- date your probabilities by conditioning on the outcome of the experiment, the expected value of your new probabilities is given by P½A� 5 o i P½AjEi�P½Ei�; ð1Þ where the sum extends over all i such that P½Ei� > 0. Your current probability P½A� is therefore given by the expectation of your conditional probabilities. This is a very elementary observation. Two questions immediately sug- gest themselves. First, does this result hold for general probability spaces? And, second, is it necessary that the experimental outcomes form a partition? To answer these questions, suppose that ðQ; F; PÞ is a probability space, where Q is a set of elementary events or—if you wish—possible worlds, and F is a σ-algebra of subsets of Q. A σ-algebra is just an algebra that is also closed under countable unions and intersections. The members of F are called events or propositions. We assume that P is a countably additive prob- ability measure on ðQ; FÞ. In what follows,ðQ; F; PÞ is the subjective probability space of an agent. This means that Q represents the worlds that the agent considers possible, IN DEFENSE OF REFLECTION 415 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp the events in F are the propositions the agent can express and for which she has probabilities, and P is the agent’s subjective probability measure. This is tantamount to saying that the agent is synchronically σ-coherent ðsee Adams 1962Þ. A F-measurable random variable X is a function from Q to the real numbers R such that every event X 21ðBÞ is in F for each open subset B of R. Loosely speaking, the events described by F-measurable random vari- ables do not go beyond what is expressible in F. If A is in F, we denote the expectation of X on A by E½X ; A� 5 E A XdP: We can now define conditional expectations.2 Suppose that X is a F- measurable random variable and that G is a sub-σ-algebra of F.3 The con- ditional expectation E½X jG� of X given G is a G-measurable random variable for which it is true that E½E½X jG�; G� 5 E½X ; G� for all G in G; ð2Þ that is, on each G in G, the random variables X and E½X jG� have the same expectation. It can be shown that E½X jG� exists and is almost surely unique ðany two random variables for which ½2� holds are equal up to a set of prob- ability 0; see, e.g., Williams ½1991� for detailsÞ. The general concept of conditional probability is obtained as a special case,bysettingX 5 IA ðthe indicator of AÞ.4 In this case, E½X jG� 5 E½IAjG� 5 P½AjG�. Hence, the conditional probability P½AjG� is also a G-measurable random variable. The random variables P½AjG� and E½X jG� often reduce to their well-known counterparts in finite probability spaces. Suppose that G ∈ G is an atom of G.5 Then P½AjG�ðqÞ 5 P½AjG� for almost every q in G. This shows that the gen- eral concept of conditional probability almost surely agrees with the standard ratio definition of conditional probability in this case. 2. The expectation of X on A should not be confused with the expectation of X given the set A, which is a special case of conditional expectation. The expectation of X on A is generally equal to the product of that conditional expectation with the probability of A. 3. That is, G is a σ-algebra, and G ⊂ F. 4. The indicator of A can be thought of as the truth value of A in world q: IAðqÞ 5 1 if q ∈ A, and IAðqÞ 5 0 otherwise. 5. That is, P½G� > 0 and P½E� 5 0 or P½G 2 E� 5 0, for every E ∈ G that is a subset of G. 416 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp There are generalizations of ð1Þ for conditional expectations and condi- tional probabilities. If we set G 5 Q in ð2Þ, we get E½E½X jG�� 5 E½X �; ð3Þ and this implies E½P½AjG�� 5 P½A�: ð4Þ Equations ð3Þ and ð4Þ hold in general probability spaces. We can get back to a simple case like ð1Þ by taking G to be generated by a finite partition fEig of Q; then it is true that E½P½AjG�� 5 o i P½AjEi�P½Ei�; where the sum ranges over all i such that Ei has positive probability. ðThis follows since any such Ei is an atom of G.Þ Hence, ð4Þ is a proper gener- alization of the law of total probability, and ð3Þ is its extension to random variables. Reflection enters the picture by observing that ð4Þ looks very similar to R2 and that something like R1 follows from this because of standard properties of expectation. What needs to be clarified, however, is in what sense the con- ditional probability P½AjG� can capture the agent’s anticipated future degrees of belief referred to in R1 and R2.6 There seem to be two possible interpretations. One is to view P½AjG� as a plan to update one’s degree of belief for A after being informed which memberof G is true.Thisapproachisused byEaswaranð2013Þ. Onthis read- ing,G is viewed as describing the outcomes of an experiment that provides the agent with a new piece of information. A similar approach is taken by Greaves and Wallace ð2006Þ, who consider acts instead of plans. An act is a probability distribution chosen by the agent in response to receiving some piece of information.7 More generally, van Fraassen ð1995Þ talks about poli- cies for opinion change. This also includes conditional expectations E½X jG� and the corresponding reflection principles. 6. This is pointed out by Weisberg ð2007Þ. 7. I agree with Easwaran that for general probability spaces, the concept of a plan appears to be superior: P½AjG� is a random variable whose existence is guaranteed. But it need not always be possible to have a function P½�jG� that is a probability measure almost surely. This problem leads to the question of the existence of regular conditional probabil- ities, on which I say a bit more in sec. 4. For now, let me just say that the probability distributions Greaves and Wallace use as acts need not exist in infinite probability spaces. Bothactsandplanscanbeviewedasdispositionstoupdateinaparticularway. SeeGreaves and Wallace ð2006Þ and Easwaran ð2013Þ for more on this issue. IN DEFENSE OF REFLECTION 417 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp Understanding anticipated future degrees of belief in terms of plans gives rise to a notion of reflection that Easwaran aptly calls plan reflection. The corresponding principles result if we substitute for ‘anticipated future de- gree of belief’ the ‘degree of belief an agent plans to have’ in R1 and R2.8 On the second interpretation, the agent is assumed to believe with prob- ability 1 that she will update by conditioning on G. In this case, P½AjG� will be her future degree of belief for A in almost every world q. This interpre- tation seems to be used by Weisberg ð2007Þ and Briggs ð2009Þ. The assump- tion that the agent believes with probability 1 that she will condition may of course fail. But if it holds, her anticipated degrees of belief are given by her conditional probabilities, and principles R1 and R2 again follow. I am not going to adjudicate between these two interpretations because I think that both are reasonable ways to make precise the notion of antic- ipated future degrees of belief. Let me point out, however, that neither of them allows for a truly diachronic notion of belief change. Both concep- tualize the future beliefs of an agent as something she contemplates from her current point of view. This view of belief change is fundamentally syn- chronic. It is not inconsistent with, but seems to be at odds with, the way many authors understand conditionalization, namely, as a norm for belief change the agent ought to apply as she actually learns a proposition for cer- tain and moves to a new probability measure. I will not offer a full-fledged defense of why a synchronic reading of be- lief change is appropriate. I think there are some reasonable arguments in favor of such a view, and I will mention one in the context of Dutch books; for some others, see Easwaran’s discussion of the difficulties of transferring synchronic norms for conditional probabilities to a truly diachronic setting ðEaswaran 2013Þ. What is more important at this point is to notice that tak- ing a synchronic view of belief change seems to be an intrinsic part of the measure-theoretic view of conditional probability as a random variable that lives within an agent’s current probability space. Thus, we see that the reflection principles R1 and R2 hold for conditional probabilities, provided that they are interpreted in a certain way. Before dis- cussing principle R3, I would like to put emphasis on two features that are needed for ð3Þ and ð4Þ to count as reflection principles. In the first place, X is assumed to be F-measurable; this excludes, for example, the possibility that an agent considers conditional probabilities for events B ⊂ Q that are not members of F. Moreover, G needs to be a subset of F. The agent does not consider updating on information that she cannot currently express.9 If 8. The resulting principles are, I think, those that van Fraassen ð1995Þ has in mind when he says that conditionalization implies reflection. 9. This excludes Sleeping-Beauty-like cases. 418 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp these two assumptions do not hold, conditional expectations as introduced above are not defined. Let me now turn to principle R3. One important point about conditional expectation is that the σ-algebra G can describe a variety of situations. An example that was already mentioned is the situation in which G is gener- ated by the outcomes of an experiment. But G might also be generated by the agent’s anticipated future degrees of belief, in the following way. Sup- pose that the agent finds herself in a highly unstructured learning situation. There is no nontrivial partition or σ-algebra that could serve as a basis for up- dating her beliefs. Perhaps the agent expects an unexpected informational input, or she cannot describe the events that she learns precisely enough in her language, or maybe she is just going to think about some topic. This kind of situation is called black-box learning in Skyrms ð1990Þ. More formally, consider a random variable Y for an agent’s future degree of belief regarding some fixed event A in F. The value YðqÞ is her antici- pated future degree of belief for A, if q is the true state of the world, and Y captures the black-box nature of the learning situation since nothing what- soever is assumed about the structure of the agent’s learning experience. As in the case of conditional probability, Y may be understood in two ways: as a plan to update one’s beliefs in response to the learning experience or as the quantity of which an agent believes, with probability 1, that it will be her de- gree of belief. In order for Y to be well-defined, we need to assume that each q contains information about the agent’s degree of belief in A at world q. The set Q must therefore be quite rich. If, for example, the agent considers all real numbers in the interval ½0; 1� to be possible future degrees of belief for A, then the worlds in Q need to reflect all these possible cases. We can also generalize from degrees of belief to estimates of unknown quantities. If X is a random variable, we can let Y be the ðcurrently unknownÞ future best estimate of X. Provided that Y is well-defined, let σðYÞ be the smallest σ-algebra making Y measurable.10 The σ-algebra σðYÞ contains all propositions that can be described by Y. We assume that σðYÞ is a subset of F. This means that the agent already grasps all events that can be expressed with Y. We thus require both that Y is well-defined and that σðYÞ is a subset of F. I do not take issue with these two assumptions for Y here. I only want to make them explicit and also draw attention to the fact that they are needed for conditional expectations. If E½X jY� denotes the conditional expectation of X given σðYÞ, then E½X jY� and P½AjY� exist provided that our two as- sumptions are met. 10. The set σðYÞ is the smallest σ-algebra that contains all sets of the form Y 21ðBÞ, where B is an open subset of R. IN DEFENSE OF REFLECTION 419 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp Let’s return to R3. In the present context, this principle says that, with probability 1, Y 5 P½AjY�; in fact, this is the precise formulation of R3 for general probability spaces.11 The corresponding formula for random vari- ables X is Y 5 E½X jY�. This relation is characteristic for martingales. The reflection principle R3 is, in general, nothing but the martingale property. If we view Y as the currently unknown future best estimate of the true value X of some quantity, then it makes sense that E½X jY� 5 Y, if we note that E½X � can be viewed as the current best estimate of X. The martingale property is not something that follows from any of the results that were mentioned so far. The approach I am going to focus on de- rives the martingale property within the framework of minimizing expected inaccuracy. But first, I briefly discuss the standard approach, which proceeds in terms of dynamic coherence. 3. The Role of Dutch Books. Since martingales are sequences of fair gam- bles, it is quite clear that dynamic coherence can be used to justify the mar- tingale property. One dynamic coherence argument for reflection is due to van Fraassen ð1984Þ. It is formally the same as David Lewis’s Dutch book argument for conditionalization ðsee Teller 1973Þ. Goldstein ð1983Þ puts forward a more general coherence argument that also applies to conditional expectations. I will not repeat any of these arguments in detail here since they are well known. Let me just say that the general structure of a dynamic coherence argument is very similar to a synchronic coherence argument. An agent announces her fair-betting odds concerning events in F, the basic idea be- ing that the agent can choose the betting odds for a proposition but cannot choose the side of the bet.12 Fair-betting odds are supposed to measure an agent’s beliefs. By a suitable normalization, they are mapped to the real num- bers. The resulting numbers are called the agent’s degrees of beliefs for those events involved in the bets. A Dutch book argument then shows that if these numbers do not obey the laws of the probability calculus, the agent willend up with a net loss from a set of bets that she individually deems fair. This line of reasoning can be applied to conditionalization ðTeller 1973Þ, reflection ðvan Fraassen 1984Þ, or Jeffrey conditioning ðArmendt 1980; Skyrms 1987bÞ. In the case of reflection, propositions described by one’s 11. This subsumes van Fraassen’s original formulation of the reflection principle, which basically states that P½AjY 5 r� 5 r for almost every q, such that YðqÞ 5 r if the event fY 5 rg has positive probability ðvan Fraassen 1984Þ. Our formulation of R3 goes sig- nificantly beyond this. Even if every event fY 5 rg has probability 0, it will be the case that P½AjY 5 r� 5 r for almost every q, such that YðqÞ 5 r. 12. This is a mechanism that ideally ensures that the agent announces her fair-betting odds. It is similar to the well-known mechanism for achieving a fair division of a cake between two persons. One person cuts the cake, and the other one chooses a piece. 420 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp future degrees of belief Y for A can be used in the dynamic coherence ar- gument only if σðYÞ is a subset of F. If this holds, then principle R3 or an appropriate generalization follows from dynamic coherence. One feature of a dynamic Dutch book should be emphasized. It requires you to announce your future betting odds today; if you only announce your fair-betting odds at different times, there is no Dutch book to be had ðHack- ing 1967Þ. This leads to two legitimate interpretations of future degrees of belief in this context; not surprisingly, they are the same as in the previous section. We have to view future degrees of beliefs ði.e., future betting oddsÞ either as a plan to update one’s degree of belief or as the degree of belief for A that one believes, with probability 1, is going to be one’s future degree of belief. The problem with this line of reasoning is that there exist many putative counterexamples to reflection.13 Many of these counterexamples are aimed at reflection. But since dynamic coherence implies reflection, they are also used to argue that there must be something wrong with dynamic coherence arguments. I consider dynamic coherence first, before returning to the impli- cations for reflection. The structure of many counterexamples is that you expect to be in some kind of pathological situation in which your degrees of belief will be bon- kers. It is then argued that you certainly do not want your degrees of be- lief to obey reflection, for this would require your current ðsaneÞ beliefs to conform with your future ðinsaneÞ beliefs. The story of Ulysses and the Si- rens is often used as an illustration. Ulysses’s current beliefs are not a mix- ture of his anticipated future beliefs, for these will be influenced by the Si- rens’ song. His beliefs do not observe reflection, but it is not because his current beliefs should be any different. Not all counterexamples have this structure. But all derive their force from using dynamic coherence arguments in a certain way. It is supposed that a rational agent is required to respond to the existence of a Dutch book by adjusting her prior so that it coheres with her posterior. This, of course, leads to absurdities. But is this a reasonable use of Dutch book arguments? There is an interpretation of Dutch book arguments that does not lead to the conclusions drawn in the counterexamples to reflection. Several sup- porters of Dutch book arguments have a fairly modest purpose in mind: dy- namic incoherence is used to detect inconsistencies—nothing more, nothing less. This modest purpose goes back to Ramsey ð1931Þ and is championed, in one form or another, by various authors ðe.g., Skyrms 1987a; Armendt 1993; Howson and Urbach 1993; Christensen 1996Þ. Ramsey maintains that the existence of a Dutch book indicates inconsistencies among degrees of beliefs: “If anyone’s mental condition violated these laws, his choice would 13. See Briggs ð2009Þ for a classification and discussion of counterexamples. IN DEFENSE OF REFLECTION 421 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp depend on the precise form in which the options were offered him, which would be absurd. He could have a book made against him by a cunning bettor and would then stand to lose in any event” ð1931, 182Þ. This famous quote suggests that being taken advantage of is not the philosophically impor- tant point about a Dutch book, however unfortunate it might be; it is rather that you assign different betting odds ðdegrees of beliefÞ to equivalent bets ðequivalent events; see Skyrms 1987aÞ. Dutch book arguments indicate such inconsistencies. They are tools for diagnosis. Ramsey refers to synchronic coherence. Something similar can be spelled out for belief change—be it in terms of conditionalization, Jeffrey condition- ing, or black-box learning ðsee Skyrms 1987bÞ. If an agent is diachronically incoherent, then the agent has distinct fair-betting odds ðdegrees of beliefÞ for equivalent bets ðequivalent eventsÞ. The bets are equivalent, given the un- derlying Boolean logic and how the agent updates beliefs—equivalent by her own standards, that is to say. Assuming the background logic and syn- chronic coherence, the epistemic defect brought out by the Dutch book is in this case attributable to how the agent updates beliefs. The important questionnow iswhata dynamicallyincoherentagentshould do.Ifwetakehertobeonlyconcernedaboutthepragmaticaspectof theDutch book, it is reasonable to conclude that she will change her beliefs in order not to be exploitable. But if we focus on the epistemic defect that is indicated by dynamic incoherence, we are not forced to this conclusion. Instead, a dy- namic Dutch book informs the agent that her degrees of beliefs are incon- sistent but says nothing about how the agent should respond to this incon- sistency. Prima facie, consistent degrees of belief are an epistemic virtue. But this does not imply that an agent has to establish consistency at any cost; the Dutch book does not override all other considerations that might be impor- tant for the agent. The Ramseyan point of view leads to a tempered understanding of dy- namic incoherence. A Dutch book indicates a particular epistemic defect, but it does not say anything about whether or how the agent should change her degrees of belief. The agent’s response will often depend on other consid- erations. Understanding dynamic coherence in this way has some important con- sequences for the counterexamples to reflection. In some of them, the agent might well believe that her belief change will lead her to adopt irrational be- liefs. Such is the case in the example of Ulysses and the Sirens. Briggs ð2009Þ points out that in cases like this one, an agent exhibits self-doubt. A rational agent should be able to have such beliefs, without being forced to obey prin- ciples such as reflection. But notice that this belief of the agent’s—that updating her beliefs will be irrational—is correctly indicated by dynamic incoherence. The Dutch book argument is actually doing its job, and no ab- 422 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp surd consequences can be derived since the agent is not required to avoid the Dutch book by changing her prior.14 Other counterexamples to reflection involve memory loss. These exam- ples show, correctly I think, that dynamic coherence and reflection are insuf- ficient for all-things-considered rationality. But this conclusion misses the point. The claim is not that dynamic coherence and reflection are sufficient for all-things-considered rationality. The claim is that dynamic incoherence and violations of reflection are indicators of epistemic irrationality. Forget- ting is one kind of epistemic irrationality. But it is perfectly rational ðin the all-things-considered senseÞ to prefer a situation in which one is slightly epi- stemically irrational to a situation in which one is perfectly epistemically ra- tional but has to pay all sorts of nonepistemic costs. There is another class of counterexamples that I will not discuss. They are similar to the case of Sleeping Beauty in that an agent can learn events that she does not already grasp. This violates our basic assumptions and leads outside the standard mathematical theory of conditional expectation. Therefore, such examples deserve a special treatment.15 In brief, the following view emerges from these considerations. Dynamic incoherence—understood in a tempered sense and applied to situations that fall within the scope of the theory of conditional expectation—as well as reflection are diagnostic of epistemic irrationality. The epistemic irrational- ity applies to how an agent updates beliefs since we have assumed that the agent is synchronically rational. Thus, as long as one ignores larger consid- erations, an agent cannot violate reflection and at the same time think that she will form her future degrees of belief in an epistemically rational way. If she does consider herself to be epistemically rational, then her probability measure should observe reflection. Phrased in terms of plans, this is essentially the conclusion reached in van Fraassen ð1995Þ as to what a violation of reflection amounts to: “the person holding this opinion cannot regard herself as following a rational policy for opinion change” ð17Þ. Similarly, Skyrms ð1990Þ takes reflection to indicate cases of genuine learning. What my discussion adds to van Fraassen’s and Skyrms’s arguments is that these conclusions derive very naturally from a view of Dutch book arguments as tools for diagnosis. In addition, it allows us to largely deflate the intuitive plausibility of some counterexamples to dynamic coherence and reflection, by observing that a diagnosis of dynamic incoherence does not say anything about how an agent should respond to 14. Jeffrey ð1988Þ also ties violations of reflection to cases in which the agent expects her belief change to be unreasonable. 15. See, e.g., Schervish, Seidenfeld, and Kadane ð2004Þ for a thoughtful discussion of this issue. IN DEFENSE OF REFLECTION 423 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp incoherence. In fact, violations of reflection are indicative of an agent who cannot consider herself to be epistemically rational. Much of my argument depends on the connection between fair-betting odds and degrees of belief. This connection implies that having distinct fair odds for equivalent bets is the same as having distinct degrees of belief for equivalent events. If this link does not hold, then the agent’s evaluations of bets are inconsistent. While this is certainly some kind of defect, it need not count as epistemically defective ðsee Joyce 1998Þ. I think that one can make sense of the connection between betting odds and degrees of belief, in terms of measuring strength of belief by betting behavior. However, this is primarily a measurement-theoretic question that would lead us too far afield. Instead of pursuing this topic further, I introduce another approach for justifying reflection principles in which we start with numerical beliefs as a primitive concept. This approach might be more ap- pealing to those philosophers who entertain fundamental doubts about Dutch book arguments. 4. Reflection and Expected Inaccuracy. We have seen that reflection is a feature of consistent degrees of belief, if degrees of belief are measured by fair-betting behavior. In the next two sections, we see that it is also a natural aspect of degrees of belief within the context of approximating the truth, understood in terms of minimizing expected inaccuracy. We start by con- sidering the case of quadratic inaccuracy measures, which play an important role in accuracy-based justifications of probabilistic concepts ðGreaves and Wallace 2006; Leitgeb and Pettigrew 2010aÞ. In the following section, we see how this approach can be generalized to a natural class of inaccuracy measures. Before I explain the details, let me try to describe the basic idea in a non- technical way. The framework of this section is a geometrical one. We start with a space whose points are random variables; in particular, truth values of propositions and their estimates are points in this space. We can mea- sure how much the estimate of the truth of a proposition differs from its truth value at a world, by taking the square of the difference between the value of the estimate and the truth value at this world. Taking the expecta- tion of the squared difference relative to an agent’s current probability mea- sure yields a measure of the expected inaccuracy of the truth estimate. Now, this expectation can be used to define a distance between points in the space of random variables. With respect to this distance, it can be demonstrated that conditional probability is the best estimate of the truth value of a propo- sition in the sense of having minimal distance to its truth value. Most impor- tant, we shall see that this implies the martingale property P½AjY� 5 Y. In other words, the reflection principle R3 is a consequence of this approach. 424 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp It is well known that minimizing expected inaccuracy can be used to jus- tify conditionalization ðGreaves and Wallace 2006; Leitgeb and Pettigrew 2010b; Easwaran 2013Þ. Moreover, it can be shown that, in a certain sense, conditionalization entails reflection. Hence, minimizing expected inaccuracy leads to reflection for conditionalization, as is noted by Easwaran ð2013Þ. While this is true, the arguments in this section are not merely old wine in new skins. As we have seen in section 2, the claim that conditionalization entails reflection is basically the same as the fact that an event’s prior proba- bility is equal to the expectation of its conditional probability ðsee eq. ½4�Þ. Reflection principles go much beyond that, however. They apply not just to conditional probabilities but to anticipated future degrees of belief in gen- eral. The fact that conditionalization entails reflection does not, by itself, al- low us to conclude that reflection principles also hold for anticipated future degrees of belief that are not given by conditionalization. This requires a sep- arate treatment. To make the argument more precise, we start with some technicalities. Let A be an event in F, and suppose that Y is the agent’s estimate of the truth value IA of A. Quadratic inaccuracy measures give the inaccuracy of Y as an estimate of IA at world q as ðIAðqÞ 2 YðqÞÞ2 up to multiplication by a pos- itive constant. For our purposes, the constant can be ignored. The expected in- accuracy of Y as an estimate of IA is then given by E½ðIA 2 YÞ2�: ð5Þ The expectation is taken with respect to P. Leitgeb and Pettigrew ð2010aÞ provide axioms that single out quadratic in- accuracy measures as the uniquely legitimate ones. But their arguments only apply to finite probability spaces. Our use of the quadratic inaccuracy measure can therefore be justified in cases in which Q is finite, by appealing to Leitgeb and Pettigrew’s axiomatic treatment.16 For infinite probability spaces, the re- sults in this section should be viewed as expository. I provide a fuller treat- ment in the next one. Quadratic inaccuracy measures give rise to a geometric structure that is very well known in probability theory. To introduce this structure, consider the set of all square-integrable, F-measurable random variables ðX is square integrable if E½X 2� < ` Þ. Square integrability is appropriate in the present context, for otherwise the expected inaccuracy ð5Þ could be infinite or un- defined. 16. We assume, unlike Leitgeb and Pettigrew, that P is a probability measure. Thus, we could use the set of axioms provided by Selten ð1998Þ as an alternative justification of quadratic inaccuracy measures. IN DEFENSE OF REFLECTION 425 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp The set of all square-integrable random variables is a vector space over the real numbers.17 This vector space is called L2. The expectation of the product of two random variables E½XY� defines an inner product on L2. Variables X and Y are orthogonal if E½XY� 5 0.18 The norm associated with the inner product is k X k 5 ðE½X 2�Þ1=2. The norm of X can be understood as the size of X. It also gives rise to a notion of distance between X and Y, by letting their distance be equal to k X 2Y k 5 E½ðX 2YÞ2�1=2. The vector space L2 is complete; that is, there are no “holes” in L2. If we have a sequence of square-integrable random variables such that the distance between all but finitely many of them becomes arbitrarily small, the se- quence converges to an element of L2. ðMore technically, with respect to the norm k � k, all Cauchy sequences of elements of L2 converge to an element in L2.Þ The limit is not unique, but almost surely unique. If Y is a limit of the sequence,then any Y 0 with k Y 2 Y 0 k 5 0 is also a limit, which means that Y and Y 0 are the same, except on a set of probability 0. The geometry of this vector space is very similar to the geometry of Eu- clidean vector spaces, such as R3, with the usual dot product between vec- tors. Euclidean spaces can be generalized by considering spaces whose points are not real valued vectors and by using inner products other than the dot product. Such spaces are known as Hilbert spaces. Space L2 can be viewed as a Hilbert space up to random variables that agree almost surely ðWilliams 1991, 65Þ. Suppose that K is a complete vector subspace of L2. Due to the Hilbert space structure of L2, for any X in L2, there exists a Y in K that minimizes k X 2 Y k such that Y is almost surely unique. Any Y 0 that agrees with Y except on a set of probability 0 will also minimize k X 2Y k. The random variable Y is the orthogonal projection of X on K. That is, X 2Y is or- thogonal to all random variables Z in K ðWilliams 1991, 67Þ. If G is a sub-σ-algebra of F, then the set G of square-integrable G- measurable random variables is a complete vector subspace of L2. When- ever X is square integrable, this means that E½X jG� is in G.19 It can be shown that E½X jG� is the orthogonal projection of X on G ðWilliams 1991, 85Þ. This implies that E½X jG� is the closest random variable to X among all G- measurable random variables; see figure 1 for an illustration. 17. If X and Y are square integrable, then so is lX 1 mY for all real numbers l; m. 18. The inner product can be used to define a generalized notion of “angle” between tworandom variables in L2. Just as the dot product of two vectors in R3 is the product of their norms times the cosine of the angle between them, one can think of the inner prod- uct of two random variables as the product of their norms times the cosine of the “angle” between them; this cosine turns out to be the correlation between the variables. 19. If X is square integrable, then so is the conditional expectation E½X jG�. 426 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp Closeness refers to the norm k � k. By definition, then, E½ðX 2 E½X jG�Þ2� minimizes the expected square error E½ðX 2 YÞ2� for all G-measurable Y, and E½X jG� is the almost unique minimum. The expected value E½ðX 2 YÞ2� ð6Þ is a generalization of the expected inaccuracy in ð5Þ, where X is the true value of some quantity and Y is the agent’s estimate of X after learning which event in G obtained. Thus, ðXðqÞ 2 YðqÞÞ2 is the inaccuracy of her estimate at world q. In this case, the agent minimizes expected inaccuracy by choosing Y to be almost equal to E½X jG�. Leitgeb and Pettigrew prove that conditional probability minimizes ex- pected inaccuracy in finite probability spaces ð2010b, theorem 3Þ. The con- siderations above yield a version of this theorem for general countably ad- ditive probability spaces. In a first step, observe that the argument in the previous paragraph implies that the conditional probability P½AjG� is the G-measurable random variable that minimizes the expected inaccuracy E½ðIA 2 YÞ2�. However, it is not in general possible to choose P½�jG� so that it is a countably additive probability measure almost surely. To get a full analogue of Leitgeb and Pettigrew’s theorem, P½�jG� also needs to be a regular conditional probability: P½�jG� is a regular conditional probability if it is a probability measure on F for almost every q and P½AjG� is F- measurable for each A in F. It is well known that P½�jG� is a regular conditional probability under the fairly natural assumption that Q is a Polish space, that is, a separable com- pletely metrizable topological space. Together with the smallest σ-algebra that contains all open sets of Q, such spaces are known as standard mea- sure spaces. Any finite Q is Polish, as is any Q that has the same structure as a real metric vector space. In general, to say that Q is a metrizable topo- Figure 1. Set G is the complete subspace of G-measurable square-integrable ran- dom variables, and E½X jG� is the orthogonal projection of X on G. If Y is not the orthogonalprojectionofAonG, then it does not minimize the distance to X. The set G can be equal to the set Y which contains all σðYÞ-measurable random variable. IN DEFENSE OF REFLECTION 427 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp logical space and that F is the smallest σ-algebra containing all open sets of Q is just to say that it is possible to give a metric distance function between points in Q such that F contains every ball of any radius around every point in Q. One can think of this in terms of a distance function between possible worlds. The distance function need not be unique. All that is required is that there exists one such distance function. To say that Q is completely metriz- able is to say Q is complete with respect to the distance function; there are no “holes” in Q ðjust like there are no “holes” in L2Þ. Separability requires that any two points in a topological space are contained in open sets that do not intersect. This could plausibly fail in relevant probability spaces. In sum, though, Polish spaces are the spaces most often encountered in appli- cations of probability theory. Thus, for most probabilistic applications P½�jG� will almost surely be a probability measure on F. The same kind of reasoning as in the case of probabilities conditional on G can be used to show that minimizing expected inaccuracy ð6Þ entails reflection. Let the F-measurable random variable X again be the true value of some quantity. Because E½X � minimizes expected inaccuracy ð6Þ with respect to the trivial σ-algebra f∅; Qg one’s current best estimate of X is its expectation. Let Y be your anticipated new estimate of X after a black-box learning experience. As before, we assume that Q is rich enough for Y to be well- defined and that σðYÞ is a sub-σ-algebra of F. In this case σðYÞ gives rise to the vector subspace Y of L2 of all σðYÞ-measurable random variables. Now suppose that E½X jY� ≠ Y on a set of positive probability. Then Y does not minimize expected inac- curacy ð6Þ among all random variables in Y. ðSee fig. 1 for an illustration.Þ It is important to observe that all random variables in Y are available to the agent, in the sense of availability used by Greaves and Wallace ð2006Þ and Easwaran ð2013Þ. The random variables that are available to the agent are those that do not depend on any conceptual resources beyond those implicit in Y. These conceptual resources are given by σðYÞ. Hence, if Y is avail- able as an update of one’s estimate of X, so should any other random var- iable in Y, whereas it need not be the case that the agent is able to express future degrees of belief that go beyond σðYÞ. Consider another random variable Y 0, and suppose, for example, that there is a set B in σðY 0Þ that is not a member of σðYÞ. Then the conceptual resources implicit in Y 0 go beyond those of Y, and it is not the case that Y 0 is available for the agent whenever Y is. Among all random variables in Y, E½X jY� minimizes expected inaccu- racy. If Y ≠ E½X jY� is the agent’s estimate of X, then she does not approx- 428 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp imate the true values X as closely as she could given the available random variables. If, however, E½X jY� 5 Y ð7Þ almost surely, then expected inaccuracy is minimized. Equation ð7Þ is the martingale property and, hence, a general version of principle R3. We also get versions of principles R1 and R2 since E½Y� 5 E½E½X jY�� 5 E½X � ðthe last equality is a standard property of conditional expectationsÞ. It follows that minimizing expected inaccuracy in terms of ð6Þ entails reflection for ran- dom variables. Reflection principles for degrees of belief can be obtained by the famil- iar substitution X 5 IA. The foregoing arguments then imply that Y mini- mizes expected inaccuracy if and only if P½AjY� 5 Y ð8Þ almost surely. Hence, by minimizing expected inaccuracy you update your degrees of belief as if you would condition on Y ðcf. Good 1981Þ. It also fol- lows that E½Y� 5 E½P½AjY�� 5 P½A�. Taken together, this again leads to ver- sions of the three reflection principles R1–R3. 5. Generalizations. In the previous section we assumed that inaccuracy measures are quadratic. One might wonder to what extent the results for re- flection depend on this assumption. This question is especially important in the absence of a more principled justification of quadratic inaccuracy mea- sures for general probability spaces. Some theorems that are proven in Ban- erjee, Guo, and Wang ð2005Þ can be used to show that the results of the pre- vious section continue to hold for a large class of inaccuracy measures called Bregman distance functions, which can be derived from a salient condition on estimates, namely, a generalization of the condition of propriety or im- modesty ðsee conditions ½10� and ½11� belowÞ. Hence, reflection principles and, in particular, the martingale property follow from a quite natural episte- miccondition on the accuracy of estimates. Supposethat f:D → R is a strictly convex differentiable function, where D is an interval in R. Then the Bregman distance function Bf :D � D → R is defined as Bfðx; yÞ 5 fðxÞ 2 fðyÞ 2 ðx 2 yÞf0ðyÞ; where f0 denotes the derivative of f. The function Bf is the difference between the value of f at x and the value of the first-order term of the Taylor expansion of f around y evaluated at x. The function fðxÞ 5 x2, for example, gives rise to the distance function ðx 2 yÞ2. Thus, the quadratic inaccuracy IN DEFENSE OF REFLECTION 429 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp measure is a Bregman distance function. Other examples include the Kullback- Leibler divergence or the Itakura-Saito distance ðBanerjee et al. 2005Þ. Suppose that X is a F-measurable random variable for which both E½X � and E½fðXÞ� are finite and that Bf is a Bregman distance function. The ex- pected inaccuracy of the estimate Y is now given by E½BfðX ; YÞ�: ð9Þ It can be shown that E½X jG� is the ðalmost surelyÞ unique minimizer of E½BfðX ;YÞ� among all Y that are G-measurable ðsee Banerjee et al. ½2005� for a proofÞ. This result allows us to draw the same conclusions for all Bregman dis- tance functions as for quadratic inaccuracy measures. In particular, if Y is the agent’s future estimate of X that minimizes expected inaccuracy in terms of ð9Þ, then E½X jY� 5 Y almost surely. From this the reflection prin- ciples for conditional expectations and conditional probabilities follow. One can also show that Bregman distance functions are, under rather mild regularity conditions, the only ones for which the conditional expecta- tion E½X jG� is closest to X among all G-measurable random variables ðBan- erjee et al. 2005Þ. Let F:D � D → R be a nonnegative continuous func- tion such that F is continuously differentiable in its first argument and Fðx; xÞ 5 0 for all x ∈ D. If E½X jG� minimizes E½FðX ; YÞ� among all Y that are G-measurable, then F is a Bregman distance function for some strictly convex function f. Why should one choose a Bregman distance function as one’s measure of inaccuracy? The last result does not provide a useful rationale, for it as- sumes our desired conclusion: that a random variable’s conditional expec- tation is its most accurate estimate. However, Banerjee et al. ð2005Þ prove something stronger than this. Suppose that for all measurable random vari- ables X and for all constant random variables Z, such that E½X � ≠ Z, E½FðX ; E½X �Þ� < E½FðX; ZÞ�: ð10Þ That is, E½X � is the unique minimizer of E½FðX; ZÞ� among all constant ran- dom variables Z. If X 5 IA, this means that for any A ∈ F and for any con- stant random variable Z such that P½A� ≠ Z, E½FðIA; P½A�Þ� < E½FðX ; ZÞ�: ð11Þ Therefore, measured in terms of F, your degree of belief P½A� is the most accurate constant estimate of IA in expectation. Banerjee et al. ð2005Þ prove that if ð10Þ holds for a continuous nonnegative function F with F ðx; xÞ 5 0, and if F is continuously differentiable in its first argument, then F is a Bregman distance function for some strictly convex function f. Hence, 430 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp if ð10Þ holds, then E½X jY� is the closest random variable to X among all σðYÞ-measurable random variables with respect to F. We see that, once again, our reflection principles follow, this time from condition ð10Þ. Conditions such as ð10Þ and ð11Þ are known as propriety conditions and have been much discussed in the literature ðGreaves and Wallace 2006; Joyce 2009; Easwaran 2013Þ. Degrees of beliefs obeying a condition like ð11Þ are often called immodest. From the point of view of measuring inaccuracy, mod- esty appears to be a vice. Modest degrees of beliefs are self-undermining; they recommend degrees of beliefs other than the ones the agent is cur- rently holding as being superior. Put differently, the choice of a function F that violates ð11Þ is inconsistent with regarding all one’s current truth esti- mates as maximally accurate. The same can be said about constant estimates of random variables in ð10Þ. Modest estimates violate ð10Þ and are therefore self-undermining. If E½X � is not the best constant estimate of X, then adopt- ing a different estimate would be superior in the light of the agent’s own probabilities. Constant estimates are significant in that they are always available to the agent ðGreaves and Wallace 2006Þ. Only constant estimates are measurable relative to the trivial σ-algebra f∅; Qg. One therefore does not need any in- formation about the structure of the probability space in order to form a constant estimate. The results of this and the previous section bear some resemblance to the justifications of conditionalization in Greaves and Wallace ð2006Þ and Eas- waran ð2013Þ. In both of these articles, conditions very similar to ð11Þ play an important role in showing that conditional probabilities minimize expected inaccuracy when planning to update on the outcomes of an experiment. An experiment is taken to be a partition of Q. Our new results show that gen- eral conditional expectations also minimize inaccuracy. That is, first, the out- comes need not constitute a partition of Q but can be a σ-algebra, which is a more general structure. And, second, this σ-algebra can be generated by the future estimate of a quantity. Easwaran has pointed out that reflection in the sense of R2 holds for conditional probabilities given an experiment. Within our framework, we obtain more general reflection principles—ones that also apply to general future estimates of X. Should ð10Þ be viewed as a rationality requirement? This question boils down to asking whether a rational agent can hold constant estimates that are self-undermining. If we consider an agent whose only goal is to approxi- mate the truth, then this should clearly not be the case. Such an agent would, on learning that her constant estimates are not the most accurate ones, either change her constant estimates or have doubts about the appropriateness of the function F as a measure of inaccuracy. In the framework of expected accuracy, it is natural that degrees of belief are estimates of truth values and, more generally, that expectations are estimates of random variables. Viola- IN DEFENSE OF REFLECTION 431 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp tions of ð10Þ and ð11Þ thus amount to saying that the distance measure F it- self precludes certain values from being one’s current estimates. This is not to deny that ð10Þ is a demanding requirement. It would be easy to find examples in which it does not seem to be reasonable for an agent to meet ð10Þ, regardless of the costs involved. Like dynamic coher- ence, maximally accurate constant estimates in the sense of ð10Þ should be viewed as a prima facie requirement of rationality. Propriety does not over- ride all considerations in a larger all-things-considered context. But within the confines of epistemic rationality, we can again conclude that reflection is a rationality requirement, if truth approximation is taken to be the standard of rationality. 6. Concluding Remarks. In section 2, we saw that reflection principles are closely tied to the general concept of expectation conditional on a σ-algebra. Importantly, this does not depend on where the σ-algebra comes from. The σ-algebra might describe the outcomes of an experiment. But it can also be generated by your future beliefs—which is an experiment of sorts in the con- text of black-box learning. From this we were led to the conclusion that reflection is a basic feature of rational opinion change. Two epistemic principles that require this are dy- namic coherence and accuracy. These two approaches highlight different epistemic virtues of reflection. Dynamic coherence brings out consistency, whereas accuracy emphasizes approximating the truth. In many situations, other considerations also influence one’s anticipated future estimates. But if one only cares about consistency or approximating the truth, then reflec- tion is a requirement for rational change of opinions. REFERENCES Adams, Ernest W. 1962. “On Rational Betting Systems.” Archiv für mathematische Logik und Grundlagenforschung 6:7–29, 112–28. Armendt, Brad. 1980. “Is There a Dutch Book Argument for Probability Kinematics?” Philosophy of Science 47:583–88. ———. 1993. “Dutch Books, Additivity, and Utility Theory.” Philosophical Topics 21:1–20. Arntzenius, Frank. 2003. “Some Problems for Conditionalization and Reflection.” Journal of Phi- losophy 100:356–70. Bacchus, Fahiem, Henry E. Kyburg, and Mariam Thalos. 1995. “Against Conditionalization.” Syn- these 85:475–506. Banerjee, Arindam, Xin Guo, and Hui Wang. 2005. “On the Optimality of Conditional Expecta- tion as a Bregman Predictor.” IEEE Transactions on Information Theory 51:2664–69. Bovens, Luc. 1995. “‘P and I Will Believe That Not-P’: Diachronic Constraints on Rational Belief.” Mind 104:737–60. Briggs, Rachael. 2009. “Distorted Reflection.” Philosophical Review 118:59–85. Christensen, David. 1991. “Clever Bookies and Coherent Beliefs.” Philosophical Review 100: 229–47. ———. 1996. “Dutch Books Depragmatized: Epistemic Consistency for Partial Believers.” Jour- nal of Philosophy 93:450–79. 432 SIMON M. HUTTEGGER This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp Easwaran, Kenny. 2013. “Expected Accuracy Supports Conditionalization—and Conglomerability and Reflection.” Philosophy of Science 80:119–42. Goldstein, Michael. 1983. “The Prevision of a Prevision.” Journal of the American Statistical As- sociation 78:817–19. Good, Irving J. 1981. “The Weight of Evidence Provided by an Uncertain Testimony or an Un- certain Evidence.” Journal of Statistical Computation and Simulation 13:56–60. Greaves, Hilary, and David Wallace. 2006. “Justifying Conditionalization: Conditionalization Maximizes Expected Epistemic Utility.” Mind 115:607–32. Hacking, Ian. 1967. “Slightly More Realistic Personal Probability.” Philosophy of Science 34:311–25. Howson, Colin, and Peter Urbach. 1993. Scientific Reasoning: The Bayesian Approach. 2nd ed. La Salle, IL: Open Court. Jeffrey, Richard C. 1988. “Conditioning, Kinematics, and Exchangeability.” In Causation, Chance, and Credence, vol. 1, ed. Brian Skyrms and William L. Harper, 221–55. Dordrecht: Kluwer. Joyce, James M. 1998. “A Nonpragmatic Vindication of Probabilism.” Philosophy of Science 65: 575–603. ———. 2009. “Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Be- lief.” In Degrees of Belief, ed. Franz Huber and Christoph Schmidt-Petri, 263–97. New York: Springer. Leitgeb, Hannes, and Richard Pettigrew. 2010a. “An Objective Justification of Bayesianism.” Pt. 1, “Measuring Inaccuracy.” Philosophy of Science 77:201–35. ———. 2010b. “An Objective Justification of Bayesianism.” Pt. 2, “The Consequences of Mini- mizing Inaccuracy.” Philosophy of Science 77:236–72. Levi, Isaac. 1987. “The Demons of Decision.” Monist 70:193–211. Maher, Patrick. 1992. “Diachronic Rationality.” Philosophy of Science 59:120–41. Ramsey, Frank P. 1931. “Truth and Probability.” In Foundations of Mathematics and Other Essays, ed. Richard B. Braithwaite. New York: Harcourt Brace. Repr. in Studies in Subjective Probability, ed. Henry E. Kyburg and Howard E. Smokler ðHuntington, NY: Krieger, 1964Þ. Schervish, Mark J., Teddy Seidenfeld, and Joseph B. Kadane. 2004. “Stopping to Reflect.” Journal of Philosophy 101:315–22. Selten, Reinhard. 1998. “Axiomatic Characterization of the Quadratic Scoring Rule.” Experimental Economics 1:43–61. Skyrms, Brian. 1987a. “Coherence.” In Scientific Inquiry in Philosophical Perspective, ed. Nicholas Rescher, 225–342. Pittsburgh: University Press of America. ———. 1987b. “Dynamic Coherence and Probability Kinematics.” Philosophy of Science 54:1–20. ———. 1990. The Dynamics of Rational Deliberation. Cambridge, MA: Harvard University Press. ———. 2006. “Diachronic Coherence and Radical Probabilism.” Philosophy of Science 73:959–68. Talbott, William. 1991. “Two Principles of Bayesian Epistemology.” Philosophical Studies 62: 135–50. Teller, Paul. 1973. “Conditionalization and Observation.” Synthese 26:218–58. van Fraassen, Bas C. 1984. “Belief and the Will.” Journal of Philosophy 81:235–56. ———. 1995. “Belief and the Problem of Ulysses and the Sirens.” Philosophical Studies 77:7–37. Weisberg, Jonathan. 2007. “Conditionalization, Reflection, and Self-Knowledge.” Philosophical Stud- ies 135:179–97. Williams, David. 1991. Probability with Martingales. Cambridge: Cambridge University Press. IN DEFENSE OF REFLECTION 433 This content downloaded from 169.234.246.163 on Sat, 29 Jun 2013 11:00:49 AM All use subject to JSTOR Terms and Conditions http://www.jstor.org/page/info/about/policies/terms.jsp