Accuracy, Language Dependence and Joyce’s Argument for Probabilism Branden Fitelson�y Abstract In this note, I explain how a variant of David Miller’s (1975) argument concerning the language-dependence of the accuracy of predictions can be applied to Joyce’s (1998) no- tion of the accuracy of “estimates of numerical truth-values” (viz., Joycean credences). This leads to a potential problem for Joyce’s accuracy-dominance-based argument for the conclusion that credences (understood as “estimates of numerical truth-values” in Joyce’s sense) should obey the probability calculus. 1 Miller on the Language Dependence of Predictive Accuracy Suppose we have two numerical quantities � and . These might be, for instance, the veloc- ities (in some common units) of two objects, at some time (or some other suitable physical quantity of two objects at a time). Suppose further that we have two sets of predictions concerning the values of � and , which are entailed by two hypotheses H1 and H2, and let’s denote the truth about the values of � and (or, if you prefer, the true hypothesis about their values) — in our standard units — as T . Let the predictions of H1 and H2, and the true values T of � and be given by the following table. [Ignore the �/� columns of the table, for now — I’ll explain the significance of those columns, below.] � � � H1 0.150 1.225 0.925 2.000 H2 0.100 1.000 0.800 1.700 T 0.000 1.000 1.000 2.000 Table 1: Canonical example of the language dependence of the accuracy of predictions It seems clear that the predictions of H2 are “closer to the truth T about � and ” than the predictions of H1 are. After all, the predicted values entailed by H2 are strictly in between the values predicted by H1 and the true values entailed by T . However, as Popper (1972, Appendix 2) showed [using a recipe invented by David Miller (1975)], there exist quantities � and � (as in the table) satisfying both of the following conditions: �Department of Philosophy, Rutgers University, 1 Seminary Place, New Brunswick, NJ 08901-1107 [email: branden@fitelson.org]. This is a draft (10/03/11). Final version to appear in Philosophy of Science. yI would like to thank Kenny Easwaran, Ben Levinstein, David Miller, Wolfgang Schwarz, Mike Titelbaum, Robbie Williams, and two anonymous referees of this journal for useful comments on earlier drafts. 1 1. � and � are symmetrically inter-definable with respect to � and in the following (linear) way: � � � 2� � � 2 � 3� � � �� 2� � 2�� 3� 2. The values for � and � entailed by H2 are strictly “farther from the truth T about � and �” than the values for � and � entailed by H1. As Miller (1975) explains [see (Miller, 2006, Chapter 11) for a nice historical survey], there is a much more general result in the vicinity. It can be shown that for any pair of false theories H1 and H2 about parameters � and , many comparative relations of “closer to the truth” between H1 and H2 regarding � and can be reversed by looking at what the estimates provided by H1 and H2 for � and entail about quantities � and �, which are symmetrically inter-definable with respect to � and , via some (linear) inter-translation of the form: � � a � b� � � c � d� � � a�� b� � c�� d� That is, for many cases in which we judge that “H2 is closer to the truth T about � and than H1 is” (on many ways of comparing “closeness”) there will exist some member of the above family of symmetric inter-translations such that we will judge that “H1 is closer to the truth T about � and � than H2 is”. In this way, we can often reverse accuracy comparisons of quantitative theories via such re-descriptions of prediction problems. As such, many assessments of the accuracy of predictions are language dependent.1 2 Joyce on Probabilism and the “Accuracy” of Credences According to Joyce (1998), if we view credences (of rational agents) as numerical estimates of truth-values of propositions, then we can give an argument for probabilism that is based on considerations having to do with the “accuracy” of such estimates. I won’t get into all the details of Joyce’s various arguments here. Rather, I will focus on a simple, concrete example that illustrates a (potential) problem of language dependence. Consider an agent S facing a very simple situation, involving only one atomic sentence P. Suppose that S is logically omniscient (i.e., S assigns the same credences to logically equivalent statements, and he also assigns zero credence to all contradictions and credence one to all tautologies in his toy language). Thus, all that matters concerning S’s coherence 1Strictly speaking, this only becomes a language dependence problem if we adopt the language L� in which � and are primitive parameters, and we treat � and � as defined parameters in L� — as opposed to adopting the language L��, and treating � and as defined in L��. Otherwise, we could characterize what is going on here as a dependence of “distances from the truth” on a choice of parameters within a single language L� ��. I intend this to be a problem of language dependence. So, I assume we start with an adopted language � set of primitive parameters. I thank an anonymous referee for pressing this clarification. 2 (in Joyce’s sense) is whether S’s credences b in P and :P sum to one (and are non-negative). Now, following Joyce, we will associate the truth-value True with the number 1 and the truth- value False with the number 0. Let � be the numerical value associated with P’s truth-value, and let be the numerical value associated with :P’s truth-value (of course, � and will vary in the obvious ways across the two salient possible worlds: w1, in which P is false, and w2, in which P is true). We can now state (informally) the sort of Theorem(s) that Joyce has been writing about for a number of years. Theorem (Joyce). If S’s credence function b — construed as providing estimates of � and — fails to be probabilistic, then there exists a probabilistic b0 that is more accu- rate than b (according to a suitable “scoring rule”) regarding � and — in all possible worlds. And, no coherent (probabilistic) credence function is accuracy-dominated in this sense by any incoherent credence function (a key asymmetry). Joyce makes various assumptions about how to measure “the accuracy of estimates of � and — in a possible world”. The various choices of “scoring rule” that one might make in order to render such “accuracy measurements” will not be important for the issue that I am going to raise here. The phenomenon will arise for any such instantiation of Joyce’s framework. Rather than describing my “reversal theorem” in such general terms, I will illustrate it via a very simple concrete example, regarding our toy agent S, and assuming the Brier Score as our “accuracy measure”. Suppose that S’s credence function (b) assigns the following values P and :P (i.e., b entails the following numerical “estimates” of the quantities � and ). � b 12 1 4 Table 2: The credence function (b) of our simple incoherent agent (S). Joyce’s theorem entails the existence of a coherent set of estimates (b0) of � and , which is more accurate than b (under the Brier Score) in both of the salient possible worlds. I will say that such a b0 Brier-dominates b with respect to � and . To make things very concrete, let’s look at an example of such a b0 in this case. The following table depicts the Euclidean-closest such b0, relative to Joyce’s f0;1g-representation of the truth-values (viz., � and ). [Ignore the �/� columns of the table, for now — I’ll explain their significance, below.] � � � b 12 1 4 9 16 3 16 b0 58 3 8 3 4 1 4 w1 0 1 7 16 9 16 w2 1 0 9 16 7 16 Table 3: An example of the language-dependence of Joycean Brier-domination 3 The estimates entailed by b0 are more accurate — with respect to � and — in both w1 and w2, according to the Brier Score. A natural question to ask (in light of section 1, above) is whether there is a Miller-style symmetric inter-translation that can reverse this Brier- dominance relation. Interestingly, it can be shown (proof omitted) that there is no linear Miller-style symmetric inter-translation (of the simple form above) that will do the trick. But, there is a slightly more complex (non-linear) symmetric inter-translation that will yield the desired reversal (and it is depicted above). Furthermore, it can be shown that this very same numerical inter-translation will yield such a reversal for any coherent function b0 that Brier-dominates b for this incoherent agent S (with respect to � and ). To be more precise, we have the following theorem about our (particular) agent S: Theorem. For any coherent function b0 that Brier-dominates S’s credence function b with respect to � and , there exist quantities � and � that are symmetrically inter-definable with respect to � and , via the following specific symmetric inter- translations.2 � � 12�� 1 2 � 1 16 � �� �� � � � 12�� 1 2 � 1 16 � �� �� � � � 12�� 1 2�� 1 16 � ��� ��� � � 12�� 1 2�� 1 16 � ��� ��� � Where b Brier-dominates b0 with respect to � and �. It is also noteworthy that the true values of � and � “behave like truth-values”, in the sense that (a) the true value of � (�) in w1 (w2) is identical to the true value of � (�) in w2 (w1), and (b) the true values of � and � always sum to one. Indeed, these transformations are guaranteed to preserve coherence of all dominating b0’s, and the “truth-vectors”.3 So, while it is true that there are some aspects of “the truth” with respect to which S’s credence function b is bound to be less accurate than (various) coherent b0’s, it also seems to be the case that (for any such b0) there will be specifiable, symmetrically inter-definable aspects of “the truth” with respect to which the opposite is the case (i.e., with respect to which b is bound to be more accurate than b0). In the next section, I consider several possible reactions to this Miller-esque “language dependence of the accuracy of credences” phenomenon. In the end, I think the upshot 2Although our translations are more complex than the very simple, linear Miller-style translations above, our translations can be rendered dimensionally homogeneous, by replacing “ 116 ” in the statement of the translations with “ c16 ”, where c is in the units of � and , and c takes the value 1. So amended, our translations would be appropriate for quantities with an associated physical dimension (e.g., velocities). But, because we’re dealing with dimensionless quantities here (e.g., probabilities), dimensional homogeneity is not even a pressing issue for us. See (Szirtes, 2007, Chapter 6) for a useful discussion concerning dimensional homogeneity. 3A Mathematica notebook that contains verifications of all of the technical claims made in this note is available from the author. The notebook can be downloaded from the following URL: http://fitelson.org/joyce.nb. More general results can be proven (and further constraints can be accommodated on the desired translation scheme). But, all I need (dialectically) is one incoherent agent S for which I can ensure reversals of all such Brier-dominance relations via a single, symmetric inter- translation to/from the �/ representation and the �/� representation. See the last section for further discussion. 4 will be that Joyce needs to tell us more about (precisely) what he means when he says that “credences are (numerical) estimates of (numerical) truth-values”. Specifically, I think the present phenomenon challenges us to get clearer on the precise content of the accuracy norm(s) that are applicable to (or constitutive of) the Joycean cognitive act of “estimation of the (numerical) truth-value of a proposition”. 3 Some Possible Reactions 3.1 Naturalness/Privileged Language One might try to maintain that (in some sense) the quantities � and are “more natural” (in this context) than � and � and/or that the “estimation problem” involving � and is somehow “privileged” (in comparison to the �/� “estimation problem”). I don’t really see how such an argument would go. First, from the point of view of the �/�-language, the quantities � and seem just as “gerrymandered” as the quantities � and � might appear from the point of view of Joyce’s preferred numerical representation of the truth-values. Moreover, there is a disanalogy to the case of physical magnitudes like velocity, since truth- values don’t seem to have numerical properties (per se). That is, there is already something a little artificial about thinking of truth-values as the sort of things that can be “numerically estimated” (where the “estimates” are numerically scored for “accuracy”). 3.2 “Asymmetries” in Accuracy-Dominance in the �/�-Language One might try to find some (new) accuracy-dominance asymmetry between coherent and incoherent credences in the �/�-language. I see two problems with this strategy. First, in the �/�-language (as opposed to the �/ -language), some coherent vectors (in Joyce’s sense) are Brier-dominated by an incoherent vector (witness the example above). Having said that, it is also true that there do exist other coherent credence functions b? that Brier-dominate b with respect to � and �. As such, we don’t have an “utter reversal” of the (full) asymmetry between coherent and incoherent vectors in the �/ -language. I’m not sure that’s required here (for our purposes). We have certainly broken the (full) asymmetry between coherent and incoherent vectors. Moreover, we could define-up another pair of quantities and � (symmetrically inter-definable with respect to � and � — perhaps relative to a new family of inter-translations) which reverses these new (b? vs b) relations of Brier-dominance, and so on. . . . So, this response just seems to re-iterate the initial problem. 3.3 Disanalogies between “Estimation” and “Prediction” I think the most promising (and useful) response to the phenomenon is to argue (i) that there are crucial disanalogies between “estimation” (in Joyce’s sense) and “prediction” (in the sense presupposed by Popper and Miller), and (ii) these disanalogies imply that my “reversal argument” is presupposing something incorrect regarding the norms appropriate to “estimation”. Here, it is important to note that Joyce does not tell us very much about what he means by “estimation”. He does say a few things that are suggestive about what “estimation” is not. Specifically, Joyce clearly thinks: 5 1. Estimates are not guesses. Joyce (1998, 587) explicitly distinguishes estimation and guessing. Presumably, guessing (as a cognitive act) doesn’t have the appropriate nor- mative structure to ground the sorts of accuracy (and coherence) norms (for credences) that Joyce has in mind. 2. Estimates are not expectations. Joyce (1998, 587–8) explicitly disavows thinking of esti- mates as expectations. Indeed, this is supposed to be one of the novel and distinctive features of Joyce’s approach. In fact, it’s supposed to be one of the advantages of his argument (over previous, similar arguments). Here, Joyce seems to think that ex- pectations have two sorts of (dialectical) shortcomings, in the present context. First, he seems to think that they have a pragmatic element, which is not suitable for a non-pragmatic vindication of probabilism. Second, expectations seem to build-in a non-trivial amount of probabilistic structure [via the definition of expectation, which presupposes that b�:p� � 1 �b�p�], and this makes the assumption that estimates are expectations question-begging in the present context. 3. Estimates are not assertions that the values of the parameters are such-and-so. This is clear (just from the nature of these “estimation problems”), since it’s not a good idea to assert things that you know (a priori) must be false. And, whenever you offer “esti- mates” (in Joyce’s sense) that are non-extreme, you know (a priori) that the parameters (� and ) do not take the values you are offering as estimates (an analogous point can be made with respect to what b and b0 “assert” about � and �). These are the only (definite, precise) commitments about “estimates” that I’ve been able to extract from Joyce’s work (apart from the implicit assumption concerning the appropriate- ness of “scoring” them in terms of “accuracy” using the Brier score). Unfortunately, these negative claims about what Joyce means by “estimation” do not settle whether my “reversal argument” poses a problem for Joyce. Allow me to briefly explain why. Let [E�x;y� � hp;qi\ be the claim that [S is committed to the values hp;qi as their “estimates” (in Joyce’s sense) of the quantities hx;yi\. What we need to know are the conditions under which the following principle (which is implicit in my “reversal argument”) is acceptable, relative to Joyce’s notion of “estimation” (E): (y) If E��; � � hp;qi, then E��;�� � f�p;q�, where f is a symmetric inter-translation function that maps values of h�; i to/from values of h�;�i. Presumably, there will be some symmetric inter-translation functions f (in some contexts) such that (y) is acceptable to Joyce. The question is: which translation functions f are acceptable — in my example above? Since Joyce doesn’t give us a (sufficiently precise) theory of E, it is difficult to answer this question definitively. But, if my “reversal argument” is going to be blocked, then I presume that Joyce would want to reject (y) for our inter-translation function f? above. It is natural to ask precisely what grounds Joyce might have for such a rejection of our f?. It is useful to note that (y) is clearly implausible, under certain interpretations of E. Pre- sumably, if E involves guessing, then one could argue that (y) should not hold (in general). Perhaps it is just fine for S’s guesses about the values of h�; i to be utterly independent 6 of S’s guesses about h�;�i (at least, to the extent that I understand the “norms of guess- ing”). Similarly, if E involves expectation, then (y) will demonstrably fail (in general) for non-linear functions like our f?. [Although, on an expectation reading of “estimate,” (y) will demonstrably hold for all linear inter-translations.] Unfortunately for Joyce, neither of these interpretations of E is available to him. So, this yields no concrete reasons to reject (y) in our example. On the other hand, if E involved assertion (as in item 3 above), then (y) would be emi- nently plausible. On an assertion reading of E, (y) is tantamount to a simple form of deduc- tive closure for assertoric commitments (in the traditional sense). And, this would be very similar to the way Popper and Miller were thinking about the predictions of (deterministic, quantitative) scientific theories.4 It seems clear that E is not exactly like that (in this context), but this (alone) doesn’t give us any concrete reasons to reject (y) in this case. I submit that what we need here is a (sufficiently precise) theory of E, which satisfies Joyce’s explicit commitments (1)–(3) above, and which is also precise enough to explain why (y) should fail for f? (in our example above). At the very least, this note serves as an invitation to Joyce to provide such an (independent) philosophical explication of his E. References Hacking, I. “Salmon’s Vindication of Induction.” The Journal of Philosophy 62, 10: (1965) 260–266. Joyce, J. “A nonpragmatic vindication of probabilism.” Philosophy of Science 65, 4: (1998) 575–603. Miller, D. “The accuracy of predictions.” Synthese 30, 1: (1975) 159–191. . Out of error: Further essays on critical rationalism. Aldershot: Ashgate Publishing Company, 2006. Popper, K. Objective Knowledge: An Evolutionary Approach. New York: Oxford University Press, 1972, second edition. Szirtes, T. Applied dimensional analysis and modeling. Burlington, MA: Butterworth- Heinemann, 2007, second edition. 4See (Hacking, 1965) for an interesting discussion of different senses of “estimation” and “prediction” in the context of statistical theory. 7 Miller on the Language Dependence of Predictive Accuracy Joyce on Probabilism and the ``Accuracy'' of Credences Some Possible Reactions Naturalness/Privileged Language ``Asymmetries'' in Accuracy-Dominance in the /-Language Disanalogies between ``Estimation'' and ``Prediction''