Numerical Methods, Complexity, and Epistemic Hierarchies Numerical Methods, Complexity, and Epistemic Hierarchies Nicolas Fillion and Sorin Bangu*y Modern mathematical sciences are hard to imagine without appeal to efficient computa- tional algorithms. We address several conceptual problems arising from this interaction by outlining rival but complementary perspectives on mathematical tractability. More spe- cifically, we articulate three alternative characterizations of the complexity hierarchy of mathematical problems that are themselves based on different understandings of compu- tational constraints. These distinctions resolve the tension between epistemic contexts in which exact solutions can be found and the ones in which they cannot; however, contrary to a persistent myth, we conclude that having an exact solution is not generally more epis- temologically beneficial than lacking one. 1. Introduction. In Extending Ourselves, Humphreys set two desiderata for a scientifically informed philosophical approach to science: ð1Þ “in dealing with issues concerning the application of mathematical models to the world, as empiricists we should drop the orientation of an ideal agent who is com- pletely free from practical computational constraints of any kind,” and yet ð2Þ we should not “restrict ourselves to a minimalist position where what is computable is always referred back to the computational competence of human agents” ðHumphreys 2004, 124Þ. While the second desideratum ðthe rejection of ‘minimalism’Þ has potentially transforming consequences for how empiricist philosophers look at science, here we will heed the first one—that we should take seriously the ‘computational constraints’ confronting the sci- *To contact the authors, please write to: Nicolas Fillion, Simon Fraser University; e-mail: nfillion@sfu.ca. Sorin Bangu, University of Bergen; e-mail: sorin.bangu@fof.uib.no. yThe current work integrates the two papers we presented at the symposium The Plurality of Numerical Methods in Computer Simulations and Their Philosophical Analysis held at IHPST Paris, November 2011. We thank Julie Jebeile and Anouk Barberousse for organizing the conference. We also thank the audience for this symposium as well as audiences at the WCPA 2014 and the PSA 2014 meetings. In particular, we would like to thank Bob Bat- terman, Audrey Yap, and Paul Humphreys for very helpful comments. Philosophy of Science, 82 (December 2015) pp. 941–955. 0031-8248/2015/8205-0018$10.00 Copyright 2015 by the Philosophy of Science Association. All rights reserved. 941 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). entists. Indeed, when it comes to extracting useful information from math- ematical equations describing systems of interest, one often fails to notice how significant is the difference between merely proving that a solution exists and proving that a computational route to access it is available. In addressing the issue of computability, it is essential to draw attention to the important distortions to our understanding of the scientific enterprise that result from identifying the epistemic subject with an ‘ideal agent’ pos- sessing unlimited calculational resources. Even though this has only recently become the focus of extensive work in philosophy, this epistemic dimension has played a prominent role in the works of the founders of modern com- putation theory, such as Turing: “the assumption that as soon as a fact is presented to a mind all consequences of that fact spring into the mind si- multaneously with it . . . is a very useful assumption under many circum- stances, but one too easily forgets that it is false” ð1950, 451Þ. Since the computational route to the solution can be easier or harder to navigate for different types of mathematical problems, it is natural to organize more or less tractable problems into a complexity hierarchy. Our article articulates and compares different ways in which the com- putational complexity hierarchy might be understood. We further emphasize that we must also consider theoretical computational constraints, in addition to the practical ones. As we will show, there are constraints that persist no matter how much computational power is available and regardless of how much the epistemic subjects are idealized. In other words, such constraints are grounded in objective facts about what types of solutions may be ob- tained for a given problem. To clarify the nature and consequences of such limitations, it is important to understand the relations within the variety of exact solutions ðalgebraic, elementary, closed form, analytic, etc.Þ and the methods used to obtain and justify numerical solutions that are typically not exact. 2. Exactly Solvable and Unsolvable Problems. After a mathematical model is generated, the issues of interest come up when the scientist tries to ex- tract useful information from it—or, in scientific parlance, when she tries to ‘solve’ the model. As is well known, it is only for very simple and dras- tically simplified models that the equations can be solved exactly ði.e., that an explicit formula can be obtained as a solutionÞ.1 The textbook example is the simple perfect frictionless pendulum, with small amplitude of oscil- lation. Its equation v 00 5 2ðg/LÞv admits such a solution, namely, vðtÞ 5 vmaxsinðg=LtÞ1=2. For even slightly less idealized models, such as the simple pendulum of arbitrary amplitude, the equation of motion v 00 5 2 ðg=LÞsinv 1. We will clarify what we mean by “solving the equation with an explicit formula” in sec. 2, where types of exact solutions are distinguished. 942 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). is not tractable in the same straightforward way. One should then use more sophisticated mathematical machinery to explicitly express an exact solution or instead appeal to numerical methods ðsuch as a Runge-Kutta algorithmÞ to obtain ðhopefullyÞ approximate solutions. In other cases, only such inexact numerical values for the solutions are obtainable by computation, which gives rise to a different epistemic context. The distinction between the two epistemic contexts can be amply illustrated. Here are several well-known scientific ex- amples belonging to each category ðTrefethen 2008, 605Þ. Cases in which we can find an explicit formula that solves the problem include solving systems of n linear equations and n unknowns, minimizing n-variable linear functions subject to m linear constraints, and so on. Cases in which no explicit solution is available include finding eigenvalues of n � n matrices, minimizing functions of several variables, evaluating arbitrary integrals, solving ordinary and partial differential equations, and so on. It may seem natural to think that the scientist operating in contexts of the second type finds herself in an epistemically disadvantaged position, especially when compared to the circumstances in which explicit exact so- lutions are available. In this way, this binary distinction—between mathe- matical problems that afford exact, explicit solutions and those that do not— gives rise to the belief that there is an epistemic hierarchization of the two contexts. It is easy to see why one might presume that possessing exact, ex- plicit expressions for solutions is preferable to not possessing them: we seek these solutions because, presumably, they immediately reveal infor- mation about the behavior of systems. Since inexact solutions might fail to do so by not being genuinely informative ðwhen their error is too largeÞ, exact solutions are deemed epistemologically superior. The central aim of what follows is to challenge this belief and maintain instead that the dif- ference is not as pronounced as one might think, since even when explicit solutions are available, numerical considerations ðtypically associated with inexact solutionsÞ cannot be avoided. We argue that the mere existence of an explicit, exact solution does not necessarily improve one’s epistemic position, as complications often occur. These complications require that we adopt a perspective on the role of equations ðand their solutionsÞ ac- cording to which this hierarchization is rather illusory. To this end, we first turn to a closer examination of the two epistemic contexts. To begin, it is important to understand some nuances about the notion of an exact solution. There are many types of exact solutions to a mathematical problem, affording different epistemic advantages. It is not uncommon to see the phrases “exact solution,” “algebraic solution,” “analytic solution,” and “closed-form solution” used interchangeably in the literature. Typically, these phrases are used to characterize an epistemological context in which exactness prevails ði.e., in which approximations are of no concernÞ. Yet there are important differences between them. Among the phrases above, METHODS, COMPLEXITY, AND HIERARCHIES 943 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). “exact” is the most general term that refers to any mathematical objects that satisfies the conditions constitutive of the problem; the other terms denote particular cases of exact solutions arranged as shown in figure 1. Consider an arbitrary problem that happens to have a unique function as its exact solution ðas is the case for “nice” initial-value problemsÞ. The problem is said to have an algebraic solution if the solution can be written as a finite combination of algebraic operations. So, the question whether there is an al- gebraic solution depends on what is in the class of algebraic functions. Note that the property “having an algebraic solution” depends on the existence of an expression of a given type that captures the function that solves the problem ðand not only on the existence of a solutionÞ. The same can be said for elementary and closed-form solutions. The classes of admissible opera- tions for expressions of the solution are as follows: • Algebraic expressions admit the following operations: addition, sub- traction, multiplication, division, and exponentiation with integral and fractional exponents; • Elementary expressions admit all elementary algebraic operations, plus exponents and logarithms in general ðand so they include trigonometric and inverse trigonometric functions as wellÞ; • Closed-formexpressionsincludeallclosed-formexpressions,plusmany other “well-understood functions,” in particular the so-called special functions ðbut not arbitrary limits or integralsÞ.2 From their mutual relations, we see that it might be the case that some problems have a closed-form solution without having an elementary solu- tion and that some problems have an elementary solution without having an algebraic solution. For example, consider a fourth-degree polynomial, pðzÞ 5 a0 1 a1z 1 a2z2 1 a3z3 1 a4z4; ð1Þ and a fifth-degree polynomial, qðzÞ 5 b0 1 b1z 1 b2z2 1 b3z3 1 b4z4 1 b5z5: ð2Þ In the sixteenth century, Ferrari showed that there are explicit formulas giv- ing the roots of equation ð1Þ in terms of radicals, while no such formulas 2. Borwein and Crandall ð2013Þ review seven different approaches to defining what the class of closed-form solutions contains; importantly, those disagree on the extension of the class. Moreover, as they report ðfrom WeissteinÞ, “an infinite sum would generally not be considered closed form. However, the choice of what to call closed form and what not to is rather arbitrary since a new ‘closed-form’ function could simply be de- fined in terms of the infinite sum” (50). The idea is that, at a given stage of development of mathematics, any function that is well understood is to be considered closed form. 944 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). exist for the roots of equation ð2Þ ðas proved by Abel and Ruffini at the turn of the nineteenth centuryÞ. Thus, one might think that finding the roots of p is a different kind of task than that of finding the roots of q. In technical terms, we say that whereas the quartic has an elementary so- lution, the quintic does not. The Abel-Ruffini Impossibility Theorem shows that fifth-degree polynomials generally have no algebraic solutions ði.e., they have no solutions expressible with algebraic expressions, which include rad- icalsÞ. That is of course not the same as saying that there is no solution, because the existence of a solution is guaranteed by the Fundamental Theorem of Al- gebra. Importantly, the Impossibility Theorem states that the solution cannot be expressed using a form that is particularly convenient for the sake of calcu- lations. Moreover, it does not only say that no such expression has been found so far; rather, it says that no such expression will ever be found. Yet, there is an exact solution, and it turns out to be closed form since it can be expressed in terms of Jacobi elliptic functions ðas shown by Hermite in the mid-nineteenth centuryÞ. So, we can write down an explicit formula for the roots, provided that we use special functions in this expression ðand so it is not elementaryÞ. This being said, a look at how such roots are obtained in practice today reveals that solving equations ð1Þ and ð2Þ is not that different after all. If an engineer wants to find the roots of either of these polynomials, she will use a computer. But, while the computer cannot use an explicit ðelementaryÞ for- mula in the case of q, it does not necessarily use one in the case of p—since it might turn out that the use of such a formula slows down the actual com- putation of the result or provides spurious roots due to numerical instability ðwe return to this point laterÞ.3 Thus, in practice, the possession of an explicit formula does not immediately translate into an epistemic advantage. As we have indicated above with the example of the pendulum, some simple physical systems do not have closed-form solutions. The perfect pen- 3. For instance, in virtually any numerical analysis text, one finds an argument ex- plaining that computing roots of quadratics using the quadratic formula is ill advised, for it may lead to what is known as ‘catastrophic cancellation’. Figure 1. METHODS, COMPLEXITY, AND HIERARCHIES 945 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). dulum with small angle of oscillation is described by a differential equation that turns out not to have an elementary solution.4 However, it does have an exact solution, in terms of Jacobi elliptic functions, which is closed form. Be that as it may, instead of being satisfied with this exact nonelementary solu- tion, physicists often approximate the problem by taking the limit v → 0. Then, the model equation reduces to the simple harmonic oscillator that has the simple elementary solution vðtÞ 5 c1sin qt 1 c2cosqt, where q 5 v0. Interestingly, if we take a simple harmonic oscillator and then add a linear factor to the model equation, we can again have a situation that has no elementary solution. This is also easy to imagine in a physical setup. If you consider a mass attached to a Hookean spring, the model equation would be x00 5 2x ðsuppose the stiffness k is 1Þ, a simple harmonic oscillator. How- ever, in real systems, stiffness is not constant. We can try to understand what would happen if the stiffness increased linearly with time, so that x00 1 tx 5 0. It turns out that the solution of this model equation is the Airy function, which cannot be expressed as an elementary function. Thus, small changes in the physical circumstances can drastically alter the kind of solution afforded by the model equations.5 When there is an exact solution, but no elementary solution, it is nec- essary to rely in some way on infinite series representation of the solution to evaluate it at some time. With respect to calculations, the difficulty with infinite series representations is that we cannot sum an infinite number of terms. It then seems that we can evaluate the solution to arbitrary accuracy by using increasingly long ðbut finiteÞ truncated series. An interesting sit- uation arises when we have a perfectly good analytic solution in the form of a uniformly convergent Taylor series, which converges so slowly that it ends up being of no practical use for computation. The Airy function men- tioned above is a good example of this. Numerically, even if the series converges for all x, it might be of little practical use, since the theoretical uniform convergence might not translate to success in numerical contexts. Thus, to the extent that we need concrete numerical values, calculability is crucial. Calculability is conceptually most straightforward when we have an expression that we can evaluate, and this obtains when we have finite expressions capturing solutions ði.e., when we have algebraic or perhaps elementary solutionsÞ. The requirement of exactness is insufficient to the extent that it allows for solutions that cannot be expressed finitarily. How- ever, as we discuss below, even finitarily representable solutions do not guarantee that no problems will arise in actual calculation. 4. Another famous example of this situation is the global solution of the n-body problem provided by Wang ð1990Þ. It is analytic but does not have a closed-form representation. 5. Another interesting perturbation of the equation leads to Duffing’s equation, which also has a character that supports our argument. 946 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). The upshot of this discussion is this: when only qualitative behavior is of interest, exact solutions are not very important. But, when quantitative information is required, exact solutions will in general not give us a straight- forward recipe to obtain numbers. This recipe would be within reach, however, if the exact solution could be captured by ðor given in the form ofÞ an al- gebraic or elementary expression. This is why, even when a problem is sus- ceptible of receiving an exact solution, applied mathematicians often ap- proximate the description of the system in order to derive model equations that have a closed-form solution or, even better, elementary or algebraic so- lutions. However, this implies that, for small changes in our description of the system, the character of the solutions can change significantly. But, given that mathematical modeling is an activity practiced in a context where uncertainty is always present, this means that our emphasis on an exact so- lution will not, in general, guarantee the computability of accurate numerical results from explicit exact solutions. These considerations lead us to a more flexible and inclusive way of dealing with the extraction of quantitative information from mathematical representations inspired by the works of nu- merical analysts, as well as to different ways of characterizing the hierarchy of computational complexity. 3. Computational Cost and Numerical Stability. In order to characterize alternative, complementary ways of understanding complexity hierarchies, we distinguish among three forms of equivalence of mathematical problems. We will call them mathematical equivalence, computational equivalence, and numerical equivalence.6 To grasp the difference between the first two notions of equivalence, consider the following expression, known as the sample variance of a set of n values: s2n 5 1 n 2 1 o n i51 ðxi 2 �xÞ2; ð3Þ where �x is, as usual, the arithmetic mean of the n values, �x 5 1=nð Þoni51xi. Calculating the variance using expression ð3Þ involves passing through the set of data twice ðfirst to compute �x, then to accumulate the sum of squaresÞ. Another formula is sometimes considered as a replacement for equation ð3Þ: s2n 5 1 n 2 1 � � o n i51 x2i 2 1 n o n i51 xi � �2� � : ð4Þ Formulas ð3Þ and ð4Þ are of course mathematically equivalent, since the graphs of the two right-hand-side formulas are identical, as can be shown by 6. The distinction between mathematical and numerical equivalence is a standard one in numerical analysis. See, e.g., Dahlquist and Bjorck ð1974, 48Þ. METHODS, COMPLEXITY, AND HIERARCHIES 947 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). elementary manipulations. But expression ð4Þ seems to offer a computa- tional advantage: to compute it, we pass through the data only once. We then say that ð3Þ and ð4Þ are not computationally equivalent: ð4Þ is, from this per- spective, preferable to ð3Þ. Typically, such computational advantages are measured by counting the number of arithmetic operations; in numerical contexts, this is known as the “flop count,” where ‘flop’ stands for ‘floating-point operation’. Formula ð3Þ necessitates n 2 1 additions and one division by n to compute �x, n sub- tractions and n multiplications for the squaring, n 2 1 operations for the sum of squares, and one division by n 2 1, for a total of 4n operations. But, as one can easily check, formula ð4Þ requires only 3n 1 2 operations, so we see that the computational cost of formula ð3Þ is larger than that of ð4Þ. Therefore, although ð3Þ and ð4Þ are, strictly speaking, not computationally equivalent, both computational costs are linear functions, so the difference between the two is often of no practical importance, even for very large col- lections of data, since modern computers can process gigaflops ð109 flopsÞ per second. Sometimes, however, the difference between computational costs is very important indeed. What matters is the difference between orders of com- putational cost. The two methods to compute the sample variance above had a linear cost, which we denote by OðnÞ. But if we instead considered an algorithm having a quadratic cost ðOðn2ÞÞ, a cubic cost ðOðn3ÞÞ, or more generally a polynomial cost ðOðnkÞÞ or even an exponential cost ðOðcnÞÞ, the difference would be more dramatic ðsee fig. 2Þ. Such considerations lead to a second way of characterizing a hierarchy of complexity of mathemat- ical problems. Consider such a case, namely, the problem of finding the determinant of a matrix A ∈ Rn�n, which arises very often in science. We consider two methods. The first method—a recursive method known as Laplacian de- terminant expansion by minors, the method taught in introductory linear algebra—uses this formula to compute the determinant: detðAÞ 5 o n j51 ð21Þi1jaijMij; ð5Þ where i is any row along which we expand, and Mij is the determinant of the ðn 2 1Þ � ðn 2 1Þ matrix obtained by crossing out row i and column j. This sum requires a linear number of operations; that is, the cost is a function ax 1 b, or simply OðnÞ. But this requires computing Mij, which is itself the determinant of an ðn 2 1Þ � ðn 2 1Þ matrix, so that the summation for this number will require Oðn 2 1Þ operations. Again, it will involve computing the determinant of ðn 2 2Þ � ðn 2 2Þ matrices, which costs Oðn 2 2Þ operations. So, including all the steps of the recursion, we have a cost of 948 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Oðn � ðn 2 1Þ . . . 2 � 1Þ 5 Oðn!Þ, and factorial cost is even larger than exponential. Clearly, for even a small system, this method will be prohib- itively computationally expansive. However, one could use a more af- fordable strategy—Gaussian elimination—to transform the matrix A into an upper-triangular matrix B. Note that the elementary transformations involved in the transformation do not change the determinant, so that the problem of finding detðAÞ and detðBÞ are mathematically equivalent. How- ever, detðBÞ is simple to find: it is the product of the n diagonal entries. Thus, since the reduction to a transformed matrix uses Oðn3Þ operations ðsee, e.g., Corless and Fillion 2013, chap. 4Þ, we have an algorithm using cubic cost. This is hugely inferior to the cost of the Laplacian row expansion method. For instance, for a small 12 � 12 matrix, the latter method requires about 1,700 flops, while the Laplacian expansion method requires about 500 mil- lion flops. These are certainly not computationally equivalent. As we have seen, mathematical equivalence regards the calculations as abstract operations, while computational equivalence takes into account some practical computational constraints. Yet the notion of numerical equivalence goes even deeper into these constraints, and, in fact, this is the notion that underlies modern approaches to scientific computing ðas opposed to Turing- Figure 2. Logarithmic plot of different orders of computational cost. METHODS, COMPLEXITY, AND HIERARCHIES 949 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). style computation and complexity theoryÞ. In fact, there is a computing tradition grounded in numerical analysis that significantly differs from and complements the computation theory that is more familiar to philosophers of science ðsee, e.g., Blum 2004Þ. Below we emphasize key aspects of the concept of numerical stability and how it leads to a different characterization of computational complexity. In numerical terms, computational limitations are often given by speci- fying the number of digits available to represent numbers; this is made es- sential by the fact that digital computers are unable to handle the infinite number of decimals that a real or complex number may have. Thus, real and complex numbers are replaced by finite “machine numbers.”7 For the purposes of a simple example, suppose that we work within the confines of what is called ‘8-digit fixed-point arithmetic’. That is, we have only 8 digits available to represent numbers, so numbers longer than 8 digits get chopped, which gives rise to a round-off error. Under this constraint, let us consider the following set of three values: x1 5 10,000, x2 5 10,001, and x3 5 10,002. For these values, formula ð3Þ computes s23 5 1 1, and formula ð4Þ computes s2 3 5 0. As is immediately evident, this discrepancy is due to the fact that chopping discards the last digits in some cases. This shows that, even if ð3Þ and ð4Þ are mathematically equivalent and almost compu- tationally equivalent in the sense given above, they are not numerically equivalent. This situation raises interesting conceptual questions: What is the value of s2 3 after all? We noted that in certain circumstances ða large amount of dataÞ, two mathematically equivalent expressions are associated with dif- ferent computational speeds—and in these circumstances ð4Þ is to be pre- ferred to ð3Þ ðalthough, as we saw, the difference in practice is not very significantÞ. In yet other circumstances ð8-digit arithmetic and data of a cer- tain typeÞ, the numerical equivalence of the two expressions is lost, despite their mathematical equivalence. Therefore, one must advance a principled reason to select the ‘right’ expression when calculating concrete values. What is this principle then? Before answering the question, note that modern scientific computing generally uses floating-point arithmetic and not fixed-point arithmetic as in the example above.8 If we work in, say, a standard 16-digit floating-point 7. This has important consequences that cannot be neglected; indeed, in limited- precision arithmetic, many field laws do not hold true, e.g., associativity of addition and cancellation of multiplication. Thus, a computer arithmetic has a different algebraic structure. 8. See, e.g., the first appendix and the first chapter of Corless and Fillion ð2013Þ. The main advantage of floating-point numbers over fixed-point numbers is that they cover a broader range of approximate real values. 950 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). arithmetic, there are design standards that guarantee that basic operations such as addition, subtraction, multiplication, division, square root, and so on, are correctly rounded.9 In other words, for any basic real-valued oper- ation *: R � R → R,10 it will be the case that flðx * yÞ 5 x * y 1 D 5 x * yð1 1 dÞ; ð6Þ where D is a round-off error that is at most half the unit in the last place,11 and d is the equivalent relative error given by d 5 D x * y 5 flðx * yÞ 2 x * y x * y : ð7Þ So, the calculated floating-point value of such basic operations will be within 1 ± 1.1 � 10216 of the exact value. When multiple operations are chained together, we obtain a cumulative round-off error in terms of the individual round-off errors. If implementing a problem in floating point using a certain algorithm ðor formulaÞ results in a relatively small cumu- lative round-off error, then we say that the algorithm or formula is nu- merically stable. So, the notion of numerical equivalence necessitates that we consider sets of problems/functions and their relations, and not only a single function written in mathematically equivalent forms. Let us examine a simple problem that will exemplify how one chooses between solutions obtained from numerically inequivalent problems. Its careful analysis will point toward the general principle underlying the anal- ysis of scientific computation. Suppose then that we are in the happy cir- cumstance that we can solve the model equations of a physical process, and we obtain an explicit solution f:12 f ðxÞ 5 xð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 2 ffiffiffi x p Þ: ð8Þ Suppose, further, that for some reason we are interested in calculating the value f ð500Þ. If we had unlimited computational resources, such a value would be given in the form of an infinite array of decimals. However, this 9. One such standard that is widespread in the software industry is the IEEE-754 standard. 10. The point generalizes to operations of arbitrary ðbut finiteÞ arity and to complex numbers. 11. The unit in the last place, denoted ulp, is a constant determined by the parameters of a system of floating-point arithmetic. In a standard 16-digit floating-point system using a binary basis, it is about 1.1 � 10216. Another related quantity is also used, namely, the “machine epsilon.” 12. Matthews and Fink ð1999, 28Þ use this example to illustrate loss of significant figures, but we use it instead to make a more general point about numerical stability. METHODS, COMPLEXITY, AND HIERARCHIES 951 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). value is simply not available to a finite epistemic agent computing it on a finite physical machine. So, within the current context, the question is in fact what is the value fð500Þ, given certain computational constraints. We can call the value supposedly obtainable by an ideal ðPlatonist, if you willÞ mathe- matician the ‘true’ value, but we have to think about what the actual, com- putable value is, under the given constraints. Suppose our constraint is that we work in a 6-digit floating-point arithmetic in base 10. Then, the computed value would be f ð500Þ 5 500ð ffiffiffiffiffiffiffiffi 501 p 2 ffiffiffiffiffiffiffiffi 500 p Þ 5 500ð22:3830 2 22:3607Þ 5 11:1500: ð9Þ Let us introduce the notation JcðaÞ standing for the ‘computable’ value of J at point a, subject to constraints C. Then, fcð500Þ 5 11.1500. At this point, suppose we discover a different expression g, gðxÞ 5 xffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 1 ffiffiffi x p ; ð10Þ which, as is immediately clear, is mathematically equivalent to f. In fact, we may regard it as merely an accident that we first expressed the solution as a computation of the expression for f. The solution could have been obtained in the form g to begin with. We can put this in terms of epistemic symmetry: imagine that Fred and Ginger solve the problem in parallel, perhaps in dif- ferent rooms, and that Fred writes down the solution as f while Ginger obtains it as g. Epistemically speaking, there seems to be perfect symmetry between them. Yet, a simple calculation shows that when Ginger computes gcð500Þ, she gets gcð500Þ 5 500ffiffiffiffiffiffiffiffi 501 p 1 ffiffiffiffiffiffiffiffi 500 p 5 500 22:3830 1 22:3607 5 500 44:7437 5 11:1748: ð11Þ The problem we face now is of the type introduced above: which of the two values, 11.1500 or 11.1748, should we use ðe.g., to build a bridgeÞ? Recall the aim we stated at the outset: to raise a challenge to the belief that it is preferable to have an analytic solution available because, allegedly, in this case we do not need to appeal to error-theoretic considerations. We just made such an assumption, that we are in the happy circumstance of being able to know one exact expression for the solution of the differential equation of interest: f. The point of this example is to show that this was not good enough. When it came to actually calculating values at points of interest, we noticed that this expression of the solution can be questioned. The lesson is that the supposedly superior epistemic situation of possessing 952 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). an exact explicit solution is in fact on many occasions more precarious— and it is exactly considerations of numerical nature that show this. This example can be generalized. Suppose, again, that function f is the solution to our model. We also want the physical information encoded in this function; that is, we would like to know the values of this function at various points t. As limited epistemic agents, we are always under certain computational limitations C, so we have to calculate fcðtÞ. The problem in its most general form is that we are not immediately justified to work with this value, since other values exist—and, therefore, we need a principled way to choose among them. We know that other values exist since there are other functions mathematically equivalent to f, and, what is even more disconcerting, we can produce them at will, by simple manipulations of one of them. The general nature of the difficulty is then clear: given a solution f, we can specify an infinite class of functions gk such that the following conditions obtain: 1. for all k, gk is mathematically equivalent to f; 2. it is possible that at least one of them, call it gi, will be computa- tionally inequivalent to f; 3. it is possible that at least one of them, call it gi, is such that gicðtÞ ≠ fcðtÞ.13 Thus, if we construct a complexity hierarchy based on the notion of nu- merical stability, it will differ in important respects from the two hierarchies mentioned earlier. We want to use a computed solution from an expression that is as computationally simple as possible while being numerically sta- ble. If we return to the alternative expressions of equations ð8Þ and ð10Þ, we use the method of equation ð6Þ to determine whether the expression is numerically stable. If we write the floating-point implementation of ð8Þ and want to find an equivalent real function with error terms, we get fCðxÞ 5 flðxð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 2 ffiffiffi x p ÞÞ 5 x � flð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 2 ffiffiffi x p Þ � ð1 1 d1Þ 5 xð flð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p Þ 2 flð ffiffiffixp ÞÞð1 1 d2Þð1 1 d1Þ 5 xð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p ð1 1 d3Þ 2 ffiffiffi x p ð1 1 d4ÞÞð1 1 d2Þð1 1 d1Þ: ð12Þ 13. With a characterization of approximate solutions in terms of solutions of nearby problems, as found, e.g., in Fillion and Corless ð2014Þ, we could even say in addition that gicðtÞ≉fcðtÞ. METHODS, COMPLEXITY, AND HIERARCHIES 953 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). We thus see that the worst case scenario is fCðxÞ 5 f ðxÞð1 1 dÞ3; ð13Þ where d is the maximum relative rounding error dictated by the size of the unit in the last place for this system of floating-point arithmetic ðin this case, with 6 digits and a 5 500, this is about d 5 2024Þ. What about g? We have gCðxÞ 5 f l xffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 1 ffiffiffi x p � � 5 x flð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p 1 ffiffiffi x p Þ ð1 1 d1Þ 5 x ð ffiffiffiffiffiffiffiffiffiffiffi x 1 1 p ð1 1 d2Þ 1 ffiffiffi x p ð1 1 d3ÞÞð1 1 d4Þ ð1 1 d1Þ: ð14Þ Here, the worst case scenario is gCðaÞ 5 gðaÞð1 1 dÞ; ð15Þ which is much more robust to round-off errors of maximum magnitudes d. Thus, we can conclude that, once implemented in floating-point arithmetic, g solves a very near problem, but f does so to a lesser extent. But, as we have noted, many other candidate functions could come up against g; would it undermine our confidence that g gave us a reliable answer? No, since the error factor for g is of order ð1 1 dÞ, so that no other candidate could have an error factor of lower order. As we see, the argument is quite general and does not depend on the specifics of a particular system of floating-point arithmetic. In fact, the val- ues d could be equally well thought of as physical perturbations or mea- surement errors, instead of round-off errors. This is why the expression g will be more information conducive than f, both in a numerical and in a physical sense. As a result, we conclude that, even if Fred and Ginger started from epistemically symmetric contexts, their results have asymmetric values. In other words, we see that the epistemic symmetry in the mathematical context does not carry over to the numerical context. 4. Conclusion. When we do not have an exact solution to a mathematical problem, we see that it is important to consider various forms of stability, to determine the similarity between modified problems, and to determine the robustness of our original reference problem to perturbations. By such forms of analysis, we establish that inexact solutions are entirely satisfac- tory ðor not, as the case may beÞ, given a certain modeling context. More precisely, we seek to show that the computational error engendered by a numerical method is small in comparison to the systemic modeling error that we know to be present for physical reasons. But even for exact solu- tions, the robustness or sensitivity to such factors has to be established. As a 954 NICOLAS FILLION AND SORIN BANGU This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). result, the alleged superiority of exact solutions is mitigated, provided that one is interested in scientific modeling. This fact has important consequences for the way that the hierarchy of complexity of mathematical problems should be conceived, as it incorporates a relationship with nearby problems in terms of both computational complexity and numerical stability. REFERENCES Blum, L. 2004. “Computing over the Reals: Where Turing Meets Newton.” Notices of the American Mathematical Society 51 ð9Þ: 1024–37. Borwein, J., and R. Crandall. 2013. “Closed Forms: What They Are and Why We Care.” Notices of the American Mathematical Society 60 ð1Þ: 50–65. Corless, R. M., and N. Fillion. 2013. A Graduate Introduction to Numerical Methods. New York: Springer. Dahlquist, G., and A. Bjorck. 1974. Numerical Methods. Englewood Cliffs, NJ: Prentice-Hall. Translated from the 1969 Swedish edition. Fillion, N., and R. M. Corless. 2014. “On the Epistemological Analysis of Modeling and Com- putational Error in the Mathematical Sciences.” Synthese 191:1451–67. Humphreys, P. 2004. Extending Ourselves: Computational Science, Empiricism, and Scientific Method. New York: Oxford University Press. Matthews, J., and K. Fink. 1999. Numerical Methods Using MATLAB. 3rd ed. Englewood Cliffs, NJ: Prentice-Hall. Trefethen, L. N. 2008. “Numerical Analysis.” In The Princeton Companion to Mathematics, ed. T. Gowers, J. Barrow-Green, and I. Leader. Princeton, NJ: Princeton University Press. Turing, A. 1950. “Computing Machinery and Intelligence.” Mind 59 ð236Þ: 433–60. Wang, Q. 1990. “The Global Solution of the n-Body Problem.” Celestial Mechanics and Dynamical Astronomy 50 ð1Þ: 73–88. METHODS, COMPLEXITY, AND HIERARCHIES 955 This content downloaded from 142.058.152.241 on March 12, 2016 16:44:29 PM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.journals.uchicago.edu/action/showLinks?crossref=10.1093%2Fmind%2FLIX.236.433 http://www.journals.uchicago.edu/action/showLinks?crossref=10.1007%2Fs11229-013-0339-4 http://www.journals.uchicago.edu/action/showLinks?crossref=10.1090%2Fnoti936 http://www.journals.uchicago.edu/action/showLinks?crossref=10.1090%2Fnoti936 http://www.journals.uchicago.edu/action/showLinks?crossref=10.1093%2F0195158709.001.0001