– Confucius

“There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.” That perceptive comment was written in Life on the Mississippi (1883) by Samuel Clemens (1835–1910), known by his pen name, Mark Twain, who has been called “the father of American Literature,” and who wrote The Adventures of Tom Sawyer and The Adventures of Huckleberry Finn. In a similar vein, but more relevant to a mathematical textbook like the one you are reading now, is a remark by the physicist Eugene Wigner. In his paper, “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,” Wigner (1960) wrote, “it is important to point out that the mathematical formulation of the physicist’s often crude experience leads in an uncanny number of cases to an amazingly accurate description of a large class of phenomena.”

Dimensional analysis, the subject of this introductory chapter, illustrates this phenomenon well. Consider a simple pendulum, consisting of the Earth’s gravity g acting on a mass m suspended on a string of length l. How does the period of the pendulum depend on m, l, and g? In this chapter, we show how to solve this problem using only dimensional analysis, without any physics or calculus.

“The principal use of dimensional analysis is to deduce from a study of the dimensions of the variables in any physical system certain limitations on the form of any possible relationship between those variables. The method is of great generality and mathematical simplicity.” (Bridgman 1969).

When discussing the future of geophysics education, G. K. Vallis (2016) made this perceptive statement:

Scientists will always have personal preferences and differing expertise but combining analytical ideas with simple numerical models can be a very powerful tool in both research and education, and modern tools can be used to enable this at an early stage in the classroom. A numerical model transparently coded in 100 lines and run on a laptop can then play a similar role to that of a rotating tank in illustrating phenomena and explaining what equations mean, and the rift between theory, models and phenomena then never opens. Although conventional books will remain important for years to come, the next textbook or monograph in GFD, or really in any similar field, could to great effect be written using a Jupyter Notebook (formerly IPython Notebook), or similar, which can combine numerical models with conventional text and equations (e.g. LaTex markup), figures, and even symbolic manipulation in a single document, enabling interactive exploration of both analytical and numerical GFD concepts. Such an effort would be a major undertaking, so a collaborative effort may be needed, perhaps like the development of open source software, and the end product would hopefully be free like both beer and speech.

Our book is written from this perspective for the future of mathematics education in climate sciences. The book uses R and the R Notebook. This chapter explains the installation of R and RStudio and demonstrates some basic uses of R.Equivalent Python codes and their Jupyter Notebooks may be found at the book website www.cambridge.org/climatemathematics.

You may have seen statements such as, “The available evidence rejects the null hypothesis at the 5% significance level.” That language certainly sounds “scientific.” What such statements really mean, however, is strongly dependent on context. You might be enthusiastic about trying a new restaurant or seeing a new movie, if 19 out of 20 online reviews were favorable. But you would never get on an airplane if you thought the odds were 1 in 20 that it would crash. We use “statistics” to mean a suite of scientific methods for analyzing data and for drawing credible conclusions from data. We provide basic concepts and useful R codes covering commonly used statistical methods in climate data analysis, so that users can arrive at credible conclusions based on the data, together with a given error probability. To interpret the statistical results in a meaningful way, however, knowledge of climate science is essential. The statistical methods in this chapter have been chosen to focus on making credible inferences in climate science, with a given error probability, based on the analysis of climate data, so that observational data can lead to objective and reliable conclusions. We will first describe a list of statistical indices, such as mean, variance, and quantiles, for climate data. We will then take up probability distributions and statistical inferences. You may be surprised to learn that some well-known and powerful statistical techniques were developed by a scientist who spent his entire career working as a beer brewer at the Guinness Brewery in Dublin, Ireland.

Mathematicians have described some aspects of pure mathematics as “beautiful,” using terms such as “elegant,” “deep,” and “general.” This book, however, is about topics in mathematics that have proven to be useful in the application of mathematics to climate science. In deciding what to include in this book, our criteria have been practical rather than aesthetic. To illustrate the usefulness of linear algebra and matrices in climate science, one instructive example is the analysis of sea level pressure data using a singular value decomposition (SVD) method. The term El Nino originally meant the occasional appearance, every few years, of unusually warm surface water in the eastern tropical Pacific. Thanks to research in climate science, we now realize that El Nino is part of a large-scale complex of changes in the atmosphere and ocean with far-reaching effects. SVD and related mathematical methods have played an important role in this major scientific advance.

Emanuel Lasker (1868–1941), one of the greatest chess players of all time, was chess champion of the world for 27 years. He also had earned a doctorate in mathematics, but he gave up mathematics for chess. Lasker once said, “In mathematics, if I find a new approach to a problem, another mathematician might claim that he has a better, more elegant solution. In chess, if anybody claims he is better than I, I can checkmate him.” The practical experience of playing a number of games of chess with each other is the best way to determine which of two chess players is stronger. The practical experience of carrying out a large amount of climate research with various mathematical methods is the best way to determine which methods are most useful in climate science.

This chapter is devoted to a class of climate models known as energy balance models (EBMs). These models have been developed as highly idealized and simplified representations of certain key aspects of a climate system. An EBM is developed on the basis of the principle of energy balance: the climate system in question is assumed to be in a balanced or equilibrium state, such that the incoming solar energy entering the system is equal to the outgoing energy leaving the system. The motions of an atmosphere and an ocean are not explicitly considered in an EBM.

Such an extremely simplified model can aid us in understanding a very complex system, such as Earth’s climate, and in the case of systems much simpler than the Earth’s climate, an EBM may be capable of simulating some climate parameters of the system fairly realistically. In fact, some bodies in the solar system have neither atmosphere nor ocean, and our Moon is one such body. Furthermore, the Moon has no water. As we shall show, a simple EBM can simulate the Moon’s surface temperature quite realistically.

This chapter will also describe applications of an EBM to the planet Earth, despite the drastic simplification that atmospheric and oceanic motions are not considered at all. The unknown variable in an EBM applied to the Earth’s climate is the Earth’s surface air temperature. This chapter considers only a few very simple EBMs, such as a stationary zero-dimensional model with and without albedo feedbacks, and discusses the EBM solutions, their stability, and their climate interpretations and limitations.

Let us take stock. In the preceding chapters, we have introduced the programming language R and discussed the mathematical topics of dimensional analysis, basic statistical methods, and matrices and linear algebra. We then surveyed energy balance models in Chapter 5. In the present chapter and the following one, we will discuss several topics involving climate science applications of calculus, focusing on derivatives in the present chapter and on integrals in the following chapter. Before starting these two chapters, we suggest reviewing Appendix D on calculus concepts and methods for climate science.

You may find our treatment of calculus somewhat different from the way you were first taught this subject. Re-learning calculus may thus resemble returning to a place you have visited before and seeing it from a new perspective. In Appendix D, we describe a simple and direct approach to calculus due to Rene Descartes (1596–1650), which is not at all the same as the conventional method found in most textbooks. Appendix D outlines our rationale for taking this pedagogically novel route. We hope that you will come to agree with our choice of approaches to calculus. In choosing this unusual approach, we have been mindful of a wise comment made by Ferdinand Porsche (1875–1951), the brilliant engineer who founded the Porsche automobile company. Dr. Porsche said, “Change is easy. Improvement is far more difficult.” In this chapter, we cover linear approximation, Newton’s method, linearization of the Stefan–Boltzmann blackbody radiation law, Taylor expansion, and partial derivatives.

We now move on to illustrate several ways in which integrals are used in climate science. We focus first on hydrostatic balance, the equilibrium between the forces of gravity and the vertical pressure gradient that is found in both the ocean and the atmosphere for phenomena characterized by large horizontal length scales. We then introduce the important concept of geopotential, the gravitational potential energy per unit mass at a given atmospheric altitude z with respect to sea level. We provide a general derivation of the hypsometric equation showing that atmospheric pressure decreases exponentially with increasing elevation, and we discuss approximations to this equation. We also show how the hypsometric relationship was used in the 1800s to determine the height of mountains with surprising accuracy, long before modern technology was developed.

We next explore the role of integrals in several fundamental topics in thermodynamics, including work done when a system consisting of air is expanded or compressed into a different volume. We discuss internal energy, enthalpy, entropy, the ideal gas law, and the first law of thermodynamics. We show how, as an application of integration, the Stefan–Boltzmann law can be derived from Planck’s law of radiation. Because our focus is on selected mathematical aspects of these climate subjects rather than on details of the physics, readers interested in learning more about these topics may wish to also consult standard textbook references listed at the end of this chapter, such as the books by Curry andWebster (1990), Feynman et al. (2013), Pierrehumbert (2010), and Wallace and Hobbs (2006).

Global climate models, sometimes known as general circulation models (both are called GCMs) are a relatively recent development. Until the 1960s, they did not exist, nor did the powerful supercomputers that they require. It is no exaggeration to say that the development of GCMs has revolutionized climate science. GCMs provide us with a virtual Earth, on which we can do controlled experiments, which would (fortunately) be impossible to do on the real Earth. GCMs provide a worthy complement to other research approaches, such as theory and observations. An extraordinarily rapid development of GCMs has occurred since the 1960s. A major aspect of this development has been to extend models of the atmosphere to include models of the ocean circulation and other components of the climate system. For an introductory historical survey of GCMs, see Donner et al. (2011).

At the foundation of GCMs are a set of equations that describe fundamental conservation laws in physics such as the conservation of mass, momentum, and energy. Other conservation laws or principles have proven useful for providing deep physical insight into the dynamics of atmospheres and oceans, such as the conservation of potential vorticity. This chapter introduces examples of both types of conservation laws. The energy balance models (EBMs) described in Chapter 5 are the simplest in a hierarchy of climate models and are based only on a simple form of the conservation of energy: outgoing energy emitted as radiation by the Earth is balanced by incoming energy in the form of solar radiation absorbed by the Earth. Today coupled ocean–atmosphere GCMs, often involving detailed treatments of many aspects of the climate system, from cloud physics to ocean chemistry, occupy the comprehensive end of the spectrum of climate models. Numerical methods are invariably necessary to solve complex climate models. This chapter provides the basic mathematics needed to describe the conservation laws that are the foundation of quantitative climate science.

Research very often includes the need for graphic displays of scientific data. Such graphics are necessary both to carry out the research and to present it to others in formats such as publications and talks. If your objective is to become a master of good graphical design, presenting quantitative scientific information in an artistic way that is both aesthetically appealing and informative, then we commend you for this worthy goal, and we suggest you might begin with careful study of the visualization books by Edward R. Tufte.

Our goal in this book is much more modest. In the final three chapters, beginning with this one, we want to show you how versatile and useful the R language can be in both graphics and data analysis. In this chapter, we demonstrate that R is well suited for producing a wide variety of graphics including simple line plots and color contour maps. We illustrate these capabilities of R using a wide variety of climate data.This chapter is an introduction to the basic skills needed to use R graphics for climate science. These skills are sufficient to meet most needs for climate science research,teaching, and publications. We have divided these skills into the following categories:

- plotting multiple data time series in the same figure, including multiple panels in a figure, adjusting margins, and using proper fonts for text, labels, and axes;
- creating color maps of a climate parameter, such as the surface air temperature on the globe or over a given region; and
- animating plots.

The empirical orthogonal function (EOF) method is a commonly used tool for climate data analysis in research. This chapter provides basic ideas and mathematical theory for the EOF analysis, also known as principal component analysis (PCA) in the statistical literature. This chapter provides recipe-like R codes for analyzing and visualizing space–time climate data using EOFs and PCs. The codes can (i) compute EOFs and principal components (PCs) using the singular value decomposition (SVD) analysis approach, (ii) plot the EOFs on a world map and PC time series, (iii) compute temporal trends, data standardization, and de-trended data over a spatial grid, and (iv) plot the spatial distribution of temporal trends. NCEP/NCAR Reanalysis data are used as examples. As described in Chapter 4, the SVD method helps to reveal the spatial and temporal patterns of any dataset obtained by sampling in space and time. EOFs show spatial patterns of climate data, such as the El Nino warm anomaly pattern of the eastern tropical Pacific. The corresponding temporal patterns are depicted in PCs that can show the times when El Ninos occur. The SVD approach can aid in developing physical insight and visualizing climate information, and thus can help lead to an improved understanding of the phenomena under study. Our description makes EOFs and PCs a natural space–time decomposition technique that can be readily carried out by a simple R command: svd(datamatrix). This method is different from the traditional approach of an eigenvalue problem based on a covariance matrix, which focuses only on spatial patterns.

We are now near the end of our journey. We have discussed many mathematical techniques and tools, and we have demonstrated many capabilities of the programming language R. These include the ability to implement the mathematical tools and to analyze datasets, which might be generated by models or obtained from observations. We have seen that R can carry out many kinds of statistical analyses of data and can produce a wide range of graphical products.

However, between the optimistic beginning and the successful end of almost all research projects, the mischievous gods of science have placed a variety of frustrating obstacles. One of these obstacles is missing data. Data can be missing for many reasons. Instruments can break or malfunction. Communications can fail. Ship tracks or satellite orbits can cause parts of the climate system to be unobserved. Human error can lose or destroy data. And sometimes data that ought to be in a dataset simply cannot be found.

Sometimes, missing data can be created in an effort to develop quality control procedures to flag erroneous data. The history is not perfectly clear, but something like that may have delayed the identification of the Antarctic ozone hole from satellite data. NASA satellite data showing ozone amounts lower than expected were flagged as suspicious. The NASA team later decided that the data were good. Meanwhile, a British team led by Dr. Joseph Farman published its discovery of the ozone hole using ground-based instrumentation. The moral of the story is that one should always devote effort to missing or questionable data.

This appendix includes dot product, linear regression formulas, solar power flux, and divergence theorem.

In addition to the dot product of two vectors described in Appendix A, there is another type of vector product: the cross product of two vectors, which is a third vector that is perpendicular to both of the two vectors.

Climate applications of the cross product of two vectors include (i) the Coriolis force whose expression includes a cross product of the angular velocity vector of the rotating reference frame and the flow velocity vector relative to the rotating reference frame, and (ii) the vorticity vector of a flow which is defined symbolically as a cross product of the curl operator and the flow velocity vector. Stokes’ theorem, Green’s theorem, and GPS-planimeter as a smartphone app are included in this appendix.

The Earth is approximately a sphere in the geometric height coordinate, and for some purposes it may be regarded as a perfect sphere in the geopotential height coordinate. Spherical coordinates are naturally used in climate model equations and in many formulas of climate science. This appendix describes the relationship between the spherical coordinates and Cartesian coordinates, and it explains the representation of differentials in spherical coordinates. These formulas are frequently encountered not only in climate model equations, but also in climate data analysis, such as in the area-factor used in computing covariance matrices and empirical orthogonal functions (EOFs).

This appendix describes calculus using the simple Descartes’ direct approach without limits, which is a different approach from that of a conventional calculus textbook. A derivative may be regarded intuitively as resembling the slope (i.e., the steepness) of a mountain road. Similarly, an integral resembles the hiker’s total elevation increase, which is a result of the accumulation (called integration) of the small elevation increases of each step. The slope and elevation are a pair of key features experienced by the mountain road hiker, and they form a derivative–antiderivative pair, which is a basic concept of calculus. Each step of the mountain hiker represents the local slope of the mountain road, measured by a small distance forward, and also by a small distance upward. The calculus method is to use the small steps to find various kinds of local and integrated results using these three quantities: the local slope denoted by f (x), the small distance forward denoted by dx, and the small distance upward denoted by dy.

This appendix emphasizes the calculus concepts and their mathematical implementation from various perspectives, such as that of statistics and that of climate science applications. In this appendix, we use R and WolframAlpha to do the actual and sometimes tedious calculations of derivatives and integrals. The appendix contains both single variable calculus and multivariate calculus with climate science examples. It also includes vector calculus through the description of the line integral, surface integral, and volume integral. Two theorems involving these integrals, both of which are very useful in climate science, are found in Appendix A for the divergence theorem and in Appendix B for Stokes’ theorem.

The purpose of this appendix is to show students how to describe and present their solutions to the exercise problems. Only a few sample solutions are presented here. The updated solutions of additional problems and hints can be found on the book website www.cambridge.org/climatemathematics, while the complete solution manual is available only to instructors.