3.2.3. Statistical Analysis of Instrumental Records
Once climate data has been collected and corrected for inhomogeneities, it will need to be analysed. The aim of any statistical analysis is to identify systematic behaviour in a data set and hence improve understanding of the processes at work to compliment the theory. Statistical analysis is a search for a signal in the data that can be distinguished from the background noise (see discussion in section 2.4). In climate change research that signal will be a periodic variation, a quasi-periodic variation, a trend, persistence or extreme events in the climate element under analysis (Figure 3.1).
Before undertaking a statistical analysis of a climate record, a number of questions about the task in hand should be asked.
1. What is the purpose of the analysis?
In its simplest form, the statistical analysis may be:
a) descriptive; or
b) investigative (Mitchell et al., 1966).
Descriptive analyses set out solely to document particular aspects of the variations present in the data set (the signal). Indices calculated will include the mean and variance (or standard deviation). The occurrence of extreme events, cycles and trends will also be noted (Gibbs et al., 1978). Significance testing is crucial to this category of analysis. Significance testing establishes whether or not the variation under consideration is different from what one would expect to arise in a random time series.
Investigative analyses set out to test a pre-defined hypothesis. The hypothesis should a priori have a sound physical basis. "Does the time series contain an El Niño cycle?" would be an example of a hypothesis that could be investigated.
2. What is the most appropriate data set to use?
Any data set used for a statistical analysis should be:
a) representative of the relevant physical processes;
b) sufficient in quantity to support the statistical method(s) to be used; and
c) accurate and reliable (homogeneous).
To investigate the impact of El Niño on drought in eastern Australia, it is necessary to firstly identify a representative indicator of El Niño, such as sea surface temperatures in the SE Pacific. Secondly, a reliable indicator of drought in eastern Australia is required, for example, rainfall. The data set would need to be of sufficient duration to permit the testing for a relationship on the time scale under consideration, i.e. does El Niño cause drought in eastern Australia? Since El Niño has a quasi-periodicity of 2 to 5 years, then a data set of length at least 7 to 10 times this duration (i.e. up to 50 years) is required to have confidence in the statistical methods. To investigate longer-term trends, the data requirement becomes more stringent.
3. What is the most appropriate technique to use and how should it be applied?
Often it will be clear as to which statistical method of analysis is required. However, its application may be less straightforward. The nature of the data may determine whether or not a particular technique is valid (or, at least the way in which the technique is applied). For example, if the data are not normally distributed then this may invalidate assumptions on which the technique is based. What ever technique is used, it goes without saying that testing statistical significance must be a critical concern.
Barry & Perry (1973) offer a detailed introduction of the mathematical aspects of statistical analysis, with many useful examples. Other useful references are provided by Gani (1975) and Godske (1966). Before concluding this section, however, a couple of points need illustrating. Firstly, much of what has been said about statistical analysis of instrumental records applies equally well to the study of palaeoclimatology, and to the reconstruction of past climates from proxy data (discussed in the next section). Secondly, the statistical analysis of climate data serves to compliment and support theories developed to explain the causes (and effects) of climate change. Statistical associations do not prove cause and effect for they are solely based upon the laws of probability. When analysing and interpreting climate data in the effort to aid understanding of the causes of climate change, it is necessary to bare this in mind.