The term "correlation" is used to indicate the degree of interrelation between two or more variables. The procedure of calculating quantitatively the degree of the interrelation is called correlation analysis. Correlation analysis can be carried out for both continuous variables and discrete data, and the analysis for discrete data most often found in engineering practice and digital calculations is described below.
Autocorrelation function. Assuming a discrete time-series with finite number of samples of N and an average value of ,
the autocorrelation function of which is defined as:
which effectively averages all possible products of the time-series and its time-shifted version separated by a time lag k. In practice, formula (2) is preferred in its normalized form
The value of ρ_{xx} is such that – 1 ≤ ρ_{xx} ≤ 1. The autocorrelation function is an average measure of the time-domain properties of the time-series, and is related to the power spectral density function in the frequency domain by the Fourier transform (see Spectral Analysis). If the magnitude of the autocorrelation function ρ_{xx} decreases with increasing time lag k, there is some degree of randomness in the time series. If the ρ_{xx} changes sign at regular time intervals, then the time-series is periodic, and a combination of the two may imply that the time-series is quasi-periodic, which is often the case in real engineering problems.
The cross-correlation function for two sets of time-series data
is defined as
The correlation function and cross-spectral function are equivalent measures in time and frequency domains which are related to each other by the Fourier transform (see Spectral Analysis).
The correlation coefficient is defined as the normalized version of formula (4) and is given by
the value of which at a particular time corresponding to k is a measure of similarity of the strength of components in x_{n} and y_{n} at that time. The value of ρ_{xy} is such that – 1 ≤ ρ_{xy} ≤ 1 , and the larger the ρ_{xy} the more strongly correlated are the x_{n} and y_{n} at a given time.
It should be emphasized that the concept of correlation is different from that of regression. The procedure of finding a best fit curve is called regression, whereas the accuracy of the regression curve is measured by correlation.
REFERENCES
Gardner, W. A. (1988) Statistical Spectral Analysis, a Non-probabilistic Theory, Prentic-Hall, Inc., New Jersey.
Linn, P. A. (1989) An LIntroduction to the Analysis and Processing of Signals, 3rd Edition, Macmillan Press Ltd., London.
Schwartz, M. and Shaw, L. (1975) Signal Processing: Discrete Spectral Analysis, Detection and Estimation, McGraw-Hill, Inc., USA.
References
- Gardner, W. A. (1988) Statistical Spectral Analysis, a Non-probabilistic Theory, Prentic-Hall, Inc., New Jersey.
- Linn, P. A. (1989) An LIntroduction to the Analysis and Processing of Signals, 3rd Edition, Macmillan Press Ltd., London.
- Schwartz, M. and Shaw, L. (1975) Signal Processing: Discrete Spectral Analysis, Detection and Estimation, McGraw-Hill, Inc., USA.