IRI Home

CPT Help Home -> How to use CPT -> Program Settings -> Transforming the Data

Transforming the Data

Multiple linear regression, principal components regression, and canonical correlation analysis usually work best if the predictands are normally distributed. If precipitation, for example, is being predicted the data may often be positively skewed. CPT has an option to transform the predictand data to a normal distribution. The transformation is based on the empirical distribution, and so will work for positively and negatively skewed data, but may not be very effective if there are numerous ties in the data (for example, many cases of zero precipitation). The predictand data are transformed to a quantile, which is then trasnformed to a normal deviate. The inverse process is applied to convert predictions back to the original distribution. Since the predictions are transformed back to the original distribution, the transformation is effectively hidden to the user, although, results such as principal component loadings and regression coefficients will apply to the transformed data. The trasnformation is often effective in eliminating or minimizing instances of forecasts with probabilities on the normal category being smaller than the probabilities on the outer categories (in cases when the categories are climatologically equiprobable).

To activate the transformation, use the Options ~ Data ~ Transform Y Data menu item, which will toggle the option. A tick is shown next to the item if the transformation is activated. Note that the transformation can slow the computation noticeably.

For predictands such as precipitation, it may also be desirable to set an absolute lower limit of zero so that negative values are not predicted. The Options ~ Data ~ Zero-Bound menu item will toggle an option to reset all negative predictions to zero. A tick is shown next to the item if the zero-bound is activated.

Previous | Next


 
Last modified: