Predictor selection -CV -aic, AICc, BIC -hypothesis testing -Mallows Cp -split the data, 3 data sets with example, compare PCR, CCA and other pattern-based regression techniques -dimension reduction -dealing with co-linearity -way of interpreting multivariate regression. CCA, MCA, Example PCR, CCA etc .... -- Philippines -- Pred comp. of CFS? --ENSO Pitfalls (Nuances) of cross-validation -trick to speed up computation. can be applied to EOFs. Not to CCA. Elsner model slection estimate of skill Bias of cross-validation -correlation of cv climo forecast is -1 -not including model selection in CV--does including it as an additional layer of CV add an additional bias? -recomputing climo -recomputing EOFs not needed Examples ("You spot the problem") -Indian PCR -IRI PCR -not taking out climo From wikipedia: These are some ways that cross-validation can be misused: * By using cross-validation to assess several models, and only stating the results for the model with the best results. -use cross-validation to pick predictors, give that CV'd skill as the the skill * By performing an initial analysis to identify the most informative features using the entire data set -- if feature selection or model tuning is required by the modeling procedure, this must be repeated on every training set. If cross-validation is used to decide which features to use, an inner cross-validation to carry out the feature selection on every training set must be performed. * By allowing some of the training data to also be included in the test set -- this can happen due to "twinning" in the data set, whereby some exactly identical or nearly identical samples are present in the data set. ^serial correlation <- sst lim, just leaving out months Seasonal prediction at the IRI describe forecast products and how they are made described tailored products/research Need for probabilities estimating probabilities from ensembles New methodology?