Predictor selection

-CV
-aic, AICc, BIC
-hypothesis testing
-Mallows Cp
-split the data, 3 data sets

with example, compare


PCR, CCA and other pattern-based regression techniques

-dimension reduction
-dealing with co-linearity
-way of interpreting multivariate regression.

CCA, MCA,

Example PCR, CCA etc ....
-- Philippines
-- Pred comp. of CFS?
--ENSO


Pitfalls (Nuances) of cross-validation

-trick to speed up computation. can be applied to EOFs. Not to CCA.	 

Elsner
model slection
estimate of skill 
	 Bias of cross-validation
	 -correlation of cv climo forecast is -1

	 -not including model selection in CV--does including it as an
          additional layer of CV add an additional bias?

	 -recomputing climo
	 -recomputing EOFs not needed


	 Examples ("You spot the problem")
	 -Indian PCR
	 -IRI PCR
	 -not taking out climo

From wikipedia:

These are some ways that cross-validation can be misused:

    * By using cross-validation to assess several models, and only
      stating the results for the model with the best results.

      -use cross-validation to pick predictors, give that CV'd skill
       as the the skill

    * By performing an initial analysis to identify the most
      informative features using the entire data set -- if feature
      selection or model tuning is required by the modeling procedure,
      this must be repeated on every training set. If cross-validation
      is used to decide which features to use, an inner
      cross-validation to carry out the feature selection on every
      training set must be performed.

    * By allowing some of the training data to also be included in the
      test set -- this can happen due to "twinning" in the data set,
      whereby some exactly identical or nearly identical samples are
      present in the data set.
      ^serial correlation <- sst lim, just leaving out months


Seasonal prediction at the IRI

describe forecast products and how they are made

described tailored products/research

Need for probabilities

estimating probabilities from ensembles

New methodology?