IRI Home

CPT Help Home -> How to use CPT -> Program Settings -> Dataset Information

Dataset Information

Once you have selected an input file, CPT will automatically identify which of the three structures the file follows. Depending on the structure of the dataset, you may be asked to define the domain in which you are interested. You will only be asked for the domain if the dataset is gridded or station. You can revise these domain settings later using the Edit ~ X Data Domain or Edit ~ Y Data Domain menu items. See Setting the Domain for futher details.

Before the domain settings are requested for gridded and station data, CPT performs an initial check on the data structure. If the file is read successfully, the name of the file should appear in the respective box next to "File name", and the program will automatically calculate and indicate the total number of variables within the domain once it is specified. (In the case of unreferenced data the total number of indices in the file is calculated). If, for example, you are using a gridded dataset of sea-surface temperatures but select only the data over a small area of 5 by 4 gridpoints, there should be a total of 20 (4x5) gridpoints. If an X input file is opened, this will be the default forecast file as well. If CPT is unable to read the specified file, an error message will be given.

Upon successful opening of an input file, the dates of the first and last data in the dataset will be indicated. The first year in the X input file does not have to be the same as in the Y input file. You may start the analysis at a later year in either or both of the datasets, particularly if the first years in each dataset differ. If, for example, the X input data starts in 1950, and the Y input data starts in 1965, you will normally want CPT to ignore the first 15 years in the X input file. In this case, set the first year of the X training period to 1965. CPT will set what it considers to be sensible default values for the starting years on the basis of setting a lead-time that is less than one year, and using as much of the data as possible (although it will avoid resetting the start year if a replacement file is selected). It will recognised whether the forecast lag spans the year-end, and will adjust the default starting dates accordingly. These defaults can be overridden by either by typing the desired years in the respective boxes, or using the attached fly-wheels. Although CPT does permit the forecast lag to exceed one year or to be negative, this would not normally be advisable. The analysis of climate predictability with lags of one year or more would most likely have little physical basis. A warning is issued if the lag is larger than one year or is negative.

Beneath the dates, CPT will indicate the number of fields and lagged fields in the file, together with the number of variables within the domain. This number may be less than the total number of variables in the file, depending on the domain settings. The actual number of variables used depends on the handling of missing values and is indicated only after the model is constructed.

Previous | Next


 
Last modified: