5. Recommended Operation

During operation, FlowCast can empirically generate forecast probabilities in the form of probability distributions, pie charts, or contour maps. It can import a wide range of input data types and provide tools for assessing the quality and characteristics of the data. Finally, it can quantify the skill of different predictive systems both temporally and spatially to define their nature. However, despite the power, flexibility and ease of use that these tools offer, the software does not offer advice or guidance on proper operational practice, nor does it identify ‘real’ skill from ‘artificial’ skill.

For example, it is highly recommended that only one predictive system be used operationally, even if skill testing indicates ‘better’ predictors for different seasons. Swapping predictive systems mid-year has two consequences. Firstly, it introduces artificially inflated skill into the forecast since the choice to do this is subjectively based on the system presented during training, while the forecast will be made outside of these constraints. Secondly, changing systems between successive outlooks often introduces erratic swings in predictions, caused by natural lags between predictor mechanisms. Nevertheless, there will be many who will do this.

The skill testing analyses can also be easily misinterpreted. In the skill charts presented in Figure 10e, regions of high skill often appear seemingly at random throughout the chart. These isolated ‘pockets’ of skill should be treated with extreme caution. It is better to identify regions and patterns of temporally graduating skill values to define periods where forecasting is more reliable. Also, the magnitudes of these values may not be directly comparable between different predictive systems. Some level of subjective assessment is therefore required when comparing predictors, taking into account other factors including training and testing sample sizes, calculation methodology, number of stratifications and/or number of time-series elements.

Finally, users are likely to make forecast assessments based solely on these skill score results, without considering the mechanism between the predictor and predictand elements. For, example, skill testing may show that total umbrella sales in Queensland is a good predictor of Australian rainfall. Although the two events could be highly correlated, selling umbrellas does not make it rain, so the apparent skill is artificial, and forecasts are likely to be problematic, and politically undefendable.