A little information for those wondering how the calculation currently works.
I think the best performing models will perform much better in the future early in the storm. Early in the storm, as the forecast position I determine error for increased, I previously increased the value by a multiple of 6. So early in the storm, error would be calculated for the forecast hours of 6, 18 and 30 for example. The problem is that many models do not have forecast positions at those hours. Last night I updated the system to only use forecast position error that is a multiple of 12. At 60 hours since model data was first available, the 48 forecast will be what we determine the forecast error at for the rest of the life of the storm. Because Euro data is only available for depressions and higher, it will usually take up to 48 hours after model data is first available from that model before it has the opportunity to appear as a best performing model. To have its first 48 hour forecast compared we need 48 hours of Euro data. If something skips the invest stage, and Euro data was released soon after other models, it could appear sooner.
After a few days, especially 2.5 days in, the best performing models map should do well. It's just early on you will see some models get it right that don't so much normally perhaps. And with the issue I mentioned correctly, it should work better. (But sometimes some model will be simply still be lucky.)
You can see from the average error over the life of the storm, many models have 6, 18 and 30 forecast position error missing:
http://hurricanecity.com/models/models.cgi?basin=al&year=2016&storm=03&display=model_error&type=table&latestinvest=1&run=latest&error_type=average&interval=6&hour=120&position_unit=nm&intensity_unit=kts&show_cases=&heat_map=1&hide_zero_hour=&show_bearing=&trend=&show_trend_text=&zero_hour_within=10&intensity_option=&hide_model_names=&extra_info=2This was only an issue early in the life of the storm. We can't calculate the error at the 48 hour forecast position until 48 hours have elapsed, so I increase it slowly as we do have that data available. I wait a little longer so that we usually have more than one run to compare. In the end, for models that come out every 6 hours, we compare the last 24 hours of available model data, up to 4 runs. For models that come out every 12 hours, up to 2 runs.
Ensemble members are currently removed from the calculation, along with ensemble control members, although not an ensemble mean if available.
Some track models that are duplicates are also removed, such as:
SHIP - Statistical Hurricane Intensity Prediction Scheme (SHIPS) model
We include DSHP which has the same track but a better reflection of intensity as it accounts for land. SHIP does not.
CLP5 - CLImatology and PERsistance model 5-day (CLIPER5)
We do have OCD5 which has the same track. OCD5 though also carries the intensity from DSF5, so we want to display that one.
A complete list of models excluded from the calculation is available here:
http://hurricanecity.com/models/models.cgi?page=bestperformingmodelsIt's a long list only in case some ensembles, which may rarely if ever appear, do start appearing.
A complete chart is below of how the calculation currently works. It's subject to further improvements.
The "66 and on" section I emailed you last night Jim was wrong. It is updated correctly below.