The Kazutsugi Numerai tournament and poor performance (September 5, 2019)


Everyone's tournament reputation on Numerai has suffered after the drastic changes put in place 8 weeks ago. What do I mean by drastic? Numerai reduced the number of tournaments each week from 5 tournaments to 1. For the remaining tournament, the forecast performance metric was changed to "correlation" from AUC. And finally the dataset was expanded from 50 anonymous features to 310 features with the forecasting target having five possible values (0/0.25/0.5/0.75/1) rather than binary (0/1).

The 4 weeks of realized results from the forecasts end with 9 submissions out of 777 staked submissions managed to do well enough.
Kazutsugi 168: 4/184 staked submissions were good enough to be rewarded a total of 28.56194 USD. (2.99NMR+13.44USD, assuming conversion rate of 5.06USD=1NMR)
Kazutsugi 169: 3/186 staked submissions. 3.7022 USD rewarded. (0.37NMR+1.83USD)
Kazutsugi 170: 2/200 staked submissions. 3.9834 USD rewarded. (0.39NMR+2.01USD)
Kazutsugi 171: 0/207 staked submission. Somehow 0.02 USD was rewarded and I assume that's some sort of rounding or web coding error.

Despite the misleading title -- I am clickbaiting -- I don't actually think the poor performance is due to the tournament changes. Bernie 158's NMR payout was 19 NMR and 76.08 USD with 15/121 staked submissions being good enough. Only marginally better. Numerai's contestant performance is cyclical and that suggests some correlations exist between it and the financial markets. And the recent poor performance again highlights the limitation of machine learning/data science. These new tools are not some oracular magic spells, as some people seem to think and companies sometimes advertise. Machine learning can improve forecasting a bit by essentially relaxing constraints relative to the methodologies that came before it, at the risk of over-fitting and making very wrong forecasts. The magic we sometimes do see have more to do with understanding the data well and coming up with a better way to use the data -- and that's hard with the Numerai tournament because the nature of its data is mostly hidden.(There are some 'advances' being made actually with using eras for forecasting now that Numerai has made clear that eras are chronologically ordered time periods.)

Lastly, it is possible that even with the poor payout, the hedge fund is doing well. The drivers for the poor payout are poor forecasting due to the markets (indicative from the live correlations being mostly negative) and how Numerai compensate contestants via the correlation metric.