At the moment I’m somewhat split between teaching „Intelligent Systems“ courses and thinking about the agriculture data as well as the sports science data. In yesterday’s meeting, we discussed the three data blocks that are available from them. I’ll receive the purged data soon. One thing that was mentioned was the issue of normalization.

Before a neural network can be learned from the data, I have to apply normalization. In earlier work of the sports scientists, this has been done with something like Matlab’s mapminmax, to an interval from 0 to 1 (btw another function I discovered after having implemented it myself). Naturally, these are sports science data, consisting of attributes of the athlete’s training specifics during a four-week pre-tournament period. The target variable is the tournament result itself. I assume I can learn an ANN from those data that predicts the tournament result. The issue here is what to do when an ANN has been modeled and a new data record comes in with one or more of the values out of the range that has been used in the latest normalization — think of a new minimum time or world record as the target variable.

An example: When I apply the normalization with the same settings as before, the new value should be out of the range.
We have three original, unnormalized values: 12, 15, 18.
Normalization to [0,1] yields: 0, 0.5, 1
When I apply the same processing settings to the new set of values: 12, 15, 18, 21,
the normalization yields: 0, 0.5, 1, 1.5 since it uses the same min/max as before.

I’m not sure yet what the neural network will do when encountering such a skewed input. It seems best that I’ll give it a try.