Data Mining et al - Georg RuÃŸ' PhD Blog — R, clustering, regression, all on spatial data, hence it's: » Normalization in the context of sports science data

At the moment I’m somewhat split between teaching „Intelligent Systems“ courses and thinking about the agriculture data as well as the sports science data. In yesterday’s meeting, we discussed the three data blocks that are available from them. I’ll receive the purged data soon. One thing that was mentioned was the issue of normalization.

Before a neural network can be learned from the data, I have to apply normalization. In earlier work of the sports scientists, this has been done with something like Matlab’s mapminmax, to an interval from 0 to 1 (btw another function I discovered after having implemented it myself). Naturally, these are sports science data, consisting of attributes of the athlete’s training specifics during a four-week pre-tournament period. The target variable is the tournament result itself. I assume I can learn an ANN from those data that predicts the tournament result. The issue here is what to do when an ANN has been modeled and a new data record comes in with one or more of the values out of the range that has been used in the latest normalization — think of a new minimum time or world record as the target variable.

An example: When I apply the normalization with the same settings as before, the new value should be out of the range.
We have three original, unnormalized values: 12, 15, 18.
Normalization to [0,1] yields: 0, 0.5, 1
When I apply the same processing settings to the new set of values: 12, 15, 18, 21,
the normalization yields: 0, 0.5, 1, 1.5 since it uses the same min/max as before.

I’m not sure yet what the neural network will do when encountering such a skewed input. It seems best that I’ll give it a try.

Georg RuÃŸ' PhD Blog — R, clustering, regression, all on spatial data, hence it's:

Seiten

Kategorien

Our recent book

Data Mining et al — RSS feed

Meta

Normalization in the context of sports science data