Data Mining et al - Georg RuÃŸ' PhD Blog — R, clustering, regression, all on spatial data, hence it's:

Details on the agriculture data (II)

By russ

Since the term precision farming is not as new as one might expect (see, e.g., the links at the end of the Wikipedia article on precision farming), the data I am working on has already been collected using methods of precision farming. There was one trial in 2003 for data collection and another one in 2004 for verification.

The collected attributes and amount of data are as mentioned in the last post. The farming variants that determine the amount of fertilizer are as follows:

human decision, based on personal experience of farmer, without in-field variability (13% of data)
human decision with smaller management zones (mapping, 30%)
sensor decision, based on on-line sensor input from the field, uses decision rules that have been mined off-line (30%)
nitrogen variation: progression of fertilizer amount to collect data (13%)
no specific variant (14%)

Next will be a plan on how to construct MLPs from these data and test them. There might be some delay due to my teaching obligations in this year’s winter term.

Posted in agriculture, data mining, English | Kommentare deaktiviert

Details on the agriculture data (I)

By russ

The data set that has been kindly provided by Martin Schneider was obtained from growing of winter wheat.
It has roughly 5000 records for small-scale areas of a crop field, which contain the following attributes:

ID: numeric identifier
N1, N2, N3: there are three periods (at least in Germany) where fertilizer is applied; these values store the amount used per area
REIP32, REIP49: indexed value that measures the amount of sunlight reflected from the crop
EM38: electric conductivity of soil
Variant: categorical attribute, describes the management strategy applied to the area under consideration
tractive power: the amount of power that is needed to pull e.g. a plough
yield 2003, 2004: stores the yield from the respective area

The target is quite similar to the one in the sports science category:

learn neural networks from the data
feed the networks with current year’s input data
predict this year’s yield and / or
optimize the amount of fertilizer by simulating different amounts and predicting with the ANNs

Posted in agriculture, data mining, English | Kommentare deaktiviert

Some more results for the sports science data

By russ

I finally ended up simplifying the whole task and starting from the very beginning. I had two data sets of two athletes with the same training attributes (data columns). The earlier matlab script did some sort of pretraining with the one dataset and some sort of main training and cross validation with the second dataset. Remember, I am still trying to reproduce the results from the paper (which were generated with Data Engine) using MatLab.
Read the rest of this entry »

Posted in data mining, English, sports science | Kommentare deaktiviert

Some results for the sports science data

By russ

The prediction capabilities of the neural network that was coded in the last post do not seem to be as good as expected, at least not in the standard configuration. When I fed the data set (which I will not publish here) through the network and the cross validation, the results are as follows:
Read the rest of this entry »

Posted in data mining, English, sports science | Kommentare deaktiviert

MatLab script v1 for the sports science data

By russ

A well-commented script that tries to model the data mining process from the sports scientists is online.
Below is a quick screenshot for reading, the script can be downloaded here.

There are some steps (two main steps) for training the network:

Since there is not much data available for training, additional data was taken fromÂ anotherÂ athlete.
the network is initialized once and stored in a variable,
the network is pre-trained: it is assumed that it can then better adapt to the actual training data,
the main training is performed starting from the pre-trained network,
this is repeated for (number of data) and cross validation is carried out.

Read the rest of this entry »

Posted in data mining, English, sports science | Kommentare deaktiviert

Details on the sports science data mining process

By russ

The current area of application of the sports science data mining is in

olympic swimming
archery
disabled swimming

When it comes to the research targets, we are trying to

model the effects of different training strategies towards the outcome of an upcoming tournament,
predict the tournament time (or any standardized measure of success) at the Olympic Games.

Read the rest of this entry »

Posted in data mining, English, sports science | Kommentare deaktiviert

Prediction using sports science data

By russ

This project ties in with earlier work done by JÃ¼rgen Edelmann-Nusser and Nico Ganter: predicting athletes‘ tournament swimming times using only their training data. It works as follows:

Â During the athletes‘ training sessions, their amount of training in different disciplines (running, strength, stamina) is recorded.
TheÂ athletesÂ completeÂ aÂ tournamentÂ andÂ theirÂ resultsÂ areÂ recordedÂ asÂ well.
TheseÂ data, consisting of trainingÂ timesÂ and fields andÂ theÂ respectiveÂ resultÂ inÂ tournament,Â canÂ beÂ usedÂ toÂ trainÂ oneÂ orÂ moreÂ neuralÂ networks.
OnceÂ theÂ neuralÂ networksÂ areÂ trained,Â oneÂ canÂ predictÂ or try to predict the outcome of the upcoming tournament.
Furthermore, one could adapt the athletes‘ training strategy by varying the training parameters and applying the strategy with the best predicted tournament result.

Presumably, this work will be done using MatLab and its nnet Neural Networks toolbox. Since I’m on the application side of the work, I will probably be scripting the neural network stuff in MatLab and publish the scripts here.

Posted in data mining, English, sports science | Kommentare deaktiviert

Classification using neuroscience data

By russ

Based on work by Christoph Reichert (diploma thesis, computer science) and his supervisor Jochem Rieger who works at the neuroscience school of the medical department, they seem to advance towards a certain cooperation between neuroscience and computer science. In a typical neurological experiment, a subject is presented a stimulus (an image) and he has to choose if he will recognize that particular image later on. During this time, his brain’s activity is recorded using MEG with high spatial (i.e. loads of sensors) and high temporal resolution. This activity is made accessible to a computer scientist using MatLab.

First, the task is to predict, from brain activity only, whether the subject will recognize the image or not. Due to the high dimensionality of the data, this classification (yes/no) will be performed by an SVM. From the SVM (or its separating hyperplane) the most significant activity that lead to the choice of the classification plane can be obtained. Therefore, the classifier contributes to understanding which part of the brain is the most active or most relevant for the given task. Furthermore, a transformation from the spatial to the frequency domain using wavelets showed some more interesting, additional results.

This work will be continued and the results so far look very promising.

Posted in data mining, English, neuroscience | Kommentare deaktiviert

Georg RuÃŸ' PhD Blog — R, clustering, regression, all on spatial data, hence it's:

Seiten

Kategorien

Our recent book

Data Mining et al — RSS feed

Meta

Experiments with the agriculture data (I)

Miscellaneous

Details on the agriculture data (II)

Details on the agriculture data (I)

Some more results for the sports science data

Some results for the sports science data

MatLab script v1 for the sports science data

Details on the sports science data mining process

Prediction using sports science data

Classification using neuroscience data