Diving deeper into Matlab’s endless built-in functions, I discovered (i.e. read) Mathworks‘ nnet manual. I usually abhor user manuals for specific programming languages, but Mathworks has made it an enjoyable read.
Read the rest of this entry »
Diving deeper into Matlab’s endless built-in functions, I discovered (i.e. read) Mathworks‘ nnet manual. I usually abhor user manuals for specific programming languages, but Mathworks has made it an enjoyable read.
Read the rest of this entry »
When applying cross validation, it is necessary to determine an appropriate size for folds of data to split the data set accordingly. An empirical way to do this is to try different fold sizes and check when the error reaches a minimum (or some other criterion). This is what the latest matlab script does.
Read the rest of this entry »
To clean up the main code of the cross-validation script, I decided to source out the part that deals with splitting the data into training and testing sets. After I had created and verified the code, I discovered Matlab’s built-in dividevec function from the Neural Network Toolbox. It does something similar and was introduced from R2006A:
The dividevec function facilitates dividing your data into three distinct sets to be used for training, cross validation, and testing, respectively. Previously, you had to split the data manually.
In the recent scripts I used Matlab’s rand() function to seed the random number generator. I aimed at generating reproducible results. However, if the data can be modeled with an MLP, then it should most of the time converge towards the same solution, given that the initialization is not too different (weights from -1 to 1).
Read the rest of this entry »
Yesterday’s entry featured a short cross validation, using a fixed network structure. This structure should be verified and improved.
Read the rest of this entry »
Just to get an impression of the nature of the data, I slightly edited the matlab script that I used with the sports science data and applied it.
Read the rest of this entry »
I have been mentioned in a blog post by Sandro Saitta. He already is on my blog roll. His blog was one of the reasons I had to start my own blog for documenting my research right here.
Since the term precision farming is not as new as one might expect (see, e.g., the links at the end of the Wikipedia article on precision farming), the data I am working on has already been collected using methods of precision farming. There was one trial in 2003 for data collection and another one in 2004 for verification.
The collected attributes and amount of data are as mentioned in the last post. The farming variants that determine the amount of fertilizer are as follows:
Next will be a plan on how to construct MLPs from these data and test them. There might be some delay due to my teaching obligations in this year’s winter term.
The data set that has been kindly provided by Martin Schneider was obtained from growing of winter wheat.
It has roughly 5000 records for small-scale areas of a crop field, which contain the following attributes:
The target is quite similar to the one in the sports science category:
I finally ended up simplifying the whole task and starting from the very beginning. I had two data sets of two athletes with the same training attributes (data columns). The earlier matlab script did some sort of pretraining with the one dataset and some sort of main training and cross validation with the second dataset. Remember, I am still trying to reproduce the results from the paper (which were generated with Data Engine) using MatLab.
Read the rest of this entry »
Data Mining et al is powered by WordPress | Using Tiga theme with a bit of Ozh + WP 2.2 / 2.3 Tiga Upgrade