There are quite a few deadlines for publications approaching in January and I will submit another paper detailing some of the recent accomplishments on the agriculture data there. One of the conferences is ICDM 2008, held in Leipzig, Germany. It targets industrial applications of data mining and I felt the paper fit in there quite nicely.

I have also received more data sets from Martin Schneider at Martin-Luther-University of Halle-Wittenberg which will have to be mined. There are quite a lot of interesting tasks to be performed on those data — that requires thorough planning. I probably won’t be able to do that planning until my return from the organizational business trip to Melbourne, starting a cooperation project between our research group and the one that I worked with in 2004/2005.

An interesting result so far concerns the three-stage field data I have described in this article. The complete data can
be split into parts such that each subsequent part contains more data than the one before as more sensor measurements become available during the growing season. As it should be expected, the more data we have to predict the yield from, the less erroneous the prediction should be. This could be verified, the result is shown in the figure below; FT3 has more data than FT2 has more data than FT1. The details are stored in the paper I will submit to ICDM2008 soon. Accepted or not, I will probably publish the paper here. Feel free to remind me in case I forget to do so.

Comparison of absolute network errors, for submission to ICDM2008
And the accompanying matlab script that produced the above figure.