Georg Ruß' PhD Blog — R, clustering, regression, all on spatial data, hence it's:

Juli 14th, 2010

ICDM conference and DMA workshop

I’m currently at ICDM in Berlin, the conference which took place in Leipzig in the past two years. Apart from the different location at Alexanderplatz, the quality is the same, and the conference is again very nice. Now that I’m a regular participant, I know a lot of the other people, which is nice if you want to talk to them without having a lot of introduction to do.

My work presented here is a continuation and extension of the IPMU work presented in Dortmund two weeks ago. Again, the emphasis is on getting data mining people into precision agriculture — they’re really needed there. The other aspect of my work is to make sure that spatial data are treated with spatial models, otherwise a lot of the assumptions for non-spatial models are violated and lead to misleading results.

In conjunction with the ICDM I’m holding my workshop on Data Mining in Agriculture for the first time. It’s going to be held this afternoon and so far I have only seen one of the three other presenters. The author of the book Data Mining in Agriculture, Antonio Mucherino, told me that he’s not about to come for personal, urgent reasons, which is a pity, but acceptable.

Some links to the above work: ICDM paper (in Springer LNAI series), DMA workshop paper, the workshop proceedings (of which I’m a co-editor).

Januar 5th, 2010

R scripts for ICDM’2010

The following is a link to the R scripts which generate the figures used in the ICDM’2010 (to-be-reviewed) paper. The functions for computing the root mean squared error are in 20-*R and 21-*R, where the first is for the non-spatial case and the second is for the spatial analysis, including clustering (which is a one-liner in R, just as many other things). The relevant functions are NonSpatialRegression() and spatialPredictionWithClustering(). The scripts might not be of much use without the data sets, but they may be tailored easily to other data sets. Should you have questions, feel free to drop me a few lines, I’m happy to answer. You might also consider participating in my workshop on Data Mining in Agriculture (DMA’2010).

Link: Rscripts-icdm2010.tar

Januar 5th, 2010

Paper summary for ICDM’2010

The following is a paper summary for the ICDM 2010 conference, which will be held in Berlin during July. It mainly elaborates on the issue of spatial autocorrelation in the agriculture data I’m using. It refers to my previous publications (2008, 2009) at this conference where I presented standard regression approaches using different techniques for the task of yield prediction. It seems these techniques considerably underestimate the prediction error due to spatial autocorrelation. I therefore developed an approach based on k-means clustering to enable yield prediction on spatial data sets. The conference reports from the previous years are here: , 2008, 2009.
Read the rest of this entry »