I gave my talk on last week’s IFCS/GfKl conference in Dresden, sticking to the 15 minutes that had been set for speakers, with an additional five minutes for discussion. However, unbeknownst to me, Alexander Brenning attended that particular session. He had been working in the final stages of the preagro collaborative research project where my co-authors had also been involved. Hence, he knew exactly what I was talking about, knew the data and where they came from and could therefore raise two important issues:
The first one is that the data I’m working with are probably highly auto-correlated. This is due to the spatial grid which is used in the preprocessing stage. It is likely that neighboring data records (10x10m resolution) cannot be treated independently of each other. Or, if treated independently, a possible yield prediction might be biased.
The second issue is that an inner cross-validation should be performed for selecting model parameters. An idea is given in the „Limitations and Misuse“ section of the wikipedia article on cross-validation. I’m not finally done with acknowledging this issue, but I’m getting there.
Therefore, the conference was indeed useful. People from different disciplines, like the above geo-statistician and mathematician, have different approaches towards the same problem. In addition, it seems as if selecting the best heterogeneity indicators might be a feature selection task, so my research could drift into this direction.
Pure coincidence: Alexander happened to speak at a workshop held by AgriCon, also about some of the issues he raised after my talk.