For now, here’s the zip file with R scripts: hacc-spatial.zip.

From the thesis abstract:

The second task is concerned with management zone delineation. Based on a literature

review of existing approaches, a lack of exploratory algorithms for this task is concluded, in

both the precision agriculture and the computer science domains. Hence, a novel algorithm

(HACC-spatial) is developed, fulfilling the requirements posed in the literature. It is based

on hierarchical agglomerative clustering incorporating a spatial constraint. The spatial

contiguity of the management zones is the key parameter in this approach. Furthermore,

hierarchical clustering offers a simple and appealing way to explore the data sets under

study, which is one of the main goals of data mining.

The thesis itself can be found here: PhD thesis (32MB pdf), the algorithm is described on pdf page 124 (print page 114): hacc-spatial-algorithm.pdf.

Further explanations and shorter descriptions are to be found in two publications, available in fulltext: Exploratory Hierarchical Clustering for Management Zone Delineation in Precision Agriculture and Machine Learning Methods for Spatial Clustering on Precision Agriculture Data.

Let me know if there are questions, comments or even successful results when applying the algorithm to your data sets.

There are also two youtube videos of the clustering (with an additional pre-clustering step, the „inital phase“): F440-REIP32-movie.avi and F611-REIP32-movie.avi. It’s probably the end of both videos where it gets interesting. Compare the plots for the REIP32 variable of the F440 and F611 data sets (F440: PhD pdf page 185 (clustering on page 138) and F611: PhD pdf page 195).

**Important points**

- The algorithm was designed to work with spatial data sets: each data record/point in the data set represents a vector of values which also has a location in space (2D/3D).
- The data points should be spatially roughly uniformly distributed (probably with high density, although that doesn’t really matter). That is, it does not and cannot rely on density differences in the geospatial data distribution.
- The input structure for the R scripts is a spatialPointsDataFrame with variables. The algorithm (the function) allows to select particular variable(s) for clustering. I.e. you may use multiple variables for clustering.
- The algorithm is definitely not optimized for speed. It served my purposes well, but may take a while to run on your data.
- The contiguity factor
**cf**is subject to experimentation.

Apart from that, there’s not much to comment (yet). Let me know about questions or issues and I may be able to fix them or list further requirements here.

mail: researchblog@georgruss.ch

]]>Die Dissertation ist jetzt bei der Bibliothek publiziert und auch hier online zu finden: http://blog.georgruss.de/?page_id=358

]]>

Three things made me want root access:

- It’s my device and I decide what to do with it (and Google does, in this case, since it’s Android).
- Remove some chinese apps.
- Get FasterFix to work.

After I had root access, I used Ghost Commander because it can simply mount/unmount /system rw/ro. I could have done that via console, but it’s much more convenient that way.

]]>(this merits a new category at the top level)

]]>Mit dem offiziellen Datum vom 23.11.2011 habe ich heute meine Dissertation eingereicht. Jetzt ist der FakultÃ¤tsrat dran, dann sind es die Gutachter und wenn alles glatt lÃ¤uft, bin ich bei der Verteidigung dran. Vorbehaltlich der Genehmigung durch den FakultÃ¤tsrat findet die Verteidigung am 23.02.2011, 15 Uhr, in 29-301 statt. Der Dissertationstitel entspricht der Ãœberschrift dieses Blogs.

FÃ¼rs Binden habe ich Ã¼brigens **42** EUR bezahlt. Das kann doch kein Zufall sein!

Here’s the script: minihomertool, version 2011-09-24

]]>