Datamining Grasshopper

Daniel Davis – 20 September 2011

Recently I edited out a section from a top-be-published journal article because it meandered off topic and past the word count. Despite the journal editor’s draconian word count, I thought it potentially useful to other people. So here it is, rewritten and free of word counts.

There are 2035 parametric models shared publicly on the Grasshopper forum. I downloaded these models and looked for trends in the way they were structured.

1. Most popular nodes

Top 1-30 nodes

Unsurprisingly the slider was the most popular node with 11,842 occurrences in the 2035 models. Panel, Group and Scribble come in 2nd, 3rd and 9th place, which is the result of people documenting their models (in the forum people know their model will be read so they put some effort into explaining it). 4th place is occupied by ‘List Item,’ which leads a group of list management nodes - Series, Cull, Graft, List Length, Flatten Tree - dominating the top 30 most popular nodes. It is significant that the most popular modelling operation in Grasshopper should be list management. When teaching Grasshopper I find list management to be the hardest topic to explain, and yet Grasshopper's list management tools are some of the most powerful short of pure scripting. It should be pointed out these results are a little biased in that there are many different types of geometric modelling nodes and while no single node is popular, in aggregate they are popular. Nevertheless the list of most popular nodes should serve as a useful guide for teaching Grasshopper and the customisation of the Grasshopper menus.

The full list of most popular nodes can be downloaded here.

2. Strangely unpopular

The neglected nodes speak volumes for how people use Grasshopper. Falling down the bottom of the popularity list is poor old 'Cluster,' who sits in 159th place. There are a few reasons for this:

  1. Cluster has only been available in Grasshopper 0.8 while the models in the Grasshopper forum stretch back to version 0.6.
  2. People are probably more inclined to share small snippets of models on the forum and therefore don't require clusters.
  3. Clusters was buggy when it was re-released.

Yet, if we only look at the models created in Grasshopper 0.8 and only look at the models that contain more than 26 nodes, only 3.6% have one or more clusters in them :( This means most of the models people are creating in Grasshopper, even the really really really large ones, are totally unstructured, like the example in the next section:

3. Model size

The medium size of a model shared on the forum is 26 nodes, probably lower than what you would expect in practice due to people often sharing just small snippets of models. Yet a couple of models contained over 1000 nodes. The largest model I could find on the Grasshopper forum is a monster by the name of 'hybridB02 (2).ghx' (in this thread) weighing 20mb and containing 2584 nodes. It looks a little bit like this:

Biggest Grasshopper model

hybridB02 (2).ghx is a pretty classic case of copy-paste, and if it gets the job done who cares. But this type of unstructured model could be a huge liability if it ever needs to be edited. Say the original pink box is wrong, you edit it but to propagate the edit through the model you need to delete every instance you copy-pasted and copy-paste in the new version. Or, as we often do, start again.

Refactored large Grasshopper model

I was curious about what the model would look like refactored and quickly attempted to refactor it myself (see above). One of my first moves was to reduce the dimensionality of the model. There are approximately 500 sliders in the original model, so the model is effectively no longer parametric since moving the 500 sliders would be just as difficult as moving 500 points on an explicit model by hand. Often a formula is used to reduce dimensionality of a parametric model, however in this case I used rotational symmetry to simplify things. This might have made the geometry too neat for some tastes but you could always use a formula to tweak each rotated instance. The resulting model has 126 nodes and 20 sliders, making exploration of the design space far more viable - although the model is by no means perfect.

Refactoring the model introduced quite a few of the list management nodes. This is both the reason they are popular (they are really useful) and the reason managing lists is problematic (people starting out in Grasshopper are unlikely to use them). In completing this research I had planned to make one more list of my own, a list of popular combinations of nodes but this will likely have to wait for another paper and a more friendly editor. It looks like my friend the cluster will have to wait a while longer too, as I have quite subconsciously left him out of the refactored model.

And if you are in, or near, Melbourne in November, we are going to be hosting a week long computational design workshop with Hugh Whitehead - Director of the Specialist Modelling Group at Fosters. Find out more at http://designingthedynamic.com/