Do relationships matter?

Daniel Davis – 16 March 2010

Banner image for Do relationships matter?

Recently I have been producing parametric models for Antoni Gaudí's Sagrada Família; in Excel. When I agreed to use Excel to produce the models, I signed on certain I would fail, such was the unlikeliness of Excel succeeding in producing parametric architecture. While I cannot say much about the project, I should say that we were only using Excel for research purposes and that the design of the Sagrada Família remains one of the worlds most technologically advanced architectural projects. Using Excel drove me pretty close to the wall, particularly at 2am on a Friday as I got the model ready for a deadline and was trying to figure out a formula for a circle intersecting a plane in three dimensions. After two weeks on Excel I am not a convert, but I have begun to re-evaluate the worth of the more technologically advanced tools.

One of the curiosities of Excel is that cell relationships are hidden. A cell displays data and the only way to see relationships is to click on the data and view the formula. This is the inverse of graph based parametric modelling tools like Grasshopper and GC, which graphically represent the relationship between nodes but hide the data - in Grasshopper you need to hover over each node to see the enclosed data. Both approaches achieve a similar result, which got me thinking about why architects use the graph based approach while Excel still uses the spreadsheet based approach.

I am not the first person to consider this. In 1997, Spreadsheet 2000 competed against Excel with a graph based spreadsheet (shown above), which looks suspiciously similar to Grasshopper, complete with Bezier curves. Incredibly Spreadsheet 2000 was itself created with a graph based programming tool, Prograph, by Steve Wilson who went on to become the vice president of systems at Oracle, where he still probably maps out systems using graphs. This must have all seemed pretty advanced in 1997, it still looks advanced today, yet Spreadsheet 2000 failed before reaching the year 2000.

There does not seem to be any analysis on why Spreadsheet 2000 failed [update 27 June 2014: Steve Wilson commented below and explained the real reason, everything beyond here is inaccurate speculation], so I am going to speculate: relationships don't matter. The problem Spreadsheet 2000 was trying to solve was the invisibility of relationships in spreadsheets (Wilson talks about the aims of Spreadsheet 2000 here). The solution was to use a directed graph that exposed the relationships between cells. The graph is successful in doing this, although its verbosity makes it inefficient compared to a normal spreadsheet. For example the relationship between the top right table and the bottom middle table is inversion. Using a graph this takes up a significant amount of interface real-estate - two lines and an intermediate node depicting 1/x - it could have just as easily been communicated by adding a column to the top right data headed 'Inverse'. The inefficiency of the graph increases with complexity since more nodes make it more difficult to interpret individual relationships from the spaghetti of code.

By exposing the relationships, Spreadsheet 2000 aimed to reduce errors. This is not the case. There is no correlation between knowing the relationships between cells and the number of errors. Stephen Powell's "A Critical Review of the Literature on Spreadsheet Errors," references a study where one group of participants were given a spreadsheet on paper so they could not see the cell relationships, another group were given an electronic copy so they could see the formula, and both groups identified a similar number of errors in the spreadsheet, 50%.* Anecdotally I agree with this; I have never stared at the Grasshopper graph and gone 'there is my mistake,' I do however see errors in the 3d model and then find myself working back through the graph, hovering over each node to see what the data inside looks like.

Spreadsheet 2000's failure seems to be a common occurrence for visual programming paradigms. It is only in parametric architecture that I have really seen graph based tools like Grasshopper and GC become so widely adopted. I am unsure of why this is the case. After using Excel I feel there is still room to improve on these paradigms, particularly the emphasis they place on relationships, which are often only needed at the time of construction. Despite this, unless you are Gaudí, I wont be giving up my graph based representations again.

Download Spreadsheet 2000 here

* Similar research has been done by the European Spreadsheet Risks Interest Group, a group that gives me considerable comfort knowing that someone is looking out for our accountants and the risks they face.