Sorry about the hiatus. I’ve been working on a fascinating project at Bloomberg lately. It is an extremely challenging combination of working with existing systems and moving to a new infrastructure for handling the ever increasing amount of data they have to deal with. If you know C++, some python, a little JavaScript, maybe some Java, and can deal with complicated environments and working with smart people who wrangle huge data, we have positions open. Contact me at klewis95@bloomberg.net or kal@kalx.net.
Back to the topic. What is a spreadsheet? I have been reading “Elements of Programming” by Alexander Stepanov and was surprised to learn the latest fashion in the computer science world seems to be something I learned as a grad student in the second year of getting my MA at the University of Hawaii in math. In the course Logic and Set Theory I
learned a mathematical concept is defined by what rules it satisfies. (C++ Concepts [Lite] is a watered down version of this.) A vector space is any set with a commutative addition and scalar multiplication satisfying the distributive law. Today’s subway reading was http://gauss.cs.ucsb.edu/~aydin/GraphBLAS_API_C.pdf. At this point anyone without a PhD in math might want to stop reading, but I’ll jot down a few thoughts. They can be made mathematically rigorous using the language of Cartesian Closed Categories.
A spreadsheet is a function from a set of indices I → C to cells.
An index is just a set and a cell has a value and perhaps a formula.
A value can be a number, or string, or Boolean, or a reference to one or more cells, or an error. (Or a couple of other things if you’ve been following along with how I stay faithful to the original Microsoft C API.)
A function is a function from zero or more cells to a value.
That only defines the type of the spreadsheet. We also have to define what functions can be applied to spreadsheet types.
Since you know how Excel works, these will seem obvious to you.
The function Enter
: S × I × (V + F) → S
lets you select a cell in your spreadsheet and enter either a value or formula.
The function Delete
: S × I → S removes
the value and formula of the corresponding index.
By now you are getting the hang of things. Move, Copy, Precedents, etc. are just functions.
The tricky thing is evaluation of spreadsheets because there are different ways of doing this. The first step is defining a clean versus a dirty cell. I think you can do that with the simple notation above. But I might be wrong.
I’d love for you to follow up on this topic. As a hobbyist programmer who spends far too many hours in Excel, I am deeply intrigued by spreadsheets but it’s hard to come across analysis like the one you’re alluding to. If you have anything at all to share, even academic sources, I would be forever grateful.