Next step: put some tooling around TypeScript to automate everything.
Graph Query Language.
It’s a thing now. Leave it to Facebook to come up with a completely misleading name.
It is true a tree is a graph. Querying is only one of the things it helps you with. It is not a language.
It is a specification that has been around since 2012 that is a huge improvement over REST that let’s you not only query, but provides a simple view for users doing CRUD on legacy systems in an efficient way.
If it has a dirty secret, it is that programmers have to figure out how to turn the specification into something that does more than just an http request/response. The challenge programmers now have is to figure out is how to marshal multiple resources from one call into a response that conforms the query.
This makes it easy for clients. When they ask for something, they get back a response in the same format as they ask for. It allows them to concentrate on the model. The view/presenter/controller becomes trivial for the people actually using the data.
Back to the topic. What is a spreadsheet? I have been reading “Elements of Programming” by Alexander Stepanov and was surprised to learn the latest fashion in the computer science world seems to be something I learned as a grad student in the second year of getting my MA at the University of Hawaii in math. In the course Logic and Set Theory I
learned a mathematical concept is defined by what rules it satisfies. (C++ Concepts [Lite] is a watered down version of this.) A vector space is any set with a commutative addition and scalar multiplication satisfying the distributive law. Today’s subway reading was http://gauss.cs.ucsb.edu/~aydin/GraphBLAS_API_C.pdf. At this point anyone without a PhD in math might want to stop reading, but I’ll jot down a few thoughts. They can be made mathematically rigorous using the language of Cartesian Closed Categories.
A spreadsheet is a function from a set of indices I → C to cells.
An index is just a set and a cell has a value and perhaps a formula.
A value can be a number, or string, or Boolean, or a reference to one or more cells, or an error. (Or a couple of other things if you’ve been following along with how I stay faithful to the original Microsoft C API.)
A function is a function from zero or more cells to a value.
That only defines the type of the spreadsheet. We also have to define what functions can be applied to spreadsheet types.
Since you know how Excel works, these will seem obvious to you.
Enter: S × I × (V + F) → S
lets you select a cell in your spreadsheet and enter either a value or formula.
Delete: S × I → S removes
the value and formula of the corresponding index.
By now you are getting the hang of things. Move, Copy, Precedents, etc. are just functions.
The tricky thing is evaluation of spreadsheets because there are different ways of doing this. The first step is defining a clean versus a dirty cell. I think you can do that with the simple notation above. But I might be wrong.
Apologies to Raymond Chen: https://blogs.msdn.microsoft.com/oldnewthing/
Against my better judgement I rewrote the xll add-in library. Again.
Still learning all the new C++11/14/17 good bits. And trying to avoid the crazy things that seem to be slipping into the latest standards.
No disrespect to women, but I always feel like I’m giving birth every time I try this. The ancient C Excel SDK has its, um, personality. I continue to be floored by how expressive modern C++ has become.
The AddIn class now uses lambda expressions to Register and Unregister on xlAutoOpen and xlAutoClose.
You could probably save a few characters if you wrote this in python/ruby/perl/language du jour, except you can’t write this in those languages. It definitely takes more brain cells to come up with code this tight, but that is where the fun is.
Hitler uses git: https://www.youtube.com/watch?v=CDeG4S-mJts
Why is git called git? Just ask Linus. https://www.youtube.com/watch?v=4XpnKHJAok8. It’s funny how people tell you things if you just listen.
Andrew Tridgell reverse engineered BitKeeper and the company that invested capital in their product decide not to provide a free version to people like Linus.
Linus got unhappy about that so he wrote something he needed to replace that. Software is a funny, fungible thing. It is not like normal business things. Some people can be 10x more productive, not just 10%, maybe even more, but it is difficult to measure.
Andrew Tridgell also reverse engineered the Samba protocol and was the co-inventor of rsync. The only thing I asked for when my team took over the Excel add-ins at Banc of America Securities was a Samba server for the files. The Microsoft file system at that time made it impossible to delete files that were open.
I use rsync every day, and twice on Sundays. It is one of these brilliant things that were just done right. No need to keep changing. Confession time: I use vi. I would not wish that on my worst enemy, but Bill Joy wrote the most efficient editor for turning keystrokes into programs.
Linus wannabes have been spending a lot of effort to make git usable for mere mortals. GitHub is in the news lately. My experience has been annoyance at the sloppy kids hopped up on Red Bull fiddling with one tiny aspect they finally groked and foisting that on users who just want a clear picture on how to use their product. It sounds like the adults have finally showed up.
There is actually a workflow that can be taught to people who are not as smart as Linus. It is on the GitHub website.
It is the complete opposite of how Linus envisioned it would be used. But he built it for himself.
Against my better judgement, I’ve been rewriting the Excel add-in library. After a long day dealing with C++03 limitations at my current gig, I have to confess I am really enjoying applying the latest C++14 goodness to my old code.
Don’t try this at home, kids: https://github.com/keithalewis/xll12
You will need Visual Studio 2015 and be willing to work on your modern C++ chops. Let me know what problems you run into and I’ll make my best effort to get you going. Don’t underestimate your ability to help improve this code. Clone it and send me a pull request.
That reminds me. Time for a git rant.
Working on huge data at Bloomberg these days. After a long day of trying to figure out how they do that I like to think about math.
My attempt at working up to “A monad is just a monoid in the category of endofunctors.” It gets messy when using `SelectMany` in LINQ. Google Bart De Smet. Or maybe you can now just ask Cortana. Technology moves fast these days.
There is no Royal Road. It is still fun to use the grey matter sitting on top of a primate brain in the meantime.
I confess I binged this instead of googled it: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD841.html
I misremembered it as something Gauss said, not Euclid. Feeble human brains still need to be reminded of things learned over 2000 years ago. At least mine does. There is nothing I can add to Dijkstra’s observations other than an anecdote.
Last Thursday I had dinner at the Columbia Faculty House. My friend is taking a break from a course he has been teaching and I offered to fill in. And found out there is a 1000ft elevation change between Morningside Ave and Morningside Dr. Next time I’ll take the 1 train instead of the B line unless I want the exercise.
One of the professors at dinner paid me the nicest complement, after drawing it out a bit. The conversation went something like this:
Me: Hi, I’m Keith Lewis.
Him: <Russian accent>I have heard your name before.</Russian accent>
Me: <Getting Nervous>Oh?</Still Nervous>
Him: <RA>People have told me things about you.</RA>
Me: <CMA?>I’m sure some of those things are probably true.</CMA?>
Him: They have said they have taken your course before and found it very difficult.
Me: <Keeping my mouth shut/>
Him: Then they realized they could do the homework. They also told me they found what you taught them was useful for their job.
Bingo! It is the same thing every semester. Students complain the material is too hard. I go over the previous homework at the beginning of each class and ask them what else they needed to be successful doing it. The answer is always the same: Nothing.
There are some things that can be learned by having your hand held and walking you down the garden path. (How do I get the Visual Studio debugger to start Excel and load an add-in?) But you can only learn the important and truly useful things by spending some time in the limbo of uncomfortable uncertainty. If your brain isn’t hurting, you aren’t learning anything new.
A dead, but not forgotten, man just taught me a new vocabulary word: pleonastic. I need to try and endeavor to avoid and eschew pleonasm.
I like to ignore advice people give me and figure things out for myself. It does me no good. My brand is I don’t give a fuck about my brand. If you find the code I write useful and can point out things I’m missing that can help me improve my open source code you will have my full attention. The sad fact of open source is that people who actually spend the time to look carefully at your code are as rare as rocking horse shit.
Bill Gates might not have been far off when he claimed secretaries would be writing code using some version of BASIC. Larry Wall came up with perl, Guido von Rossum came up with python. I am being politically incorrect, it is a form of censorship that I abhor, but these are all languages for people who can’t understand the complexity of modern computing.
That is not a bad thing, I just call it as I see it. The technology world moves fast these days. Companies have to hire people that can produce things according to their limited ability.
The C++ world seems to have noticed this and is making big strides these days on making “easy things easy,” according to Bjarne Stroustrup’s current mantra.
C++ is anything but easy. But somehow it seems to attract the smart people in this world.
If you have Excel 2013 you might know about the
WEBSERVICE function. Give it a URL and it returns the content as a string. You can use
FILTERXML on the result and a dose of XPath to extract what you are interested in if it happens to be XML.
You can’t make this shit up:
NOTE: The WEBSERVICE function is not available in Excel Online.
Ha! Excel Online is pretty limited. Here is my stupid pet trick trying to figure out how to get that to do something useful: ExcelUDF.
I put together xllinet which provides more functionality than these two functions and works in all versions of Excel. (At least back to Excel 97.) It implements some of the WinINet library. The function
InternetOpenUrl and reads the result into a string. It returns a handle to the string object instead of a string. Excel cuts you off at 32K characters, the handle gives you access to arbitrarily large strings. Even in old versions of Excel with 255 character limits.
You can peek at the result with
STR.GET. That way you have a chance at figuring out what you got back. XML, JSON, CSV, or a mess. Try doing that on the result from http://google.com. If it is XML, you can pass that to
FILTERXML. Charles Goldfarb notwithstanding, I’m happy to let XML aficionados put themselves between angle brackets and flog themselves with attributes all day long. The smart people these days seem to be returning things in JSON. Have a gander at data.gov for more JSON than you can shake a stick at.
There are a lot of parsers out there, but I rolled my own. JSON maps quite naturally to a two column recursive Excel OPER. Excel is basically 2-D, but it is possible to accommodate recursive data structures like JSON. JSON objects are key-value pairs, where the values can be a single value, an array of values, or another JSON object. Single values are either a string, number, Boolean, or null.
Both arrays and objects are OPER ranges. Use
RANGE.GET on the values returned to spelunk the object. Any key that starts with an asterisk has as its value a handle to either and array or object. Keep using
RANGE.GET to dig deeper.
There are two cell menu items to help you with that. ‘Adjust’ works with any array function. It figures out the size needed and plops it in. ‘Expand’ gives you a peek at the handle value. It places the range returned just to the right of the cell containing the handle and selects it so you can delete it once you’ve seen it, just like ‘Adjust’.