Okay. Maybe I’m exaggerating. I’m the last guy in the world to cast aspersions or proselytize when it comes to programming languages. I spent three years at Morgan Stanley writing A+ code and still think that was the most expressive and productive language I’ve ever used. I even wrote a GUI in it, and I have no idea how to write GUI’s.
They had a program called COPS for doing Monte Carlo simulations. Nobody knew what the acronym stood for and the guy that wrote it went over to the group this book was written about. Another example of somebody sitting down in the wrong seat at the right time, getting fired, and trying to capitalize by writing a book. Publishers are great at figuring out how to manipulate these people.
LSS, I used the NYC police car white on blue colors for the app, and got some cred with the A+ gurus at MS for figuring out how to get the police stripe to resize correctly. The first time I showed it to traders they told me to change the color scheme to black background with shades of grey text. They had to look at it all day.
I’m a unix guy, born and bred, and thought gcc and vi were ideal tools for writing code. A+ was a shock to me. No need to write a clumsy command line interface, just start up the interpreter and you can look at anything you want. Debugging consisted of setting a break point. Then you just use the language to poke around. Kenneth Iverson was way ahead of his time. Kdb is a legacy of that.
You can see many of those ideas in my http://tukhi.com/ product. It is a spin off of a product I’ve been developing over the years for my hedge fund clients. The key to performance in Excel is array formulas. Especially the FP data type, something you can only get at through the old school C SDK.
One consistent complaint is that Excel is too slow. I have to rely on Excel recalculation to provide the fodder for collecting statistics over a simulation. Users can enter any formula and use 3rd party add-ins I have no control over.
If they get 1000 recalculations/second they seem happy, 100 and they suffer in silence, but when it gets down to 10, or even 1, that is their pain point.
Last week I whacked something together to find the upper bound on what Excel is capable of. My clients need millions of data points, more than can be fit into even the big grid, and I’ve been using databases for that. ODBC is more performant than you might think, but I was looking for more.
I’ve looked into Hadoop, MongoDB, and other Big Data “solutions.” 1010data has a product far ahead of either of these. Those guys seem to be on the right track. Forget the BI stack, just put data in rows and make typical queries so fast you don’t need a big IT team to make that happen.
I wrote 200 lines of code to create a file of doubles, memory map it in, and iterate over the doubles one at a time. Don’t get my client’s hope up, but I clocked it at 60,000 recalcs/sec on my Lenovo X220 laptop.
You can see how many lines of code I was able to reduce that to: http://xllcdb.codeplex.com/SourceControl/latest#trunk/xllcdb.h
This is entirely due to the expressive power of the latest C++ standard. Just like Java taught us how productive garbage collected languages with off-the-shelf libraries can be, the latest C++ standard provides off-the-shelf means to write code that makes it possible to get the highest possible performance out of the silicon using only a few lines of code. It is just a matter of learning how to leverage off of what some of the smartest people in the world spent considerable time and effort producing.