If you have Excel 2013 you might know about the
WEBSERVICE function. Give it a URL and it returns the content as a string. You can use
FILTERXML on the result and a dose of XPath to extract what you are interested in if it happens to be XML.
You can’t make this shit up:
NOTE: The WEBSERVICE function is not available in Excel Online.
Ha! Excel Online is pretty limited. Here is my stupid pet trick trying to figure out how to get that to do something useful: ExcelUDF.
I put together xllinet which provides more functionality than these two functions and works in all versions of Excel. (At least back to Excel 97.) It implements some of the WinINet library. The function
InternetOpenUrl and reads the result into a string. It returns a handle to the string object instead of a string. Excel cuts you off at 32K characters, the handle gives you access to arbitrarily large strings. Even in old versions of Excel with 255 character limits.
You can peek at the result with
STR.GET. That way you have a chance at figuring out what you got back. XML, JSON, CSV, or a mess. Try doing that on the result from http://google.com. If it is XML, you can pass that to
FILTERXML. Charles Goldfarb notwithstanding, I’m happy to let XML aficionados put themselves between angle brackets and flog themselves with attributes all day long. The smart people these days seem to be returning things in JSON. Have a gander at data.gov for more JSON than you can shake a stick at.
There are a lot of parsers out there, but I rolled my own. JSON maps quite naturally to a two column recursive Excel OPER. Excel is basically 2-D, but it is possible to accommodate recursive data structures like JSON. JSON objects are key-value pairs, where the values can be a single value, an array of values, or another JSON object. Single values are either a string, number, Boolean, or null.
Both arrays and objects are OPER ranges. Use
RANGE.GET on the values returned to spelunk the object. Any key that starts with an asterisk has as its value a handle to either and array or object. Keep using
RANGE.GET to dig deeper.
There are two cell menu items to help you with that. ‘Adjust’ works with any array function. It figures out the size needed and plops it in. ‘Expand’ gives you a peek at the handle value. It places the range returned just to the right of the cell containing the handle and selects it so you can delete it once you’ve seen it, just like ‘Adjust’.