Showing posts with label vba. Show all posts
Showing posts with label vba. Show all posts

Monday, 27 February 2012

Why VBA macros got everywhere

Last week, I wrote a post that suggested Visual Basic for Applications (VBA) could be declining in importance for today's analysts. I do think that it is and that it will continue to do so, because it's not evolving to meet the needs of bigger data, or to compete with new and better ways of building dashboards.

VBA has got a valuable lesson to teach though. Never mind that it may now be feeling the pressure from new competitors, how on earth did a BASIC programming language that was bolted to the back of Microsoft Office - and particularly Excel - get to be so important in the first place?

I touched on one of the reasons in that previous post.

"it will let you do things that are otherwise the preserve of IT, which should be the ambition of any good analyst. If you need IT to sort data out for you, then you've failed."

VBA is a fantastic tool for empowering analysts to build their own solutions to problems. It gives analysts the power to create innovative new bits of kit without learning the sort of heavyweight programming, that is the preserve of full-time coders with computer science degrees. What analysts produce in VBA - and I speak from personal experience here - is quite often horrifying to their IT departments. Even very good code by analyst standards is a world away from the way that a good programmer might chose to solve a problem. For one thing, no programmer worth the name would have started their build in VBA.

The thing is, even with that coding deficiency, VBA works. It makes an awful lot of businesses run. (And along the way, it's completely hamstrung Microsoft with Excel upgrades, because it's too embedded in too many places to change it now.)

You could see sloppy programming as a failure on VBA's part; as evidence that all these macros should have been built by IT in a proper language, with version control and a detailed specification. I disagree. I'm an analyst, not an IT person and so I think what VBA did is amazing and we should be trying to repeat it.

VBA survived and it prospered. It did that because it met a need and it's a need that large companies in particular, go out of their way to avoid happening with other bits of software. Excel was the Trojan Horse that put IT capability into the hands of people who aren't supposed to have it. As VBA starts to show its age, we're in danger of drifting towards a world where centralized IT departments control access to data and access to the tools that can work with it.


There's no Trojan horse yet for Big Data. Sure, there's free software which can work with it but as a junior analyst, it's likely to be a struggle to persuade IT that they need to unlock the admin account on your PC so that you can install R, MySQL and FileZilla (and especially that you also need some space on an SQL server.) Getting software paid for will be even harder as the prevailing attitude is still one of, "you've got Office and that's all you need." The majority of IT departments never liked VBA in the first place, because macros that they didn't build crash and cause problems that the IT department is asked to fix. Never mind that a few macros which don't crash are saving hundreds of man hours per year in the finance department.

I should say at this point that I'm not trying to give IT procurement a kicking. What I do want to do is recognise that there are smaller, incredibly useful tools, which don't need a long IT build and which we can't always specify precisely at the start. They need to evolve and they need to be developed by the people who work with data and spend time with clients. By the people whose job it is recognise opportunities for data analysis and exploit them. VBA did that. As VBA ages, what's going to do it in future?

There are three ways that this could go. One is depressing and the other two are interesting.

First, the depressing one. The age of amateur coding within large companies could be coming to a close. I don't think this is all that likely as the benefits are too great, but we might be entering a phase where IT controls access to any kind of developer tools, before the pendulum swings back the other way. What will swing the pendulum back, is larger companies realising that they're taking a pasting from smaller and more agile competitors, where analysis teams are able to run with their ideas.

As a second possibility, somebody could develop the next VBA Trojan Horse. That somebody won't be Microsoft, which is unfortunate because in Office, they've still got the capability to deliver it. Microsoft currently seem most concerned with creating tools for centralised IT to use, which is why from an end-user point of view, all of Microsoft's BI tools are crap. If the Trojan Horse comes, I think it's more likely to be from a new developer delivering a platform that doesn't need admin rights to install on an analyst's PC. That platform could well be cloud based, which is awkward where data is highly confidential, but not an insurmountable problem. In the same way as for VBA, by the time people who are inclined to centralise IT processes work out what's going on, it will be too late.

The final possibility is by far and away my favourite and I think, also the most likely. We could finally recognise the benefits of giving all sorts of teams - not just analysts - some control over the software that they choose to use to do their job. When you think about it, the way we look at software currently is awfully nannying...

"Here's a PC, it's got Office and a web browser on it. That's what you get."

"I could do a much better job with a copy of xxx"

"Write a business case that costs more in terms of your time invested than just paying for the software would have, we'll think about it and get back to you in three months. Probably with a no."

Of course you still need some central control, but there are huge benefits to a flexible approach to software. On a factory production line, you use the tools you're given, but the companies we work in aren't a production line. A better analogy would be a construction site, where you have all sorts of skilled technicians doing different jobs and where you wouldn't dream of telling the carpenter that he can't use his choice of chisel, because it's non-standard.

The CIA recently hinted that it might be heading in a more flexible direction, when it told vendors that it wants to start paying on a metered "pay as you go" basis for its software. You'd do that so that you can install many and different pieces of software and pay for the good ones that end up being used a lot. You'd do it so that you don't have to enforce the same few tools across multiple departments doing different jobs.

We don't know how many analysts the CIA has or what its budget is, because both are classified, but some old guesses put it at around $27bn. The Twin Towers and quite a bit of inflation has happened since then, so I'd say it's a fair bet that we're looking at well above $30bn. That's a lot of analysts and a lot of software. Definitely worth keeping an eye on how they choose to do procurement.

If I were a software developer, I'd be looking for Trojan Horses to sneak my product past IT. Tableau Public is a nice idea, which aims to create critical mass from outside companies by letting bloggers use the software for free. It's not quite there though... Google Docs is probably closest to the cloud idea of not requiring an install and could be the future, but it's nowhere near mature enough for use by analysts. Just a good toy for the minute.

And for me, as once again a small cog in the enormous WPP wheel, I'm hoping that marketing can move to the more flexible software model outlined by the CIA. Back to the original argument, we should recognise what VBA does so well and look for ways to make it happen again. Give staff responsibility for knowing what tools they need and let them do their job even better.

Tuesday, 14 February 2012

Losing touch... or why Excel and VBA won't cut it any more

Thinking through this post is making me feel old. There's going to be a lot of 'in my day' type reminiscing and I'm only 34. It's all this new fangled technology that's doing it. The world's changing fast. I hate people who say that the world's changing fast, but this time it's true.

I got my first proper job twelve years ago this month, as a junior analyst with a small econometrics consultancy and although the statistical techniques I use are roughly the same as back then, I've started to realise that our software tools are going through a revolution. Hence this post - I'd like to stop and look around for a minute to see what's happened.



Fairly quickly after starting that first job, I discovered that data processing in Excel was a hell of a lot faster and easier if you learned Visual Basic for Applications (VBA), so I did. With the help of our IT department and a lot of practice, I got pretty good and it went a long way to getting me promoted because I could make dull work happen quickly, make other peoples' lives easier and build some nice interactive spreadsheet tools for our clients.

Up until fairly recently, if an aspiring analyst asked what they should do to get ahead at work, I'd say get good in Excel. Really good. And learn VBA. The first bit's still true, but VBA? Not so much.

The trouble is, VBA's getting left behind. It's still worth knowing some, but it's nowhere near as important as it was, because creating tools in Excel is nowhere near as important as it used to be. It's also not a good gateway into other types of programming because as a language, its structure is out of date. Although some programming skills are always transferable, you need to pretty much start again when you want to learn another language after VBA.

There's also a problem for the next generation in that they need to get luckier with where they start work to get exposed to the right kit. Everybody uses Excel, so at some point, every inquisitive analyst ends up in VBA. The new generation of tools probably won't be on your PC unless you decide to put them there.

So, you're ambitious and you're six months into your first analyst's role. What do you learn now? Even if your company doesn't use these, this is where I'd start. It's the kit I'm using (and still learning) and it's free, so you can pick it up as a CV booster without buying expensive software. If you're a junior analyst reading Wallpapering Fog then I hope this list might help. You also have excellent taste in blogs, so well done on that.

Let's look at what you need to be able to achieve, as an ambitious analyst...

Collect data

This is much more important than it used to be. Ten years ago, if you didn't have the dataset and the client didn't have it, then you'd have to buy it. Either way, almost certainly it would turn up on a spreadsheet or csv file. You often needed VBA macros to clean it up and make a tidy spreadsheet.

Now, some of your data will arrive like that (so a few simple macros are still handy) but very often, you'll want to trawl the web for it. Senior staff love it when you tell them you can scrape the data that they want off the web, automatically and for free. It will make you famous.

You could learn a proper programming language, but we're statisticians not programmers, so unless you want to do that for yourself anyway, then you need a tool which is designed specifically to work with statistical data. For analysts, R is the new VBA. It's free and it's well worth the effort that it takes to learn.

Learning R gives you the same head-start that VBA gave ten years ago. You don't need to buy new software (just like VBA, which was always in your copy of Excel anyway) and it will let you do things that are otherwise the preserve of IT, which should be the ambition of any good analyst. If you need IT to sort data out for you, then you've failed.

If you get good in Excel and good in R, you'll be in a promising place from which to get your data assembled, which brings me onto...

Process data

Excel worked well when data came in thousands of rows. It still works well for lots of things and the latest versions have finally broken the 65k row limit, but there's a problem. If you throw lots of data at Excel - properly lots - you'll break it. Or wait forever for it to calculate. Excel isn't designed for processing databases and that's what we're working with now.

R can do it, but you need a good level of SQL too, even if it's just to make Access work properly. SQL turns up everywhere and it's easy to learn.

To be fair, you've needed SQL for ages but I keep coming across analysts who aren't comfortable using it. You can't get away with that any more.

Build your models

Excel for the simple ones if you like - it's still a very powerful bit of software. For more complex statistical models, you need something else. Again, R is good. Some of the older competition like SAS (which is another reason to get a good SQL grounding) is starting to look very dated. It's also hugely expensive, particularly when compared to open source.

There's no way I'd adopt SAS now and it's being kept afloat by a legacy of systems embedded in big firms. If you end up using it, fine, but don't learn it unless you have to.

I'd go with R again. And I have.

Make some output


The days of the interactive Excel workbook, emailed to a client, are over. Or rather, they're not quite but they should be and soon will be.

You need to be able to make good looking charts and output in Excel (start here) so that you can illustrate your PowerPoint decks because unfortunately, PowerPoint is still an essential tool to know.

For interactive output, you want dashboards. There's only one bit of kit to learn for the moment and that's Tableau. If you can't persuade your company to buy you a copy, then get the free version and have some fun publishing to Tableau Public. Give it a couple of years and there are going to be some exciting roles around for people who can do good things with this piece of software.


So there you go. Learn a few macros by all means and definitely get very good with the front end of Excel, but take it from someone who's invested a lot of time in VBA and never uses it any more, there's a new world of software coming and you need to learn it. What worked ten years ago, won't cut it in another five.

The scary thing is, that means old buggers like me need to learn a load of new kit, and quickly. Back to the books...