Monday 9 February 2015

How to do football analysis in Tableau | Part 2


"Here's where the fun begins"
Han Solo

So you've read Part 1, you've downloaded and installed Tableau Public and you've got a small Liverpool and Everton dataset to play with. Or alternatively, you took the short cut, downloaded the dataset and came straight here.

Either way, it's time to hit Tableau.

Before you dive into the software, you'll need to create an online Tableau Public account. This will give you somewhere to publish your visualisations and also to save them while you work on them. One of the big restrictions of Tableau Public is that it's cloud-based and you can only save your stuff to Tableau Public's website

You've set up an account? Great, we'll make use of it later on. Now you can open up the Tableau Public software.

You should be looking at a screen like this...



Click "Open Data"and you get a long list of data sources that Tableau can access. The greyed out ones are for Professional users only - they aren't available in Tableau Public - but our dataset is in Excel, which is fine.

Click Excel, find your football data workbook and open it.

You should see Tableau's data loading screen.



You can do a lot of data manipulation here, including joining different datasets together, but we're just loading up a single Excel worksheet for now. If your Excel workbook has only got one worksheet in it, then Tableau will pick that up automatically. If you have more than one sheet in the workbook, then drag the one with the player data across from the left hand column to the empty top window.

Click "Go to Worksheet".

We're in! You should be looking at an empty Tableau worksheet.



If you can use Excel pivot tables then you're going to feel at home here quite quickly but if you can't then don't worry, it's all very straightforward.

Tableau's looked at our dataset and guessed which columns in our data are "Dimensions" and which are "Measures".



Dimensions are things you can split the data by. Player names, team names and positions go in here. "Apps" (number of appearances) shouldn't really be in here but we can deal with it later if we need that data. Tableau's guessed wrong because there are brackets for substitute appearances in the Apps column on our spreadsheet, rather than it just containing simple numbers.

Everything that isn't a Dimension, is a Measure. Measures are data columns that you want to add up, or average, or do whatever else mathematical with. Measures are your numbers. Tableau's put things like Age and Goals and Minuted Played in here, which is what we want.

So, what do you want to see first?

At last week's OptaPro Forum, Simon Gleave (@SimonGleave) showed some nice age distribution charts that plot the ages of players at a club. We could easily draw one of those.

Drag "Age" from the Measures area and drop it into the middle of the table, where it says "Drop field here". Take care to drop it in the middle, not onto the column or row headings areas.


If you're using the sample data, then it will now say 1,330 in the table. If you've put your own data together then you might get a different number because the WhoScored website is regularly updated with new player data.

1,330 is the total of all of the Everton and Liverpool players' ages in our dataset. Useful.

What about the average age of each team? That would be more useful.

Since you dragged in Age, there's now a green lozenge on the "Marks" area that says "SUM(Age)"



You can use this green lozenge to change from Sum to Average. Right click it, find "Measure (Sum)" in the popup and change it to Average,

In the sample dataset, the average age of all of Liverpool's and Everton's players is 26.6.

Let's split it by team. Drag "Team" from Dimensions onto the row shelf at the top of the screen and drop it.



Everton are older than Liverpool! We've just learned something.

Showing numbers to three decimal places is a bit much, so you can change the default formatting for Age if you want. Right click it in the Measures area, choose "Default Properties" and use "Number Format" - "Number (Custom)" to get rid of the decimals.

You could format numbers directly in the view but the nice thing about changing the default is that now whenever we use Age again, it will always appear without decimal places.

Stop reading for a minute and have a play with this table. It might seem odd to say it in a user guide, but the best way by miles to learn Tableau is to play with it. Drop more measures into the view and try splitting the rows and columns by different dimensions and see how Tableau reacts.


That's enough tables. Tableau's all about the graphics, no? Let's draw a chart.

Use the little tab button at the bottom of the screen - the one that looks like a little bar chart - to create a new empty worksheet. The other button - that looks like a little four pane window - is for creating dashboards. We'll get to that later.



Hopefully you're starting to see that in some ways, Tableau's a lot like Excel. It has worksheets and each worksheet is basically an Excel Pivot Table, with rows and columns and measures.

We're going to use this new sheet to draw an age distribution. That means we'll want to count how many players there are, split by age groups.

In our dataset, Age is a Measure to be summed or averaged, not a Dimension that you can split things by, but Tableau can sort that out for us. Drop age into the middle of the view, like you did before and then use "Show Me" to draw a histogram.



We've got a chart!

The Show Me button is the centre of Tableau and it's where you decide what kind of visualisation you want to draw. Think of it like the Excel Chart Gallery, but a lot more powerful.

Depending on what data you've dragged into the view, Tableau will offer you different types of charts in Show Me. This can sometimes get a bit confusing, e.g. you might decide to draw a scatter chart and Tableau says No and greys out the button. It looks at the data you've dragged into the main view and decides you can't draw a scatter with that.

When the type of chart you want is greyed out, look at the tip at the bottom of the "Show Me" box. Tableau will tell you exactly what it needs and when you drag those things in, the option you want will work. Once you start to get used to Tableau and the way that it works, you'll find this happens much less.


Charts in Tableau work exactly like tables and if you get confused, it can often help to think of them that way, or even switch back to a table, sort your data out and then switch back to a chart.

Charts working like tables, means that they have rows and columns and we can split our age chart by team if we want to.

Grab team from the Dimensions area and drop it just to the left of CNT(Age) on the Rows shelf.





Two charts! Now we can really see the differences in age that are driving Everton's older average.

Try dragging Team back off the Rows shelf and putting it in different places - on the Columns shelf, or into Colour or Label in the Marks area. There is loads of flexibility to create the view that you want.

Tableau has five basic ways of showing you differences in your data.

1. You can split it, using the Rows and Columns shelves.
When you do this, you'll get new rows and columns in a table, or new charts, one for each split.

2. You can vary colour

3. You can vary size
(That one doesn't really make sense as a team split - try it. Not every technique is good everywhere)

4. You can change the shape of datapoints

5. You can label different items


Tried all of those? Don't just skim through, I really meant it about playing being the best way to learn. Drag player names in. Swap Age for a different measure. Put team in Color and Label. Go nuts. Break stuff. Junk your worksheet and start again if you need to and remember that Tableau's Undo feature is pretty much bulletproof; Control-Z will always put things back the way they were!


Now that you know the basics of worksheets, you're ready to make your first dashboard...

We'll create some more charts and use them to build an interactive dashboard in Part 3.

No comments: