What happens when you mix QlikView with the best football leagues in the globe? In today’s post, I’ll share a little app I created to analyze data about my favorite soccer tournaments like the English Premier League, the German Bundesliga or the Serie A from Italy. So let’s take a break of business oriented dashboards and spend some time reviewing the wins, losses and goals of the last few years.
If you’re a football fan, you’ll stay for the discoveries. If you’re a QlikView enthusiast, you’ll stay for the tips regarding scripting and visualization. If you’re not neither of those, you’ll visit our Random page in order to see funny videos like a pug playing ‘Enter Sandman’ from Metallica in the drums.
As usual, you can download all the related files here. Ready? Let’s get started!
The first challenge in this endeavor was to find a good data source. However, a static XLS file wouldn’t do the trick. Since we’re in the middle of the season and there are matches every week, we need a way to update our file constantly. Even though it’s not the most common way to extract tables, don’t forget that QlikView can retrieve records from external websites (Data from Files > Web Files).
After browsing for a while, I found a great sports portal called SkySports where we could get all the information we needed. In order to load it to our model (and since I didn’t want to copy and paste the same code dozens of times), I relied on one of our oldest friends: the FOR… NEXT loop. The structure is not very complex, so I’ll let the code speak for itself:
With all the data on hand, it was time to define our KPIs. Whilst most of them were already suitable for analysis, I had a little problem: I couldn’t compare directly some of the metrics. For instance, the French Ligue 1 has 20 teams, but Eredivisie (Netherlands) has only 18. In this regard, it’s also unfair to compare the points obtained by Borussia Dortmund and Lyon since they have played 27 and 31 games respectively. Therefore, I came up with a simple calculation to ‘standardize’ the scales called QlikFreak Score (you know marketing is not my strong suit, but bear with me). This KPI is the result of taking the points obtained by a team and dividing them by the total points disputed. For instance, if I got 6 points after playing 5 games, my QlikFreak Score would be 40% (6 out of 15 possible points).
That being said, lets now focus on the design! The layout is fairly simple: the filters are located on top and the analytical objects in the bottom. We start our journey with an overview of the 116 teams in our model by classifying them into different categories: the ones that classify to the Champions or Europa League, the ‘normal’ teams, and the ones that are on risk of relegation.
The table in the bottom displays the most relevant metrics but it uses the Gestalt principle of similarity (yeah, the cell background) to divide them in four groups:
- About the games: Games played, wins, draws, and losses.
- About the goals: Goals for, goals against and goal difference.
- About the points: Points obtained, points disputed and QlikFreak Score.
- About the rank: Team’s rank in the respective league and status.
Since I’m using abbreviations, it won’t hurt to add a little Help Text (Caption > Help Text) so the user can easily remember their meaning.
Another interesting thing about this app is that it uses two color references throughout all its tabs. The first one refers to the tournaments. As you can see in the filter pane, each league has an associated color chip. Accordingly, each team in the table is accompanied by a colored cell so that even the less savvy users can relate Arsenal with the Premier League and Real Madrid with La Liga (This tip was brought to you by the man with the hat. Thanks Rob!)
The other one has to do with the status. It associates the teams that classify to the Champions League with a golden star, the Europa League with a green triangle and the endangered squads (relegation) with a red cross. [I don’t really know how to call the color I used for the ‘normal’ teams]
The second tab focuses on analyzing the performance of a specific team:
In the left, you can see the team’s crest alongside a gauge that displays its QlikFreak Score. Just below those objects, there’s a trend line with the performance of the last 5 seasons (Expressions > Text on axis) and in the bottom, you’ll find a table that provides detailed figures about each one of them. Cool thing, the selected season is always highlighted in the table:
By the way, did you notice that the color reference for each league prevails in this tab?
Since I loaded an INLINE table with the RGB codes, it’s pretty easy to assign dynamic colors to all the charts:
Another interesting feature here is the usage of images. I selected two different techniques for this purpose:
- Team Crest: Reference to an external file (preferably using Relative Paths)
- League Logo: Bundled images using a SUB (I explain this one in my book, but I’ll try to write a post about it in the following weeks)
Now, let me take a moment to explain the scatter plot in the right (I really like this visualization). The x axis represents the team’s rank in the selected season. For instance, speaking about the Bundesliga, it goes from 1 to 18 (since there are only 18 teams). On the other hand, the y axis depicts the QlikFreak Score, so a higher bubble can be interpreted as a better performance.
The selected squad will be highlighted so you can easily see where it stands and there are three reference lines that let you know who will attend the international tournaments or who’s in the red zone (relegation). See that color association again? In this example, it’s not hard to see that the Germans will be represented by 2 teams in the Europa League, being Bayer Leverkusen one of them.
However, this visualization can tell you more stories. Let’s take a look at Ligue 1:
Can you see the amazing gap between Paris St. Germain and the rest of the teams? Clearly, they’re playing on a different level! It looks like we won’t need Nostradamus to predict where the ‘Coup de la Ligue’ will go this year.
Or let’s go back to the 2014 / 2015 championship in Spain. The roles we had that year were very clear:
- The Leaders: Barcelona and Real Madrid
- The Contenders: Atlético, Valencia and Sevilla
- The rest of the league
However, some might prefer to see a more balanced tournament like the Premier League, displaying almost perfectly staggered bubbles.
Just a little bump in the end, which explains why Aston Villa’s fans are so sad lately.
The third tab highlights the goal difference per team (scored minus received). It also has secondary (yet quite interesting) metrics like the average goals per match.
With the flexibility provided by the Interactive Sort option, we can easily see how Barcelona and Real Madrid score almost 3 goals per match while Troyes, Udinese or Hannover 96 are struggling to get at least one. On the other end, Gianluigi Buffon (Juventus) and Manuel Neuer (Beyern Munich) have helped their squads by limiting their opponents’ strikers to one goal every two games.
When applying visual cues to a column, I recommend you to neutralize the RGBs as QlikView’s default colors are a bit too intense. The classic green / red backgrounds work fine in this scenario.
The twin bar chart in the left-hand side deserves a little warning. Since its axis in located in the middle of the graph, we can accurately compare the blue and orange bars against themselves. However, the total size of the bar (goals scored and received altogether) is harder for us to read correctly because we can only rely on its area (there’s no shared axis, so our perception of distance is not an option here). Nevertheless, it can tell us curious facts. For instance, by the time I’m writing this post, there are 5 teams that share the same goal difference in Italy (-5). However, if we only relied on this figure, we might disregard the fact that Sampdoria has almost 1.5 times more goals than Bologna!
Before moving forward, I’d like to remind you that there’s a Multibox in the filter pane that lets you select the team’s rank. For instance, you could pick only the best team of each tournament and evaluate their offensive / defensive ends by comparing the average goals per match. Even though Lionel Messi, Luis Suarez and Neymar are the best offensive party (2.9 goals per game), Paris St. Germain, Bayern Munich and Juventus display a better defensive performance.
Our fourth tab is also one of my favorites (you know how much I like Scatter Plots). This time, the axes represent the goals scored and received. I included a diagonal line so you could easily see who has a positive or negative goal difference (above or below the line). [See how the color reference prevails?]
This type of visualization is great for spotting outliers and finding weird behaviors. For instance, remember when I said that Paris St. Germain was playing on a whole different level? Look at this:
The distance between Ibrahimovic’s team and its rivals is ridiculous! Not only it’s the highest bubble by far (great offense), but it’s also the closest to the y axis (they don’t receive much goals).
However, more goals not always translates into better results. Just look at the Spanish championship:
Despite having some of the most effective strikers, Real Madrid (87 goals) ranks third in La Liga. Actually, their sworn enemies –Atlético de Madrid– have earned more points and rank second as of march, 21st. However, the funny part is that Cristiano Ronaldo and company have scored almost twice as goals as Simeone’s pupils. It looks like keeping low scoreboards is not a bad strategy after all!
The following tab shows general data about all the tournaments. It contains the number of teams alongside their status and interesting metrics such as Goals per Game. To be honest, I didn’t think Eredivisie would be on top of this list, but it claimed the first spot in the last 5 seasons.
Finally, I tried to explain the behavior of each league by calculating some figures around the QlikFreak Score like max and min values, average, median and standard deviation. At first, I thought that they might tell fascinating tales, but I soon realized that these metrics are easily distorted by the outliers (I also built a Box Plot, but I ended up deleting it due to its lack of relevance).
For instance, I thought that a smaller standard deviation could be interpreted as a more competitive league, meaning that all the teams have more or less the same level and get similar amounts of points. In the same manner, a large standard deviation would imply a bigger difference between the qualities of the squads.
Nevertheless, one cannot blindly assume this kind of things. Just look at the 2012 / 2013 Bundesliga: taking into account all the teams, its standard deviation is 16%. But if you take Bayern Munich out of the equation, it drops down to 12%. These metrics are quite sensitive and since we only have a handful of observations, I’d rather not use them. But hey, it was a good exercise!
Finally, this app wouldn’t be complete without one of my favorite charts: a histogram. After failing with my standard deviation attempt, I thought that a histogram (in fact, several of them) would help me visualize how these tournaments behave. I started by creating simple chart based on the QlikFreak Score. With this, you can see that the majority of the teams get, let’s say, between 30% and 55%.
In the left tail, we find the not-so-lucky teams who have obtained less than 20% of the points they have disputed so far like Aston Villa, Cambuur Leeuwarden and Troyes. On the other end, the right tail embodies the most efficient teams like Ajax, Barcelona and Paris St. Germain. Just yesterday, Bayern Munich beat Cologne and became the leader in this area, with an amazing 85% of points obtained.
However, this chart described the 6 tournaments altogether. As I wanted a comparison between them, I used an uncommon approach: a Trellis chart.
To be honest, I’m not the biggest fan of Trellis Charts (or small multiples, as some may call them). I’ve always thought that the idea is really good, but QlikView doesn’t allow you to customize a lot of its features, so I usually end up using other tricks. However, this time it worked pretty well. With a continuous (and shared) x axis in the resulting charts, you can clearly see where the majority of the teams are located. Also, since I added an extra expression, if you hover the cursor over a bar, you can see which teams reside in that particular bin.
So that’s about it, I hope you liked it as much as I did! To be honest, I spent a lot of time playing with this app and picturing new ways of presenting the data. I guess it can’t be helped, maybe it’s one of the side effects of working with QlikView 😛
By the way, I’d love to see how YOU visualize this data as well! Any Dutch team who would like to talk about the Eredivisie (Barry / Frédérique / Wesley)? I bet everyone will go crazy if Ralf Becher used an extension to explain one or two things about the Bundesliga. Perhaps Steve Dark or Richard Pearce can give us some insights about the Premier League as well. And even though it’s not rugby, Byron Van Wyk can probably upload a couple of charts. So, if you have some spare time, why don’t you create your own version of this app?
If you have any comments, be sure to post them in the section below. Thanks for reading. Until next time!