As an avid follower of all things statistics it is great to see all the soccer blogs popping up analysing data. This has been made possible in a large way by the data released my Manchester City. What is great to see is the wide range of analysis and critiquing done by the community. In most cases this community operates outside the game i.e. they are not Performance Analysts working within clubs. This allows them a certain amount of freedom to publish things that clubs might consider sensitive information.

I think we have barely even started this process in GAA. There are only a handful of bloggers/journalists using data on a semi-regular basis. For many reasons the evolution has been a bit slow but I wanted to make a couple of points.

1. Averages;

while averages can be very useful they should come with a bit of a warning. I have recently finished a project looking at the the effects the quality of the opposition can have on performance. It was found to be significant. This basically means that when calculating averages I account for the quality of the opposition. I create 1 average for ‘Top Teams’ and 1 for ‘Bottom Teams’ – combining the 2 masks the ‘real’ performance. We can now benchmark our performance not just against the overall average but against a more specific one. Averages alone might mask the true performance.

2. Context;

I will be the first to admit that stats will never tell the whole story. A lot of the time you need to look at video. It can be very difficult to get stats to give you all the context you need – but that doesn’t mean you shouldn’t try. For Example: If you are looking at shooting averages – comparing them to the Championship average is a great start but there are other factors you need to consider.

  • Game Situation: Is a score when you are behind by 1 worth much more than when you are ahead by 10? I would think so and therefor needs to be considered.
  • Game Time: The stage the match is in might also be important (First minute v’s Last Minute). Coupled with the game situation above, scores at different times in the game might also need to be considered.
  • Shot Location: Where on the pitch a player scores from is also important when comparing performances. Inside/Outside the scoring zone for example.
  • Opposition Quality. Against opponents of different quality you will get different results. You need to factor this into any performance % you measure.

3. Sample Size

GAA teams play an extremely small amount of competitive games. For example in last years championship there were 61 games. That’s across 33 teams. Donegal played only 7 games. In contrast Man Utd played 38 games in the EPL alone add on the FA Cup, League Cup and Champions League and you get close to 61 games for them alone. This sample size makes it easier to draw real conclusions from the data. It’s never going to be like this in GAA so we need to work with what we have got – but is worth keeping in mind and we need to be mindful of making statements based on small sample sizes.


1 Comment

  • liam

    You make an interesting point about the small sample size. I think that unlike soccer, research should focus more on tactics than about team performance, as you won’t ever have enough games to judge a team (especially as beaten teams stop playing, which means you can’t figure out how good they were to adjust the winning team’s performance).

    Certain situations, however, recur dozens of times within games and so even with just 61 games per year the sample size will quickly increase. An example would be fortys (or frees in the same range) – what is the expectation of going for it compared to passing it/dropping it into the box? I’ve been watching gaelic for decades and would have very low confidence in guessing this accurately. If you knew it, however, you could then adjust the expectation based on the player and situation (wind, time remaining) you have at the time.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.