Overview
The following three tables contain NHL even strength skater reguarlized adjusted plus-minus (RAPM) data in three, five, and one-year stints. The impacts are in terms of expected goals (xG) per 60 minutes of play using my own model. RAPM leverages a technique called ridge regression to estimate player impacts on offense and defense in the roles they are given. All the models have to go off of is what happened on the ice. They cannot extrapolate what would happen if a player had a different role, different deployment patterns, or played with different teammates. The quality of the teammates and opposition are accounted for of course, but if a player happens to line up with teammates who’s skills do not match up well with his own, tough luck. For more reading about RAPM, it was first popularized in NBA analysis and I would recommend reading this two part series from Justin Jacobs, who has consulted for NBA teams. If you prefer a more hockey-centric overview, this overview from the EvolvingWild twins is a great read.
These estimates are rate stats (per 60 minutes of even strength ice time). Two players who are close in estimated impact might look like they provided the same on-ice value on the surface, but if one player played significantly more minutes than another, that player provided more to his team. That does not make the player “better” per se, the impacts are meant to measure quality by themselves. But grizzled internet sports analysts know value is a function of both quality and opportunity.
If you read the overviews I linked, you will realize tha RAPM values are still very noisy over one year samples. It is best to use larger samples when using RAPM to evaluate players. Five years is the best bet, but that is also an unsatisfyingly long time horizon. Three year samples split the proverbial baby in this regard, having a much more robust sample of play-by-play data to work from than one-year samples but providing results quicker than the five-year RAPM.
The main reason for this type of model struggling to stabilize when fed a season’s worth of play-by-play data is that one season does not seem to be quite enough to adaquately dampen the effects of collinearity. Collinearity takes the form of pairs or groups of teammates who do not see a significant amount of ice time apart and the model correspondingly has trouble isolating those players’ impacts. Three (and certainly five) seasons give each player in the sample many chances to play with a variety of different teammates, due to line/pair changes throughout a multi-year period and roster turnover during the offseason and at the trade deadline.
While we are on the topic of model limitations, there are a couple of items that I think are worth mentioning that I am sure one might question when looking at the results stored in the tables below. The first is that this model is trained to isolate impact in terms of expected goals. The usual caveats for expected goals apply here; obviously the publicly available data does not consider things like passing and goaltender movement. But let us not make perfect the enemy of good, we still have public access to important features such as shot distance and angles, the changes in each between shots, the time between shots, and shift start locations. I think directionally public expected goal models do a good job in bettering our understanding of what happens on the ice. As the British statistician George Box once said, “All models are wrong, some are useful.”
Along the same lines, we know players have some ability to drive goals above expected, whether via their own shot-making talent or their ability to complete high-danger passes (across the royal road, into the slot, to spring rush chances, etc). That talent is not totally captured in here. We would expect good passers to assist in generating higher xG shot attempts (which is not nothing) but players who can consistently outpace their individual expected goals with supreme shooting talent are not credited here (th most notable examples here would probably be a Leon Draisaitl or a Steven Stamkos). Shooting is definitely a skill but the extent to which it is repeatable over a large sample of shots is up for debate. Extreme cases,however, lead to some middling results for players we (the public) generally think of as elite talents. Those players are punished here and it should be documented.
One final note is that this does not include individual player contributions explicitly. There is no information about a player’s on-ice percentages here or their individual contributions (in terms of scoring, shot attempt, scoring chance, or microstat rates). The only information fed to the model is who is on the ice and contextual factors about the game and shift at hand (home/away, shift start location, is the game a back-to-back, score, etc.) Supplying these individual stats would give more insight into player roles and would most likely help the model distinguish who is driving the bus, as it were. It is reasonable to conclude that the model would be more successful in isolating and gauging player impact. I agree with this sentiment and admit it is the main limitation of the pure RAPM approach. In the future I would like to address this concern and see which players improve/worsen with the addition of more player-specific data. That is a project for another day.
I will include tables for one-year, three-year, and five-year values and order them by my subjective view of their usefulness. The data included runs up through the games the last game of the 2023-24 regular season.