Missing ESPN play by play data
By - johnnyg68
Yeah, it's certainly been a challenge this year. For me in particular, I've got people sending me CSVs of play data for games with none as well as CSVs with corrections. The former I can get imported pretty quickly if it adheres to the format and there's no missing fields. The latter has been a bit of a challenge to get imported even with the CSVs. That's something I could open to crowdsourcing more if more people are interested; this is just volunteers who have approached me so far.
I honestly have no idea about approaching ESPN and I'm not sure they'd be amenable to the feedback anyway. It seems their inside stats people have another PBP dataset they are using for things, but no clue where they get that from or why it differs from what's publicly available. One potential offseason project I'm mulling is creating some automation to pull play data from non-ESPN sources to fill in the ESPN gaps.
> One potential offseason project I'm mulling is creating some automation to pull play data from non-ESPN sources to fill in the ESPN gaps.
Teams' official websites usually have the play-by-play in games they hosted. For example, here's [Stanford's record](https://gostanford.com/sports/football/stats/2021/ucla/boxscore/35162#play-by-play) of their game against UCLA, which I used since the [ESPN version](https://www.espn.com/college-football/playbyplay/_/gameId/401309858) is totally borked.
Since almost all team sites are on the Sidearm platform, they look pretty homogenous and might be easy to scrape with automation.
Yeah, data is now monetized. Short term win, long term lose.
As far as I can find, there's no way to contact ESPN. I searched for ages a few years ago because we had two players with the same number and they thought my team's QB blocked a punt or something goofy on ST. Wanted to contact them to correct it but it's impossible as far as I can tell