NFL Statistical Analysis

As I mentioned in my previous post, there is a lack of both raw data and intense research and analysis in both the NFL and college football, compared to the other big three professional sports. While I’d like to look at all kinds of sports data, I think that for the time being I’m going to be focusing on football data.

The first obstacle is the lack of raw data available for recent NFL games. When I say raw data, I primarily mean easily weekly box scores in an easily parsable format (like XML, CSV, or even plain text). If you’re interested, I have found some sources, which have been promising, but just don’t have the exact stats that I’d like. Erik Berg has a wonderful site, full of regularly updated XML files for sports, but he focuses primarily on the NBA and college basketball, it would seem. His NFL beta box scores from the 2006 season also don’t have the individual player stats that I’d like to have also. But it’s like a good resource for information about the SportsML schema for storing box scores.

The football stats juggernaut, is also a great source for data, and its new beta site,, has a .csv format for lots of the stats on the site, but lacks enough stats for the individual games. Not to put down the work that they are doing at all…it is definitely amazing and useful, just doesn’t have exactlty what I’m looking for.

So, it would seem that my own pickiness would mean that I need to take matters into my own hands to come up with some sort of solution myself. The best remedy I could think of involved finding a source that provides box scores in .html format, and then writing a “Scraping” program that turns the .html file into usable data.  I checked out a few different sources, ESPN, CNN/Sports Illustrated, USA Today Yahoo! Sports, but I finally settled on In terms of ease of parsing the html and depth of the stats listed, it was far and away the best. Also, it has a good play-by-play for each game, in case that’s data that I’m ever interested in using.

So far, I’m in the process of writing this program (so far it correctly grabs the team stats, but hasn’t dealt with the individual stats yet), and my question is should I put any effort into writing the data back out into either an xml file or csv file, so that others can use? Would anyone find this type of data useful? If so which format would you prefer, xml or csv? Also, the box scores only go back to the 2002 season, but that’s still five years worth of data with a lot of statistics, which I prefer over 20 years of minimal data. If you have any advice or input on this process, or any specific requests or ideas, feel free to let me know and I’ll keep you updated with the progress.


4 Responses to “NFL Statistical Analysis”

  1. 1 thesportsmaster8000 November 28, 2007 at 5:54 pm

    Hey I’d be interested in seeing what you end up developing and I would be interested in seeing your data as a cvs or xml file. I’ve been using a statistics program to cruch some stats and give me some data but it was more or less just messing around. I ran into the same problem you did, not enough stats in an easy to transfer format. I’ll be checking back, seeing what you come up with.

  2. 2 thesportsmaster8000 November 28, 2007 at 11:55 pm

    Hey, I’d be interested in seeing what you come up with as far as a program and data set are concerned. I’ve done some simple stuff using a statistics progam (SPSS) but I run into the same problems you do with lack of data and lack of an easy format. I’ll check back to see what more you come up with.

  3. 3 thesportsmaster8000 November 28, 2007 at 11:55 pm


  4. 4 Joel Marcey November 29, 2007 at 12:51 pm

    I too would be *very* interested to see what you come up with, especially if something can be done that is accurate, up-to-date, and cheaply (free is always good).

    I have not deeply dived into to find out what they offer as far as stats, how one can use them and pricing, but I would have to imagine they have something pretty compelling. The key factor with them would have to be price, I would think.

    That would be something interesting to find out.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: