XML Box Scores Now Available

It’s been a little while, due to waiting on the arrival of a new laptop, but my stats parser for generating NFL box scores in an XML format is finally ready to be “beta-tested.” What I mean by this is, I have XML box scores for the entire 2006 season, which I’m going to make available for download. Due to the sheer volume of statistics available, there’s no way that I can test the accuracy of each file, but I will say that I have run a few tests that look at all of the games for the season and I came up with the same season leaders in multiple categories as nfl.com has on their web site.

Unfortunately, WordPress doesn’t allow me to upload either zip files or xml files, so I’ve uploaded them to a third-party site, mediafire.com and they can be found at this link,  http://www.mediafire.com/?23mtrjxwfmy . That file unzips to a folder that contains a folder for each of the 17 weeks of the 2006 season, each of which contains all of the games played in that week. The format is very similar to the one mentioned here, NFL Box Score XML Format, with a few additions found in the gamebook for the game.

If you download the files and look at the xml files, please leave a comment to let me know what you think of the format, or if there are any improvements you’d like to see. I’m planning on using the 2006 season as a “test-run” sort of thing while I work on a gamebook parser that will be included in the 2007 season (and later improve the 2006). Thanks, and enjoy!


4 Responses to “XML Box Scores Now Available”

  1. 1 Brian December 14, 2007 at 6:13 pm

    Very cool. Keep up the great work.

  2. 2 SportsPicksSystem December 26, 2007 at 11:43 pm


    I tried to open the file atl-car.xml from week1 directory. The game-metadata tag is never closed. I think this problem occurs in every files. Do you have the same problem? Anyway great work.

    Sports Picks System

  3. 3 Steve April 20, 2008 at 6:59 pm

    I too decided in 2007 to scrape the box scores from NFL.Com to use in a neural network. My parser is set up to select the box scores for any given week in any season, or an entire season’s worth if available on nfl.com.

    Basically my steps are:
    1. gather the schedules in text format
    2. let the PC format them as XML
    3. select the schedule to fetch box scores for
    4. iterate through the HTML, in memory, and output the box scores week by week and for a full season as xml.
    5. Then I am able to use this data to complete calculations and format the select stats for a neural network as well.

    I did all of my work in C# in VS 2003. I have since ported it to VS 2008.

    Very interesting. Keep it up.

  4. 4 Fantasy News July 2, 2008 at 9:55 pm

    The sample sheets look great. However, it abbreviates the first name of each player. . . Why? That makes it harder to add to a database to keep track of things.

    I suppose the nfl.com boxscore must abbreviate the first name as well? If that is the case. . then parsing nfl.com might not be the best route to go . .

    What language is it coded in? I would love to take a look because I was just about to purchase the nfl boxscore package at xmlteam. . .so this may be a better route 🙂

    Please keep us updated! I can smell football around the corner

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: