Posts Tagged 'box score'

NFL Box Score XML Format

So, the work on the scraping program for putting NFL data and statistics into a parsable, useful format is coming along well. I’ve got a sample box score in html format (from nfl.com) and the corresponding xml file that my program creates. I’ll post the links here at the bottom, and what I’m looking for right now is some input about the format. I tried to keep the xml file schema as close to the logical layout found on the nfl.com box score page as possible.

The initial format contains a “game-metadata” section, and two “team” sections. The game-metadata section consists of the names of the two teams playing and is also a placeholder for a bunch of information that I’d like to include in the future, such as the date, day of the week, weather, surface of the playing field, whether or not it’s a dome, etc.

Each team has it’s own section with “team-metadata”, like team name, win or loss, current record, etc. It also houses all of the team stats from the box score, and they’re labeled intuitively, like passingTouchdowns, or fumblesLost. Also, they’re available in a format that can easily be parsed from a String to an integer. For example, what is read in the box score as 13-25 (passing completions and attempts) is listed in the xml file as two separate fields, passingComp=”17″ and passingAtt=”26″, so that you don’t need to worry about problems with converting to an int.

Also under the “team” tag is a section for individual player stats. These are broken down into different categories: passing, rushing, receiving, fumbles, kicking, punting, kickoff returns, punt returns, and defense. Within each category is a list of all the players who recorded a stat for that particular category. So Trent Edwards is listed under the passing and rushing category but not kickoff returns or defense. The other option that I was considering was just listing each player for a team with all of his stats together, rather than separating them by category. If you leave input, please keep this in mind.

Without further ado, here is the link to the original box score at nfl.com: Bills’ defense stifles Jets in victory

And here is the link for the xml file that my program generated: buf-nyj.xml

The xml file is currently hosted on my pesonal school web space, since WordPress has restrictions on uploading xml files. Any advice on a place to permanently store the full set of stats would be appreciated as well. Please keep in mind this is only a first trial and sample format and that in order to get the most use out of it, your input is needed!