Sean Lahman



Download Lahman’s Baseball Database

The updated version of the database contains complete batting and pitching statistics from 1871 to 2013, plus fielding statistics, standings, team stats, managerial records, post-season data, and more. For more details on the latest release, please read the documentation.

The database can be used on any platform, but please be aware that this is not a standalone application. It is a database that requires Microsoft Access or some other relational database software to be useful.


Please help support the Baseball Archive. The database is free, but there are real costs associated with maintaining it and making it available for download. The more popular this site becomes, the more expensive it is to keep things going. Please consider making a donation as a show of your support. Like the PBS folks say, we need your support if we’re going to survive. Click here for more information.


Limited Use License

This database is copyright 1996-2014 by Sean Lahman.

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.  For details see: http://creativecommons.org/licenses/by-sa/3.0/


Download 2013 Version 

This release includes playing statistics through the end of the 2013 season.

Click here for a list of revisions made since the first beta release on Decemeber 6, 2013.


Download Previous Versions

Some third-party applications don’t work with newer versions of the database. For that reason, we’re making some earlier versions available for download. Please be advised that no support exists for these versions. All questions about using the database with third-party applications should be directed to the makers of that software.

2012

2012 Version  – Microsoft Access
2012 Version  – comma-delimited version
2012 Version  – SQL version

2011
Version 5.9.1 – Microsoft Access
Version 5.9.1 – comma-delimited version
Version 5.9.1 – SQL version

2010
Version 5.8 – Microsoft Access
Version 5.8 – comma-delimited version

2009
Version 5.7 – Microsoft Access

Version 5.7 – comma-delimited version

2008
Version 5.6 – Access
Version 5.6 – comma-delimited version

2007
Version 5.5 – Access
Version 5.5 – comma-delimited version

2006
Version 5.4 – Access
Version 5.4 – comma-delimited version
Version 5.4 – spreadsheet version

2005
Version 5.3 – Access 2000
Version 5.3 – Access 97
Version 5.3 – comma-delimited version
Version 5.3 – spreadsheet version

2004
Version 5.2 – Access 2000
Version 5.2 – Access 97
Version 5.2 – comma-delimited version
Version 5.2 – spreadsheet version

2003
Version 5.1 – Access 2000
Version 5.1 – Access 97
Version 5.1 – comma-delimited version

2002
Version 5.0 – Access 2000
Version 5.0 – Access 97
Version 5.0 – comma-delimited version

2001
Version 4.5  – Access 2000
Version 4.5 – Access 97
Version 4.5 – comma-delimted version

2000
Version 4.0 – Access 97

1999
Version 3.0 – comma-delimited version

24 Responses

  1. […] Lahman (most famous for the Lahman Baseball Database) presents an interesting look at the history of board games by delving into the US Patents for some […]

    by Sean Lahman Looks Back on History of Sports Board Games | One for Five on Nov 29, 2011 at 12:27 am

  2. […] poring over at least 600 names via e-mail/comments and scouring the Lahman database, the field of 512 (!) names is set for the “March Moniker […]

    by March Moniker Madness Field is Set | Value Over Replacement Grit on Feb 19, 2012 at 11:57 pm

  3. […] to normalize the strikeouts for every pitcher in major league history. This was done by using the Lahman Database and applying the following […]

    by High Heat Stats » Crowning New Strikeout Kings – Fully Normalized Strikeout Leaders on Feb 23, 2012 at 5:28 pm

  4. […] The 2011 postseason was quite a ride for Rays pitcher Matt Moore. Not only did Moore pitch 7 dominant innings in game 1 of the ALDS vs. Texas, his 3 relief innings in game 4 gave him 10 postseason IP, more than he had thrown in his regular season career. Moore is the extreme, but there are plenty of active players who have pitched a significant part of their careers in the postseason.  This is particularly true for young pitchers of the Texas Rangers. Here are the pitchers who have the highest percentage of their career innings (regular + postseason) coming in the postseason. All stats were derived from information in the Lahaman Database. […]

    by High Heat Stats » Living in the Postseason – Pitchers with the Highest Percentage of Career IP Coming in the Postseason on Feb 26, 2012 at 1:19 pm

  5. […] took the Lahman Baseball Database and filtered out the top five starters for every squad in history, based on number of game starts […]

    by They have an SP1 through 5, but no A,E,I,O,Us. | Value Over Replacement Grit on Mar 31, 2012 at 9:04 am

  6. […] First off, a bit about my methodology. In past posts on weight, I just used a player’s listed weight on Baseball-Reference. However, many of these weights are comically awry, presumably because a player’s weight can get locked in early in his career, even if he subsequently gains a ton of weight and pitches most of his career at the higher number. So I decided to add a second source–the Lahman database. […]

    by The Heaviest Pitcher-Batter Matchups In MLB History | JunkStats on Apr 11, 2012 at 4:54 pm

  7. […] […]

    by SQL - CycloneFanatic on Jun 8, 2012 at 10:48 pm

  8. […] That got me wondering of the differences in the birth names of Major Leaguers versus the US general public over time.  Here are the most popular boys names in each decade from 1880-1980, with “Public” coming from the Social Security Administration and “MLB” culled from the Lahman database. […]

    by Most Popular First Names in Baseball History – by Birth Decade | Value Over Replacement Grit on Jul 7, 2012 at 8:41 pm

  9. […] spring from an Excel workbook I’ve adapted from the storied, freely-downloadable baseball player database managed by Sean Lahman (who won’t turn away contributions, by the way), one of the go-to sites […]

    by Birth Months and Budding Ballplayers: The Little League Thesis Revisited « spreadsheetjournalism on Aug 24, 2012 at 9:53 am

  10. […] compared a list of typefaces with the first and last names in the Lahman Baseball Database.  Here are the exact matches for either first or last […]

    by The VORG’s “All-Font” Team | Value Over Replacement Grit on Sep 19, 2012 at 10:14 pm

  11. […] the distinction that each has been the only player with that particular last name.    Using the Lahman Baseball Database, I found that there have been approximately 6,300 unique last names through 2011, and then I […]

    by A Lineup of Unique Last Names | Value Over Replacement Grit on Oct 2, 2012 at 9:37 pm

  12. […] Sean Lahman’s Baseball Database: complete batting and pitching statistics back to 1871, plus fielding statistics, standings, team stats, managerial records, post-season data, and more. http://www.seanlahman.com/baseball-archive/statistics/ […]

    by Data Sets: A List in Flux | Citizen-Statistician on Nov 21, 2012 at 10:17 am

  13. […] Data Sources: UN Data Google Public Data US Census Statistical Abstract Data Masher Data.gov MLB Green River ScraperWiki RetroSheet Sean Lahman’s Baseball Database […]

    by Comment-free collection of my most frequently visited statistics sites | Special Guessed on Jan 10, 2013 at 11:07 pm

  14. […] are many different sources for baseball stats, many requiring a fee, but I will be referring to the Sean Lahman Baseball Stats Database. It is open source, so you can just download a version that works for you, and run with it. I am […]

    by Baseball’s All-Star Break: Predicting the Game Using Excel | SoftArtisans on Jul 16, 2013 at 12:38 pm

  15. […] statistics data I’ve been using to create the predictive models can be found using the Lahman database up to 2012 and frankly just scraping straight off yahoo MLB for 2013 stats. I’m not sophisticated really […]

    by July 20 2013 MLB Predictions Vs Actual Winners | Epic99 Sports Analytics on Jul 21, 2013 at 8:49 am

  16. […] read it back into R, but what fun is that??? So, I did some more googling and found this amazing baseball statistics website, created by Sean Lahman where you can download insane amounts of data as .csv files or even as […]

    by My new favorite package: Shiny! | Chit Chat R on Aug 28, 2013 at 7:15 pm

  17. […] using the Lahman Baseball Database as our guide, here are the players with unique names to have played from 1871 through 2012.  We […]

    by The VORG’s List of Unique Names in Baseball History | Value Over Replacement Grit on Aug 29, 2013 at 8:18 am

  18. […] In this example, we’ll use the in-database function ‘corr’ to examine the relationship between MLB team wins and other variables (doubles, fielding percentage and home runs).  I’ve loaded my tables with data found in the Lahman database (found here: http://www.seanlahman.com/baseball-archive/statistics/). […]

    by Correlation Analysis with IBM’s PureData for Analytics (Netezza) | Big Data topics (Netezza, Hadoop, etc) on Sep 11, 2013 at 9:58 pm

  19. […] Sean Lahman and his Baseball Database: Updated yearly with encyclopedic stats on players, managers, teams and franchises.  When combined with the Play Index at Baseball Reference and the files from Retrosheet, you can answer just about any stats-based question you might have. […]

    by A Very VORGy Thanksgiving – 2013 Edition | Value Over Replacement Grit on Nov 27, 2013 at 9:29 am

  20. […] Download or move a large csv file to a box with SSIS  installed (im using a csv file from http://seanlahman.com/baseball-archive/statistics/) […]

    by Quick SSIS Throughput Test | SQL Notes From The Underground on Dec 10, 2013 at 11:10 am

  21. […] Using Run Expectancy Matrices and Stolen Base/Caught Stealing Data from 1993-2010, courtesy of the Lahman Database, we can determine whether the runner should be […]

    by Man On First: Should I Steal? | Batting Leadoff on Dec 12, 2013 at 9:34 am

  22. […] http://www.seanlahman.com/baseball-archive/statistics/ […]

    by The Lahman Database: Season-by-Seasuon data | Boot Camp for New Users of R on Dec 13, 2013 at 11:07 pm

  23. […] begin by visiting http://www.seanlahman.com/baseball-archive/statistics/ and clicking on the 2013 Beta Version, comma-delimited format. This will download a zip file […]

    by Regression of OPS Stats | Analyzing Baseball Data with R on Dec 18, 2013 at 11:34 am

  24. […] the VORG has culled a fairly complete list of all sets of same-named players, as generated from the Lahman Baseball Database.  There have been 506 sets of same-named players in baseball history, totaling 1,098 names.  The […]

    by All the Sets of Same-named Players | Value Over Replacement Grit on Dec 18, 2013 at 10:19 pm

You must be logged in to post a comment.




Recent Posts


Pages



About Sean Lahman

Sean Lahman is an award-winning database journalist and author.  He develops interactive databases and data driven stories for the Rochester Democrat and Chronicle and other Gannett newspapers and websites. He also writes a weekly column on emerging technology and innovation. Prior to joining the Democrat and Chronicle, he was a reporter and columnist with the […]more →

Switch to our desktop site