Data for Case Studies in Data Science with R

Main page

The following provides links to data sets from the case-studies/chapters in the book Case Studies in Data Science with R. In some chapters, we programmatically access or construct the data (e.g., via simulation) and so it is less relevant to obtain them here. In other cases, we link to other sites so that you can get the most recent versions of those data sets.

1. Predicting Location via Indoor Positioning Systems

2. Modeling Runners' Times in the Cherry Blossom Race

2. Using Statistics to Identify Spam

4. Processing Robot and Sensor Log Files: Seeking a Circular Target

5. Strategies for Analyzing a 12 Gigabyte Data Set: Airline Flight Delays

6. Pairs Trading

7. Simulation Study of a Branching Process

Data are generated during the simulation.

8. A Self-Organizing Dynamic System with a Phase Transition

Data are generated during the simulation.

9. Simulating Blackjack

Data are generated during the simulation. CSV files for the strategies are available and also as archives CSV.tar.gz and CSV.zip.

10. Baseball: Exploring Data in a Relational Database

11. CIA Factbook Mashup

12. Exploring Data Science Jobs with Web Scraping and Text Mining


Duncan Temple Lang <[email protected]>
Last modified: Sat Nov 15 18:36:23 PST 2014