Topics/Weekly Activities | Due Dates by 11:59 pm Fridays unless noted |
|
---|---|---|
Week 1 January 8 lecture pdf January 10 lab |
Introduction to Data Science tools: R, markdown | Lab 1 |
Week 2 January 15 lecture pdf January 17 lab |
Version Control & Reproducible Research, Git |
Lab 2 |
Week 3 January 22 lecture pdf January 24 lab (sample solution) |
Exploratory Data Analysis | Lab 3 |
Week 4 January 29 lecture pdf January 31 lab (sample solution) |
Data visualization | HW1, Lab 4 |
Week 5 February 5 lecture pdf February 7 lab (sample solution) |
Data cleaning and wrangling ML 1 advanced regression advanced regression solution |
Lab 5 |
Week 6 February 12 lecture pdf February 14 lab (sample solution) |
Regular Expressions, Data scraping, using APIs | HW2, Lab 6 |
Week 7 February 21 |
Reading Week | |
Week 8 February 26 lecture February 28 lab (sample solution) |
Text mining | Lab 8 |
Week 9 March 4 lecture March 6 lab (sample solution) |
High performance computing, cloud computing | Midterm, Lab 9 |
Week 10 March 11 lecture March 13 lab (sample solution, lab-b (optional) (sample solution) |
ML 2 (trees, rf, xgboost) | Lab 10 |
Week 11 March 18 lecture March 20 lab11 (sample solution) |
Interactive visualization and effective data communication I |
HW3, Lab 11 |
Week 12 March 25 lecture March 27 lab12 |
Interactive visualization and effective data communication II | Lab 12 |
Week 13 April 1 lecture April 3 |
Final Project Workshop | HW4 |
Week 15 April 30 |
Final Project, HW5 |
Task | % of Grade |
---|---|
Labs (including attendance) | 10 |
Homework (5) | 25 |
Midterm report | 30 |
Final project | 35 |
knitr
by its author, Yihui Xie. There is also a knitr book covering the same ground in more detail.Makefiles
in the data-analysis pipeline, by Lincoln Mullenxcode-select --install
, or just try to use e.g. git
from the terminal and have OS X prompt you to install the tools..tex
format directly, but it is more useful to just have it available in the background for other tools to use. The MacTeX Distribution is the one to install for macOS.pandoc-citeproc
for processing citations and bibliographies, and pandoc-crossref
for producing cross-references and labels.make
what the steps are to create the pieces of a document or program. As you edit and change the various pieces, it automatically figures out which pieces need to be updated and recompiled, and issues the commands to do that. See Karl Broman’s Minimal Make for a short introduction. Make will be installed automatically with Apple’s developer tools.Many of these websites have API to download the data. We recommend you using APIs to get data.