Topics/Weekly Activities | Due Dates by 11:59 pm Wednesdays unless noted |
|
---|---|---|
Week 1 January 9 lecture January 11 lab |
Introduction to Data Science tools: R, markdown | Lab 1 |
Week 2 January 16 lecture January 18 lab |
Version Control & Reproducible Research, Git |
Lab 2 |
Week 3 January 23 lecture January 24 @ 4pm Guest Speaker January 25 lab (sample solution) |
Exploratory Data Analysis Assignment 1 |
Lab3, Reflection |
Week 4 January 30 lecture, paper, Guest Speaker February 1 lab (sample solution) |
Data visualization | HW1, Lab4, Reflection |
Week 5 February 6 lecture February 8 lab (sample solution) |
Data cleaning and wrangling ML 1 advanced regression advanced regression solution |
Lab5 |
Week 6 February 13 lecture February 15 lab (sample solution) |
Regular Expressions, Big Data, Data scraping, using APIs | HW2, Lab6 |
Week 7 February 20/22 |
Reading Week | |
Week 8 February 27 lecture March 1 lab (sample solution) |
Text mining | Lab8 |
Week 9 March 6 lecture, Guest Speaker March 8 lab (sample solution) |
High performance computing, cloud computing | Midterm, Lab9 |
Week 10 March 13 lecture March 15 lab (sample solution), lab-b (optional) (sample solution) |
ML 2 (trees, rf, xgboost) | HW3 Lab10 |
Week 11 March 20 lecture March 22 lab11 (sample solution) |
Interactive visualization and effective data communication I |
Lab11 |
Week 12 March 27 lecture March 29 lab12 |
Interactive visualization and effective data communication II | HW4, Lab12 |
Week 13 April 3 lecture April 5 (lab) |
Final Porject Workshop | HW5 |
Week 15 April 28 (final exam period) |
Final Project |
Task | % of Grade |
---|---|
Labs | 10 |
Guest speaker reflections | 5 |
Homework (5) | 25 |
Midterm report | 25 |
Final project | 35 |
[1] https://github.com/JSC370/jsc370.github.io
knitr
by its author, Yihui Xie. There is also a knitr book covering the same ground in more detail.Makefiles
in the data-analysis pipeline, by Lincoln Mullenxcode-select --install
, or just try to use e.g. git
from the terminal and have OS X prompt you to install the tools..tex
format directly, but it is more useful to just have it available in the background for other tools to use. The MacTeX Distribution is the one to install for macOS.pandoc-citeproc
for processing citations and bibliographies, and pandoc-crossref
for producing cross-references and labels.make
what the steps are to create the pieces of a document or program. As you edit and change the various pieces, it automatically figures out which pieces need to be updated and recompiled, and issues the commands to do that. See Karl Broman’s Minimal Make for a short introduction. Make will be installed automatically with Apple’s developer tools.Many of these websites have API to download the data. We recommend you using APIs to get data.