JSC370 and JSC470: Data Science II and III
Winter 2021
Where and When
 Instructor: David Duvenaud
 Teaching Assistant: harsh Panchal
 Email: duvenaud@cs.toronto.edu, please put “JSC370” or “JSC470” in the title.
 Location: Zoom (see Quercus for details)
 Time: Tuesday and Thursdays, 35pm
 Office hours: Wednesdays 45 by zoom.
 Course Forum: Discourse
 Course syllabus
Course Structure
Tentative Schedule
Week 1: Background, motivation, course setup
January 12 Lecture: Video  Slides
 We don’t need data scientists, we need data engineers
 Data Science Subreddit  Has great discussion of what jobs are available, career trajectories and considerations, common problems, etc.
January 12 Tutorial: Review of Python, Numpy, Pandas, Git, Colab Video
Week 2
January 19: Guest Lecture: Ben Allison, Principal Machine Learning Scientist at Amazon  Video 
January 21: Lecture on Latent variable models and collaborative filtering, intro to Assignment 1

Slides Video  Intro to JAX
 Collaborative Filtering and the Missing at Random Assumption
 If It’s Worth Doing, It’s Worth Doing With MadeUp Statistics
 Intro to probabilistic matrix factorization
Week 3
January 25: Lecture 3: Confounding, censoring, and assignment 1 lab. Video  Assigment 1 
Some links on confounding and Simpson’s paradox:
January 21: Assignment 1 Presentations Video
Week 4
Feb 1st: Assignment 1 due by midnight.
February 2nd: Guest Lecture: Farah Bastien, Manager, Data Science/Data Engineer at MLSE (Maple Leaf Sports & Entertainment Partnership): Sports Analytics for the Leafs and the Raptors. Video
February 4th: Shapley values, causality, and Pearl’s docalculus. Video
 SHAP values explained exactly how you wished someone explained to you
 A Unified Approach to Interpreting Model Predictions
 Making sense of Shapley values
 Causal Shapley Values
 Problems with Shapleyvaluebased explanations as feature importance measures
Week 5
February 9: Assignment 2 lab Video
February 11: Assignment 2 presentations Video
Week 6
Feb 17: Assignment 2 due by midnight.
Reading Week
Week 7
February 23: Guest Lecture: Wanying Zhao, Study design at Trilliam Foundation Video
February 25: Natural Language processing Video
 Latent Semantic Aalysis
 Topic Modeling + LDA Slides
 Original LDA Paper
 Illustrated Word2Vec
 RNN + Deep Language Model Slides
 Talk to Transformer
 To What Extent is GPT3 Capable of Reasoning?
Week 8
March 2nd: Assignment 3 Lab Video
March 4th: Assignment 3 presentations Video
Week 9
March 9: Guest Lecture: Alp Kucukelbir, Chief Scientist, Fero Labs
March 10th: Assignment 3 due by midnight.
March 11: Assignment 4 lab
Week 10
March 16: Lecture 9: Using large offtheshelf models, Outlier detection and Goodhart’s Law, Decision theory, + time series
March 18: Assignment 4 presentations
Week 11
March 22: Assignment 4 due by midnight.
March 23: Guest Lecture: Robert Grant, Cancer Genomics
March 25: Assignment 5 lab
Week 12
March 30: Lecture 11: Reproducibility and version control for data
April 1: Assignment 5 presentations
Week 13
April 5: Assignment 5 due by midnight.
April 6: Lecture 12: The future of Data Science
April 8: Paper presentations