JSC370 and JSC470: Data Science II and III
Where and When
- Instructor: David Duvenaud
- Teaching Assistant: harsh Panchal
- Email: firstname.lastname@example.org, please put “JSC370” or “JSC470” in the title.
- Location: Zoom (see Quercus for details)
- Time: Tuesday and Thursdays, 3-5pm
- Office hours: Wednesdays 4-5 by zoom.
- Course Forum: Discourse
- Course syllabus
Week 1: Background, motivation, course setup
January 12 Lecture: Video | Slides
- We don’t need data scientists, we need data engineers
- Data Science Subreddit - Has great discussion of what jobs are available, career trajectories and considerations, common problems, etc.
January 12 Tutorial: Review of Python, Numpy, Pandas, Git, Colab Video
|January 19: Guest Lecture: Ben Allison, Principal Machine Learning Scientist at Amazon||Video|
January 21: Lecture on Latent variable models and collaborative filtering, intro to Assignment 1
- Intro to JAX
- Collaborative Filtering and the Missing at Random Assumption
- If It’s Worth Doing, It’s Worth Doing With Made-Up Statistics
- Intro to probabilistic matrix factorization
|January 25: Lecture 3: Confounding, censoring, and assignment 1 lab. Video||Assigment 1|
Some links on confounding and Simpson’s paradox:
January 21: Assignment 1 Presentations Video
Feb 1st: Assignment 1 due by midnight.
February 2nd: Guest Lecture: Farah Bastien, Manager, Data Science/Data Engineer at MLSE (Maple Leaf Sports & Entertainment Partnership): Sports Analytics for the Leafs and the Raptors. Video
February 4th: Shapley values, causality, and Pearl’s do-calculus. Video
- SHAP values explained exactly how you wished someone explained to you
- A Unified Approach to Interpreting Model Predictions
- Making sense of Shapley values
- Causal Shapley Values
- Problems with Shapley-value-based explanations as feature importance measures
February 9: Assignment 2 lab Video
February 11: Assignment 2 presentations Video
Feb 17: Assignment 2 due by midnight.
February 23: Guest Lecture: Wanying Zhao, Study design at Trilliam Foundation Video
February 25: Natural Language processing Video
- Latent Semantic Aalysis
- Topic Modeling + LDA Slides
- Original LDA Paper
- Illustrated Word2Vec
- RNN + Deep Language Model Slides
- Talk to Transformer
- To What Extent is GPT-3 Capable of Reasoning?
March 2nd: Assignment 3 Lab Video
March 4th: Assignment 3 presentations Video
March 9: Guest Lecture: Alp Kucukelbir, Chief Scientist, Fero Labs
March 10th: Assignment 3 due by midnight.
March 11: Assignment 4 lab
March 16: Lecture 9: Using large off-the-shelf models, Outlier detection and Goodhart’s Law, Decision theory, + time series
March 18: Assignment 4 presentations
March 22: Assignment 4 due by midnight.
March 23: Guest Lecture: Robert Grant, Cancer Genomics
March 25: Assignment 5 lab
March 30: Lecture 11: Reproducibility and version control for data
April 1: Assignment 5 presentations
April 5: Assignment 5 due by midnight.
April 6: Lecture 12: The future of Data Science
April 8: Paper presentations