CMPT 353 Lecture Notes

  1. Course Introduction [“Course Introduction” slides]
    1. This Course [This Course slides]
    2. Offering Strategy [Offering Strategy slides]
    3. Grades [Grades slides]
    4. Exercises [Exercises slides]
    5. Project [Project slides]
    6. Quizzes/Exam [Quizzes/Exam slides]
    7. Us [Us slides]
    8. Lectures and Labs [Lectures and Labs slides]
    9. References [References slides]
    10. Programming [Programming slides]
    11. Expectations [Expectations slides]
    12. Computational Data Science? [Computational Data Science? slides]
    13. Data Science? [Data Science? slides]
    14. Why Data Science? [Why Data Science? slides]
    15. Topics (1) [Topics (1) slides]
  2. Data Analysis Pipeline [“Data Analysis Pipeline” slides]
    1. Your Question [Your Question slides]
    2. Getting Data [Getting Data slides]
    3. Preparing Data [Preparing Data slides]
    4. Analyzing Data [Analyzing Data slides]
    5. Presenting Results [Presenting Results slides]
    6. Creating a Pipeline [Creating a Pipeline slides]
    7. Manual Pipeline Steps [Manual Pipeline Steps slides]
    8. The Pipeline [The Pipeline slides]
  3. Data In Python [“Data In Python” slides]
    1. Built-In Data Structures [Built-In Data Structures slides]
    2. NumPy [NumPy slides]
    3. Operating on Arrays [Operating on Arrays slides]
    4. Pandas [Pandas slides]
    5. Working With Pandas [Working With Pandas slides]
  4. Getting Data [“Getting Data” slides]
    1. Where Data Comes From [Where Data Comes From slides]
    2. Data from Files [Data from Files slides]
    3. Databases [Databases slides]
    4. Web APIs [Web APIs slides]
    5. Scraping HTML [Scraping HTML slides]
    6. File Formats [File Formats slides]
    7. CSV [CSV slides]
    8. JSON [JSON slides]
    9. XML [XML slides]
    10. Others [Others slides]
  5. Extract-Transform-Load
  6. Noise Filtering
  7. Cleaning Data
  8. Stats Review
  9. Inferential Stats
  10. Statistical Tests
  11. Machine Learning
  12. ML: Classification
  13. ML: Other Techniques
  14. Big Data and Spark
  15. How Spark Calculates
  16. Working With Spark
  17. Aside: NumPy/Pandas Speed
  18. Other DataFrame Tools
  19. Communicating
  20. More Data Science

Course home page.

Schedule, Summer 2024

Week Deliverables (*) Lecture Hour Lecture Date First Slide Video Link
1 1 May 6
2 May 6
3 May 8
2 Exer 1 4 May 13
5 May 13
6 May 15
3 Exer 2 7 May 20
8 May 20
9 May 22
4 Exer 3 10 May 27
11 May 27
12 May 29
5 Exer 4 13 Jun 3
14 Jun 3
15 Jun 5
6 Exer 5 16 Jun 10
17 Jun 10
18 Jun 12
7 Exer 6 19 Jun 17
20 Jun 17
21 Jun 19
8 Exer 7 22 Jun 24
23 Jun 24
24 Jun 26
9 Exer 8 25 Jul 1
26 Jul 1
27 Jul 3
10 Exer 9 28 Jul 8
29 Jul 8
30 Jul 10
11 Exer 10 31 Jul 15
32 Jul 15
33 Jul 17
12 Exer 11 34 Jul 22
35 Jul 22
36 Jul 24
13 Exer 12 37 Jul 29
38 Jul 29
39 Jul 31
14

* Check CourSys for the actual due dates and times.

Quiz instruction slide.