2 Course Outline

CONTENT

A cartoon of a Black woman. She is wearing a beige sweater with a red, blue, and green stripe on it. She looks like she wants answers to questions.

  • Where we learn is important
  • What is this course about?
  • What are we going to learn?
  • When are we going to get together?
  • What skills should we have before we take this course?
  • What textbooks will we be reading?
  • Where can we find course materials?
  • What types of assessments are there?
  • What happens if something is missed or late?
  • What will we learn during the lectures?
  • What will we learn during labs?
  • What if we aren’t feeling well?

A photo of Dr. Daniel Gillis in front of a rainbow arc.

CONTACT INFO

  • Hi folks – please call me Dan. If you need to reach me, I can be contacted via my UofG email (dgillis@uoguelph.ca). My pronouns are he/him.
  • My office hours will be discussed during class, but you can also book an appointment with me at any time. To book a time with me, please use my Booking Page.
  • The course email is cis4020@socs.uoguelph.ca. This email address will be monitored by the teaching team on a regular basis. Please use this email if you have any questions about the course content or the course deliverables.

Where we learn is important

The Dish With One Spoon Covenant speaks to our collective responsibility to steward and sustain the land and environment in which we live and work, so that all peoples, present and future, may benefit from the sustenance it provides. As we continue to strive to strengthen our relationships with and continue to learn from our Indigenous neighbours, we recognize the partnerships and knowledge that have guided the learning and research conducted in and for this class. We acknowledge that the University of Guelph resides in the ancestral and treaty lands of several Indigenous peoples, including the Attawandaron people and the Mississaugas of the Credit, and we recognize and honour our Anishinaabe, Haudenosaunee, and Métis neighbours. We acknowledge that the work we do here occurs on their traditional lands so that we might work to build lasting partnerships that respect, honour, and value the culture, traditions, and wisdom of those who have lived here since time immemorial.

What is this course about?

Data Science focuses on extracting the important relations in data. The course is intended as a survey of the discipline and focuses on applied computational methods for data analysis. Topics include algorithms, computational and machine learning methods, software tools, and modelling, as they apply to the analysis of and discovery in big data. Want to know more? Check out the Academic Calendar here.

What are we going to learn?

By the end of this course, you should be able to:

  1. Explain the primary concepts and tools used in data analysis.
  2. Create systems that analyze, capture, and format large data sets.
  3. Apply the appropriate algorithms for analyzing different types of data.
  4. Describe different data types and their characteristics.
  5. Describe the different statistical and learning algorithms for data analysis.
  6. Create software that implements analysis algorithms.
  7. Integrate existing software tools and libraries into an analysis system.

When are we getting together?

At this point in time, the course will be delivered face-to-face (with a few possible exceptions). However, as a class, we will discuss our comfort levels and safety in both the class and lab settings – particularly given the ongoing COVID-19 pandemic. Importantly, we will do whatever we need to do to ensure the health and wellness of everyone in the class (including the teaching team, visitors, and our community partner). With that in mind, it’s important for each of us to remain as flexible and patient as possible during the semester in the event that rising case counts necessitate moving to virtual delivery of the course content.

While I want you to attend every class and every lab, I realize that there may be times when this won’t be possible. Please discuss any challenges you might have with me, and I’ll do my best to help you sort them out.

To ensure our time together is productive, I’m going to ask that you do some work in advance of class. This typically involves reading but might involve sketching, watching videos, or creating interpretive dance routines to demonstrate certain computer science topics.

Lec 01 Monday/Wednesday 10:00 am – 11:20 am CRCS 117
Lab 01 Monday, 2:30 pm – 3:20 pm CRCS 117

What skills should we have before we take this course?

Students enrolled in CIS4020 are expected to have the skills and knowledge covered in the prerequisites listed below, as well as strong writing skills, the ability to work in teams, and strong communication skills.

  • Prerequisite Software Systems Development and Integration CIS2750
  • Prerequisite Statistics I STAT2040
  • Prerequisite Linear Algebra I MATH1160

What textbooks will we be reading?

  • Required: This book!
  • Suggested: R for Data Science (2nd Edition), Hadley Wickham et al., 2023
  • Suggested: Doing Data Science, Cathie O’Neil & Rachel Schutt, 2014
  • Suggested: Data Visualization: A Practical Introduction, Kieran Healy, 2018
  • Suggested: Statistics for Data Science, James D. Miller, 2017
  • Suggested: Statistics for Machine Learning, Pratap Dangeti, 2017

Other Resources You Might Use:

Where can we find course materials?

There are a lot of things to cover in this course! But never fear – all course material, news, announcements, and grades will be regularly posted on our CIS4020 website. You can find it here. Please be sure to check the website regularly. While other tools (such as Slack, Discord, Trello, etc.) might be used and are super helpful in their own right, always refer to the course website for course information, or ask the teaching team if you are still unsure.

Community Engagement Materials

We are extremely lucky to be working with a fantastic community partner this year. To build a relationship with our community partner, you will find materials describing community-engaged projects and our partner on our website and in this book. It’s important for all of us to review these materials before we begin working with our community partner. And remember that our community partners are volunteering their time to support us as we learn about software design.

Challenge Materials

In addition to the regular course materials, I will post on our course website extra articles or other data about the community partner and their overarching challenge. Please use these materials to bring yourself up to speed with them and their challenge, as this will help you build a better solution as well as a rapport with them.

Lecture Materials

Slides and other lecture materials will be made available in advance of the first day of class. You will be able to link to them directly from this book or via the course website. In the event that we offer a virtual session, it will be recorded and the video will be posted as soon as possible after class. Links to the video will be posted on our course website.

Labs

Lab meetings will allow teams the opportunity to meet with TAs to clarify issues or to receive advice pertaining to their project. Some labs will require that you or your team submit work for grading. While most students will submit these materials by the end of the lab, you will have extra time after the lab to complete and submit the materials should you need it. Lab materials will be made available through the course website.

Assignments

Assignment descriptions and rubrics are provided in this book. You should submit all of your assignments through the course website.

Quizzes

Instead of a final exam, the course has several short quizzes. Each quiz will be available on the website until the last day of class (Friday, December 1st, 2023 at 4:30 pm). The quizzes will not be reopened after this date.

What types of assessments are there?

The course has been broken down into several deliverables. Some of these will be submitted during the lab (although, as mentioned, you’ll have more time than just the lab to get this work done), while others can be done on your own time or with your team. Each deliverable has a due date range instead of a single due date. If you are having difficulty with an assignment, please chat with the teaching team as soon as possible. Additionally, if you need more time to get something done, please chat with me sooner rather than later. I almost always say yes to extensions (so long as it doesn’t cause problems for the TAs who have their own schedule of due dates).

Assignments – 50%

  1. Due between September 7th at 8:30 a.m. & December 1st at 4:30 p.m. [10%]
  2. Due between September 29th at 8:30 a.m. & October 13th at 4:30 p.m. [10%]
  3. Due between October 20th at 8:30 a.m. & November 3rd at 4:30 p.m. [10%]
  4. Due between November 20th at 8:30 a.m. & December 1st at 4:30 p.m. [10%]
  5. Due December 1st at 4:30 p.m. [10%]

NOTE: Assignment 1 will be completed individually. All other assignments will be team contributions.

Quizzes – 25%

  1. Due between September 11th at 8:30 a.m. & September 15th at 4:30 p.m. [5%]
  2. Due between October 2nd at 8:30 a.m. & December 1st at 4:30 p.m. [10%]
  3. Due between October 16th at 8:30 a.m. & December 1st at 4:30 p.m. [10%]

Note: All quizzes (except for quiz 1) will be open until the last day of class (Friday, December 1st, 2023) at 4:30 p.m. Unless otherwise indicated, quizzes are not to be completed as a team. The quizzes will not be re-opened after December 1st at 4:30 p.m. All quizzes can be found on Moodle.

Labs – 25%

  1. Lab 2 is due between September 25th at 2:30 p.m. & October 13th at 4:30 p.m. October 20th at 4:30 p.m. [5%]
  2. Lab 3 is due between October 2nd at 2:30 p.m. & October 20th at 4:30 p.m. October 27th at 4:30 p.m.[5%]
  3. Lab 5 is due between October 23rd at 2:30 p.m. & November 17th at 4:30 p.m. [5%]
  4. Lab 7 is due between November 6th at 2:30 p.m. & November 24th at 4:30 p.m. [5%]
  5. Lab 8 is due between November 13th at 2:30 p.m. & December 1st at 4:30 p.m. [5%]

Note: While there are 8 labs, you will be graded on only 5 of them (labs 2, 3, 5, 7 and 8). Some of the labs will be completed by you and your team, others will be completed by you.

What happens if something is missed or late?

Missed Labs: If you are going to miss a lab, please let your team know (and if possible – me too). Together we should be able to work around your absence.

Missed Assessments: If you can’t complete an assignment, quiz, or lab due to medical, psychological, or compassionate reasons, please chat with me.

Accommodation of Religious Obligations: If you are unable to meet an in-course requirement due to religious obligations, please let me know within two weeks of the start of the semester to make alternate arrangements.

Late Deliverables: We will begin grading course deliverables shortly after the last moment they are due. If you have not submitted a course deliverable on time, it will be considered late. You may submit any course deliverable late (with the exception of in-class lab demos and quizzes which are open until December 1st at 4:30 pm), however, it may take a bit of time for us to provide feedback as we may need to prioritize other obligations to the course. Preferably, please chat with me before the last due date of the deliverable if you think you might need an extension. I will almost always grant an extension – but this will also depend on the availability of the teaching team to provide feedback. Whatever the case, we will try to work to identify a new due date so that you can be successful in the course. If something comes up suddenly and you are unable to complete a deliverable, please reach out to me as soon as possible so that we can determine options for you to complete the coursework.

Regrades: If you feel your assignment has been graded incorrectly, please present your case (via email) to the instructor. Be specific about what you believe was graded incorrectly. All regraded material will be completely regraded. This could result in your grade being reduced.

What will we learn during lectures?

Week Topics Covered (order/content may vary) Links To Relevant Chapters Links To Slide Decks Learning Outcomes
Intro Introduction to CIS4020 Welcome to CIS4020

What is Community-Engaged Learning?

Setting Expectations

Asking Good Questions

1 The Data Science Process

Exploring Data

The Data Science Process

Exploring Data

The Data Science Process

Exploring Data

1, 3, 4
2 Preliminary Data Analysis

Statistical Distributions

Preliminary Data Analysis

Statistical Distributions

Preliminary Data Analysis

Statistical Distributions

1, 3, 4, 5
3 Hypothesis Testing

Confidence Intervals

Hypothesis Testing

Confidence Intervals

Hypothesis Testing

Confidence Intervals

5
4 Sampling Distributions

Simple Linear Regression

Multiple Linear Regression

Simple Linear Regression

Multiple Linear Regression

Sampling Distributions

Simple & Multiple Linear Regression

5
5 Logistic Regression Logistic Regression Logistic Regression 5
6 Poisson Regression

K Nearest Neighbours

Poisson Regression

K Nearest Neighbours

Poisson Regression

K Nearest Neighbours

5
7 Naive Bayes Classifier

Support Vector Machines

Naive Bayes Classifier

Support Vector Machines

Naive Bayes Classifier

Support Vector Machines

5
8 Decision Trees

Neural Networks

Decision Trees

Neural Networks

Decision Trees

Neural Networks

5
9 K Means

Mean-Shift Clustering

K Means

Mean-Shift Clustering

K Means

Mean-Shift Clustering

5
10 Gaussian Mixture Models

Agglomerative Hierarchical Clustering

Gaussian Mixture Models

Agglomerative Hierarchical Clustering

Gaussian Mixture Models

Agglomerative Hierarchical Clustering

5
11 In Class Presentations

In Class Presentations

6, 7
12 In Class Presentations 6, 7

What will we learn during labs?

Week Topics Covered (order/content may vary) Links To Relevant Chapter Links To Slide Deck Learning Outcome
Intro No Lab
1 No Lab
2 Lab 1: Ethics & Do No Harm Lab 1: Ethics & Do No Harm Lab 1: Ethics & Do No Harm 1
3 Lab 2: Literature Reviews Lab 2: Literature Reviews Lab 2: Literature Reviews 1, 5
4 Lab 3: Types of Data & Simulations Lab 3: Types of Data & Simulations Lab 3: Types of Data & Simulations 2, 3, 4, 5, 6
5 No Lab
6 Lab 4: Data Visualization Lab 4: Data Visualization Lab 4: Data Visualization 4, 6, 7
7 Lab 5: Storyboards & Dashboards Lab 5: Storyboards & Dashboards Lab 5: Storyboards & Dashboards 6, 7
8 Lab 6: Science Communication Lab 6: Science Communication Lab 6: Science Communication 6, 7
9 Lab 7: Critiquing Science Communication Lab 7: Critiquing Science Communication Lab 7: Critiquing Science Communication 6, 7
10 Lab 8: Critical Reviews Lab 8: Critical Reviews Lab 8: Critical Reviews 6, 7
11 No Lab
12 No Lab

What if we aren’t feeling well?

While your health and wellness are always important, it is particularly important in this class and to me. I want you to put yourself first this semester. We need to do whatever we can to support each other, as well as our family, friends, and community. With that in mind, we need to work together, practice patience and empathy, and remain honest about our needs. Only then can we foster and promote a safe, supportive environment, as well as good physical, emotional, spiritual, cultural, and mental health and wellness for everyone.

“If you are sick, heartbroken, or exhausted, get rest, reach out to someone,
and take whatever steps necessary to get well. Work is not more important than your health.”

-Dr. Max Liboiron

If you are experiencing any challenges, please do not hesitate to contact me, and know that there are resources on campus set up to help you out.

  • Medical concerns? Student Health Services at x52131
  • Threats of violence, personal safety? Campus police at x2000
  • Psychological or emotional concerns? Counselling services at x53244
  • Accessibility concerns? SAS at x56208
  • Sexual assault? Campus police at x2000, or counselling services at x53244
  • Mental Health concerns? Please see the Mental Health Resources page here.

Other sources of help can be found at the following links:

  • Student Wellness, Monday to Friday, 8:30 am-4:30 pm, x52131, J.T. Powell Building
  • Counselling Services, Monday to Friday, 8:15 am-4:15 pm, x53244, Level 3, University Centre
  • Campus Safety Office, 24/7, x2000, Trent Building
  • Good2Talk, 1.866.925.5454
  • Here 24/7, 1.844.437.3427

License

Community Engaged Data Science Copyright © 2023 by Daniel Gillis. All Rights Reserved.

Share This Book