Syllabus - Sta-209 - Sections 01 and 03 (Spring 2025)

Course Information

Instructor:

  • Ryan Miller, Noyce 2218, millerry@grinnell.edu

Class Meetings:

  • Noyce 2401
    • Sec 01 meets from 8:30-9:50am
    • Sec 03 meets from 10-11:20am

Office Hours:

  • Drop-in hours (Noyce 2218): Monday 1:30-2:30pm, Thursday 2:30-3:30pm, Friday 2:30-3:30pm
    • Additional availability by appointment

Course Mentors:

  • Hannah Kim (Sec-01) kimhanna3@grinnell.edu
  • Elene Sturua (Sec-03) sturuael@grinnell.edu
    • Each mentor will host a weekly help session, and you are welcome to attend either mentor’s session. Dates and times for these sessions will be posted when available.

Course Description:

The course covers the application of basic statistical methods such as univariate graphics and summary statistics, basic statistical inference for one and two samples, linear regression (simple and multiple), one- and two-way ANOVA, and categorical data analysis. Students use statistical software to analyze data and conduct simulations. A student who takes Statistics 209 cannot receive credit for Mathematics 115 or Social Studies 115. Prerequisite: Mathematics 124 or 131

Texts:

There is no required textbook for the course.

Select homework exercises, supplemental readings, and other materials will be drawn from the following sources:

  1. Introduction to Modern Statistics (2nd edition) by Mine Çetinkaya-Rundel and Johanna Hardin. A free version is available at: https://openintro-ims.netlify.app/
  2. Hands on Programming with R by Garrett Grolemund. A free version is available at: https://rstudio-education.github.io/hopr/index.html
  3. Statistics: Unlocking the Power of Data by Lock, Lock, Lock, Lock, and Lock. There are no free versions of this book, but we will use some of the free online resources found here: https://www.lock5stat.com/

Website:

The official course schedule and all course materials will be posted on the following website:

All assignment submissions, grades, and assignment feedback will be managed through P-web, https://pioneerweb.grinnell.edu/, unless other instructions are given.

\(~\)

Aims and Objectives

This course aims to develop in students informed critical and theoretical perspectives on the impacts of data collection and analysis, including the social construction of data production, and the use of algorithmic techniques to process that data.

Put simply, the goal of this course is to prepare students to independently analyze data using justifiable statistical methods while understanding both the strengths and limitations of the data and the analytic methods used.

\(~\)

Learning Objectives

After completing this course, students should be able to:

  1. Utilize data visualization, descriptive statistics, regression, and statistical inference to derive meaningful insights from data.
  2. Correctly apply the statistical methods of hypothesis testing and confidence interval estimation to quantify the presence of variability in data.
  3. Use the R programming environment to create data visualizations and perform basic statistical analyses.
  4. Clearly and concisely communicate findings to statistical and non-statistical audiences.

\(~\)

Policies

Class Sessions

We aim to devote at 50% of in-class time towards “lab”, which involves working as part of a small group (2-3 students) through tutorials and exercises involving the analysis of data using R. For the first half of the semester lab groups will be assigned, and for the second half you will work on labs with the members of your class project group.

Content from these labs will appear on exams, so you are responsible for ensuring that you understand the core concepts introduced in each lab. Labs are intended to be worked on collaboratively, and use of “divide and conquer” approaches to answering the lab’s questions may negatively impact your grade.

Attendance

This course involves substantial collaboration and absences impact not only yourself, but also your classmates, especially if the absence was not communicated in advance. However, I understand that missing class is sometimes necessary. If you will be absent for any reason I ask to be notified as soon as possible. Showing up late or missing class without notice will negatively impact the “engagement and participation” portion of your final grade.

Late Work

Assignments are generally due at 11:59pm on the posted due-date. All deadlines have an automatic 48-hour “partial extension” where I will accept your submission with a penalty of no more 10% (ie: a penalty between 0% and 10% depending upon the circumstances and frequency of late work). After 48-hours, late work may still be accepted on a case-by-case basis, potentially subject to a penalty greater than 10%. Special exceptions involving individual circumstances and unexpected events may be allowed, but they should be arranged as far in advance of an assignment’s deadline as is possible, and/or coordinated with academic support staff.

Software

Software is an essential tool of statisticians and will play an important role in this course. We will primarily use R, an open-source statistical software program created by and for statisticians. You will also be expected to write, document, and submit code used on assignments and a course project. You will not be expected to write any code on exams, but you will be expected to interpret code and output that is provided.

You are welcome to use your own personal laptop, or a Grinnell College laptop. R is freely available and you can download it and it’s UI companion, R Studio, here (note: R must be downloaded and installed before R Studio):

  1. Download R from http://www.r-project.org/
  2. Download R Studio from http://www.rstudio.com/

You may also work on a classroom laptop, all of which will have R and R Studio pre-installed.

Finally, Grinnell hosts an online version of R Studio that you may use while on campus internet: https://rstudio.grinnell.edu/

Academic Honesty

At Grinnell College you are part of a conversation among scholars, professors, and students, one that helps sustain both the intellectual community here and the larger world of thinkers, researchers, and writers. The tests you take, the research you do, the writing you submit-all these are ways you participate in this conversation.

The College presumes that your work for any course is your own contribution to that scholarly conversation, and it expects you to take responsibility for that contribution. That is, you should strive to present ideas and data fairly and accurately, indicate what is your own work, and acknowledge what you have derived from others. This care permits other members of the community to trace the evolution of ideas and check claims for accuracy.

Failure to live up to this expectation constitutes academic dishonesty. Academic dishonesty is misrepresenting someone’s intellectual effort as your own. Within the context of a course, it also can include misrepresenting your own work as produced for that class when in fact it was produced for some other purpose. Additional information can be found here.

Inclusive Classroom

Grinnell College makes reasonable accommodations for students with documented disabilities. To receive accommodations, students must provide documentation to the Coordinator for Disability Resources, information can be found here. If you plan on using accommodations in this course, you should speak with me as early as possible in the semester so that we can discuss ways to ensure your full participation in the course.

Religious Holidays

Grinnell College encourages students who plan to observe holy days that coincide with class meetings or assignment due dates to consult with your instructor in the first three weeks of classes so that you may reach a mutual understanding of how you can meet the terms of your religious observance, and the requirements of the course.

Academic Support

If you have other needs not addressed in previous sections, please let me know soon so that we can work together for the best possible learning environment. In some cases, I will recommend consulting with the Academic Advising staff. They are an excellent resource for developing strategies for academic success and can connect you with other campus resources as well: http://www.grinnell.edu/about/offices-services/academic-advising. If I notice that you are encountering difficulty, in addition to communicating with you directly about it, I will also likely submit an academic alert via Academic Advising’s SAL portal. This reminds you of my concern, and it notifies the Academic Advising team and your adviser(s) so that they can reach out to you with additional offers of support.

\(~\)

Grading

Engagement and Participation - 5%

Participation in a lab-heavy course is absolutely critical. During labs you are expected to help your partner(s) learn the material (which goes beyond simply answering the lab questions), and your partner is expected to help further your understanding. Everyone will begin the semester with a baseline participation score of 80%, which will move up or down depending on my subjective assessment of your behavior during class. You can very quickly raise this score by helping your lab partner(s), and working diligently to understand course material during class. Alternatively, you can lower this score by skipping class, letting your lab partner(s) do most of the work, using your phone or surfing the web during class, etc. Reports from lab or project partners that you are not contributing equally to group efforts may also influence this score. If you are ever unsure of your participation standing, you can email me and I am happy to provide you an interim estimate.

Labs - 10%

In-class labs contain embedded questions that you and your lab partner(s) should answer together in a single document. Some lab questions will be scored for accuracy with feedback given, while others may be scored for effort/completion. Additionally, labs are to be completed collaboratively, and if it becomes clear that you are your partner(s) are using a “divide and conquer” approach to answering lab questions your score on that assignment may be penalized.

Individual Homework - 20%

There will be approximately 10 homework assignments throughout the semester, generally with one assignment due each week (with some exceptions around exam dates). Homework is to be completed individually, and answers that are suspiciously similar may be reported as academic honesty violations. That said, I understand that you may want to discuss homework questions with your classmates. You are welcome to do this so long as your submitted answers are uniquely yours, and if you receive substantial help you acknowledge the contributions of anyone or anything (other than official course materials, instructors, and mentors) that substantively shaped your answers.

Exams (3x) - 45% in total (15% each)

There will be 3 exams throughout the semester, each covering 3-4 weeks of course content. Because our course content builds upon itself, these exams are incidentally cumulative, but the focus will always be on the most recent set of material. Exams are closed-notes, but you will be provided a formula page containing content I do not want you to memorize. You will be given the exam’s formula page as well as a practice version of the exam at least 1-week prior to the exam date.

  • Mastery Policy: My hope is that everyone thoroughly understands our course content on a schedule aligned with exam dates; However, I am more interested in everyone understanding this content when the course is over, not by a somewhat arbitrary exam date. Consequently, you will have the opportunity to take a replacement version of any or all of the course’s 3 exams during our assigned final exam period for a maximum score of 90% on each exam you re-take. Please note that the intent of this grade cap is to provide an incentive for studying for the original exam and not waiting until final’s week. Replacement exams will be approximately the same duration and difficulty level as their counterparts. Anyone receiving a score lower than 90% on an exam is eligible to take the corresponding replacement exam. If you score lower on the replacement than your original exam the two scores will be averaged. The intent of this averaging policy is to provide an incentive for you to thoughtfully prepare for any replacement exams you opt to take during final’s week rather than deciding to take them on whim hoping you’ll get a higher score.

Project - 20%

The course project is intended to provide you an opportunity to perform your own statistical analysis on real-world data. The final product is a three-page written report accompanied by R code and documentation. You may work on this project individually, or in a group of two or three of your choosing. This project is intended to mirror the USCLAP Competition guidelines, and I encourage any interested students to prepare their project with a competition submission in-mind.

There will be several project check-points throughout the semester, and a comprehensive description of the assignment will be made available later in the semester.

\(~\)

Misc

Getting Help

In addition to visiting office hours and completing the recommended readings, there are many other ways in which you can find help on assignments and projects.

The Data Science and Social Inquiry Lab (DASIL) is staffed by mentors who are experienced in R programming and may be able to troubleshoot coding problems you are having.

The Grinnell Math Lab is located on the 2nd floor of Noyce Science Center in Room 2012 and offers drop-in statistics tutoring.

The online platform Stack Overflow is a useful resource to find user-generated coding solutions to common R problems. Nearly all professionals have needed to “look up” a coding strategy on a site like Stack Overflow at some point in their career, and I have no problem with you doing the same on assignments or projects. However, if you make substantial use of a Stack Overflow answer (ie: actually integrating lines of code written by someone else into your work, not just getting help identifying the right functions/arguments) the expectation is that you cite or acknowledge doing so.

Large Language Models

Large language models, such as ChatGPT, Microsoft Co-pilot, or Google Bard, can be useful tools for explaining and fixing errors in your R code, or helping you understand example code in greater detail than might havebeen given. You are welcome to use these tools throughout the course; however, you are ultimately responsible for the accuracy of any work you submit. Relying upon a large language model to write for you is risky. The model may hallucinate inaccurate information or generate text that is superficial and lacking sufficient detail, I encourage you to read Professor Erik Simpson’s write-up on writing with LLMs to see some reasons why you shouldn’t lean too heavily on these technologies. Nevertheless, you’re welcome to use large language models in this course in the same way you’d use a website like Stack Overflow or a peer mentor.

\(~\)

Topic List

Consult the course website for a comprehensive list of topics. Below is a tentative list:

  • Exam 1 content:
    • Data visualizations, numerical summaries, contingency tables, confounding variables, regression, and sources of variability/standard error
  • Exam 2 content:
    • Confidence intervals, hypothesis testing, p-values, testing errors, hypothesis testing misconceptions, 1-sample and 2-sample Z/T tests
  • Exam 3 content:
    • Chi-squared tests, analysis of variance (ANOVA), inference for regression models, logistic regression (time-permitting)