Course Syllabus

Welcome to Modern Data Analysis Techniques

Team taught by Miguel Morales (Physics) and Bryna Hazelton (eScience), the goal of this class is to introduce current techniques and best practices in the statistically rigorous analysis of large data sets. The class is organized around four themes:  practical statistics, advanced data visualization, collaborative analysis code, and advanced data analysis practices. 

Room+

Everyone learns so much better in person please come to class in person whenever possible, and the room is a new space B143 across from the SPS lounge. That said, many advanced students need to travel for research and covid is still around, so we will offer zoom on demand and will endeavor to record the classes. Send Miguel and email if you want zoom for a class, and this will be the link we use

Office Hours

Miguel Morales, Wed 11:00-12:00, plus by appointment or opportunity. C325.

Bryna Hazelton, Mon 2:00-3:00 or by appointment. C-wing 6th floor (eScience Institute).

Grading

As a graduate elective, what you get out of the course largely depends on what you put into it. Further, this class is designed to scale depending on your interests and time. At one end, it is designed to provide motivated students with a firm grounding in advanced statistics and data analysis tools that can be used on a wide range of academic and professional problems. At the other end it is designed to serve as a low-pressure survey of modern analysis techniques. During the first week you will detail what your goals are, and your grade will be based on how well you achieve your goals. There will be no exams, with the homework and final project forming the basis of your grade. 

Syllabus

Themes:  Practical Statistics; Data Visualization; Collaborative Analysis; Advanced Data Analysis Practices

Week 1

Th:  Welcome; course overview; what does sigma mean? (slides)

Homework: Intro quiz

Week 2

T:  Introduction to git & GitHub  (video, slides)

Th:  Statistical building blocks (video, slides)

Homework: Homework #1 (git game)

Week 3

T: Data visualization pt. 1; (video, slides)

Th: Data visualization pt. 2, workshopping plots; (video, slides)

Homework:  Homework #2 (intro to stats)

Stats reference:  

Statistics cheat sheet

Online reference chart

Song paper

Wikipedia entries can be useful, look under ‘related distributions’

Week 4

T: Statistical questions, trials factors, parameter distributions; (video, slides)

Th: Parameter distributions cont.; Fisher matrix; triangle plots; variable backgrounds; (video, slides)

Homework: Homework #3

Week 5

T: Statistically valid plots; jackknife tests; (video, slides)

Th:  Developing an analysis plan;  (video, slides)

Week 6

T:  Revisiting significance (video, slides

Th:  Metadata, Provenance & Test Thickets;  (video, slides)

Week 7

T:  Confidence intervals;

Th:  Stats mini-review; the blob, analysis dragons; 

Week 8

T: Machine Learning Joys & Sorrows (slides)

Th:  Machine Learning case studies (slides)

Week 9

T: Blind & semi-blind analyses; data rampages

Th: Thanksgiving

Week 10

T Nov 28:  

Th Nov 30:  Hannah Boyer, Hannah Rarick, Al Snow, Ethan Hansen

Week 11

T Dec 5:  Xan McPherson, Adina Ripkin, Kieran Smout, Carl Thomas

Th Dec 7:  Ella Carlander, Henry Froland, Eli Lilleskov, Iman Fahmy

 

Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form (https://registrar.washington.edu/students/religious-accommodations-request/).

Course Summary:

Date Details Due