Project

Overview

The purpose of this project is to give you the chance to use data analysis skills to investigate a topic that is of interest to you. You’ll form groups of 3-5, decide on some questions that you can analyze using statistics, find or collect a suitable data set, conduct appropriate analysis on your data, and report your findings as a paper.

The paper will be due on the last day of class. Before then, you’ll be asked to submit a proposal and then an outline, as indicated on the calendar. Details about each of these components are given below.

All of the Gradescope submissions for the project are set up as a “group assignments.” This means that one person in each group should upload their group’s document to Gradescope and add their group members before submitting.

Topic Proposal [3 points]

Tell me who’s in your group (3-5 people) and a topic you want to investigate. This will be brief: a few sentences at most.

I encourage you to be creative and choose something that’s interesting to you! Maybe you want to understand what attributes of a CC student’s reading habits most influence their political opinions, or maybe you want to understand how a country’s economy relates to its performance in the Olympics, or… The possibilities are endless.

Here are many data sources if you need some inspiration, but don’t feel like you have to use something off of this list; you can find data yourself, or you can decide to collect data yourself. I do ask that the data you use for your project be raw data in a data frame (likely in a csv file that you can import into RStudio like you’ve done in the labs, though other file formats are possible). Don’t use summary data (and in particular, note that a two-way table is summary data; it is not a data frame). If you’re not sure if some data you’ve found is actually raw data in a data frame, feel free to ask.

Outline [7 points]

The outline should be roughly one page, and can be written in bullet points. You want to include enough detail here that someone could use your outline as a blueprint for replicating your statistical analysis. For example, you should address things like the following:

Paper [25 points]

On the last day of class, each group will submit a paper. You should write as succinctly as possible: be efficient with words. Also, your paper might have the following components, but if you feel like it makes sense to structure your write-up a little differently, feel free!

Presentation [15 points]

On the last day of class, each group will give a 10ish minute presentation to the entire class. You’ll want to share what topic you investigated, what hypotheses you tested, tables and/or graphics summarizing the data you used, what statistical tests you ran, and what the results of your tests were. Everyone in the group should take an equal and active part in the presentation.