# Ling250/450: Homework 4

**Due: Wednesday April 2, 11pm**

This homework is to give you some practice with descriptive statistics, and to reflect on evidence and beliefs in the context of forming and testing hypotheses. This document is written in Markdown format (.md), which is encoded as plain text, but can be displayed with nice formatting. To view the Markdown file side-by-side with the rendered version, open the Markdown file in VS Code and use the shortcut ⌘+K then V on Mac or ctrl+K then V on Windows.

You should submit your completed work **on Blackboard, as a PDF**. To convert your Markdown file to a PDF, you will need to install one extra Extension to VS Code. Go to Settings > Extensions, and search for "Markdown PDF", then install it. To convert to PDF, use the keyboard shortcut shift+⌘+P on Mac or shift+ctrl+P on Windows to open the "Command Palette". In the Palette, search for the command "Markdown PDF: Export (pdf)" and run it. An HTML file will be created for just a moment while the export is running, and then your PDF should appear in the same folder as the Markdown file.

Here is a guide for writing in Markdown: https://www.markdownguide.org/basic-syntax/

## Q1
This question will have you investigate the vowels.csv dataset we've been using in class (see Blackboard for a download). Pretend you work at some sort of automatic speech recognition company. Your boss wants to know whether the vowel [i] and the vowel [ɪ] are actually different vowels, or if Linguists are just making this stuff up. (Linguists would call these the "tense high front vowel" and "lax high front vowel" respectively). Using **only descriptive statistics**, try to persuade your boss that these either are or are not the same vowel (we'll assume there's no right answer for the purposes of this question).

You should back up your argument with **at least one** measure of central tendency and **at least one** measure of variability per vowel. You should also produce at least one plot to visualize the data you want to show your boss. Include the R code that you used to manipulate the data and get these answers. **Hint**: you'll want to use the dplyr library.

## A1

## Q2
For this question, you will read the short article titled [The Perils of Post-Hockery](https://ruscio.pages.tcnj.edu/files/2016/08/Ruscio-1998-SI-Post-Hockery.pdf) (linked here) and answer some questions about it.

- What is the phenomenon of the "hot hand" in Basketball (as Rusico describes it?) What contrast does Rusico draw between the "common" belief about how the "hot hand" works, and the weaker, more "descriptive" conception of it? What piece of evidence does Rusico highlight as being missing when players, coaches, and fans talk about someone having a "hot hand"?
- Why does Rusico describe "computerized data analysis" as problematic when publishing scientific papers? What habits does it promote that interfere with rigorously testing hypotheses? (Keep in mind this was written in 1993)
- Describe "Confirmation Bias" in your own terms. Do you have any low-stakes beliefs like Zodiacs, personality tests, or the Blackout Baby Boom that you know evidence probably doesn't support, but are more fun to believe in than not?
- Keeping in mind that some of the 10 questions in the "Hindsight Bias and Overconfidence" section are a little dated now, either do as the text suggests and try to come up with your "90% confidence interval" for each of these questions, or come up with your own similar question (which you don't know the exact answer to). What were your results, either on the 10 questions in the text or the one you came up with?
- In your own words, what is "post-hockery" as Rusico is describing it in this paper?
- Why am I having you read this paper in a Data Science class, during our section on Hypothesis Testing? What is at least one useful takeaway from this paper that you can keep in mind when developing a hypothesis to test for your final project?

## A2
