1
0

updating explanation of part I to address reasons for dropped observations

This commit is contained in:
aaronshaw
2020-11-24 22:19:47 -06:00
parent 37924e44ba
commit c36ea661e7
3 changed files with 26 additions and 0 deletions

View File

@@ -1723,6 +1723,18 @@ summary(model)</code></pre>
## Multiple R-squared: 0.719, Adjusted R-squared: 0.7108 ## Multiple R-squared: 0.719, Adjusted R-squared: 0.7108
## F-statistic: 87.01 on 4 and 136 DF, p-value: &lt; 2.2e-16</code></pre> ## F-statistic: 87.01 on 4 and 136 DF, p-value: &lt; 2.2e-16</code></pre>
<p>What do you know. That was it. The difference in <span class="math inline">\(R^2\)</span> is huge!</p> <p>What do you know. That was it. The difference in <span class="math inline">\(R^2\)</span> is huge!</p>
<p>A little further digging (by Nick Vincent) revealed that these two outliers come from auctions where the Mario kart game was being sold as part of a bundle along with other games. You can look this up in the <code>title</code> field from the original dataset using the following block of code:</p>
<pre class="r"><code>data(mariokart)
mariokart %&gt;%
filter(total_pr &gt; 100) %&gt;%
select(id, total_pr, title)</code></pre>
<pre><code>## # A tibble: 2 x 3
## id total_pr title
## &lt;dbl&gt; &lt;dbl&gt; &lt;fct&gt;
## 1 110439174663 327. &quot;Nintedo Wii Console Bundle Guitar Hero 5 Mario Kart &quot;
## 2 130335427560 118. &quot;10 Nintendo Wii Games - MarioKart Wii, SpiderMan 3, et…</code></pre>
<p>What do you make of the textbook authors decision to drop the observations? Can you make a case for/against doing so? What seems like the right decision and the best way to handle this kind of situation?</p>
</div> </div>
<div id="interpret-some-results" class="section level2"> <div id="interpret-some-results" class="section level2">
<h2>Interpret some results</h2> <h2>Interpret some results</h2>

Binary file not shown.

View File

@@ -124,6 +124,20 @@ summary(
What do you know. That was it. The difference in $R^2$ is huge! What do you know. That was it. The difference in $R^2$ is huge!
A little further digging (by Nick Vincent) revealed that these two outliers come from auctions where the Mario kart game was being sold as part of a bundle along with other games. You can look this up in the `title` field from the original dataset using the following block of code:
```{r}
data(mariokart)
mariokart %>%
filter(total_pr > 100) %>%
select(id, total_pr, title)
```
What do you make of the textbook authors' decision to drop the observations? Can you make a case for/against doing so? What seems like the right decision and the best way to handle this kind of situation?
## Interpret some results ## Interpret some results
The issues above notwithstanding, we can march ahead and interpret the results of the original model that I fit. Here are some general comments and some specifically focused on the `cond_new` and `stock_photo` variables: The issues above notwithstanding, we can march ahead and interpret the results of the original model that I fit. Here are some general comments and some specifically focused on the `cond_new` and `stock_photo` variables: