10 Matching Annotations
  1. Jan 2020
    1. p-value=PH0[|¯¯¯¯Y−μY,0|>|¯¯¯¯Yact−μY,0|](3.2)(3.2)p-value=PH0[|Y¯−μY,0|>|Y¯act−μY,0|]\begin{equation} p \text{-value} = P_{H_0}\left[ \lvert \overline{Y} - \mu_{Y,0} \rvert > \lvert \overline{Y}^{act} - \mu_{Y,0} \rvert \right] \tag{3.2} \end{equation} In (3.2), ¯¯¯¯YactY¯act\overline{Y}^{act} is the mean of the sample actually computed. Consequently, in order to compute the ppp-value as in (3.2), knowledge about the sampling distribution of ¯¯¯¯YY¯\overline{Y} when the null hypothesis is true is required. However in most cases the sampling distribution of ¯¯¯¯YY¯\overline{Y} is unknown. Fortunately, as stated by the CLT (see Key Concept 2.7), the large-sample approximation ¯¯¯¯Y≈N(μY,0,σ2¯¯¯¯Y)  ,  σ2¯¯¯¯Y=σ2YnY¯≈N(μY,0,σY¯2)  ,  σY¯2=σY2n \overline{Y} \approx \mathcal{N}(\mu_{Y,0}, \, \sigma^2_{\overline{Y}}) \ \ , \ \ \sigma^2_{\overline{Y}} = \frac{\sigma_Y^2}{n} can be made under the null. Thus, ¯¯¯¯Y−μY,0σY/√n∼N(0,1).Y¯−μY,0σY/n∼N(0,1). \frac{\overline{Y} - \mu_{Y,0}}{\sigma_Y/\sqrt{n}} \sim \mathcal{N}(0,1). So in large samples, the ppp-value can be computed without knowledge of the exact sampling distribution of ¯¯¯¯YY¯\overline{Y}. Calculating the p-Value when the Standard Deviation is Known For now, let us assume that σ¯¯¯¯YσY¯\sigma_{\overline{Y}} is known. Then, we can rewrite (3.2) as p-value=PH0⎡⎢⎣∣∣ ∣∣¯¯¯¯Y−μY,0σ¯¯¯¯Y∣∣ ∣∣>∣∣ ∣∣¯¯¯¯Yact−μY,0σ¯¯¯¯Y∣∣ ∣∣⎤⎥⎦=2⋅Φ⎡⎢⎣−∣∣ ∣∣¯¯¯¯Yact−μY,0σ¯¯¯¯Y∣∣ ∣∣⎤⎥⎦.(3.3)p-value=PH0[|Y¯−μY,0σY¯|>|Y¯act−μY,0σY¯|](3.3)=2⋅Φ[−|Y¯act−μY,0σY¯|].\begin{align} p \text{-value} =& \, P_{H_0}\left[ \left\lvert \frac{\overline{Y} - \mu_{Y,0}}{\sigma_{\overline{Y}}} \right\rvert > \left\lvert \frac{\overline{Y}^{act} - \mu_{Y,0}}{\sigma_{\overline{Y}}} \right\rvert \right] \\ =& \, 2 \cdot \Phi \left[ - \left\lvert \frac{\overline{Y}^{act} - \mu_{Y,0}}{\sigma_{\overline{Y}}} \right\rvert\right]. \tag{3.3} \end{align} The ppp-value can be seen as the area in the tails of the N(0,1)N(0,1)\mathcal{N}(0,1) distribution that lies beyond ±∣∣ ∣∣¯¯¯¯Yact−μY,0σ¯¯¯¯Y∣∣ ∣∣

      This part is very confusing

  2. Dec 2019
    1. distribution of the sample mean ¯¯¯¯YY¯\overline{Y} of the Bernoulli distributed random variables YiYiY_i, i=1,...,ni=1,...,ni=1,...,n, is well approximated by the normal distribution with parameters μY=p=0.5μY=p=0.5\mu_Y=p=0.5 and σ2Y=p(1−p)=0.25σY2=p(1−p)=0.25\sigma^2_{Y} = p(1-p) = 0.25 for large

      Is it correct to state that the sample mean is well aproximated by the normal distribution with parameters p = 0.5 and sigma square = 0.25 for large n? Shouldnt it be that sigma square = 0.25/n ? (That is, there´s a " divided by n" missing )?

  3. Nov 2019
    1. By setting the argument lower.tail to TRUE we ensure that R computes 1−P(Y≤2)1−P(Y≤2)1- P(Y \leq 2), i.e,the probability mass in the tail right of 222.

      I think the correct explanation, according to R help function, is: "lower.tail : logical; if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X > x]."

    2. # compute denstiy at x=-1.96, x=0 and x=1.96

      Why is it relevant to compute the density at specific points (and not intervals), if the probability of a continuous distribution to take one specific value is zero? I would recomend a note stating that this function is not very useful for continuous variables, as probabilities can only be measured in intervals. I saw on a few websites that this function gives the height at the chosen point, maybe this could also be noted.