FAD1015: Mathematics III — Tutorial 12

Centre for Foundation Studies in Science
Universiti Malaya
Session 2024/2025


Topic: Hypothesis Testing in R

It is important to remember that the hypothesis testing for sample mean assumes that the X follows a normal distribution. This may be true if X follows a normal distribution or if the sample size is large enough (in practice, n>30, thanks to the Central Limit Theorem).


Case Study: Octopus Weight Analysis

We examine the weight of adult female octopuses fished off the coast of Mauritania. The data can be found here: http://tinyurl.com/yhyetsuw

We would like to obtain an estimation of the mean of the weight and a confidence interval for this mean with a threshold of 95%.


Steps

Step 1: Read the data (given as OctopusF.txt)

# Read the octopus data
octopus <- read.table("OctopusF.txt", header = TRUE)

Step 2: Select the female octopus only / remove male octopus

octF <- subset(octopus, Sexe == "F")

Step 3: Find the summary statistics

summary(octF)

Step 4: Assess normality

From the summary statistics, can you tell if the data is normally distributed or not?

Step 5: Construct a histogram

hist(octF$weight)

Step 6: Check for normality

Do you think this is necessary? Name a few normality tests:

  • Shapiro-Wilk Test
  • Q-Q Plot (Quantile-Quantile plot)
# Q-Q Plot
qqnorm(octF$weight)
qqline(octF$weight)

# Shapiro-Wilk test
shapiro.test(octF$weight)

Step 7: State hypotheses

Assuming normality assumption holds, test if the mean weight for female octopus is equal or greater than 640.

State the null and alternative hypothesis:

  • H₀: μ = 640 (or μ ≤ 640)
  • H₁: μ > 640

Step 8: Read about t.test function

Use help(t.test) and read up about the details of the function t.test

help(t.test)

Step 9: Perform t-tests

Perform the following t.test (in practice σ is unknown) and observe their output. Discuss the differences of each command.

# Default two-tailed t-test
weightF <- octF$weight
t.test(weightF)

# One-sample t-test with specified mean
t.test(weightF, mu = 640)

# One-tailed test (greater)
t.test(weightF, mu = 640, alternative = "greater")

# One-tailed test (less)
t.test(weightF, mu = 640, alternative = "less")

Discussion Points:

  • What does each variant test?
  • How do the p-values differ?
  • When would you use each alternative hypothesis?

Step 10: Draw conclusions

Referring to Step 7, what conclusion can you reach based on the t-test results?

Step 11: Confidence interval

Find the confidence interval of the sample mean (σ is unknown). Observe the result.

From Step 6, it is shown that the data is not following normal distribution.

Questions to consider:

  • Do you think the results from the t-test is valid when normality is violated?
  • What is the alternative to t-test when data is not normally distributed?

Additional Notes

When to use t-test vs alternatives:

Condition Test to Use
Normal distribution, σ unknown One-sample t-test
Non-normal, large sample (n > 30) t-test (CLT applies)
Non-normal, small sample Mann-Whitney U Test or Wilcoxon Signed-Rank Test
Paired data Paired t-test
Two independent groups Two-sample t-test

R Functions Reference

Function Purpose
t.test() Performs one/two sample t-tests
shapiro.test() Shapiro-Wilk normality test
qqnorm() Creates Q-Q plot
qqline() Adds reference line to Q-Q plot
hist() Creates histogram
summary() Summary statistics
subset() Select subset of data

Related Concepts

  • Hypothesis Testing — overview of statistical hypothesis testing framework
  • T-Test — statistical test for mean with unknown variance
  • Null Hypothesis — default assumption to be tested
  • Alternative Hypothesis — claim to be tested against null
  • P-Value — probability of test statistic under null
  • Confidence Interval — range of plausible values for parameter
  • Shapiro-Wilk Test — test for normality
  • Q-Q Plot — graphical check for normality
  • Central Limit Theorem — basis for large-sample inference
  • Non-Parametric Tests — tests without normality assumption
  • Probability Distributions — t-distribution and normality

Related Lectures

Related Course Page


Source: FAD1015 25-26 Tutorial 12 Questions.pdf