Sample mean and sample proportion are both statistics used to summarize data, but they serve different purposes and are calculated differently. Here’s a detailed comparison of the two:
1. Sample Mean
Definition:
The sample mean is the average of a set of numerical values. It provides a measure of central tendency, indicating where the center of the data lies.
Calculation:
The sample mean (( \bar{x} )) is calculated as:
[
\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}
]
where:
- ( n ) = number of observations in the sample
- ( x_i ) = each individual observation
Characteristics:
- Data Type: Used for quantitative (continuous or discrete) data.
- Range: Can take any real number value, depending on the data.
- Sensitivity: The mean is sensitive to outliers (extreme values can significantly affect the mean).
- Distribution: The sampling distribution of the sample mean will tend to be normally distributed (by the Central Limit Theorem) as ( n ) increases, even if the original data is not normally distributed.
Example:
If you have a sample of test scores: 85, 90, 75, 88, the sample mean would be:
[
\bar{x} = \frac{85 + 90 + 75 + 88}{4} = \frac{338}{4} = 84.5
]
2. Sample Proportion
Definition:
The sample proportion is the ratio of the number of successes in a sample to the total number of observations in that sample. It is particularly useful when you’re dealing with categorical data.
Calculation:
The sample proportion (( \hat{p} )) is calculated as:
[
\hat{p} = \frac{x}{n}
]
where:
- ( x ) = number of successes (the count of a certain category or outcome)
- ( n ) = total number of observations in the sample
Characteristics:
- Data Type: Used for categorical (qualitative) data where outcomes can be categorized (e.g., yes/no, success/failure).
- Range: Values range between 0 and 1 (or expressed as a percentage, 0% to 100%).
- Interpretation: Represents the likelihood or proportion of success within the sample.
- Distribution: For large samples, the distribution of the sample proportion can be approximated using the normal distribution if certain conditions are met (usually when both ( np ) and ( n(1-p) ) are greater than 5).
Example:
If you conducted a survey of 100 people, and 40 of them said they preferred brand A, the sample proportion would be:
[
\hat{p} = \frac{40}{100} = 0.40 \text{ (or 40\%)}
]
Summary
Feature | Sample Mean | Sample Proportion |
---|---|---|
Definition | Average of numerical data | Ratio of successes in categorical data |
Data Type | Quantitative | Categorical |
Calculation | (\bar{x} = \frac{\sum x_i}{n}) | (\hat{p} = \frac{x}{n}) |
Range | Any real number | [0, 1] (or percentage) |
Sensitivity | Sensitive to outliers | Less sensitive to outliers |
Distribution | Approaches normality (CLT) | Approaches normality (CLT) under certain conditions |
Both statistics are essential in inferential statistics and help researchers make conclusions about the larger population based on sample data.