Sample‑size considerations for different study designs

 7 min read

YouTube video ID: 6su7gzcWAGY

Source: YouTube video by ChisquaresWatch original video

PDF
  • Single‑case investigations (e.g., the first Ebola or COVID‑19 patient) do not require a sample‑size calculation because only one individual is examined.
  • Case series (multiple patients with the same disease) also do not need a formal sample‑size calculation; they are used to describe percentages, means, and other descriptive statistics.

Descriptive vs. Analytic studies

Study typeNeed for sample‑size calculationTypical purpose
Descriptive (ecological, cross‑sectional)Yes – to achieve a desired precision for prevalence or proportion estimates.Estimate population parameters.
Analytic (case‑control, cohort, randomized trial)Yes – to detect a prespecified difference (effect size) between two or more groups.Test hypotheses about associations or treatment effects.

Effect size, power, and required sample size

  • Effect size = the smallest clinically relevant difference the investigator wishes to detect between groups (e.g., risk difference, odds ratio).
  • The smaller the effect size, the larger the required sample size.
  • Analogy: detecting a tiny object under a microscope requires a more powerful (larger) microscope; similarly, detecting a tiny effect requires a larger study.

Type I and Type II errors

Error typeDescriptionCommon cause
Type I (false positive)Concluding a positive association when none exists.Multiple comparisons / “p‑hacking” (searching for a p‑value < 0.05).
Type II (false negative)Failing to detect a real association.Sample size that is too small.
  • Both errors can coexist in a single study.
  • Which error is more dangerous depends on context (e.g., in drug development, a Type I error may lead to an ineffective drug being marketed, whereas a Type II error may withhold a beneficial drug). The decision must be made case‑by‑case.

Sample size, validity, and precision

  • Validity (internal and external) is determined by sampling procedures, measurement bias, and study design, not by sample size.
  • Precision refers to the width of confidence intervals; larger samples produce narrower intervals (greater precision).
  • A study can be precise but not valid (narrow confidence interval around a biased estimate) or valid but not precise (wide confidence interval around an unbiased estimate).

Finite‑population correction (FPC)

  • Sample‑size formulas for surveys assume the sample is an infinitesimally small fraction of the population.
  • When the sample exceeds about 5 % of the population, the FPC must be applied to adjust the required size.
  • For most large populations (city, state, country, world) the correction is negligible; it matters only for relatively small populations.

Multi‑arm trials and multiplicity

  • With k groups, the number of pairwise comparisons is

[ \frac{k!}{2!(k-2)!} ]

(e.g., 3 groups → 3 comparisons; 4 groups → 6 comparisons).
- To control the overall Type I error rate, the Bonferroni adjustment divides the nominal α (e.g., 0.05) by the number of comparisons, yielding a more stringent significance threshold and a larger required sample size.

Sample‑size inputs for comparative trials

ParameterTypical inputNotes
Control‑group prevalencee.g., 60 %Used as baseline.
Effect sizeAbsolute prevalence difference (e.g., 20 %) or odds ratio (e.g., 1.5)Choose the metric that matches the planned analysis.
Number of arms2, 3, …Determines multiplicity adjustments.
Power80–90 % (commonly 90 % for drug trials)Higher power → larger sample.
Response/compliance ratee.g., 60 %Adjusts upward for anticipated loss to follow‑up.
Multiple‑comparison adjustmentYes/NoEnables Bonferroni correction.
  • Example: control prevalence = 60 %, desired reduction to 40 % (20 % absolute difference), 4 arms, 90 % power, 60 % response rate → ≈ 583 participants per arm (total ≈ 2 332).
  • Using an odds‑ratio of 1.5 for the same scenario reduces the required per‑arm size to ≈ 37 (total ≈ 148).

Choosing the appropriate effect metric

  • Prevalence (or risk) difference is preferred when the absolute rates in each arm are known, because it removes ambiguity about direction.
  • Odds ratio is appropriate for case‑control studies where logistic regression will be used.
  • Different metrics yield different sample‑size estimates; the choice must align with the planned statistical model.

Block randomization and arm balance

  • Simple random allocation can produce unequal arm sizes, reducing power because power is driven by the smallest arm.
  • Block randomization forces roughly equal numbers in each arm, preserving power.

Cluster‑randomized trials and design effect

  • In cluster designs, participants within the same cluster are more alike (intra‑cluster correlation, ρ).
  • The design effect (DE) inflates the required sample size:

[ DE = 1 + \rho (m - 1) ]

where m = average cluster size.
- A DE of 1.5 → increase total sample size by 50 %; DE = 2 → double the sample size.
- When the exact ρ is unknown, analysts often explore a range (e.g., 1.5–4) in a sensitivity table.

Non‑inferiority, superiority, and equivalence trials

Trial typeNull hypothesis (H₀)Alternative hypothesis (H₁)Directionality
SuperiorityNo difference (Δ = 0)New treatment better (Δ > 0)One‑sided (often)
Non‑inferiorityNew treatment worse by more than Δ (Δ < ‑δ)New treatment not worse than Δ (Δ ≥ ‑δ)One‑sided (in the non‑inferior direction)
EquivalenceDifference exceeds ±ΔDifference lies within ±ΔTwo‑sided
  • In non‑inferiority trials, setting Δ (the non‑inferiority margin) is critical; it may be based on regulatory guidance, expert consensus, or the lower bound of a confidence interval from prior studies.
  • Sample‑size calculations for non‑inferiority trials use the same formulas as superiority trials but treat the test as one‑sided in the non‑inferior direction and require a larger sample to avoid a Type II error that would falsely claim non‑inferiority.

Identifying study designs and common mislabelings

  • Descriptive studies report characteristics without group comparisons.
  • Analytic studies compare groups (case‑control, cohort, randomized trial).
  • Mislabeling (e.g., “interventional case‑control” or “prospective case‑control”) is common; case‑control studies are always observational because exposure is not assigned.
  • Correct identification requires understanding the temporal relationship between exposure and outcome and whether the investigator manipulates exposure.

Sample‑size calculation for diagnostic‑test accuracy

  • Need to estimate sensitivity and specificity separately.
  • Inputs: disease prevalence, expected sensitivity, expected specificity, desired margin of error (e.g., ±5 %), confidence level (e.g., 95 %).
  • Total sample size = n₁ (for sensitivity) + n₂ (for specificity).
  • Example: prevalence = 10 %, sensitivity = 90 %, specificity = 85 %, margin = 5 % → total ≈ 161 participants.

Planning studies: protocol and statistical analysis plan (SAP)

  • A protocol describes the study objectives, design, population, and data‑collection methods.
  • A statistical analysis plan details how data will be analyzed (e.g., which tests, handling of missing data, subgroup analyses) and must be finalized before data collection begins to prevent “moving the goalposts.”
  • Together, the protocol and SAP ensure that the sample‑size calculation aligns with the intended analyses and that the study remains methodologically sound.

Key take‑aways

  1. Sample‑size calculations are essential for any study that aims to estimate parameters with a given precision or to detect a prespecified effect.
  2. The required size grows as the effect size shrinks, as the number of comparisons increases, and as intra‑cluster correlation inflates variance.
  3. Type I errors are driven mainly by multiple testing; Type II errors stem from insufficient power. Context determines which error is more consequential.
  4. Larger samples improve precision but do not guarantee validity; proper design, sampling, and measurement are equally important.
  5. For multi‑arm, cluster, and non‑inferiority trials, specialized adjustments (Bonferroni, design effect, non‑centrality parameter) must be incorporated into the sample‑size formula.

These principles provide a systematic framework for determining how many participants are needed across the wide variety of epidemiologic and clinical‑trial designs discussed.

  Takeaways

  • Sample‑size calculations are required for studies that need precise parameter estimates or to detect a predefined effect.
  • Smaller effect sizes, more comparisons, and intra‑cluster correlation increase the required sample size.
  • Type I errors are mainly caused by multiple testing while Type II errors result from insufficient power, and the relative importance of each depends on the study context.
  • Larger sample sizes improve precision by narrowing confidence intervals but do not ensure validity, which depends on design and measurement.
  • Multi‑arm, cluster‑randomized, and non‑inferiority trials need specific adjustments such as Bonferroni correction, design effect, and one‑sided testing in their sample‑size formulas.

Frequently Asked Questions

Who is Chisquares on YouTube?

Chisquares is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

PDF