1
Matching
EPIET
Mahón, 2006
J Stuart & F Simón, 2005
T Grein, 2006
2
Once again … confounding
Exposure Outcome
Third variable
Be associated with exposure
- without being consequence of exposure
Be associated with outcome
- independent of exposure
3
Control of confounders
•
In analysis
–
Stratification
–
Multivariable analysis
•
In study design
–
Randomization (experiment)
–
Restriction
–
Matching
4
Matching
•
Ensures that confounding factor is equally
distributed among each of study groups
–
Controls selected to match specific characteristics of
cases
–
Unexposed selected to match specific characteristics of
exposed
•
Achieves balanced data set that can
–
Prevent confounding (if matched on confounder)
–
Increase study precision
Focus on case-control studies as implications more important
5
Types of matching
•
Individual matching
–
Controls selected individually for each case by matching
variable
–
Pairs of individuals (1:1)
–
Selection of more than one control per case (1:n)
•
Frequency matching
–
Number of controls selected according to number of
cases in categories of matching variable
–
Matching done by groups of subjects
•
In both cases, analysis must take matching
design into account
6
Individual matching
–
Echovirus meningitis outbreak, Germany, 2001
–
Is swimming in pond “A” risk factor?
–
Case control study with each case matched to one control
Source: A Hauri, RKI Berlin
7
Individual matching
–
Echovirus meningitis outbreak, Germany, 2001
–
Is swimming in pond “A” risk factor?
–
Case control study with each case matched to one control
Source: A Hauri, RKI Berlin
Concordant
pairs
Discordant
pairs
8
Individual matching
Matched 2x2 table
Unmatched 2x2 table
x
9
Individual matching: Analysis
•
Treat each pair as one stratum
•
Calculate Mantel-Haenszel odds ratio
•
Nomenclature matched 2x2 table
∑
∑
×
×
=
][
][
i
i
MH
ncb
nda
OR
10
Individual matching: Analysis
11
Individual matching: Analysis
12
Individual matching: Analysis
13
Individual matching: Analysis
14
Individual matching: Analysis
exposed control wherepairs discordant
exposed case wherepairs discordant
∑
∑
=
g
f
0h1/2g0f0e
0h 0g 1/2f 0e
=
+++
+++
=
×
×
=
∑
∑
][
][
i
i
MH
ncb
nda
OR
15
Individual matching: Analysis
7.67
6
46
g
f
OR
MH
===
16
Matching case to n controls
•
Same principle as 1:1 matching
•
Constitute pairs
–
Pair (1 case, 1 control)
–
Triplet (1 case, 2 controls) yields 2 pairs
–
Quadruplet (1 case, 3 controls) yields 3 pairs
–
Etc
•
Stratified analysis by pairs
17
Matching case to n controls
1
6
exposed control wherepairs discordant
exposed case wherepairs discordant
==
∑
∑
MH
OR
18
Frequency matching: Analysis
19
Frequency matching: Analysis
Stratum 3
Stratum 4
20
Frequency matching: Analysis
•
With many strata, stratification quickly leads to
sparse data problem
–
Matching for > 1 confounder
–
Numerous nominal categories
•
Conditional logistic regression
–
Logistic regression for matched data
–
“Conditional“ on using discordant pairs only
–
Matching variable itself cannot be analysed
–
Testing for interaction of matching variable possible
21
Why stratified analysis?
•
Matching eliminates the original confounding, but
introduces another confounding factor
•
Controls no longer representative of source
population as selected according to matching
criteria (selection bias)
•
Cases and controls more alike. By breaking
match, OR usually underestimated
•
Matched design = matched analysis
22
Overmatching
•
20 cases of cryptosporidiosis
•
? associated with attendance at local swimming
pool
•
Two matched studies
–
Controls from same general practice and nearest date of
birth
–
Cases nominated controls (friend controls)
23
Overmatching
GP, age-matched
OR = f/g = 15/1 = 15
Friend-matched
OR = f/g = 3/1 = 3
24
Advantages of matching
•
Useful method in case-control studies to optimise
resources
•
Can control for complex environmental, genetic,
other factors
–
Siblings, neighbourhood, SES, utilization of health care
•
Can increase study efficiency
–
Overcomes sparse-data problem by balancing strata
–
Maximises information when sample size small
•
Sometimes easier to identify controls
–
Random sample may not be possible
25
Disadvantages of matching
•
Cannot examine risks associated with matching
variable
•
If no controls identified, lose case data, and vice
versa
•
Overmatching on exposure will bias OR towards
1
•
Complicates statistical analysis
•
Residual confounding by poor definition of strata
•
Sometimes difficult to identify appropriate
controls