LinkSolv 8.3 Help Pages and User Guide

How to Select Cases for Linkage

 
 
Test case selection information as match fields. You can use reported information to select records for linkage. For example, suppose you want to link police crash report records to medical treatment records in order to determine medical outcomes of crashes. One strategy could be to use reported information for case selection -- you could select only those medical records for which cause of injury is reported as motor vehicle crash. A second strategy could be to summarize cases selection criteria in a field that you compare for linkage purposes but don't actually use the information for case selection. You could set a flag equal to Y on all crash records and equal to Y, N, or missing on all medical record depending on the reported cause of injury, and compare this flag in your linkage model. Usually, calculated match probabilities will be essentially the same with either strategy -- it's just two different paths to the same answer -- and which strategy is the better choice is determined by other considerations. Using reported information for case selection will decrease false positive links because records thought to be unrelated to the linkage have been removed from the files. In this crash outcome example, you would avoid most medical records for which the cause of injury was unrelated to a crash. However, this strategy also usually increases false negative links (missing links) because the information being used for case selection is never perfect -- some medical records have incorrect or missing cause of injury. The second strategy increases true positive links because it doesn't depend on the falg being accurate, but also might increase false positive links at the same time, particularly when the population flagged Y is a small fraction of the total number of records. You should test both strategies to determine which works better with your data.
 
Linkage Algorithms Examples. Linkage algorithms detect when a flag is used for linkage in order to handle such fields as special cases so that calculated match probabilities are correct. The following examples show how this works by applying Bayes' Rule for linkage to calculate posterior odds given agreement.
 
Example 1. Use all medical records for linkage and use the flag for matching. Assume 1,000 Crash records for vehicle occupants. 1,000 medical records for injured patients including 100 records for patients injured in one of the crashes. For simplicity, assume no missing values or incorrect values so that Flag = Y on all crash records. Flag = Y on 100 medical records for crash victims, otherwise Flag = N. 
 
 
A
B
C
1
Parameter
Formula
Result
2
Record Count A
 
1000
3
Record Count B
 
1000
4
Matched Pairs
 
100
5
Unmatched Pairs
(Record Count A x Record Count B) - Matched Pairs
999900
6
Prior Odds for Agreement
Matched Pairs / Unmatched Pairs
0.00010001
7
Matched Pairs with Flag = Y
 
100
8
m Probability Agreement
Matched Pairs with Flag = Y / Matched Pairs
1
9
Unmatched Pairs with Flag = Y
(Record Count A x Matched Pairs) - Matched Pairs
99900
10
u Probability Agreement
Unmatched Pairs with Flag = Y / Unmatched Pairs
0.099909991
11
Likelihood Ratio Agreement
m Probability Agreement / u Probability Agreement
10.00900901
12
Posterior Odds
Prior Odds x Likelihood Ratio
0.001001001
 
 
 
Example 2. Use the flag for case selection -- select only medical records with Flag = Y for linkage:
 
 
A
B
C
1
Parameter
Formula
Result
2
Record Count A
 
1000
3
Record Count B
 
100
4
Matched Pairs
 
100
5
Unmatched Pairs
(Record Count A x Record Count B) - Matched Pairs
999900
6
Prior Odds for Agreement
Matched Pairs / Unmatched Pairs
0.00010001
7
Matched Pairs with Flag = Y
 
100
8
m Probability Agreement
Matched Pairs with Flag = Y / Matched Pairs
1
9
Unmatched Pairs with Flag = Y
(Record Count A x Matched Pairs) - Matched Pairs
99900
10
u Probability Agreement
Unmatched Pairs with Flag = Y / Unmatched Pairs
1
11
Likelihood Ratio Agreement
m Probability Agreement / u Probability Agreement
1
12
Posterior Odds
Prior Odds x Likelihood Ratio
0.001001001
 
 
 
 
 
Authored with help of Dr.Explain