Triple Match  Guidelines

Auto Specify  Match A  B, Match B  C, Match A  C A Triple Match means that the same person and event is reported in three different tables that are designated as Source A, Source B, and Source C. Triple Match specifications are derived from three dual match specifications so complete your dual matches first. Source A is the table that is specified as the first table in two of the dual matches. In the example, Crash is Source A because it is specified first in both Crash to EMS and Crash to Hospital. Source B is the table that is specified first in one dual match and second in another. EMS is Source B because it is specified first in EMS to Hospital and second in Crash to EMS. Source C is the table that is specified second in two dual matches. Hospital is Source C because it is specified second in both Crash to Hospital and EMS to Hospital. Once Sources A, B, and C are determined then the three dual matches are identified as A  B, B  C, or A  C.

Auto Specify  Total Triples The Total number of Triples is estimated as 95% of the smallest number of imputed links from the three dual matches. See Item 6 for more information.

Auto Specify  Preferred Triples. Preferred Triple are set to the default value, Common A. See Item 7 for more information.

Auto Specify  Field A, Field B, Field C All fields that are paired as match fields in one or more of the dual matches are listed as triple match fields. If a match field is common to two or more dual matches (common field from Source A, B, or C) then entries are made for all three fields. In the example, Crash County is a common field from A because it is paired with with Scene County in Crash to EMS and with Hospital County in Crash to Hospital. If a match field is only used in one dual match then no third entry is made. In the example, Std Hour is only paired with Std Hour Call in Crash to EMS. Neither field is paired with a field from Hospital.

Auto Specify  Group Field Group is set to the default value, 0. See Item 8 for more information.

Total Triples This should be your best estimate of the total number of triple links that exist to be found. Your estimate should be based on reported information and any other prior knowledge you have about the data files in much the same way that you estimate Total Matches for each dual match.

Preferred Triples Candidate triples are selected automatically from tabulated merged pairs by finding all pairs in two different dual matches with a common record. For example, the Crash to EMS pair (Crash record 1, EMS record 2) and the EMS to Hospital pair (EMS record 2, Hospital record 3) share a common EMS record 2 (Common B record). The probability that this candidate triple is a true match is the product of the probability that (Crash record 1, EMS record 2) is a true dual match times the probability that (EMS record 2, Hospital record 3) is a true dual match. Some candidate triples involve only one common record. In the example, this would be the case if the pair (Crash record 1, Hospital record 3) was not in the merged pairs for Crash to Hospital. Otherwise, the probability calculation could be done using pairs with Common A, Common B, or Common C. In such cases, triple probability calculations are done using pairs with the preferred common record you specify. Different choices give different linkage models with different goodness of fit. You should experiment to find the best fit for your data.

Group Sometimes you may want to combine two rows of match fields together for reporting purposes because they involve the same conceptual information. In the example match, Zip Codes are compared in EMS to Hospital for location agreement instead of Counties as in Crash to EMS and Crash to Hospital. You can specify that these two rows should be grouped together by giving them both Group number 1. A second pair of related rows should be given Group number 2, etc.

Perform Triple. When you click on Perform Triple the following steps are performed for each imputation. 1. Candidate triples (Triple__MT1, etc.) and associated posterior probabilities are selected automatically from tabulated merged pairs by finding all pairs in two different dual matches with a common record. 2. Comparison outcomes from the dual matches are tabulated for each match field. 3. Posterior probabilities are calculated for each candidate triple. For dual matches, candidates can only be true matches or unmatched. For triple matches, candidates can be true triples, true A  B only, true B  C only, true A  C only, or unmatched. A posterior probability is calculated for each possibility. Goodness of fit for a triple match with simulated data is measured is the same way as for dual matches. 4. Disposition of each candidate triple is imputed by a random draw from the calculated posterior distribution. 5. Associated dual match pairs are extracted from each imputed triple match. Extracted pairs are listed in tables like the imputed pairs and linked pairs tables for dual matches (Triple__CrashEMS__IP1, Triple__CrashEMS__LP1, etc.). Pairs that are not part of imputed triples are imputed using the same dual match probabilities as used in the past. If the candidate triple in the earlier example became an imputed triple match then the pair (Crash record 1, Hospital record 3) would be included even though it was not in the merged pairs for the Crash, Hospital dual match. So, there is a potential to pick up missing dual matches by doing a triple match.

Review Triple. When you click on Review Triple a table of merged candidate triples is displayed (Triple__MT1, etc.). Candidate triples are numbered sequentially. Each is identified by a different combination of Unique IDs from Sources A, B, and C. Posterior probabilities from the three dual matches are listed as Prob AB, Prob BC, and Prob AC. If a candidate was missing from a dual match (not tabulated as a merged pair) then 0.01 is listed as the posterior probability. You might be able to improve your dual match specifications if you investigate why such pairs were missing from dual match results in the first place. Calculated prior and posterior probabilities are listed for each possible imputed disposition (T = Triple, AB = A  B only, BC = B  C only, AC = A  C only , U = Unmatched). Imputed disposition for each candidate triple is listed using the same coding scheme. Comparison outcomes are listed for all match fields. Outcomes are coded using A for Agree, D for Disagree, and M for Missing. For example, outcome ADD means that a match field Agreed in Match A  B, Disagreed in Match B  C, and Disagreed in Match A  C. Outcome MMA means that a match field was Missing in Match A  B, Missing in Match B  C, and Agreed in Match A  C.
