Comparison of the minimally important difference for two health state utility measures: EQ-5D and SF-6D
Walters, Stephen J.; Brazier, John E.
Quality of Life Research
BACKGROUND: The SF-6D and EQ-5D are both preference-based measures of health. Empirical work is required to determine what the smallest change is in utility scores that can be regarded as important and whether this change in utility value is constant across measures and conditions. OBJECTIVES: To use distribution and anchor-based methods to determine and compare the minimally important difference (MID) for the SF-6D and EQ-5D for various datasets. METHODS: The SF-6D is scored on a 0.29-1.00 scale and the EQ-5D on a -0.59-1.00 scale, with a score of 1.00 on both, indicating ‘full health’. Patients were followed for a period of time, then asked, using question 2 of the SF-36 as our anchor, if their general health is much better (5), somewhat better (4), stayed the same (3), somewhat worse (2) or much worse (1) compared to the last time they were assessed. We considered patients whose global rating score was 4 or 2 as having experienced some change equivalent to the MID. This paper describes and compares the MID and standardised response mean (SRM) for the SF-6D and EQ-5D from eight longitudinal studies in 11 patient groups that used both instruments. RESULTS: From the 11 reviewed studies, the MID for the SF-6D ranged from 0.011 to 0.097, mean 0.041. The corresponding SRMs ranged from 0.12 to 0.87, mean 0.39 and were mainly in the ‘small to moderate’ range using Cohen’s criteria, supporting the MID results. The mean MID for the EQ-5D was 0.074 (range -0.011-0.140) and the SRMs ranged from -0.05 to 0.43, mean 0.24. The mean MID for the EQ-SD was almost double that of the mean MID for the SF-6D. CONCLUSIONS: There is evidence that the MID for these two utility measures are not equal and differ in absolute values. The EQ-5D scale has approximately twice the range of the SF-6D scale. Therefore, the estimates of the MID for each scale appear to be proportionally equivalent in the context of the range of utility scores for each scale. Further empirical work is required to see whether or not this holds true for other utility measures, patient groups and populations. [References: 41]