There have been many pay-for-performance (P4P) programs that have been implemented to attempt to improve quality and reduce cost. The vast majority of these programs have not been able to demonstrate large or even any improvement in quality or cost. Some researchers claim that these programs have not worked due to the size of the bonus, the specific metrics measured, a learning curve in P4P design and a host of other factors.
An interesting article in Health Economics—Sherry (2016)–claims that the the lack of success may be due to a fundamental flaw in P4P design.
…the intuition that P4P increases the output of a rewarded service is guaranteed only in the most simple (and unrealistic) cases of P4P plans where a single quality metric is rewarded. When more than one service is rewarded under P4P, the change in the level of each rewarded service is ambiguous due to multitasking between rewarded services. Multiple rewarded services are more likely to increase when unrewarded services earn lower marginal revenue. The change in unrewarded services is also generally ambiguous, even in settings of joint production. A given unrewarded service is more likely to decrease due to multitasking if other competing unrewarded services are more profitable; conversely, it is less likely to decrease if it is jointly produced with other rewarded services.
To give a simple example, consider the case where a patient should receive 3 types of preventive services but only service #1 and #2 are rewarded. In this case, it is unclear if they would increase the quantity of preventive services of #1 and #2. If service #1 and #2 are rewarded equally but service #2 takes less time to accomplish, physicians may increase provision of preventive service #2, but decrease provision of preventive service #1. Further, if physician time is limited, they may be less likely to provide preventive service #3 since since this services does not receive any reimbursement. On the other hand, if preventive service #2 and #3 typically occur together, then physicians could also perform more preventive service #3.
If this seems confusing, most P4P programs have tens or hundreds of measures and due to the joint production and multitasking interactions, the net effect on quality will be unclear.
Based on some comparative statics the authors conduct on a simple 3 service model, the authors make 3 recommendations for improving P4P programs.
- Rewarding fewer quality measures may be preferred. “Rewarding a smaller set of quality measures limits the scope of P4P programs, but may stimulate greater performance improvement if multitasking between rewarded services is mitigated.” Further, rewarding fewer services decreases the administrative complexity and cost of the program.
- Second, attention should be paid to unrewarded services as well. The point has been made previously in the multitasking literature, most notably by Holmstrom and Milgrom 1991 (summary post here).
- The interactions between the production of different types of health services should be considered when designing a P4P program. “When rewarding a single quality measure, selecting a measure that is jointly produced with other desirable health services and quality measures can contribute to broader quality improvements beyond the rewarded measure. Similarly, avoiding measures that induce higher amounts of multitasking can help avoid disruptions in care.”