interventions, coverage and cost limitations, and disease management, the literature shows mixed results for these modalities in improving patient outcomes and controlling cost. Our objective is to evaluate the potential of data mining methods to identify novel risk factors for chronic disease and stratification of enrollee utilization, which can be used to develop new methods for targeting disease management services to maximize benefits to both enrollees and payers.
Key words: substance use disorder; Bayesian belief network; chemical dependency; predictive modeling
Methods: For our evaluation, we used DecisionQ machine learning algorithms to build Bayesian network models of a representative sample of data licensed from Thomson-Reuters’ MarketScan consisting of 185,322 enrollees with three full-year claim records. Data sets were prepared, and a stepwise learning process was used to train a series of Bayesian belief networks (BBNs). The BBNs were validated using a 10 percent holdout set.
Results: The networks were highly predictive, with the risk-stratification BBNs producing area under the curve (AUC) for SUD positive of 0.948 (95 percent confidence interval [CI], 0.944–0.951) and 0.736 (95 percent CI, 0.721–0.752), respectively, and SUD negative of 0.951 (95 percent CI, 0.947–0.954) and 0.738 (95 percent CI, 0.727–0.750), respectively. The cost estimation models produced area under the curve ranging from 0.72 (95 percent CI, 0.708–0.731) to 0.961 (95 percent CI, 0.95–0.971)
Conclusion: We were able to successfully model a large, heterogeneous population of commercial enrollees, applying state-of-the-art machine learning technology to develop complex and accurate multivariate models that support near-real-time scoring of novel payer populations based on historic claims and diagnostic data. Initial validation results indicate that we can stratify enrollees with SUD diagnoses into different cost categories with a high degree of sensitivity and specificity, and the most challenging issue becomes one of policy. Due to the social stigma associated with the disease and ethical issues pertaining to access to care and individual versus societal benefit, a thoughtful dialogue needs to occur about the appropriate way to implement these technologies.
In 2007, an estimated 19.9 million persons aged 12 or older were current illicit drug users, and 17.0 million people were heavy drinkers.1 Results from one recent study indicate that the risk for cocaine dependence is 5–6 percent among all those who have used the drug.2 While it is generally acknowledged that substance use disorder (SUD) has high healthcare costs, comorbidities, and economic costs to the nation, the development of systematic approaches to disease management in this population has been handicapped by what is still a limited knowledge of disease mechanics compared to diseases such as cancer and heart disease, for which we tend to have a more advanced understanding of disease physiology and genetics supporting evidence-based population intervention and treatment models. While we are making tremendous strides in understanding the genetics and physiology of SUD, there is still a need to enhance our tool set for intervention and management. Further, SUD contributes not only to behavioral health costs but also to overall medical costs, as shown in a large retrospective cohort study by Clark et al. that identified significant medical cost increases in a study cohort of 148,457 Medicaid beneficiaries.3
This paper explores the use of machine learning and Bayesian classification models to develop broadly applicable models for the identification of disease risk factors and stratification models to guide the placement of health plan enrollees with SUD into appropriate disease management programs. While the high costs and morbidities associated with SUD are understood by payers, who manage it through utilization review, acute interventions, coverage and cost limitations, and disease management, the literature shows mixed results for these modalities in improving patient outcomes and controlling cost. A selection bias has been documented in disease management whereby members who are already sick are more motivated to take advantage of disease management.4 This selection bias may contribute to reduced success rates since enrollment and management often occur in response to an acute episode rather than prophylactically to prevent acute episodes. As the literature shows, in appropriately targeted populations, disease management can prove very successful.5 However, the literature is also severely critical of the current state of the art in developing personalized, stratified models of care in behavioral health, as evidenced by the disappointing results in the Matching Alcoholism Treatments to Client Heterogeneity (Project MATCH) study, and there is vigorous debate in the literature over the benefit of these types of models.6–8 Our objective is to evaluate the potential of data mining methods to address some of the shortcomings of current practice.
Our method has the ability to address many of these limitations by supporting more complex rule sets that can effectively account for the inherent complexity of comorbid interactions. Further, we focused on developing our tools from a claims database since this represents a common substrate available to both payers and providers, as the literature often laments the lack of access to good data sets for evaluation.9 By focusing on the most widely available data, we are seeking to develop a set of methods and tools that have the potential to improve patient risk stratification and enrollment in disease management programs to address current shortcomings in practice by developing an individualized model of risk stratification using broad populations and readily available data.10
While there is extensive literature on the comorbidities and impact that SUD has on other acute and chronic conditions, these multiple, complex relationships are often studied in a bivariate context. Assembling these into a robust, useful rule set is nontrivial. Some work has been done using regression modeling and clustering; however, these methods suffer from limitations with respect to their ability to codify complex nonlinear relationships, ingest and model large sample sizes, and provide transparent outputs to users. 11–13
We have selected machine-learned Bayesian belief network (BBN) probabilistic classifiers because they address several of these key issues. BBNs allow for the representation of complex, nonlinear systems in a transparent format that is tractable, or easily comprehensible, to the user.14, 15 BBNs are effective at representing complex biological systems in a robust manner.16, 17 The use of Bayesian networks has historically been limited by a high level of inherent computational complexity. However, the advent of increased computational power and the development of machine-learning algorithms allow us to overcome these challenges and develop novel BBNs directly from large, heterogeneous training cohorts.18, 19 The use of BBNs and machine learning is well established in research and clinical practice in areas such as risk prognosis, diagnosis, and expected outcomes in heart disease, cancer, and trauma.20–23 We were unable to identify existing literature exploring the application of this method to risk stratification of SUD enrollees using claims data; thus, we believe this is a novel assessment of the potential value of this technology.
For our evaluation, we used DecisionQ machine-learning algorithms to build Bayesian network models of a representative sample of data licensed from Thomson-Reuters’ MarketScan. The sample contained detailed insurance claim information on 400,000 randomly selected MarketScan enrollees for the years 2004, 2005, and 2006. The data records (which are deidentified) have information on demographics, inpatient admissions (including detailed procedure codes, diagnosis codes and charges), outpatient services, and pharmacy claims. We restricted the sample to the 185,322 individuals who remained enrolled all three years. We used the resulting models to identify key relationships and identify combinations of factors to calculate both the individual risk probability of SUD as well as an individual estimate of total annual future claims given demographic factors and comorbid conditions. Future claims estimates can also be derived by making assumptions about treatment and the impact that treatment may have on utilization.
Definition of SUD and Training Data Set
Thomson Reuters provided a set of 42 tables of information on a randomly selected sample of 400,000 enrollees, aged 18 to 65, from their MarketScan database. The data, which cover the years 2004, 2005, and 2006, include details on each inpatient, outpatient, and pharmacy claim together with demographic information by enrollee. The database is deidentified, but each record has a unique identification field suitable for matching information by enrollee across the various tables.
We began with 29.7 million data records describing three years of clinical history of 400,000 enrollees. Therefore, the process of arranging selected elements by enrollee (“flattening” the data) for modeling was nontrivial. To accomplish it we used SAS routines to merge tables within category, but across years. We then sorted each table by encounter date and used a series of Java applications (operating across a JDBC-ODBC bridge) to extract and aggregate required database fields by unique enrollee. The Java routines produced comma-delimited text files, which were incorporated into a Microsoft Access database.
Upon analysis, we observed that 185,322 of the 400,000 MarketScan population were present in all three years of data. We used that subgroup as our study population. This provided us with a well-defined, representative cohort of adult enrollees. A randomly selected training set of 166,999 (roughly 90 percent) was used for model building. The remaining 10 percent or 18,623 individuals became the holdout set, which we used subsequently to validate the models. We identified a set of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes directly associated with the diagnosis of SUD. We defined a database enrollee as having SUD if any of the diagnosis codes in our set appeared in a claim record, either as the primary diagnosis or a nonprimary diagnosis. All enrollees in the three-year cohort that met these criteria were coded as such in our final database. (See Appendix A.)
BBN Model Development
We used our prepared data set to train a series of BBNs to estimate individual risk of SUD as well as expected future healthcare utilization. BBNs have increased in popularity as a method to classify and interpret complex clinical and pathologic information because they more accurately reflect the nonlinear and multifactorial nature of biology.24 A Bayesian network encodes the joint probability distribution of all the variables in a domain by building a network of conditional probabilities. It uses conditional independence assumptions to make the representation tractable. The networks are directed graphs that incorporate parent-child relationships between nodes. Essentially, they provide a hierarchy of how the knowledge of a priori evidence influences the downstream likelihood of an event (e.g., “I know that enrollee X has hypertension; therefore, the probability of kidney disease relative to the overall population is y”). The model offers a transparent, graphical representation of these probabilities that a user can interpret, unlike a neural network, which uses complex calculations that cannot be represented to the user and is thus opaque.
We used machine learning to calculate prior probabilities and identify the structure of our BBN. Prior probabilities are derived from the data to be modeled by calculating distributions of discrete states for categorical variables or using binning to convert continuous variables into categorical variables. A heuristic search method is used to generate hypothetical models with different conditional independence assumptions in order to identify the best model structure. The heuristic search method used in this study benefits from two proprietary advances, one a more efficient caching and query system that allows us to consider an order of magnitude more data, and the other a very efficient search architecture that provides additional flexibility in searching for the optimal model structure. These improvements have been shown to perform 1 to 5 percent better than a standard heuristic algorithm in terms of model quality score.25
The modeling team applied the heuristic search algorithms in a stepwise modeling process to optimize the robustness and utility of each BBN. The objective of this process was to produce the most robust classifier with respect to identification of SUD or stratification into expected utilization categories through better attribute selection and continuous testing.
This process can be summarized as follows:
- preliminary modeling identifies appropriate machine learning parameters, data quality issues, and confounding attributes that reduce model accuracy;
- global modeling sets appropriate machine-learning parameters, prunes attributes, and allows investigators to observe the global data structure;
- naïve modeling operates with an assumption that features driving a specific dependent outcome of interest are mutually independent, therefore providing insight into the direct contribution of individual features; and
- focused modeling runs on subsets of variables identified in the prior steps to derive a more focused BBN than that obtained in global modeling.
Continuous testing is used to score networks to identify the best network and structure, with the objective of balancing between reducing the risk of overfitting while exploring features exhaustively.
Given the high dimensionality of the data being used and the problem under consideration, the team recognized that to maximize predictive power, a series of different classifiers should be trained and independently evaluated using the test set, and then the best classifiers for risk stratification and cost estimation should be selected and used to derive insights and rules for disease management enrollment. As a result, we produced two sets of models: risk stratification models for the identification of SUD enrollees in the broad population, and cost/treatment models for the estimation of utilization, cost, and therapy response within different enrollee subsets.
The network was validated using a holdout data set of 18,623 enrollees for interset validation. The validation set was further broken into 10 different subsets to provide an estimate of both classifier accuracy and variance of classifier accuracy. The test set predictions were then used to calculate receiver operating characteristic (ROC) curves (sensitivity vs. specificity) for each model. The ROC curve was calculated by comparing the predicted value for each variable to the known value in the test set on a case-specific basis and then used to calculate area under the curve (AUC), a metric of overall model quality.
From our MarketScan population we calculated some basic statistics to describe our study population. Of total enrollees in 2004, 23.9 percent dropped out in 2005, and 53.7 percent had dropped out by 2006. Of enrollees who had any SUD diagnosis in 2004, 21.3 percent dropped out in 2005, and 52.5 percent had dropped out by 2006. These numbers are essentially comparable. Looking at the members of our population who remained enrolled for the entire study period, 4.04 percent had a diagnosis of SUD (either primary or nonprimary, inpatient or outpatient) during the study period, and in each of 2004, 2005, and 2006 the rate of SUD diagnosis ranged from 1.5 to 1.7 percent.
Having identified our SUD enrollees, we applied machine learning to build a Bayesian classifier to describe the associations in our commercial enrollee population. Many clinical and demographic factors are involved in risk stratifying enrollees with SUD. Estimating related utilization involves multiple diseases and multiple diagnoses with multiple mechanisms. BBNs allow us to represent these complex relationships in an efficient and user-friendly manner. Each classifier we trained has a unique hierarchy of information, or structure. These structures help us to identify how different variables influence the expected likelihood of an outcome, such as SUD diagnosis or expected cost range. The structure of the BBN is meaningful in itself in that it provides a hierarchy of conditional dependence, or the likelihood of a given outcome given known information. It is important to note that this is not causality, but rather conditional dependence, which can be thought of as co-occurrence.
Figure 1, one of our risk stratification models, uses a full BBN. In this model, we have flattened each enrollee data record to look at the presence or frequency of individual values of Major Diagnostic Category (MDC), a classification used in the Thompson Medstat database to group diagnosis into major categories (e.g., SUD or cardiovascular disease).
In this figure we can interpret the structure relative to our outcomes of interest, which are highlighted in blue: mdc20CountOut (the count of outpatient SUD diagnoses in a given claims year), mdc20CountInp (the count of inpatient SUD diagnoses in a given claims year), and anyCDAnyYear (whether the enrollee had any diagnoses of SUD during the study period). These outcomes have conditional dependence (represented by lines in the figure) with the following first-order predictors (highlighted in red): industry_inp1 (enrollee industry), mdc19CountInp and mdc19CountOut (counts of inpatient and outpatient behavioral health disorder diagnoses), and mdc4CountOut (count of outpatient diseases of the respiratory system). These first-order predictors are not necessarily causative of SUD, but rather are the most information-rich features for estimating the likelihood of a concurrent SUD diagnosis. These first-order predictors are conditionally dependent in their own right with second-order predictors (highlighted in yellow) including diseases of the nervous systems; diseases of the ear, nose, and throat; diseases of the circulatory system; diseases of the kidney and urinary tract; and other health services. The full BBN contains multiple nonlinear relationships representing conditional dependence between variables that predict our outcome of interest.
Figure 2, on the other hand, details a naïve BBN classifier designed to stratify individual enrollee costs based on historical data. The naïve BBN assumes that features associated with a specific dependent outcome of interest are mutually independent. It therefore provides insight into the direct information contribution of individual features. It also supports the development of quantitative contribution reports. This classifier uses available prior information to provide a specific estimate of cost range. Hence, in Figure 2, all the features that are connected to the 2006 cost range outcome (2006PaidRange, in the center of the diagram) influence the estimate of prospective enrollee cost. Hence, all features except those excluded from the network (upper left) act as first-order predictors with varying weights according to their respective goodness of fit (strength of association) with the dependent outcome. Knowledge of demographics and claims history can be used to estimate prospective cost.
Figure 3 details an additional naïve BBN, in this instance focused on general population risk stratification. The objective of this BBN is to use historical claims record data to develop an estimate of individual risk of SUD, based upon the overall prevalence in our three-year study cohort, that can be used to identify risk factors that can be disseminated to clinicians and providers to assist in diagnosis. Similar to the model in Figure 2, this model uses historical diagnoses, pharmacy data, enrollee demographic data, and utilization data, but rather than estimating annual cost, it estimates the likelihood of SUD diagnosis within a three-year enrollment period.
For each of the BBNs discussed above, we used our 10 percent holdout set consisting of 18,623 enrollees to validate the models for robustness and statistical quality. For each model, we input the holdout test set and calculated positive and negative predictive values and area under the curve for each model. The tables detail the validation results for both the risk stratification models and the cost/treatment models.
We tested four different risk stratification models, and identified two risk stratification models with strong characteristics as measured by AUC. These models produced AUCs for SUD positive of 0.948 (95 percent confidence interval [CI], 0.944–0.951) and 0.736 (95 percent CI, 0.721–0.752), respectively, and SUD negative of 0.951 (95 percent CI, 0.947–0.954) and 0.738 (95 percent CI, 0.727–0.750), respectively. We also developed a risk stratification model to segment enrollees positive for SUD into a likely SUD category. For further validation, we used our holdout set and our best risk estimation model to assess the sensitivity of detection and predictive value at different probability thresholds. To clarify, because BBNs are probabilistic, the user has the option of deciding what level of probability constitutes a positive or negative prediction. Table 1 uses our optimal risk stratification model and details sensitivity, specificity, and negative and positive predictive values, as well as estimated cases detected per 100,000 enrollees and number of false positives per true positive as a measure of model robustness.
In addition to evaluating three-year risk stratification, we also assessed the use of the multiyear model to risk score enrollees on a prospective year. To do this, we used our holdout set of enrollees and used 2004–2005 characteristics to risk score enrollees for a diagnosis of SUD in 2006. We then stratified our enrollee population using the probability of SUD and selected ranked cohorts in sets of 50, 100, 250, and 500. One of the challenges in risk stratifying SUD enrollees is that we believe the condition to be pervasively underdiagnosed as a result of social stigma, ethical issues, limited treatment options, and poor reimbursement.26, 27 In order to try to address this effect, we calculated sensitivity (rate of detection) and predictive value (accuracy) on both a one-year and two-year diagnosis threshold for each ranked cohort. Our results are summarized in Table 2. Each ranked cohort is listed individually, and we calculated the sensitivity (detection rate) of an SUD claim for each of 2005 and 2006 and 2005/2006 together. It is important to note that these are only claims, and there may be enrollees who are clinically indicated but for whom no claims were filed. We also calculated the positive predictive value, or the probability that an enrollee flagged as a high SUD risk had a claim for SUD in 2005, 2006, or either. We believe that this number is negatively biased due to the underdiagnosis of SUD. The optimum cohort appears to be the top 250 group. In this group, we can successfully identify one out of three enrollees for two-year risk, and 70 percent of our estimates for this group are accurate.
We also used our holdout set to evaluate the predictive power of five different BBNs to estimate the 2006 cost based on prior years (2004 and 2005). Based upon these statistics, the best two predictive models are the Naïve Inclusive Cost Model and the Demographics, Diagnoses, and Cost Model. We elected to use the Naïve Inclusive Cost Model for our insights and rules because of the higher input dimensionality it supports. Table 3 details AUC statistics for each range of expected cost.
Our validation analysis showed that our classifiers are robust and can be used to risk stratify diagnosed enrollees and estimate individual expected costs with a high degree of accuracy. To further support this analysis, we also used our holdout set of 18,623 to estimate 2006 cost using 2004–2005 data while suppressing data from each enrollee’s 2006 claims record. To do this, we applied our prospective cost/utilization estimation model (depicted in Figure 2) to predict next-year cost ranges as described in Table 3. The result is a set of estimates of next-year cost assuming no disease management for SUD. We then compared the estimated cost range to the actual known cost range in 2006. Table 4 details the comparison of cost ranges between the predictions and the actual known costs in 2006. In 64 percent of predicted cases, costs were predicted in the correct range, and in 80 percent of cases, costs were within one range of accuracy.
As we move into an era of evidence-based, information-driven personalized care, there is a need for tools and methods that support individualized patient disease management. While there has been interesting early work in these types of approaches for SUD, the results have largely been disappointing.28–31 However; studies have shown that, if properly implemented, proactive targeted intervention and therapy matching can have a favorable impact on patient outcomes and costs.32, 33 The current paradigm is focused on benefits limitation, broad-based disease management, and carving out behavioral health benefits with success measures focused on total savings rather than individual benefit.34 Our objective in this study was to develop a novel approach—one where we use sophisticated classification models to identify very specific enrollee subpopulations, some as small as a handful of patients, with very different utilization profiles. These models allow us to develop highly individualized estimates of utilization and the potential benefit of disease management given prior utilization history of those who have been diagnosed with SUD and are thus candidates for disease management. Thus, with validated models we can stratify our disease management efforts and enroll patients into different service levels based upon forward-looking estimates of both utilization and the potential impact that disease management may have on future utilization within the known SUD population.
We can use the models to identify novel insights, extract rules, and develop case studies of how the models would perform when applied to a novel population. Within these populations, we can use enrollee-specific historical information to calculate enrollee-specific estimates of costs over the next 12 months and likely cost savings resulting from successful intervention and disease management, allowing payers and disease managers to develop stratified service levels that are appropriate to the expected risks and benefits of disease management in specific subgroups. As an example, we can estimate the relative risk of recurrent SUD diagnosis in those patients who have been identified through diagnosis using claims history. Table 5 and table 6 detail the relative risk of a new SUD claim over the next three years based upon inpatient and outpatient diagnoses in the first year of enrollment. These tables use only the primary ICD-9 code from the first claim in 2004, with the first inpatient claim in Table 5 and the first outpatient claim in Table 6, and this data is used to calculate the probability of an SUD diagnosis in the subsequent three years and corresponding relative risk. The first column also details expected prevalence based upon our study population. This type of data can be used to focus enrollment efforts on patients who would benefit most from disease management support.
The ability to estimate utilization and cost further allows us to detail the relative increase in expected annual cost of selected chronic diseases and trauma when an enrollee also has a diagnosis of SUD. Further, we have flagged additional conditions, such as HIV and eye disorders, which have known associations with SUD.35 Table 7 details the expected annual cost differential, on an annual basis, for enrollees diagnosed with selected chronic conditions both with and without SUD. In addition to looking at selected high-cost chronic diseases, we also detail several conditions where SUD has a surprising cost impact. Several of these relationships have already been identified in the literature in terms of the relationship to cost, utilization, and outcomes in respiratory disease, trauma, and infectious disease.36–38 Note that these estimates include only cost estimation derived from diagnosis and do not include other factors such as historic pharmacy utilization, which further impact expected cost.
We can also combine multiple known factors in the model to produce combined estimates of risk and cost. In Table 8, we use an enrollee’s pharmacy history (central nervous system [CNS] drugs) and an outpatient diagnosis of diseases of the hepatobiliary system and pancreas to calculate the risk of SUD—in this instance, 31.3 percent, or a 7.8X relative risk.
We can then use the cost models to estimate the expected cost distribution of the enrollee with and without SUD. Table 9 details expected next-year cost distribution without SUD, while Table 10 details the expected next-year cost distribution of an identical case with SUD: the enrollee has a pancreatic disorder and is using some type of CNS drug. Without SUD, the annual cost of this enrollee is expected to be above $10,000 only 43.6 percent of the time. With SUD, the annual cost of this enrollee is expected to exceed $10,000 73.3 percent of the time.
As a further analysis, we used our validated cost model to estimate hypothetical potential savings attributable to disease management. We sought to estimate the reduction in 2006 total enrollee cost if enrollees with SUD were successfully treated at the end of 2005, making the assumption that successful disease management of SUD would change utilization patterns. Accordingly, we suppressed variables that describe utilization in 2004 and 2005 and all variables related to 2006 utilization, and we compared the estimated cost distributions of all 18,623 enrollee cases in our holdout set between those who had SUD and identical matched cases without SUD. Using the analysis above, we calculated an estimated hypothetical 2006 post-treatment cost for each enrollee and calculated an estimated savings against the actual known 2006 cost for each enrollee. We then ranked the entire cohort by SUD risk score and then ranked within each scoring group by estimated savings. We then calculated average per-enrollee savings for each cohort (top 50, 100, 250, and 500) and calculated estimated savings of the following:
- Top 50 enrollees, average savings $23,284 per enrollee
- Top 100 enrollees, average savings $12,317 per enrollee
- Top 250 enrollees, average savings $4,927 per enrollee
- Top 500 enrollees, average savings $2,463 per enrollee
This analysis indicates, for example, that selecting the top 500 enrollees (out of our 18,623-enrollee test set) produces an expected cost reduction benefit of approximately $2,500 in annual savings, excluding the costs of disease management and treatment. Restricting our set to the top 250 cases produces an expected savings of approximately $5,000 per enrollee, and by further restricting our disease management population to the top 100 enrollees, we increase our expected average reduction to more than $12,000 per patient. Using this approach, we can stratify a disease management population and tune our marginal benefit to maximize both enrollee benefit and financial return in light of the expected costs and success rate of a given disease management program. The actual return is highly dependent on the individual payer and treatment modality, as the cost and success rate of interventions varies greatly, from as much as $30,000 per month at the Betty Ford Clinic to as little $300 per month for outpatient programs or $147 per month for clinic-based methadone treatment.39, 40 These cost estimates need to be further adjusted based on expected success and recidivism rates, as these rates can vary significantly.41, 42 Accurate, validated stratification tools can allow payers to make significantly more informed decisions about how disease management strategies can be employed in a stratified way to maximize benefit to both enrollee and plan.
One interesting phenomenon in our stratification exercise was that the expected enrollee savings did not necessarily appear to scale with general utilization. Many patients in lower utilization categories appeared to score higher expected savings than patients in higher utilization categories. This led us to examine specific cases in the model to attempt to understand why this phenomenon occurred. For example, for enrollees with cancer necessitating inpatient care, 90.1 percent of SUD-negative enrollees cost more than $10,000 in 2006, while 97.3 percent of SUD-positive enrollees cost more than $10,000 in 2006. In contrast, for enrollees with respiratory disorders necessitating inpatient care in 2006, 78.3 percent of SUD-negative enrollees cost more than $10,000 in 2006, while 90.4 percent of SUD-positive enrollees cost more than $10,000 in 2006. While enrollees with a diagnosis of cancer and SUD have a much higher expected cost than enrollees with a diagnosis of respiratory disorder and SUD ($70,756 vs. $48,876), the impact of SUD status is more pronounced in respiratory patients than in cancer patients. Ruling out SUD moved 12.1 percent of respiratory patients but only 7.3 percent of cancer patients below $10,000 in costs in 2006 and resulted in an overall expected cost differential of $8,363 for respiratory disease as compared to $6,189 for cancer. As we add other factors, such as medication history, we can develop a rich picture of enrollee segments where SUD appears to impact utilization and cost in the context of other chronic diseases. The difference between these enrollee populations is that SUD appears to impact long-term chronic conditions more heavily than short-term acute conditions. A reasonable hypothesis for this difference is that in conditions where patient compliance and effective pharmacy management are critical to disease management, SUD may negatively impact compliance and significantly increase outpatient and inpatient resource utilization.
At this point, the greatest challenge in implementation is not a technical difficulty but rather a policy challenge. In addition to issues of intervention and disease management costs and success rates that produce widely varying returns for different disease management populations, we also need to account for concerns regarding patient privacy, potential stigma, and restricted access to care when potential benefits are stratified. We can address this dilemma to some degree through the development of enrollment rules that focus on patients post diagnosis, the use of enrollment techniques that allow patients to move between disease management service levels or opt out of the program, and the use of thresholds that severely reduce false positives in identifying those diagnosed with SUD who would best benefit from disease management.43 For example, we could enroll all enrollees who are diagnosed with SUD and ensure that we reach all potential beneficiaries, but at the cost of providing disease management services to those who are unlikely to benefit, both wasting resources and potentially stigmatizing enrollees who have been diagnosed with SUD but for whom a disease management program may provide little benefit. The ability to use accurate stratification technologies has the potential to significantly improve disease management strategies and reimbursement policies relative to the current “blunt” paradigm of benefits limitation for controlling behavioral health costs.44
Using this method, we can develop forward-looking stratified individual estimates of disease risk for each enrollee in our selected population that can be used to identify diagnosed patients at greatest risk of relapse. This estimate takes into effect utilization histories, comorbidities, chronic conditions, demographic data, and pharmacy usage of each enrollee. Importantly, we focused on data that are available currently and do not require complex or expensive collection mechanisms to be developed. Further, within this complex matrix, we can estimate 12-month costs (exclusive of disease management and treatment for SUD) for a given enrollee assuming either SUD relapse or SUD rule-out. This allows us to hypothetically match a given enrollee who has been diagnosed with SUD against an identical, SUD-free enrollee to compare the hypothetical impact of SUD on cost. We can use this methodology to rank all appropriate potential disease management programs by i) total estimated cost and ii) total estimated cost differential attributable to SUD disease management, exclusive of disease management costs. These tools should allow payers and providers to make more informed and thoughtful decisions with respect to the design of stratified disease management programs.
We have been able to successfully model a large, heterogeneous population of commercial enrollees, applying state-of-the-art machine learning technology to develop complex and accurate multivariate models that support near-real-time scoring of novel payer populations based on historic claims and diagnostic models. Our initial validation results indicate that we can stratify enrollees with SUD diagnoses with a high degree of sensitivity and specificity, and the most challenging issue becomes one of policy. Due to the social stigma associated with the disease and ethical issues pertaining to access to care and individual versus societal benefit, a thoughtful dialogue needs to occur about the appropriate way to implement these technologies.
Future work is planned in which we will test these models further by evaluating them against other data sources and evaluate the social and economic ramifications of this methodology.
Lawrence M. Weinstein, MD, ABHM, is a senior vice president at Catasys, Inc., a healthcare company headquartered in Los Angeles, CA.
Todd A. Radano is an executive vice president and founder at DecisionQ Corporation in Washington, DC.
Timothy Jack, MD, is a behavioral health medical director for Wellpoint Blue Cross and Blue Shield and a managed behavioral healthcare consultant in Los Angeles, CA.
Philip Kalina, MS, is the director of modeling services at DecisionQ Corporation in Washington, DC.
John S. Eberhardt III, is an executive vice president and founder at DecisionQ Corporation in Washington, DC.
1. Substance Abuse and Mental Health Services Administration, Office of Applied Studies (2008). Results from the 2007 National Survey on Drug Use and Health: National Findings (NSDUH Series H-34, DHHS Publication No. SMA 08-4343). Rockville, MD.
2. O’Brien, M. S., J. C. Anthony, L. E. O’Dell, A. A. Alomary, M. Vallee, G. F. Koob, R. L. Fitzgerald, and R. H. Purdy. “Risk of Becoming Cocaine Dependent: Epidemiological Estimates for the United States, 2000–2001.” Neuropsychopharmacology 30 (2005): 1006–18.
3. Clark, R. E., M. Samnaliev, and M. P. McGovern. “Impact of Substance Disorders on Medical Expenditures for Medicaid Beneficiaries with Behavioral Health Disorders. Psychiatric Services 60, no. 1 (2009): 35–42.
4. Linden, A., J. L. Adams, and N. Roberts. Evaluation Methods in Disease Management: Determining Program Effectiveness. Washington, DC: Disease Management Association of America, 2006.
5. Weisner, C. G.,T. Ray, J. R. Mertens, D. D. Satre, and C. Moore. “Short-Term Alcohol and Drug Treatment Outcomes Predict Long-Term Outcome.” Drug and Alcohol Dependence 71, no. 3 (2003): 281–94.
6. Alexander, J. A., T. A. Nahra, C. H. Lemak, H. Pollack, and C. I. Campbell. “Tailored Treatment in the Outpatient Substance Abuse Treatment Sector: 1995–2005.” Journal of Substance Abuse Treatment 34, no. 3 (2008): 282–92.
7. Angarita, G. A., S. Reif, S. Pirard, S. Lee, E. Sharon, and D. R. Gastfriend. “No-Show for Treatment in Substance Abuse Patients with Comorbid Symptomatology: Validity Results from a Controlled Trial of the ASAM Patient Placement Criteria.” Journal of Addiction Medicine 1, no. 2 (2007): 79–87.
8. Babor, T. F. “Treatment for Persons with Substance Use Disorders: Mediators, Moderators, and the Need for a New Research Approach.” Journal of Methods in Psychiatric Research 17 (2008): s45–s49.
9. Merkx, M. J. M., G. M. Schippers, M. J. W. Koeter, P. J. Vuijk, S. Oudejans, C. C. Q. de Vries, et al. “Allocation of Substance Use Disorder Patients to Appropriate Levels of Care: Feasibility of Matching Guidelines in Routine Practice in Dutch Treatment Centres.” Addiction 102, no. 3 (2007): 466–74.
10. Buhringer, G. “Allocating Treatment Options to Patient Profiles: Clinical Art or Science?” Addiction 101, no. 5 (2006): 646–52.
11. Collins, S. E., I. Torchalla, M. Schroter, G. Buchkremer, and A. Batra. “Development and Validation of a Cluster-Based Classification System to Facilitate Treatment Tailoring.” Journal of Methods in Psychiatric Research 17 (2008): s65–s69.
12. Chi, F. W., and C. M. Weisner. “Nine-Year Psychiatric Trajectories and Substance Use Outcomes: An Application of the Group-Based Modeling Approach.” Evaluation Review 32, no. 1(2008): 39–58.
13. Weisner, C. G.,T. Ray, J. R. Mertens, D. D. Satre, and C. Moore. “Short-Term Alcohol and Drug Treatment Outcomes Predict Long-Term Outcome.”
14. Jensen, F. An Introduction to Bayesian Networks. New York: Springer-Verlag, 1996.
15. Hofman, J. M., and C. H. Wiggins. “Bayesian Approach to Network Modularity.” Physical Review Letters 100, no. 25 (2008, June 23): 258701.
16. Robin, H., J. S. Eberhardt, M. Armstrong, R. Gaertner, and J. Kam. “Interpreting Diagnostic Assays by Means of Statistical Modeling.” IVD Technology 12, no. 3 (2006, April): 55–63.
17. Susan Maskery, Yonghong Zhang, Hai Hu, Craig Shriver, Jeffrey Hooke, and Michael Liebman. “Bayesian Network Analysis of Breast Pathology Diagnoses.” Presented at the 13th Annual International Conference on Intelligent Systems and Molecular Biology. Detroit, MI, June 25–29, 2005.
18. Moraleda, J., and T. Miller. “Ad+tree: A Compact Adaptation of Dynamic Ad-Trees for Efficient Machine Learning on Large Data Sets.” Proceedings of the 4th International Conference on Intelligent Data Engineering and Automated Learning, 2002.
19. Moraleda, J. New Algorithms, Data Structures, and User Interfaces for Machine Learning of Large Datasets with Applications. Doctoral dissertation, Stanford University, Palo Alto, CA, December 2003.
20. Burnside, E. S., D. L. Rubin, J. P. Fine, R. D. Shachter, G. A. Sisney, and W. K. Leung. “Bayesian Network to Predict Breast Cancer Risk of Mammographic Microcalcifications and Reduce Number of Benign Biopsy Results: Initial Experience.” Radiology 240, no. 3 (2006): 666–73.
21. Burd, R. S., M. Ouyang, and D. Madigan. “Bayesian Logistic Injury Severity Score: A Method for Predicting Mortality Using International Classification of Disease-9 Codes.” Academic Emergency Medicine 15, no. 5 (2008): 466–75.
22. Ho, K. M., and M. Knuiman. “Bayesian Approach to Predict Hospital Mortality of Intensive Care Readmissions During the Same Hospitalisation.” Anaesthesia and Intensive Care 36, no. 1 (2008): 38–45.
23. Biagioli, B., S. Scolletta, G. Cevenini, E. Barbini, P. Giomarelli, and P. Barbini. “A Multivariate Bayesian Model for Assessing Morbidity after Coronary Artery Surgery.” Critical Care 10, no. 3 (2006): R94.
24. Burnside, E. S., D. L. Rubin, R. D. Shachter, R. E. Sohlich, and E. A. Sickles. “A Probabilistic Expert System That Provides Automated Mammographic-Histologic Correlation: Initial Experience.” American Journal of Roentgenology 182, no. 2 (2004): 481–88.
25. Moraleda, J. New Algorithms, Data Structures, and User Interfaces for Machine Learning of Large Datasets with Applications.
26. Holder, Harold O., and James A. Blose. “The Reduction of Health Care Costs Associated with Alcoholism Treatment: A 14-Year Longitudinal Study.” Journal of Studies on Alcohol 53, no. 4 (1992, July): 293–302.
27. Karol, D. E., I. N. Schuermeyer, and C. A. Brooker. “The Case of HS: The Ethics of Reporting Alcohol Dependence in a Bus Driver.” International Journal of Psychiatry in Medicine 37, no. 3 (2007): 267–73.
28. Alexander, J. A., T. A. Nahra, C. H. Lemak, H. Pollack, and C. I. Campbell. “Tailored Treatment in the Outpatient Substance Abuse Treatment Sector: 1995–2005.”
29. Angarita, G. A., S. Reif, S. Pirard, S. Lee, E. Sharon, and D. R. Gastfriend. “No-Show for Treatment in Substance Abuse Patients with Comorbid Symptomatology: Validity Results from a Controlled Trial of the ASAM Patient Placement Criteria.”
30. Babor, T. F. “Treatment for Persons with Substance Use Disorders: Mediators, Moderators, and the Need for a New Research Approach.”
31. Merkx, M. J. M., G. M. Schippers, M. J. W. Koeter, P. J. Vuijk, S. Oudejans, C. C. Q. de Vries, et al. “Allocation of Substance Use Disorder Patients to Appropriate Levels of Care: Feasibility of Matching Guidelines in Routine Practice in Dutch Treatment Centres.”
32. Rothbard, A. B., and E. Kuno. “Comparison of Alcohol Treatment and Costs after Implementation of Medicaid Managed Care.” American Journal of Managed Care 12, no. 5 (2006): 285–96.
33. Saitz, R., M. J. Larson, C. LaBelle, J. Richardson, and J. H. Samet. “The Case for Chronic Disease Management for Addiction.” Journal of Addiction Medicine 2, no. 2 (2008): 55–65.
34. Hodgkin, D., C. M. Horgan, D. W. Garnick, and E. L. Merrick. “Benefit Limits for Behavioral Health Care in Private Health Plans.” Administration and Policy in Mental Health Services Research 36, no. 1 (2009): 15–23.
35. Haimovici, Robert, et al. “Risk Factors for Central Serous Chorioretinopathy: A Case-Control Study.” Ophthalmology 111, no. 2 (2004): 244–49.
36. London, J. A., G. H. Utter, F. Battistella, and D. Wisner. “Methamphetamine Use Is Associated with Increased Hospital Resource Consumption among Minimally Injured Trauma Patients.” Journal of Trauma, Injury, Infection and Critical Care 66, no. 2 (2009): 485–90.
37. Bard, M. R., C. E. Goettler, E. A. Toschlog, S. G. Sagraves, P. J. Schenarts, M. A. Newell, et al. “Alcohol Withdrawal Syndrome: Turning Minor Injuries into a Major Problem.” Journal of Trauma, Injury, Infection and Critical Care 61, no. 6 (2006): 1441–45.
38. Gangl, K., R. Reininger, D. Bernhard, R. Campana, I. Pree, J. Reisinger, et al. “Cigarette Smoke Facilitates Allergen Penetration across Respiratory Epithelium.” Allergy 64, no. 3 (2009): 398–405.
39. Jones, E. S., B. A. Moore, J. L. Sindelar, P. G. O’Connor, R. S. Schottenfeld, and D. A. Fiellin. “Cost Analysis of Clinic and Office-Based Treatment of Opioid Dependence: Results with Methadone and Buprenorphine in Clinically Stable Patients.” Drug and Alcohol Dependence 99, nos. 1–3 (2009): 132–40.
40. The Addiction Recovery Guide. Available at http://www.addictionrecoveryguide.org/treatment/residential/centers.html (accessed June 30, 2009).
41. Cournoyer, L. G., S. Brochu, M. Landry, and J. Bergeron. “Therapeutic Alliance, Patient Behaviour and Dropout in a Drug Rehabilitation Programme: The Moderating Effect of Clinical Subpopulations.” Addiction 102, no. 12 (2007): 1960–70.
42. Weisner, C. G.,T. Ray, J. R. Mertens, D. D. Satre, and C. Moore. “Short-Term Alcohol and Drug Treatment Outcomes Predict Long-Term Outcome.”
43. Karol, D. E., I. N. Schuermeyer, and C. A. Brooker. “The Case of HS: The Ethics of Reporting Alcohol Dependence in a Bus Driver.”
44. Hodgkin, D., C. M. Horgan, D. W. Garnick, and E. L. Merrick. “Benefit Limits for Behavioral Health Care in Private Health Plans.”
Article citation: Perspectives in Health Information Management 6, Fall 2009