From Meaningful Data Science to Impactful Decisions: The Importance of Being Causally Prescriptive

This article proposes a framework for transition from traditional data science, where the focus is on extracting value from available data, to goal-driven analytical decision-making, where the business objective is defined first, through integration of various analytical techniques in a common setting. We discuss the link between predictive analytics and prescriptive analytics in the context of formulating the problem and assert that all prescriptive analytics problem formulations assume a causal link between decisions and outcomes. We emphasize the role of predictive analytics and causal inference in specifying the causal link between decisions and outcomes accurately and ultimately in aligning the analysis with the business objectives. We offer practical examples that integrate various required analytics tasks and describe scenarios where causal inference is required versus not required.


INTRODUCTION
Rapid growth in data, industry demand, and technology have created a golden age for data science.More than ever, there is widespread recognition that data and analytics bring business value by enabling informed decisions.The analytics underlying business decisions is often categorized into three types: descriptive, predictive, and prescriptive (Figure 1).The foundational level-descriptive-involves data summaries, visualizations, and observation.The middle level-predictive-involves building models to explain and predict future behavior, with significant contributions from the statistical, machine learning, and econometrics fields.The top level-prescriptive-operationalizes the insights from the previous two levels, taking into consideration business constraints and often utilizing advanced prescriptive analytical methods from fields like operations research and system dynamics; see also Rose (2016), LaRiviere et al. (2016), Frazzetto et al. (2019), Poornima & Pushpalatha (2020), Bertsimas & Kallus (2020), Delen (2020), and Lo (2020).
Successfully integrating all three levels of analytics in the context of business process improvement requires combining data science with decision science (Bobriakov 2019).As argued in de Langhe and Puntoni (2021) and Bojinov, Chen, and Liu (2020), data-driven strategies that are not decision-driven strategies can be seriously flawed.From a methodological point of view, transitioning from data science to decision science requires (1) a holistic view of the processfrom data to descriptive, predictive, and prescriptive analytics-and (2) generating the correct inputs through predictive analytics to inform prescriptive analytics.This article analyzes this transition through the lens of causality.
The study of causality, or cause-and-effect relationships, has been around for centuries in multiple fields.While correlation or association between two variables can be observed from available data, it is only what we "see" about how they are related and is not about the effect of what we "do" (or manipulate by setting a specific value) to one variable on the other variable Pearl & MacKenzie (2018) (Illari & Russo 2014).For example, if the displayed times on the watches of the two authors always differ by a minute, the two times are clearly correlated and are both driven by the official time (a confounder).If we move up the time on one watch, there is no reason to expect the other watch will automatically be impacted, as they are not causally related.Now consider a different example: a student's exam score and its relationship to the level of preparation of the student for the exam.It may be reasonable to assume that better preparation for the exam will result in a better exam score on average, that is, the relationship between the level of student preparation and exam score would be causal.If this is the case, an intervention or manipulation (preparation for the exam) can causally change the probability distribution of a system response or outcome (exam score).By maximizing preparation, the student can maximize the exam score.
Causal inference is a set of methodologies based on experimental or observational data to learn about cause-and-effect relationships by eliminating or reducing confounding effects, allowing us to assess what decision or action can cause a desirable outcome and quantify the causal relationship.Understanding the cause-and-effect relationship enables a better decision to be made.As we will point out in this article, however, in practice, one needs to carefully identify the ultimate goal and the levers to accomplish this goal.In the student exam preparation example, "preparation" is a vague term.What are the actual levers a student has to prepare for an exam?(These levers are represented by specific terms in prescriptive analytics.)Does preparation mean "time to prepare"?Can we reasonably expect that the time a student spends preparing will increase the final exam score?Even if we assume it will, defining the goal is not obvious.Is the student's goal to perform well on the exam?Or is there an overarching goal, like performing well in the course?If the ultimate goal is to perform well in the course, the student's time may need to be split between different course deliverables, of which the exam will be one but not necessarily the only one.Maximizing the time spent studying for the exam may not necessarily result in accomplishing the goal of performing well in the course overall, even if the causal relationship between studying for the exam and achieving a desirable outcome (a high exam score) is understood and can be estimated.Extending this simplified example to business situations, in this article, we will argue that specifying the ultimate goal carefully and mapping out the correct causally prescriptive methodologies in the process for achieving that goal is paramount to impactful decision-making.
The decision theory literature from psychology and philosophy considers causal knowledge essential to decision-making, as humans choose from available options that cause the outcomes that they desire; see Hagmayer & Sloman (2005), Sloman & Hagmayer (2006), Joyce (2008), andHagmayer &Fernback (2017).While these scholars are mostly concerned about how decisions are made by humans, we are interested in how decisions should be made in a prescriptive fashion; see Sloman (2005: ch.7) for comments on the difference.Surprisingly, many paradigms used in prescriptive analytics from decision science fields, such as operations research, operations management, and industrial engineering, do not include explicit considerations for causality or causal inference.Considerations for causality are incorporated in paradigms from other analytical fields.For example, Attri, Dev, and Sharma (2013); Ali, Sorooshian, and Kie (2016); Chauhan, Singh, and Jharkharia (2018); and Sorooshian, Tavana, and Ribeiro-Navarrete (2023) describe the DEMATEL and ISM methodologies, which address causal relationships between variables for decision-making based on expert judgment.However, such methodologies do not optimize specific outcomes as in traditional prescriptive analytics formulations and do not explicitly incorporate estimation procedures for causal relationships based on scientific experiments (such as A/B testing and clinical trials) or observational data.
Our contribution is to introduce a practical seven-step causal prescriptive analytics framework that blends important concepts and methodologies from several different analytical fields and specifically addresses the question of when causal inference is necessary to support prescriptive analytics paradigms.We illustrate the usage of the framework with examples.

INTRODUCING THE CAUSAL PRESCRIPTIVE ANALYTICS FRAMEWORK
A common characteristic in many decision problems is that they can be represented as optimization models, making optimization modeling an important prescriptive analytics tool.There are three parts to the classical optimization paradigm: (1) decision variables (actions or quantities under the control of the decision maker), (2) objective function (goal expressed in terms of the decision variables), and (3) constraints (requirements on the decision variables).Optimization formulations play an important role in prescriptive analytics, allowing for situations to be described in a common language that can then be passed to optimization solvers to obtain an optimal policy.
Because the predictive analytics and optimization communities have traditionally been separated, we find that there are important steps missing from applying prescriptive methodologies like optimization in practice.The practical causal prescriptive analytics framework we propose below replaces the standard optimization paradigm with seven key questions that need to be answered to understand the end-to-end process needed to support informed decision-making (Figure 2): 1. What is the objective or goal?That is, what are you trying to achieve?
The first question for any decision problem is to know what the ultimate goal (Z) is.Although this might seem obvious, in our experience, many analytics and data science projects do not start with a goal or the goal is poorly defined.For impactful decision-making, we propose defining a goal that one wants to achieve rather than finding purpose for available Lo and Pachamanova Data Science Journal DOI: 10.5334/dsj-2023-008 data or focusing on exciting methodologies; see Keeney (1996) for various considerations in objective setting.

2.
What are the outcomes that help achieve your objective?
Defining the outcomes that can be achieved and how they relate to the ultimate goal is critical for understanding whether the goal (Z) is achievable.These immediate outcomes (Y) are driven by actions or decisions (X in Question 3) and are expressed as a function of these decisions.

3.
What are the decision variables (options, actions, treatments, or interventions)?
It is important to know what kind of decision (X) one could possibly make in order to influence the outcome (Y from Question 2) so as to ultimately achieve one's goal (Z from Question 1).The decision variables should be defined so that they are actions under the decision maker's control and so that their causal connection to the outcomes (Y) that help achieve the goal (Z) is well understood.

4.
What is the available information, such as data, insights, and models?
Existing information (I) may exist that can help influence the decision (X) or can be used to define the relationship between X and Y.This information can be based on domain knowledge about the relationship between X and Y or availability of experimental or observational data from past decisions and outcomes that can be used to infer the causal relationship between X and Y. Taking advantage of such information allows one to express the causal relationship between X and Y correctly (see Question 5).

What is the relationship between X and Y?
Understanding the nature of the relationship between X and Y allows one to build models to represent it accurately.This representation is often estimated through predictive analytics or causal inference techniques, unless the relationship is already known.It is what enters the prescriptive model and is used to identify desired outcomes.We will further address the significance of this step in section 5.
As illustrated in Figure 2, in practice, Questions 1-5 are resolved in an iterative manner.
One may need to revisit the specification of goals, outcomes, and actions multiple times to define them in a way that allows for taking advantage of available information and building an accurate representation of the relationship between X and Y.

What constraints need to be taken into consideration?
There are often constraints that limit the range of potential decisions X.Some of these constraints are physical (e.g., one cannot market to a negative number of customers), while others are imposed by business practices.

7.
What is an appropriate solution that achieves the goal?
After formulating the problem in the steps above, the final step is to determine an appropriate solution.There may be multiple combinations of methods for solving the prescriptive analytics problem.The choice of method combination and solution depends on available data, time, and resources.
The seven questions above extend and enrich the three components of the classical optimization paradigm to emphasize the implementation and usability of model insights for impact.As depicted in Figure 2, the framework is a cycle: it starts and ends with the stated goal, making sure that the whole process is aligned with accomplishing the goal.Note that the choice of decision variables (X) drives the immediate outcome (Y), which leads to the ultimate goal (Z).Decision-making in this framework is therefore assumed to be causal by nature.However, whether X and Y are defined accurately to support Z (Questions 2 and 3) and whether the correct relationship between X and Y is estimated (Question 5) can have a significant effect on the final goal.In other words, before employing exciting prescriptive analytics methodologies, one needs to understand whether making a decision (X) would affect the goal (Z) in a predictable causal manner through the function that is used to describe their relationship and whether this will ultimately support Z.In some cases, causal inference is needed to estimate the relationship between X and Y, while in others, it is not.Understanding the situations in which causal inference is necessary is critical for selecting business-relevant optimal solutions (Question 7).Let us illustrate this point and the causal prescriptive analytics framework with a common application: direct marketing.Direct marketing involves intervention or treatment, such as an email, a direct mail, a web advertisement, or a phone call to a customer, that aims to maximize a call to action, such as a product purchase.Marketers typically deal with multiple treatments and/or multiple products and have a fixed budget.The goal is to match each customer with the right treatment and/or product so as to maximize a business outcome, such as sales or expected profit. 1 The seven-step process for this example is as follows (Figure 2): 1. What is your objective or goal?That is, what are you trying to achieve?
From the perspective of the business, the overarching goal is to maximize profit (Z), which means maximizing the customer purchase response that results from direct marketing.This nuance is important, as ideally only customers who are likely to change their behavior and purchase due to direct marketing should be treated (targeted) in order for marketing spend to be allocated purposefully; see, for example, Lo (2002;2008) and Lo & Pachamanova (2015) for additional discussion.Well-run businesses can incent pursuing desirable goals by aligning performance metrics with these goals.In this context, accounting for the additional number of customers who purchase (or additional expected revenue) with treatment (e.g., email) relative to no treatment at all or a business-as-usual intervention that has been used in the past are examples of metrics that create such incentives.These metrics directly relate to the profitability of the direct marketing campaign and the business's bottom line (Z).

What are outcomes that help achieve your objective?
Profitability is a function of incremental sales volume, which is directly linked to the probability that an individual customer changes behavior as a result of the direct marketing campaign.The outcome (Y in Figure 2) that directly affects the goal (Z) is therefore the total incremental difference in individual customer purchase probabilities (also known as lift) as a result of the marketing campaign. 1 In addition to being a key tool for many for-profit organizations, direct marketing can also be generalized to other similar situations, such as contacting the right donors for donations in a fundraising program for a nonprofit organization.The actions or decision variables are whether or not to send a particular offer (treatment) to each customer.If there are multiple treatments (or products), one can define the decision variables (X) to be binary (0 or 1) to correspond to the treatment being considered for each customer.If there are M treatments and N customers, there are a total of MN binary decision variables.

4.
What is the available information (e.g., data, insights, or models) to influence your decision?
Companies often have data on whether or not customers have purchased a particular product in the past.They also have data on whether or not a customer has been targeted with a particular campaign.However, the data set needs to be assessed carefully in the context of understanding whether an action can lead to customer response.In this context, one needs prior marketing campaign data based on a randomized controlled trial (also known as A/B testing in business applications) that contains response data (whether the customer purchased or not) to each treatment as well as demographic data as covariates for model estimation.Such data can be used to test scientifically and measure the effectiveness of a treatment, allowing for evaluation of the incremental probability that the customer will purchase as a result of the marketing campaign.

What is the relationship between X and Y?
With the availability of the right data set in Step 4, one can estimate the incremental probability of response for a particular customer using causal inference techniques; see, for example, Kane, Lo & Zheng (2014).The total change in response probability because of the marketing campaign can be estimated as a sum-product of the individual response probabilities (lift) with the binary decision variables of whether or not a particular treatment is applied to a particular customer; see Lo & Pachamanova (2015) and Appendix A for a mathematical formulation and an empirical example.

6.
What constraints need to be taken into consideration?
The main business constraint is that the total cost of all treatments cannot exceed a fixed marketing budget.There could be additional business constraints, such as the limitation that each customer should only receive at most one treatment in this marketing campaign.Physical constraints include restrictions on the decision variables; they need to take only values 0 or 1 for the formulation of the problem to be meaningful.

7.
What is an appropriate solution that achieves the goal?
After going through Steps 1-6, one needs to take a holistic look at the path to find a solution and evaluate whether it ultimately addresses the goal in Step 1.In this example, since the goal is to maximize profitability due to direct marketing, a binary integer optimization program can be set up to maximize profitability by determining the right treatment for each customer (i.e., determining the values for the decision variables X that lead to the highest outcome Y, which was determined to lead to the desired goal Z).As a key input to the optimization model, uplift modeling or conditional average treatment effect (CATE) techniques can be employed to estimate the treatment effect as a function of available covariates for each treatment in Step 5, enabling the prediction of the effect of each treatment over control (e.g., no action) at the individual or subgroup level.The combination of uplift modeling and constrained optimization requires an integrated predictive and prescriptive analytics approach; see also Lo (2002;2008) The causal prescriptive analytics framework can be illustrated through a directed acyclic graph (DAG) (Figure 3).The box around the decision X represents the possible constraints that are "boxing" the feasible values of X.

APPLICATIONS OF THE CAUSAL PRESCRIPTIVE ANALYTICS FRAMEWORK
Many important problems can be addressed in practice with the causal prescriptive analytics framework introduced in section 2. We list several examples below.
Vehicle routing.Vehicle routing is a common and very difficult problem in prescriptive analytics, with applications ranging from delivery service routing to bus routing (Routing Challenge 2021; Bertsimas et al. 2020;Davenport 2013).For instance, routing a fleet of school buses requires that each student be picked up and delivered to the school within particular time windows, while minimizing total travel time or distance, and is subject to various constraints, such as bus capacity.It can be formulated as a constrained optimization problem by representing all location points where students wait as vertices in a graph and assigning binary decision variables X to correspond to the actions of whether or not to use particular arcs in the graph (Feillet 2010).The outcome Y (e.g., total travel distance) can be expressed as a sum-product of arc distances and decision variables representing whether certain arcs are used.Constraints include the capacity of buses, the fact that pickup and drop-off need to happen within particular time windows, and the fact that the arcs selected in a bus route need to form a continuous path with predetermined start and end points.
Workforce scheduling.Workforce scheduling problems appear in multiple business contexts, from large retailers to call centers to hospitals.The main goal is to set a timetable assigning a particular number of employees to shifts (decisions X) so that employee preferences or labor law constraints are taken into consideration while meeting business demands and minimizing total cost (Y = Z); see Daskin (2010: ch.7-8) and Koole (2013: ch.5-6).
Inventory management.A factory responsible for producing a product needs to order raw materials with the right amount (decision X) in order to minimize the total cost (Z), which is the sum of ordering cost (Y1) and inventory holding cost (Y2).
Portfolio construction.The portfolio optimization problem has been a central problem in quantitative investments since Markowitz (1952) proposed considering the trade-off between expected reward and risk to determine the optimal portfolio allocation.Given estimates of future asset expected returns and risk, the goal (Z) is to maximize the future value of the funds, which can be translated as finding the asset weights (X) that maximize the expected portfolio return (Y) for a given target level of portfolio risk.
Pricing.The prices of products and services are often determined by several departments in a company, such as accounting, marketing, and product managers.There is a minimum price that can be charged based on the cost of the product, but to set the final price, one needs to estimate the supply and demand for the product at any given price.If the price is set too high, the demand would be lower, but the profit margin per item would be higher, and vice versa.The goal is to determine a price level (X) that results in the maximum profit; see Lo (2020).
Customer retention.This classic problem involves identifying the best customers for special customer retention efforts in order to improve profitability by balancing the cost and benefits of retention.This problem can be formulated as follows: the objective is to maximize profitability (Z) through customer retention (Y) by attempting to retain the appropriate customers (X).Employee acquisition.The optimal number of employees to be hired is a common but not simple problem with the goal of maximizing overall profitability.For example, in a department store, the employee acquisition decision involves how many additional sales employees to recruit (X) in order to maximize overall incremental profit (Z), defined as sales revenue (Y) minus employment cost.
Digital health.Wearable devices not only report patient vitals but can also be used to provide health-related recommendations to patients.These devices are sometimes freely offered by employers or insurers.Relevant decisions include which of a set of messages to display (and when) for each individual in order to achieve positive outcomes, such as minimizing emergency room visits and medical costs (Menictas et al. 2019;Carpenter et al. 2020).
Personalized medicine.The National Institutes of Health (NIH) and Food and Drug Administration (FDA) jointly proposed personalized medicine in Hamburg and Collins (2010), with the goal of providing patient-level individualized treatment as opposed to the typical one-size-fits-all medical treatment.Within the causal prescriptive analytics framework paradigm, this problem can be formulated with decision variables (X) that correspond to individuals to be selected to receive treatment so as to maximize effectiveness (Y) and ultimately improve population health (Z).This emerging field has many challenges to overcome, including measurement and optimization.
Government policies.Economic and health care policies have a wide impact on the community with objectives such as improving health, cost control, and improving the economy.Examples include determining the necessary level of interest rate (X) to optimize consumer and business response (Y) and ultimately improve overall economic measures (Z) or tuning health care policy parameters (X) to treat hospitals serving vulnerable populations equitably (Y) and ultimately reducing disparity in treatment in the population (Z).

METHODOLOGIES THAT SUPPORT THE CAUSAL PRESCRIPTIVE ANALYTICS FRAMEWORK
The different stages of the causal prescriptive analytics framework require methodologies from multiple analytical fields.Most generally, they can be separated into three categories: predictive analytics, causal inference, and optimization.
Predictive analytics.Information (I) in Figure 3 may be provided directly or may need to be estimated based on historical data using statistical analysis, predictive analytics, or machine learning models.The range of methods includes point and interval estimation, regression-based analysis, econometric time series analysis, decision tree, random forest, gradient boosted tree, Bayesian analysis, and deep learning; see, for example, Mills & Markellos (2008); Freedman Causal inference.While predictive analytics is about predicting an outcome (Y) as accurately as possible using available features (X) where the metric of interest is typically the conditional expected value , causal inference is for measuring the effect of a cause on the outcome.The former can be directly addressed by modeling | ( ) E Y X x = using statistical or machine learning methods for supervised learning to approximate the functional (not necessarily causal) relationship between Y and X.The latter would require specific techniques from the field of causal inference; using the do-calculus notation from Pearl (2000) and Pearl, Glymour, and Jewell (2016), the causal effect of X on Y is denoted by where the do-operator is to indicate that an intervention is acted on the value of X as opposed to simply observing the value of X in a conditional expectation.To measure the causal impact of a decision or intervention, randomized controlled trial (RCT), where treated and untreated units are randomly split, is often regarded as the gold standard and is recommended to be used whenever possible; see Salsburg (2001: ch.5), Freedman (2009: ch.1), Glennerster & Takavarasha (2013), Imbens & Rubin (2015), Pearl & MacKenzie (2018), Leigh (2018), Rosenbaum (2019), andThomke (2020).In situations where RCT is unavailable because experimentation is too difficult or too costly to achieve, confounding can be reduced by causal inference techniques for observational data.This itself is a large field that cuts across several academic disciplines and is out of scope for this paper, and thus we provide some key citations below for each major school of thought: 1.The potential outcomes or counterfactual approach and the related propensity score matching from the field of statistics, where it is assumed that hypothetical outcomes exist under the treated and untreated scenarios for each analysis unit and a single propensity score, defined as the probability that each analysis unit belongs to treatment as a function of confounders, can be applied to match treatment and control groups as much as possible; see Rubin (2006);Rosenbaum (2002;2010;2019); Imbens & Rubin (2015); Hernan & Robins (2020);and Dominici, Bargagli-Stoffi & Mealli (2021) for a description of multiple variations of this technique widely applied to social and medical sciences.

2.
The probabilistic graphical method from artificial intelligence based on the Bayesian network is a unique approach, as it is designed to construct and estimate the causal relationships (or DAG) through structure learning, even without detailed domain knowledge of which variables may be causing which, thus allowing us to answer causes of effects in addition to effects of causes; see Pearl (2000;2012); Spirtes, Glymour & Scheines (2000); Koller & Friedman (2009); Scutari & Denis (2015); Pearl, Glymour & Jewell (2016);and Peters et al. (2017).

4.
This overview of techniques indicates that the appropriate solution to an optimal decisionmaking problem is often highly multidisciplinary.Therefore, we note that Figure 3 is only a highlevel symbolic representation, and the exact detailed relationships can be more complicated.

ON CAUSAL INFERENCE AND ITS PLACE IN THE CAUSAL PRESCRIPTIVE ANALYTICS FRAMEWORK
Causal inference is an important tool within the causal prescriptive analytics framework, and the direct marketing example provided in section 2 illustrates how it can be utilized in Step 5 of the framework.We note, however, that causal inference is not always necessary to identify the relationship between X and Y.
How do you differentiate between contexts where causal inference is required versus not required?It is helpful to think of this question through the illustration in Figure 4.The outcome (Y) is a function of the decision variables (X) and some coefficients (C) that are often estimated from data.If there is a system response mechanism to the decision or intervention represented by the decision variables in the optimization problem X, then the relationship X→C needs to be taken into consideration when estimating Y, and causal inference is required.We outline a few examples next.

CAUSAL INFERENCE NOT REQUIRED
Many important prescriptive analytics contexts do not require causal inference.This situation happens when there is no system response mechanism to a decision or intervention.Given the choice of a decision (i.e., a specific value of the decision variable), the immediate causal effect on the outcome variable is mathematically known (or straightforward to obtain).We will refer to Panel A of Table 1 for the examples below.Vehicle routing.Because selecting an arc and adding it to a bus route is not really an intervention-it causally affects the total distance traveled (Y) and also the goal (Z) but does not affect the arc distance itself (C)-the standard vehicle routing problem does not require causal inference.
Workforce scheduling.The action of assigning an employee to a shift (X) does not impact the cost for staffing the employee to the shift (C), which is known in advance; so although the assignments affect the total cost (Y) causally, causal inference is not needed to estimate relationship between X and Y.

Inventory management.
Values such as annual usage rate and holding cost per year per item of inventory carried often need to be estimated using historical data (see Shapiro (2007), Pinedo (2009), or Daskin ( 2010)); however, given the known cost per order and holding cost per unit per year, the mathematical relationship between X (order quantity) and Y (order and holding costs) is typically known (assuming the company is a small player in the market), and causal inference is not required.
Portfolio construction.To maximize Z through Y, a key step is to predict the expected individual stock returns (C) using historical data or simulations; see, for example, Fabozzi et al. (2007).
Once the expected individual stock returns are predicted, the future expected portfolio return can be calculated as a sum-product of the expected individual returns and the weights of the stocks in the portfolio.Although there is still estimation involved when it comes to determining C, the relationship between X, C, and Y is a mathematical formula that typically does not require causal inference because it is assumed that the portfolio is small relative to the market, so the decision to assign a particular weight to a stock does not affect the expected return on the stock.

CAUSAL INFERENCE REQUIRED
Causal inference is needed when the decision variable or intervention affects the function representing the relationship between X and Y and, by extension, the ultimate goal Z. Several categories of applications typically require causal inference (see Panel B of Table 1 for a technical summary): 1. Behavioral relationships.Human behaviors in response to individual-level stimuli or interventions are usually unknown and need to be estimated through causal inference.
Pricing.Estimating customer responses to pricing decisions typically requires causal inference, which is similar to the direct marketing problem in section 2. 2 This is because the decision variable price (X) affects sales volume (C) through an unknown price elasticity, thus the outcome sales revenue (Y), and ultimately the objective (Z), profitability.Common methodologies for estimating price elasticity include testing various price points through RCT, analyzing historical observational data through econometric time series analysis if there was price variation in historical data and RCT is not feasible, and survey-based conjoint or discrete choice analysis if in-market price changes are infeasible or difficult.
Customer retention.Because of limited resources, organizations have to select customers for retention efforts (an intervention), for example, outbound call programs with an incentive.The behavioral outcome for change in retention rate (C) and its relationship with the retention effort (X) is not known in advance and thus needs to be estimated through RCT by testing a combination of treatments, such as incentive, time/day of outreach, frequency, and channel, which is methodologically similar to the direct marketing example.If RCT is not available, causal inference techniques can be applied to nonexperimental data.
Employee acquisition.Since sales volume (C) may depend on available customer support and customer satisfaction, it is not a simple or known function of the number of staff (X) and may require causal inference to estimate; see Pessach et al. (2020).For example, one may utilize store-to-store variation as well as variation over time on the number of sales agents to develop a panel data analysis for assessing the impact on outcome metrics.Digital health.Causal inference techniques (e.g., RCT) are required to understand the effect of a message (X) on message-specific health outcomes (C).If multiple messages are eligible for each individual in a sequential order, the message-specific outcomes can be aggregated to the overall individual health outcome (Y), such as the number of emergency room visits, which is then translated to medical cost and summarized to the employer level (Z).
Personalized medicine.Medicine can be optimally assigned to appropriate patients (X) in order to maximize population health (Z) through individual treatment effectiveness (C); see Hamburg & Collins (2010) and Yong (2015).The effect of X on C is commonly measured via RCT.
If multiple treatments are available, the overall effect at the individual level (Y) can be obtained through the estimates of C. For example, when the number of vaccines available in a country is limited, health officials have to decide who receives vaccination first in order to achieve a maximum protection for the whole population.

Policy examples.
Government policies have an impact on individuals, organizations, and society as a whole, and such impact is usually assessed through observational data analysis, such as econometric methods, since RCT is often not feasible.
Health care policy.An example is a setting studied in Gai and Pachamanova (2019), where the impact of a policy (X) known as the Hospital Readmissions Reduction Program (HRRP) (as part of the Affordable Care Act (ACA)) is analyzed in an effort to reduce excess hospital readmissions (Y = C in this case) and lower health costs (Z) while ensuring equitable treatment for vulnerable populations.Difference-in-difference was employed in the study to compare the pre-HRRP differences in readmission rates between treatment and control groups with their post-HRRP differences.
Economic policy.One of the most powerful economic decisions for central banks is setting an appropriate interest rate in order to stimulate the economy when it is weak or to prevent inflation when the economy is too strong.Setting it inappropriately can lead to an undesirable ripple effect.See Belongia and Ireland (2015) and Kiley and Roberts (2017) for examples of applying advanced econometric methods to measure impact on various metrics.

CONCLUSION
In this article, we introduced a causal prescriptive analytics framework that outlines the integration of multiple types of analytical techniques, including predictive analytics, machine learning, causal inference, and constrained optimization.These methodologies are from a variety of academic disciplines.We asserted that all prescriptive problems for optimal decisionmaking are causal by nature: in order to achieve a goal, we make a decision (or take an action) to cause a desirable outcome to happen.However, not all prescriptive problems require causal inference to uncover these relationships.We listed numerous examples where the framework can be applied and discussed several practical examples to illustrate the differentiation between problems that require causal inference and problems that do not.We also emphasized the importance of aligning the representation of causality with the ultimate goal.Our practical framework unifies decision-making problems in a common setting, facilitating the discovery of analytics opportunities and transitioning analytical decision-making from data science toward impactful decision science.

APPENDIX A: CONSTRAINED OPTIMIZATION FORMULATION OF THE MULTITREATMENT DIRECT MARKETING PROBLEM
This appendix describes the direct marketing problem mentioned in section 2 in more detail.
Traditional response modeling based on conventional supervised learning aims at estimating the response rate, p ij , for individual i to receive treatment j.Lo (2002;2008);Siegel (2011;2013); Kane, Lo, and Zheng (2014); Lo and Pachamanova (2015); and Haughton et al. (2023) have explained and demonstrated that such an approach is flawed, as scientific marketing measures "lift over control" by comparing the response outcome to a control group where treatment is not given in order to causally assess whether a program is successful.To be consistent with this scientific measurement, uplift modeling is required to estimate "lift over control" as opposed to traditional modeling for estimating response rate only, which may capture customers who would naturally respond without receiving a treatment, resulting in inefficient targeting and potential waste of resources.
The treatment optimization problem can be formulated as a binary integer programming optimization model that can be solved by exact or heuristic methods.The objective function is to maximize incremental profitability due to direct marketing.
Here Z = incremental profit due to direct marketing; Y = incremental sales due to direct marketing; ˆij p  (represented by C in Table 1) = estimated lift value (treatment effect over no treatment) for individual i to receive treatment j; r = revenue per sale (assumed constant here but can be relaxed); x ij (decision variable) = 1 if treatment j is assigned to individual i and 0 otherwise; and c ij = cost of promoting treatment j to individual i. (We provide this example to illustrate the main concepts in our framework.We note that that, in practice, the optimization problem can involve multiple treatments and products, and the causal relationships can be more complicated to estimate.) In the above constrained optimization model, the constant r is irrelevant to the optimal solution and thus can be dropped (or assumed to be 1.0), and ˆij p  are the key input values that can be estimated by uplift modeling or CATE techniques based on historical RCT data; see Lo & Pachamanova (2015), Pachamanova et al. (2020), andHaughton et al. (2023: ch.6-7).The literature also describes a set of uplift modeling techniques for handling observational data through propensity score matching types of causal inference techniques; see Athey & Imbens (2015) and Haughton et al. (2023: ch.9).
To empirically illustrate the benefit of causal prescriptive analytics using uplift modeling over regular supervised learning for traditional response modeling in this application, that is, using estimated lift values ˆij p  as opposed to estimated response rates ˆij p in (A.1), we use online retail data for women's and men's merchandise from the Hillstrom data set, MineThatData (minethatdata.com).We follow the clustering-based heuristic optimization procedure described in Lo and Pachamanova (2015) and solve the optimization problem (A.1) to find the optimal treatment quantity for each customer segment.
The procedure from Lo and Pachamanova (2015) is outlined as follows: 1. Develop an uplift model for estimating lift in response rate using the separate model approach,3 which requires development of logistic regression models using the training data for treatment and control groups, respectively.
2. In the holdout data, compute the lift estimates for both men's and women's merchandise using the estimated uplift model by subtracting the estimated control response rate from the estimated treatment response rate at the individual level.
3. Perform a cluster analysis of individuals using the two lift estimates for men's and women's merchandise as input variables.

4.
For each cluster in the holdout data, calculate the cluster-specific sample lift scores for both men's and women's merchandise by taking the difference between the sample mean response rate in the treatment group and the sample mean response rate in the control group.5. Apply the cluster solution to the new data 4 for future campaigns.
6. Solve the integer program equivalent of (A.1) at the cluster level to maximize overall incremental value.
To evaluate the benefit of using uplift modeling over traditional response modeling, we repeat the above procedure using the estimated response rates ˆij p for men's and women's merchandise as input variables to cluster analysis in Step 3, as opposed to using the lift estimates, ˆij p  , followed by applying the resulting cluster solution to the new data in Step 5 and determining the optimal solution to maximize overall value (instead of incremental value) in Step 6.We then compare the results of the two solutions using the objective function values in (A.1).
The results from the optimization using ˆij p and ˆij p  as objective function coefficients are shown in Table A.1a and Table A.1b,respectively. 4 Following Lo and Pachamanova (2015), we assume the new data for future campaign is 10 times of the holdout data from the previous campaign.

Figure 2
Figure 2 Proposed causal prescriptive analytics framework.

Figure 3
Figure 3 Directed acyclic graph (DAG) representation of the proposed causal prescriptive analytics framework.

Figure 4
Figure 4 Causal inference in prescriptive analytics problem formulations.