Lee Merkhofer Consulting Priority Systems
Implementing project portfolio management

Monte Carlo Analysis and Decision Trees

As stated in the previous sub-section of this paper, if probabilities are assigned to uncertainties, those probabilities can be propagated through the project selection decision model to derive probability distributions for the risks to achieving the organization's objectives and the uncertainty over project value. The two most popular ways for doing this are Monte Carlo analysis and decision trees.

Monte Carlo Analysis

Monte Carlo analysis is a form of simulation. Historically, it has been used most often for investigating the behavior of physical systems, for example the rate at which contaminated water migrates underground from a hazardous waste site. However, Monte Carlo analysis is increasingly being used to quantify project risks. To do this, the project decision model is evaluated many times. The inputs for each run of the model are selected randomly in accordance with the probability distributions assigned to those inputs. After each run, the model outputs; that is, the performance of the project relative to each objective and the resulting project value, are recorded. If enough trials are conducted, a frequency plot of the model output shows the shapes of the probability distributions. Since the specific inputs that are selected for each model run are generated randomly (according to the specified probabilities), the process is a little like rolling dice (hence the name).

To better illustrate how Monte Carlo analysis works, here is a highly simplified example involving only two uncertainties (most Monte Carlo analyses involve more than two uncertainties).

Imagine that a company manufacturing personal care products is considering introducing a new product. The product is a teeth whitening gel containing a new peroxide-based chemical compound. The compound is very effective at bleaching teeth, but testing shows some users experience gum irritation. The likelihood of irritation varies by individual and depends on the amount of gel that the user inadvertently gets on his or her gums. The irritation is not significant from a health standpoint, but if the risk of irritation is high enough, sales will be hurt and there will be an adverse impact on two of the company's objectives: maximizing profit and creating satisfied customers.

Suppose that industry tests have collected data on the amount of gel that users of teeth whitening systems get on their gums. The data has a mean of 0.02 mg and a standard deviation of 0.005 mg. Suppose tests with volunteer subjects indicate that the likelihood of irritation increases linearly with the amount of exposure and ranges from a low of 5 subjects out of 100 per milligram to a high of 10 out of 100 subjects per milligram. The probability distributions assumed are shown below.

Input distributions

Figure 38:   Distributions assumed for the example Monte Carlo analysis.

The simulation begins with the selection of a sample exposure level (the amount of gel that gets on the user's gums) and sensitivity (how sensitive that particular user happens to be to the gel). These selections are made randomly, governed only by the respective probability distributions.

Suppose, for instance, that the first randomly selected exposure is 0.022 (slightly above the mean exposure) and the selected sensitivity is 0.07 (slightly below the mean). The probability that the user will experience irritation is the product of the exposure and sensitivity. Therefore, the probability of irritation for these sample values is 0.022 mg of exposure times 0.007 incidences of irritation per mg of exposure, which equals 0.00154, indicating that 15.4 out of 10,000 user applications will result in irritation.

Several software tools, including @Risk® an Crystal Ball®, are available for making the random selections from specified probability distributions and then calling on a model (in this case, the model is a simple multiplication) to compute a result. If enough samples and model computations are performed, a probability distribution over results can be generated. Figure 39 shows the distribution obtained from 10,000 trials.

Monte Carlo analysis

Figure 39:   Probability distribution for the fraction of users experiencing irritation.

With Monte Carlo analysis you can do sensitivity analyses wherein only one uncertainty is allowed to vary while keeping the others fixed. In this way, you can see which uncertainties (risks) have the biggest influence on project value (and focus energies accordingly). The results can be used to investigate changes that might be made to reduce risk. For example, a common application of Monte Carlo analysis for project risk management is measuring uncertainty over project schedule. If the analysis shows that there's a 50 percent probability the project will run a month late, you might want to build an extra month into the schedule.

Correlated Uncertainties in Monte Carlo Analysis

Correlations among uncertainties can have a big impact on risk. For example, the cost to complete a major construction project is impacted both by uncertainty over the costs to complete various tasks and by whether the project is proceeding ahead or behind schedule. Cost and schedule uncertainties are correlated, if a task is behind schedule it is more likely to cost more to complete. If this particular correlation is ignored, the risk of a cost overrun will be underestimated. Thus, a common case wherein correlated uncertainties are represented in Monte Carlo analysis is analyzing cost uncertainty for major construction projects.

Correlation among random variables can be handled in Monte Carlo analysis by using joint probability distributions. A joint probability distribution is analogous to a single variable probability distribution, except that it describes the probability of obtaining various combinations of two or more random variables. Common joint probability distributions include the multivariate discrete distribution, multinomial, multivariate hyper geometric, bivariate normal, bivariate lognormal, multivariate normal and multivariate lognormal. The process requires selected a multivariate distribution and then specifying a correlation structure for the variables. Methods are also available for using the marginal distributions of random variables and their correlation structure without requiring the complete joint distribution.

Bivariate distributions

Figure 40:   Some bivariate distributions for Monte Carlo analysis.

When uncertainties for the Monte Carlo simulation are interdependent, the correlation between the variables should be estimated and the proper joint distributions used for the simulation runs to ensure that the random selection of the inputs does not violate the defined correlation. Frequently, though, there is little or no real-world basis for selecting joint probability distributions for the uncertainties that impact project performance. Thus, although the use of multivariate probability distribution may appear attractive from a theoretical standpoint, practitioners will often ignore correlations unless they are very strong, in which case a deterministic relationship might be assumed such that one variable is a function of the another. In that case, the problem reduces to finding a single-variable probability distribution for the independent variable in the assumed deterministic relationship.

Decision Trees

Decision trees provide a second means for generating probability distributions to describe project risks. Like Monte Carlo analysis, decision trees generate probability distributions for a model's outputs by running the model many times with different inputs while respecting the probabilities of those inputs. With Monte Carlo analysis, the inputs for each model run are selected randomly according to the probability distributions specified for those inputs. With decision trees, the model inputs are chosen systematically based on a tree structure that lays out the possible inputs and their probabilities.

Figure 41 provides an example of a relatively simple but realistic decision tree. This tree was used in an analysis of strategy for introducing a new product. The key decision is whether the company should license the product or produce and market it themselves.

Project risk event tree

Figure 41:   Decision tree for evaluating a strategy for producing and marketing a new product.

As illustrated, a decision tree is a graphic structure composed of nodes and branches. By convention, decisions are represented by squares (yellow in this case). The branches emanating from a decision node represent the possible alternatives for the decision. Uncertainties ("chance nodes") are represented by circles (green). The branches emanating from a chance node represent possible outcomes of the uncertain event. A triangle (blue) indicates a terminal (end) node. The order of the nodes (from left to right) in the tree corresponds to sequence in which decisions must be made and uncertain outcomes will be revealed.

Using Influence Diagrams with Decision Trees

The problem with decision trees, as illustrated by the above example, is that they are generally too large to be displayed on a single sheet of paper or on a computer screen. Fortunately, even very large decision trees can be summarized succinctly by an influence diagram. An influence diagram shows with nodes the decisions, uncertainties, and other factors needed to compute the measures selected to evaluate the project. The arrows in the influence diagram identify the influences that exist. Because they are smaller and much easier to understand, influence diagrams are the preferred method for designing and explaining decision tree models.

The influence diagram for the above tree is shown below.

Influence diagram

Figure 42:   Influence diagram used to construct the example decision tree.

The decision (represented by the yellow square in the influence diagram) is, as I've already explained, whether to manufacture and market the product in house or to license it. If the company licenses the product, the company will pay no production or marketing costs, but it will receive revenue equal to a negotiated license fee times unit sales (hence the arrows into the revenue node). On the other hand, if the company chooses the in-house option, the company will pay uncertain production and marketing costs (hence the arrows to the green chance nodes). It will also generate some level of local jobs, depending on the magnitude of sales. The value of the project will be a weighted combination of the number of new jobs created and the profit generated. As shown, factors that are neither uncertainties nor decisions are displayed as blue rectangles with rounded corners.

Decision tree software, such as Lumina's Analytica and Syncopation's DPL Portfolio, automatically generate decision trees from influence diagrams. DPL was used to create this tree. The software generates a symmetrical tree regardless of whether the tree could be made smaller by removing portions of the tree that don't apply under certain decisions or uncertain outcomes. That is the case here since the tree shows production and marketing cost outcomes for the license option in the lower half of the tree even though these outcomes are irrelevant to the company under the license option. Including irrelevant nodes in the tree has no effect on the computations, and it is easier for the software to generate symmetrical trees. (The software allows the user to remove irrelevant portions of the tree if desired, but, since the user interface relies on the influence diagram rather than the tree for communication and understanding, little is lost in making trees symmetrical.

Quantifying the Decision Trees

To quantify the tree the user specifies probabilities for each branch emanating from each chance node in the tree. If the chance node represents a continuous variable, as is the case here, the continuous probability distribution must be discretized.

Decision trees handle correlations among uncertainties by requiring the probabilities assigned to the possible outcomes to a chance node to be conditional probabilities, conditioned on the outcomes of the previous uncertainties and decisions along the path leading to the chance node. Thus, for example, if marketing costs happened to be correlated with production costs, the probabilities assigned to the high, medium, and low marketing costs (or the values themselves), would be different depending on which production cost branch they connect to.

Like Monte Carlo analysis, a decision tree is typically linked to a model, often a model implemented in Excel, for computing for each path through the tree the relevant outcomes and their value to the organization. In the example, as explained above, the model computes profit and the number of jobs created, quantities which are displayed at the tree's end nodes. The probability of reaching each end node is the product of the probabilities along the branches that form the path. Knowing the probability of obtaining the values assigned to each end node, the decision tree software plots the risk profile for each decision alternative (by adding the probabilities of all values less than each possible value). The figure below shows the risk profile (cumulative probability distribution) for profit under the "do it ourselves" alternative.

Probabilistic forecasts

Figure 43:   Cumulative probability distribution for profit.

Decision Trees versus Monte Carlo Analysis

Monte Carlo analysis and decision trees are each powerful tools. Both can be linked to a model for estimating the performance of candidate projects. Whatever outputs the model produces will be assigned a probability distribution. For example, if the decision model estimates project performance over time, say annually, decision trees or Monte Carlo analysis can show how the risks to objectives evolve over time (though care needs to be taken to capture the correlations between the model results in adjacent years/each years). Figure 44 provides an example of this common output.

Probabilistic forecasts

Figure 44:   Characterizing risks shows how uncertainties evolve over time.

Monte Carlo analysis and decision tree have their own advantages and disadvantages. Monte Carlo analysis can be more efficient if there are lost of uncertainties, whereas decision trees get too large if more than 4 or 5 uncertainties are included in the tree. Decision trees have the advantage of being visual; however, as we've seen this advantage diminishes as the model becomes larger and more complex. With decision trees, most the probabilities assigned to the branches of chance nodes are conditional probabilities. Conditional probabilities ("How likely is this outcome given that the outcomes and decisions shown earlier in the tree?") are easier for subject matter experts to estimate than the joint probability distributions and correlations required by Monte Carlo analysis. Using decision trees requires discretizing continuous uncertainties, but experience shows that the errors introduced by discretization are minor.

The biggest advantage that decision trees have over Monte Carlo analysis is that a decision tree can be "solved" to identify the value-maximizing choices at each decision node in the tree. The solution technique is called tree rollback, explained here. The above tree has been rolled back (rollback values are shown in brackets) and the results show that the alternative of producing and marketing the product in house provides more value than licensing the product (following convention, the highest value alternative at a decision node is identified with a heavy black line.

Monte Carlo analysis can likewise identify the preferred alternative if the only decision to be made is an initial choice. However, rolling back a decision tree will provides not just the best choice for the initial decision, but also the optimal choices for all downstream decisions. With Monte Carlo analysis it is necessary to specify down stream decisions with rules (e.g., abandon the project if net earnings are below zero). While a given rule of thumb may make sense in some scenarios, there is no guarantee that it is appropriate across the complete range of possibilities investigated by the Monte Carlo analysis. Another advantage of decision trees is that they allow for value of information calculations.

Risks of the Project Portfolio

Another important reason to consider quantifying project risks is that the overall risk of the project portfolio can then be determined. As noted above, conducting a portfolio of projects reduces risks through risk diversification (hedging), but diversification is less effective for organizations that conduct a smaller number of projects. Also, non-diversifiable risks impact portfolio risks regardless of how large the project portfolio is.

Both Monte Carlo analysis and decision trees can, in theory, be used to quantify the risks of the entire portfolio, provided the decision model computes the combined impact on the organization's objectives and the portfolio value. Correlations among projects must, of course, be quantified. The difficulty is specifying joint probability distributions for the Monte Carlo analysis and the size the necessary decision tree if there are more than just a few projects. Thus, the more common way to quantify portfolio risks is to define scenarios for risks that impact multiple projects. For example, if the state of the general economy influences the performance of multiple projects, three scenarios might be defined: "improving economy," "status quo," and "recession." Either Monte Carlo analysis or a decision tree could be used to separately estimate the performance of each project under each scenario. Then, portfolio risks can be computed by combining the scenario uncertainties with the (assumed independent) risks computed for each project.

Failure to account for risks that simultaneously impact multiple investments can have serious consequences. The 2008 financial crisis provides many illustrations. For example, insurance giant American International Group (AIG) used models to assess the risks associated complicated contracts called credit-default swaps, which totaled more than $400 billion. According to an article in the Wall Street Journal, AIG knew their models left out global market forces and commonly-used contract terms (non-diversifiable risks which affected nearly all of AIG's investments). Even so, AIG neglected to expand its risk management models to include these risks. In retrospect, it was clear that the failure to address the common threats caused AIG to vastly underestimate risk and to continue to purchase the dangerous contracts. Were it not for the government bailout, AIG would have collapsed. [6]