"A common misunderstanding about weights is that they convey the relative importance of objectives." 
My stepbystep process for constructing a project selection decision model calls for weights to be assessed after:
As reflected in Figure 36, specifying weights necessarily comes after nearly all of the other significant choices about model design have been made. Figure 32: Steps for creating a project selection decision model. 

Understanding WeightsThe term "weights" refers to the w_{i} factors in the additive form of a multiattribute value function [1]:
The use of this term may contribute to the common misunderstanding that the w_{i} factors define the relative importance of the objectives and their corresponding performance measures. If, for example, an organization designs a multicriteria prioritization model with a weight on public safety that is half as large as the weight assigned to net revenue, observers may assume the organization regards public safety as being half as important as profits. The relative value of weights does not support such a conclusion. As shown below, in addition to the relative value attributed to obtaining improvements in performance measures, weights depend on the ranges defined for their assessment [2]. Even though conclusions about the meaning of weights are often based on misconceptions, the fact that such misunderstandings can easily arise underscores the need for taking care when designing the model and when explaining its logic. Weights as Scaling FactorsAs described previously, it is customary to scale (normalize) singleattribute value functions V_{i}(x_{i}) to go from zero to one [3]. The single attribute value function V_{i}(x_{i}) converts performance relative to the i'th objective into the relative value of that level of performance. Regardless of the shape of the V_{i}(x_{i}) functions, the values assigned to the worst and best performance levels are zero and one, respectively. As indicated by the form of Equation 1, the w_{i} weights serve as scaling factors that allow numbers indicating the relative preference for the performance levels obtained from the singleattribute value functions to be compared with one another and summed [4]. If, for example, performance measure x_{A} has a weight w_{A} that is twice the weight w_{B} for measure x_{B}, then this should be interpreted as meaning that the decision maker values an increment of 0.1 value points on performance measure x_{A}, the same as a 0.2 value point increment for performance measure x_{B}. The w_{i} weights control the sensitivity of total value to changes in the various areas of performance represented in the model. If w_{i} is made smaller, changes in the levels of performance x_{i} will have less significance for determining total value. Conversely, it w_{i} is made larger, total value will be made more sensitive to projects that impact performance in this area. Weights as Swing WeightsScaling the multiattribute value function V( · ) and the singleattribute value functions V_{i}(x_{i}) in the normal way means that, if x_{i}^{–} and x_{i}^{+} are the worst and best outcome levels for the i'th performance measure, respectively, and if x^{–} and x^{+} are the worst and best outcome bundles, respectively, then:
V_{i}(x_{i}^{–}) = 0 , V_{i}(x_{i}^{+}) = 1, i = 1, 2, ..., N
V(x^{–}) = 0 , V(x^{+}) = 1 With this normalization, the weights must sum to one:
Now suppose there are two outcome bundles denoted x^{1} and x^{2} that are identical for every measure except the i'th. The i'th measures for the two bundles are set equal the worst and best outcomes for that measure, respectively. The performance levels for the measures other than the i'th, designated x^{#}, are at arbitrary levels but are the same for each bundle (the levels can differ from measure to measure, but whatever the levels are for one bundle they are at the same levels for the second bundle), Expressing these assumption mathematically:
x^{1} = [x_{i} = x_{i}^{–}; x_{k} = x^{#}; k ≠
i]
x^{2} = [x_{i} = x_{i}^{+}; x_{k} = x^{#}; k ≠ i] With the definitions as given, going from x^{1} to x^{2} is the swing from the worst to best outcome for the i'th performance measure (other performance measures remaining unchanged). The value increase associated with this swing is:
V(x^{2})  V(x^{1}) = w_{i}V_{i}(x_{i}^{+}) 
w_{i}V_{i}(x_{i}^{–}) = w_{i}
This result shows that each scaling factor w_{i} has a very specific interpretation; namely, w_{i} is the relative value (value expressed on a zero to one scale) of obtaining a swing from the worst to best outcome on the i'th performance measure [5]. For this reason, weights are more precisely termed swing weights; they may be obtained by estimating the value of swings from worst to best outcome levels for the performance measures. Weights as TradeoffsConsider two outcome bundles where the outcome levels for every performance measure but two, the i'th and j'th, are the same. For the first of the two outcome bundles, labeled a, the i'th performance measure is at its best level and the j'th performance measure is at its worst level. The other performance measures are at some arbitrary, common level x^{#}:
x^{a} = [x_{i} = x_{i}^{+}; x_{j} = x^{–}; x_{k} =
x^{#}, for i, j ≠ k]
For the second outcome bundle, labeled b, the i'th performance measure is at some special outcome level x_{i}^{☆}, and the j'th performance measure is at its best level, with the other measures set at the common level x^{#}:
x^{b} = [x_{i} = x_{i}^{☆}; x_{j} = x^{+}; x_{k} =
x^{#}, for i, j ≠ k]
Now consider the swings that occur for the individual performance measures if there is a swing from performance bundle x^{a} to the performance bundle x^{b}. The j'th performance measure swings from its worst level to its best level, so there is a gain in value equal to w_{j}. Meanwhile, the i'th performance measure swings from its best level to the level of x_{i}^{☆}. This swing can be decomposed into two parts, a swing from the best level to the worst level, which produces a loss in value of w_{i}, and a swing from the worst level to the level of x_{i}^{☆}, which, since V_{i}(x_{i}^{–}) is zero, produces a gain in value equal to V_{i}(x_{i}^{☆}). Thus, the net value of the swing from x^{a} to x^{b} is w_{j}  w_{i} + V_{i}(x_{i}^{☆}). Suppose now that the level x_{i}^{☆} is adjusted so that the values of the two outcome bundles are equal. Then, we must have:
w_{j} = w_{i} V_{i}(x_{i}^{☆})
Dividing each side by w_{i}:
w_{j}/w_{i} = 1  V_{i}(x_{i}^{☆})/w_{i} or, inverting
w_{i}/w_{j} = 1/[1  V_{i}(x_{i}^{☆})]
This equation yields a formula for the increase in the i'th measure needed to compensate for a decrease in the j'th measure. If the decision maker estimates the fraction of the swing for measure x_{i} that is of equal value to the swing for measure x_{j}, we can generate equations that relate the weights for the performance measures [6]. To illustrate, suppose, for convenience, that the decision maker starts by ranking the performance measure swings from most valuable to least valuable. Then, the decision maker specifies the fraction or percentage of the top ranked swing that would have equal value to the second ranked swing. Designate the proportion of the topranked swing that equals the second ranked swing p_{12}. Then, if w_{1} and w_{2} are the weights for the swings ranked number one and two:
w_{2} = p_{12}w_{1}
Likewise, if the decision maker estimates the proportion of the top ranked swing that is of equal value to the third ranked swing, designated p_{13}:
w_{3} = p_{13}w_{1}
Continuing in this way, N  1 equations can be obtained for the N swing weights. To solve for the weights, you can assign "1" to the first, compute a value for each of the other w_{i}, and then use Equation 2 to normalize the weights to sum to one. Consistency ChecksAdditional equations relating weights can be generated to provide consistency checks [7]. For example, if the decision maker estimates the proportion of the second ranked swing that is of equal value to the third ranked swing, denoted p_{23}, then, for consistency we should have,
p_{13} = p_{12}p_{23}
Consistency checks can provide a sense for the reliability of the judgments being provided by the decision makers who are the subjects for the weight assessment process. If serious inconsistencies are identified, they can be pointed out so as to allow decision makers to reconsider their judgments and resolve the inconsistencies. Weights as Measures of "Importance," NotAnother concept often used to assign weights is "importance"—the more important the objective (or its performance measure) is judged to be, the larger the weight that is assigned to it. Assigning weights based on importance provides the opportunity to specify weights at every level in the objectives hierarchy, not just weights for the lowestlevel objectives. With importance weights, each level of the hierarchy has its own weights, and at each level those weights sum to one (or one hundred, if the weights are expressed as percentages). With importance weights, the weight for each intermediate objective equals the sum of the weights of its subobjectives (the connected objectives directly below it in the hierarchy). An objective with subobjectives will thus have a "category weight," a weight indicating the combined importance of all of its subobjectives. Conveniently, importance weights can be specified from the top down, with the weight for an objective at any level being apportioned across its lowerlevel, subobjectives. Despite these attractive features, you should avoid the temptation to use importance weights. Importance weights have no theoretical or operation grounding. How do you measure importance? A decision maker might say, for example, that health is "X" times more important than money, but what is the amount of health or money to which that statement applies [8]? Avoiding a fatality is certainly more important than saving $1,000, but is avoiding a cold more important than $10,000? In addition to the theoretical problems for using importance weights, there are practical problems. Tests, for example, show that weights assigned based on feelings of importance have poor repeatability, meaning that the same decision maker will assign different importance weights to a performance measure at different points in time [42]. Of equal concern, importance weights violate the range sensitivity principle. The Range Sensitivity PrincipleAs described above, each w_{i} weight in the normalized, additive value function (Equation 1) equals the value, as determined by the decision maker, of a specified swing in performance measured by the x_{i} performance measure. Typical advice for specifying a swing is to choose either a "local" range or a "global" range. A local range for a performance measure is typically defined as a range that spans the levels of performance seen in the current set of alternatives. The global range is larger, it is typically defined as a range the spans the worst and best levels of performance that are theoretically possible. In truth, weights may be estimated based on any performance range that might be defined. The selection of the range for normalizing the value function is a choice that may be made at the time the model is being designed. Regardless of how ranges in performance are selected, it is necessarily true that making the range for a swing smaller will typically cause its weight to be smaller. Conversely, making the range for a swing larger will typically yield a larger weight. This result is known as the range sensitivity principle. Importance weights, because they are assigned independent of any specified swing in performance, will, obviously, violate the range sensitivity principle. In fact, though, tests show that the range sensitivity principle is violated to a lesser or greater degree for nearly every weight assessment method and nearly every subject. Why? One theory is the anchoring and adjustment bias described in Part 1 of this paper. When asked to specify the value of a swing, people initially think about the underlying objective. People have many years of intuitive experience thinking about the importance of objectives. Those initial thoughts may create anchors, and, like most anchors, the adjustments away from them tend to be too small. Thus, people have trouble adequately adjusting weights according to the postulated swings. Nearly every study reported in the literature indicates that the range sensitivity principle is violated, at least to a small degree, but quite often significantly so. The unavoidable conclusion from the literature is that normal people do not and cannot adequately adjust weights to be in accordance with the range sensitivity principle. Methods for Assessing WeightsA surprisingly large number of methods have been proposed for assessing weights, many of which are simply minor variants of one another. However, as discussed more below, even small procedural differences have been shown to sometimes have important consequences [9]. The proposed methods differ both in terms of the nature of the questions posed and the interpretations given to the judgments obtained. The methods can be categorized as direct, indirect, tradeoff, ranking, pairwise, interval, holistic, and others. The tables below summarize some of the more popular methods useful for obtaining weights for project selection models. Be aware that in cases where original definitions require estimations of "importance, I've rephrased the instructions to seek judgments of the desirability of swings. Computer programs are available for guiding many,if not all, of these methods. Assume for the application of the methods that "worst" and "best" levels of performance have been defined for each measure, thereby allowing swings in performance to be specified. Nearly all of the proposed methods advise that weights should be normalized so that they sum to one, so assume, unless indicated otherwise, that such normalization is the final step in the process for each method. Direct MethodsDirect weight assessment methods seek direct estimates of weights from decision makers based on weights being defined as the judged value of specified swings in performance. The main direct methods for obtaining weights are rating and point allocation.
Graphical and physical aids are often used to facilitate direct rating. For example, point allocation can be facilitated by providing decision makers 100 poker chips to be allocated among the swings. Direct rating may be aided by drawing a line on graph paper between points representing the values assigned to the least and most desirable swings. The decision maker then places tick marks along the line indicating the relative values of swings that lie between the values that bound the line. The values corresponding to the locations of the tick marks can then be obtained either graphically or with a software program designed for this purpose. A strength of direct methods is opportunity to stress that assignments be based on the valuation of swings, which can help reduce the tendency subjects have to be biased by notions of the relative importance of objectives. Tradeoff MethodsTradeoff methods require decision makers to provide judgments that establish relationships between pairs of weights.
In ratio and tradeoff methods, performance swings are considered in pairs and presented to decisionmakers as contrasting outcomes that differ only in the performance measures under consideration. If helpful, comparisons between the value of the outcome swings may be expressed in percentages. The method wherein tradeoffs are expressed at the level of performance units requires performance measures to be continuous and single attribute value functions must be linear. As an example of how the method works [15], suppose a computer manufacturer must choose between two product designs: One is less costly and the other will get to market sooner. The performance measures are cost per unit and months to market. The method requires the decision maker to first choose the more valuable unit of performance. For example, the judgment might be that reducing time to market by one month is more valuable than reducing the cost to produce each computer by $1. Then, the decision maker must determine the number of the less desired unit equal in value to the more desired unit. For example, it might be estimated that lowering cost per unit by $15 is equally desired to getting to market one month sooner. Per unit conversions for other performance measures might be similarly obtained, so the method is wellsuited to situations where it is desirable to express project value in equivalent dollars. Note that for this weight assessment method, the per unit conversions are interpreted as weights, so there is no normalization. The equivalent cost method, also called pricing out is very useful, but requires that performance measures have been defined in terms of units familiar to decision makers with corresponding singleattribute value functions that are linear in those units (something you should usually aim for regardless). Estimates of equivalent monetary value may then be obtained based on willingness to pay ("How much would you be willing to spend to move the performance measure from its worst level to its best level of performance?"). Alternatively, equivalent monetary values can sometimes be obtained from available results from cost benefit analyses (CBA), academic research, government recommended values, and values being used by other organizations . CBA derives a monetary value per unit of a consequence based on market prices, contingent valuation (people's willingness to pay), and the hedonic price method (analyzing market prices to determine how factors impact market prices). For example, if reducing greenhouse gas emissions is an objective, and tons of greenhouse gas emissions released is the performance measure, then CBA results can used to obtain an equivalent dollar cost per ton of greenhouse gas released. Multiplying the per ton cost times the estimated number of tons by which emissions might be reduced provides an equivalent monetary value for emissions reductions. Compared to direct methods, obtaining weights through tradeoff methods is more cognitively demanding for decision makers. Also, applying the method typically requires realtime computer support for calculations. Indirect MethodsWhile direct methods demand precise estimates from decision makers that lead to precise weights, an alternative is to allow decision makers to express vagueness or uncertainties in responses to weight assessment questions. An example is provided by the socalled, balance beam method, described below.
Indirect methods for weight assessments may in some situations be easier for decision makers than direct methods. Also, some argue that attempting to put a precise value on an inherently imprecise concept is inappropriate, misleading, and conveys a false sense of precision. Since the equations allow for identifying ranges of values for some weights, the method coordinates well with sensitivity analysis. Ranking (Ordinal) MethodsWith these methods, decision makers need only rank the swings. Surrogate values for weights are then derived from the ranking.
Ranking methods are good options for situations where weights must be obtained quickly from many subjects who might be unwilling or unable to devote the greater effort needed to provide cardinal information for computing weights. Of the conversion methods, ROC has gained most recognition. The weight assessment method proposed by Edwards and Barron, called SMARTER (SMART Exploiting Ranks), is ranking with ROC [11]. SMARTER is presented by the authors as an improvement to SMART because it does not force subjects to provide more difficult cardinal judgments. Mixed OrdinalCardinal MethodsDirect methods for assessing weights demand from decision makers sufficient cardinal judgments to allow the computation of precise values for the complete set of weights. At the other extreme, ranking methods provide no opportunity for decision makers to provide cardinal judgments. Between these two extremes, mixed ordinalcardinal methods allow decision makers an opportunity to input cardinal judgments while not demanding all of the cardinal judgments needed to compute a precise value for every weight. Two examples of ordinalcardinal methods are presented below: CROC (Cardinal Rank Ordering of Criteria) [24] and Simos' method {25]. The version of Simos's method described is sometimes referred to as the improved Simos method, because it includes an improvement provided by Figuiera and Roy [26]. A challenge for mixed ordinalcardinal methods is providing an efficient mechanism for enabling decision makers to express cardinal judgments. Both CROC and the Simos method use visual cues for expressing cardinal judgments. CROC is facilitated by a software package that allows the user to express degrees of preference and associated levels of uncertainty using a slider. With Simos's method, users express degrees of preference using white and colored cards.
CROC and the Simos method extend ROC ranking in a way that allows but does not force subjects to express weakly held or vague feelings about their preferences. Because cardinal information may be added to ordinal results produced by ranking, the methods have the potential to incorporate into the project selection model more accurate and comprehensive assessments of the preferences of decision makers. People seem to like communicating about preferences using the similar graphical means adopted by the two methods. Because the methods allow imprecise representations of preference, they can lessen the reluctance that some decision makers feel toward revealing their true preferences. Pairwise Comparisons Using Scales Defined Using Qualitative PhrasesPairwise comparison is a popular weight assessment method that has been shown, through many successful applications, to work well for most subjects. Subjects are able to discern and express relatively small differences in preferences using this method [29]. Each item is compared with every other item to determine which is preferred and by how much. The usual approach to pairwise comparison calls for providing a scoring scale, with scale levels defined using common qualitative phrases like "slightly preferred", "moderately preferred", and so forth. Such scales are often referred to as semantic scales. It's clear that providing a semantic scoring scale with score levels defined using qualitative phrases can greatly ease the subjects' task of providing pairwise comparison judgments. While some decision makers will not be able to provide weights by direct assignment or point allocation, conducting weight assessments through a pairwise comparison method with a semantic scoring scale will almost always be successful. Most of the example scoring scales that you'll find for pairwise comparisons are designed to obtain judgments of the relative "importance" of the pairs being considered. Accordingly, and rightly so, pairwise comparison methods are often criticized because they allow for different interpretations of scores due to lack of any quantitative basis for measuring importance. However, simply replacing in the scale definitions the term "importance" with "value",as I've done in these example, mitigates these criticisms. Estimating the relative value of alternative swings remains a difficult task, however, in the case of value the term has a precise meaning and there are natural measures for quantifying it. And, as argued on previous pages, value maps exactly and directly to preference. Consistent with most descriptions of pairwise comparison, I've not included ranking as the first step for conducting the process. However, as I describe below, my approach is always to begin by asking decision makers to rank swings in order of desirability. Ranking is a relatively easy task and incorporating it into the assessment process provides an opportunity for decision maker's to discuss differing viewpoints prior to attempting the more difficult task of quantifying degrees of preference.
Assessing Risk ToleranceThe assessment of weights involves determining the organization's willingness to tradeoff advancements in the achievement of its various objectives. If decisions about projects entail the level of risk faced by the organization, it is also necessary to establish the organization's willingness to tradeoff the expected achievement of its objectives in order to reduce risk. Obtaining the organization's willingness to accept risk requires determining the levels of risk involved in project choice decisions, whih requires having a model for simulating risk. Disscussion of the process for quantifying risk and the organiztions willingness to accept risk is the topic of the next part of this paper. References
