Qualitative GSA¶

Qualitative GSA deals with the computation of measures that can rank random input parameters in terms of their impact on the function output and the variability thereof. This is done to a degree of accuracy that allows distinguishing between influential and non-influential parameters. If the measures for some input parameters are negligibly small, these parameters can be fixed so that the number of random input parameters decreases for a subsequent quantitative GSA. This section explains the qualitative measures and the trade-off between computational costs and accuracy.

The most commonly used measures in qualitative GSA is the mean EE, \(\mu\), the mean absolute EEs, \(\mu^*\), and the standard deviation of the EEs, \(\sigma\). The EE of \(X_i\) is given by one individual function derivative with respect to \(X_i\). The “change in”, or the “step of” the input parameter, denoted by \(\Delta\). The only restriction is that \(X_i + \Delta\) is in the sample space of \(X_i\). The Elementary Effect, or derivative, is denoted by

\[d_i^{(j)} = \frac{f(\pmb{X_{\sim i}^{(j)}}, X_i^{(j)} + \Delta^{(i,j)})- f(\pmb{X})}{\Delta^{(i,j)}},\]

where \(j\) is an index for the number of \(r\) observations of \(X_i\). Note, that the EE, \(d_i^{(j)}\), is equal to the aforementioned local measure, the system derivate \(S_i = \frac{\partial Y}{\partial X_i}\), except that the value \(\Delta\) has not to be infinitesimally small. To offset the third drawback of \(d_i\) and \(S_i\), that base vector \(X_i\) does not represent the whole input space, one computes the mean EE, \(\mu_i\), based on a random sample of \(X_i\) from the input space. The second drawback, that interaction effects are disregarded, is also offset because elements \(X_{\sim i}\) are also resampled for each new \(X_i\). The mean EE is given by

\[\mu_i = \frac{1}{r} \sum_{j=1}^{r} d_i^{(j)}.\]

Thus, \(\mu_i\) is the global version of \(d_i^{(j)}\). The standard deviation of the EEs writes \(\sigma_i = \sqrt{\frac{1}{r} \sum_{j=1}^{r} (d_i^{(j)} - \mu_i)^2}\). The mean absolute EE, \(\mu_i^*\), is used to prevent observations of opposite sign to cancel each other out:

\[\mu_i^* = \frac{1}{r} \sum_{j=1}^{r} \big| d_i^{(j)} \big|.\]

Step \(\Delta^{(i,j)}\) may or may not vary depending on the sample design that is used to draw the input parameters.

One last variant is provided in [Smith.2014]. That is, the scaling of \(\mu_{i}^* by \frac{\sigma_{X_i}}{\sigma_Y}\). This measure is called the sigma-normalized mean absolute EE:

\[\mu_{i,\sigma}^* = \mu_i^* \frac{\sigma_{X_i}}{\sigma_Y}.\]

This improvement is necessary for a consistent ranking of \(X_i\). Otherwise, the ranking would be distorted by differences in the level of the the input parameters. The reason is that the input space constrains \(\Delta\). If the input space is larger, the base value of \(X_i\) can be changed by a larger \(\Delta\).

From the aforementioned set of drawbacks of the local derivate \(D_i = \frac{\partial Y}{\partial X_i}\), two drawbacks are remaining for the EE method. The first drawback is the missing direct link to the variation in \(Var(Y)\). The second drawback is that the choice of \(\Delta\) is somewhat arbitrary if the derivative is not analytic. To this date, the literature has not developed convincing solutions for these issues.

In an attempt to establish a closer link between EE-based measures and Sobol’ indices, [Kucherenko.2009] made two conclusions: the first conclusion is that there is an upper bound for the total index, \(S_i^T\), such that

\[S_i^T \leq \frac{\frac{1}{r} \sum_{j=1}^{r} {d_i^2}^{(j)}}{\pi^2 \sigma_Y}.\]

This expression makes use of the squared EE. In light of this characteristic, the use of \(\sigma_i\) as a measure that aims to include the variation of \(d_i^{j}\) appears less relevant. Nevertheless, this rescaling makes the interpretation more difficult. The second conclusion is that the Elementary Effects method can lead to false selections for non-monotonic functions. This is also true if functions are non-linear. The reason is linked to the aforementioned second drawback, the arbitrary choice of step \(\Delta\). More precisely, depending on the sampling scheme, \(\Delta\) might be random instead of arbitrary and constant. In both cases, \(\Delta\) can be too large to approximate a derivative. If, for example, the function is highly non-linear of varying degree with respect to the input parameters \(\pmb{X}\), \(\Delta > \epsilon\) can easily distort the results. Especially if the characteristic length of function variation is much smaller than \(\Delta\).