BMA Performance

886 Words2 Pages

Reducing the dimensionality of a model parameter space, this strategy enables to explore the space in more detail. The other strategy that can be thought of is refining the ensemble by discarding models which use weak attributes. We expect that such refinement can improve the BMA performance.

To test the assumption made in section 2 and refine DT model ensembles obtained with BMA, we propose a new strategy aiming at discarding the DT models which use weak attributes. According to this strategy, first the BMA technique described in section 2 is used to collect DT models. Then posterior probabilities of using attributes in the ensemble of DT models are estimated. These estimates give us the posterior information on feature importance. Having obtained a range of the posterior probabilities, we then define a threshold value to cut off the attributes with the probabilities below this threshold – we define such attributes as weak. At the next stage we find the DT models which use these weak attributes and finally discard these DT models from the ensemble.

Obviously, the larger the threshold value, the greater number of attributes is defined as weak, and therefore the larger portion of DT models is discarded. The efficiency of this discarding technique is evaluated in terms of the accuracy of the refined DT ensemble on the test data. The uncertainty in the ensemble outcomes is evaluated in terms of entropy. Having a set of the threshold probability values obtained in a series of experiments, we can expect that there is an optimal threshold value at which the performance becomes higher. We can also expect to find a threshold value at which the uncertainty becomes lower. In the following section we test the proposed technique on the p...

... middle of paper ...

...hreshold is gradually increased from 0.0 to 0.005. At the same time the uncertainty in decisions is decreased from 478.4 to 469.0 in terms of entropy E of the ensemble. For comparison, we applied a technique of discarding the same weak attributes and then reran the BMA on the data reduced in their dimensionality.

From Table 1 we can see that the BMA performance has slightly increased from 27.4 to 29.0 when 23 weak attributes were discarded. The discarding of 31 attribute has resulted in a decrease in the ensemble entropy from 478.3 to 463.6. Overall, the both techniques are shown to provide the comparable performance and ensemble entropy. However, the technique of discarding attributes has shown to tend to perform in a larger variation. Within this technique for each threshold value it is required to retrain DT ensemble on the data of a new dimensionality.

Open Document