Comparing size and accuracy of decision trees with and without synchronic arcs

Right, well when predicting x’, the feature vector is (x, y, z, y’, z’), but not x’, so all the variables aside from the one we’re trying to predict with that decision tree are available as inputs.

oh, I see… so, that’s a bit stronger than needed, but if that
doesn’t produce much smaller trees, then I guess it’s harder than I
thought

In response to this, I reran the taxi/decision tree experiment with and without synchronic arcs, and compared the size of the decision trees and their accuracy in both cases.  In all cases, the accuracy of the decision trees with synchronic arcs was better (or at least as good) as the decision tree without.  As far as size goes, in some cases the trees with synchronic arcs are larger and in some cases they are not.

In order to avoid divide by zero problems, both classifiers are said to have at least one error; for some attributes the decision trees can perfectly predict the value with no error at all.

Comparing: @attribute taxi_x-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 1.0
(Num leaves w/o current / Num leaves w/ current): 1.0
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute taxi_y-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 1.0
(Num leaves w/o current / Num leaves w/ current): 1.0
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute taxi_delta_x-0 {-1,0,1}
(Tree size w/o current / Tree size w/ current): 2.625
(Num leaves w/o current / Num leaves w/ current): 3.3636363636363638
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute taxi_delta_y-0 {-1,0,1}
(Tree size w/o current / Tree size w/ current): 2.625
(Num leaves w/o current / Num leaves w/ current): 3.3636363636363638
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute passenger_x-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 0.46153846153846156
(Num leaves w/o current / Num leaves w/ current): 0.5
(Error rate w/o current / Error rate w/ current): 1.7999999999999998

Comparing: @attribute passenger_y-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 0.3
(Num leaves w/o current / Num leaves w/ current): 0.3333333333333333
(Error rate w/o current / Error rate w/ current): 19.0

Comparing: @attribute destination_x-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 1.0
(Num leaves w/o current / Num leaves w/ current): 1.0
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute destination_y-0 {0,1,2,3,4}
(Tree size w/o current / Tree size w/ current): 1.0
(Num leaves w/o current / Num leaves w/ current): 1.0
(Error rate w/o current / Error rate w/ current): 1.0

Comparing: @attribute action-0 {U,D,W,E,PickUp,DropOff,Refuel}
(Tree size w/o current / Tree size w/ current): 0.9667341934108113
(Num leaves w/o current / Num leaves w/ current): 0.9798368971938611
(Error rate w/o current / Error rate w/ current): 1.0632673515409672

Comparing: @attribute fuel-0 {-1,0,1,2,3,4,5,6,7,8,9,10,11,12,13}
(Tree size w/o current / Tree size w/ current): 0.995260663507109
(Num leaves w/o current / Num leaves w/ current): 0.8274111675126904
(Error rate w/o current / Error rate w/ current): 25.999999999999996

Comparing: @attribute fuelDelta-0 {-1,0,1,2,3,4,5,6,7,8,9,10,11,12,13}
(Tree size w/o current / Tree size w/ current): 3.3157894736842106
(Num leaves w/o current / Num leaves w/ current): 2.6
(Error rate w/o current / Error rate w/ current): 25.999999999999996

Comparing: @attribute reward-0 {20,-1,-10,-20}
(Tree size w/o current / Tree size w/ current): 2.022727272727273
(Num leaves w/o current / Num leaves w/ current): 1.605263157894737
(Error rate w/o current / Error rate w/ current): 21.999999999999996

Comparing: @attribute hasPassenger-0 {true,false}
(Tree size w/o current / Tree size w/ current): 0.9135802469135802
(Num leaves w/o current / Num leaves w/ current): 0.7971014492753623
(Error rate w/o current / Error rate w/ current): 20.0

Comparing: @attribute terminal-0 {true,false}
(Tree size w/o current / Tree size w/ current): 2.4
(Num leaves w/o current / Num leaves w/ current): 2.25
(Error rate w/o current / Error rate w/ current): 8.999999999999998

Comparing: @attribute wallN-0 {true,false}
(Tree size w/o current / Tree size w/ current): 1.2027027027027026
(Num leaves w/o current / Num leaves w/ current): 1.25
(Error rate w/o current / Error rate w/ current): 1.5913978494623655

Comparing: @attribute wallS-0 {true,false}
(Tree size w/o current / Tree size w/ current): 1.0704225352112675
(Num leaves w/o current / Num leaves w/ current): 1.117283950617284
(Error rate w/o current / Error rate w/ current): 2.5555555555555554

Comparing: @attribute wallW-0 {true,false}
(Tree size w/o current / Tree size w/ current): 2.15311004784689
(Num leaves w/o current / Num leaves w/ current): 2.33125
(Error rate w/o current / Error rate w/ current): 5.446808510638299

Comparing: @attribute wallE-0 {true,false}
(Tree size w/o current / Tree size w/ current): 3.0174418604651163
(Num leaves w/o current / Num leaves w/ current): 3.111940298507463
(Error rate w/o current / Error rate w/ current): 6.925925925925926

3 thoughts on “Comparing size and accuracy of decision trees with and without synchronic arcs”

1. Michael Littman says:

good idea to check this… what do you make of the results? why would adding more features make the trees bigger? Noise, you think?

2. hut3 says:

Thanks for pointing that out. In these experiments I (perhaps foolishly) used the same set of data for training and testing.

I figured that if the trees with synchronic links are overfitting the data, then the accuracy should show up as reduced accuracy when different sets of data are used for training and testing. As it turns out, the results are essentially the same in that case as well. I think this may be because taxi is a no-noise environment anyway, and overfitting is more of an issue when there is noise in the environment.

It is interesting that the trees with the synchronic links are sometimes bigger. It seems that in a perfect world, even if the trees without the synchronic links are able to perfectly reproduce the rules that create the state data at the current state (thereby “recreating” the synchronic links) those trees can only be bigger.

Here is a comparison of the predictive trees for the reward attribute without and with synchronic links. This is an example where the tree with synchronic links is bigger than the tree without.

Decision tree for @attribute reward-0 {20,-1,-10,-20}
——————
action-1 = U
| wallN-1 = true
| | wallS-1 = true: -20 (19.0)
| | wallS-1 = false: -1 (24.0)
| wallN-1 = false: -1 (169.0)
action-1 = D
| wallN-1 = true
| | wallS-1 = true: -20 (27.0)
| | wallS-1 = false: -1 (21.0)
| wallN-1 = false: -1 (150.0)
action-1 = W
| wallN-1 = true
| | wallS-1 = true: -20 (22.0)
| | wallS-1 = false: -1 (20.0)
| wallN-1 = false: -1 (186.0)
action-1 = E
| wallN-1 = true
| | wallS-1 = true: -20 (13.0)
| | wallS-1 = false: -1 (24.0)
| wallN-1 = false: -1 (181.0)
action-1 = PickUp: -10 (195.0/3.0)
action-1 = DropOff: -10 (220.0)
action-1 = Refuel: -1 (229.0)

Decision tree for @attribute reward-0 {20,-1,-10,-20}
——————
terminal-0 = true: -20 (81.0)
terminal-0 = false
| fuelDelta-0 = -1: -1 (775.0)
| fuelDelta-0 = 0
| | action-1 = U: -10 (0.0)
| | action-1 = D: -10 (0.0)
| | action-1 = W: -10 (0.0)
| | action-1 = E: -10 (0.0)
| | action-1 = PickUp
| | | hasPassenger-0 = true
| | | | hasPassenger-1 = true: -10 (2.0)
| | | | hasPassenger-1 = false: -1 (3.0)
| | | hasPassenger-0 = false: -10 (190.0)
| | action-1 = DropOff: -10 (220.0)
| | action-1 = Refuel: -1 (219.0)
| fuelDelta-0 = 1: -1 (0.0)
| fuelDelta-0 = 2: -1 (1.0)
| fuelDelta-0 = 3: -1 (0.0)
| fuelDelta-0 = 4: -1 (1.0)
| fuelDelta-0 = 5: -1 (1.0)
| fuelDelta-0 = 6: -1 (1.0)
| fuelDelta-0 = 7: -1 (4.0)
| fuelDelta-0 = 8: -1 (0.0)
| fuelDelta-0 = 9: -1 (0.0)
| fuelDelta-0 = 10: -1 (0.0)
| fuelDelta-0 = 11: -1 (0.0)
| fuelDelta-0 = 12: -1 (2.0)
| fuelDelta-0 = 13: -1 (0.0)

As you can see, the tree with synchronic links looks much closer to the true rules of taxi, and alot of the size of the tree is just due to the requirement of having a leaf node for each of the possible (discrete in this case) values of the fuelDelta attribute.

So it seems like the size issue may be due to a combination of more accurately modeling the actual dynamics of the environment (whereas the trees without synchronic links simply can’t achieve that, they are underfitting) as well as simply being an artifact of the way this particular implementation of decision trees works. For the next step I can try and play with the settings for the attributes to make more generalization happen (so that the numerical attributes are treated that way).

3. Michael Littman says:

well, without synchronic links, it should still be possible to capture the dynamics exactly. I guess that’s the fair comparison—given that both capture the dynamics exactly, which is smaller?