## Thoughts on KWIK Learning the Stock Game

In my previous post, I wrote about using entropy to find the structure of the stock game.  In discussions afterward, it came up whether it was provable that the method described was KWIK (Knows What It Knows – Li,  Littman, Walsh – ICML 2008).

István pointed out that this might be difficult due to the log factor in the equation for entropy, but I think I figured out a way around it; the basic answer is to just raise the equation H(X) to 2^{H(X)}, which then removes the log factor.  I wrote a quick breakdown of the math steps, its available here: Removing the log from the equation for entropy [note – Binomial should read Bernoulli].

From there I was unsure what to do, as it didn’t seem there was really anything simple to do as a next step in that line of thinking.  Reading over the KWIK paper again though gave me exactly what I was looking for; estimating the probability of a Bernoulli distribution is KWIK learnable.

So the algorithm I used was just using the conditional Bernoulli distributions to compute the entropy.  The question is, does anything weird happen based on the calculations done with those estimated probabilities to remove the KWIK bound?  Basically, does an accurate estimation of the probability imply an accurate estimation of the entropy?  More specifically, can the product of a number of KWIK-learned values be not KWIK anymore for some reason?

Tagged , , , ,

## 2 thoughts on “Thoughts on KWIK Learning the Stock Game”

1. Michael Littman says:

Ok, so it seems like there are some helpful pieces here. I’d suggest that you take a step back and formalize the question. In mathematical notation, what are you trying to show? The log stuff ought to come in somewhere in the middle of the argument, but you should be clear on the problem statement and what you want to show. I suspect it seems tedious and redundant, but restating the question is really helpful in keeping the ideas clean and easy to understand.

2. hut3 says: