Dear All,
I've created a java demo showing Reinforcement Learning by value-
gradients in action available at
http://freespace.virgin.net/michael.fairbank/neuropilot/RLVGdemo/
It shows this new learning algorithm cracking a problem very effectively
and tries to motivate the new theory and expose flaws in TD(lambda).
It's all referenced to the accompanying paper. I'd welcome any comments.
I used this theory to create one of my early Neuropilot demos available
at
http://freespace.virgin.net/michael.fairbank/neuropilot
Mike Fairbank.
[ comp.ai is moderated ... your article may take a while to appear. ]