(Q-learning en Python)
Ints for implementing Q
qvalues= {}class QPlayer :
def __init__(self, explorationRatio=0.1, discountFactor=0.99, learningRate= 0.1 ):
self.epsilon= 0.1 # the exploration ratio, 0.1 over 1 chance to take a random action.
self.gamma= 0.99 # the discount factors, interest of immediate reward regarding future gains
self.alpha= 0.1 # the learning rate, speed that incoming experiences erase the oldest.
self.qvalues= {} # Going further:
Last updated
Was this helpful?