Iskalni niz:
išči po
išči po
išči po
išči po
Vrsta gradiva:
Jezik:
Št. zadetkov: 1
Video in druga učna gradiva
Oznake: computer science
We address the problem of competing with any large set of $N$ policies in the non-stochastic bandit setting, where the learner must repeatedly select among $K$ actions but observes only the reward of the chosen action. We present a modification of the Exp4 algorithm of Auer et al. called Exp4.P, whi ...
Leto: 2011 Vir: videolectures.net
Št. zadetkov: 1
Ključne besede:
Leto izdaje:
Repozitorij:
Tipologija:
Jezik: