Search
Search Funnelback University
- Refined by:
- Date: 2018
21 -
24 of
24
search results for KA :ZA31 op
where 0
match all words and 24
match some words.
Results that match 2 of 3 words
-
POLICY COMMITTEE FOR ADAPTATION IN MULTI-DOMAIN SPOKEN…
mi.eng.cam.ac.uk/~sjy/papers/gmsv15.pdf20 Feb 2018: 5]. Here, we address the problem ofdecision-making. Moving from a limited domain dialogue system that op-erates on a relatively modest ontology to an open domain. ... kA(a,a′) = δa(a. ′) (7). where δa(a′) = 1 iff a = a′, 0 otherwise. -
POMDP-based dialogue manager adaptation to extended domains M.…
mi.eng.cam.ac.uk/~sjy/papers/gbhk13a.pdf20 Feb 2018: kA(a,a′). For a sequence of belief state-. action pairs Bt = [(b0,a0),. ... For the action space kernel, the δ-kernel is useddefined by:. kA(a,a′) = δa(a. ′). -
On-line Active Reward Learning for Policy Optimisationin Spoken…
mi.eng.cam.ac.uk/~sjy/papers/sgmb16.pdf20 Feb 2018: This Gaussian process op-erates on a continuous space dialogue rep-resentation generated in an unsupervisedfashion using a recurrent neural networkencoder-decoder. ... Zhang and Chaudhuri2015] Chicheng Zhang and Ka-malika Chaudhuri. 2015. Active learning -
Reward Estimation for Dialogue Policy Optimisation Pei-Hao Su, Milica …
mi.eng.cam.ac.uk/~sjy/papers/sugy18.pdf20 Feb 2018: The summary action kernel is defined as:. kA(a,a′) = δa(a. ′) (3). ... Note that the reward model and the dialogue policy are being jointly op-timised during the sequence of dialogues.
Search history
Recently clicked results
Recently clicked results
Your click history is empty.
Recent searches
Recent searches
Your search history is empty.