Search
Search Funnelback University
- Refined by:
- Date: 2018
Did you mean apc53?
11 -
20 of
128
search results for KA :PC53
where 0
match all words and 128
match some words.
Results that match 1 of 2 words
-
POLICY COMMITTEE FOR ADAPTATION IN MULTI-DOMAIN SPOKEN…
mi.eng.cam.ac.uk/~sjy/papers/gmsv15.pdf20 Feb 2018: Q(b,a) GP (0,k((b,a), (b,a))) (2). where the kernel k(, ) is factored into separate kernels overbelief and action spaces kB(b,b′)kA(a,a′). ... kA(a,a′) = δa(a. ′) (7). where δa(a′) = 1 iff a = a′, 0 otherwise. -
POLICY OPTIMISATION OF POMDP-BASED DIALOGUE SYSTEMS WITHOUT…
mi.eng.cam.ac.uk/~sjy/papers/ghtt12.pdf20 Feb 2018: k(b,a,b′,a′) = kB(b,b′)kA(a,a. ′). (3). In addition, if we assume additive noise in the Q-function, Q(b,a) N(0,σ2), the following ... kA(a,a′) = 1 δa(a′) (14). where δa(a′) = 1 iff a = a′.For the full-space systems, the kernel function -
is-05-hvs6_final
mi.eng.cam.ac.uk/~sjy/papers/seyo05.pdf20 Feb 2018: 211. 2121. 1. aannna. annnnn. ka. rd (13). Here nr specifies the number of events that occurred r times and a fixed discounting factor was used if they are zero. -
Dialogue manager domain adaptation using Gaussian process…
mi.eng.cam.ac.uk/~sjy/papers/gmrs17.pdf20 Feb 2018: kA(a,a′). For a training sequence of belief state-action pairs B = [(b0,a0),. , ... kA(a,a′) = δa(a. ′) (6). where δa(a′) = 1 iff a = a′, 0 otherwise. -
ON-LINE POLICY OPTIMISATION OF BAYESIAN SPOKEN DIALOGUE SYSTEMS…
mi.eng.cam.ac.uk/~sjy/papers/gbhk13.pdf20 Feb 2018: is factored into separate kernels over thesummary state and action spaces kC(c,c′)kA(a,a′). -
Reward Estimation for Dialogue Policy Optimisation Pei-Hao Su, Milica …
mi.eng.cam.ac.uk/~sjy/papers/sugy18.pdf20 Feb 2018: kA(a,a′). The. policy is optimised using an algorithm called GP-SARSA [7, 45] in which theQ-function is updated by calculating the posterior given the collected belief-action pairs ... The summary action kernel is defined as:. kA(a,a′) = δa(a. ′) -
Online_ASRU11.dvi
mi.eng.cam.ac.uk/~sjy/papers/gjty11.pdf20 Feb 2018: function,Q(b, a) GP (0, k((b, a), (b, a))) wherethe kernelk(, ) is factored into separate kernels over thesummary state and action spaceskB(b, b)kA(a, a). -
hierParsing.dvi
mi.eng.cam.ac.uk/~sjy/papers/heyo03a.pdf20 Feb 2018: bbj¢&}M"@"jj}j[&}(¢}M|ªjb}a{(|&}(}M¢¤&Ka"}Xl|&|s¡"&}¡}bj¤&|@b&¤&@}"|&}b|&bj5¢@¤[@¤&}"¤|&| b9}&}M¤@j@b -
On-line Active Reward Learning for Policy Optimisationin Spoken…
mi.eng.cam.ac.uk/~sjy/papers/sgmb16.pdf20 Feb 2018: Zhang and Chaudhuri2015] Chicheng Zhang and Ka-malika Chaudhuri. 2015. Active learning fromweak and strong labelers. -
Optimisation for POMDP-based Spoken Dialogue Systems M. Gašić, F.…
mi.eng.cam.ac.uk/~sjy/papers/gjty12.pdf20 Feb 2018: Q(b,a) GP (0,k((b,a), (b,a))). (41). The kernel k(, ) is often factored into separate kernels over the belief state and actionspaces kB(b,
Search history
Recently clicked results
Recently clicked results
Your click history is empty.
Recent searches
Recent searches
Your search history is empty.