Search
Search Funnelback University
51 -
100 of
154
search results for `broadcast news data` |u:mi.eng.cam.ac.uk
Fully-matching results
-
Building Multiple Complementary Systems using Directed Decision Trees …
mi.eng.cam.ac.uk/~mjfg/breslin_INTER07.pdf11 Jan 2008: end. Figure 4: Calculating Decision Tree Divergence. 5. ResultsExperiments were performed on a Broadcast News Arabic task.Each system was trained using 101.8 hours of data and a PLPfrontend. ... Results are given on three test sets: bnat05 (5.72 -
JOURNAL OF IEEE TRANS. ACOUST., SPEECH, SIGNAL PROCESSING, JULY ...
mi.eng.cam.ac.uk/research/projects/AGILE/publications/sim_SAP06.pdf10 Oct 2007: Discriminative training of precision matrices was evaluatedon an English conversational telephone speech (CTS) task,which consists of multi-speaker spontaneous telephone conver-sational speech, and an English broadcast news (BN) task, ... 1.1% and -
The Cambridge Multimedia Document Retrieval (MDR) Project : Summary…
mi.eng.cam.ac.uk/reports/full_html/sparckjones_cltr517.html/10 Oct 2001: Thus given a stream of broadcast news, passages relevant to the user's information need can be successfully (and preferentially) retrieved. ... The material is all broadcast news, from a variety of sources. -
UNSUPERVISED TRAINING FOR MANDARIN BROADCAST NEWS AND…
mi.eng.cam.ac.uk/~mjfg/wang_ICASSP07.pdf22 Jun 2007: Experi-ments were carried out on a Mandarin transcriptions task. Two typesof test data were considered, Broadcast News (BN) and BroadcastConversations (BC). ... 8] J. Ma, S. Matsoukas, O. Kimball, and R. Schwartz, “Unsu-pervised training on large -
INDICATOR VARIABLE DEPENDENT OUTPUT PROBABILITY MODELLING…
mi.eng.cam.ac.uk/~sjy/papers/tuyo01.pdf20 Feb 2018: After analysing a two dimensional reestimationexample on artificial data, the proposed HMM is evaluatedon the 1997 Broadcast News task with a particular focuson spontaneous speech. ... Different topologies of themodel were evaluated on the 1997 Broadcast -
IEEE TRANS. ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007 ...
mi.eng.cam.ac.uk/~mjfg/gales_ASL.pdf22 Nov 2006: II. SEGMENTATION AND CLUSTERING. For Broadcast News transcription, the first stage of pro-cessing is to partition the incoming audio data stream intohomogeneous segments (the segmentation) and to group thesesegments into ... A. Training and Test Data Sets -
IMPROVING BROADCAST NEWS TRANSCRIPTION BY LIGHTLY…
mi.eng.cam.ac.uk/reports/svr-ftp/chan_icassp2004.pdf27 May 2004: broadcast news data (TDT4 corpus) that were carefully chosen byclosed-caption filtering [3] [7]. ... Finally,conclusions are given in Section 5. 2. ENGLISH BROADCAST NEWS DATA. -
THE CAMBRIDGE UNIVERSITY SPOKEN DOCUMENT RETRIEVAL SYSTEM S.E.…
mi.eng.cam.ac.uk/reports/svr-ftp/johnson_icassp99.pdf8 Mar 2000: hours of broadcast news data. ... of broadcast news texts, the LDC-distributed 1995newswire texts, and the transcriptions of the acoustic training data. -
SPEECH RECOGNITION SYSTEM COMBINATION FOR MACHINE TRANSLATION M.J.F.…
mi.eng.cam.ac.uk/~mjfg/gales_ICASSP07.pdf22 Jun 2007: 3. STT POST PROCESSING. In processing data such as Broadcast News (BN) or Broadcast Con-versations (BCs) for an STT system, the first stage is to segment thedata into homogeneous blocks, ... The language models for each of the systems were trainedon over -
CONFIDENCE ESTIMATION AND DELETION PREDICTION USINGBIDIRECTIONAL…
mi.eng.cam.ac.uk/~mjfg/ALTA/publications/SLT2018_ragni.pdf31 Aug 2019: 19] J. Ma, S. Matsoukas, O. Kimball, and R. Schwartz, “Unsu-pervised training on large amounts of broadcast news data,” inICASSP, 2006. ... 28] J. Ma and S. Matsoukas, “Unsupervised training on a largeamount of Arabic broadcast news data,” in -
Effects of Out of Vocabulary Words in Spoken Document ...
mi.eng.cam.ac.uk/reports/svr-ftp/woodland_sigir00.pdf10 May 2000: We processed this data to automatically detect andremove commercials [2]. Transcription used a simplifiedversion of the HTK broadcast news system which corre-sponds to the “first-pass” recognition system describedin [2]. ... Thespeech recogniser uses -
Surprise Languages: Rapid-Response Cross-Language IR
mi.eng.cam.ac.uk/~ar527/oard_evia2019.pdf11 Jun 2019: The audio in the analysis andthe development testing (DEV) packs on which we evaluate initialsystems consists largely of news and topical broadcasts, along withsmaller amounts of CTS data that is similar ... The much higher sampling ratesof 44.1 and 48 -
DEVELOPMENT OF A PHONETIC SYSTEM FOR LARGE VOCABULARY ARABICSPEECH ...
mi.eng.cam.ac.uk/~mjfg/gales-asru07.pdf11 Jan 2008: The performance and combination ofphonetic and graphemic acoustic models are then compared on bothBroadcast News (BN) and Broadcast Conversation (BC) data. ... Schwartz, “Unsu-pervised training on large amount of broadcast news data,” inProc. -
Segment Generation and Clustering in the HTKBroadcast News…
mi.eng.cam.ac.uk/reports/svr-ftp/hain_darpa98.pdf8 Mar 2000: 1. IntroductionThe transcription of broadcast news requires techniques to deal withthe large variety of data types present. ... The distribution of broadcast news data suitable for GMM trainingcan be seen in Table 1. -
paper.dvi
mi.eng.cam.ac.uk/~mjfg/richter_EURO99.pdf19 Nov 2010: on splicing 9 time frames of 24 dimensional Cepstra, including c0.A context dependent state–clustered allophone system was builton the broadcast news training data. ... Recent improvements to ibm’s speech recogni-tion system for automatic -
IEEE TRANS. ON SAP, VOL. ?, NO. ??, ????? ...
mi.eng.cam.ac.uk/research/projects/AGILE/publications/kai_ASP07.pdf10 Oct 2007: F. Gales. Abstract— Large vocabulary speech recognition systems areoften built using found data, such as broadcast news. ... I. INTRODUCTION. A DAPTIVE training [1], [2] has become increasinglypopular as greater use has been made of found data, suchas -
tech.dvi
mi.eng.cam.ac.uk/~sjy/papers/bghk13.pdf20 Feb 2018: likelihood. Otherclassifiers such as SVMs [13] and MLPs [14] have also been used.This classification approach has been used successfully in LVCSRtasks such as meetings [15, 14, 16] and broadcast news ... 17] J.L. Gauvain, L. Lamel, and G. Adda, -
THE CU-HTK MARCH 2000 HUB5E TRANSCRIPTION SYSTEM
mi.eng.cam.ac.uk/reports/full_html/hain_stw00.html/12 Oct 2000: The system used N-gram word-level language models. These were constructed by training separate models for transcriptions of the Hub5 acoustic training data and for Broadcast News data and then ... Since it was not known if LDC-style or MSU-style training -
RECENT ADVANCES IN BROADCAST NEWS TRANSCRIPTION D.Y. Kim, G. ...
mi.eng.cam.ac.uk/reports/svr-ftp/kim_asru2003.pdf25 Sep 2003: 3. BROADCAST NEWS DATA. 3.1. Acoustic training data. For acoustic model training, the BN-E data released by the LDC in1997 and 1998 was used. ... Broadcast News Data. Acoustic training data. Development data. Text corpora. Acoustic model building. -
BI-DIRECTIONAL LATTICE RECURRENT NEURAL NETWORKSFOR CONFIDENCE…
mi.eng.cam.ac.uk/~ar527/ragni_icassp2019.pdf5 Feb 2019: The develop-ment data was used to assess the accuracy of confidence estimationapproaches. ... 22] J. Ma and S. Matsoukas, “Unsupervised training on a largeamount of Arabic broadcast news data,” in ICASSP, 2007. -
CAMBRIDGE UNIVERSITYENGINEERING DEPARTMENT Automatic Transcription…
mi.eng.cam.ac.uk/~mjfg/hain_tr465.pdf23 Dec 2004: 50000 most frequent words occurring in 204 million words (MW)of Broadcast News (BN) training data, yielding a vocabulary size of around 55000. ... Again modified Kneser-Ney discountingwas used. The BNLM model was trained on 204MW of Broadcast News data -
Joint Uncertainty Decoding for Noise Robust Speech Recognition H. ...
mi.eng.cam.ac.uk/~mjfg/liao_INTER05.pdf19 Dec 2006: The experiments presented in this paper were artificial intwo ways: corrupted speech was simulated by adding noise toclean speech and the compensation parameters were estimatedon stereo data. ... Future work will examine real found data, suchas broadcast -
is2008.dvi
mi.eng.cam.ac.uk/~mjfg/raut_INTER08.pdf2 Mar 2009: 1. IntroductionSpeech recognition systems are increasingly being built withfound data such as broadcast news and conversational tele-phone speech recordings. ... The speech data wasparameterised using 12 PLP Cepstral coefficients plus the0thorder (C0) -
Explicitly Generating Complementary Systems for Large…
mi.eng.cam.ac.uk/~mjfg/breslin_INTER06.pdf22 Nov 2006: This algorithm is described in detail in the next section,followed by preliminary results on a Broadcast News Mandarinsystem, before conclusions are drawn. ... 3. Experimental ResultsExperiments were performed on a Broadcast News Mandarin task.The -
Combining I-vector Representation and Structured Neural Networks for…
mi.eng.cam.ac.uk/~mjfg/icassp16_wu.pdf5 Apr 2016: English Broadcast News (BN) transcription task.Two distinct sets of test data are examined. ... The proposed approachesare evaluated on the utterance-level unsupervised adaptation of alarge vocabulary continuous English broadcast news transcriptiontask. -
NEW FEATURES IN THE CU-HTK SYSTEM FOR TRANSCRIPTION OFCONVERSATIONAL…
mi.eng.cam.ac.uk/reports/svr-ftp/auto-pdf/hain_icassp01.pdf9 Aug 2005: N-gram word-level language modelswere constructed by training separate models on transcriptions ofthe Hub5 acoustic training data and on Broadcast News data andthen merging the resultant language models to effectively ... In order to accommodate both -
THE CU-HTK MANDARIN BROADCAST NEWS TRANSCRIPTION SYSTEM R. Sinha, ...
mi.eng.cam.ac.uk/~mjfg/sinha_ICASSP06.pdf22 Nov 2006: This data was split between about 34Mwords from broadcast sources (CCTV, NTDTV, VOA) and about6M words from news paper sources. ... Jin, M. Noamany, andT. Shultz, “The ISL RT-04 Mandarin Broadcast News evaluationsystem,” inProc. -
GENERAL QUERY EXPANSION TECHNIQUESFOR SPOKEN DOCUMENT RETRIEVAL…
mi.eng.cam.ac.uk/reports/svr-ftp/jourlin_esca99.pdf10 Apr 2000: The input data is presented to the system as com-plete episodes of broadcast news shows and these are. ... The HMMs used in TREC-7 were trained on 70hours of acoustic data and the language model wastrained on manually transcribed broadcast news span-ning -
Cambridge STT Overview P.C. Woodland, H.Y. Chan, G. Evermann, ...
mi.eng.cam.ac.uk/research/projects/EARS/pubs/woodland_earsfeb04.pdf23 Mar 2004: Woodland et al.: Cambridge STT Overview. Outline. • Broadcast News. • Lightly supervised discriminative training. • ... Lightly supervised discriminative training on TDT data. • Improve the English Broadcast News system by adding large amounts of -
Knill_IS14_1.dvi
mi.eng.cam.ac.uk/~ar527/knill_is2014a.pdf10 Nov 2014: These approaches are evaluated on data distributed under the. IARPA Babel program [6]. ... Broadcast News Tran-scription and Understanding Workshop, 1998, pp. 301–305. [11] T. -
Use of Graphemic Lexicons for Spoken Language Assessment K.M. ...
mi.eng.cam.ac.uk/~ar527/knill_is2017.pdf15 Jun 2018: The DNN structure is 702x10005x6000. AKneser-Ney trigram LM is trained on 186K words of BULATStest data and interpolated with a general English LM trained ona large broadcast news corpus, using ... 3.2. Grader. The GP grader [6] is trained on independent -
Impact of ASR Performance on Free Speaking Language Assessment ...
mi.eng.cam.ac.uk/~ar527/knill_is2018.pdf15 Jun 2018: A Kneser-Ney trigram LM is trainedon 186k words from the System 1 training data, and interpolatedwith a general LM trained on Broadcast News English [34], us-ing the SRILM toolkit ... Fiscus, and W. M. Fisher, “Design and Prepa-ration of the 1996 Hub-4 -
This is a placeholder. Final title will be filled later.
mi.eng.cam.ac.uk/~mjfg/yu-asru05.pdf11 Jan 2008: 1. INTRODUCTION. Adaptive training [1, 2] has become popular as the use of founddata, such as Broadcast News, has increased. ... A separatetransform is used to represent each homogeneous block of data,e.g. -
Who Really Spoke When? Finding Speaker Turns and Identities in…
mi.eng.cam.ac.uk/reports/full_html/tranter_icassp06.html/9 Dec 2006: Who Really Spoke When? Finding Speaker Turns and Identities in Broadcast News Audio. ... The training data used for this task consisted of the Hub-4 1996/7 broadcast news training data. -
Class-based language model adaptation using mixtures ofword-class…
mi.eng.cam.ac.uk/reports/svr-ftp/moore_icslp00.pdf2 Nov 2000: 2 Finding the TopicsA mixture of broadcast news and newswire text was usedas training data for the topic model, with 144 million wordsof Broadcast News text and 25 million words of ... 3 Building a Language ModelA word 4-gram language model was built -
JOURNAL OF IEEE TRANS. ACOUST., SPEECH, SIGNAL PROCESSING, JULY ...
mi.eng.cam.ac.uk/~mjfg/sim_SAP06.pdf22 Nov 2006: Discriminative training of precision matrices was evaluatedon an English conversational telephone speech (CTS) task,which consists of multi-speaker spontaneous telephone conver-sational speech, and an English broadcast news (BN) task, ... 1.1% and -
An Investigation into the Interactions betweenSpeaker Diarisation…
mi.eng.cam.ac.uk/reports/svr-ftp/tranter_tr464.pdf9 Oct 2003: 38. 7 Conclusions 42. 8 Acknowledgements 43. A Data 43A.1 Broadcast News Data. ... The results on the RT-02 Broadcast News evaluation data are given in Table 5. -
SPOKEN DOCUMENT RETRIEVAL FOR TREC-7 AT CAMBRIDGE UNIVERSITY S.E. ...
mi.eng.cam.ac.uk/reports/svr-ftp/johnson_trec7.pdf8 Mar 2000: TREC-7 1. 2. THE HTK BROADCAST NEWS TRANSCRIPTIONSYSTEM. The input data is presented to our HTK transcription system as com-plete episodes of broadcast news shows and these are first ... The HMMs for TREC-7 used HMMs trained on 70 hours of acous-tic data -
POSTERIOR PROBABILITY DECODING, CONFIDENCEESTIMATION AND SYSTEM…
mi.eng.cam.ac.uk/reports/svr-ftp/evermann_stw00.pdf5 Oct 2000: All theexperiments reported are based on this system. The acoustic models used are triphone and quinphone HMMstrained on data from the Switchboard and CallHome corpora. ... A 4-gram language model was trained on the transcripts of theacoustic training -
LARGE VOCABULARY DECODING AND CONFIDENCE ESTIMATIONUSING WORD…
mi.eng.cam.ac.uk/reports/svr-ftp/evermann_icassp00.pdf5 May 2000: It is also interesting tonote that the improvement is consistent over the varioustypes of data found in broadcast news. ... Johnson, T.R. Niesler,A. Tuerk, E.W.D. Whittaker, and S.J. Young. The1997 HTK Broadcast News Transcription System. -
Recent Progress in Large Vocabulary ContinuousSpeech Recognition: An…
mi.eng.cam.ac.uk/~mjfg/icassp06_tutorial.pdf22 Feb 2007: English Broadcast News System Description. • Segmentation and clustering:– LIMSI kindly supplied segmentation and clustering. • ... Mandarin Broadcast News System Description• Mandarin specific features (full description in[42] - see ICASSP poster -
IEEE Proo f IEEE TRANSACTIONS ON AUDIO, SPEECH, AND ...
mi.eng.cam.ac.uk/~mjfg/kai_ASP06.pdf4 Apr 2007: I. INTRODUCTION. ADAPTIVE training is a widely used technique to buildspeech recognition systems on nonhomogeneous data.This type of data occurs in many scenarios, for example,broadcast news and conversational telephone ... 18). where the summation in -
BI-DIRECTIONAL LATTICE RECURRENT NEURAL NETWORKSFOR CONFIDENCE…
mi.eng.cam.ac.uk/~mjfg/ALTA/publications/ICASSP2019_li.pdf31 Aug 2019: The develop-ment data was used to assess the accuracy of confidence estimationapproaches. ... 22] J. Ma and S. Matsoukas, “Unsupervised training on a largeamount of Arabic broadcast news data,” in ICASSP, 2007. -
paper.2col.dvi
mi.eng.cam.ac.uk/~mjfg/kai_ASP09.pdf2 Mar 2009: However, inmany applications, such as broadcast news transcription orconversational telephone speech, there is no transcriptionavailable for the test data. ... Second, as the correct transcriptionsareknown for the training data, there are no hypothesis -
SPOKEN DOCUMENT RETRIEVAL FOR TREC-8 AT CAMBRIDGE UNIVERSITY S.E. ...
mi.eng.cam.ac.uk/reports/svr-ftp/johnson_trec8.pdf10 Apr 2000: Since a substantial portion of the data to betranscribed was known to be commercials and thus irrelevantto broadcast news queries, an automatic method of detectingand eliminating such commercials would potentially reduce ... Three fixed backoff -
Audio Indexing and Retrieval of Complete Broadcast News Shows ...
mi.eng.cam.ac.uk/reports/svr-ftp/johnson_riao00.pdf10 Apr 2000: Abstract. This paper describes a system for retrieving relevant portions of complete broadcast news shows startingwith only the audio data. ... 7 Conclusions. This paper has described a system for retrieving relevant portions of complete broadcast news -
Speaker Diarisation for Broadcast News
mi.eng.cam.ac.uk/reports/full_html/tranter_odyssey04.html/14 Jun 2004: This paper describes systems developed at CUED and MIT-LL to perform automatic segmentation, clustering and labelling of speakers (and in some cases commercial breaks) in broadcast news data. ... Each data set consists of one 30 minute extract from 6 -
Article Submitted to Computer Speech and Language Automatic…
mi.eng.cam.ac.uk/reports/svr-ftp/auto-pdf/kim_csl04.pdf9 Aug 2005: 3. Corpora and evaluation measuresTwo different sets of data, the Broadcast News (BN) text corpus and the 100-hourHub-4 BN data set, were available as training data for the experiments ... from the NIST 1998 Hub-4 broadcast news benchmark testswere used -
EARS STT Overview Phil Woodland February 4th 2004 Cambridge ...
mi.eng.cam.ac.uk/research/projects/EARS/pubs/woodland_board_earsfeb04.pdf23 Mar 2004: STT Collaboration Examples. • Sharing & Preparing Data– Wordwave segmentations (BBN)– Common broadcast news development sets (LIMSI BBN,CU,SRI)– Shared TDT transcriptions (all sites working on broadcasts)– Shared CTS transcriptions (CU)– -
tech.dvi
mi.eng.cam.ac.uk/~mjfg/liao_tr552.pdf21 Sep 2007: trained on multistyle data sets such as broadcast news or conversational telephone speech. ... large vocabulary Broadcast News corpus of collected broadcast recordings. 1 Introduction.
Search history
Recently clicked results
Recently clicked results
Your click history is empty.
Recent searches
Recent searches
Your search history is empty.