Search

Search Funnelback University

Search powered by Funnelback
1 - 38 of 38 search results for `broadcast news data` |u:mi.eng.cam.ac.uk
  1. Fully-matching results

  2. The Cambridge University Multimedia Document Retrieval Demo System

    mi.eng.cam.ac.uk/reports/full_html/tuerk_riao00demo.html/
    14 Aug 2000: The system is trained on about 150 hours of acoustic training data and 260 million words of broadcast news and newspaper transcriptions. ... The system gives a word error rate of 15.9% on the 1998 Hub4 broadcast news evaluation data.
  3. The Cambridge University Multimedia Document Retrieval Demo System

    mi.eng.cam.ac.uk/reports/full_html/tuerk_sigir00demo.html/
    14 Aug 2000: This system gives a word error rate of 15.9% on the 1998 Hub4 broadcast news evaluation data. ... J. J. Odell, P. C. Woodland, and T. Hain. The CUHTK-Entropic 10xRT Broadcast News Transcription System.
  4. Speaker Clustering Using Direct Maximisation of the MLLR-Adapted…

    mi.eng.cam.ac.uk/reports/full_html/johnson_icslp98.html/
    8 Mar 2000: This paper presents two strategies for clustering broadcast news data segments (found by an automatic segmentation algorithm) for subsequent MLLR adaptation. ... Experiments on various sets of broadcast news data have been carried out to evaluate the
  5. The Cambridge University Spoken Document Retrieval System

    mi.eng.cam.ac.uk/reports/full_html/johnson_icassp99.html/
    8 Mar 2000: its performance using automatic transcriptions of about 50 hours of broadcast news data. ... 132 million words of broadcast news texts, the LDC-distributed 1995 newswire texts, and the transcriptions of the acoustic training data.
  6. Experiments in Broadcast News Transcription

    mi.eng.cam.ac.uk/reports/full_html/woodland_icassp98.html/
    1 Mar 2000: That system was constructed using HMMs trained on the Wall Street Journal (WSJ) corpus as a base and then adapted to individual data types of broadcast news data using supervised maximum ... Siegler M.A., Jain U., Raj B. & Stern R.M. (1997).
  7. Effects of Out of Vocabulary Words in Spoken Document Retrieval

    mi.eng.cam.ac.uk/reports/full_html/woodland_sigir00.html/
    14 Aug 2000: The TREC-8 audio contains 500 hours of US broadcast news data that was recorded between February and June 1998. ... The speech recogniser uses cross-word context-dependent hidden Markov models and a 4-gram language model trained on broadcast news data.
  8. The 1998 HTK Broadcast News Transcription System: Development and…

    mi.eng.cam.ac.uk/reports/full_html/woodland_darpa99.html/
    2 Mar 2000: Significant progress in the accurate transcription of broadcast news data has been made over the last few years so that we are now at a point where such systems can be ... The soft-clustering technique developed at JHU [9] had shown worthwhile reductions
  9. Segment Generation and Clustering in the HTK Broadcast News

    mi.eng.cam.ac.uk/reports/full_html/hain_darpa98.html/
    1 Mar 2000: The distribution of broadcast news data suitable for GMM training can be seen in Table 1. ... Siegler M.A., Jain U., Raj B. & Stern R.M. (1997). Automatic Segmentation, Classification and Clustering of Broadcast News Data [HTML].
  10. The 1997 HTK Broadcast News Transcription System

    mi.eng.cam.ac.uk/reports/full_html/woodland_darpa98.html/
    1 Mar 2000: We have found that using the full system with adaptation results in a 20-25% decrease in word error rate on broadcast news data. ... Siegler M.A., Jain U., Raj B. & Stern R.M. (1997). Automatic Segmentation, Classification and Clustering of Broadcast
  11. General Query Expansion Techniques for Spoken Document Retrieval

    mi.eng.cam.ac.uk/reports/full_html/jourlin_esca99.html/
    2 Mar 2000: The input data is presented to the system as complete episodes of broadcast news shows and these are first converted to a set of segments for further processing [9]. ... The HMMs used in TREC-7 were trained on 70 hours of acoustic data and the language
  12. The CUHTK-Entropic 10xRT Broadcast News Transcription System

    mi.eng.cam.ac.uk/reports/full_html/odell_darpa99.html/
    1 Mar 2000: The CUHTK-Entropic 10xRT Broadcast News Transcription System. J.J. Odell , P.C. ... Cepstral mean normalisation of each segment is applied. Two sets of cross word triphone context dependent HMMs were produced from the 1997 and 1998 Broadcast news
  13. Who Spoke When? - Automatic Segmentation and Clustering for…

    mi.eng.cam.ac.uk/reports/full_html/johnson_eurospeech99.html/
    2 Mar 2000: For the task of identifying potentially unknown anchor speakers within broadcast news shows, the frame classification error rate is very important. ... The 1996 Hub-4 Broadcast News Transcription development data was used for all the experiments reported
  14. Class-based language model adaptation using mixtures of word-class…

    mi.eng.cam.ac.uk/reports/full_html/moore_icslp00.html/
    2 Nov 2000: A mixture of broadcast news and newswire text was used as training data for the topic model, with 144 million words of Broadcast News text and 25 million words of Los ... Whittaker and S.J. Young, The 1997 HTK Broadcast News Transcription System''; DARPA
  15. EXPERIMENTS IN BROADCAST NEWS TRANSCRIPTION P.C. Woodland, T. Hain,…

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_icassp98.pdf
    10 Apr 2000: and then adapted to individ-ual data types of broadcast news data using supervised maximumlikelihood linear regression (MLLR) [4, 2]. ... 2. BROADCAST NEWS DATA. This section describes the various data sets that have been used inthe experiments reported
  16. POSTERIOR PROBABILITY DECODING, CONFIDENCE ESTIMATION AND SYSTEM…

    mi.eng.cam.ac.uk/reports/full_html/evermann_stw00.html/
    5 Oct 2000: All the experiments reported are based on this system. The acoustic models used are triphone and quinphone HMMs trained on data from the Switchboard and CallHome corpora. ... A 4-gram language model was trained on the transcripts of the acoustic training
  17. THE CU-HTK MARCH 2000 HUB5E TRANSCRIPTION SYSTEM

    mi.eng.cam.ac.uk/reports/full_html/hain_stw00.html/
    12 Oct 2000: The system used N-gram word-level language models. These were constructed by training separate models for transcriptions of the Hub5 acoustic training data and for Broadcast News data and then ... Since it was not known if LDC-style or MSU-style training
  18. SPEAKER CLUSTERING USING DIRECT MAXIMISATION OFTHE MLLR-ADAPTED…

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_icslp98.pdf
    10 Apr 2000: This paper presents two strategies forclustering broadcast news data segments (found by an au-tomatic segmentation algorithm) for subsequent MLLRadaptation. ... 5. EXPERIMENTS. Experiments on various sets of broadcast news data havebeen carried out to
  19. Effects of Out of Vocabulary Words in Spoken Document ...

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_sigir00.pdf
    10 May 2000: We processed this data to automatically detect andremove commercials [2]. Transcription used a simplifiedversion of the HTK broadcast news system which corre-sponds to the “first-pass” recognition system describedin [2]. ... Thespeech recogniser uses
  20. The 1998 HTK Broadcast News Transcription System:Development and…

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_darpa99.pdf
    8 Mar 2000: The paper is arranged as follows. We first give details ofthe broadcast news data used in the experiments, then givean outline of the overall system used in the 1997 evaluation.The ... Our experiments with FD on broadcast news data show thatoverall we
  21. Spoken Document Retrieval for TREC-7 at Cambridge University

    mi.eng.cam.ac.uk/reports/full_html/johnson_trec7.html/
    30 Mar 2000: The input data is presented to our HTK transcription system as complete episodes of broadcast news shows and these are first converted to a set of segments for further processing. ... This was useful, but we went further and developed a new pair of
  22. THE CAMBRIDGE UNIVERSITY SPOKEN DOCUMENT RETRIEVAL SYSTEM S.E.…

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_icassp99.pdf
    8 Mar 2000: hours of broadcast news data. ... of broadcast news texts, the LDC-distributed 1995newswire texts, and the transcriptions of the acoustic training data.
  23. THE DEVELOPMENT OF THE1996 HTK BROADCAST NEWS TRANSCRIPTION SYSTEM ...

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_darpa97.pdf
    8 Mar 2000: television andradio broadcast news programmes recorded “off-air”. Forthe primary partitioned evaluation (PE), the data was pre-segmented into portions that were acoustically homogeneous:i.e. ... For the other portions of FX, a “global” model was
  24. Spoken Document Retrieval for TREC-8 at Cambridge University

    mi.eng.cam.ac.uk/reports/full_html/johnson_trec8.html/
    30 Mar 2000: Since a substantial portion of the data to be transcribed was known to be commercials and thus irrelevant to broadcast news queries, an automatic method of detecting and eliminating such commercials ... Three fixed backoff word-based language models were
  25. Audio Indexing and Retrieval of Complete Broadcast News Shows

    mi.eng.cam.ac.uk/reports/full_html/johnson_riao00.html/
    19 Apr 2000: This paper describes a system for retrieving relevant portions of complete broadcast news shows starting with only the audio data. ... Conclusions. This paper has described a system for retrieving relevant portions of complete broadcast news shows when
  26. WHO SPOKE WHEN? - AUTOMATIC SEGMENTATION ANDCLUSTERING FOR…

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_eurospeech99.pdf
    10 Apr 2000: These methods were shown tobe an important part of our overall recognition systemfor broadcast news. ... The 1996 Hub-4 Broadcast News Transcription develop-ment data was used for all the experiments reported inthis paper.
  27. The 1997 HTK Broadcast News Transcription SystemP.C. Woodland, T. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_darpa98.pdf
    8 Mar 2000: The rest of the paper is arranged as follows. We first givedetails of the broadcast news data used in the experiments,and briefly describe our work on segment processing whichsplits the ... We havefound that using the full system with adaptation results
  28. Segment Generation and Clustering in the HTKBroadcast News

    mi.eng.cam.ac.uk/reports/svr-ftp/hain_darpa98.pdf
    8 Mar 2000: 1. IntroductionThe transcription of broadcast news requires techniques to deal withthe large variety of data types present. ... The distribution of broadcast news data suitable for GMM trainingcan be seen in Table 1.
  29. Class-based language model adaptation using mixtures ofword-class…

    mi.eng.cam.ac.uk/reports/svr-ftp/moore_icslp00.pdf
    2 Nov 2000: 2 Finding the TopicsA mixture of broadcast news and newswire text was usedas training data for the topic model, with 144 million wordsof Broadcast News text and 25 million words of ... 3 Building a Language ModelA word 4-gram language model was built
  30. GENERAL QUERY EXPANSION TECHNIQUESFOR SPOKEN DOCUMENT RETRIEVAL…

    mi.eng.cam.ac.uk/reports/svr-ftp/jourlin_esca99.pdf
    10 Apr 2000: The input data is presented to the system as com-plete episodes of broadcast news shows and these are. ... The HMMs used in TREC-7 were trained on 70hours of acoustic data and the language model wastrained on manually transcribed broadcast news span-ning
  31. LARGE VOCABULARY DECODING AND CONFIDENCE ESTIMATIONUSING WORD…

    mi.eng.cam.ac.uk/reports/svr-ftp/evermann_icassp00.pdf
    5 May 2000: It is also interesting tonote that the improvement is consistent over the varioustypes of data found in broadcast news. ... Johnson, T.R. Niesler,A. Tuerk, E.W.D. Whittaker, and S.J. Young. The1997 HTK Broadcast News Transcription System.
  32. The CUHTK-Entropic 10xRT Broadcast News Transcription SystemJ.J.…

    mi.eng.cam.ac.uk/reports/svr-ftp/odell_darpa99.pdf
    8 Mar 2000: The CUHTK-Entropic 10xRT Broadcast News Transcription SystemJ.J. Odell. , P.C. Woodland. & ... ror rate for systems running in less than 10xRT in the 1998DARPA broadcast news evaluation.
  33. POSTERIOR PROBABILITY DECODING, CONFIDENCEESTIMATION AND SYSTEM…

    mi.eng.cam.ac.uk/reports/svr-ftp/evermann_stw00.pdf
    5 Oct 2000: All theexperiments reported are based on this system. The acoustic models used are triphone and quinphone HMMstrained on data from the Switchboard and CallHome corpora. ... A 4-gram language model was trained on the transcripts of theacoustic training
  34. THE CU-HTK MARCH 2000 HUB5E TRANSCRIPTION SYSTEMT. Hain, P.C. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/hain_stw00.pdf
    5 Oct 2000: The system used N-gram word-level language models. These were constructed by train-ing separate models for transcriptions of the Hub5 acoustic trainingdata and for Broadcast News data and then ... not known if LDC-style or MSU-style training tran-scripts
  35. Audio Indexing and Retrieval of Complete Broadcast News Shows ...

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_riao00.pdf
    10 Apr 2000: Abstract. This paper describes a system for retrieving relevant portions of complete broadcast news shows startingwith only the audio data. ... 7 Conclusions. This paper has described a system for retrieving relevant portions of complete broadcast news
  36. SPOKEN DOCUMENT RETRIEVAL FOR TREC-7 AT CAMBRIDGE UNIVERSITY S.E. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_trec7.pdf
    8 Mar 2000: TREC-7 1. 2. THE HTK BROADCAST NEWS TRANSCRIPTIONSYSTEM. The input data is presented to our HTK transcription system as com-plete episodes of broadcast news shows and these are first ... The HMMs for TREC-7 used HMMs trained on 70 hours of acous-tic data
  37. SPOKEN DOCUMENT RETRIEVAL FOR TREC-8 AT CAMBRIDGE UNIVERSITY S.E. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_trec8.pdf
    10 Apr 2000: Since a substantial portion of the data to betranscribed was known to be commercials and thus irrelevantto broadcast news queries, an automatic method of detectingand eliminating such commercials would potentially reduce ... Three fixed backoff
  38. Reducing Word Error Rates of Found Speech -XPERT Tool ...

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_tr330.pdf
    10 Apr 2000: Broadcast News transcription is also complicated by the presence of different audio conditions. ... The examples of automatic transcriptions analysed using xpert in this report are taken from the 1997Hub4 Broadcast News evaluation [2] and 1997 TREC6
  39. Submitted in partial requirement for the MPhil in Computer ...

    mi.eng.cam.ac.uk/reports/svr-ftp/johnson_mthesis.pdf
    10 Apr 2000: s. ucce. ss. Success in Identifying Speakers Using Covariance. Figure 2: Results of using distance metrics d1-d13 on broadcast news data. ... s. ucce. ss. Success in Identifying Speakers Using Covariance. Figure 3: Results of using d14-d23 on broadcast

Refine your results

Related searches for `broadcast news data` |u:mi.eng.cam.ac.uk

Search history

Recently clicked results

Recently clicked results

Your click history is empty.

Recent searches

Recent searches

Your search history is empty.