Search

Search Funnelback University

Search powered by Funnelback
1 - 34 of 34 search results for `news corpora` |u:mi.eng.cam.ac.uk
  1. Fully-matching results

  2. Bin Jia, Khe Chai Sim et al: CU-HTK RT03 ...

    mi.eng.cam.ac.uk/research/projects/EARS/pubs/jia_rt03s.pdf
    23 Jun 2003: Language Model. • Sources of data (using LDC character-to-word segmentor)– Acoustic training data (modifier Kneser-Ney)– News corpora: TDT[2,3,4], China Radio, People’s Daily, Xinhua (Good-. ... Acoustic 206.6 190.8AcousticNews Corpora 199.6 179.8
  3. Experiments in Broadcast News Transcription

    mi.eng.cam.ac.uk/reports/full_html/woodland_icassp98.html/
    1 Mar 2000: Experiments in Broadcast News Transcription. P.C. Woodland, T. Hain, S.E. Johnson, T. ... FX. all other speech (e.g. spontaneous non-native). Table 1: Broadcast news focus conditions.
  4. 29 Apr 2024: 19] uFACT: Unfaithful alien-corpora training for semantically consistent data-to-text generation. ... We propose uFACT (Un-Faithful Alien Corpora Training), a training corpus construction method for data-to-text (d2t) generation models.
  5. 22 Nov 2006: The two acoustictraining data sources, and each of the news corpora, were kept asdistinct sources for language model (LM) generation. ... The total contributionfrom all the news corpora was about 0.12, with the majority fromPeople’s Daily (0.09).
  6. 20 Feb 2018: in-domainutterance pairs, and up to 91.4% when adding the out-of-domainbilingual corpora detailed in Section 2.2. ... 11] J. Tiedemann, “News from OPUS - A collection of multi-lingual parallel corpora with tools and interfaces,” in Re-cent Advances
  7. 15 Jun 2018: Experimentswere conducted on the Penn Tree Bank and BBC Multi-GenreBroadcast News (MGB) corpora, where the proposed approachsignificantly outperforms standard forms of recurrent models inperplexity. ... PTB consists mainly oftext related to finance,
  8. IMPROVING BROADCAST NEWS TRANSCRIPTION BY LIGHTLY…

    mi.eng.cam.ac.uk/reports/svr-ftp/chan_icassp2004.pdf
    27 May 2004: The rest of the paper is organised as follows. In Section 2, wedescribe the English broadcast news corpora that used in this work.Then, our lightly supervised discriminative training approach ispresented ... Rich Tran-scription Workshop, 2003. [4] D.
  9. 3 Nov 2023: 13] uFACT: Unfaithful alien-corpora training for semantically consistent data-to-text generation. ... We propose uFACT (Un-Faithful Alien Corpora Training), a training corpus construction method for data-to-text (d2t) generation models.
  10. 23 Dec 2004: The total contribution fromall the news corpora was about 0.12, with the majority from Peo-ple’s Daily (0.09). ... All experiments use the interpolated language modelwith the news corpora. Language Model System (S3) CER (%)dev04.
  11. The 1997 HTK Broadcast News Transcription System

    mi.eng.cam.ac.uk/reports/full_html/woodland_darpa98.html/
    1 Mar 2000: 41-48 (Lansdowne,VA, Feb. 1998). The 1997 HTK BROADCAST NEWS TRANSCRIPTION SYSTEM. ... using the broadcast news training texts, the acoustic training data and 1995 Marketplace transcriptions.
  12. Abstract for evermann_icassp00

    mi.eng.cam.ac.uk/reports/abstracts/evermann_icassp00.html
    27 Jul 2020: The effectiveness of these techniques is demonstrated on the broadcast news and the conversational telephone speech corpora where improvements both in terms of word error rate and normalised cross entropy were
  13. sig-004.dvi

    mi.eng.cam.ac.uk/~sjy/papers/gayo07.pdf
    20 Feb 2018: The reviewconcludes with a case study of LVCSR for Broadcast News andConversation transcription in order to illustrate the techniquesdescribed. ... The N -gram parameters areestimated by counting N -tuples in appropriate text corpora.
  14. 19 Jul 2006: Text data: used to train the ASR language model:– large news corpora available;– systems built on > 1 billion words of data. •
  15. STRUCTURAL METADATA RESEARCH IN THE EARS PROGRAM Yang Liu1,5 ...

    mi.eng.cam.ac.uk/reports/svr-ftp/tomalin_icassp05.pdf
    12 May 2005: 2.3. MDE Corpora. Conversational telephone speech (CTS) and broadcast news (BN)are used for the structural event detection tasks in EARS. ... The MDE effort in theEARS program aims to explore these tasks more extensively, us-ing different corpora and
  16. Bitext Alignment forStatistical Machine Translation Yonggang Deng A…

    mi.eng.cam.ac.uk/~wjb31/ppubs/YDengDissertationDec05.pdf
    16 Feb 2008: 72. 5.9 Percentage of Usable Arabic-English Bitext. English tokens for Arabic-English news and UN parallel corpora under different alignment pro-cedures. ... in real data, for example, parallel corpora mined from web pages, automatic bitext.
  17. 9 Jul 2024: This paradigm has shownimpressive results on standard summarization tasks such as news summarization [159, 340].However, there is a challenge in applying a large foundation model to long-documentsummarization such as
  18. EXPERIMENTS IN BROADCAST NEWS TRANSCRIPTION P.C. Woodland, T. Hain,…

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_icassp98.pdf
    10 Apr 2000: EXPERIMENTS IN BROADCAST NEWS TRANSCRIPTION. P.C. Woodland, T. Hain, S.E. Johnson, T.R. ... Young S.J. (1997) TheDevelopment of the 1996 Broadcast News Transcription Sys-tem.
  19. Bitext Alignment for Statistical Machine Translation

    mi.eng.cam.ac.uk/~wjb31/ppubs/YDengDefenseDec05.pdf
    16 Feb 2008: English Arabic-English. Used all parallel corpora available from LDCC-E: 200M En. ... words (news, all UN bitexts). Y. Deng (Johns Hopkins) Bitext Alignment for SMT 39 / 42.
  20. THE DEVELOPMENT OF THE1996 HTK BROADCAST NEWS TRANSCRIPTION SYSTEM ...

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_darpa97.pdf
    8 Mar 2000: THE DEVELOPMENT OF THE1996 HTK BROADCAST NEWS TRANSCRIPTION SYSTEM. P.C. Woodland, M.J.F. ... 5. CONCLUSIONThis paper has described our initial efforts to develop systemsfor broadcast news transcription.
  21. paper.dvi

    mi.eng.cam.ac.uk/~mjfg/liao_ICASSP07.pdf
    15 Aug 2007: Experiments are conductedon theResource Management and Broadcast News corpora. Index Terms— Speech recognition, Robustness. ... 4. EXPERIMENTS. A simplified Broadcast News system based on the 2003 CU-HTKsystem [11] was evaluated.
  22. The 1997 HTK Broadcast News Transcription SystemP.C. Woodland, T. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/woodland_darpa98.pdf
    8 Mar 2000: The 1997 HTK Broadcast News Transcription SystemP.C. Woodland, T. Hain, S.E. ... news development test data andjust 15.8% on the 1997 evaluation test set.
  23. tech.dvi

    mi.eng.cam.ac.uk/~mjfg/liao_tr552.pdf
    21 Sep 2007: trained on multistyle data sets such as broadcast news or conversational telephone speech. ... large vocabulary Broadcast News corpus of collected broadcast recordings. 1 Introduction.
  24. RECENT ADVANCES IN BROADCAST NEWS TRANSCRIPTION D.Y. Kim, G. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/kim_asru2003.pdf
    25 Sep 2003: 12, pp. 75-98. [5] D. Graff (2002). “An Overview of Broadcast News Corpora.”Speech Communication, Vol.37, pp. ... Broadcast News Data. Acoustic training data. Development data. Text corpora. Acoustic model building.
  25. The Cambridge Multimedia Document Retrieval (MDR) Project : Summary…

    mi.eng.cam.ac.uk/reports/full_html/sparckjones_cltr517.html/
    10 Oct 2001: full audio material including non-news items. Figure 1: Details of the TREC data sets. ... 5. Details of the various corpora are given in the tables with the results.
  26. sig-004.dvi

    mi.eng.cam.ac.uk/~mjfg/mjfg_NOW.pdf
    19 Mar 2008: The reviewconcludes with a case study of LVCSR for Broadcast News andConversation transcription in order to illustrate the techniquesdescribed. ... The N -gram parameters areestimated by counting N -tuples in appropriate text corpora.
  27. thesis.dvi

    mi.eng.cam.ac.uk/reports/svr-ftp/nock_thesis.pdf
    14 Jun 2006: time of writing include the tran-scription of real radio and television news broadcasts (eg. ... 72] finds differences in part-of-speechdistributions found in the conversational Switchboard [54], dictated Wall StreetJournal [128] and the mixed speaking
  28. Article Submitted to Computer Speech and Language Automatic…

    mi.eng.cam.ac.uk/reports/svr-ftp/auto-pdf/kim_csl04.pdf
    9 Aug 2005: 3. Corpora and evaluation measuresTwo different sets of data, the Broadcast News (BN) text corpus and the 100-hourHub-4 BN data set, were available as training data for the experiments ... For example: News in “Lisa Stark, A. B. C. News, Washington”
  29. LARGE VOCABULARY DECODING AND CONFIDENCE ESTIMATIONUSING WORD…

    mi.eng.cam.ac.uk/reports/svr-ftp/evermann_icassp00.pdf
    5 May 2000: It is also interesting tonote that the improvement is consistent over the varioustypes of data found in broadcast news. ... The effectiveness of these techniques was demonstratedon the broadcast news and the conversational telephonespeech corpora where
  30. 11 Jul 2017: Joint Training Methods for Tandem andHybrid Speech Recognition Systems. using Deep Neural Networks. Chao Zhang. Department of EngineeringUniversity of Cambridge. This dissertation is submitted for the degree ofDoctor of Philosophy. Peterhouse July
  31. PhD Thesis

    mi.eng.cam.ac.uk/~mjfg/thesis_kcs23.pdf
    16 Nov 2007: 8.1 Summary of various speech training corpora for CTS-E, BN-E and CTS-M 102. ... News (BN) English transcription tasks are used. 2. Hidden Markov Model Speech Recognition.
  32. /home/blue7/jjjb2/2009-03-02_ZH-EN/results/HMMcomp.f2e.ps

    mi.eng.cam.ac.uk/~wjb31/ppubs/jbrunningthesis.pdf
    20 Oct 2010: Figure 1.1: Graphical representation of noisy channel models and decoding. 1http://news.xinhuanet.com/english/2007-08/31/content_6637522.htm. ... The amount of text in electronic parallel corpora that can be used forthis purpose is rapidly increasing:
  33. DESIGN OF FAST LVCSR SYSTEMS G. Evermann & P.C. ...

    mi.eng.cam.ac.uk/reports/svr-ftp/evermann_asru2003.pdf
    23 Sep 2003: More details on the effectiveness of these tech-niques on Broadcast News can be found in [8]. ... InProc. IEEE ASRU Workshop, 1997. [4] D. Graff. An Overview of Broadcast News Corpora.SpeechCommunication, 37:15–26, 2002.
  34. "Refinements in Hierarchical Phrase-Based Translation…

    mi.eng.cam.ac.uk/~wjb31/ppubs/jpino2015HieroRefinementsThesis.pdf
    6 Feb 2015: In order to address thisconcern and also in order to obtain more data, parallel data can also beextracted automatically from comparable corpora (Smith et al., 2013). ... How-ever, the widespread availability of machine translation and the developmentof
  35. TOWARDS IMPROVED LANGUAGE MODEL EVALUATION MEASURES Philip Clarkson…

    mi.eng.cam.ac.uk/reports/svr-ftp/auto-pdf/clarkson_eurospeech99.pdf
    9 Aug 2005: Different quantities of thetraining corpora were used to train each language model, andvarious cutoffs were applied. ... Christie, and A. Robinson. TheTranscription of Broadcast Television and Radio News:The 1996 Abbot System.

Related searches for `news corpora` |u:mi.eng.cam.ac.uk

Search history

Recently clicked results

Recently clicked results

Your click history is empty.

Recent searches

Recent searches

Your search history is empty.