시간과 비용을 줄이는 전처리 된 데이터셋
전처리 된 데이터셋은 AI 구축 비용을 낮추고 작업 속도를 높일 수 있는 효과적인 서비스입니다. 수동 작업을 최소화하여 합리적인 예산으로 인공지능 테스트와 프로젝트 품질 검증이 가능해집니다.
저희 에펜은 모든 데이터 유형(이미지, 음성, 텍스트, 동영상 등)을 포함한 400개 이상의 데이터셋을 제공합니다. 전처리 된 데이터셋으로 AI 구축을 쉽고 빠르게 시작하세요.
음성 인식 데이터셋
- 64개 이상의 언어로 된 22,000시간의 음성 인식 데이터 지원
- 스마트폰 마이크, 고성능 마이크 및 녹음 장비 지원
- 독백, 자유 연설, 대화 및 다양한 시나리오 지원
- 조용한 환경, 사무실, 가정, 차량 내부, 및 다양한 녹음 환경 지원
- 모든 데이터셋에 전사된 텍스트 제공
- 발음 사전 지원(일부에 한함)
텍스트 데이터셋
- 523만 항목의 98개 언어를 다루는 발음 사전 지원
- 326만 항목의 22개 언어를 다루는 품사 사전 지원
- 100만 개 이상의 항목을 8개 언어로 지원하는 NER 데이터셋 제공
- 3가지 언어를 지원하는 어휘 분석기 제공
이미지 데이터셋
- 사진 13,500장, 흑백 머그샷 1,000장 지원
- 한국어, 영어, 태국어, 힌디어, 스페인어, 핀란드어를 포함한 12,000장의 OCR 이미지 지원
- 2,196장의 멀티 라벨링 이미지 데이터베이스 지원
- 멀티 포즈와 다중 조명 인물 사진 680장 지원
동영상 데이터셋
- 0~3세 영유아의 울음소리 100개(각 1분) 지원
- 한국어, 독일어, 태국어 자막이 있는 동영상 지원
자동 조종 장치 데이터셋
- 자동차 및 자동차 번호판 이미지 지원
- 프랑스어, 네덜란드어, 스페인어, 이탈리아어, 영어, 러시아어 등으로 구성된 차량 내 ASR 데이터셋 지원
음성 합성(TTS) 데이터셋
- 다국어 데이터셋 지원
- 20개 이상의 각기 다른 언어를 사용하는 400명의 전문 성우 리소스 지원
데이터셋 활용 분야
자율주행
운전자 행동 인식 데이터셋: 운전자의 자세와 위험 행동, 피로도 감지.
승객 안전 모니터링 데이터셋: 차에 남겨진 어린이나 애완동물 및 위험 품목 식별
차량 내부 음성 데이터셋: 음성 내비게이션과 스마트 드라이빙 구축
차량 외부 데이터셋: 도로의 차선, 장애물 및 주차 공간 식별
고객 서비스
자연어처리 데이터셋: AI와 유사한 대화 프로그램을 생성하고 스마트 온라인 고객 서비스 지원
TTS 음성 데이터셋: 텍스트 파일을 실시간으로 변환하고 자연스러운 음성 스트림으로 변환
스마트 금융
재무 OCR 데이터셋: 텍스트 전사 인식 자동화를 통해 계약 검토나 금융 및 보험 분야의 OCR 지원
스마트 홈
음성 인식 데이터셋: 에어컨과 같이 가정에서 쓰이는 전자 제품의 기능적 프롬프트 및 스마트 상호 작용 지원
장애물 이미지 데이터셋: 로봇 청소기의 물체 식별과 장애물 통과 기능 지원
스마트 기기
얼굴 인식·음성 인식 데이터셋: 스마트 장치 애플리케이션 배포 지원
스마트 보안
얼굴 인식·위험 행동 추적 데이터셋: 스마트 보안 인공 지능 구축 지원
전처리 된 데이터셋으로 AI 구축을 쉽고 빠르게 시작하세요
제품카탈로그
80개 이상의 언어로 된 400개 이상의 오디오, 이미지, 비디오, 텍스트 데이터셋으로 구성된 광범위한 카탈로그를 찾아보세요. 전처리된 데이터셋으로 더 빠른 작업이 가능해집니다.
Dataset Name | Product Type | Common Use Cases | Recording Device | Unit |
---|
Dataset Name | Product Type | Common Use Cases | Recording Device | Unit | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
135 | Text | ASR, TTS, Language Modelling | N/A | 12,000 words | Add | sqi_ALB_PHON | Appen Global | Pronunciation Dictionary | Albanian | Albania | N/A | N/A | N/A | N/A | 12,000 | N/A | text | Albanian (Albania) Pronunciation Dictionary | ||
136 | Text | ASR, TTS, Language Modelling | N/A | 45,000 words | Add | amh_ETH_PHON | Appen Global | Pronunciation Dictionary | Amharic | Ethiopia | N/A | N/A | N/A | N/A | 45,000 | N/A | text | Amharic (Ethiopia) Pronunciation Dictionary | ||
141 | Text | ASR, TTS, Language Modelling | N/A | 11,000 words | Add | ara_DZA_PHON | Appen Global | Pronunciation Dictionary | Arabic | Algeria | N/A | N/A | N/A | N/A | 11,000 | N/A | text | Arabic (Algeria) Pronunciation Dictionary | ||
20 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 29 hours | Add | EAR_ASR001 | Appen Global | Conversational Speech | Arabic | Algeria | Low background noise (home/office) | 496 | 2 | Available on request | 11,327 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed however, for a smaller number of calls, only one half of the conversation was collected and transcribed 8% landline, 92% mobile | Arabic (Eastern Algeria) conversational telephony | |
137 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | ara_EGY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Egypt | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Arabic (Egypt) Pronunciation Dictionary | ||
114 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 352 hours | Add | ARE_ASR001_CN | Appen China | Scripted Speech | Arabic | Egypt | Low background noise (home/office) | 627 | 1 | 128,908 | 207,576 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised | Arabic (Egypt) scripted smartphone | |
139 | Text | ASR, TTS, Language Modelling | N/A | 13,000 words | Add | ara_IRQ_POS | Appen Global | Part of Speech Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 13,000 | N/A | text | Arabic (Iraq) Part of Speech Dictionary | ||
138 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | ara_IRQ_PHON | Appen Global | Pronunciation Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Person names | Arabic (Iraq) Pronunciation Dictionary | |
140 | Text | ASR, TTS, Language Modelling | N/A | 48,000 words | Add | ara_LBY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Libya | N/A | N/A | N/A | N/A | 48,000 | N/A | text | Arabic (Libya) Pronunciation Dictionary | ||
65 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 12 hours | Add | MSA_ASR001 | Global Phone | Scripted Speech | Arabic | Tunisia | Low background noise (home/office) | 78 | 1 | 4,908 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Arabic (Modern Standard Arabic) scripted microphone | |
112 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 33 hours | Add | ARY_ASR001 | Appen Global | Conversational Speech | Arabic | Morocco | Low background noise | 180 | 2 | 80,544 | 23,836 | 8 | alaw | Each speaker participated in 1 to 4 conversations. Speakers are identified by a unique 4-digit speaker ID which is recorded in the demographic file Transcription is available in original script and fully reversible Romanised version with accompanying pronunciation lexicon English translation of product transcription is available (ARY_MT001, ARY_ASRMT001) | Arabic (Morocco) conversational telephony | |
113 | Text | MT, Chatbot , Conversational AI | N/A | 80,544 utterances | Add | ARY_MT001 | Appen Global | Conversational Translation | Arabic | Morocco | N/A | 180 | N/A | 80,430 | 23,844 | N/A | text | Corresponding audio, transcription, fully reversible romanised transcription and pronunciation lexicon data are available (ARY_ASR001, ARY_ASRMT001) | Arabic (Morocco) conversational telephony translation | |
143 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add | ara_MAR_PHON | Appen Global | Pronunciation Dictionary | Arabic | Morocco | N/A | N/A | N/A | N/A | 60,000 | N/A | text | Arabic (Morocco) Pronunciation Dictionary | ||
144 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | arb_N/A_PHON | Appen Global | Pronunciation Dictionary | Arabic | N/A | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Arabic (N/A) Pronunciation Dictionary | ||
115 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 322 hours | Add | ARS_ASR001_CN | Appen China | Scripted Speech | Arabic | Saudi Arabia | Low background noise (home/office) | 227 | 1 | 104,574 | 156,282 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised 300-1000 prompts per speaker covering general content including education, sports, entertainment, travel, culture and technology | Arabic (Saudi Arabia) scripted smartphone | |
146 | Text | ASR, TTS, Language Modelling | N/A | 17,000 words | Add | ara_SDN_PHON | Appen Global | Pronunciation Dictionary | Arabic | Sudan | N/A | N/A | N/A | N/A | 17,000 | N/A | text | Arabic (Sudan) Pronunciation Dictionary | ||
145 | Text | ASR, TTS, Language Modelling | N/A | 75,000 words | Add | ara_ARE_PHON | Appen Global | Pronunciation Dictionary | Arabic | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 75,000 | N/A | text | Arabic (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
120 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 170 hours | Add | ARU_ASR001_CN | Appen China | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise (home/office) | 133 | 1 | 42,352 | 85,775 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised | Arabic (United Arab Emirates (UAE)) scripted smartphone | |
70 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 48 hours | Add | OrienTel United Arab Emirates MCA (Modern Colloquial Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 880 | 1 | 43,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control | Arabic (United Arab Emirates (UAE)) scripted telephony | |
71 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 31 hours | Add | OrienTel United Arab Emirates MSA (Modern Standard Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 24,500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control | Arabic (United Arab Emirates (UAE)) scripted telephony | |
9 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 86 hours | Add | CGA_ASR001 | Appen Global | Scripted Speech | Arabic | United Arab Emirates (UAE) - Saudi Arabia | Low background noise (home/office) | 150 | 4 | 42,000 | 19,245 | 16 | raw PCM | Fully transcribed with acoustic event tagging derived from the SpeechDAT conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words All transcriptions fully vowelized 280 prompts per speaker including 30 Person names (first name and family name) from a set of 15, 10 single isolated digits 0-10, 8-digit sequences (randomly generated), 200 phonetically balanced sentences, 30 x 10-word phonetically balanced word strings | Arabic (United Arab Emirates (UAE)/ Saudi Arabia) scripted microphone | |
127 | Text | NER, Content Classification, Search Engines | N/A | 20,774 sentences | Add | ARB_NER001 | Appen Global | News NER | Standard Arabic | N/A | N/A | N/A | N/A | 20,774 | Available on request | N/A | text | Arabic NER news text | ||
147 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | asm_IND_PHON | Appen Global | Pronunciation Dictionary | Assamese | India | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Assamese (India) Pronunciation Dictionary | ||
121 | Audio | Baby Monitor, Security & Other Consumer Applications | Mobile phone | 3 hours | Add | CRY_ASR001 | Appen China | Human Sound | N/A | China | Low background noise (home/office) | 100 | 1 | N/A | N/A | 16 | wav | Crying sound of babies 0-3 years old, each lasting around 2 minutes. | Baby crying audio | |
4 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 31 hours | Add | BAH_ASR001 | Appen Global | Conversational Speech | Indonesian | Indonesia | Low background noise | 1,002 | 2 | 30,695 | 11,480 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For a large proportion of calls, only one half of the conversation was collected and transcribed 28% landline, 72% mobile | Bahasa Indonesia conversational telephony | |
150 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | eus_ESP_PHON | Appen Global | Pronunciation Dictionary | Basque | Spain | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Basque (Spain) Pronunciation Dictionary | ||
6 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 47 hours | Add | BEN_ASR001 | Appen Global | Conversational Speech | Bengali | Bangladesh | Mixed (in-car, roadside, home/office) | 1,000 | 2 | 108,923 | 17,922 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words | Bengali (Bangladesh) conversational telephony | |
151 | Text | ASR, TTS, Language Modelling | N/A | 29,000 words | Add | ben_IND_PHON | Appen Global | Pronunciation Dictionary | Bengali | India | N/A | N/A | N/A | N/A | 29,000 | N/A | text | Bengali (India) Pronunciation Dictionary | ||
7 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 38 hours | Add | BUL_ASR001 | Appen Global | Conversational Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 217 | 2 | 86,453 | 22,342 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 49% landline, 51% mobile Conversations cover a range of topics including: Holiday/Leisure, Movies/TV Shows and Work. | Bulgarian (Bulgaria) conversational telephony | |
152 | Text | ASR, TTS, Language Modelling | N/A | 55,000 words | Add | bul_BGR_PHON | Appen Global | Pronunciation Dictionary | Bulgarian | Bulgaria | N/A | N/A | N/A | N/A | 55,000 | N/A | text | Bulgarian (Bulgaria) Pronunciation Dictionary | ||
111 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 22 hours | Add | BUL_ASR002 | Global Phone | Scripted Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 77 | 1 | 8,674 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Bulgarian (Bulgaria) scripted microphone | |
268 | Image | Document Processing, Document Search | Camera, scan | 5,832 documents | Add | IMG_OCR_B2B | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-business documents containing printed text. 38% Premium Quality images including Purchase Order, Payment Advice or Remittance Advice, Order Confirmation and Delivery note; 64% Standard Quality images in various challenging conditions in a wider range of categories including Complaints or Return, Delivery advice, Delivery note, Dunning, Goods receipt, Invoice, Offer, Order confirmation, Pay slip, Payment Advice or Remittance Advice, Purchase Order, Receipt, and Supplier load | Business-to-business printed text document OCR | |
269 | Image | Document Processing, Document Search | Camera, scan | 22,626 documents | Add | IMG_OCR_B2C_Other | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-consumer and miscellaneous other category documents containing text: 37% invoices, 42% receipts, 1% documents with tables, 2% handwritten forms and documents, 2% menus, 11% product labels, 2% posters, 3% street signs. 6 Languages collected in 23+ locales: 11% Arabic, 43% English, 4% French, 4% German, 24% Spanish, 14% Russian | Business-to-consumer/other text document OCR | |
155 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | yue_HKG_POS | Appen Global | Part of Speech Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Traditional | Cantonese (China) Part of Speech Dictionary | |
153 | Text | ASR, TTS, Language Modelling | N/A | 37,000 words | Add | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 37,000 | N/A | text | Simplified | Cantonese (China) Pronunciation Dictionary | |
154 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Traditional | Cantonese (China) Pronunciation Dictionary | |
156 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | cat_ESP_PHON | Appen Global | Pronunciation Dictionary | Catalan | Spain | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Catalan (Spain) Pronunciation Dictionary | ||
157 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add | ceb_PHL_PHON | Appen Global | Pronunciation Dictionary | Cebuano | Philippines | N/A | N/A | N/A | N/A | 20,000 | N/A | text | Cebuano (Philippines) Pronunciation Dictionary | ||
265 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 200 hours | Add | FOREIGNER_ASR001_CN | Appen China | Scripted Speech | Mandarin Chinese | China | Low background noise | 309 | 1 | 16 | wav | This database contains 200 hours of foreigners speaking Chinese from the following countries: Argentina, Egypt, Australia, Russia, the Philippines, Kazakhstan, Korea, Kyrgyzstan, Canada, Kuala Lumpur, Kenya, Laos, Malaysia, Mauritius, the United States, Mongolia, South Africa, Japan, Tajikistan, Thailand, Turkey, Hong Kong, Singapore, India, Indonesia, Vietnam There is no data from South Korea, Brazil, or data recorded by minors. Each session lasts about an hour; sentence duration ranges between 3-10 seconds The content is in the form of an individual reading while being recorded on a mobile phone in a home/office environment. Sensitive data and personal information has been scrubbed. | Chinese (multinational foreigner) scripted smartphone | |||
10 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 39 hours | Add | CRO_ASR001 | Appen Global | Conversational Speech | Croatian | Croatia | Low background noise (home/office) | 200 | 2 | Available on request | 23,919 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 53% landline, 47% mobile Conversations cover a range of topics including: News & Current Affairs, Health and Sport. | Croatian (Croatia) conversational telephony | |
158 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add | hrv_HRV_PHON | Appen Global | Pronunciation Dictionary | Croatian | Croatia | N/A | N/A | N/A | N/A | 20,000 | N/A | text | Croatian (Croatia) Pronunciation Dictionary | ||
11 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 11 hours | Add | CRO_ASR002 | Global Phone | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 94 | 1 | 4,499 | 23,929 | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Croatian (Croatia) scripted microphone | |
116 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 263 hours | Add | CRO_ASR003_CN | Appen China | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 243 | 1 | 73,467 | 136,140 | 16 | wav | Dataset contains audio with corresponding text prompts | Croatian (Croatia) scripted smartphone | |
159 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add | ces_CZE_PHON | Appen Global | Pronunciation Dictionary | Czech | Czech Republic | N/A | N/A | N/A | N/A | 50,000 | N/A | text | Czech (Czech Republic) Pronunciation Dictionary | ||
12 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 31 hours | Add | CZE_ASR001 | Global Phone | Scripted Speech | Czech | Czech Republic | Low background noise (home/office) | 102 | 1 | 12,425 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Czech (Czech Republic) scripted microphone | |
13 | Audio | ASR, Virtual Assistant | Landline only | 93 hours | Add | Czech SpeechDat(E) Dataset | Nuance | Scripted Speech | Czech | Czech Republic | Low background noise | 1,000 | 1 | 52,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, and phonetically rich words and sentences | Czech (Czech Republic) scripted telephony | |
161 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | dan_DNK_POS | Appen Global | Part of Speech Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Danish (Denmark) Part of Speech Dictionary | ||
160 | Text | ASR, TTS, Language Modelling | N/A | 107,000 words | Add | dan_DNK_PHON | Appen Global | Pronunciation Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 107,000 | N/A | text | Danish (Denmark) Pronunciation Dictionary | ||
90 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add | Speecon Danish | Nuance | Scripted Speech | Danish | Denmark | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Danish (Denmark) scripted microphone | |
15 | Audio | ASR, Automatic Captioning, Keyword Spotting | Microphone | 51 hours | Add | DAR_BRC001 | Appen Global | Broadcast Speech | Dari | Afghanistan | Low background noise (studio) | N/A | 1 | Available on request | Available on request | N/A | wav | Dataset is fully transcribed and timestamped Pronunciation lexicon not currently available but can be developed upon request Dataset is largely speech only and does not include music or advertisements Data types include: talk shows, interviews, news broadcasts (excluding news reading by anchors) | Dari (Afghanistan) broadcast | |
14 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 40 hours | Add | DAR_ASR001 | Appen Global | Conversational Speech | Dari | Afghanistan | Low background noise | 500 | 2 | Available on request | 11,168 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Dataset is largely speech only and does not include music or advertisements 13% landline, 87% mobile | Dari (Afghanistan) conversational telephony | |
162 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | prs_AFG_PHON | Appen Global | Pronunciation Dictionary | Dari | Afghanistan | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Dari (Afghanistan) Pronunciation Dictionary | ||
163 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add | luo_KEN_PHON | Appen Global | Pronunciation Dictionary | Dholuo | Kenya | N/A | N/A | N/A | N/A | 20,000 | N/A | text | Dholuo (Kenya) Pronunciation Dictionary | ||
258 | Audio | ASR, Conversational AI, Speech Analytics | Recording pen/microphone | 84.6 hours | Add | DONGBEI_ASR001_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 268 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Dongbei dialect (China) Conversational Speech | |||
259 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 75.2 hours | Add | DONGBEI_ASR002_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 185 | 1 | 8 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Dongbei dialect (China) Conversational Speech | |||
91 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 47 hours | Add | Speecon Dutch from Belgium | Nuance | Scripted Speech | Dutch | Belgium | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Dutch (Belgium) scripted microphone | |
33 | Audio | ASR, Virtual Assistant | Microphone | 80 hours | Add | Flemish SpeechDat(II) FDB-1000 (FIXED1FL) | Nuance | Scripted Speech | Dutch | Belgium | Low background noise | 1,000 | 1 | 52,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control | Dutch (Belgium) scripted telephony | |
19 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone and mobile phone | 27 hours | Add | Dutch and Flemish SpeechDat-Car | Nuance | Scripted Speech | Dutch | Netherland - Belgium | Mixed (in-car) | 302 | 5 | 15,100 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report 125 prompts per adult speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech | Dutch (Netherlands & Belgium) scripted in-car | |
66 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 36 hours | Add | NLD_ASR001 | Appen Global | Conversational Speech | Dutch | Netherlands | Low background noise | 200 | 2 | Available on request | 14,964 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 51% landline, 49% mobile Conversations cover a range of topics including: Holiday/Leisure, Work and Sport. | Dutch (Netherlands) conversational telephony | |
164 | Text | ASR, TTS, Language Modelling | N/A | 45,000 words | Add | nld_NLD_PHON | Appen Global | Pronunciation Dictionary | Dutch | Netherlands | N/A | N/A | N/A | N/A | 45,000 | N/A | text | Dutch (Netherlands) Pronunciation Dictionary | ||
92 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 68 hours | Add | Speecon Dutch from the Netherlands | Nuance | Scripted Speech | Dutch | Netherlands | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Dutch (Netherlands) scripted microphone | |
122 | Image | Facial Recognition | Camera | 13500 images | Add | IMG_FACE_KEN_CN | Appen China | Human Face | N/A | Kenya | Mixed background and lighting conditions | 99 | N/A | N/A | N/A | N/A | jpg | Images of 100 participants containing all combinations of 9 different lighting conditions, 2 different distances between participants face and smartphone, 7 different camera angles A random 32 images per person include occlusions such as sunglasses, masks, wigs or hats A random 36 shots include different facial expressions including stare, open mouth, pout mouth smile and frown Lighting conditions: indoor normal light, outdoor normal light, indoor backlight, outdoor backlight, indoor ordinary dark light, full black screen fill light, point light source (white light, street light), neon light (monochromatic red, green and blue, multi-color mixed light), side glare Distances: 30cm and 50cm Camera angles: front, left 45°, right 45°, left 15°, right 15°, top 30°, bottom 30° | East African facial images | |
21 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 28 hours | Add | ENA_ASR001 | Appen Global | Conversational Speech | English | Egypt | Low background noise | 250 | 2 | Available on request | 5,619 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins | English (Arabic - Levant/Egypt) conversational telephony | |
166 | Text | ASR, TTS, Language Modelling | N/A | 157,000 words | Add | eng_AUS_PHON | Appen Global | Pronunciation Dictionary | English | Australia | N/A | N/A | N/A | N/A | 157,000 | N/A | text | English (Australia) Pronunciation Dictionary | ||
2 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 92 hours | Add | AUS_ASR001 | Appen Global | Scripted Speech | English | Australia | Low background noise (home/office) | 500 | 1 | 82,500 | 35,137 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 162 prompts (read speech) per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items (from a set of 215), phonetically rich sentences and words | English (Australia) scripted telephony | |
3 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 118 hours | Add | AUS_ASR002 | Appen Global | Scripted Speech | English | Australia | Mixed | 1,000 | 1 | 75,000 | 18,952 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 75 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words The prompts are a mixture of 'read' and 'elicited' items where 5 prompts per script are 'spontaneous free speech' | English (Australia) scripted telephony | |
168 | Text | ASR, TTS, Language Modelling | N/A | 3,000 words | Add | eng_CAN_POS | Appen Global | Part of Speech Dictionary | English | Canada | N/A | N/A | N/A | N/A | 3,000 | N/A | text | English (Canada) Part of Speech Dictionary | ||
167 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add | eng_CAN_PHON | Appen Global | Pronunciation Dictionary | English | Canada | N/A | N/A | N/A | N/A | 50,000 | N/A | text | English (Canada) Pronunciation Dictionary | ||
22 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 144 hours | Add | ENC_ASR001 | Appen Global | Scripted Speech | English | Canada | Mixed | 1,000 | 1 | 99,000 | 12,483 | 8 | alaw or wav | Fully transcribed to SALA II/SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 99 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words | English (Canada) scripted telephony | |
170 | Text | ASR, TTS, Language Modelling | N/A | 18,000 words | Add | eng_HKG_PHON | Appen Global | Pronunciation Dictionary | English | Hong Kong | N/A | N/A | N/A | N/A | 18,000 | N/A | text | English (Hong Kong) Pronunciation Dictionary | ||
271 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 143 hours | Add | ENI_ASR003 | Appen Global | Conversational Speech | English | India | Mixed (home, car, public place, outdoor) | 272 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | English (India) conversational smartphone | |
25 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 67 hours | Add | ENI_ASR002 | Appen Global | Conversational Speech | English | India | Low background noise | 540 | 2 | 77,565 | 11,646 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 271 telephony conversations are recorded for this project | English (India) conversational telephony | |
172 | Text | ASR, TTS, Language Modelling | N/A | 13,000 words | Add | eng_IND_POS | Appen Global | Part of Speech Dictionary | English | India | N/A | N/A | N/A | N/A | 13,000 | N/A | text | English (India) Part of Speech Dictionary | ||
171 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add | eng_IND_PHON | Appen Global | Pronunciation Dictionary | English | India | N/A | N/A | N/A | N/A | 60,000 | N/A | text | English (India) Pronunciation Dictionary | ||
24 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 217 hours | Add | ENI_ASR001 | Appen Global | Scripted Speech | English | India | Mixed | 2,358 | 1 | 115,541 | 9,190 | 8 | alaw or wav | Fully transcribed to SpeechDAT type conventions. Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 49 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words | English (India) scripted telephony | |
173 | Text | ASR, TTS, Language Modelling | N/A | 12,000 words | Add | eng_IRL_PHON | Appen Global | Pronunciation Dictionary | English | Ireland | N/A | N/A | N/A | N/A | 12,000 | N/A | text | English (Ireland) Pronunciation Dictionary | ||
174 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add | eng_NZL_PHON | Appen Global | Pronunciation Dictionary | English | NZ | N/A | N/A | N/A | N/A | 50,000 | N/A | text | English (NZ) Pronunciation Dictionary | ||
23 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 53 hours | Add | ENF_ASR001 | Appen Global | Conversational Speech | English | Philippines | Low background noise | 450 | 2 | 41,602 | 7,272 | 8 | alaw or wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins | English (Philippines) conversational telephony | |
169 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add | eng_PHL_PHON | Appen Global | Pronunciation Dictionary | English | Philippines | N/A | N/A | N/A | N/A | 5,000 | N/A | text | English (Philippines) Pronunciation Dictionary | ||
165 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add | eng_ARE_PHON | Appen Global | Pronunciation Dictionary | English | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 5,000 | N/A | text | English (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
67 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 33 hours | Add | OrienTel English as spoken in the United Arab Emirates | Nuance | Scripted Speech | English | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 25,500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 51 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control | English (United Arab Emirates (UAE)) scripted telephony | |
104 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 150 hours | Add | UKE_ASR001 | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1,175 | 2 | 298,562 | 24,193 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words | English (United Kingdom) conversational telephony | |
255 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 50 hours | Add | UKE_ASR001B | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1,150 | 2 | Available on request | 13,192 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words | English (United Kingdom) conversational telephony | |
176 | Text | ASR, TTS, Language Modelling | N/A | 155,000 words | Add | eng_GBR_POS | Appen Global | Part of Speech Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 155,000 | N/A | text | English (United Kingdom) Part of Speech Dictionary | ||
175 | Text | ASR, TTS, Language Modelling | N/A | 195,000 words | Add | eng_GBR_PHON | Appen Global | Pronunciation Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 195,000 | N/A | text | English (United Kingdom) Pronunciation Dictionary | ||
99 | Audio | TTS | Headset microphone | 11 hours | Add | TC-STAR female baseline voice Laura | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription | English (United Kingdom) scripted microphone - single female | |
100 | Audio | TTS | Headset microphone | 7 hours | Add | TC-STAR male baseline voice Ian | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription | English (United Kingdom) scripted microphone - single male | |
272 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 50 hours | Add | USE_ASR004 | Appen Global | Conversational Speech | English | United States | Mixed (home, car, public place, outdoor) | 94 | 1 | Available on request | Available on request | 48 | wav | Two person conversations recorded on a smartphone covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | English (United States - African American) conversational smartphone | |
266 | Text | Virtual Assistant, Chatbot | N/A | 952,677 messages | Add | ENG_SMS001 | Appen Global | SMS text messages | English | United States | N/A | Available on request | N/A | 952,677 | Available on request | N/A | text | This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
267 | Text | Virtual Assistant, Chatbot | N/A | 106,649 messages | Add | ENG_SMS001A | Appen Global | SMS text messages | English | United States | N/A | 390 | N/A | 106,649 | Available on request | N/A | text | This is a subset of ENG_SMS001. This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
270 | Text | Virtual Assistant, Chatbot | N/A | 351,826 messages | Add | ENG_SMS002 | Appen Global | WhatsApp text messages | English | United States | N/A | Available on request | N/A | 351,826 | Available on request | N/A | text | This dataset contains threaded text message conversations between 2 participants, using WhatsApp. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation WhatsApp - Threaded | |
107 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 1000 hours | Add | USE_ASR003 | Appen Global | Conversational Speech | English | United States | Low background noise | 2,000 | 1 | 500,000 | 52,586 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Conversations cover a wide variety of topics including: study/major/work, hometown, living arrangements, weather and seasons, punctuality, TV programs/film) | English (United States) conversational smartphone | |
178 | Text | ASR, TTS, Language Modelling | N/A | 263,000 words | Add | eng_USA_POS | Appen Global | Part of Speech Dictionary | English | United States | N/A | N/A | N/A | N/A | 263,000 | N/A | text | English (United States) Part of Speech Dictionary | ||
177 | Text | ASR, TTS, Language Modelling | N/A | 330,000 words | Add | eng_USA_PHON | Appen Global | Pronunciation Dictionary | English | United States | N/A | N/A | N/A | N/A | 330,000 | N/A | text | English (United States) Pronunciation Dictionary | ||
93 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add | Speecon English (USA) database | Nuance | Scripted Speech | English | United States | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | English (United States) scripted microphone | |
106 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 62 hours | Add | USE_ASR001 | Appen Global | Scripted Speech | English | United States | Low background noise (studio) | 200 | 2 | 80,000 | 18,318 | 48 | raw PCM or wav PCM | Dataset is fully transcribed and timestamped Dataset is formatted according to SALA II/SpeechDAT style conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 400 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words | English (United States) scripted microphone | |
128 | Text | NER, Content Classification, Search Engines | N/A | 22,768 sentences | Add | ENG_NER001 | Appen Global | News NER | English | N/A | N/A | N/A | N/A | 22,768 | Available on request | N/A | text | English NER news text | ||
132 | Text | NER, Content Classification, Search Engines | N/A | 19,584 sentences | Add | FAR_NER001 | Appen Global | News NER | Iranian Persian | Iran | N/A | N/A | N/A | 19,584 | Available on request | N/A | text | Farsi/Persian NER news text | ||
182 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | fin_FIN_POS | Appen Global | Part of Speech Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Finnish (Finland) Part of Speech Dictionary | ||
125 | Image | Document Processing, Document Search | Camera | 7293 images | Add | IMG_OCR_FIN_CN | Appen China | Document OCR | Finnish | Finland | Mixed lighting conditions | 4 | N/A | N/A | N/A | N/A | jpg | Images containing text, such as billboards / outer packaging / signage / magazines / menus, etc. | Finnish (Finland) printed text OCR | |
181 | Text | ASR, TTS, Language Modelling | N/A | 85,000 words | Add | fin_FIN_PHON | Appen Global | Pronunciation Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 85,000 | N/A | text | Finnish (Finland) Pronunciation Dictionary | ||
142 | Text | ASR, TTS, Language Modelling | N/A | 4,000 words | Add | fra_DZA_PHON | Appen Global | Pronunciation Dictionary | French | Algeria | N/A | N/A | N/A | N/A | 4,000 | N/A | text | Arabic script | French (Algeria) Pronunciation Dictionary | |
5 | Audio | ASR, Virtual Assistant | Landline only | 76 hours | Add | Belgian French SpeechDat(II) FDB-1000 (FIXED1BF) | Nuance | Scripted Speech | French | Belgium | Low background noise | 1,000 | 1 | 53,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control | French (Belgium) scripted telephony | |
36 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 9 hours | Add | FRC_ASR003 | Appen Global | Conversational Speech | French | Canada | Mixed | 68 | 2 | Available on request | 6,022 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins For the majority of calls, only one half of the conversation was collected and transcribed, however, for a smaller number of calls, both speakers (in-line/out-line) were collected and transcribed | French (Canada) conversational telephony | |
183 | Text | ASR, TTS, Language Modelling | N/A | 67,000 words | Add | fra_CAN_PHON | Appen Global | Pronunciation Dictionary | French | Canada | N/A | N/A | N/A | N/A | 67,000 | N/A | text | French (Canada) Pronunciation Dictionary | ||
35 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 46 hours | Add | FRC_ASR002 | Appen Global | Scripted Speech | French | Canada | Low background noise (home/office) | 150 | 1 | 22,500 | 10,755 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 150 prompts per speaker including digits, digit strings (randomly generated), addressses and phonetically rich sentences and words | French (Canada) scripted microphone | |
34 | Audio | ASR, Virtual Assistant | Mobile phone | 131 hours | Add | FRC_ASR001 | Appen Global | Scripted Speech | French | Canada | Mixed | 1,000 | 1 | 100,000 | 11,697 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 100 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words | French (Canada) scripted telephony | |
275 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 159 hours | Add | FRF_ASR004 | Appen Global | Conversational Speech | French | France | Mixed (home, car, public place, outdoor) | 298 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | French (France) conversational smartphone | |
40 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 25 hours | Add | FRF_ASR001 | Appen Global | Conversational Speech | French | France | Low background noise | 563 | 2 | Available on request | 11,922 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed | French (France) conversational telephony | |
39 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone and mobile phone | 113 hours | Add | French SpeechDat-Car | Nuance | Scripted Speech | French | France | Mixed (in-car) | 300 | 5 | 37,500 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report Approximately 125 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech 113.7 hours | French (France) In-Car | |
185 | Text | ASR, TTS, Language Modelling | N/A | 95,000 words | Add | fra_FRA_POS | Appen Global | Part of Speech Dictionary | French | France | N/A | N/A | N/A | N/A | 95,000 | N/A | text | French (France) Part of Speech Dictionary | ||
184 | Text | ASR, TTS, Language Modelling | N/A | 112,000 words | Add | fra_FRA_PHON | Appen Global | Pronunciation Dictionary | French | France | N/A | N/A | N/A | N/A | 112,000 | N/A | text | French (France) Pronunciation Dictionary | ||
41 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 26 hours | Add | FRF_ASR003 | Global Phone | Scripted Speech | French | France | Low background noise (home/office) | 98 | 1 | 10,273 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | French (France) scripted microphone | |
37 | Audio | ASR, Virtual Assistant | Landline only | 41 hours | Add | French SpeechDat(II) FDB-1000 | Nuance | Scripted Speech | French | France | Low background noise (home/office) | 1,017 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | French (France) scripted telephony | |
38 | Audio | ASR, Virtual Assistant | Landline only | 305 hours | Add | French SpeechDat(II) FDB-5000 | Nuance | Scripted Speech | French | France | Low background noise | 5,040 | 1 | 237,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 47 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | French (France) scripted telephony | |
60 | Audio | ASR, Virtual Assistant | Landline only | 45 hours | Add | Luxembourgish French SpeechDat(II) FDB-500 (FIXED1LF) | Nuance | Scripted Speech | French | Luxembourg | Low background noise | 614 | 1 | 32,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | French (Luxembourg) telephony | |
273 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 104 hours | Add | DEU_ASR004 | Appen Global | Conversational Speech | German | Germany | Mixed (home, car, public place, outdoor) | 198 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | German (Germany) conversational smartphone | |
186 | Text | ASR, TTS, Language Modelling | N/A | 146,000 words | Add | deu_DEU_PHON | Appen Global | Pronunciation Dictionary | German | Germany | N/A | N/A | N/A | N/A | 146,000 | N/A | text | German (Germany) Pronunciation Dictionary | ||
16 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 16 hours | Add | DEU_ASR001 | Appen Global | Scripted Speech | German | Germany | Low background noise (studio) | 127 | 2 | 12,700 | 6,826 | 48 | raw PCM | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 100 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words | German (Germany) scripted microphone | |
18 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 25 hours | Add | DEU_ASR003 | Global Phone | Scripted Speech | German | Germany | Low background noise (home/office) | 77 | 1 | 10,085 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | German (Germany) scripted microphone | |
42 | Audio | ASR, Virtual Assistant | Landline only | 31 hours | Add | German SpeechDat (II) FDB-1000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 988 | 1 | 43,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 44 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | German (Germany) telephony | |
43 | Audio | ASR, Virtual Assistant | Landline only | 268 hours | Add | German SpeechDat(II) FDB-4000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 4,000 | 1 | 160,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | German (Germany) telephony | |
61 | Audio | ASR, Virtual Assistant | Landline only | 33 hours | Add | Luxembourgish German SpeechDat(II) FDB-500 (FIXED1LG) | Nuance | Scripted Speech | German | Luxembourg | Low background noise | 500 | 1 | 26,500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | German (Luxembourg) telephony | |
187 | Text | ASR, TTS, Language Modelling | N/A | 27,000 words | Add | deu_CHE_PHON | Appen Global | Pronunciation Dictionary | German | Switzerland | N/A | N/A | N/A | N/A | 27,000 | N/A | text | German (Switzerland) Pronunciation Dictionary | ||
94 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add | Speecon German (Switzerland) database | Nuance | Scripted Speech | German | Switzerland | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | German (Switzerland) scripted microphone | |
68 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 31 hours | Add | OrienTel German Spoken by Turkish | Nuance | Scripted Speech | German | Turkey | Low background noise | 300 | 1 | 15,600 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | German (Turkey) telephony | |
188 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add | ell_GRC_PHON | Appen Global | Pronunciation Dictionary | Greek | Greece | N/A | N/A | N/A | N/A | 5,000 | N/A | text | Greek (Greece) Pronunciation Dictionary | ||
117 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 191 hours | Add | GRE_ASR001_CN | Appen China | Scripted Speech | Greek | Greece | Low background noise (home/office) | 287 | 1 | 54,113 | 68,271 | 16 | wav | Dataset contains audio with corresponding text prompts | Greek (Greece) scripted smartphone | |
189 | Text | ASR, TTS, Language Modelling | N/A | 35,000 words | Add | grn_PRY_PHON | Appen Global | Pronunciation Dictionary | Guarani | Paraguay | N/A | N/A | N/A | N/A | 35,000 | N/A | text | Guarani (Paraguay) Pronunciation Dictionary | ||
190 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | hat_HTI_PHON | Appen Global | Pronunciation Dictionary | Haitian Creole | Haiti | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Haitian Creole (Haiti) Pronunciation Dictionary | ||
277 | Image | Document Processing, Document Search | Camera, scan | 964 images | Add | IMG_OCR_Handwritten | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | This is a subset of IMG_OCR_B2C_Other. Scans and photographs of handwritten forms and handwritten documents. 6 Languages collected in 23+ locales: 8% Arabic, 41% English, 7% French, 2% German, 20% Russian, 22% Spanish | Handwritten text document OCR | |
45 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 33 hours | Add | HAU_ASR002 | Appen Global | Conversational Speech | Hausa | Nigeria | Low background noise | 200 | 2 | Available on request | 7,949 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers | Hausa (Nigeria) conversational telephony | |
191 | Text | ASR, TTS, Language Modelling | N/A | 11,000 words | Add | hau_NGA_PHON | Appen Global | Pronunciation Dictionary | Hausa | Nigeria | N/A | N/A | N/A | N/A | 11,000 | N/A | text | Hausa (Nigeria) Pronunciation Dictionary | ||
44 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 20 hours | Add | HAU_ASR001 | Global Phone | Scripted Speech | Hausa | Cameroon | Low background noise (home/office) | 103 | 1 | 7,895 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Hausa scripted microphone | |
46 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 34 hours | Add | HEB_ASR001 | Appen Global | Conversational Speech | Hebrew | Israel | Low background noise | 200 | 2 | Available on request | 19,250 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 50% landline, 50% mobile Conversations cover a range of topics including: Friends, Family and Studies. | Hebrew (Israel) conversational telephony | |
192 | Text | ASR, TTS, Language Modelling | N/A | 31,000 words | Add | heb_ISR_PHON | Appen Global | Pronunciation Dictionary | Hebrew | Israel | N/A | N/A | N/A | N/A | 31,000 | N/A | text | Hebrew (Israel) Pronunciation Dictionary | ||
48 | Audio | ASR, Conversational AI, Speech Analytics, TTS | Mobile phone and landline | 32 hours | Add | HIN_ASR002 | Appen Global | Conversational Speech | Hindi | India | Mixed | 996 | 2 | Available on request | 12,266 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed 29% landline, 71% mobile | Hindi (India) conversational telephony | |
193 | Text | ASR, TTS, Language Modelling | 35,000 words | Add | hin_IND_PHON | Appen Global | Pronunciation Dictionary | Hindi | India | N/A | N/A | N/A | N/A | 35,000 | N/A | text | Hindi (India) Pronunciation Dictionary | |||
47 | Audio | ASR, Virtual Assistant, TTS | Mobile phone | 224 hours | Add | HIN_ASR001 | Appen Global | Scripted Speech | Hindi | India | Low background noise | 1,920 | 1 | 96,000 | 9,853 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 50 prompts per speaker including digits, natural numbers, personal, business and place names, web addresses, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words | Hindi (India) scripted telephony | |
126 | Video | Fitness Applications, Action Classification, Gesture Recognition | Mobile phone | 2000 videos | Add | VED_HUMAN_BODY_CN | Appen China | Human Body | N/A | China | Mixed background and lighting conditions | 1000 | N/A | N/A | N/A | N/A | mp4 | Video clips are approximately 10-20 seconds long | Human body movement | |
194 | Text | ASR, TTS, Language Modelling | N/A | 500 words | Add | hun_HUN_PHON | Appen Global | Pronunciation Dictionary | Hungarian | Hungary | N/A | N/A | N/A | N/A | 500 | N/A | text | Hungarian (Hungary) Pronunciation Dictionary | ||
118 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 286 hours | Add | HUN_ASR001_CN | Appen China | Scripted Speech | Hungarian | Hungary | Low background noise (home/office) | 254 | 1 | 94,031 | 201,921 | 16 | wav | Dataset contains audio with corresponding text prompts | Hungarian (Hungary) scripted smartphone | |
49 | Audio | ASR, Virtual Assistant | Landline only | 65 hours | Add | Hungarian SpeechDat(E) | Nuance | Scripted Speech | Hungarian | Hungary | Low background noise | 1,000 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Hungarian (Hungary) scripted telephony | |
195 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | ibo_NGA_PHON | Appen Global | Pronunciation Dictionary | Igbo | Nigeria | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Igbo (Nigeria) Pronunciation Dictionary | ||
149 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | ind_IDN_POS | Appen Global | Part of Speech Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Indonesian (Indonesia) Part of Speech Dictionary | ||
148 | Text | ASR, TTS, Language Modelling | N/A | 95,000 words | Add | ind_IDN_PHON | Appen Global | Pronunciation Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 95,000 | N/A | text | Indonesian (Indonesia) Pronunciation Dictionary | ||
262 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 100 hours | Add | NMG_ASR001_CN | Appen China | Conversational Speech | Inner Mongolian | China | Low background noise | 200 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover the following areas: Xilingol League, Tongliao, Hohhot. Each recording session contains about 30 minutes of free dialogue between 2 people. | Inner Mongolian (China) Conversational Speech | |||
32 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 30 hours | Add | FAR_ASR002 | Appen Global | Conversational Speech | Iranian Persian (Farsi) | Iran | Mixed | 1,000 | 2 | Available on request | 12,358 | 8 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words | Iranian Persian (Farsi) (Iran) conversational telephony | |
31 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 85 hours | Add | FAR_ASR001 | Appen Global | Scripted Speech | Iranian Persian (Farsi) | Iran | Mixed | 789 | 1 | 38,400 | 8,716 | 8 | alaw | Fully transcribed to OrienTel type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 48 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words | Iranian Persian (Farsi) (Iran) scripted telephony | |
180 | Text | ASR, TTS, Language Modelling | N/A | 1,400,000 words | Add | pes_IRN_POS | Appen Global | Part of Speech Dictionary | Iranian Persian | Iran | N/A | N/A | N/A | N/A | 1,400,000 | N/A | text | Iranian Persian (Iran) Part of Speech Dictionary | ||
179 | Text | ASR, TTS, Language Modelling | N/A | 80,000 words | Add | pes_IRN_PHON | Appen Global | Pronunciation Dictionary | Iranian Persian | Iran | N/A | N/A | N/A | N/A | 80,000 | N/A | text | Iranian Persian (Iran) Pronunciation Dictionary | ||
276 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 256 hours | Add | ITA_ASR005 | Appen Global | Conversational Speech | Italian | Italy | Mixed (home, car, public place, outdoor) | 482 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | Italian (Italy) conversational smartphone | |
52 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 36 hours | Add | ITA_ASR003 | Appen Global | Conversational Speech | Italian | Italy | Low background noise | 200 | 2 | Available on request | 18,974 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 50% landline, 50% mobile Conversations cover a range of topics including: Travel, Family and Holidays. | Italian (Italy) conversational telephony | |
197 | Text | ASR, TTS, Language Modelling | N/A | 171,000 words | Add | ita_ITA_POS | Appen Global | Part of Speech Dictionary | Italian | Italy | N/A | N/A | N/A | N/A | 171,000 | N/A | text | Italian (Italy) Part of Speech Dictionary | ||
196 | Text | ASR, TTS, Language Modelling | N/A | 197,000 words | Add | ita_ITA_PHON | Appen Global | Pronunciation Dictionary | Italian | Italy | N/A | N/A | N/A | N/A | 197,000 | N/A | text | Italian (Italy) Pronunciation Dictionary | ||
50 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 44 hours | Add | ITA_ASR001 | Appen Global | Scripted Speech | Italian | Italy | Mixed | 200 | 4 | 40,000 | 7,316 | 22 | raw PCM | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 prompts per speaker including 100 command and control type items and 100 phonetically rich sentences | Italian (Italy) scripted microphone | |
53 | Audio | TTS | Microphone | 3 hours | Add | ITA_TTS001 | Appen Global | Scripted Speech | Italian | Italy | Low background noise (studio) | 1 | 1 | 3,300 | Available on request | 22 | raw PCM | Dataset is accompanied by a pronunciation lexicon containing all words spoken in the Dataset 3,300 prompts per speaker including phonetically rich sentences | Italian (Italy) scripted microphone | |
51 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone | 47 hours | Add | ITA_ASR002 | Appen Global | Scripted Speech | Italian | Italy | Mixed (in-car) | 205 | 4 | 35,875 | 10,366 | 48 | raw PCM | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 350 prompts per speaker including digits, street names, generic command and control items, phonetically rich sentences and words Each speaker recorded 1or 2 sessions including Session 1 in a parked vehicle with the engine running and Session 2 in a vehicle travelling at 60 mph (100 km/h) | Italian (Italy) scripted microphone in-car | |
54 | Audio | ASR, Virtual Assistant | Landline only | 38 hours | Add | Italian Fixed Network Speech SpeechDat(M) Corpus | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 1,000 | 1 | 39,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 39 prompts per speaker includign isolated and connected digits, natural numbers, money amounts, spelled words, time and date phrases, yes/no questions, city names, common application words, application words in phrases and phonetically rich sentences | Italian (Italy) telephony | |
55 | Audio | ASR, Virtual Assistant | Landline only | 228 hours | Add | Italian SpeechDat(II) FDB-3000 | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 3,040 | 1 | 134,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 44 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Italian (Italy) telephony | |
56 | Audio | ASR, Virtual Assistant | Mobile phone | 103 hours | Add | Italian SpeechDat(II) MDB-250 | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 375 | 1 | 19,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 51 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Italian (Italy) telephony | |
89 | Audio | ASR, Virtual Assistant | Mobile phone | 13 hours | Add | SpeechDat(M) Italian Mobile Network Speech Database | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 342 | 1 | 13,500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Italian (Italy) telephony | |
199 | Text | ASR, TTS, Language Modelling | N/A | 269,000 words | Add | jpn_JPN_POS | Appen Global | Part of Speech Dictionary | Japanese | Japan | N/A | N/A | N/A | N/A | 269,000 | N/A | text | Japanese (Japan) Part of Speech Dictionary | ||
198 | Text | ASR, TTS, Language Modelling | N/A | 262,000 words | Add | jpn_JPN_PHON | Appen Global | Pronunciation Dictionary | Japanese | Japan | N/A | N/A | N/A | N/A | 262,000 | N/A | text | Japanese (Japan) Pronunciation Dictionary | ||
57 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 33 hours | Add | JPN_ASR001 | Global Phone | Scripted Speech | Japanese | Japan | Low background noise (home/office) | 144 | 1 | 13,067 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Japanese (Japan) scripted microphone | |
95 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 57 hours | Add | Speecon Japanese | Nuance | Scripted Speech | Japanese | Japan | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Japanese (Japan) scripted microphone | |
133 | Text | NER, Content Classification, Search Engines | N/A | 20,629 sentences | Add | JPY_NER001 | Appen Global | News NER | Japanese | Japan | N/A | N/A | N/A | 20,629 | Available on request | N/A | text | Japanese NER news text | ||
200 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add | jav_IDN_PHON | Appen Global | Pronunciation Dictionary | Javanese | Indonesia | N/A | N/A | N/A | N/A | 20,000 | N/A | text | Javanese (Indonesia) Pronunciation Dictionary | ||
58 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 15 hours | Add | KAN_ASR001 | Appen Global | Conversational Speech | Kannada | India | Mixed | 178 | 2 | Available on request | 15,660 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words | Kannada (India) conversational telephony | |
109 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 57 hours | Add | KAN_ASR001A | Appen Global | Conversational Speech | Kannada | India | Mixed | 1,000 | 2 | Available on request | 15,660 | 8 | alaw | Approx. 25% of the dataset sessions are transcribed and time stamped - full transcripts can be made available Database is accompanied by a pronunciation lexicon containing all transcribed words 16% Hands-Free car, 16% Landline quiet, 15% Mobile quiet, 17% Moving vehicle, 19% Public place, 17% Roadside | Kannada (India) conversational telephony | |
201 | Text | ASR, TTS, Language Modelling | N/A | 49,000 words | Add | kan_IND_PHON | Appen Global | Pronunciation Dictionary | Kannada | India | N/A | N/A | N/A | N/A | 49,000 | N/A | text | Kannada (India) Pronunciation Dictionary | ||
202 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | kaz_KAZ_PHON | Appen Global | Pronunciation Dictionary | Kazakh | Kazakhstan | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Kazakh (Kazakhstan) Pronunciation Dictionary | ||
204 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | kor_KOR_POS | Appen Global | Part of Speech Dictionary | Korean | South Korea | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Korean (South Korea) Part of Speech Dictionary | ||
203 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | kor_KOR_PHON | Appen Global | Pronunciation Dictionary | Korean | South Korea | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Korean (South Korea) Pronunciation Dictionary | ||
59 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 20 hours | Add | KOR_ASR001 | Global Phone | Scripted Speech | Korean | South Korea | Low background noise (home/office) | 100 | 1 | 8,107 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Korean (South Korea) scripted microphone | |
129 | Text | NER, Content Classification, Search Engines | N/A | 25,830 sentences | Add | KOR_NER001 | Appen Global | News NER | Korean | South Korea | N/A | N/A | N/A | 25,830 | Available on request | N/A | text | Korean NER news text | ||
205 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add | kur_TUR_PHON | Appen Global | Pronunciation Dictionary | Kurmanji | Turkey | N/A | N/A | N/A | N/A | 60,000 | N/A | text | Kurmanji (Turkey) Pronunciation Dictionary | ||
206 | Text | ASR, TTS, Language Modelling | N/A | 9,000 words | Add | lao_LAO_PHON | Appen Global | Pronunciation Dictionary | Lao | Laos | N/A | N/A | N/A | N/A | 9,000 | N/A | text | Lao (Laos) Pronunciation Dictionary | ||
207 | Text | ASR, TTS, Language Modelling | N/A | 71,000 words | Add | lit_LTU_PHON | Appen Global | Pronunciation Dictionary | Lithuanian | Lithuania | N/A | N/A | N/A | N/A | 71,000 | N/A | text | Lithuanian (Lithuania) Pronunciation Dictionary | ||
208 | Text | ASR, TTS, Language Modelling | N/A | 19,000 words | Add | mal_IND_PHON | Appen Global | Pronunciation Dictionary | Malayalam | India | N/A | N/A | N/A | N/A | 19,000 | N/A | text | Malayalam (India) Pronunciation Dictionary | ||
209 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | msa_MYS_PHON | Appen Global | Pronunciation Dictionary | Malaysian | Malaysia | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Malaysian (Malaysia) Pronunciation Dictionary | ||
210 | Text | ASR, TTS, Language Modelling | N/A | 35,000 words | Add | zho_CHN_PHON | Appen Global | Pronunciation Dictionary | Mandarin (Simplified) | China | N/A | N/A | N/A | N/A | 35,000 | N/A | text | Mandarin (Simplified) (China) Pronunciation Dictionary | ||
211 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add | zho_TWN_PHON | Appen Global | Pronunciation Dictionary | Mandarin (Traditional) | Taiwan | N/A | N/A | N/A | N/A | 50,000 | N/A | text | Mandarin (Traditional) (Taiwan) Pronunciation Dictionary | ||
63 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 26 hours | Add | MAC_ASR002 | Global Phone | Scripted Speech | Mandarin Chinese | China | Low background noise (home/office) | 132 | 1 | 10,225 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Mandarin Chinese (China) scripted microphone | |
62 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 323 hours | Add | MAC_ASR001 | Appen Global | Scripted Speech | Mandarin Chinese | China | Mixed | 2,000 | 1 | 200,000 | 7,145 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 98 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items (from a set of 215), phonetically rich sentences and words | Mandarin Chinese (China) scripted telephony | |
131 | Text | NER, Content Classification, Search Engines | N/A | 17,313 sentences | Add | MAC_NER001 | Appen Global | News NER | Mandarin Chinese | China | N/A | N/A | N/A | 17,313 | Available on request | N/A | text | Mandarin NER news text | ||
64 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 15 hours | Add | MAR_ASR001 | Appen Global | Conversational Speech | Marathi | India | Mixed | 180 | 2 | Available on request | 11,908 | 8 | alaw | Approx. 29% of the dataset sessions are transcribed and time stamped - full transcripts can be made available Dataset is accompanied by a pronunciation lexicon containing all transcribed words 17% Hands-Free car, 16% Landline quiet, 19% Mobile quiet, 16% Moving vehicle, 16% Public place, 17% Roadside | Marathi (India) conversational telephony | |
110 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 52 hours | Add | MAR_ASR001A | Appen Global | Conversational Speech | Marathi | India | Mixed | 1,000 | 2 | Available on request | 11,908 | 8 | alaw | Portion of the dataset sessions are transcribed and time stamped - full transcripts can be made available Dataset is accompanied by a pronunciation lexicon containing all transcribed words | Marathi (India) conversational telephony | |
212 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | mar_IND_PHON | Appen Global | Pronunciation Dictionary | Marathi | India | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Marathi (India) Pronunciation Dictionary | ||
213 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | mon_MNG_PHON | Appen Global | Pronunciation Dictionary | Mongolian | Mongolia | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Mongolian (Mongolia) Pronunciation Dictionary | ||
215 | Text | ASR, TTS, Language Modelling | N/A | 3,000 words | Add | nor_NOR_POS | Appen Global | Part of Speech Dictionary | Norwegian | Norway | N/A | N/A | N/A | N/A | 3,000 | N/A | text | Norwegian (Norway) Part of Speech Dictionary | ||
214 | Text | ASR, TTS, Language Modelling | N/A | 115,000 words | Add | nor_NOR_PHON | Appen Global | Pronunciation Dictionary | Norwegian | Norway | N/A | N/A | N/A | N/A | 115,000 | N/A | text | Norwegian (Norway) Pronunciation Dictionary | ||
264 | Image | Image label recognition training | Mobile phone and camera | 2196 images | Add | IMG_TAG_CN | Appen China | Object Image | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | jpg | Multi-scene picture sample library of 2196 images, with the following categories: KTV: 50, Department store: 55, Office: 100; Museum: 63; Electrical appliances: 55; Marine: 191; Car: 50; Handbags: 35; Night view: 54; Sports equipment: 54; Convenience stores: 34; Restaurant: 54; Window scenery: 62; Pets: 82; Ship: 50; Zoo, 70; Clothing store: 53; Beach: 95; Airport: 65 tickets; Gym: 47; Attractions: 77; Crowd: 67; Desert: 73; Beach: 68; Mountain area: 54; Shopping mall: 55; Trees: 85; Sky: 102; Snow: 71; Snow Mountain: 53; Night view: 78; Playground: 94 | Object Image Collection | |||
216 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | ori_IND_PHON | Appen Global | Pronunciation Dictionary | Oriya | India | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Oriya (India) Pronunciation Dictionary | ||
80 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 20 hours | Add | PAP_ASR001 | Appen Global | Conversational Speech | Panjabi | Pakistan | Low background noise | 205 | 2 | Available on request | 7,298 | 8 | alaw | Dataset is fully transcribed and time-stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 71% of calls, both speakers (in-line/out-line) were collected and transcribed, however, for 29% calls, only one half of the conversation was collected and transcribed | Panjabi (Pakistan) conversational telephony | |
74 | Audio | ASR, Automatic Captioning, Keyword Spotting | Microphone | 51 hours | Add | PAS_BRC001 | Appen Global | Broadcast Speech | Northern Pashto - Southern Pashto | Afghanistan | Low background noise (studio) | N/A | 1 | Available on request | Available on request | N/A | wav | Dataset is fully transcribed and timestamped Pronunciation lexicon not currently available but can be developed upon request Dataset is largely speech only and does not include music or advertisements Data types include: talk shows, interviews, news broadcasts (excluding news reading by anchors) | Pashto (Afghanistan) broadcast | |
73 | Audio | ASR, Conversational AI, Speech Analytics | Microphone | 39 hours | Add | PAS_ASR002 | Appen Global | Conversational Speech | Northern Pashto - Southern Pashto | Afghanistan | Low background noise | 40 | 2 | 34860 | 9,480 | 16 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words A full translation of the transcripts into French is also available as an optional additional purchase Average length of calls: 120 mins where one speaker acts as an interviewer and the other as the interviewee for scenarios are similar to TransTAC style (e.g. civil affairs, checkpoints etc.) The interviewer appears in more than one set of dialogues but the interviewee is unique for each set | Pashto (Afghanistan) conversational microphone | |
72 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 55 hours | Add | PAS_ASR001 | Appen Global | Conversational Speech | Northern Pashto - Southern Pashto | Afghanistan | Low background noise | 967 | 2 | Available on request | 13,633 | 8 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed 25% landline, 75% mobile | Pashto (Afghanistan) conversational telephony | |
217 | Text | ASR, TTS, Language Modelling | N/A | 65,000 words | Add | pus_AFG_PHON | Appen Global | Pronunciation Dictionary | Pashto | Afghanistan | N/A | N/A | N/A | N/A | 65,000 | N/A | text | Pashto (Afghanistan) Pronunciation Dictionary | ||
219 | Text | ASR, TTS, Language Modelling | N/A | 4,000 words | Add | pol_POL_POS | Appen Global | Part of Speech Dictionary | Polish | Poland | N/A | N/A | N/A | N/A | 4,000 | N/A | text | Polish (Poland) Part of Speech Dictionary | ||
218 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | pol_POL_PHON | Appen Global | Pronunciation Dictionary | Polish | Poland | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Polish (Poland) Pronunciation Dictionary | ||
75 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 25 hours | Add | POL_ASR001 | Global Phone | Scripted Speech | Polish | Poland | Low background noise (home/office) | 99 | 1 | 10,130 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Polish (Poland) scripted microphone | |
119 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 293 hours | Add | POL_ASR002_CN | Appen China | Scripted Speech | Polish | Poland | Low background noise (home/office) | 353 | 1 | 106,674 | 168,544 | 16 | wav | Dataset contains audio with corresponding text prompts | Polish (Poland) scripted smartphone | |
76 | Audio | ASR, Virtual Assistant | Landline only | 78 hours | Add | Polish SpeechDat(E) Database | Nuance | Scripted Speech | Polish | Poland | Low background noise | 1,000 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Polish (Poland) scripted telephony | |
78 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 33 hours | Add | PTB_ASR002 | Appen Global | Conversational Speech | Portuguese | Brazil | Low background noise | 200 | 2 | 33,837 | 11,287 | 8 | alaw or wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 63% landline, 38% mobile | Portuguese (Brazil) conversational telephony | |
77 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 26 hours | Add | PTB_ASR001 | Global Phone | Scripted Speech | Portuguese | Brazil | Low background noise (home/office) | 102 | 1 | 10,417 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Portuguese (Brazil) microphone | |
221 | Text | ASR, TTS, Language Modelling | N/A | 98,000 words | Add | por_BRA_POS | Appen Global | Part of Speech Dictionary | Portuguese | Brazil | N/A | N/A | N/A | N/A | 98000 | N/A | text | Portuguese (Brazil) Part of Speech Dictionary | ||
220 | Text | ASR, TTS, Language Modelling | N/A | 102,000 words | Add | por_BRA_PHON | Appen Global | Pronunciation Dictionary | Portuguese | Brazil | N/A | N/A | N/A | N/A | 102,000 | N/A | text | Portuguese (Brazil) Pronunciation Dictionary | ||
79 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 36 hours | Add | PTP_ASR001 | Appen Global | Conversational Speech | Portuguese | Portugal | Low background noise | 200 | 2 | 36,586 | 16,339 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers | Portuguese (Portugal) conversational telephony | |
223 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add | por_PRT_POS | Appen Global | Part of Speech Dictionary | Portuguese | Portugal | N/A | N/A | N/A | N/A | 60,000 | N/A | text | Portuguese (Portugal) Part of Speech Dictionary | ||
222 | Text | ASR, TTS, Language Modelling | N/A | 112,000 words | Add | por_PRT_PHON | Appen Global | Pronunciation Dictionary | Portuguese | Portugal | N/A | N/A | N/A | N/A | 112,000 | N/A | text | Portuguese (Portugal) Pronunciation Dictionary | ||
81 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 37 hours | Add | ROM_ASR001 | Appen Global | Conversational Speech | Romanian | Romania | Low background noise | 200 | 2 | Available on request | 16,658 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 50% landline, 50% mobile Conversations cover a range of topics including: Leisure, Work and Sport. | Romanian (Romania) conversational telephony | |
224 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | ron_ROU_PHON | Appen Global | Pronunciation Dictionary | Romanian | Romania | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Romanian (Romania) Pronunciation Dictionary | ||
82 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 37 hours | Add | RUS_ASR001 | Appen Global | Conversational Speech | Russian | Russia | Low background noise | 200 | 2 | Available on request | 28,284 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 50% landline, 50% mobile | Russian (Russia) conversational telephony | |
226 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | rus_RUS_POS | Appen Global | Part of Speech Dictionary | Russian | Russia | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Russian (Russia) Part of Speech Dictionary | ||
225 | Text | ASR, TTS, Language Modelling | N/A | 115,000 words | Add | rus_RUS_PHON | Appen Global | Pronunciation Dictionary | Russian | Russia | N/A | N/A | N/A | N/A | 115,000 | N/A | text | Russian (Russia) Pronunciation Dictionary | ||
83 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 31 hours | Add | RUS_ASR002 | Global Phone | Scripted Speech | Russian | Russia | Low background noise (home/office) | 115 | 1 | 12,205 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Russian (Russia) scripted microphone | |
96 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 46 hours | Add | Speecon Russian Database | Nuance | Scripted Speech | Russian | Russia | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Russian (Russia) scripted microphone | |
84 | Audio | ASR, Virtual Assistant | Landline only | 180 hours | Add | Russian SpeechDat(E) Database | Nuance | Scripted Speech | Russian | Russia | Low background noise | 2,500 | 1 | 112,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 45 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Russian (Russia) scripted telephony | |
130 | Text | NER, Content Classification, Search Engines | N/A | 29,888 sentences | Add | RUS_NER001 | Appen Global | News NER | Russian | Russia | N/A | N/A | N/A | 29,888 | Available on request | N/A | text | Russian NER news text | ||
227 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | srp_SRB_PHON | Appen Global | Pronunciation Dictionary | Serbian | Serbia | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Serbian (Serbia) Pronunciation Dictionary | ||
257 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 4.5 hours | Add | SHANGHAI_ASR002_CN | Appen China | Conversational Speech | Shanghai dialect | China | Low background noise | 14 | 1 | 8 | wav | Audio only, transcription not included Audio recordings cover the following districts: Shanghai Huangpu District, Xuhui District, Changning District, Jing 'an District, Putuo District, Hongkou District, Yangpu District, Pudong New Area Shanghai suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Shanghai dialect (China) Conversational Speech | |||
256 | Audio | ASR, Conversational AI, Speech Analytics | Recording pen/microphone | 21 hours | Add | SHANGHAI_ASR001_CN | Appen China | Conversational Speech | Shanghai dialect | China | Low background noise | 51 | 1 | 16 | wav | Audio only, transcription not included Audio recordings cover the following districts: Shanghai Huangpu District, Xuhui District, Changning District, Jing 'an District, Putuo District, Hongkou District, Yangpu District, Pudong New Area Shanghai suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Shanghai dialect (China) Conversational Speech | |||
123 | Image | Document Processing, Document Search | Camera | 200 images | Add | IMG_OCR_MAC_CN | Appen China | Document OCR | N/A | China | Mixed lighting conditions | 30 | N/A | N/A | N/A | N/A | jpg | Text in each image is labeled with bounding boxes by the line Images containing heavy text in Chinese, including books, publications, posters, receipts, PPT, printed paper, etc. | Simplified Chinese printed text OCR | |
85 | Audio | ASR, Virtual Assistant | Landline only | 65 hours | Add | Slovak SpeechDat(E) Database | Nuance | Scripted Speech | Slovak | Slovakia | Low background noise | 1,000 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Slovak (Slovakia) scripted telephony | |
86 | Audio | ASR, Virtual Assistant | Landline only | 76 hours | Add | Slovenian SpeechDat(II) FDB-1000 | Nuance | Scripted Speech | Slovenian | Slovenia | Low background noise (home/office) | 1,000 | 1 | 40,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report Approximately 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Slovenian (Slovenian) telephony | |
87 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 50 hours | Add | SOM_ASR001 | Appen Global | Conversational Speech | Somali | Somalia | Low background noise | 1,000 | 2 | Available on request | 23,217 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 1% landline, 99% mobile | Somali (Somalia) conversational telephony | |
228 | Text | ASR, TTS, Language Modelling | N/A | 76,000 words | Add | som_SOM_PHON | Appen Global | Pronunciation Dictionary | Somali | Somalia | N/A | N/A | N/A | N/A | 76,000 | N/A | text | Somali (Somalia) Pronunciation Dictionary | ||
229 | Text | ASR, TTS, Language Modelling | N/A | 25,000 words | Add | kur_IRQ_PHON | Appen Global | Pronunciation Dictionary | Sorani | Iraq | N/A | N/A | N/A | N/A | 25,000 | N/A | text | Sorani (Iraq) Pronunciation Dictionary | ||
88 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 5 hours | Add | SOR_ASR001 | Appen Global | Conversational Speech | Central Kurdish (Iran) | Iran | Low background noise | 170 | 2 | Available on request | 7,924 | 8 | alaw or wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For a large proportion of calls, only one half of the conversation was collected and transcribed | Sorani (Kurdish) conversational telephony | |
230 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | spa_ARG_PHON | Appen Global | Pronunciation Dictionary | Spanish | Argentina | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Spanish (Argentina) Pronunciation Dictionary | ||
232 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | spa_CHL_PHON | Appen Global | Pronunciation Dictionary | Spanish | Chile | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Spanish (Chile) Pronunciation Dictionary | ||
233 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | spa_COL_PHON | Appen Global | Pronunciation Dictionary | Spanish | Colombia | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Spanish (Colombia) Pronunciation Dictionary | ||
27 | Audio | ASR, Call Centre, Conversational AI, Speech Analytics | Mobile phone and landline | 22 hours | Add | ESL_ASR002 | Appen Global | Conversational Speech | Spanish | Chile-Columbia | Mixed | 84 | 2 | 22,098 | Available on request | 8 | wav | Dataset is fully transcribed and time-stamped Pronunciation lexicon not currently available but can be developed upon request Call Center Call Centre style conversations (by 64 customers, 14 agents) in banking and telco domains, primarily using mobile phone | Spanish (Latin America - Chile and Colombia) conversational telephony | |
26 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 17 hours | Add | ESL_ASR001 | Global Phone | Scripted Speech | Spanish | Costa Rica | Low background noise (home/office) | 100 | 1 | 6,898 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Spanish (Latin America) scripted microphone | |
234 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | spa_PER_PHON | Appen Global | Pronunciation Dictionary | Spanish | Peru | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Spanish (Peru) Pronunciation Dictionary | ||
274 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 223 hours | Add | ESP_ASR003 | Appen Global | Conversational Speech | Spanish | Spain | Mixed (home, car, public place, outdoor) | 414 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request | Spanish (Spain) conversational smartphone | |
231 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | spa_ESP_PHON | Appen Global | Pronunciation Dictionary | Spanish | Spain | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Spanish (Spain) Pronunciation Dictionary | ||
28 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 39 hours | Add | ESP_ASR001 | Appen Global | Scripted Speech | Spanish | Spain | Mixed | 200 | 4 | 40,000 | 6,367 | 22 | raw PCM | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 prompts per speaker including 100 command and control type items and 100 phonetically rich sentences | Spanish (Spain) scripted microphone | |
30 | Audio | TTS | Microphone | 1 hour | Add | ESP_TTS001 | Appen Global | Scripted Speech | Spanish | Spain | Low background noise (studio) | 1 | 1 | 1,787 | 3,614 | 22 | wav | Dataset is accompanied by a pronunciation lexicon containing all words spoken in the Dataset 1,787 prompts per speaker including phonetically rich sentences | Spanish (Spain) scripted microphone | |
97 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 46 hours | Add | Speecon Spanish Database | Nuance | Scripted Speech | Spanish | Spain | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers | Spanish (Spain) scripted microphone | |
235 | Text | ASR, TTS, Language Modelling | N/A | 90,000 words | Add | spa_USA_PHON | Appen Global | Pronunciation Dictionary | Spanish | United States | N/A | N/A | N/A | N/A | 90,000 | N/A | text | Spanish (United States) Pronunciation Dictionary | ||
236 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add | spa_VEN_PHON | Appen Global | Pronunciation Dictionary | Spanish | Venezuela | N/A | N/A | N/A | N/A | 15,000 | N/A | text | Spanish (Venezuela) Pronunciation Dictionary | ||
237 | Text | ASR, TTS, Language Modelling | N/A | 66,000 words | Add | swa_KEN_PHON | Appen Global | Pronunciation Dictionary | Swahili | Kenya | N/A | N/A | N/A | N/A | 66,000 | N/A | text | Swahili (Kenya) Pronunciation Dictionary | ||
239 | Text | ASR, TTS, Language Modelling | N/A | 105,000 words | Add | swe_SWE_POS | Appen Global | Part of Speech Dictionary | Swedish | Sweden | N/A | N/A | N/A | N/A | 105,000 | N/A | text | Swedish (Sweden) Part of Speech Dictionary | ||
238 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add | swe_SWE_PHON | Appen Global | Pronunciation Dictionary | Swedish | Sweden | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Swedish (Sweden) Pronunciation Dictionary | ||
98 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 30 hours | Add | SWE_ASR001 | Global Phone | Scripted Speech | Swedish | Sweden - Finland | Low background noise (home/office) | 98 | 1 | 11,816 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Swedish (Sweden/ Finland) microphone | |
240 | Text | ASR, TTS, Language Modelling | N/A | 22,000 words | Add | syl_BGD -IND_PHON | Appen Global | Pronunciation Dictionary | Sylheti | Bangladesh - India | N/A | N/A | N/A | N/A | 22,000 | N/A | text | Sylheti (Bangladesh - India) Pronunciation Dictionary | ||
241 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | tgl_PHL_PHON | Appen Global | Pronunciation Dictionary | Tagalog | Philippines | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Tagalog (Philippines) Pronunciation Dictionary | ||
243 | Text | ASR, TTS, Language Modelling | N/A | 106,000 words | Add | tam_IND_PHON | Appen Global | Pronunciation Dictionary | Tamil | India | N/A | N/A | N/A | N/A | 106,000 | N/A | text | Tamil (India) Pronunciation Dictionary | ||
242 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add | tel_IND_PHON | Appen Global | Pronunciation Dictionary | Telugu | India | N/A | N/A | N/A | N/A | 50,000 | N/A | text | Telugu (India) Pronunciation Dictionary | ||
101 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 28 hours | Add | THA_ASR001 | Global Phone | Scripted Speech | Thai | Thailand | Low background noise (home/office) | 98 | 1 | 14,039 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Thai (Thailand) microphone | |
124 | Image | Document Processing, Document Search | Camera | 1219 images | Add | IMG_OCR_THA_CN | Appen China | Document OCR | Thai | Thailand | Mixed lighting conditions | 10 | N/A | N/A | N/A | N/A | jpg | Images containing text, Shopping receipts / tickets / invoices / taxi slips, etc. | Thai (Thailand) printed text OCR | |
244 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add | tha_THA_PHON | Appen Global | Pronunciation Dictionary | Thai | Thailand | N/A | N/A | N/A | N/A | 30,000 | N/A | text | Thai (Thailand) Pronunciation Dictionary | ||
245 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | tpi_PNG_PHON | Appen Global | Pronunciation Dictionary | Tok Pisin | Papua New Guinea | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Tok Pisin (Papua New Guinea) Pronunciation Dictionary | ||
102 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 41 hours | Add | TUR_ASR001 | Appen Global | Conversational Speech | Turkish | Turkey | Low background noise | 200 | 2 | Available on request | 32,386 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 48% landline, 52% mobile | Turkish (Turkey) conversational telephony | |
103 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 17 hours | Add | TUR_ASR002 | Global Phone | Scripted Speech | Turkish | Turkey | Low background noise (home/office) | 100 | 1 | 6,950 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Turkish (Turkey) microphone | |
247 | Text | ASR, TTS, Language Modelling | N/A | 257,000 words | Add | tur_TUR_POS | Appen Global | Part of Speech Dictionary | Turkish | Turkey | N/A | N/A | N/A | N/A | 257,000 | N/A | text | Turkish (Turkey) Part of Speech Dictionary | ||
246 | Text | ASR, TTS, Language Modelling | N/A | 255,000 words | Add | tur_TUR_PHON | Appen Global | Pronunciation Dictionary | Turkish | Turkey | N/A | N/A | N/A | N/A | 255,000 | N/A | text | Turkish (Turkey) Pronunciation Dictionary | ||
69 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 118 hours | Add | OrienTel Turkish Database | Nuance | Scripted Speech | Turkish | Turkey | Low background noise | 1,700 | 1 | 76,500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 45 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words | Turkish (Turkey) telephony | |
248 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add | ukr_UKR_PHON | Appen Global | Pronunciation Dictionary | Ukrainian | Ukraine | N/A | N/A | N/A | N/A | 5,000 | N/A | text | Ukrainian (Ukraine) Pronunciation Dictionary | ||
105 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 47 hours | Add | URD_ASR001 | Appen Global | Conversational Speech | Urdu | India - Pakistan | Mixed | 1,000 | 2 | 174,666 | 10,871 | 8 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Environments: 9% Hands-free car, 7% Landline quiet, 34% mobile quiet, 29% public place, 16% roadside | Urdu (India/ Pakistan) conversational telephony | |
250 | Text | ASR, TTS, Language Modelling | N/A | 12,000 words | Add | urd_PAK_POS | Appen Global | Part of Speech Dictionary | Urdu | Pakistan | N/A | N/A | N/A | N/A | 12,000 | N/A | text | Urdu (Pakistan) Part of Speech Dictionary | ||
249 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add | urd_PAK_PHON | Appen Global | Pronunciation Dictionary | Urdu | Pakistan | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Urdu (Pakistan) Pronunciation Dictionary | ||
134 | Text | NER, Content Classification, Search Engines | N/A | 20,634 sentences | Add | URD_NER001 | Appen Global | News NER | Urdu | Pakistan | N/A | N/A | N/A | 20,634 | Available on request | N/A | text | Urdu NER news text | ||
263 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 122 hours | Add | WWE_ASR001_CN | Appen China | Conversational Speech | Uygur | China | Low background noise | 231 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover the following areas: Hotan dialect, Central dialect. Each recording session contains about 30 minutes of free dialogue between 2 people. | Uygur (China) Conversational Speech | |||
108 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 19 hours | Add | VIE_ASR001 | Global Phone | Scripted Speech | Vietnamese | Vietnam | Low background noise (home/office) | 129 | 1 | 18,842 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) | Vietnamese (Vietnam) microphone | |
251 | Text | ASR, TTS, Language Modelling | N/A | 8,000 words | Add | vie_VNM_PHON | Appen Global | Pronunciation Dictionary | Vietnamese | Vietnam | N/A | N/A | N/A | N/A | 8,000 | N/A | text | Vietnamese (Vietnam) Pronunciation Dictionary | ||
252 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | wuu_CHN_PHON | Appen Global | Pronunciation Dictionary | Wu | China | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Wu (China) Pronunciation Dictionary | ||
261 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 58.6 hours | Add | WUHAN_ASR002_CN | Appen China | Conversational Speech | Wuhan dialect | China | Low background noise | 180 | 1 | 8 | wav | Audio only; transcription not included Audio recordings cover 5 districts of Wuhan: Jiang 'an, Jianghan, Qiao Kou, Hanyang and Wuchang Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Wuhan dialect (China) Conversational Speech | |||
260 | Audio | ASR, Conversational AI, Speech Analytics | Recording pen/microphone | 44.71 hours | Add | WUHAN_ASR001_CN | Appen China | Conversational Speech | Wuhan dialect | China | Low background noise | 135 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover 5 districts of Wuhan: Jiang 'an, Jianghan, Qiao Kou, Hanyang and Wuchang Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. | Wuhan dialect (China) Conversational Speech | |||
253 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add | hsn_CHN_PHON | Appen Global | Pronunciation Dictionary | Xiang | China | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Xiang (China) Pronunciation Dictionary | ||
254 | Text | ASR, TTS, Language Modelling | N/A | 75,000 words | Add | zul_ZAF_PHON | Appen Global | Pronunciation Dictionary | Zulu | South Africa | N/A | N/A | N/A | N/A | 75,000 | N/A | text | Zulu (South Africa) Pronunciation Dictionary |
에펜의 고품질 데이터셋
에펜은 다양한 상용 데이터베이스와 400개가 넘는 라이센스 가능한 데이터셋을 제공하고 있습니다. 음성 데이터베이스는 80개 이상의 언어와 방언을 지원하며, TTS나 ASR과 같은 AI 애플리케이션 시나리오에 유용하게 활용할 수 있습니다. 에펜의 고품질 데이터셋과 AI 라이프사이클 전 과정에 걸친 고품질 솔루션을 통해 AI 프로젝트를 쉽고 빠르게 시작해 보세요!