Work Description

Title: Chatino Speech Corpus Archive Dataset Open Access Deposited

https://creativecommons.org/licenses/by-sa/4.0/
Attribute Value
Abstract
  • The data is the result of experiments related to the process of creating speech technologies to document a low-resourced or endangered language. The language that we picked for the creation of speech corpora and training of forced alignment tools is Eastern Chatino, an unwritten and low-resourced language from Oaxaca, Mexico. As far as we can tell, this is the first such resource available under a free Creative Commons license.
Description
  • This zip file contains WAV-audio files and annotations. The recordings were produced using a digital audio recorder (ZOOM H6) and can be listened to using any sound software that can play WAV-audio files. The annotations can be viewed and edited by the ELAN software packages. ELAN ( https://tla.mpi.nl/tools/tla-tools/elan/) is a professional tool for the creation of complex annotations of video and audio resources. Download the dataset using the link below.
Depositor
Citations to related material
Resource type
Last modified
  • 06/20/2024
Language
Subject
License
License Comments
  • This data is licensed for reuse under a Creative Commons Attribution Share-Alike 4.0 International (CC BY-SA 4.0) license.
Date issued
  • 2016-10-10
To Cite this Work:
Chatino Speech Corpus Archive Dataset [Data set]. Indiana University - DataCORE.

Relationships

Files (Count: 3; Size: 1.72 KB)

Download All Files

Some files in this dataset must be individually downloaded, and will not be included in the zip download:
Files can be downloaded individually in the "Files" panel above.

Best for data sets < 3 GB. Downloads all files plus metadata into a zip file.