Humanities Data

Description

The real-world speech data set consists of 3686 segments of English speech spoken with different accents. This dataset is provided by the Speech Processing Group at Brno University of Technology, Czech Republic. The majority data corresponds to American accent and only 1.65% corresponds to one of seven other accents (these are referred to as outliers). The speech segments are represented by 400-dimensional so called i-vectors which are widely used state-of-the-art features for speaker and language recognition. Learing Outlier Ensembles: The Best of Both Worlds – Supervised and Unsupervised. Barbora Micenkova, Brian McWilliams, and Ira Assent, KDD ODD2 Workshop, 2014.

Resource Fields

Resource Type:

dataset

Submitted By:

Eva Bacas and Matt Lavin

Date Submitted:

2020-04-24 14:54:12

Access URL:

http://odds.cs.stonybrook.edu/speech-dataset/

Project Open Data Required Fields (version 1.1)

Modified

[No data]

Publisher

[No data]

Contact Name

[No data]

Unique Identifier

[No data]

Public Access Level

[No data]

Project Open Data Additional Fields (version 1.0)

Contact email

[No Data]

Endpoint

[No Data]

Format

mat

Project Open Data Required-if-Applicable Fields (version 1.1)

Access Level Comment

[No Data]

Bureau Code

[No Data]

Program Code

[No Data]

License

[No Data]

Rights

Learing Outlier Ensembles: The Best of Both Worlds – Supervised and Unsupervised. Barbora Micenkova, Brian McWilliams, and Ira A

Spatial

[No Data]

Temporal

[No Data]

Speech dataset

Description

Resource Fields

Tags

Project Open Data Required Fields (version 1.1)

Project Open Data Additional Fields (version 1.0)

Project Open Data Required-if-Applicable Fields (version 1.1)