The Rodrigo Corpus

Description

The Rodrigo corpus was obtained from the digitisation of the book “Historia de España del arçobispo Don Rodrigo”, written in ancient Spanish in 1545. It is a single writer book where most pages consist of a single block of well-separated lines of calligraphical text. This dataset is free available for research purposes. It contains 15,010 images of text lines with their paleographic transcription. It is divided into three partitions: 9000 text lines for training, 1000 for validation and 5010 for testing.

Resource Fields

Resource Type:

dataset

Submitted By:

Eva Bacas and Matt Lavin

Date Submitted:

2020-04-24 14:54:12


Project Open Data Required Fields (version 1.1)

Modified

[No data]

Publisher

[No data]

Contact Name

[No data]

Unique Identifier

[No data]

Public Access Level

[No data]

Project Open Data Additional Fields (version 1.0)

Contact email

[No Data]

Endpoint

[No Data]

Format

png, txt

Project Open Data Required-if-Applicable Fields (version 1.1)

Access Level Comment

[No Data]

Bureau Code

[No Data]

Program Code

[No Data]

License

[No Data]

Rights

Granell, Emilio; Martínez-Hinarejos, Carlos-D. (2018): The Rodrigo corpus. Zenodo. Dataset. https://doi.org/10.5281/zenodo.14900

Spatial

[No Data]

Temporal

[No Data]