Skip to content
@Glottography

Glottography

Glottography is a collection of datasets providing information about the spatial extent of speaker areas, i.e., the areas where particular languages are spoken.

Each dataset is derived from geographic maps published in a source publication as follows:

  1. The shapes depicted in the source maps are turned into GIS vector objects.
  2. The metadata provided in the source publication is used to match the shape to a languoid in the Glottolog catalog.

Note

Multiple features in the source may be mapped to the same Glottolog languoid.

This information is packaged into a CLDF dataset (in the cldf subdirectory of each dataset) in order to provide interoperable data for cross-linguistic analyses as follows:

  • The metadata of the shapes (aka features in GeoJSON terminology) are listed in a ContributionTable with a link to the GeoJSON file containing the shape.
  • A LanguageTable lists the language- and family-level languoids for which the dataset provides information (via features either directly mapped to the languoid or mapped to "parts" of the languoid, i.e., dialects or sub-groups) with links to the features in the source that were aggregated.
  • Three sets of geo-data:
    • The features as depicted in the source publication.
    • Aggregated language-level speaker areas.
    • Aggregated family-level speaker areas.

Reviewed and released datasets are published in the Glottography community on Zenodo.

To suggest new datasets, open an issue at https://github.com/Glottography/.github/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22new%20dataset%22

Pinned Loading

  1. tutorials tutorials Public

    TeX

  2. pyglottography pyglottography Public

    Programmatic curation of Glottography datasets

    Python

Repositories

Showing 10 of 32 repositories

Top languages

Loading…

Most used topics

Loading…