Datasets made available for the public.
| Name | Date | URL | Tag |
|---|---|---|---|
| South African State Capture Commision Transcripts - Zondo Commission | Oct, 2022 | Github | NLP |
| IsiZulu News (articles and headlines) and Siswati News (headlines) Corpora | Oct, 2022 | Github | NLP |
| South African Disinformation [Fake News] Website Data - 2020 | Jul, 2021 | Zenodo | NLP |
| Loughran McDonald-SA-2020 Sentiment Word List | Jul, 2021 | UP repository | NLP |
| Umsuka English - isiZulu Parallel Corpus | Jun, 2021 | Zenodo | NLP |
| WordNets for South African Languages | Dec, 2020 | Zenodo | NLP |
| covid19za | Mar, 2020 | GitHub, Zenodo | Soc |
| African Pre-Trained Embeddings | Feb, 2020 | Zenodo | NLP |
| South African News Data | Feb, 2020 | Zenodo | NLP |