Finish MVP for transliterations

code is reasonably well-written
checked that we get seemingly good data back
adding README
adding data
This commit is contained in:
2020-03-24 22:06:08 -07:00
parent 308d462e76
commit 36167295ec
10 changed files with 5828 additions and 23 deletions

View File

@@ -0,0 +1,3 @@
# Transliterations
This part of the project collects tranliterations of key phrases related to COVID-19 using Wikidata. We search the Wikidata API for entities in `src/wikidata_search.py` and then we make simple SPARQL queries in `src/wikidata_transliterations.py` to collect labels and aliases the entities. The labels come with language metadata. This seems to provide a decent initial list of relevant terms across multiple languages.