covid19/keywords
2020-03-31 15:34:34 -07:00
..
analysis rename 'transliterations' to 'keywords' 2020-03-31 15:15:01 -07:00
output use 'item' instead of 'entity' 2020-03-31 15:34:34 -07:00
resources rename 'transliterations' to 'keywords' 2020-03-31 15:15:01 -07:00
src use 'item' instead of 'entity' 2020-03-31 15:34:34 -07:00
README.md Improve README.md for keywords 2020-03-31 15:25:51 -07:00
requirements.txt rename 'transliterations' to 'keywords' 2020-03-31 15:15:01 -07:00

Keywords

This code finds trending web searches related to the COVID-19 pandemic using Google trends (collect_trends.py). It then searches for relevant keywords on Wikidata (wikidata_search) in order to find high-quality translations of important words and phrases (wikidata_translations.py). The goal is to support efforts expanding the Observatory to information in many languages beyond English.

We search the Wikidata API for entities in src/wikidata_search.py and then we make simple SPARQL queries in src/wikidata_translations.py to collect labels and aliases the entities. The labels come with language metadata. This seems to provide a decent initial list of relevant terms across multiple languages.

The output data lives at covid19.communitydata.science.