38fdd07b39
- Renamed the articles.txt to something more specific Changes to both scripts: - Updated filenames to match the new standard - Reworked the logging code so that it can write to stderr by default. Because we can only call logging.basicConfig() once, this eneded up being a bigger changes. - Caused scripts to output git commits and export to track which code produced which dataset. - Caused programs to take files instead of directories as output (allows us to run programs more than once a day). Changes to the wikipedia_views/scripts/fetch_daily_views.py: - Change output that it outputs a sequence of JSON dictionaries (one per line) as per the standard we agreed to and which is what Twitter, Github, and other dumps do. Previous behavior was to create output a single JSON list object. - A number of other small changes and tweaks throughout. |
||
---|---|---|
.. | ||
fetch_daily_views.py | ||
wikiproject_scraper.py |