13
0
Go to file
Nate E TeBlunthuis 7d0e020f9d update .gitignore
2020-07-07 12:28:44 -07:00
examples update examples with working streaming 2020-07-07 11:47:17 -07:00
.gitignore update .gitignore 2020-07-07 12:28:44 -07:00
check_comments_shas.py Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
check_submission_shas.py Script for checking shas for submissions. 2020-07-03 13:35:46 -07:00
comments_2_parquet_part1.py Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
comments_2_parquet_part2.py Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
comments_2_parquet.sh Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
helper.py Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
pull_pushshift_comments.sh Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
pull_pushshift_submissions.sh Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
submissions_2_parquet_part1.py Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
submissions_2_parquet_part2.py Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00
submissions_2_parquet.sh Build comments dataset similarly to submissions and improve partitioning scheme 2020-07-07 11:45:43 -07:00