13
0
Go to file
2020-07-07 00:51:40 -07:00
examples Script to demonstrate reading parquet. 2020-07-07 00:51:40 -07:00
.gitignore Script to run both parts of submissions_2_parquet.sh 2020-07-06 23:27:18 -07:00
check_comments_shas.py Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
check_submission_shas.py Script for checking shas for submissions. 2020-07-03 13:35:46 -07:00
comments_2_parquet.py Cache before sorting so we don't extract twice. 2020-07-06 22:30:04 -07:00
pull_pushshift_comments.sh Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
pull_pushshift_submissions.sh Check the shas when we download dumps 2020-07-06 23:31:52 -07:00
submissions_2_parquet_part1.py Move the spark part of submissions_2_parquet to a separate script. 2020-07-06 22:27:34 -07:00
submissions_2_parquet_part2.py Move the spark part of submissions_2_parquet to a separate script. 2020-07-06 22:27:34 -07:00
submissions_2_parquet.sh Script to run both parts of submissions_2_parquet.sh 2020-07-06 23:27:18 -07:00