1
0
Commit Graph

21 Commits

Author SHA1 Message Date
02043006e0 updates to scripts 2025-01-15 14:42:48 -06:00
75729e27ad spark updates 2025-01-09 15:40:47 -06:00
092306d777 fixing spark for querying data 2025-01-08 11:54:02 -06:00
8c934d93c5 working on bringing in activity data 2025-01-07 13:08:28 -06:00
83e668bfe5 updating scripts 2025-01-06 12:22:42 -06:00
60dfe7e0a2 worked on phab and wiki activity data collection 2025-01-03 19:54:07 -06:00
ab771e25b7 first parse of tech news files 2024-12-27 13:52:24 -06:00
9510f65255 data cleaning for the raw wikitext conversation dumps 2024-12-19 13:06:16 -06:00
80d12c0a1f phabricator script 2024-12-18 16:55:11 -06:00
db6b140748 pivot for collecting dump data 2024-12-17 15:52:29 -06:00
4eb0b70608 data collection script 2024-12-16 17:22:44 -06:00
mgaughan
a30708557f skeleton script for commit data collection 2024-12-12 14:41:14 -06:00
3c04d1ced5 wip getting wmf spark up, on ice for now 2024-12-12 13:34:46 -06:00
6cb3296e8e getting kibo set for data collection 2024-12-11 17:20:26 -06:00
mgaughan
b0dfcf3a13 more outlining for phabricator collection 2024-12-10 11:58:28 -06:00
mgaughan
6a7ed34089 outline of the wiki event collection 2024-12-10 11:14:40 -06:00
mgaughan
b08e889da6 init work to get wiki activity data 2024-12-10 10:50:11 -06:00
mgaughan
b57e4c15a3 first draft of conversation parsing for wiki talk pages 2024-12-09 14:56:31 -06:00
mgaughan
aa30d02c48 initial jamming on parsing through talk page discussions 2024-12-08 17:38:31 -06:00
mgaughan
4aa3433953 basic stems at importing scraping libraries 2024-12-04 17:22:24 -06:00
mgaughan
288b0d9bae initial commit 2024-12-04 16:39:16 -06:00