-
6f2858dd72
updating for new bivariate plots
main
Matthew Gaughan
2025-11-03 10:04:42 -0800
-
2efd961fed
adding trial survival test and more information about adac variables
Matthew Gaughan
2025-10-27 17:54:14 -0700
-
ab1cb3efea
updated DSL data aggregation
Matthew Gaughan
2025-10-27 10:28:08 -0700
-
e955b4f50f
adding some analysis of modal terms and olmo labels
Matthew Gaughan
2025-10-24 14:10:49 -0700
-
e5ca779900
unified new data and cleaned project directory
Matthew Gaughan
2025-10-24 09:03:54 -0700
-
d6965a33cb
new batched OLMO labels
mgaughan
2025-10-24 10:03:28 -0500
-
0ed72af495
add scripts for other aggregation and merge tasks
Matthew Gaughan
2025-10-23 13:50:27 -0700
-
e3748fa55f
updating collation scripts, more work TODO
Matthew Gaughan
2025-10-21 19:41:36 -0700
-
90311ca136
updating with new human labels
Matthew Gaughan
2025-10-21 15:19:13 -0700
-
b198781aa0
updating some of the scripts for PCA analysis
Matthew Gaughan
2025-10-20 11:09:04 -0700
-
f146016eac
re-done total pca
mgaughan
2025-10-20 12:38:44 -0500
-
2e8b85d3e9
removing erroneous PCA df, going to re-run
Matthew Gaughan
2025-10-20 10:31:54 -0700
-
bf4bc88083
running PCA across both description and reply comment types
mgaughan
2025-10-20 11:30:38 -0500
-
c40e87ff80
updating the repo, cleaning up misc. printout
Matthew Gaughan
2025-10-20 09:13:48 -0700
-
d86233abca
updated PCA analysis
Matthew Gaughan
2025-10-15 10:45:29 -0700
-
0843685707
final run of olmo sentence categorization
mgaughan
2025-10-15 09:51:33 -0500
-
f60f3ef120
updating PCA to account for sentence count and median length
mgaughan
2025-10-14 23:15:14 -0500
-
cb2fe737cd
updating batching script, preparing for run
mgaughan
2025-10-11 07:38:11 -0500
-
186a26f261
backing up renewed PCA analysis
Matthew Gaughan
2025-10-08 14:55:31 -0700
-
840b32a2e4
simple bivariate plots to look at variance, or lack thereof.
Matthew Gaughan
2025-10-07 15:00:59 -0700
-
6fb1801b2a
updating with basic seniority and affiliation data
Matthew Gaughan
2025-10-06 13:55:03 -0700
-
b982973f37
updating human sampling
Matthew Gaughan
2025-10-06 09:37:06 -0700
-
a14b08cfd8
pulling sample for human_labeling
Matthew Gaughan
2025-10-06 09:14:00 -0700
-
83bcc15811
updated with new outcome variable
Matthew Gaughan
2025-10-03 12:01:37 -0700
-
5f157ef532
some updates to PCA
Matthew Gaughan
2025-10-02 09:22:36 -0700
-
7f89fd1966
updated PCA analysis, ready for rob tomorrow
Matthew Gaughan
2025-10-01 20:58:55 -0700
-
f636969541
updated PCA results with dropped rows
mgaughan
2025-10-01 21:28:12 -0500
-
e61d3b6599
updating with DSL power analysis
Matthew Gaughan
2025-09-30 20:17:09 -0700
-
b7c2c9fcd6
unifying current data and some repo cleaning
Matthew Gaughan
2025-09-29 14:10:39 -0700
-
acd8964e73
preliminary EDA on the PCA analysis
Matthew Gaughan
2025-09-25 14:09:39 -0700
-
b21ecb02c3
running PCA on subcomment values, adding new plot for closed_relevance
mgaughan
2025-09-25 10:11:47 -0500
-
e29d4bf59c
cleaning working directory and re-running PCA with final neurobiber vectors
mgaughan
2025-09-25 09:48:23 -0500
-
9d1359af36
updating biberplus and olmo_batched results
mgaughan
2025-09-25 09:20:40 -0500
-
265b930578
updating library to account for re-running PCA
mgaughan
2025-09-23 16:41:32 -0500
-
032975c4f0
updating to collect new batch job labels
mgaughan
2025-09-23 15:09:45 -0500
-
b4f0c8f885
trying to sample the human label rows again
mgaughan
2025-09-22 20:34:31 -0500
-
bcfa688e11
olmo batched for getting the title in there too, i think
mgaughan
2025-09-22 19:18:11 -0500
-
e2413ed955
update to gerrit metadata extraction regex
Matthew Gaughan
2025-09-16 11:37:46 -0700
-
bb67fea96b
hopefully last update to human sampling
mgaughan
2025-09-16 12:16:10 -0500
-
89969daab5
updating labeling sample to be, uh, correct
mgaughan
2025-09-16 11:43:28 -0500
-
d83022f184
sampled comments for human labeling
mgaughan
2025-09-16 11:22:45 -0500
-
f68372572f
updating some scripts
mgaughan
2025-09-14 11:14:16 -0500
-
f9c12bb445
shelving some of the merge work for now
Matthew Gaughan
2025-09-14 09:11:33 -0700
-
77fc3ec541
preparing DSL modeling, looking at OLMO category data
Matthew Gaughan
2025-09-07 13:21:45 -0700
-
99c702fe20
adding batched OLMO results
mgaughan
2025-09-07 11:10:31 -0500
-
6de62f2447
some neurobiber PCA analysis
Matthew Gaughan
2025-09-05 14:59:07 -0700
-
a96fd6db2f
updates and re-running the batched olmo categorization
mgaughan
2025-09-05 13:43:00 -0500
-
f2afb7c981
should be updated and refined pca analysis
mgaughan
2025-09-04 15:47:11 -0500
-
a770d9c668
looking at kpca
mgaughan
2025-09-04 14:30:34 -0500
-
5d4df28f94
backing up the morning' before taking a few meetings
mgaughan
2025-09-04 11:21:07 -0500
-
6a5f07872d
looking at subcomment authorship
mgaughan
2025-09-04 11:13:31 -0500
-
0e569ac714
comment_type PCA
mgaughan
2025-09-04 10:57:11 -0500
-
5be22d3bfb
looking at ticket status
mgaughan
2025-09-04 10:46:33 -0500
-
68c95cdb8a
trying to look for the pca, with more specificity
mgaughan
2025-09-04 10:37:57 -0500
-
ccf434db38
looking for new phase pca
mgaughan
2025-09-04 10:25:00 -0500
-
809e858bbf
updating with new pca results
mgaughan
2025-09-04 10:12:34 -0500
-
a3c1a48dc7
trying to run olmo cat distributed, also running kernelPCA.
mgaughan
2025-09-04 09:35:41 -0500
-
a36226eab9
trying to look at the pca_plot 3
mgaughan
2025-09-02 16:04:06 -0500
-
dc23065cc8
trying to look at the pca_plot 2
mgaughan
2025-09-02 15:55:27 -0500
-
b8c12f987b
trying to look at the pca_plot 1
mgaughan
2025-09-02 15:50:47 -0500
-
d97b6e141c
trying to look at the pca_plot 0
mgaughan
2025-09-02 15:37:07 -0500
-
89105b7660
first pass at implementing pca for the style vectors
mgaughan
2025-09-02 15:30:50 -0500
-
b714e8dedb
updates to new script, I guess
Matthew Gaughan
2025-09-02 12:32:41 -0700
-
2d396ceb26
scaffolding out some work TODO on getting the olmo categories to be sentence-level
mgaughan
2025-09-02 12:48:11 -0500
-
53775c51db
removing stale todo list
mgaughan
2025-09-02 12:36:16 -0500
-
1c709f9a69
updating with gerrit information now
Matthew Gaughan
2025-08-07 19:03:20 -0700
-
41de0cbc7a
drop the labels from the FOSSY closed by plot
Matthew Gaughan
2025-07-31 21:51:51 -0700
-
7232f095e0
update to FOSSY tasks resolved plot
Matthew Gaughan
2025-07-31 21:49:32 -0700
-
34c376dbc3
FOSSY resolution share
Matthew Gaughan
2025-07-31 21:47:24 -0700
-
5239a8458a
adding renewed FOSSY heatmap
Matthew Gaughan
2025-07-31 18:30:23 -0700
-
822103ec3a
new task plot
Matthew Gaughan
2025-07-29 14:48:52 -0700
-
b624109f8d
updating with new heatmap for FOSSY presentation
Matthew Gaughan
2025-07-29 14:25:19 -0700
-
c5966518ef
updating similarity vectors
Matthew Gaughan
2025-07-29 13:38:50 -0700
-
23ef7acd01
updating 072525 biberplus labels to reflect that they have been pre-processes
mgaughan
2025-07-29 13:03:46 -0500
-
3e21ac1bb7
updating with OLMO-generated classifications
mgaughan
2025-07-28 17:09:23 -0500
-
9e4c05e347
almost done with the classification task
mgaughan
2025-07-25 15:37:32 -0500
-
862643d5df
building out olmo classification pipeline
mgaughan
2025-07-25 14:18:27 -0500
-
a08a49d04e
adding in analysis of biberplus vectors
Matthew Gaughan
2025-07-23 14:22:20 -0700
-
b0584ec1be
adding biberplus labels
mgaughan
2025-07-23 15:20:26 -0500
-
edd17d3269
updating with biberplus implementation, though not quite solved yet
mgaughan
2025-07-22 16:44:07 -0500
-
2e0665488c
updating with dbscan clustering etc.
Matthew Gaughan
2025-07-16 14:03:51 -0700
-
90e69975d2
preliminary EDA around neurobiber
Matthew Gaughan
2025-07-15 15:15:01 -0700
-
43fb346318
updated the labels to try to store in a better format
mgaughan
2025-07-15 14:17:46 -0500
-
7e8fb1982b
updating with tentative neurobiber labels, need to verify outputs
mgaughan
2025-07-14 15:38:23 -0500
-
c4dd45e344
saving cleaned, unified csv for text modeling
Matthew Gaughan
2025-07-14 08:19:11 -0700
-
8f2409feb0
updating with some structure for discussion analysis stuff
mgaughan
2025-07-11 16:13:26 -0500
-
68ec9c75f6
restructuring the repo for the second phase
mgaughan
2025-07-11 15:14:24 -0500
-
55964c754b
updating with new EDA
Matthew Gaughan
2025-07-07 13:08:58 -0700
-
067fd08dd4
some tidying up following m2 figure creation, more needed
Matthew Gaughan
2025-07-07 10:44:33 -0700
-
7966e92125
updating m2 plots with free y axis
Matthew Gaughan
2025-07-05 14:13:12 -0700
-
6c477f3d49
updating with c3 viz for dependency depth
Matthew Gaughan
2025-07-01 15:58:05 -0700
-
a4d8685c13
updating with new sampling approach for c2 and c3
Matthew Gaughan
2025-07-01 07:45:13 -0700
-
edcb174d42
updated preliminary phabricator EDA with things re: longitudinal data
Matthew Gaughan
2025-06-30 11:30:27 -0700
-
2af7983fdb
cleaned data and updated with some preliminary panel groupings, more longitudinal EDA needed
Matthew Gaughan
2025-06-27 14:01:23 -0700
-
ab1fe8e051
crossectional EDA for phase 2 of the project, need to make it longitudinal
Matthew Gaughan
2025-06-23 14:37:15 -0700
-
fd1479775d
updated work for some m2 writing tomorrow
Matthew Gaughan
2025-05-24 16:59:03 -0700
-
3573afbc1a
reorganizing
Matthew Gaughan
2025-05-18 16:50:20 -0700
-
ee31544b15
updating for c2 updates
Matthew Gaughan
2025-05-12 09:44:59 -0700
-
7a28e0e079
updating things for new key term searches across phabricator tasks
Matthew Gaughan
2025-05-08 13:49:50 -0700
-
c3ef44a402
updating bot framework commit plotting, showing announcement
Matthew Gaughan
2025-05-01 21:47:00 -0700