1
0
Commit Graph

57 Commits

Author SHA1 Message Date
mgaughan
6fabfdd2ab 34/34 for pass 1 2025-07-29 22:56:09 -04:00
mgaughan
f387493d42 33/34 2025-07-29 20:32:00 -04:00
mgaughan
75e9b1602c 32/34 2025-07-29 10:58:48 -04:00
mgaughan
5648edec4c 31/34 2025-07-28 11:33:07 -04:00
mgaughan
80abfd40fa 30/34 2025-07-27 23:44:47 -04:00
mgaughan
036d42e17e 29/34 2025-07-25 13:01:32 -04:00
mgaughan
3fbc12447f 28/34 2025-07-25 00:49:51 -04:00
mgaughan
cd2a10e621 27/34 2025-07-24 14:30:56 -04:00
mgaughan
72d3e10aea 26/34 2025-07-24 12:22:02 -04:00
mgaughan
7d0b1339fc bertopic information added 2025-07-23 17:18:33 -05:00
mgaughan
28cad5a5fd 25/34 2025-07-23 12:57:24 -04:00
mgaughan
0b8e944a41 24/34 2025-07-22 11:50:50 -04:00
mgaughan
154eadda62 22/34 papers done 2025-07-21 10:12:25 -04:00
mgaughan
2513e718f3 updating with 20/34 2025-07-19 15:57:42 -04:00
mgaughan
21d533d6f1 18/34 2025-07-18 17:30:58 -04:00
mgaughan
c62ff32268 two more papers, up to 15/34 2025-07-16 12:19:39 -04:00
mgaughan
6a354a2914 13/34 2025-07-15 20:19:21 -04:00
mgaughan
a4e0efa955 12/34 2025-07-15 14:11:32 -04:00
mgaughan
6aecc5f5d8 11/34 coded 2025-07-14 11:04:19 -04:00
mgaughan
c37ed9aaa8 10/34 2025-07-11 15:49:28 -04:00
mgaughan
f4cc5b26c5 9/34 2025-07-11 11:19:13 -04:00
mgaughan
d5147f62be 8/34 2025-07-10 22:17:16 -04:00
mgaughan
a7cc04b68a 7/34 papers 2025-07-10 17:22:49 -04:00
mgaughan
08e88c0680 6/34 papers 2025-07-10 15:05:01 -04:00
mgaughan
d24df6c6a4 updating with 5/34 papers 2025-07-10 12:01:08 -04:00
mgaughan
5c664d4736 updating with 4/34 papers first pass 2025-07-09 23:32:12 -04:00
mgaughan
3a4c92642c 1/34 paper done for analysis 2025-07-08 17:11:23 -04:00
mgaughan
0ee2b41792 adding csv to track doc reading/analysis 2025-07-07 22:56:07 -04:00
mgaughan
7141d0d9ad updating with fit topic model 2025-06-25 13:20:18 -05:00
mgaughan
609039e5cc updating with bertopic script and correct norskov 2025-06-24 10:33:03 -05:00
mgaughan
4e3462b35b reorganizing and updating norskov pdf 2025-06-24 09:26:04 -04:00
mgaughan
0fc36abcaf updating with the text/mkdwn version of the pdf studies 2025-06-24 08:14:03 -05:00
mgaughan
18e2fb1e77 updating the quest version with new ocr stuff 2025-06-23 21:52:47 -05:00
mgaughan
9eb4624cc5 adding the studies identified through the snowball sampling 2025-06-19 14:54:23 -04:00
mgaughan
8a51860c09 updating with sample expansion snowballing technique 2025-06-09 16:16:22 -05:00
mgaughan
6c455c88eb deleting unneeded sankey script 2025-06-09 16:05:22 -05:00
mgaughan
c50a3b57ff updating new git organization to remove sif file 2025-06-03 09:52:20 -05:00
mgaughan
ff8ca0b46e updating with new container, collected categorizations 2025-06-03 09:43:25 -05:00
mgaughan
c6f4a244f4 updating (and failing) to plot categorization with sankey diagram 2025-06-02 22:40:48 -05:00
mgaughan
9403c79c44 pulling new olmocr image and new categorization stuff 2025-06-02 21:24:53 -05:00
mgaughan
c5df6cb6c6 removing ill categorizations 2025-06-02 11:35:45 -05:00
mgaughan
63450ba7ef now with updated categorizations 2025-06-02 11:29:59 -05:00
mgaughan
5ed797e971 trying to get olmocr to run, updated categorization values 2025-06-02 11:27:23 -05:00
mgaughan
d8b9ca9dea updating with docker images and categorized citations 2025-06-02 09:01:18 -05:00
mgaughan
c7448f2fc2 trying to load-balance the few-shot a bit more 2025-05-30 21:45:30 -05:00
mgaughan
225d7f53c8 bad categorization data, some restructuring of the repo 2025-05-30 21:36:18 -05:00
mgaughan
9985e190e7 updated with preliminary categorization 2025-05-30 21:20:36 -05:00
mgaughan
c3bb0801a2 ~final~ update to categorization script 2025-05-30 16:39:24 -05:00
mgaughan
86e2cd3ed8 updating with manual dedup of citations 2025-05-30 16:37:03 -05:00
mgaughan
1d63537027 redoing the dedup csv, something wrong with the other one 2025-05-30 13:52:13 -05:00