Nathan TeBlunthuis
|
9590e18a07
|
bugfix in filling in missing
|
2025-01-12 14:27:58 -08:00 |
|
Nathan TeBlunthuis
|
a7182ff3dc
|
print debugging
|
2025-01-12 09:44:25 -08:00 |
|
Nathan TeBlunthuis
|
31aaa03079
|
add flag to run without overwriting completed parts.
|
2025-01-12 09:40:52 -08:00 |
|
Nathan TeBlunthuis
|
fcdd2d2272
|
bugfix
|
2025-01-12 02:39:05 -08:00 |
|
Nathan TeBlunthuis
|
9ae2d13573
|
bugfix
|
2025-01-12 01:17:21 -08:00 |
|
Nathan TeBlunthuis
|
3792f58d15
|
print debugging
|
2025-01-12 01:12:05 -08:00 |
|
Nathan TeBlunthuis
|
a9a4a6d90b
|
bugfix
|
2025-01-12 01:07:57 -08:00 |
|
Nathan TeBlunthuis
|
1e2eeadb60
|
typo fix
|
2025-01-12 01:06:08 -08:00 |
|
Nathan TeBlunthuis
|
a9711fddf5
|
set nterms based on the new database
|
2025-01-12 01:03:52 -08:00 |
|
Nathan TeBlunthuis
|
f79eb28e31
|
fix f-string
|
2025-01-12 00:56:17 -08:00 |
|
Nathan TeBlunthuis
|
1fa2f6c4d2
|
bugfix
|
2025-01-12 00:54:34 -08:00 |
|
Nathan TeBlunthuis
|
8f0ce2dba7
|
bugfix
|
2025-01-12 00:52:38 -08:00 |
|
Nathan TeBlunthuis
|
2b4cb7fdf6
|
bugfix
|
2025-01-12 00:49:36 -08:00 |
|
Nathan TeBlunthuis
|
e568ee6db7
|
add parameters.
|
2025-01-12 00:47:47 -08:00 |
|
Nathan TeBlunthuis
|
b4f9ce0ad2
|
support remapping term_ids.
|
2025-01-12 00:44:16 -08:00 |
|
Nathan TeBlunthuis
|
72a4e686ef
|
bugfix
|
2025-01-11 22:59:20 -08:00 |
|
Nathan TeBlunthuis
|
9c6d7429b2
|
fix bug.
|
2025-01-11 22:46:43 -08:00 |
|
Nathan TeBlunthuis
|
4c2ddc7455
|
bugfix
|
2025-01-11 21:50:07 -08:00 |
|
Nathan TeBlunthuis
|
1453a57d68
|
bugfix
|
2025-01-11 21:36:48 -08:00 |
|
Nathan TeBlunthuis
|
561a6704a3
|
make multiproc configurable
|
2025-01-11 21:21:53 -08:00 |
|
Nathan TeBlunthuis
|
b2f1c1342f
|
tweak parallelism in hopes for speed.
|
2025-01-11 20:22:18 -08:00 |
|
Nathan TeBlunthuis
|
4168d0d4cf
|
pass clusters param through
|
2025-01-11 20:09:19 -08:00 |
|
Nathan TeBlunthuis
|
dba0faf125
|
bugfix
|
2025-01-11 20:02:36 -08:00 |
|
Nathan TeBlunthuis
|
d0f37fe33a
|
limit output to only the subreddits in clusters.
|
2025-01-11 19:52:54 -08:00 |
|
Nathan TeBlunthuis
|
9892315234
|
bugfix
|
2025-01-11 19:12:01 -08:00 |
|
Nathan TeBlunthuis
|
17defcd163
|
bugfix.
|
2025-01-11 19:07:45 -08:00 |
|
Nathan TeBlunthuis
|
ecc50f0249
|
spelling fix.
|
2025-01-11 18:59:42 -08:00 |
|
Nathan TeBlunthuis
|
0613193e9d
|
support passing in a model object.
|
2025-01-11 18:59:25 -08:00 |
|
Nathan TeBlunthuis
|
3c1d5df97e
|
add submissions to timeseries.
|
2025-01-10 06:20:38 -08:00 |
|
Nathan TeBlunthuis
|
81e12d1cef
|
bugfix.
|
2024-12-31 14:41:27 -08:00 |
|
Nathan TeBlunthuis
|
c59d251d19
|
write clusters and read with spark instead of creating data frame.
|
2024-12-31 14:37:50 -08:00 |
|
Nathan TeBlunthuis
|
a8a86c2440
|
add timeseries code
|
2024-12-31 16:27:04 -06:00 |
|
Nathan TeBlunthuis
|
79d1826ba4
|
enforce min_df constraint in counting lsi features.
|
2024-12-30 16:17:31 -08:00 |
|
Nathan TeBlunthuis
|
3555542862
|
use min/max df constraints in counting nterms.
|
2024-12-30 16:10:50 -08:00 |
|
Nathan TeBlunthuis
|
a9b296dd73
|
bugfix
|
2024-12-28 20:18:53 -08:00 |
|
Nathan TeBlunthuis
|
d9db21686d
|
remove unnecessary isoformat
|
2024-12-28 20:08:12 -08:00 |
|
Nathan TeBlunthuis
|
41fea31fce
|
bugfix
|
2024-12-28 20:04:38 -08:00 |
|
Nathan TeBlunthuis
|
7aa22c7385
|
bugfix
|
2024-12-28 20:02:24 -08:00 |
|
Nathan TeBlunthuis
|
f11d4cfc72
|
use static tfidf (not weekly) to create tfidf matrix
|
2024-12-28 20:00:53 -08:00 |
|
Nathan TeBlunthuis
|
7b5ac73b2c
|
use static tfidf (not weekly) to create tfidf matrix
|
2024-12-28 19:58:14 -08:00 |
|
Nathan TeBlunthuis
|
e2e7d7dbb1
|
more print debugging
|
2024-12-28 19:27:42 -08:00 |
|
Nathan TeBlunthuis
|
c317ef6475
|
debugging: print the shape
|
2024-12-28 19:21:24 -08:00 |
|
Nathan TeBlunthuis
|
c3cce0817e
|
bugfix
|
2024-12-28 14:31:24 -08:00 |
|
Nathan TeBlunthuis
|
c9464f86f7
|
interface fix.
|
2024-12-28 14:27:56 -08:00 |
|
Nathan TeBlunthuis
|
f3db4efbb1
|
pass nterms as int.
|
2024-12-28 14:24:24 -08:00 |
|
Nathan TeBlunthuis
|
27f29e63fa
|
typo fix.
|
2024-12-28 14:18:58 -08:00 |
|
Nathan TeBlunthuis
|
3f277ad99e
|
pass weeks as strings.
|
2024-12-28 14:10:55 -08:00 |
|
Nathan TeBlunthuis
|
02ec11f726
|
no longer need to convert from spark dates into isoformat.
|
2024-12-28 13:55:54 -08:00 |
|
Nathan TeBlunthuis
|
104b708ff6
|
use duckdb not spark to prepare for weekly similarities.
|
2024-12-28 13:45:17 -08:00 |
|
Nathan TeBlunthuis
|
74ee86e443
|
add weekly_cosine_similarities script.
|
2024-12-25 21:15:38 -08:00 |
|