| 
							
							
								 Nate E TeBlunthuis | 6a3bfa26ee | bugfix | 2021-04-26 22:31:05 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 3a758f1fc8 | Merge branch 'charliepatch' of code:cdsc_reddit into charliepatch | 2021-04-26 13:58:25 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 806cfc948f | support passing in list of tfidf vectors. Also lowercases included subreddits. | 2021-04-26 13:20:43 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 0fe120e4ab | support passing in list of tfidf vectors. Also lowercases included subreddits. | 2021-04-26 11:44:56 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 003a48aea5 | bugfix in weekly similarities | 2021-04-22 10:37:04 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | ac06a8757a | calculate some user-level attributes to detect bots | 2021-04-20 11:34:36 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 01a4c35358 | grid sweep selection for clustering hyperparameters | 2021-04-20 11:33:54 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 628a70734b | Merge branch 'master' of code:cdsc_reddit | 2021-04-05 23:21:35 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | f0176d9f0d | Changes for cosine similarities on klone. | 2021-04-05 23:21:06 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 36cb0a5546 | add code for pulling activity time series from parquet. | 2021-03-24 16:08:57 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 06430903f0 | add included_subreddits parameter to cosine similarities. | 2021-02-22 18:38:34 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 4dc949de5f | Changes from hyak. | 2021-02-22 16:03:48 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 140d1bdd17 | fix bug in viz. | 2021-01-27 20:26:15 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 554660275f | add visualization for 10000 subreddits based on author-tf similarities. | 2021-01-27 20:22:24 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | b4dd9acbd8 | Merge branch 'master' of code:cdsc_reddit | 2021-01-27 20:09:23 -08:00 |  | 
			
				
					|  | dbe4c87f8b | add cluster selection to visualization | 2021-01-27 20:08:07 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 3155600514 | remove nsfw subs from topN | 2020-12-28 21:11:44 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 4e20dce188 | Updating to support wang-style user overlaps. | 2020-12-24 22:38:04 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 56269deee3 | Some improvements to run affinity clustering on larger dataset and compute density. | 2020-12-12 20:42:47 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | e6294b5b90 | Refactor and reorganze. | 2020-12-08 17:32:20 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | a60747292e | Add code for running tf-idf at the weekly level. | 2020-12-01 22:54:48 -08:00 |  | 
			
				
					|  | db5879d6c9 | refactor visualization code. | 2020-11-17 16:46:49 -08:00 |  | 
			
				
					|  | 13eb95b3b0 | Merge remote-tracking branch 'refs/remotes/origin/master' into master | 2020-11-17 16:33:14 -08:00 |  | 
			
				
					|  | 2cc897543a | git-annex in nathante@nate-x1:~/cdsc_reddit | 2020-11-17 16:33:13 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 1bf206d219 | git-annex in nathante@mox2.hyak.local:/gscratch/comdata/users/nathante/cdsc-reddit | 2020-11-17 16:31:48 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | f8ff8b2d0f | Update code for clustering + tsne. | 2020-11-17 15:59:20 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 82d184d9c6 | Update code for building simlarity matrices. | 2020-11-17 12:52:48 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | e794214653 | bugfix in completing tfidf similarity matrices. | 2020-11-12 11:47:53 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 220a540beb | increase learning rate. | 2020-11-11 16:58:39 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | cd43a94865 | increase iterations and perplectity and early_exaggeration | 2020-11-11 16:55:39 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | ca6a8f0896 | increase learning rate | 2020-11-11 16:48:41 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | ed0e1a8235 | Fix bug in tsne. | 2020-11-11 16:43:41 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 6baa08889b | git-annex in nathante@mox2.hyak.local:/gscratch/comdata/users/nathante/cdsc-reddit | 2020-11-11 16:39:44 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 4447c60265 | split fitting and plotting tsne. | 2020-11-11 16:38:22 -08:00 |  | 
			
				
					|  | db53c0138a | Add file to plot related subreddits using tsne. | 2020-11-11 16:05:36 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 4c8bd14992 | Bugfix (typo) | 2020-11-10 13:38:11 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 39c581bee9 | Reuse code for term and author cosine similarity. | 2020-11-10 13:18:57 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 5632a971c6 | Refactor tfidf code to for code resuse. | 2020-11-10 13:18:19 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 772f3a8fbd | rename 'idf' files to 'tfidf' | 2020-11-10 13:16:55 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 6edd155749 | Improvements to idf code | 2020-11-10 13:12:11 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 8b8c45ee2d | Merge branch 'master' of code:cdsc_reddit | 2020-11-02 10:40:12 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 3dc17bd27c | add term_cosine_similarity.py | 2020-11-02 10:40:02 -08:00 |  | 
			
				
					|  | 0882878166 | Add Cosine similarities to README.md | 2020-11-02 09:48:10 -08:00 |  | 
			
				
					|  | b50b08a3ea | Update Readme. | 2020-11-02 08:42:13 -08:00 |  | 
			
				
					|  | 9075a8153c | Merge branch 'master' of code:cdsc_reddit into master | 2020-11-01 21:50:44 -08:00 |  | 
			
				
					|  | 4c78f2c527 | Create README.md | 2020-11-01 21:50:27 -08:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 4ced659d19 | Update reddit comments data with daily dumps. | 2020-10-03 16:42:22 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 2740f55915 | Compute IDF for terms and authors. | 2020-08-23 11:57:55 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | 2d425600a8 | Update submissions to parse using the backfill queue. | 2020-08-11 22:37:36 -07:00 |  | 
			
				
					| 
							
							
								 Nate E TeBlunthuis | c92b50e050 | bugfix in checking submission shas | 2020-08-11 14:21:54 -07:00 |  |