Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							ee01ce3e61 
							
						 
					 
					
						
						
							
							Get Parquet test working  
						
						... 
						
						
						
						This requires some data smoothing to get read_table and read_parquet
DataFrames to look close enough, but the test now passes and validates
that the data match.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 16:48:58 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							52757a8239 
							
						 
					 
					
						
						
							
							Add noargs test for ikwiki  
						
						... 
						
						
						
						This way we can ensure that the parquet code outputs equivalent output.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 15:04:10 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							d413443740 
							
						 
					 
					
						
						
							
							Add numpy to environment  
						
						... 
						
						
						
						Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 13:20:28 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							3f94144b1b 
							
						 
					 
					
						
						
							
							Begin adding test for parquet export  
						
						... 
						
						
						
						Changed logic for handling anonymous edits so that wikiq handles
the type for editor ids consistently. Parquet can mix int64 and
None, but not int64 and strings - previously the code used the empty
string to denote anonymous editors.
Tests failing. Don't merge yet.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 13:17:30 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							df0ad1de63 
							
						 
					 
					
						
						
							
							Finish test standardization  
						
						... 
						
						
						
						Test logic is executed within the WikiqTestCase, while WikiqTester
handles creating and managing the variables tests need.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 10:11:58 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							f3e6cc9392 
							
						 
					 
					
						
						
							
							Begin refactor of tests to make new tests easier to write  
						
						... 
						
						
						
						Handle file naming logic centrally rather than requiring a dedicated
class per input file.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-28 09:11:36 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							c8b14c3303 
							
						 
					 
					
						
						
							
							Refactor test temporary file logic and wikiq call pattern  
						
						... 
						
						
						
						Test file refreshing and path computation is now handled by a helper.
The wikiq command is now constructed and handled by a single method
rather than in several ad-hoc ways.
The last places relying on the working directory are now removed.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-27 16:24:07 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							4d3900b541 
							
						 
					 
					
						
						
							
							Standardize calling for wikiq in tests  
						
						... 
						
						
						
						This way failures show the output of stderr/etc.
Also create path constant strings for use in tests to avoid repetition
and make changes easier.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-27 14:27:49 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							ebc57864f2 
							
						 
					 
					
						
						
							
							Make tests runnable from anywhere  
						
						... 
						
						
						
						Tests no longer implicitly require that the caller be in
a specific working directory.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-27 13:40:57 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							3d0bf89938 
							
						 
					 
					
						
						
							
							Move main logic to main()  
						
						... 
						
						
						
						This avoids:
1) the main function running when sourcing the file
2) Creating many globally-scoped variables in the main logic
Also begin refactor of test output file logic
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-27 11:10:42 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							6d133575c7 
							
						 
					 
					
						
						
							
							Remove resource leaks from tests  
						
						... 
						
						
						
						Close subprocesses within tests to fix resource leak warning.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-26 15:08:47 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							09a84e7d11 
							
						 
					 
					
						
						
							
							Reformat Wikiq_Unit_Test.py  
						
						... 
						
						
						
						Separate out reformatting from editing.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-26 15:07:39 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							9c5bf577e6 
							
						 
					 
					
						
						
							
							Remove unused dependencies and fix spacing  
						
						... 
						
						
						
						The "mw" and "numpy" dependencies were unneeded.
Spaces and tabs were inconsistently used.
They are now used consistently, changes via auto-formatter.
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-26 14:15:01 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							4804ecc4b3 
							
						 
					 
					
						
						
							
							Add additional test dependencies  
						
						... 
						
						
						
						These are now noted in requirements.txt
Also make dependency on 7zip and ffmpeg explicit in README
Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-26 12:29:49 -05:00 
						 
				 
			
				
					
						
							
							
								Will Beason 
							
						 
					 
					
						
						
						
						
							
						
						
							7a4c41159c 
							
						 
					 
					
						
						
							
							Exclude JetBrains config folder in .gitignore  
						
						... 
						
						
						
						Signed-off-by: Will Beason <willbeason@gmail.com> 
						
					 
					
						2025-05-26 10:48:17 -05:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							1aea601a30 
							
						 
					 
					
						
						
							
							[Bugfix] Call the correct matchmake function.  
						
						
						
					 
					
						2021-11-16 16:53:21 -08:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							c437b357db 
							
						 
					 
					
						
						
							
							rename matchmake functions  
						
						
						
					 
					
						2021-11-11 19:09:41 -08:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							bb83d62b74 
							
						 
					 
					
						
						
							
							Add some descriptive comments.  
						
						
						
					 
					
						2021-10-19 16:55:24 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							c285402683 
							
						 
					 
					
						
						
							
							add todos to readme  
						
						
						
					 
					
						2021-10-18 14:14:11 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							b1bea09ad6 
							
						 
					 
					
						
						
							
							fix bugs and unit tests  
						
						
						
					 
					
						2021-10-18 13:33:05 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							9a0c157ebb 
							
						 
					 
					
						
						
							
							bugfix  
						
						
						
					 
					
						2021-10-18 10:15:03 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							ae870fed0b 
							
						 
					 
					
						
						
							
							parquet path is code-complete  
						
						
						
					 
					
						2021-10-17 21:46:31 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							26f6d8f984 
							
						 
					 
					
						
						
							
							remove dependency on pandas.  
						
						
						
					 
					
						2021-10-17 20:24:33 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							ae9a241747 
							
						 
					 
					
						
						
							
							use dataclasses and pyarrow for types.  
						
						
						
					 
					
						2021-10-17 20:21:22 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							d8d20f670b 
							
						 
					 
					
						
						
							
							initial work on parquet support  
						
						
						
					 
					
						2021-10-17 13:22:22 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							cdfa77d66d 
							
						 
					 
					
						
						
							
							remove commented code  
						
						
						
					 
					
						2019-11-11 11:28:48 -08:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							02b3250a36 
							
						 
					 
					
						
						
							
							refactor regex matching in a tidier object oriented style  
						
						
						
					 
					
						2019-11-09 13:07:46 -08:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							414cc5ff2d 
							
						 
					 
					
						
						
							
							validate tests and add asserts and baselines for regex tests.  
						
						
						
					 
					
						2019-11-09 12:19:55 -08:00 
						 
				 
			
				
					
						
							
							
								sohyeonhwang 
							
						 
					 
					
						
						
						
						
							
						
						
							4ccde84529 
							
						 
					 
					
						
						
							
							added regex scanner v2's dump unit test file regextest.xml.bz2  
						
						
						
					 
					
						2019-11-07 14:06:15 -06:00 
						 
				 
			
				
					
						
							
							
								sohyeonhwang 
							
						 
					 
					
						
						
						
						
							
						
						
							f147e1d899 
							
						 
					 
					
						
						
							
							merging pull containing revert-radius with 2nd version of regex scanner w/ unit tests  
						
						
						
					 
					
						2019-11-07 13:28:17 -06:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							c84844cfb5 
							
						 
					 
					
						
						
							
							add unit tests for configuring revert_radius  
						
						
						
					 
					
						2019-10-07 15:02:30 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							c4416d0f1b 
							
						 
					 
					
						
						
							
							make revert radius configurable  
						
						
						
					 
					
						2019-10-07 13:57:49 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							7b856bec86 
							
						 
					 
					
						
						
							
							Merge branch 'master' into regex_scanner  
						
						
						
					 
					
						2019-10-05 18:17:03 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							324ccc8e26 
							
						 
					 
					
						
						
							
							update baseline outputs  
						
						
						
					 
					
						2019-10-05 16:36:07 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							17529cdd48 
							
						 
					 
					
						
						
							
							bugfix, remove old legacy persistence flag  
						
						
						
					 
					
						2019-10-05 16:13:11 -07:00 
						 
				 
			
				
					
						
							
							
								sohyeonhwang 
							
						 
					 
					
						
						
						
						
							
						
						
							7bf4559ceb 
							
						 
					 
					
						
						
							
							changes for regex scanner addition  
						
						
						
					 
					
						2019-10-05 15:36:58 -05:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							fb052ffa33 
							
						 
					 
					
						
						
							
							edont compute persistence by default  
						
						
						
					 
					
						2019-09-22 15:54:17 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							e871023ff5 
							
						 
					 
					
						
						
							
							elaborate docstring for persistence  
						
						
						
					 
					
						2019-09-22 15:11:59 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							7d62ff9fb7 
							
						 
					 
					
						
						
							
							improve help for namespace-include  
						
						
						
					 
					
						2018-09-03 11:30:12 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							f7f5bf8fd4 
							
						 
					 
					
						
						
							
							sub assertEquals assertEqual  
						
						
						
					 
					
						2018-09-03 11:21:49 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							f784c77f60 
							
						 
					 
					
						
						
							
							add namespace filter parameter  
						
						
						
					 
					
						2018-09-03 11:13:48 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							317bafb50d 
							
						 
					 
					
						
						
							
							Merge branch 'advanced_persistence' of code.communitydata.cc:mediawiki_dump_tools into advanced_persistence  
						
						
						
					 
					
						2018-08-23 19:00:49 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							7cd0bf3b9e 
							
						 
					 
					
						
						
							
							Add parameter for selecting specific namespaces.  
						
						
						
					 
					
						2018-08-23 18:49:32 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							d93769c21f 
							
						 
					 
					
						
						
							
							Merge branch 'advanced_persistence' of code.communitydata.cc:mediawiki_dump_tools into advanced_persistence  
						
						
						
					 
					
						2018-08-23 18:27:09 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							afd40c1a45 
							
						 
					 
					
						
						
							
							Merge branch 'advanced_persistence' of code.communitydata.cc:/mediawiki_dump_tools into advanced_persistence  
						
						
						
					 
					
						2018-08-23 18:25:51 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							e4222c45dd 
							
						 
					 
					
						
						
							
							add namespace filter parameter  
						
						
						
					 
					
						2018-08-23 18:25:08 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							829ffcffae 
							
						 
					 
					
						
						
							
							Merge branch 'advanced_persistence' of code.communitydata.cc:/mediawiki_dump_tools into advanced_persistence  
						
						
						
					 
					
						2018-08-23 18:23:36 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							776b73519a 
							
						 
					 
					
						
						
							
							add namespace filter parameter  
						
						
						
					 
					
						2018-08-23 18:23:23 -07:00 
						 
				 
			
				
					
						
							
							
								Nate E TeBlunthuis 
							
						 
					 
					
						
						
						
						
							
						
						
							5b6aaad862 
							
						 
					 
					
						
						
							
							add namespace filter parameter  
						
						
						
					 
					
						2018-08-23 18:02:56 -07:00 
						 
				 
			
				
					
						
					 
					
						
						
						
						
							
						
						
							f468d1a5b6 
							
						 
					 
					
						
						
							
							add support for persistence with segment matching  
						
						
						
					 
					
						2018-08-20 16:08:16 -07:00