now with updated categorizations
This commit is contained in:
parent
5ed797e971
commit
63450ba7ef
File diff suppressed because it is too large
Load Diff
@ -7,3 +7,21 @@ Copying blob sha256:9fe6e2e61518cba6844870c03b285737daec35e62baf25ae7744629ed3a7
|
||||
Copying blob sha256:41f16248e682693ff20b3032c1d5e5541cc87c5af898ae2ff9b24d2940e59100
|
||||
Copying blob sha256:95d7b781703928cf3c4eece39d800cccb76728c375fedf51ecd83833fb25e458
|
||||
Copying blob sha256:8f6c9048534734f4c873935293b7296225846ceb31c1a158400a67ea170dde7f
|
||||
Copying blob sha256:ab17245097e491b9368790714f9d90ed447bf0973bd677cfe6f2456d62b72a13
|
||||
Copying blob sha256:dfecd7e9912b76ed460b8edd5a85f1943666e38a973ab5458177cf2c7c3110e3
|
||||
Copying blob sha256:464a8f74544589bf7b57f9a4cadcb6681e5ed00758f6c35025e691df4e88e890
|
||||
Copying blob sha256:61d26dce6d4129f40549457a063c82aca2c606d73ef156d5ac7e495e1d52530a
|
||||
Copying blob sha256:227a9906e6cccfa3aee837559aeb3fdcbf4409286dd4dd0a37287cfd483c37f6
|
||||
Copying blob sha256:c826e867602d3c7a5d3b8a552e49d51c58cccf42c31d016a660a50b7f451ef09
|
||||
Copying blob sha256:d40507eacecbbd8647bcee51d03f8b8cc86044d73cb72448112d49a08b8feaac
|
||||
Copying blob sha256:93c7cb8303f3b8ca1165c92b4b55a08973e8bd1a1360dd7bc3cb8bd18804d2a8
|
||||
Copying blob sha256:7f2d4a3887cae1984105738d5887b3ed325095939dfe31e89d5b47212b7f6479
|
||||
Copying blob sha256:167c57c419bc5ef23ffe823e05c1cac741246ef69352f36a2724e2f4c276f52b
|
||||
Copying blob sha256:1b456af08bb7c15512a9be77ce1ed44ce87f2c52315c99cbb2a99dd786adb4cb
|
||||
Copying blob sha256:054fcf1bbe967cf874bfa40161b3c559f8cf03ca1e05a532a69a8edca4d8d0e5
|
||||
Copying blob sha256:e26ee59fb49e43a8046b8c3812c52cc62bb8e5772e3323ff84c76a5715668c36
|
||||
Copying blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
|
||||
Copying blob sha256:da54e4cf5022248356962228df4c357d330f5f87e2ebc0fcff2b766400721cef
|
||||
Copying blob sha256:684176763e41ba50c8aa61c6e6eb6aec1ac35eea61710971f410dd1a5a2953a8
|
||||
Copying blob sha256:78c24341e0f9d5ae00c21f5dd0a35adf62f5c1ba2618b5e0c7e45994eb69f6b5
|
||||
Copying blob sha256:435f630eb19ee65f1b1e2db0d34b278037511d4344ca482b720c6bb1f70b8f58
|
||||
|
@ -12,7 +12,7 @@ olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B").to(device)
|
||||
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-2-1124-7B")
|
||||
|
||||
#priming prompt
|
||||
prompt_1 = "For the GIVEN DATA, Please categorize it based on the following numbered characteristics: \n\n 1: YES/NO (Characteristic 1. This is an English language empirical study. English language empirical studies analyze observational or experimental data. We are exlcuding literature reviews from this definition.) \n 2: YES/NO (Characteristic 2. This focuses on free and open source software (FOSS). The focus of this paper is on FOSS projects and ecosystems.) \n 3: YES/NO (Characteristic 3. This focuses on FOSS project evolution. FOSS project evolution is the study of longitudinal changes to the characteristics of free and open source projects.) \n 4: YES/NO (Characteristic 4. This focuses on FOSS project adaptation. FOSS project adaptation describes the intentional changes made to the characteristics of FOSS projects to better align with the project's broader environment.) \n\n Only respond with the appropriate number followed by 'YES' if the characteristic is present in the provided data or 'NO' if it is not (e.g. '1. NO; 2. YES;'). Do not provide any additional information."
|
||||
prompt_1 = "For the GIVEN DATA, Please categorize it based on the following numbered characteristics: \n\n 1: YES/NO (Characteristic 1. This is an English language empirical study. English language empirical studies discuss data or observations.) \n 2: YES/NO (Characteristic 2. This focuses on free and open source software (FOSS). The focus of this paper is on free or open source software projects and ecosystems.) \n 3: YES/NO (Characteristic 3. This focuses on FOSS project evolution. FOSS project evolution describes changes to free and open source projects and ecosystems.) \n 4: YES/NO (Characteristic 4. This focuses on FOSS project adaptation. FOSS project adaptation describes the intentional changes made by projects to better align with the project's broader environment.) \n\n Only respond with the appropriate number followed by 'YES' if the characteristic is present in the provided data or 'NO' if it is not (e.g. '1. YES; 2. NO;'). Do not provide any additional information."
|
||||
|
||||
example_4 = "Example 4: TITLE - Analysis of Open Source Software Evolution Using Evolution Curve Method \n ABSTRACT - Design and evolution of modem information systems is influenced by many factors: technical, organizational, social, and psychological. This is especially true for open source software systems (OSSS), when many developers from different backgrounds interact, share their ideas and contribute towards the development and improvement of a software product. The evolution of all OSSS is a continuous process of source code development, adaptation, improvement and maintenance. Studying changes to the various characteristics of source code can help us understand the evolution of a software system. In this paper, the software evolution process is analyzed using a proposed Evolution curve (E-curve) method, which is based on information theoretic metrics of source code. The method allows identifying major evolution stages and transition points of an analyzed software system. The application of the E-curves is demonstrated for the eMule system. .\n CATEGORIES: 1. YES; 2. YES; 3.YES; 4. NO"
|
||||
|
||||
@ -62,4 +62,4 @@ with open("cites/053025_man_filtered_dedup.csv", mode='r', newline='') as file:
|
||||
array_of_categorizations.append(cite_dict)
|
||||
#CSV everything
|
||||
df = pd.DataFrame(array_of_categorizations)
|
||||
df.to_csv('053025_olmo_categorized_citations.csv', index=False)
|
||||
df.to_csv('060225_olmo_categorized_citations.csv', index=False)
|
||||
|
@ -1,9 +1,9 @@
|
||||
starting the job at: Fri May 30 21:46:19 CDT 2025
|
||||
starting the job at: Mon Jun 2 09:24:01 CDT 2025
|
||||
setting up the environment
|
||||
running the p1 categorization script
|
||||
cuda
|
||||
NVIDIA A100-SXM4-80GB
|
||||
_CudaDeviceProperties(name='NVIDIA A100-SXM4-80GB', major=8, minor=0, total_memory=81153MB, multi_processor_count=108, uuid=841be301-db75-9627-af0f-04d8965fd651, L2_cache_size=40MB)
|
||||
Loading checkpoint shards: 0%| | 0/6 [00:00<?, ?it/s]
Loading checkpoint shards: 17%|█▋ | 1/6 [00:00<00:04, 1.04it/s]
Loading checkpoint shards: 33%|███▎ | 2/6 [00:02<00:04, 1.16s/it]
Loading checkpoint shards: 50%|█████ | 3/6 [00:03<00:03, 1.32s/it]
Loading checkpoint shards: 67%|██████▋ | 4/6 [00:05<00:02, 1.43s/it]
Loading checkpoint shards: 83%|████████▎ | 5/6 [00:06<00:01, 1.45s/it]
Loading checkpoint shards: 100%|██████████| 6/6 [00:07<00:00, 1.28s/it]
Loading checkpoint shards: 100%|██████████| 6/6 [00:07<00:00, 1.30s/it]
|
||||
NVIDIA A100-PCIE-40GB
|
||||
_CudaDeviceProperties(name='NVIDIA A100-PCIE-40GB', major=8, minor=0, total_memory=40442MB, multi_processor_count=108, uuid=c91b110a-9eb1-15b6-ff0a-7aeb47b26ff0, L2_cache_size=40MB)
|
||||
Loading checkpoint shards: 0%| | 0/6 [00:00<?, ?it/s]
Loading checkpoint shards: 17%|█▋ | 1/6 [00:00<00:04, 1.24it/s]
Loading checkpoint shards: 33%|███▎ | 2/6 [00:01<00:03, 1.08it/s]
Loading checkpoint shards: 50%|█████ | 3/6 [00:02<00:02, 1.05it/s]
04it/s]
04it/s]
04it/s]
Loading checkpoint shards: 33%|███▎ | 2/6 [00:02<00:04, 1.16s/it]
04it/s]
Loading checkpoint shards: 50%|█████ | 3/6 [00:03<00:03, 1.32s/it]
04it/s]
Loading checkpoint shards: 67%|██████▋ | 4/6 [00:05<00:02, 1.43s/it]
]
|
||||
job finished, cleaning up
|
||||
job pau at: Fri May 30 23:21:25 CDT 2025
|
||||
job pau at: Mon Jun 2 11:08:43 CDT 2025
|
||||
|
Loading…
Reference in New Issue
Block a user