setting up the environment by loading in conda environment at Thu Sep 4 10:04:55 CDT 2025 running the bertopic job at Thu Sep 4 10:04:55 CDT 2025 ---------------------------------------- srun job start: Thu Sep 4 10:04:55 CDT 2025 Job ID: 3272179 Username: nws8519 Queue: gengpu Account: p32852 ---------------------------------------- The following variables are not guaranteed to be the same in the prologue and the job run script ---------------------------------------- PATH (in prologue) : /home/nws8519/.conda/envs/olmo/bin:/software/miniconda3/4.12.0/condabin:/home/nws8519/.local/bin:/home/nws8519/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/lpp/mmfs/bin:/hpc/usertools WORKDIR is: /home/nws8519 ---------------------------------------- W0904 10:05:10.900000 1845275 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] W0904 10:05:10.900000 1845275 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.900000 1845275 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0904 10:05:10.900000 1845275 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.900000 1845276 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] W0904 10:05:10.900000 1845276 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.900000 1845276 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0904 10:05:10.900000 1845276 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.906000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] W0904 10:05:10.906000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.906000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0904 10:05:10.906000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.907000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] W0904 10:05:10.907000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** W0904 10:05:10.907000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0904 10:05:10.907000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py:766] ***************************************** /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py:117: DtypeWarning: Columns (21) have mixed types. Specify dtype option on import or set low_memory=False. df = pd.read_csv("/home/nws8519/git/mw-lifecycle-analysis/p2/quest/072525_pp_biberplus_labels.csv") /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py:117: DtypeWarning: Columns (21) have mixed types. Specify dtype option on import or set low_memory=False. df = pd.read_csv("/home/nws8519/git/mw-lifecycle-analysis/p2/quest/072525_pp_biberplus_labels.csv") [rank0]: Traceback (most recent call last): [rank0]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 178, in [rank0]: main() [rank0]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 122, in main [rank0]: dataset = SentenceDataset(comment_texts, comment_types, priming, typology, instructions) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 76, in __init__ [rank0]: sentences = split_to_sentences(cleaned_comment) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 106, in split_to_sentences [rank0]: return nltk.sent_tokenize(text) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 119, in sent_tokenize [rank0]: tokenizer = _get_punkt_tokenizer(language) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 105, in _get_punkt_tokenizer [rank0]: return PunktTokenizer(language) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1744, in __init__ [rank0]: self.load_lang(lang) [rank0]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang [rank0]: lang_dir = find(f"tokenizers/punkt_tab/{lang}/") [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/data.py", line 579, in find [rank0]: raise LookupError(resource_not_found) [rank0]: LookupError: [rank0]: ********************************************************************** [rank0]: Resource punkt_tab not found. [rank0]: Please use the NLTK Downloader to obtain the resource: [rank0]: >>> import nltk [rank0]: >>> nltk.download('punkt_tab') [rank0]:  [rank0]: For more information see: https://www.nltk.org/data.html [rank0]: Attempted to load tokenizers/punkt_tab/english/ [rank0]: Searched in: [rank0]: - '/home/nws8519/nltk_data' [rank0]: - '/home/nws8519/.conda/envs/olmo/nltk_data' [rank0]: - '/home/nws8519/.conda/envs/olmo/share/nltk_data' [rank0]: - '/home/nws8519/.conda/envs/olmo/lib/nltk_data' [rank0]: - '/usr/share/nltk_data' [rank0]: - '/usr/local/share/nltk_data' [rank0]: - '/usr/lib/nltk_data' [rank0]: - '/usr/local/lib/nltk_data' [rank0]: ********************************************************************** [rank2]: Traceback (most recent call last): [rank2]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 178, in [rank2]: main() [rank2]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 122, in main [rank2]: dataset = SentenceDataset(comment_texts, comment_types, priming, typology, instructions) [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 76, in __init__ [rank2]: sentences = split_to_sentences(cleaned_comment) [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 106, in split_to_sentences [rank2]: return nltk.sent_tokenize(text) [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 119, in sent_tokenize [rank2]: tokenizer = _get_punkt_tokenizer(language) [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 105, in _get_punkt_tokenizer [rank2]: return PunktTokenizer(language) [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1744, in __init__ [rank2]: self.load_lang(lang) [rank2]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang [rank2]: lang_dir = find(f"tokenizers/punkt_tab/{lang}/") [rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank2]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/data.py", line 579, in find [rank2]: raise LookupError(resource_not_found) [rank2]: LookupError: [rank2]: ********************************************************************** [rank2]: Resource punkt_tab not found. [rank2]: Please use the NLTK Downloader to obtain the resource: [rank2]: >>> import nltk [rank2]: >>> nltk.download('punkt_tab') [rank2]:  [rank2]: For more information see: https://www.nltk.org/data.html [rank2]: Attempted to load tokenizers/punkt_tab/english/ [rank2]: Searched in: [rank2]: - '/home/nws8519/nltk_data' [rank2]: - '/home/nws8519/.conda/envs/olmo/nltk_data' [rank2]: - '/home/nws8519/.conda/envs/olmo/share/nltk_data' [rank2]: - '/home/nws8519/.conda/envs/olmo/lib/nltk_data' [rank2]: - '/usr/share/nltk_data' [rank2]: - '/usr/local/share/nltk_data' [rank2]: - '/usr/lib/nltk_data' [rank2]: - '/usr/local/lib/nltk_data' [rank2]: ********************************************************************** /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py:117: DtypeWarning: Columns (21) have mixed types. Specify dtype option on import or set low_memory=False. df = pd.read_csv("/home/nws8519/git/mw-lifecycle-analysis/p2/quest/072525_pp_biberplus_labels.csv") /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py:117: DtypeWarning: Columns (21) have mixed types. Specify dtype option on import or set low_memory=False. df = pd.read_csv("/home/nws8519/git/mw-lifecycle-analysis/p2/quest/072525_pp_biberplus_labels.csv") [rank1]: Traceback (most recent call last): [rank1]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 178, in [rank1]: main() [rank1]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 122, in main [rank1]: dataset = SentenceDataset(comment_texts, comment_types, priming, typology, instructions) [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 76, in __init__ [rank1]: sentences = split_to_sentences(cleaned_comment) [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 106, in split_to_sentences [rank1]: return nltk.sent_tokenize(text) [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 119, in sent_tokenize [rank1]: tokenizer = _get_punkt_tokenizer(language) [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 105, in _get_punkt_tokenizer [rank1]: return PunktTokenizer(language) [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1744, in __init__ [rank1]: self.load_lang(lang) [rank1]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang [rank1]: lang_dir = find(f"tokenizers/punkt_tab/{lang}/") [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank1]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/data.py", line 579, in find [rank1]: raise LookupError(resource_not_found) [rank1]: LookupError: [rank1]: ********************************************************************** [rank1]: Resource punkt_tab not found. [rank1]: Please use the NLTK Downloader to obtain the resource: [rank1]: >>> import nltk [rank1]: >>> nltk.download('punkt_tab') [rank1]:  [rank1]: For more information see: https://www.nltk.org/data.html [rank1]: Attempted to load tokenizers/punkt_tab/english/ [rank1]: Searched in: [rank1]: - '/home/nws8519/nltk_data' [rank1]: - '/home/nws8519/.conda/envs/olmo/nltk_data' [rank1]: - '/home/nws8519/.conda/envs/olmo/share/nltk_data' [rank1]: - '/home/nws8519/.conda/envs/olmo/lib/nltk_data' [rank1]: - '/usr/share/nltk_data' [rank1]: - '/usr/local/share/nltk_data' [rank1]: - '/usr/lib/nltk_data' [rank1]: - '/usr/local/lib/nltk_data' [rank1]: ********************************************************************** [rank3]: Traceback (most recent call last): [rank3]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 178, in [rank3]: main() [rank3]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 122, in main [rank3]: dataset = SentenceDataset(comment_texts, comment_types, priming, typology, instructions) [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 76, in __init__ [rank3]: sentences = split_to_sentences(cleaned_comment) [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py", line 106, in split_to_sentences [rank3]: return nltk.sent_tokenize(text) [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 119, in sent_tokenize [rank3]: tokenizer = _get_punkt_tokenizer(language) [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 105, in _get_punkt_tokenizer [rank3]: return PunktTokenizer(language) [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1744, in __init__ [rank3]: self.load_lang(lang) [rank3]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang [rank3]: lang_dir = find(f"tokenizers/punkt_tab/{lang}/") [rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank3]: File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/nltk/data.py", line 579, in find [rank3]: raise LookupError(resource_not_found) [rank3]: LookupError: [rank3]: ********************************************************************** [rank3]: Resource punkt_tab not found. [rank3]: Please use the NLTK Downloader to obtain the resource: [rank3]: >>> import nltk [rank3]: >>> nltk.download('punkt_tab') [rank3]:  [rank3]: For more information see: https://www.nltk.org/data.html [rank3]: Attempted to load tokenizers/punkt_tab/english/ [rank3]: Searched in: [rank3]: - '/home/nws8519/nltk_data' [rank3]: - '/home/nws8519/.conda/envs/olmo/nltk_data' [rank3]: - '/home/nws8519/.conda/envs/olmo/share/nltk_data' [rank3]: - '/home/nws8519/.conda/envs/olmo/lib/nltk_data' [rank3]: - '/usr/share/nltk_data' [rank3]: - '/usr/local/share/nltk_data' [rank3]: - '/usr/lib/nltk_data' [rank3]: - '/usr/local/lib/nltk_data' [rank3]: ********************************************************************** [rank2]:[W904 10:05:56.100290280 ProcessGroupNCCL.cpp:1476] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) [rank0]:[W904 10:05:56.107999460 ProcessGroupNCCL.cpp:1476] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) W0904 10:05:57.705000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 1400332 closing signal SIGTERM W0904 10:05:57.720000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py:900] Sending process 1400334 closing signal SIGTERM E0904 10:05:57.770000 1400307 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: 1) local_rank: 0 (pid: 1400331) of binary: /home/nws8519/.conda/envs/olmo/bin/python3.11 Traceback (most recent call last): File "/home/nws8519/.conda/envs/olmo/bin/torchrun", line 8, in sys.exit(main()) ^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main run(args) File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run elastic_launch( File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 270, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-09-04_10:05:57 host : qgpu0203 rank : 0 (local_rank: 0) exitcode : 1 (pid: 1400331) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ E0904 10:05:57.885000 1400308 /gpfs/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: 1) local_rank: 0 (pid: 1400333) of binary: /home/nws8519/.conda/envs/olmo/bin/python3.11 Traceback (most recent call last): File "/home/nws8519/.conda/envs/olmo/bin/torchrun", line 8, in sys.exit(main()) ^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main run(args) File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run elastic_launch( File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 270, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ /home/nws8519/git/mw-lifecycle-analysis/p2/quest/python_scripts/olmo_parallel_cat.py FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-09-04_10:05:57 host : qgpu0203 rank : 2 (local_rank: 0) exitcode : 1 (pid: 1400333) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ Traceback (most recent call last): File "/home/nws8519/.conda/envs/olmo/bin/torchrun", line 8, in sys.exit(main()) ^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main run(args) File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run elastic_launch( File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 265, in launch_agent if result.is_failed(): ^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'is_failed' Traceback (most recent call last): File "/home/nws8519/.conda/envs/olmo/bin/torchrun", line 8, in sys.exit(main()) ^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in main run(args) File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/run.py", line 883, in run elastic_launch( File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 139, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/nws8519/.conda/envs/olmo/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 265, in launch_agent if result.is_failed(): ^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'is_failed' srun: error: qgpu0203: tasks 2-3: Exited with exit code 1 srun: error: qgpu0202: tasks 0-1: Exited with exit code 1 unsupervised olmo categorization pau at Thu Sep 4 10:05:58 CDT 2025