question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CRITICAL logs show unusual exception

See original GitHub issue

The _CRITICAL.logs I see contain

2018-10-12 23:14:51,070 - CRITICAL - yank.mpi - MPI node 1/12 raised an exception and called Abort()! The exception traceback follows


Traceback (most recent call last):
  File "run_yank.py", line 153, in <module>
    main(args.yaml_script_path, args.job_id, args.n_jobs)
  File "run_yank.py", line 141, in main
    experiment_builder.run_experiments()
  File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/experiment.py", line 796, in run_experiments
    send_results_to='all')
  File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/mpi.py", line 531, in distribute
    *other_args, **kwargs)
  File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/mpi.py", line 386, in exec_tasks
    raise error
AttributeError: Group 1/12 Node 1/1 received an exception from another MPI process. Original stack trace follow:
Traceback (most recent call last):
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/mpi.py", line 357, in exec_tasks
        results.append(task(distributed_arg, *other_args, **kwargs))
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/experiment.py", line 3120, in _run_experiment
        built_experiment.run(n_iterations=switch_experiment_interval)
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/experiment.py", line 460, in run
        alchemical_phase = AlchemicalPhase.from_storage(phase)
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/yank.py", line 811, in from_storage
        sampler = sampler_class.from_storage(storage)
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/multistate/multistatesampler.py", line 222, in from_storage
        sampler = cls._instantiate_sampler_from_reporter(reporter)
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/multistate/multistatesampler.py", line 854, in _instantiate_sampler_from_reporter
        options['mcmc_moves'] = reporter.read_mcmc_moves()
      File "/home/chodera/miniconda/lib/python3.6/site-packages/yank/multistate/multistatereporter.py", line 748, in read_mcmc_moves
        mcmc_moves.append(mmtools.utils.deserialize(serialized_move))
      File "/home/chodera/miniconda/lib/python3.6/site-packages/openmmtools/utils.py", line 597, in deserialize
        names.append(serialization.pop(_SERIALIZED_MANGLED_PREFIX + key))
    AttributeError: 'NoneType' object has no attribute 'pop

I have installed

openmm                    7.4.0             py36_cuda92_1    omnia/label/dev
openmmtools               0.15.0                   py36_0    omnia
yank                      0.23.7                   py36_0    omnia

with openmm 7.3.0.dev-f8dcb72.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jchoderacommented, Oct 14, 2018

Great detective work! I wonder if there’s a way we can raise an exception if the NetCDF file is corrupted.

I’ll spend more time investigating the “safe termination” as discussed in https://github.com/choderalab/yank/issues/1081

0reactions
jchoderacommented, Oct 14, 2018

Could we add error checking for empty strings being returned in this particular case? This seems to occur in multiple different NetCDF files for me.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is Coralogix's new Error and Critical logs anomaly
Coralogix's New Error and Critical Logs Anomaly leverages our ability to separate different logs into their original templates to alert our ...
Read more >
Access and Error Logs - The Ultimate Guide To Logging - Loggly
If an ending log entry is not written, which means there is no “-<unique id>” entry, the request did not complete, indicating script...
Read more >
How do I log a Python error with debug information?
try: # do something here except Exception as e: logging.critical(e, exc_info=True) # log exception info at CRITICAL log level.
Read more >
Enabling logging and tracing for application clients - IBM
This log is most useful for determining XJBInit() errors and any unusual exceptions that do not come from the Java environment.
Read more >
How to Categorize Logs for More Effective Monitoring - Datadog
Learn how Datadog's log processing pipelines can help you start ... URL paths that include elements like unique shopping cart IDs (e.g., ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found