question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Not sure what is causing the issue but upon reaching this step DeepVariant failed. Any thoughts on how to fix? I tired to run it in a python2.7 environment and still it somehow is pulling from python 3.6 it seems.

***** Running the command:***** time /opt/deepvariant/bin/call_variants --outfile “/tmp/tmp9_28zx5u/call_variants_output.tfrecord.gz” --examples “/tmp/tmp9_28zx5u/make_examples.tfrecord@1.gz” --checkpoint “/opt/models/wes/model.ckpt”

I0424 15:59:50.266534 139872277903104 call_variants.py:316] Set KMP_BLOCKTIME to 0 2020-04-24 15:59:50.321136: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel® MKL-DNN to use the following CPU instructions in performance critical operations: AVX2 FMA To enable them in non-MKL-DNN operations, rebuild TensorFlow with the appropriate compiler flags. 2020-04-24 15:59:50.376605: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2904000000 Hz 2020-04-24 15:59:50.378224: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56a1fd0 executing computations on platform Host. Devices: 2020-04-24 15:59:50.378283: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version 2020-04-24 15:59:50.380979: I tensorflow/core/common_runtime/process_util.cc:115] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. I0424 15:59:50.447775 139872277903104 modeling.py:563] Initializing model with random parameters W0424 15:59:50.449538 139872277903104 estimator.py:1821] Using temporary folder as model directory: /tmp/tmp3bl4tsmc I0424 15:59:50.450443 139872277903104 estimator.py:212] Using config: {‘_model_dir’: ‘/tmp/tmp3bl4tsmc’, ‘_tf_random_seed’: None, ‘_save_summary_steps’: 100, ‘_save_checkpoints_steps’: None, ‘_save_checkpoints_secs’: 600, ‘_session_config’: , ‘_keep_checkpoint_max’: 100000, ‘_keep_checkpoint_every_n_hours’: 10000, ‘_log_step_count_steps’: 100, ‘_train_distribute’: None, ‘_device_fn’: None, ‘_protocol’: None, ‘_eval_distribute’: None, ‘_experimental_distribute’: None, ‘_experimental_max_worker_delay_secs’: None, ‘_session_creation_timeout_secs’: 7200, ‘_service’: None, ‘_cluster_spec’: <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f3659263518>, ‘_task_type’: ‘worker’, ‘_task_id’: 0, ‘_global_id_in_cluster’: 0, ‘_master’: ‘’, ‘_evaluation_master’: ‘’, ‘_is_chief’: True, ‘_num_ps_replicas’: 0, ‘_num_worker_replicas’: 1} I0424 15:59:50.451262 139872277903104 call_variants.py:384] Writing calls to /tmp/tmp9_28zx5u/call_variants_output.tfrecord.gz W0424 15:59:50.467876 139872277903104 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. I0424 15:59:50.501495 139872277903104 data_providers.py:369] self.input_read_threads=8 W0424 15:59:50.501965 139872277903104 deprecation.py:323] From /tmp/Bazel.runfiles_sszxydhb/runfiles/com_google_deepvariant/deepvariant/data_providers.py:374: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.experimental_determinstic. I0424 15:59:50.681574 139872277903104 data_providers.py:376] self.input_map_threads=48 W0424 15:59:50.681832 139872277903104 deprecation.py:323] From /tmp/Bazel.runfiles_sszxydhb/runfiles/com_google_deepvariant/deepvariant/data_providers.py:381: map_and_batch (from tensorflow.python.data.experimental.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.map(map_func, num_parallel_calls) followed by tf.data.Dataset.batch(batch_size, drop_remainder). Static tf.data optimizations will take care of using the fused implementation. I0424 15:59:51.794167 139872277903104 estimator.py:1147] Calling model_fn. W0424 15:59:51.800228 139872277903104 deprecation.py:323] From /tmp/Bazel.runfiles_sszxydhb/runfiles/com_google_deepvariant/deepvariant/modeling.py:885: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Deprecated in favor of operator or tf.math.divide. W0424 15:59:51.806498 139872277903104 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. I0424 16:00:02.682547 139872277903104 estimator.py:1149] Done calling model_fn. I0424 16:00:06.021238 139872277903104 monitored_session.py:240] Graph was finalized. I0424 16:00:06.037272 139872277903104 saver.py:1284] Restoring parameters from /opt/models/wes/model.ckpt I0424 16:00:10.817819 139872277903104 session_manager.py:500] Running local_init_op. I0424 16:00:11.060626 139872277903104 session_manager.py:502] Done running local_init_op. I0424 16:00:12.403780 139872277903104 modeling.py:413] Reloading EMA… I0424 16:00:12.405867 139872277903104 saver.py:1284] Restoring parameters from /opt/models/wes/model.ckpt I0424 16:00:48.634510 139872277903104 call_variants.py:402] Processed 1 examples in 1 batches [5816.472 sec per 100]

real 4m2.970s user 5m54.674s sys 1m14.107s I0424 16:03:48.557898 140277446174464 run_deepvariant.py:321] None Traceback (most recent call last): File “/opt/deepvariant/bin/run_deepvariant.py”, line 332, in <module> app.run(main) File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 299, in run _run_main(main, args) File “/usr/local/lib/python3.6/dist-packages/absl/app.py”, line 250, in _run_main sys.exit(main(argv)) File “/opt/deepvariant/bin/run_deepvariant.py”, line 319, in main subprocess.check_call(command, shell=True, executable=‘/bin/bash’) File “/usr/lib/python3.6/subprocess.py”, line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command ‘time /opt/deepvariant/bin/call_variants --outfile “/tmp/tmp9_28zx5u/call_variants_output.tfrecord.gz” --examples “/tmp/tmp9_28zx5u/make_examples.tfrecord@1.gz” --checkpoint “/opt/models/wes/model.ckpt”’ returned non-zero exit status 247.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14

github_iconTop GitHub Comments

1reaction
pgrosucommented, Jun 12, 2020

@ptrebert Glad it worked 😃 DeepVariant is nice but it’s written more complex than it has to be, and when you add Docker/Singularity on top of that, that injects many layers of complexity (not easily exposed) creating opportunity for heisenbugs. Docker/Singularity are really meant for smaller applications, since their interaction with the kernel become multiplicative rather than additive for larger applications, which you noticed indirectly via the memory resource requirements.

0reactions
ptrebertcommented, Jun 12, 2020

@pgrosu A substantial increase (almost double) of the memory available to the DeepVariant cluster jobs worked, both jobs now succeeded. Thanks for your help.

Read more comments on GitHub >

github_iconTop Results From Across the Web

exit status 247 for systemd process - Unix Stack Exchange
I have a service that I manage with systemd ; to be precise, it runs the daemon for django-background-tasks (but I do not...
Read more >
AWS-RunPatchBaseline failed to run commands: exit status 247
For Amazon Linux 2 Arch x86_64 (which is most likely your case), exit code 247 corresponds to SYS_WAITID during system call, an error...
Read more >
Exit code 247 - documentation : TW-50860 - JetBrains YouTrack
After further digging, it looks like an exit code of 247 can mean that Docker killed the process due to memory issues. I'm...
Read more >
Docker Container exited with code 247 when getting data from ...
I'm running a Flask API inside a Docker Container. My application has to download files from Google Cloud, and sometimes, after some minutes...
Read more >
NetBackup backups are failing with status code 247 even ...
Backups for a new policy immediately fail with status code 247 (the specified policy is not active) and the Detailed Status from the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found