question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Remove run_dir from ray.train.backend.BackendExecutor.start_training (Ray 1.9.0 and above)

See original GitHub issue

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

The run_dir argument in ray.train.backend.BackendExecutor.start_training isn’t used but is causing the following error: if your host computer and job cluster use different OS, then you get a pathlib error because, for e.g., you can’t instantiate a pathlib.WindowsPath in a Linux system.

I had to subclass a lot to circumvent this issue.

Simple fix: remove the run_dir argument since it isn’t used anyway.

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
sidward14commented, Dec 22, 2021

Hey @amogkam, I’ve submitted the PR!

1reaction
amogkamcommented, Dec 21, 2021

Ah great catch @sidward14! We initially had the run_dir in the BackendExecutor, but have since done some refactoring, so you’re right the run_dir isn’t needed in the BackendExecutor anymore and we can remove it. Would you mind making a PR for this? We’d be happy to get this merged in!

Read more comments on GitHub >

github_iconTop Results From Across the Web

ray.train.backend — Ray 2.1.0
Source code for ray.train.backend. import logging from typing import TypeVar, Dict from ray.train._internal.utils import Singleton from ray.train.
Read more >
Ray Train User Guide — Ray 1.12.0
Ray Train provides a thin API around different backend frameworks for distributed deep learning. At the moment, Ray Train allows you to perform...
Read more >
ray.train.backend — Ray 1.12.1
class BackendExecutor: """Main execution class for training backends. This class holds a worker group and is responsible for executing the training function ...
Read more >
Ray Train API — Ray 1.11.0
A class for enabling seamless distributed deep learning. Directory structure: - A logdir is created during instantiation. This will hold all the results/ ......
Read more >
Ray Train API — Ray 2.2.0
This page covers framework specific integrations with Ray Train and Ray Train ... tensorflow_config – Configuration for setting up the TensorFlow backend.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found