[Feature][runtime env] `RuntimeEnvAgent` use MD5Sum as the key to cache package which download from URIs
See original GitHub issueSearch before asking
- I had searched in the issues and found no similar feature requirement.
Description
Currently, runtime_env_agent use parsed URI as key to cache downloaded packages, but it seems that will bring up two problems:
- I find two different URIs with the same protocol have the same result which returns by function
pares_uri
:
In [1]: from ray._private.runtime_env.packaging import parse_uri
In [2]: parse_uri("s3://test/a_file.zip")
Out[2]: (<Protocol.S3: 's3'>, 's3_test_a_file.zip')
In [3]: parse_uri("s3://test/a/file.zip")
Out[3]: (<Protocol.S3: 's3'>, 's3_test_a_file.zip')
In [4]: parse_uri("s3://test/a/file.zip") == parse_uri("s3://test/a_file.zip")
Out[4]: True
This may cause workers can’t run in the right runtime_env.
- If we use the hash value of URI to solve problem 1, then we will face problem 2: Let us consider the following usage scenarios:
- Scenes: User use ray client to connect ray cluster in his local mac, and his python job will read a config file test.yml. Then he put this config in relative path “.”, and uses runtime_env parameters “working_dir” to package and upload config file to ray cluster. Next, he finds his job finished but the result doesn’t conform to his expectations. Then he changes his config file ./test.yml, and runs his code without any modifications. Finally, the result of the job which submits to the ray cluster will be the same as last time.
- My point: The user can indeed make the modification of the config file take effect by changing the path where the configuration file is located, and at the same time modifying the code that submits the job locally, but this will lead to poor user experience and reduce user productivity.
- Too long to see, the follow test case will failed in master:
@pytest.mark.skipif(sys.platform == "win32", reason="Fail to create temp dir.")
def test_same_uri(start_cluster):
cluster, address = start_cluster
ray.init(address)
with tempfile.TemporaryDirectory() as tmp_dir, chdir(tmp_dir):
check_value1 = b"1"
check_value2 = b"2"
with zipfile.ZipFile("test.zip", "w") as zf:
with zf.open("test_file", "w") as f:
f.write(check_value1)
with open("test.zip", "rb") as f:
f1_bytest = f.read()
os.remove("test.zip")
with zipfile.ZipFile("test.zip", "w") as zf:
with zf.open("test_file", "w") as f:
f.write(check_value2)
with open("test.zip", "rb") as f:
f2_bytest = f.read()
os.remove("test.zip")
assert f1_bytest != f2_bytest
gcs_uri = "gcs://test.zip"
_internal_kv_put(gcs_uri, f1_bytest, overwrite=True)
assert _internal_kv_get(gcs_uri) == f1_bytest
@ray.remote(runtime_env={"working_dir": gcs_uri})
def f1():
with open("test_file", "rb") as f:
value = f.read()
assert value == check_value1
ray.get(f1.remote())
_internal_kv_put(gcs_uri, f2_bytest, overwrite=True)
assert _internal_kv_get(gcs_uri) == f2_bytest
@ray.remote(runtime_env={"working_dir": gcs_uri})
def f2():
with open("test_file", "rb") as f:
value = f.read()
assert value == check_value2
ray.get(f2.remote())
Problem one, I think is a bug, and problem two, I think is very important too. I’ve seen some guys overwriting a python package of the same name same version on our internal PyPI source to save time, not to mention overwrite the s3 file…
Use case
No response
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
runtime_env backlog Milestone - GitHub
[Feature] [runtime env] Clean up the command arguments in raylet args ... env] RuntimeEnvAgent use MD5Sum as the key to cache package which...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@architkulkarni @edoakes for 2, We will discuss it with our usage case internally, and we will synchronize the results later.
I think it is difficult to make the local file names be human-readable, case according to aws doc: Creating object key names, s3 key support some special characters, such as space, and if we fixed it one by one, this will bring a lot of work, and more potential bugs…