question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

repro: invalid argument quoting during dictionary interpolation

See original GitHub issue

During dictionary interpolation, values are always enclosed in single quotes. On Windows this always results in an error because the cmd shell doesn’t have single quotes. On Linux this results in an error at least if the value contains a single quote.

Reproduce

A testing dvc.yaml:

vars:
  - param:
    param_1:
      param: testing quotes
    param_2:
      param: test'n quotes
stages:
  test_quotes_1: # This fails on Windows
    cmd: python -c "import sys; assert sys.argv[2] == 'testing quotes', sys.argv[2]" ${param_1} && echo Passed || echo Failed
  test_quotes_2: # This fails on both Windows and Linux
    cmd: python -c "import sys; assert sys.argv[2] == 'test\'n quotes', sys.argv[2]" ${param_2} && echo Passed || echo Failed
  1. dvc init --no-scm
  2. Create a dvc.yaml as above
  3. dvc repro

On Windows: both stages print “Failed” On Linux: test_quotes_1 prins “Passed”, test_quotes_2 fails to execute due to bash syntax error.

Expected

Both stages print “Passed”

Environment information

Output of dvc doctor: Windows:

DVC version: 2.27.2 (pip)
---------------------------------
Platform: Python 3.9.7 on Windows-10-10.0.19044-SP0
Subprojects:
        dvc_data = 0.10.0
        dvc_objects = 0.4.0
        dvc_render = 0.0.11
        dvc_task = 0.1.2
        dvclive = 0.10.0
        scmrepo = 0.1.1
Supports:
        http (aiohttp = 3.7.4.post0, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.7.4.post0, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2022.8.2, boto3 = 1.24.78)
Cache types: hardlink
Cache directory: NTFS on C:\
Caches: local
Remotes: local
Workspace directory: NTFS on C:\
Repo: dvc, git

Linux:

DVC version: 2.27.2 (pip)
---------------------------------
Platform: Python 3.9.7 on Linux-4.14.268-205.500.amzn2.x86_64-x86_64-with-glibc2.26
Subprojects:
        dvc_data = 0.10.0
        dvc_objects = 0.4.0
        dvc_render = 0.0.11
        dvc_task = 0.1.2
        dvclive = 0.11.0
        scmrepo = 0.1.1
Supports:
        http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2022.8.2, boto3 = 1.24.59)
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Workspace directory: xfs on /dev/xvda1
Repo: dvc (no_scm)

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:2
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
aparparacommented, Sep 28, 2022

@daavoo, unfortunately different shells use different parsing rules, so the problem can’t be solved in a platform-independent way in principle. For UNIX-like shells the proper way is to use shlex.quote().

For Windows one can use subprocess.list2cmdline. It is an internal undocumented function, but I don’t know a better option, and the algorithm is not trivial (see also How to escape os.system() calls?) It should be mentioned that for os.system(), subprocess.list2cmdline is not sufficient, one should also escape special characters. But if you always use the subprocess module, it is enough.

1reaction
aparparacommented, Sep 28, 2022

Let me clarify. Dict unpacking is not just for fun. The result is usually executed, so it is essential that the resulting command line is well-formed from the point of view of the shell, in which it gonna be executed, and it is essential that this command line produces exactly the expected result. Different shells use different syntax, so if you pass the same command line to different shells, it may fail to execute or may execute in an unexpected way. I’d understand if you just insert values without modification. In this case users would have to insert quotes etc. themselves, but at least the result would be easily predictable. But you are trying to modify values: backslash-escape double quotes and add surrounding quotes. On Windows this may lead to utterly puzzling results. E.g. a single backslash will convert to a single double-quote!

vars:
  - param:
    param_3:
      param: '\'
stages:
  test_quotes_3:
    cmd: python -c "import sys; assert sys.argv[2] == '\\', sys.argv[2]" ${param_3} && echo Passed || echo Failed

This gets:

>dvc repro
Running stage 'test_quotes_3':
> python -c "import sys; assert sys.argv[2] == '\\', sys.argv[2]" --param "\" && echo Passed || echo Failed
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AssertionError: "
Failed
Use `dvc push` to send your updates to remote storage.

The algorithm to escape quotes and backslashes correctly is different on Windows and Linux. But fortunately, it is already implemented, you don’t need to invent it. Just use shlex.quote on Linux and subprocess.list2cmdline on Windows. If you want to understand why your approach fails on Windows, you can consult A Better Way To Understand Quoting and Escaping of Windows Command Line Arguments.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Interpolation in Your DSO - Teledyne LeCroy
This picture demonstrates how to test SinX interpolation validity with a WaveMaster scope. The repetitive signal is applied in this case to channel...
Read more >
Ansible: Invalid JSON when using --extra-vars - Stack Overflow
I have been struggling with an issue in ansible issue for days now. Everything is executed wihtin a Jenkins pipeline. The ansible command...
Read more >
GIS Dictionary - Esri Support
Look up terms related to GIS operations, cartography, and Esri technology.
Read more >
Projections | Google Earth Engine
This projection propagates back through the sequence of operations such that the inputs are requested in maps mercator, at a scale determined by...
Read more >
An introduction to data cleaning with R
Reproduction is permitted, provided Statistics Netherlands is quoted as the ... notes describe a range of techniques, implemented in the R ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found