question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Filenames with unusual characters break check_executables_have_shebangs

See original GitHub issue

The check_executables_have_shebangs gets file paths with the output of the following git command:

git ls-files --stage -- path1 path2

I’m not sure about Linux, but on Windows, when the filename contains an unusual character (sorry about the choice of “unusual” to describe the character, I don’t want to speculate on encoding issues), the character is escaped with integer sequences, and git wraps the file path in double-quotes (because of the backslashes I guess):

$ git ls-files --stage -- tests/demo/doc/mañana.txt tests/demo/doc/manana.txt
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0       tests/demo/doc/manana.txt
100644 5ab9dcdd36df0f76f202227ecb7ae8a5baaa456b 0       "tests/demo/doc/ma\303\261ana.txt"

The resulting path variable becomes "tests/demo/doc/ma\303\261ana.txt", and then the script tries to open it, which fails of course because of the quotes and escaping:

Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\user\.cache\pre-commit\repofzzr3u3t\py_env-python3\Scripts\check-executables-have-shebangs.EXE\__main__.py", line 7, in <module>
  File "c:\users\user\.cache\pre-commit\repofzzr3u3t\py_env-python3\lib\site-packages\pre_commit_hooks\check_executables_have_shebangs.py", line 66, in main
    return check_executables(args.filenames)
  File "c:\users\user\.cache\pre-commit\repofzzr3u3t\py_env-python3\lib\site-packages\pre_commit_hooks\check_executables_have_shebangs.py", line 17, in check_executables
    return _check_git_filemode(paths)
  File "c:\users\user\.cache\pre-commit\repofzzr3u3t\py_env-python3\lib\site-packages\pre_commit_hooks\check_executables_have_shebangs.py", line 36, in _check_git_filemode
    has_shebang = _check_has_shebang(path)
  File "c:\users\user\.cache\pre-commit\repofzzr3u3t\py_env-python3\lib\site-packages\pre_commit_hooks\check_executables_have_shebangs.py", line 45, in _check_has_shebang
    with open(path, 'rb') as f:
OSError: [Errno 22] Invalid argument: '"tests/demo/doc/ma\\303\\261ana.txt"'

To fix the quotes issue, the pre-commit script could try to remove them with a .strip('"') call.

For the character escaping, I have no idea! Do you 😄 ?

Ref: https://github.com/copier-org/copier/pull/224#issuecomment-665079662

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
asottilecommented, Jul 28, 2020

you’ll want to add -z to the command above, and then instead of splitlines you’ll use zsplit (similar to this)

feel free to copy that implementation here if there isn’t already something doing similar in pre-commit-hooks

1reaction
asottilecommented, Jul 28, 2020

would you be interested in working on a patch to fix this?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Characters to Avoid in Filenames and Directories
Illegal Filename Characters​​ Don't start or end your filename with a space, period, hyphen, or underline. Keep your filenames to a reasonable length...
Read more >
How to Manipulate Filenames Having Spaces and Special ...
In this article, we will see how to create, copy, move and delete filesnames that starts with spaces and special characters (say #,...
Read more >
Fixing Unix/Linux/POSIX Filenames: Control Characters (such ...
One problem (among several!) is that if filenames can contain spaces, their names will be split (file “a b” will be incorrectly parsed...
Read more >
What characters are forbidden in Windows and Linux directory ...
The forbidden printable ASCII characters are: Linux/Unix: / (forward slash). Windows: < (less than) > (greater than) : (colon - sometimes works, ...
Read more >
Special characters in file names - Microsoft Community
Hi, since this morning (out of a blue) any of my files that contains special character (Norwegian letters) or files that stored in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found