question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CommandException: No URLs matched when passing URLs to rm from stdin

See original GitHub issue

I’m trying to use gsutil rm -I and pass a list of URLs to delete through stdin.

For an existing directory in, say, gs://test-bucket/test-dir, these are some commands I’ve tried:

# verify directory exists
$ gsutil ls -d gs://test-bucket/test-dir
gs://test-bucket/test-dir/

$ echo "gs://test-bucket/test-dir" | gsutil -m rm -r -I
CommandException: No URLs matched

$ echo gs://test-bucket/test-dir | gsutil -m rm -r -I
CommandException: No URLs matched

$ gsutil -m rm -r -I <<< "gs://test-bucket/test-dir"
CommandException: No URLs matched

$ gsutil ls -d gs://test-bucket/test-dir | gsutil -m rm -r -I
CommandException: No URLs matched

Am I missing something here?

Issue Analytics

  • State:open
  • Created 6 years ago
  • Reactions:1
  • Comments:15

github_iconTop GitHub Comments

7reactions
yaseenkhanmohmandcommented, Sep 20, 2018

Seeing the same problem. any suggested solutions?

2reactions
dilippedcommented, Feb 18, 2021

Sorry for the delay in response. Our team is currently occupied with other priorities and does not have the bandwidth to address this issue at the moment. However, I did some investigation for future reference.

This seems to be happening because the url_strs gets iterated twice, once here https://github.com/GoogleCloudPlatform/gsutil/blob/d8626ae0ec4b4dc9fd729f115cdeefced4680cb5/gslib/commands/rm.py#L269 if recursion is requested, and next it gets passed to the NameExpansionIterator https://github.com/GoogleCloudPlatform/gsutil/blob/d8626ae0ec4b4dc9fd729f115cdeefced4680cb5/gslib/commands/rm.py#L288

So essentially, we are trying to iterate over the iterator twice and hence on the second instance, we get an empty iterator.

The easy fix would be to convert the iterator to a list, i.e changing https://github.com/GoogleCloudPlatform/gsutil/blob/d8626ae0ec4b4dc9fd729f115cdeefced4680cb5/gslib/commands/rm.py#L252 to

url_strs = [url for url in StdinIterator()]

But this can affect users who have really long list coming from stdin or users who are already using this feature in a pipeline and not really using the -r with -I. Note that this will only affect your if you are using -r and -I together.

The ideal fix would be to remove the recursion special case and instead handle the bucket deletion based on the NameExpansionIterator result itself.

A workaround would be something that is suggested here https://github.com/GoogleCloudPlatform/gsutil/issues/490#issuecomment-364611242

Alternatively, you can avoid using recursion (-r option) and pass in the list

gsutil ls gs://my_bucket/** | gsutil -m rm -I

Note that the above command will empty the bucket, but will not remove the bucket and you will have to run a separate command to remove it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

"""CommandException: No URLs matched. Do the files you're ...
I have many folders stored in gcs bucket and i want to delete those folders recursively. gsutil -m ls -d gs://bucket_name/folder_1/*/ | grep ......
Read more >
Why this sudden error "No URLS matched" when trans...
My data transfer from a google cloud bucket to AWS bucket was in progress but after a few hours I got this error...
Read more >
rm.py - EECS: www-inst.eecs.berkeley.edu
Returns: True if the exception was a no-URLs-matched exception and it matched one of bucket_strings_to_delete, None otherwise.
Read more >
lib/gs.py - chromiumos/chromite - Git at Google
for_gsutil: Do you want a URL for passing to `gsutil`?. public: Do we want the public or private url ... if ('CommandException: No...
Read more >
Rewrite objects | Cloud Storage - Google Cloud
You can pass a list of URLs (one per line) to rewrite on stdin instead of as command line ... or Google-managed encryption,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found