question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can't create indexes from un-seekable fileobj's

See original GitHub issue

In theory, it should be possible to create indexes from un-seekable archive fileobj’s (by using peek instead of read to check file headers, for example). I was able to get this to work by modifying an older version of ratarmount core here (https://github.com/codalab/codalab-worksheets/pull/4212/files#diff-ad5ad76eb55b6437b2e1aa24b324c6d11d176be1926ebea1950a006bfce4efbe). It would be nice if we could do the same for the current version of ratarmountcore, though the current version seems to rely a lot more on seek.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mxmlnkncommented, Aug 17, 2022

Looks like we might need to resolve pauldmccarthy/indexed_gzip#102 first though…

I was going to write something like that before you did 😉.

And it’s not only indexed_gzip. I wanna swap indexed_gzip with pragzip sometime in the future and it also might have that issue. And the same goes for bz2 and zstd. I know that you probably only need gzip but it feels like a bug if only very specific file formats are supported.

In the end, I agree that it might be useful but it seems difficult to implement with all of ratarmount features. It might be implemented as a separate “function” on top of ratarmountcore because you wouldn’t even need FUSE for that. I imagine something like wget -O- remote.tar.gz | tee downloaded.tar.gz | ratarmount --index-file downloaded.tar.gz.index. Theoretically, it could be enough to detect stdin being written to enter that alternate mode but we might also trigger it explicitly with something akin to a --create-index-from-stream option. The ratarmount CLI gets kinda clunky, I’m wanting to redesign it to something akin to git with subcommands when it makes sense. I didn’t think deeper about it yet, I only categorized the options in the help output.

Currently, I won’t be able to work on this in a timely manner. I will review PRs though if you find the time to take a deeper look into it.

Edit: One such mentioned problematic ratarmount option for streaming support could be --recursive because it might try to seek back. But similar to unsupported file formats, it could be checked if the alternate mode has been activated and print out an error message.

0reactions
epicfaacecommented, Oct 3, 2022

Currently, I won’t be able to work on this in a timely manner. I will review PRs though if you find the time to take a deeper look into it.

That works!! Thanks. By the way https://github.com/pauldmccarthy/indexed_gzip/issues/102 is now resolved.

Read more comments on GitHub >

github_iconTop Results From Across the Web

List of drives to be indexed is not created correctly, cannot ...
I use windows 10 I have multiple drives. When I try to re-index, ticked drives are not displayed correctly in the index list....
Read more >
Access cannot create index - Stack Overflow
Access cannot create index ... Problem now is that access tells me, I cannot add the index because there is duplicate data in...
Read more >
gzip — Support for gzip files — Python 3.11.1 documentation
Constructor for the GzipFile class, which simulates most of the methods of a file object, with the exception of the truncate() method. At...
Read more >
sql server - Can't create indexes on really large table!
When I try to create an index on the table I get the error message below. Msg 1105, Level 17, State 2, Line...
Read more >
Why does Optim give error "Unable to create index file"? - IBM
Symptom. The following is seen in the Optim log file: Unable to create the index file. Unsupported message(-1) recieved from internal component ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found