question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[RFC] Wrapper/Proxy for Stream

See original GitHub issue

🚀 Feature

Discussed with @NivekT about the wrapper class for all streams: Pros:

  • We can add a __del__ method to close the file stream automatically when ref count becomes 0 for wrapper. It would eliminate all warnings.
  • A wrapper class can unify the reading API for file streams. (For OnDiskCache, I would prefer a unified API to read stream, otherwise I have to handle all different cases)
    • Local file stream, we can use read() to read everything into memory
    • When we set stream=True for large file, the requests.Response doesn’t support read. It only supports iter_content or __iter__ to read chunk by chunk.

Cons:

  • Thanks to @NivekT, it needs extra care about magic methods.

Reference: https://github.com/pytorch/data/pull/35#discussion_r728201731, https://github.com/pytorch/data/pull/65#discussion_r730117933

cc: @VitalyFedyunin

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
ejguancommented, Oct 22, 2021

In the case of Extractor or Decompressor the yielded file handle should also be wrapped by the Wrapper to make sure fd is closed automatically and have a unified API like read

1reaction
ejguancommented, Oct 22, 2021

You are right with Saver. I am talking about reading data from file stream. The current workflow is:

urls = IterableWrapper([URL])
fd = urls.open_url() # File handles
data = fd.map(fn=lambda x: x.read(), input_col=1) # <- This one is referred as Downloader in my mind
file = data.save_to_disk()

We can let user to do any map function to download data from file handle. But, if we are going to implement a DataPipe to do the same thing, we need to make sure all file handles (streams) sent to this DataPipe can be read.

The reason that I prefer a read method is the stream type varies:

  • String: ".join(fd)"
  • Bytes: b"".join(fd)
Read more comments on GitHub >

github_iconTop Results From Across the Web

HTTP Live Streaming RFC 8216 - IETF Datatracker
HTTP Live Streaming (RFC 8216, August 2017)
Read more >
RFC 2326: Real Time Streaming Protocol (RTSP)
Abstract The Real Time Streaming Protocol, or RTSP, is an application-level protocol for control over the delivery of data with real-time properties.
Read more >
Re: [stunnel-users] stunnel tls wrapper/proxy for xmpp - stunnel ...
In an attempt to resolve the SSL issue, I re-compiled the debian finch package from experimental (had to fix a busted debian/control) and...
Read more >
Newest 'wrapper' Questions - Page 7 - Stack Overflow
christopher.online's user avatar · christopher.online ... I am trying to calculate AES-MAC (RFC 4493) with Pkcs11 wrapper in java. and it seems that...
Read more >
MENU - Linuxsecrets
1st April RFC 3093: http://ietf.org/rfc/rfc3093.txt. So-called Firewall Enhancement Protocol (FEP). ... TLS/SSL wrapper/proxy for FTP.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found