long filenames caused by long reddit submission titles cause problems for Windows and Linux (NFS) file systems
See original GitHub issue(NSFW) Warning :: The reddit link listed below for testing is beautiful but is certainly NSFW.
-
Ripme version: 1.7.95 (on centos 7) & 2.0.1-4-9a05be80 (on Win10)
-
Java version:
-> java -version openjdk version “1.8.0_292” OpenJDK Runtime Environment (build 1.8.0_292-b10) OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)
F:\rips\ripme\build\libs>java -version openjdk version “11.0.11” 2021-04-20 OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9) OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)
- Operating system:
-> uname -a Linux localhost.localdomain 3.10.0-1160.31.1.el7.x86_64 #1 SMP Thu Jun 10 13:32:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux -> cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core)
F:\rips\ripme\build\libs>ver Microsoft Windows [Version 10.0.19043.1110]
- Exact URL you were trying to rip when the problem occurred:
:: (NSFW) Warning :: https://www.reddit.com/user/BlondEvl/comments/osveef/does_anyone_remember_how_in_school_we_were_taught/ :: (NSFW) Warning ::
- Please include any additional information about how to reproduce the problem:
Ripping any number of BlondEvl’s posts will cause this filename too long problem. I can come up with more URLs if needed.
Expected Behavior
Rip the reddit post with a crazy long submission title and truncate the filenames to be saved to some configurable maximum length perhaps. Seems like this one hits 256 characters with 255 being the max, I think. If there’s a more graceful way of detecting the maximum length maybe do that. I found this other issue which may have prior art to draw from: https://github.com/RipMeApp/ripme/issues/369
This reddit user tends to have realllllly long submission titles that cause problems for ripme. They’re creative but they cause problems. 😃
Actual Behavior
When ripping a reddit URL for a post with a really long submission title it produces a filename that is longer than the allowed number of characters for both windows and linux (at least to an NFS mounted file system).
Windows example: F:\rips\ripme\build\libs>java -jar ripme-2.0.1-4-9a05be80.jar -u “https://www.reddit.com/user/BlondEvl/comments/osveef/does_anyone_remember_how_in_school_we_were_taught/” 20:38:07.832 [main] ERROR com.rarchives.ripme.utils.Utils - Exception: java.io.IOException: The parameter is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) ~[?:?] at java.io.WinNTFileSystem.canonicalize(WinNTFileSystem.java:438) ~[?:?] at java.io.File.getCanonicalPath(File.java:626) ~[?:?] at com.rarchives.ripme.utils.Utils.removeCWD(Utils.java:332) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.DownloadFileThread.<init>(DownloadFileThread.java:46) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.AlbumRipper.addURLToDownload(AlbumRipper.java:81) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.AbstractRipper.addURLToDownload(AbstractRipper.java:352) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.AbstractRipper.addURLToDownload(AbstractRipper.java:356) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.AbstractRipper.addURLToDownload(AbstractRipper.java:360) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.AbstractRipper.addURLToDownload(AbstractRipper.java:378) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.rippers.RedditRipper.handleURL(RedditRipper.java:291) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.rippers.RedditRipper.parseJsonChild(RedditRipper.java:203) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.rippers.RedditRipper.getAndParseAndReturnNext(RedditRipper.java:106) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.ripper.rippers.RedditRipper.rip(RedditRipper.java:81) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.App.rip(App.java:104) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.App.ripURL(App.java:297) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.App.handleArguments(App.java:280) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] at com.rarchives.ripme.App.main(App.java:79) [ripme-2.0.1-4-9a05be80.jar:2.0.1-4-9a05be80] 20:38:07.905 [pool-2-thread-1] ERROR com.rarchives.ripme.utils.Utils - The filename osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accurately_determine_eye_color_besides_blue_and_brown.-96ornzh9ntd71.jpg is to long to be saved on this file system.
Here’s an example of trying to create the same file with echo and output redirection to a filename just like ripme tried to create:
F:\rips\ripme\build\libs>echo foo > osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accurately_determine_eye_color_besides_blue_and_brown.-96ornzh9ntd71.jpg The filename, directory name, or volume label syntax is incorrect.
Linux example: (/home/ripme/archive is an NFS mounted file system from a NAS) -> java -jar ripme-1.7.95-jar-with-dependencies.jar -l /home/ripme/archive/testrip -H /dev/null -u “https://www.reddit.com/user/BlondEvl/comments/osveef/does_anyone_remember_how_in_school_we_were_taught/” Loaded /home/ripme/.config/ripme/rip.properties Setting locale to en_US Loaded log4j.properties Initialized ripme v1.7.95 Set history file to /dev/null Loading history from /home/ripme/.config/ripme/history.json [+] Creating directory: /home/ripme/archive/testrip/reddit_user_BlondEvl Trying to load cookies from config for www.reddit.com Trying to load cookies from config for reddit.com Downloading file: https://i.redd.it/96ornzh9ntd71.jpg Retry #1 The filename osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accurately_determine_eye_color_besides_blue_and_brown.-96ornzh9ntd71.jpg is to long to be saved on this file system. Shortening filename osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accurajpg [!] Exception while downloading file: https://i.redd.it/96ornzh9ntd71.jpg - /home/ripme/archive/testrip/reddit_user_BlondEvl/osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accura.jpg (No such file or directory) Downloading file: https://i.redd.it/96ornzh9ntd71.jpg Retry #2 Exception in thread “pool-1-thread-1” java.lang.NullPointerException at com.rarchives.ripme.ripper.DownloadFileThread.run(DownloadFileThread.java:257) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
The 1.7.95 version (from the main ripme github project) seems to attempt to shorten the filename per the output but it’s still making the NFS mounted file system angry. I only play file system engineer on TV so I really have no clue what the limits are. I think the limit is 255 characters for the filename. Looks like we’re hitting just over that.
Here’s the character count for for the 1.7.95 version after it attempts to shorten the filename:
-> echo osveef-Does_anyone_remember_how_in_school_we_were_taught_eye_color_is_determined_by_this_one_gene_with_2_alleles__brown_dominant_over_blue__Yet_eye_color_is_a_polygenic_trait__currently_over_60_genes_are_linked_to_it._Maybe_soon_DNA_tests_could_accura.jpg | wc -m 256
Doesn’t look like the 2.0.1-4-9a05be80 version attempts to shorten the filename.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:6
Top GitHub Comments
so we’d need a parameter somewhere …
Sorry, yeah the directory minus drive and nul char is 255. I was reading this and just said 260 but if you minus the D:\ and the NUL char it is still 255.