question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tasks created by importing local storage are not sorted -- when using IO for external storage

See original GitHub issue

Describe the bug When files from the local storage (file system) are synchronized with label-studio, using IO external storage (i.e., added to the project and tasks), the tasks for these files are in random order. In other words, the IDs of the tasks have nothing to do with either filename of the original file, not the time when this file was created in the file system.

To Reproduce Steps to reproduce the behavior:

  1. Prepare files in the file system
  2. Synchronize/add files to the label-studio using IO Storage API (local files)
  3. Go to a project
  4. See that the IDs for tasks are random

Expected behavior The IDs of the tasks should be sorted by filenames of the files to be added to the label-studio.

Screenshots n/a – can provide if really needed

Environment (please complete the following information):

  • OS: Ubuntu and macOS
  • Label Studio Version Release 1.4.1

Additional context

In the label_studio/io_storages/localfiles/models.py

there is:

for file in path.rglob('*'):

Meanwhile, even in Python documentation rglob comes with the sorted() function, see this example.

image

Suggested solution

image

I can create a pull request for that, if you want.

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
makseqcommented, May 13, 2022

You are doing an amazing job guys with label-studio, good luck!

Thank you very much for your kind words, they are really warm us!

2reactions
mikbuchcommented, May 12, 2022

Hi @dvwright , we chose sorting by filename for several reasons. First, it is the default behavior in most file managers, so we wanted to keep the user (also understood as a programmer) consistent with what he/she sees in the file manager browsing files in the filesystem and what is then presented in the label-studio. Second, in our case file name contains some date and time information in a standardized form (not going into details). So in our use case the filename itself contains some kind of “creation time” (I know that this is something different from the time when the file was created in particular filesystem, nevertheless, we discussed this issue and we finally chose sorting by filename).

@makseq , I’m glad that I could help. You are doing an amazing job guys with label-studio, good luck!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How To Create Folder in Local Storage/External Flutter?
if you want to create dir in /storage/emulated/0 try this. import 'dart:io'; _createFolder()async{ final folderName="some_name"; ...
Read more >
Access documents and other files from shared storage
The Storage Access Framework supports the following use cases for accessing files and other documents. Create a new file: The ACTION_CREATE_DOCUMENT intent ...
Read more >
Storage - Prefect 2 - Coordinating the world's dataflows
Prefect storage configures local or remote data stores used for flow scripts, deployments, and flow runs.
Read more >
Import Data into Label Studio
Import and upload data labeling tasks from audio, HTML, image, CSV, text, ... If your data is stored locally, import it into Label...
Read more >
How to Use Local Storage with JavaScript - Section.io
Local storage allows developers to store and retrieve data in the browser. The data stored in local storage will not expire.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found