question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Uploading individual .txt files as individual tasks fails in multiple ways

See original GitHub issue

Describe the bug For the NER usecase when each text file is a separate task; the export of annotated data exports only text file names for text content; even though there is no issues with labelling the content

To Reproduce Steps to reproduce the behavior:

  1. Set up NER labelling Task with following Config (I used the valueType=“url” based on recommendation here to have each .txt file as a separate task : https://labelstud.io/guide/tasks.html#Plain-text; it reads “If you want to import entire plain text files without each line becoming a new labeling task, customize the labeling configuration to specify valueType=“url” in the Text tag. See the Text tag documentation”)
  <Labels name="label" toName="text">
    <Label value="label1" background="#FFA39E"/>
    <Label value="label2" background="#D4380D"/>
    <Label value="label3" background="#FFC069"/>
    <Label value="label4" background="#AD8B00"/>
    <Label value="label5" background="#D3F261"/>
    <Label value="label6" background="#389E0D"/>
    <Label value="labe7" background="#5CDBD3"/>
  </Labels>
  <Text name="text" value="$text" valueType="url"/>
</View>
  1. Upload a few .txt files and Import as Time Series (only Time Series option keeps the each .txt files as a separate task)
  2. Annotate the data
  3. Try to Export the annotated data through API or UX and in any format. The annotated data does not contain content of .txt files

Expected behavior

  1. The text content is visible in the exported data
  2. When uploading .txt files; there is an option to upload it as .txt or .txtl (text or textLine similar to json or jsonl)
  3. Perhaps a new valueType param in Text tag to indicate text or textline

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • Label Studio Version [e.g. v1.01]

Additional context https://label-studio.slack.com/archives/C01SKFX54QK/p1621488696013000

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
dvivarellicommented, Dec 21, 2021

using last docker build, the workaround with valueType=“url” works

<Text name="text" value="$text" granularity="word" valueType="url"/>
0reactions
teebucommented, Nov 8, 2022

I’m seeing the path not the actual text of my file with 1.6.0rc5

image

image

When I choose the other option I get actual data.

image

Read more comments on GitHub >

github_iconTop Results From Across the Web

Import or link to data in a text file - Microsoft Support
Text files that are organized properly fall into one of two types: Delimited files In a delimited file, each record appears on a...
Read more >
Error while handling multiple txt files - Stack Overflow
1 Answer 1 · The biggest problem is: I still need to write things inside of the XML files, but now I'm gonna...
Read more >
Authoring Tasks - Gradle User Manual
There are several ways you can define the dependencies of a task. ... Object) method — for single file/directory properties — or the...
Read more >
Guidance on Redacting Personal Data Identifiers in ...
file in sections, or you could try using a different simple-text editor. Guidance on Redacting Personal Data Identifiers in Electronically Filed Documents. -3-....
Read more >
Create and submit a robots.txt file | Google Search Central
If there are multiple groups for the same user agent, the groups will be combined into a single group before processing. The default...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found