question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

frictionless cli does not read accented characters

See original GitHub issue

Overview

Hi, in Italian, my language, we use a lot of accented characters . “age” in example is “età”.

If I have this input file

nome,età
andy,47
tom,67

and run extract I have

# ----
# data: im.csv
# ----

====  ===
nome  etÃ
====  ===
andy   47
tom    67
====  ===

If I run describe I have

encoding: iso8859-9
format: csv
hashing: md5
name: im
path: im.csv
profile: tabular-data-resource
schema:
  fields:
    - name: nome
      type: string
    - name: "et\xC3"
      type: integer
scheme: file

Thank you


Please preserve this line to notify @roll (lead of this repository)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
aborrusocommented, May 19, 2021

Thanks! Releasing a fix

it works, thank you

1reaction
rollcommented, May 19, 2021

@aborruso Yea I had the same as chardet detect encoding as iso8859-9.

It works fine if we set utf-8:

$ frictionless describe data/issue-844.csv --encoding utf-8
# --------
# metadata: data/issue-844.csv
# --------

encoding: utf-8
format: csv
hashing: md5
name: issue-844
path: data/issue-844.csv
profile: tabular-data-resource
schema:
  fields:
    - name: nome
      type: string
    - name: età
      type: integer
scheme: file
Read more comments on GitHub >

github_iconTop Results From Across the Web

Csv Format - Frictionless Framework
CSV is a file format which you can you in Frictionless for reading and writing. Arguable it's the main Open Data format so...
Read more >
Manage paths with accented characters - batch file
I have some path and file names with cyrillic chars and no issues with them. · editors which i use is either bloc-notes...
Read more >
Adding support for accented characters · Issue #148 - GitHub
The problem is that accented words in the wordcloud are split in 2 by the ... It's more likely that the file is...
Read more >
Writing accented characters? - ExifTool by Phil Harvey
Both can read the metadata if it was written "-L". But will not read it if it was written without. So although exiftool...
Read more >
Character Accent Menu doesn't appear anymore in macOS ...
Found the solution. Go to System Preferences > Keyboard > Keyboard tab > Modifier Keys(bottom right corner). Now press restore defaults and see...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found