Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`split': invalid byte sequence in US-ASCII (ArgumentError) on ripper.rb

See original GitHub issue

Metadata

node -v
  v14.5.0

ruby -v
  ruby 2.7.1p83 (2020-03ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]

uname -a
  Darwin mbp 19.6.0 Darwin Kernel Version 19.6.0: Sun Jul  5 00:43:10 PDT 2020; root:xnu-6153.141.1~9/RELEASE_X86_64 x86_64

cat package.json
  {
    "devDependencies": {
      "@prettier/plugin-ruby": "^0.19.0",
      "prettier": "^2.0.5"
    }
  }

Visual Studio Code
  Version: 1.47.0
  Commit: d5e9aa0227e057a60c82568bf31c04730dc15dcd
  Date: 2020-07-09T08:01:54.115Z (1 wk ago)
  Electron: 7.3.2
  Chrome: 78.0.3904.130
  Node.js: 12.8.1
  V8: 7.8.279.23-electron.0
  OS: Darwin x64 19.6.0

https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode
  v5.1.3

Input

# あ 

      d=[30644250780,9003106878,
  30636278846,66641217692,4501790980,
671_24_603036,131_61973916,66_606629_920,
 30642677916,30643069058];a,s=[],$*[0]
    s.each_byte{|b|a<<("%036b"%d[b.
       chr.to_i]).scan(/\d{6}/)}
        a.transpose.each{ |a|
          a.join.each_byte{\
           |i|print i==49?\
             ($*[1]||"#")\
               :32.chr}
                 puts
                  }

class RipperJS < Ripper
  attr_reader :source, :lines, :__end__

  def initialize(source, *args)
    super(source, *args)

    @source = source # <= source.encoding
    @lines = source.split("\n")
    @__end__ = nil
  end

It seems if ran through VSCode, source.encoding would be recognized as “US-ASCII.” And I get an error like below.

ripper.rb:25:in `split': invalid byte sequence in US-ASCII (ArgumentError)

When I run bundle exec rbprettier --write on the same file, source.encoding is “UTF-8”, and the format is done correctly.

Issue Analytics

State:
Created 3 years ago
Comments:11 (4 by maintainers)

Top GitHub Comments

1reaction

kddnewtoncommented, Aug 19, 2020

@stmichael that’s the ticket! Thank you for finding that. Because of that investigation, I was finally able to track it down to:

LC_ALL="en_US.US-ASCII" ruby -e 'p "ä"'

That will reliably fail. I will add something that will detect the default encoding and handle it appropriately.

1reaction

stmichaelcommented, Aug 14, 2020

I’m having the same issue but in a different setup. I don’t use any editor integration, just plain command line. My setup:

OS: Debian Language: JS (NodeJS v12.18.3) Plugin-Ruby: 0.19.0

Running yarn prettier -l '**/*.rb' in my project prints a whole bunch of errors identical to the ones @github0013 mentioned.

Error: /app/node_modules/@prettier/plugin-ruby/src/ripper.rb:25:in `split': invalid byte sequence in US-ASCII (ArgumentError)

I can’t count them all, there’s too many.

To track the issue I came up with a minimal example. Create a file test.rb with content

'ä'

That’s a valid Ruby program although it doesn’t make any sense. Running yarn prettier -l test.rb will produce the very same issue.

I’m not an expert on character encodings. In my understanding since Ruby 2.0 the default encoding is UTF-8. Then why does this plugin read the files in US-ASCII?

Top Results From Across the Web

Split invalid byte sequence in US-ASCII - ruby - Stack Overflow

I checked the program text, and here's the culprit: in x – 1 there's a U+2013 – EN DASH being used as the...

CI : FFaker: invalid byte sequence in US-ASCII (ArgumentError)

@oliworx Thanks for the report. The problem here is that probably your encoding in Docker container is US-ASCII . Try setting it to...

Solving "invalid byte sequence in UTF-8" errors in ruby

Short introduction to UTF-8 and other encodings · Why does an UTF-8 invalid byte sequence error happen? · Solution 1 - Provide a...

Class: String (Ruby 2.5.3)

A String object holds and manipulates an arbitrary sequence of bytes, typically representing characters. String objects may be created using String::new or ...

[Solved]-Split invalid byte sequence in US-ASCII-ruby

[Solved]-Split invalid byte sequence in US-ASCII-ruby · score:0. You have a non-ascii character in the input, the following slight modification to your program ......