question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Describe the bug

The plugin struggles with utf8mb4 encoded fields, even if the database is correctly setup.

I Verified it is the plugin, by connecting using command-line mysql-client-core-8.0 (ubuntu package name), and executing the same query.

To Reproduce Steps to reproduce the behavior:

  1. Install this plugin in vscode

  2. Ensure you have a database you can connect to

  3. Run the following SQL

    CREATE DATABASE IF NOT EXISTS wp_interview_futz;
    USE wp_interview_futz;
    
    CREATE TABLE IF NOT EXISTS rusty_inc_names (
        id mediumint(9) NOT NULL AUTO_INCREMENT,
        name VARCHAR (255) NOT NULL UNIQUE ,
        is_outdated TIMESTAMP NULL,
        PRIMARY KEY (id)
    ) DEFAULT CHARSET = utf8mb4 DEFAULT COLLATE = utf8mb4_unicode_ci;
    
    CREATE TABLE IF NOT EXISTS rusty_inc_tree (
        name_id mediumint(9) NOT NULL,
        emoji tinytext NOT NULL,
        parent_id mediumint(9),
        PRIMARY KEY (name_id)
    ) DEFAULT CHARSET = utf8mb4 DEFAULT COLLATE = utf8mb4_unicode_ci;
    
    /* This was me messing around with the encoding as I wasn't sure if something was wrong with the sql creating the db & tables */
    ALTER DATABASE wp_interview_futz CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
    
    ALTER TABLE rusty_inc_tree CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    
    ALTER TABLE rusty_inc_tree CHANGE emoji emoji VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    
    INSERT INTO rusty_inc_names (id, name) VALUES (1, "Rusty Corp."), (2, "Food"), (3, "Canine Therapy"), (4, "Massages"), (5, "Games"), (6, "Treats"), (7, "Paw Cosmetics");
    
    INSERT INTO rusty_inc_tree (name_id, emoji, parent_id) VALUES (1, '🐕', null), (2, '🥩', 1), (3, '😌', 1), (4, '💆', 3), (5, '🎾', 3);
    
  4. See error in console ER_TRUNCATED_WRONG_VALUE_FOR_FIELD: Incorrect string value: '\xFo\x9F\x90\x95' for column emoji at row 1

Expected behavior

Insert query works (as it does in mysql cli)

Query OK, 5 rows affected (0.03 sec)
Records: 5  Duplicates: 0  Warnings: 0

Screenshots

Screenshot from 2022-07-31 15-35-27 Screenshot from 2022-07-31 15-35-40 Screenshot from 2022-07-31 15-35-57

Desktop (please complete the following information):

  • SQLTools Version latest stable (downloaded fresh today via packages in vscode)
  • VSCode Version: latest (downloaded fresh today via vscode website)
  • OS: Linux (I would be surprised if this were OS specific)
  • Driver:
    • PostgreSQL/Redshift
    • MySQL/MariaDB
    • MSSQL/Azure
    • SQLite
    • Other? Which…
  • Database version: MySQL 5.7

Additional context

The table view after I insert from CLI is also messed up.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:10

github_iconTop GitHub Comments

1reaction
gjsjohnmurraycommented, Aug 1, 2022

In the JSON where your connection is defined, find the "driver": "MySQL" line and add a mysqlOptions property after it to get the following:

        "driver": "MySQL",
        "mysqlOptions": {
          "charset": "utf8mb4_general_ci"
        },

With this in place the published driver should have the same behaviour as you are proposing in your PR.

Another tactic with the current published driver is to run this statement immediately after connection:

SET NAMES 'utf8mb4' COLLATE 'utf8mb4_general_ci';

I am concerned that by changing what the driver uses as the default charset property for the connection your PR will break things for other users.

Instead I suggest extending the driver’s connection config page to add a field in which users can optionally specify a string to use as the value of that property. From a MySQL perspective values put in here should be ones that appear in the resultset of a SHOW COLLATION statement on the target server.

0reactions
gjsjohnmurraycommented, Aug 12, 2022

Thanks for testing anyway. Since you’re no longer using the extension I’m closing this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

10.9.3 The utf8 Character Set (Alias for utf8mb3)
utf8 has been used by MySQL is an alias for the utf8mb3 character set, but this usage is being phased out; as of...
Read more >
How to make MySQL handle UTF-8 properly - Stack Overflow
To change the character set encoding to UTF-8 for the database itself, type the following command at the mysql ...
Read more >
A Guide to UTF-8 Encoding in PHP and MySQL - Toptal
MySQL UTF-8 is actually a partial implementation of the full UTF-8 character set. Specifically, MySQL UTF-8 encoding uses a maximum of 3 bytes,...
Read more >
In MySQL, never use “utf8”. Use “utf8mb4”. | by Adam Hooper
MySQL's “utf8mb4” means “UTF-8”. MySQL's “utf8” means “a proprietary character encoding”. This encoding can't encode many Unicode characters.
Read more >
How to convert a MySQL database to UTF-8 encoding
Although MySQL supports the UTF-8 character encoding set, it is often not used as the default character set during database and table creation....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found