question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LightGBM error special JSON characters in feature name

See original GitHub issue

Hi - great library everyone!

I’m doing tabular prediction with news article data. The LightGBM model doesn’t run with the following error:

LightGBMError: Do not support special JSON characters in feature name.
Do not support special JSON characters in feature name.

The upstream issue in LightGBM is here:https://github.com/microsoft/LightGBM/issues/2455

Basically they check for special json characters .e.g [],{}": in features names and throw an error if found. Could autogluon check for these characters and remove any offending features?

Liam

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:14

github_iconTop GitHub Comments

12reactions
oguzhangur96commented, May 4, 2020

Hi,

I have encountred the same problem. Fixed it before the training begins with a regular expression and lambda function (pandas). Object named “data” must be a dataframe.

import re
data = data.rename(columns = lambda x:re.sub('[^A-Za-z0-9_]+', '', x))

Could provide data/column headers if needed.

5reactions
Innixmacommented, May 4, 2020

Thanks for the info @oguzhangur96! Right now the main difficulty here is to ensure that column names are always converted when entering the model (whether for feature importances, fit, inference) and also inversely converted back to the original upon leaving. This seems to be something that LightGBM itself should handle, and I don’t have a good understanding of why they haven’t done so.

Furthermore, it isn’t sufficient to only remove the special characters, because we have to ensure no two columns have the same name. One simple way is to just rename columns 0-n as ‘0’, ‘1’, …‘n-1’, ‘n’, but then this operation would also have to be done on all of the 99% of problems where this is not required, particularly for online-inference of single rows. I have to first benchmark inference times in these situations to ensure our online-inference speed isn’t significantly slowed due to this fix.

Read more comments on GitHub >

github_iconTop Results From Across the Web

LightGBM error special JSON characters in feature name #399
The LightGBM model doesn't run with the following error: LightGBMError: Do not support special JSON characters in feature name.
Read more >
Do not support special JSON characters in feature name - The ...
Here is an alternative answer from LightGBM error special JSON characters in feature name #399 # Change columns names ([LightGBM] Do not ...
Read more >
lightGBMError | Data Science and Machine Learning - Kaggle
I ran into this problem when committing my notebook on Kaggle: **lightgbm.basic.LightGBMError: Do not support special JSON characters in feature name.
Read more >
LightGBMError:Do not support special JSON characters in ...
LightGBMError:Do not support special JSON characters in feature name_段墨染的博客-CSDN博客_使用lightgbm是报错josn.
Read more >
Do not support special JSON characters in feature name. 문제
어느 날 LightGBM 사용 중 다음과 같이 'Do not support special JSON characters in feature name.' 에러가 발생하여 구글링 해보았다.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found