Output exceeds the size limit. Open the full output data in a text editor
See original GitHub issueHello, I meet this error while testing featurewiz , I want to do some auto feature engineering , so choose the old way , but unfortunately got Output exceeds the size limit. Open the full output data in a text editor
.
Detail:
- X shape : Shape = (128463, 1341) , mixed string, int , float and nan values.
- code:
import featurewiz as FW
outputs = FW.featurewiz(dataname=X.reset_index(drop=True), target=y.reset_index(drop=True), corr_limit=0.70, verbose=2, sep=',',
header=0, test_data='',feature_engg='', category_encoders='',
dask_xgboost_flag=False, nrows=None)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
f:\Work\jupyter_pipeline\pj01\1.1.0 clean_data.ipynb Cell 126 in <cell line: 1>()
1 if Config.add_feature:
2 # # Add feature
3 # from jinshu_model.build_models import HighDimensionFeatureAdder
(...)
8 # ce = HighDimensionFeatureAdder(max_gmm_component=4, onehot=False)
9 # X = ce.fit_transform(X)
10 import featurewiz as FW
---> 11 outputs = FW.featurewiz(dataname=X.reset_index(drop=True), target=y.reset_index(drop=True), corr_limit=0.70, verbose=2, sep=',',
12 header=0, test_data='',feature_engg='', category_encoders='',
13 dask_xgboost_flag=False, nrows=None)
14 else:
15 ce = CategoricalEncoder()
File c:\Users\ufo\anaconda3\lib\site-packages\featurewiz\featurewiz.py:793, in featurewiz(dataname, target, corr_limit, verbose, sep, header, test_data, feature_engg, category_encoders, dask_xgboost_flag, nrows, **kwargs)
791 print('Classifying features using a random sample of %s rows from dataset...' %nrows_limit)
792 ##### you can use nrows_limit to select a small sample from data set ########################
--> 793 train_small = EDA_randomly_select_rows_from_dataframe(dataname, targets, nrows_limit, DS_LEN=dataname.shape[0])
794 features_dict = classify_features(train_small, target)
795 else:
File c:\Users\ufo\anaconda3\lib\site-packages\featurewiz\featurewiz.py:2977, in EDA_randomly_select_rows_from_dataframe(train_dataframe, targets, nrows_limit, DS_LEN)
2975 test_size = 0.9
...
-> 5842 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
5844 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
5845 raise KeyError(f"{not_found} not in index")
KeyError: "None of [Int64Index([0, 0, 0, 0, 1, 1, 0, 0, 0, 1,\n ...\n 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],\n dtype='int64', length=128463)] are in the [columns]"
Issue Analytics
- State:
- Created a year ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
How to display all output in Jupyter Notebook within Visual ...
Open VS code settings or (ctrl + ,) >> In search box type "output.textLineLimit" >> Find "Notebook>Output: Text Line Limit" >> Change the ......
Read more >output exceeds the size limit. open the full output data in a text ...
When I use VSCode interactive Python, sometimes I get this message: Output exceeds the size limit. Open the full output data in a...
Read more >Limit Output - Unofficial Jupyter Notebook Extensions
This extension limits the number of characters a codecell will output as text or HTML. This also allows the interruption of endless loops...
Read more >VS Code tips — Limit the number of open editors - YouTube
Editor limits cap the number editors that you can have opened at any one time in VS Code. Once you hit the limit,...
Read more >VScode 上でのエラー "Output exceeds the size limit ... - Teratail
を実行したところ、以下のエラーが表示され、解決方法を見つけられず、困っています。 "Output exceeds the size limit. Open the full output data in a text editor.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
hello @eromoe I figured out the problem in the first statement. 👍
You must send in the entire
train dataframe
in the statement below and thetarget
refers to the name of your target column in the train dataframe - instead you sent in X and y. That’s the issue!Also I noticed that this is a big dataframe. So you might want to set
nrows
to be 10000 or something small so that your dataframe is handled in pandas without blowing up. Hope this finally solves it. AutoVimalhelp me