Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Going over the maximum context length when building an index

See original GitHub issue

Hey i’m getting an error when building an index because gpt_index is trying to go over the maximum context length.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4169 tokens (3913 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

to reproduce:

from gpt_index import GPTTreeIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)

# save to disk
index.save_to_disk('index.json')

on this data: https://github.com/awsdocs/amazon-quicksight-user-guide

Issue Analytics

State:
Created 9 months ago
Comments:9 (5 by maintainers)

Top GitHub Comments

1reaction

ArcadeCityMayorcommented, Dec 8, 2022

Realized my directory traversal script included all my build files including node_modules. Once I copied over the specific folders I wanted into a new directory and targeted that folder, the script ran fine.

I’m guessing in node_modules there were files that did not include the separator at all (like minified JS), and so the splitter was unable to reduce that to the desired chunk size.

Apologies if unrelated to OP. Thanks for your help!

1reaction

ArcadeCityMayorcommented, Dec 8, 2022

@ArcadeCityMayor re: the first issue, do you have a stack trace + is it erroring for you?

It doesn’t error, just throws that as I guess a warning at the beginning of running the script. I’ll try again shortly after trying the #88 fix and experimenting with my note above about using an incorrect separator.

model