Going over the maximum context length when building an index
See original GitHub issueHey i’m getting an error when building an index because gpt_index is trying to go over the maximum context length.
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4169 tokens (3913 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
to reproduce:
from gpt_index import GPTTreeIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)
# save to disk
index.save_to_disk('index.json')
on this data: https://github.com/awsdocs/amazon-quicksight-user-guide
Issue Analytics
- State:
- Created 9 months ago
- Comments:9 (5 by maintainers)
Top Results From Across the Web
Indexing Very Large Tables
Creating and maintaining an index on a huge table is costlier than on smaller tables. Whenever you create an index, a copy of...
Read more >Frequently Asked Questions About Indexing Performance
Answer: Parallel indexing can improve index performance when you have a large amount of data, and have multiple CPUs. You use the PARALLEL...
Read more >PostgreSQL: Documentation: 15: CREATE INDEX
CREATE INDEX constructs an index on the specified column(s) of the specified ... If an index tuple exceeds the maximum size allowed for...
Read more >CREATE INDEX (Transact-SQL) - SQL Server
The maximum allowable size of the combined index values is 900 bytes for a clustered index, or 1,700 for a nonclustered index. The...
Read more >Index Builds on Populated Collections
The optimized index build performance is at least on par with background index ... The default limit on memory usage for createIndexes is...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Realized my directory traversal script included all my build files including node_modules. Once I copied over the specific folders I wanted into a new directory and targeted that folder, the script ran fine.
I’m guessing in node_modules there were files that did not include the separator at all (like minified JS), and so the splitter was unable to reduce that to the desired chunk size.
Apologies if unrelated to OP. Thanks for your help!
It doesn’t error, just throws that as I guess a warning at the beginning of running the script. I’ll try again shortly after trying the #88 fix and experimenting with my note above about using an incorrect separator.