question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Going over the maximum context length when building an index

See original GitHub issue

Hey i’m getting an error when building an index because gpt_index is trying to go over the maximum context length.

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4169 tokens (3913 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

to reproduce:

from gpt_index import GPTTreeIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)

# save to disk
index.save_to_disk('index.json')

on this data: https://github.com/awsdocs/amazon-quicksight-user-guide

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
ArcadeCityMayorcommented, Dec 8, 2022

Realized my directory traversal script included all my build files including node_modules. Once I copied over the specific folders I wanted into a new directory and targeted that folder, the script ran fine.

I’m guessing in node_modules there were files that did not include the separator at all (like minified JS), and so the splitter was unable to reduce that to the desired chunk size.

Apologies if unrelated to OP. Thanks for your help!

1reaction
ArcadeCityMayorcommented, Dec 8, 2022

@ArcadeCityMayor re: the first issue, do you have a stack trace + is it erroring for you?

It doesn’t error, just throws that as I guess a warning at the beginning of running the script. I’ll try again shortly after trying the #88 fix and experimenting with my note above about using an incorrect separator.

model

Read more comments on GitHub >

github_iconTop Results From Across the Web

Indexing Very Large Tables
Creating and maintaining an index on a huge table is costlier than on smaller tables. Whenever you create an index, a copy of...
Read more >
Frequently Asked Questions About Indexing Performance
Answer: Parallel indexing can improve index performance when you have a large amount of data, and have multiple CPUs. You use the PARALLEL...
Read more >
PostgreSQL: Documentation: 15: CREATE INDEX
CREATE INDEX constructs an index on the specified column(s) of the specified ... If an index tuple exceeds the maximum size allowed for...
Read more >
CREATE INDEX (Transact-SQL) - SQL Server
The maximum allowable size of the combined index values is 900 bytes for a clustered index, or 1,700 for a nonclustered index. The...
Read more >
Index Builds on Populated Collections
The optimized index build performance is at least on par with background index ... The default limit on memory usage for createIndexes is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found