-
I am creating my index with the following line: Settings.llm = OpenAI(model="gpt-4-0125-preview", temperature=0.1)
Settings.embed_model = OpenAIEmbedding()
index = VectorStoreIndex(nodes=nodes) So why am I getting the following error:
When it is specified in the OpenAI documentation that the |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 10 replies
-
Hey @nshern! Great to see you back here - hope you're doing well and ready for another round of troubleshooting. The error message you're seeing is coming from the OpenAI API, not from the LlamaIndex framework. It's indicating that the total number of tokens in your prompt exceeds the maximum context length that the model can handle. The This discrepancy could be due to a few reasons:
To resolve this issue, you could try the following:
I hope this helps! If you have any other questions or need further clarification, feel free to ask. Sources
|
Beta Was this translation helpful? Give feedback.
-
Facing the same issue here running on GPT4-Turbo, see the below error message that we face in our backend; 2024-05-14 20:01:06 Retrying llama_index.embeddings.openai.base.get_embedding in 6.3257633998151785 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens, however you requested 14607 tokens (14607 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}. Anyone else found a solution or know's a workaround for this type of problem ? |
Beta Was this translation helpful? Give feedback.
-
I'm facing the same issue with the TS library. Here's the setting I'm using:
I've tried with other gp4 models as well but no luck. |
Beta Was this translation helpful? Give feedback.
-
I met the same issue: The solution is to reduce the size of chunk size.
|
Beta Was this translation helpful? Give feedback.
The problem was with the embedding model.