You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Maximum input token count 4919 exceeds limit of 4096 for train data" in model-customization-job/amazon.titan-text-lite-v1:0:4k/nhjsh25oes0i in notebook 03_Model_customization/03_continued_pretraining_titan_text.ipynb
The text was updated successfully, but these errors were encountered:
FireballDWF
added a commit
to FireballDWF/amazon-bedrock-workshop
that referenced
this issue
Apr 3, 2024
Fix Maximum input token count 4919 exceeds limit of 4096 for train data in 03_Model_customization/03_continued_pretraining_titan_text.ipynb aws-samples#224
I was able to fix the issue by reducing the chunk size and chunk overlap to 5000 and 1000, respectively; 5000 was an assumption, I am sure it would still create the model with anything below 10000 (someone above observed that the model would not get created for a chunk size equaling 10000):
text_splitter = RecursiveCharacterTextSplitter(
# Set a really small chunk size, just to show.
chunk_size = 5000, # assumption
chunk_overlap = 1000, # overlap for continuity across chunks
)
"Maximum input token count 4919 exceeds limit of 4096 for train data" in model-customization-job/amazon.titan-text-lite-v1:0:4k/nhjsh25oes0i in notebook 03_Model_customization/03_continued_pretraining_titan_text.ipynb
The text was updated successfully, but these errors were encountered: