Description of the bug:
When using max_output_tokens in generate_content to limit the output of the model, the following error is thrown when trying to access response.text, whereas its expected to get the output till MAX_TOKENS.
ValueError: The response.text quick accessor only works when the response contains a valid Part, but none was returned. Check the candidate.safety_ratings to see if the response was blocked.
The actual response:
response:
GenerateContentResponse(
done=True,
iterator=None,
result=glm.GenerateContentResponse({'candidates': [{'finish_reason': 2, 'index': 0, 'safety_ratings': [{'category': 9, 'probability': 1, 'blocked': False}, {'category': 8, 'probability': 1, 'blocked': False}, {'category': 7, 'probability': 1, 'blocked': False}, {'category': 10, 'probability': 1, 'blocked': False}], 'token_count': 0, 'grounding_attributions': []}]}),
)
This is the code snippet, taken from https://ai.google.dev/tutorials/python_quickstart#generation_configuration, which shows expected behaviour is to get upto 20 tokens then cut off due to MAX_TOKENS.
import google.generativeai as genai
genai.configure(api_key='***')
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(
'Tell me a story about a magic backpack.',
generation_config=genai.types.GenerationConfig(
# Only one candidate for now.
candidate_count=1,
# stop_sequences=['x'], # Commenting this part as sometimes it works due to 'x' being present in output and model cuts off before MAX_TOKENS is reached.
max_output_tokens=20,
temperature=1.0)
)
text = response.text
if response.candidates[0].finish_reason.name == "MAX_TOKENS":
text += '...'
print(text)
Actual vs expected behavior:
https://ai.google.dev/tutorials/python_quickstart#generation_configuration
Similar output as of here, not error when trying to access response.text, but to get the output of approx 20 tokens.
Works in AI Studio

Any other information you'd like to share?
Package information
Name: google-generativeai
Version: 0.5.0
Streaming partially works, but the last chunk with MAX_TOKENS is still empty (Could be intended behaviour)
Description of the bug:
When using max_output_tokens in generate_content to limit the output of the model, the following error is thrown when trying to access response.text, whereas its expected to get the output till MAX_TOKENS.
ValueError: The
response.textquick accessor only works when the response contains a validPart, but none was returned. Check thecandidate.safety_ratingsto see if the response was blocked.The actual response:
This is the code snippet, taken from https://ai.google.dev/tutorials/python_quickstart#generation_configuration, which shows expected behaviour is to get upto 20 tokens then cut off due to MAX_TOKENS.
Actual vs expected behavior:
https://ai.google.dev/tutorials/python_quickstart#generation_configuration
Similar output as of here, not error when trying to access response.text, but to get the output of approx 20 tokens.
Works in AI Studio

Any other information you'd like to share?
Package information
Name: google-generativeai
Version: 0.5.0
Streaming partially works, but the last chunk with MAX_TOKENS is still empty (Could be intended behaviour)