fix(core): skip redundant image captioning for quoted images when main model is multimodal#8398
Conversation
当引用包含图片的消息时(如 QQ 回复带图消息),_process_quote_message 在未配置 default_image_caption_provider_id 的情况下会无条件 fallback 到当前主模型进行图片转述,导致主模型被调用两次(一次转述、一次回复), 且转述提示词硬编码为 "Please describe the image content.",无法修改。 修复内容: - 未配置转述模型时,先检查主模型的 modalities 是否包含 "image" - 若主模型是多模态,跳过额外转述调用,主模型直接接收原图 - 若主模型非多模态,保留 fallback 转述行为 - 转述提示词改为读取配置项 image_caption_prompt,不再硬编码 - 警告日志仅在配置了转述模型但 provider 不存在时输出
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In the branch where
img_cap_prov_idis not set, ifmain_provexists but is not an instance ofProvider, we silently skip captioning and logging; consider adding an explicit fallback or at least a debug log so this misconfiguration is visible. - When reading
modalitiesfrommain_prov.provider_config, it may be safer to defensively handle non-dict or missingprovider_config(e.g., viagetattr(main_prov, 'provider_config', {})) to avoid unexpected attribute or type errors for custom providers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the branch where `img_cap_prov_id` is not set, if `main_prov` exists but is not an instance of `Provider`, we silently skip captioning and logging; consider adding an explicit fallback or at least a debug log so this misconfiguration is visible.
- When reading `modalities` from `main_prov.provider_config`, it may be safer to defensively handle non-dict or missing `provider_config` (e.g., via `getattr(main_prov, 'provider_config', {})`) to avoid unexpected attribute or type errors for custom providers.
## Individual Comments
### Comment 1
<location path="astrbot/core/astr_main_agent.py" line_range="832-842" />
<code_context>
)
- else:
- logger.warning("No provider found for image captioning in quote.")
+ elif img_cap_prov_id:
+ logger.warning(
+ "Configured image caption provider %s not found for quote.",
+ img_cap_prov_id,
+ )
except BaseException as exc:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** When no provider is found and no `img_cap_prov_id` is configured, the failure becomes silent.
This change removes the warning when no captioning provider can be resolved and only logs when a configured `img_cap_prov_id` is missing. If `img_cap_prov_id` is falsy and `get_using_provider` returns a non-`Provider`/`None`, captioning is skipped silently. Please add at least a warning (or debug log) when no captioning-capable provider can be resolved so configuration problems remain visible.
```suggestion
if llm_resp.completion_text:
content_parts.append(
f"[Image Caption in quoted message]: "
f"{llm_resp.completion_text}"
)
elif img_cap_prov_id:
logger.warning(
"Configured image caption provider %s not found for quote.",
img_cap_prov_id,
)
else:
# No specific image caption provider configured and none could be resolved.
# Log this so misconfiguration or missing capabilities remain visible.
logger.warning("No provider found for image captioning in quote; skipping image captioning.")
except BaseException as exc:
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if llm_resp.completion_text: | ||
| content_parts.append( | ||
| f"[Image Caption in quoted message]: {llm_resp.completion_text}" | ||
| f"[Image Caption in quoted message]: " | ||
| f"{llm_resp.completion_text}" | ||
| ) | ||
| else: | ||
| logger.warning("No provider found for image captioning in quote.") | ||
| elif img_cap_prov_id: | ||
| logger.warning( | ||
| "Configured image caption provider %s not found for quote.", | ||
| img_cap_prov_id, | ||
| ) | ||
| except BaseException as exc: |
There was a problem hiding this comment.
suggestion (bug_risk): When no provider is found and no img_cap_prov_id is configured, the failure becomes silent.
This change removes the warning when no captioning provider can be resolved and only logs when a configured img_cap_prov_id is missing. If img_cap_prov_id is falsy and get_using_provider returns a non-Provider/None, captioning is skipped silently. Please add at least a warning (or debug log) when no captioning-capable provider can be resolved so configuration problems remain visible.
| if llm_resp.completion_text: | |
| content_parts.append( | |
| f"[Image Caption in quoted message]: {llm_resp.completion_text}" | |
| f"[Image Caption in quoted message]: " | |
| f"{llm_resp.completion_text}" | |
| ) | |
| else: | |
| logger.warning("No provider found for image captioning in quote.") | |
| elif img_cap_prov_id: | |
| logger.warning( | |
| "Configured image caption provider %s not found for quote.", | |
| img_cap_prov_id, | |
| ) | |
| except BaseException as exc: | |
| if llm_resp.completion_text: | |
| content_parts.append( | |
| f"[Image Caption in quoted message]: " | |
| f"{llm_resp.completion_text}" | |
| ) | |
| elif img_cap_prov_id: | |
| logger.warning( | |
| "Configured image caption provider %s not found for quote.", | |
| img_cap_prov_id, | |
| ) | |
| else: | |
| # No specific image caption provider configured and none could be resolved. | |
| # Log this so misconfiguration or missing capabilities remain visible. | |
| logger.warning("No provider found for image captioning in quote; skipping image captioning.") | |
| except BaseException as exc: |
There was a problem hiding this comment.
但是关键问题在于,astrbot是一个框架呀,我们触发sys.exit会直接导致他全部关掉,完全不会影响我们下一次重启使用
There was a problem hiding this comment.
Code Review
This pull request improves image captioning in quoted messages by checking if the main chat model natively supports image inputs before attempting an extra captioning call, and introduces a configurable image caption prompt. The review feedback suggests skipping the captioning call entirely when the main model lacks image support and no dedicated provider is configured to prevent API failures. Additionally, it recommends catching Exception instead of BaseException to avoid catching system-exit signals, and suggests refactoring the shared logic and adding unit tests.
| else: | ||
| # Fallback: main model does not support images. | ||
| prov = main_prov |
There was a problem hiding this comment.
如果主模型不支持图片(modalities 中不包含 image)且未配置专用的转述模型,尝试使用主模型进行图片转述将导致 API 调用失败并记录错误日志。为了避免这种必然失败的 API 调用,我们应该像主模型是多模态时一样,直接跳过转述调用并记录一条警告日志。此外,如果该逻辑在处理直接附件和引用附件时存在重复,请考虑将其重构为共享的辅助函数,并为该新功能编写单元测试。
else:
# Main model does not support images, and no dedicated caption provider is configured.
# Skip captioning to avoid a guaranteed API failure.
content_parts.append("[Image in quoted message]")
logger.warning(
"Main model %s does not support image input and no dedicated image caption provider is configured. Skipping image captioning.",
main_prov.provider_config.get("id", "unknown"),
)
prov = NoneReferences
- When implementing similar functionality for different cases (e.g., direct vs. quoted attachments), refactor the logic into a shared helper function to avoid code duplication.
- New functionality, such as handling attachments, should be accompanied by corresponding unit tests.
There was a problem hiding this comment.
这会破坏原有 fallback 行为。原代码的逻辑就是:未配置转述模型时,回退到主模型自己转述。你的使用场景正是"主模型是多模态",所以走的是我们新增的多模态分支。如果主模型不是多模态,原来的 fallback 行为就应该保留——否则那些用非多模态模型的用户会突然丢失引用消息图片的转述能力。
| "Configured image caption provider %s not found for quote.", | ||
| img_cap_prov_id, | ||
| ) | ||
| except BaseException as exc: |
There was a problem hiding this comment.
这个BaseException是原代码就有的,不是我引入的,这也要我改吗QwQ
1854598 to
b8c58ab
Compare
|
ruff format 格式检查不通过,实际不影响功能 |
7c5d6bd to
0bcc662
Compare
|
已经在 022a5dd 中修复,将你加到了 Co-author |
当用户引用包含图片的消息且未配置
default_image_caption_provider_id时,_process_quote_message会无条件回退到当前主模型进行图片转述,导致:
image_caption_prompt配置项完全失效对于关闭了 LTM 自动图片理解、依赖引用图片来让 bot 看图的用户,
每次引用图片都会命中此问题。
Modifications / 改动点
astrbot/core/astr_main_agent.py—_process_quote_message:未配置转述模型时,检查主模型的
modalities是否包含"image":build_main_agent加入req.image_urls),仅在引用文本中保留[Image in quoted message]占位image_caption_promptThis is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
验证步骤:
default_image_caption_provider_id留空行为对比:
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"。
🤓 I have ensured that no new dependencies are introduced.
/ 我确保没有引入新依赖库。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。