Skip to content

fix(core): skip redundant image captioning for quoted images when main model is multimodal#8398

Closed
Sisyphbaous-DT-Project wants to merge 1 commit into
AstrBotDevs:masterfrom
Sisyphbaous-DT-Project:fix/quote-image-caption-fallback
Closed

fix(core): skip redundant image captioning for quoted images when main model is multimodal#8398
Sisyphbaous-DT-Project wants to merge 1 commit into
AstrBotDevs:masterfrom
Sisyphbaous-DT-Project:fix/quote-image-caption-fallback

Conversation

@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor

@Sisyphbaous-DT-Project Sisyphbaous-DT-Project commented May 28, 2026

当用户引用包含图片的消息且未配置 default_image_caption_provider_id 时,
_process_quote_message 会无条件回退到当前主模型进行图片转述,
导致:

  1. 主模型被调用两次(一次转述、一次回复),增加延迟和 token 开销
  2. 多模态主模型无法直接看到原图,只能看到文字描述
  3. 转述提示词硬编码为 "Please describe the image content.",image_caption_prompt 配置项完全失效

对于关闭了 LTM 自动图片理解、依赖引用图片来让 bot 看图的用户,
每次引用图片都会命中此问题。

Modifications / 改动点

  • astrbot/core/astr_main_agent.py_process_quote_message:
    未配置转述模型时,检查主模型的 modalities 是否包含 "image"

    • 若主模型是多模态:跳过额外转述调用(原图已由 build_main_agent 加入 req.image_urls),仅在引用文本中保留 [Image in quoted message] 占位
    • 若主模型非多模态:保留 fallback 转述行为,但提示词改为读取配置文件中的 image_caption_prompt
  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

验证步骤:

  1. 配置多模态模型为主聊天模型,default_image_caption_provider_id 留空
  2. QQ 中引用一张包含图片的历史消息,发送文字
  3. 检查 debug 日志:主模型应只被调用一次,不应出现 "Please describe the image content." 的额外转述调用

行为对比:

场景 修改前 修改后
未配置转述模型 + 主模型多模态 主模型被调用两次(转述→回复) 主模型直接看原图,调用一次
未配置转述模型 + 主模型非多模态 主模型被调用两次 保留 fallback,提示词可配置
配置了转述模型 指定模型转述,硬编码提示词 指定模型转述,读取配置提示词

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"

  • 🤓 I have ensured that no new dependencies are introduced.
    / 我确保没有引入新依赖库。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

当引用包含图片的消息时(如 QQ 回复带图消息),_process_quote_message
在未配置 default_image_caption_provider_id 的情况下会无条件 fallback
到当前主模型进行图片转述,导致主模型被调用两次(一次转述、一次回复),
且转述提示词硬编码为 "Please describe the image content.",无法修改。

修复内容:
- 未配置转述模型时,先检查主模型的 modalities 是否包含 "image"
- 若主模型是多模态,跳过额外转述调用,主模型直接接收原图
- 若主模型非多模态,保留 fallback 转述行为
- 转述提示词改为读取配置项 image_caption_prompt,不再硬编码
- 警告日志仅在配置了转述模型但 provider 不存在时输出
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels May 28, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In the branch where img_cap_prov_id is not set, if main_prov exists but is not an instance of Provider, we silently skip captioning and logging; consider adding an explicit fallback or at least a debug log so this misconfiguration is visible.
  • When reading modalities from main_prov.provider_config, it may be safer to defensively handle non-dict or missing provider_config (e.g., via getattr(main_prov, 'provider_config', {})) to avoid unexpected attribute or type errors for custom providers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the branch where `img_cap_prov_id` is not set, if `main_prov` exists but is not an instance of `Provider`, we silently skip captioning and logging; consider adding an explicit fallback or at least a debug log so this misconfiguration is visible.
- When reading `modalities` from `main_prov.provider_config`, it may be safer to defensively handle non-dict or missing `provider_config` (e.g., via `getattr(main_prov, 'provider_config', {})`) to avoid unexpected attribute or type errors for custom providers.

## Individual Comments

### Comment 1
<location path="astrbot/core/astr_main_agent.py" line_range="832-842" />
<code_context>
                     )
-            else:
-                logger.warning("No provider found for image captioning in quote.")
+            elif img_cap_prov_id:
+                logger.warning(
+                    "Configured image caption provider %s not found for quote.",
+                    img_cap_prov_id,
+                )
         except BaseException as exc:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** When no provider is found and no `img_cap_prov_id` is configured, the failure becomes silent.

This change removes the warning when no captioning provider can be resolved and only logs when a configured `img_cap_prov_id` is missing. If `img_cap_prov_id` is falsy and `get_using_provider` returns a non-`Provider`/`None`, captioning is skipped silently. Please add at least a warning (or debug log) when no captioning-capable provider can be resolved so configuration problems remain visible.

```suggestion
                if llm_resp.completion_text:
                    content_parts.append(
                        f"[Image Caption in quoted message]: "
                        f"{llm_resp.completion_text}"
                    )
            elif img_cap_prov_id:
                logger.warning(
                    "Configured image caption provider %s not found for quote.",
                    img_cap_prov_id,
                )
            else:
                # No specific image caption provider configured and none could be resolved.
                # Log this so misconfiguration or missing capabilities remain visible.
                logger.warning("No provider found for image captioning in quote; skipping image captioning.")
        except BaseException as exc:
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 832 to 842
if llm_resp.completion_text:
content_parts.append(
f"[Image Caption in quoted message]: {llm_resp.completion_text}"
f"[Image Caption in quoted message]: "
f"{llm_resp.completion_text}"
)
else:
logger.warning("No provider found for image captioning in quote.")
elif img_cap_prov_id:
logger.warning(
"Configured image caption provider %s not found for quote.",
img_cap_prov_id,
)
except BaseException as exc:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): When no provider is found and no img_cap_prov_id is configured, the failure becomes silent.

This change removes the warning when no captioning provider can be resolved and only logs when a configured img_cap_prov_id is missing. If img_cap_prov_id is falsy and get_using_provider returns a non-Provider/None, captioning is skipped silently. Please add at least a warning (or debug log) when no captioning-capable provider can be resolved so configuration problems remain visible.

Suggested change
if llm_resp.completion_text:
content_parts.append(
f"[Image Caption in quoted message]: {llm_resp.completion_text}"
f"[Image Caption in quoted message]: "
f"{llm_resp.completion_text}"
)
else:
logger.warning("No provider found for image captioning in quote.")
elif img_cap_prov_id:
logger.warning(
"Configured image caption provider %s not found for quote.",
img_cap_prov_id,
)
except BaseException as exc:
if llm_resp.completion_text:
content_parts.append(
f"[Image Caption in quoted message]: "
f"{llm_resp.completion_text}"
)
elif img_cap_prov_id:
logger.warning(
"Configured image caption provider %s not found for quote.",
img_cap_prov_id,
)
else:
# No specific image caption provider configured and none could be resolved.
# Log this so misconfiguration or missing capabilities remain visible.
logger.warning("No provider found for image captioning in quote; skipping image captioning.")
except BaseException as exc:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是关键问题在于,astrbot是一个框架呀,我们触发sys.exit会直接导致他全部关掉,完全不会影响我们下一次重启使用

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves image captioning in quoted messages by checking if the main chat model natively supports image inputs before attempting an extra captioning call, and introduces a configurable image caption prompt. The review feedback suggests skipping the captioning call entirely when the main model lacks image support and no dedicated provider is configured to prevent API failures. Additionally, it recommends catching Exception instead of BaseException to avoid catching system-exit signals, and suggests refactoring the shared logic and adding unit tests.

Comment on lines +808 to +810
else:
# Fallback: main model does not support images.
prov = main_prov
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

如果主模型不支持图片(modalities 中不包含 image)且未配置专用的转述模型,尝试使用主模型进行图片转述将导致 API 调用失败并记录错误日志。为了避免这种必然失败的 API 调用,我们应该像主模型是多模态时一样,直接跳过转述调用并记录一条警告日志。此外,如果该逻辑在处理直接附件和引用附件时存在重复,请考虑将其重构为共享的辅助函数,并为该新功能编写单元测试。

                    else:
                        # Main model does not support images, and no dedicated caption provider is configured.
                        # Skip captioning to avoid a guaranteed API failure.
                        content_parts.append("[Image in quoted message]")
                        logger.warning(
                            "Main model %s does not support image input and no dedicated image caption provider is configured. Skipping image captioning.",
                            main_prov.provider_config.get("id", "unknown"),
                        )
                        prov = None
References
  1. When implementing similar functionality for different cases (e.g., direct vs. quoted attachments), refactor the logic into a shared helper function to avoid code duplication.
  2. New functionality, such as handling attachments, should be accompanied by corresponding unit tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这会破坏原有 fallback 行为。原代码的逻辑就是:未配置转述模型时,回退到主模型自己转述。你的使用场景正是"主模型是多模态",所以走的是我们新增的多模态分支。如果主模型不是多模态,原来的 fallback 行为就应该保留——否则那些用非多模态模型的用户会突然丢失引用消息图片的转述能力。

"Configured image caption provider %s not found for quote.",
img_cap_prov_id,
)
except BaseException as exc:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

捕获 BaseException 通常是不推荐的,因为它会捕获系统退出相关的异常(如 KeyboardInterruptSystemExit),这可能会导致应用程序难以正常终止。建议将其改为捕获 Exception,以仅捕获普通的程序运行错误。

Suggested change
except BaseException as exc:
except Exception as exc:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个BaseException是原代码就有的,不是我引入的,这也要我改吗QwQ

@Sisyphbaous-DT-Project Sisyphbaous-DT-Project force-pushed the fix/quote-image-caption-fallback branch 2 times, most recently from 1854598 to b8c58ab Compare May 28, 2026 14:01
@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor Author

ruff format 格式检查不通过,实际不影响功能

@Soulter Soulter force-pushed the master branch 2 times, most recently from 7c5d6bd to 0bcc662 Compare May 28, 2026 16:47
@Soulter
Copy link
Copy Markdown
Member

Soulter commented May 28, 2026

已经在 022a5dd 中修复,将你加到了 Co-author

@Soulter Soulter closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants