fix(skills): preserve non-ASCII characters in skill frontmatter#2917
fix(skills): preserve non-ASCII characters in skill frontmatter#2917seiya-koji wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR ensures non-ASCII (Unicode) characters are preserved when generating/rendering SKILL.md YAML frontmatter by enabling Unicode output in YAML serialization, and adds regression tests to prevent reintroducing escaped output.
Changes:
- Enable
allow_unicode=Truein multipleyaml.safe_dump(...)calls that generate SKILL.md frontmatter. - Add a new extension-skills test that installs an extension with a Unicode description and asserts SKILL.md preserves it.
- Add an integration test ensuring Claude skill rendering preserves Unicode characters.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_extension_skills.py | Adds a Unicode-focused extension fixture + SKILL.md assertion to prevent escaped descriptions. |
| tests/integrations/test_integration_claude.py | Adds a regression test for Unicode preservation in rendered skills. |
| src/specify_cli/presets.py | Enables Unicode output when dumping YAML frontmatter for preset/skill reconciliation flows. |
| src/specify_cli/integrations/claude/init.py | Enables Unicode output when dumping YAML frontmatter during skill rendering. |
| src/specify_cli/extensions.py | Enables Unicode output when dumping YAML frontmatter for extension skill registration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Addressed Copilot's refactoring suggestion in ae74205. It reaches a few more call sites than the original fix, so the diff is a bit larger. |
|
Please address merge conflicts by rebasing on upstream/main |
Skill SKILL.md frontmatter descriptions containing non-ASCII characters were escaped to \uXXXX / \xXX sequences because yaml.safe_dump() was called without allow_unicode=True. - Add allow_unicode=True to the 7 skill/command frontmatter safe_dump sites (extensions, presets, claude integration) - Add regression tests for the render and extension-install paths Follows the approach of github#1936; encoding="utf-8" is already set on the affected write paths, so no encoding change is needed here. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Centralize skill/command frontmatter YAML serialization into a single _utils.dump_frontmatter helper so no call site can drop allow_unicode or diverge on formatting. Route the 7 existing sites through it and drop a now-unused local yaml import. Switch the extension test fixtures to yaml.safe_dump for parity with the production safe-dump/safe-load codepaths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ae74205 to
146b6d5
Compare
|
Rebased the branch onto the latest main. |
Description
Skill
SKILL.mdfrontmatterdescriptionvalues with non-ASCII characters wereescaped to
\uXXXX/\xXX, because theyaml.safe_dump(...)calls that renderskill frontmatter were missing
allow_unicode=True.#1936 fixed this for the YAML I/O that existed then; the skill frontmatter paths
added later (native skills migration) regressed it.
This PR adds
allow_unicode=Trueto the 7 skill/command frontmattersafe_dumpsites (
extensions.py,presets.py,integrations/claude/__init__.py) and addsregression tests for the render and extension-install paths.
Testing
Added regression tests for both frontmatter paths (render and extension-install),
ran the full suite (3718 passed, 45 skipped), and confirmed the fix end-to-end on a
sample project — a unicode extension's description (CJK + accented Latin + emoji)
survived intact in the generated SKILL.md.
uv run specify --helpuv sync && uv run pytestAI Disclosure
Implemented with Claude Code (Opus 4.8).