Python: Fix HTML tags silently dropped in prompt templates#13654
Closed
gambletan wants to merge 2 commits intomicrosoft:mainfrom
Closed
Python: Fix HTML tags silently dropped in prompt templates#13654gambletan wants to merge 2 commits intomicrosoft:mainfrom
gambletan wants to merge 2 commits intomicrosoft:mainfrom
Conversation
When a prompt template contains HTML tags like <p>, the XML parser in from_rendered_prompt() would silently drop them because they are not recognized as 'message' or 'chat_history' elements. This fix serializes unrecognized XML elements back to their string form and appends them to the preceding message content, preserving HTML tags like <p>, <div>, etc. in the final prompt sent to the model. Fixes microsoft#13632
Add tests verifying that HTML tags like <p>, <b>, <i> in prompt templates are preserved when parsed by from_rendered_prompt(). Covers the scenario reported in microsoft#13632.
Collaborator
|
Hi @gambletan, we appreciate the contribution. This issue is already fixed an open PR: #13633. Closing this as a duplicate. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #13632
When a prompt template contains HTML tags like
<p>, theChatHistory.from_rendered_prompt()method wraps the rendered prompt in a<root>element and parses it as XML. The XML parser treats<p>as a valid XML element, but since the code only recognizes<message>and<chat_history>tags, the<p>element (and its content) is silently dropped. This causes the model to receive an empty/truncated prompt.Root cause
In
python/semantic_kernel/contents/chat_history.py, thefrom_rendered_prompt()method iterates over child XML elements but only handlesmessageandchat_historytags. Any other valid XML element (including HTML tags like<p>,<div>,<b>, etc.) is ignored, losing both the element and its text content.Fix
When an unrecognized XML element is encountered, serialize it back to its string representation using
tostring()and append it (along with any trailing text) to the preceding message's content. This preserves HTML tags in the prompt text sent to the model.Changes
python/semantic_kernel/contents/chat_history.py: Addedelsebranch infrom_rendered_prompt()to handle unrecognized XML elements by converting them back to string formpython/tests/unit/contents/test_chat_history.py: Added regression tests for HTML tags (<p>,<b>,<i>) in prompt templatesTest plan
test_chat_history_from_rendered_prompt_with_html_tagsverifies the exact scenario from Python: Bug: HTML tag <p> getting blocked in Semantic Kernel version 1.39.4 #13632test_chat_history_from_rendered_prompt_with_multiple_html_tagsverifies multiple HTML tagstest_chat_history_from_rendered_prompt_html_with_message_tagsverifies HTML works alongside<message>tags<message>and<chat_history>tags unaffected)🤖 Generated with Claude Code