Skip to content

Fix XPath normalize-space function#324

Open
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:fix_normalize_spaces
Open

Fix XPath normalize-space function#324
tompng wants to merge 1 commit into
ruby:masterfrom
tompng:fix_normalize_spaces

Conversation

@tompng
Copy link
Copy Markdown
Member

@tompng tompng commented Jun 5, 2026

It should return normalized string of the first node, just like other functions such as string() and number()

doc = Nokogiri::XML.parse(<<~XML)
  <a><b>breakfast     boosts\t\t
  
  concentration   </b><c>
  Coffee beans
  aroma
  </c><d>        Dessert
  \t\t    after    dinner</d></a>
XML
doc.xpath("normalize-space(//text())")
#=> "breakfast boosts concentration"

This bug is a blocker of implementing delayed nodeset ordering.

def match(path_stack, node)
  nodeset = [XPathNode.new(node, position: 1)]
  result = expr(path_stack, nodeset)
  case result
  when Array # nodeset ← HERE, normalize-space function returned array of string
    unnode(result).uniq # Changing to sort(result) failed
  else
    [result]
  end
end

Copilot AI review requested due to automatic review settings June 5, 2026 13:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR changes how normalize-space() behaves in REXML’s XPath functions and updates the associated test expectation to match the new behavior.

Changes:

  • Simplifies Functions::normalize_space to always coerce via string() and return a single normalized string.
  • Updates the test_normalize_space_strings assertion to only expect one normalized value.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
test/functions/test_base.rb Updates expected result for normalize-space(//text()) to a single string.
lib/rexml/functions.rb Replaces array-aware normalization logic with a single-string implementation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/rexml/functions.rb
else
string.to_s.strip.gsub(/\s+/um, ' ')
end
string(string).strip.gsub(/\s+/um, ' ')
Comment thread test/functions/test_base.rb Outdated
"Dessert after dinner",
],
normalized_texts)
assert_equal(["breakfast boosts concentration"], normalized_texts)
It should return normalized string of the first node, just like other functions such as `string()` and `number()`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants