Skip to content

Conversation

@ahiltenkamp
Copy link

The Batik CSS Parser is not active maintained and supports only CSS 2.0

As a step towards replacing Batik all parser implementation specific things where put into the CSSParser class.
Because the CSSParser is the only Class that has a direct dependency to Batik it was moved to a separate batik package
The CSSScanner now uses the SAC Parser Interface instead of the CssParser which extends the BatikParser
It also uses a (own) ParserFactory which allows to use a different Css parser without having to modify the AntiSamy code.

There are two places where the old exception handling was a little bit weird, and I don't know the reason why it was this way. I've marked the two places with @todo

The failing test also exist currently in the antisamy main branch - I did not try to fix it.

# Introduce ParserFactory which allows to use a different CssParser
# Make CssScanner independent of the concrete implementation of the parser
# Encapsulate CSS Parser implementation specific stufff in separate package and class
@davewichers
Copy link
Collaborator

@spassarop - Have you reviewed this PR?

@spassarop
Copy link
Collaborator

Regarding what the PR intends to implement, it looks good to me. Unfortunately, we have no context about the exception handling you are curious about. One of them will be deleted in the next major version anyway.

@davewichers if the styling or anything else you want to check at high level is all right, then it can be merged.

@ahiltenkamp
Copy link
Author

In
CssScanner#L171 I was not sure if it is ok to catch the CssException here because before it was a ParseException. Was it just because before there was no CssException.
It's only some kind of reminder to extra check if I have broken something at this point.

In CssScanner#L324 - in my opinon the Batik ParseException can never happen, not sure if I missed something here. I was a little bit confused, why it is caught at this point. The parsing happens later. The http client only returns a string which later gets parsed as css, but maybe I am wrong, because I've never written such an handler.

@davewichers
Copy link
Collaborator

@spassarop - can you review/answer these questions? I don't want to merge until the discussion is complete, you are OK with everything.

@spassarop
Copy link
Collaborator

spassarop commented Nov 24, 2025 via email

@davewichers
Copy link
Collaborator

@spassarop - Can you do whatever other work is needed to get this ready to be merged in?

@ahiltenkamp
Copy link
Author

ahiltenkamp commented Jan 8, 2026

My proposal would be not re-add the ParseException because when re-adding it we loose the separation of the batik part and the non-batik part. In my opininion the way it is in the pull request should work as it is. I was only asking the question about the exception handling to be sure that I did not miss something.

@spassarop
Copy link
Collaborator

I get your point. Regarding the TODO comments, they can be removed. And the ParseException can be left out. However, where now is CSSException | IOException e I request to add java.lang.RuntimeException to compensate the absence of ParseException (which extends it). That way Batik exception is not in the common code and the potential failure scenario/flow is still covered with the exception catching and generating a ScanException.

@ahiltenkamp - Can you do that so this is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants