Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

mahfouz72 · 2024-04-13T16:38:09Z

Resolves: #14747
Resolves: #4175

removed deep scan from processExpression and just scanned the first level of a given node (the direct children only not the whole subtree)
adding some tokens that got missed from having no deep scan

Diff Regression config: https://gist.githubusercontent.com/mahfouz72/2b480a919e3a98f5609aeab17fb79b1b/raw/23cfa105f73bc7149494f1e0712be2376ad13fbd/parenpadbase.xml

Diff Regression patch config: https://gist.githubusercontent.com/mahfouz72/79f7d3b270b52b91da18557854ce6e60/raw/9d6c487e7b1df6d511ec2a808566db21759b91e1/parenpadpatch.xml

mahfouz72 · 2024-04-13T16:42:02Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

            }
-            else if (currentNode.hasChildren() && !isAcceptableToken(currentNode)) {
-                // Traverse all subtree tokens which will never be configured
-                // to be launched in visitToken()
-                currentNode = currentNode.getFirstChild();
-                continue;
-            }
-
-            // Go up after processing the last child
-            while (currentNode.getNextSibling() == null && currentNode.getParent() != ast) {
-                currentNode = currentNode.getParent();
-            }
            currentNode = currentNode.getNextSibling();
        }


removed the deep scan and just check the first child and all of its siblings

PS D:\test> cat src/Test3.java public class Test3 { int i = (( (5*4) + ( 4 + 2))); } PS D:\test> java -jar checkstyle-10.14.2-all.jar -t src/Test3.java COMPILATION_UNIT -> COMPILATION_UNIT [2:0] `--CLASS_DEF -> CLASS_DEF [2:0] |--MODIFIERS -> MODIFIERS [2:0] | `--LITERAL_PUBLIC -> public [2:0] |--LITERAL_CLASS -> class [2:7] |--IDENT -> Test3 [2:13] `--OBJBLOCK -> OBJBLOCK [2:19] |--LCURLY -> { [2:19] |--VARIABLE_DEF -> VARIABLE_DEF [3:4] | |--MODIFIERS -> MODIFIERS [3:4] | |--TYPE -> TYPE [3:4] | | `--LITERAL_INT -> int [3:4] | |--IDENT -> i [3:8] | |--ASSIGN -> = [3:10] | | `--EXPR -> EXPR [3:13] | | |--LPAREN -> ( [3:13] <-- this and all its siblings only checked | | |--LPAREN -> ( [3:14] | | |--PLUS -> + [3:22] | | | |--LPAREN -> ( [3:16] <-- no deep scan so this will not be picked | | | |--STAR -> * [3:18] | | | | |--NUM_INT -> 5 [3:17] | | | | `--NUM_INT -> 4 [3:19] | | | |--RPAREN -> ) [3:20] | | | |--LPAREN -> ( [3:24] | | | |--PLUS -> + [3:28] | | | | |--NUM_INT -> 4 [3:26] | | | | `--NUM_INT -> 2 [3:30] | | | `--RPAREN -> ) [3:31] | | |--RPAREN -> ) [3:32] | | `--RPAREN -> ) [3:33] | `--SEMI -> ; [3:34] `--RCURLY -> } [4:0]

mahfouz72 · 2024-04-13T16:43:32Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

+            TokenTypes.TYPECAST,
+            TokenTypes.STAR,
+            TokenTypes.PLUS,
+            TokenTypes.MINUS,
+            TokenTypes.DIV,
+            TokenTypes.MOD,
+            TokenTypes.LAND,
+            TokenTypes.LOR,
+            TokenTypes.LNOT,


add some tokens that got missed due to having no deep scan (we may need to add some more tokens but those were the obvious one from the test and input files)

mahfouz72 · 2024-04-13T16:47:12Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

-     */
-    private boolean isAcceptableToken(DetailAST ast) {
-        return acceptableTokens.get(ast.getType());
-    }
-


we don't need this anymore we needed it to avoid checking the same node twice. while doing a deep scan when we found a subtree that is acceptableToken we don't visit it because it will be picked up with visitToken() later

now there is no deep scan for tokens so we don't need this check. I removed this method, the related unnecessary class fields and the test corresponding to it

mahfouz72 · 2024-04-13T16:49:06Z

Github, generate report

mahfouz72 · 2024-04-13T16:59:16Z

Github, generate site

github-actions · 2024-04-13T17:03:26Z

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024170214/index.html

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024170214/checks/whitespace/parenpad.html#Properties

github-actions · 2024-04-13T17:40:02Z

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024173934/reports/diff/index.html

mahfouz72 · 2024-04-13T18:14:14Z

there are huge differences.
I want to know if I'm on the right track before proceeding to add more tokens.

romani · 2024-04-13T22:43:51Z

Please extend link check suppression files with lines as CI suggesting.

mahfouz72 · 2024-04-14T00:29:19Z

@romani done. should I add all tokens that got missed and identified in the regression report in this PR?

romani · 2024-04-14T02:54:27Z

I do not recommend to rush to add tokens, some of them not added for good reason. Let's add one by one, with full attention to regression diff report.

mahfouz72 · 2024-04-14T03:08:02Z

one by one in separate PRs or in this PR?
and also what about the tokens added till now? all of them were added to cover the missing parnes that are not checked in input files after having no deepscan

nrmancuso · 2024-05-01T13:01:49Z

I do not recommend to rush to add tokens, some of them not added for good reason. Let's add one by one, with full attention to regression diff report.

@romani I don't think we can do this without a bunch of hacks; as soon as we stop "deep scanning", we need to extend the tokens for this check to keep the behavior consistent.

@mahfouz72 can you help us to understand what other tokens we may need to add here, and how you are discovering them?

mahfouz72 · 2024-05-01T13:55:55Z

@nrmancuso we need to add any possible token that could be used under EXPR token. for example, all mathematical, logical, and bitwise operators should be added. for now, I added some of them I discovered those from the failing unit tests after having no deep scan. but there are more tokens that we didn't use in the input file

Examples:

bitwise: & , | , ~ , ^ , << , >>
comparison : < <= > >= == !=
assignments : += -= = /= *= etc..

I am discovering them from the regression report and in general as I stated above I need to think of any token that can be used under expression and can be surrounded by paren

what do you think should I start working in this way? and pay full attention to the report to have consistent behaviour

rnveach · 2024-05-01T23:16:19Z

@nrmancuso @romani Could this be a sign we need a ParenPadExpression check? Will someone ever want different spacing for different expression tokens?

mahfouz72 · 2024-05-01T23:41:10Z

Will someone ever want different spacing for different expression tokens?

IMO, No, no one will need to violate this x = (3+5) but not this x += (3+5). That why I see this step #14792 (comment) making the check weird and unnecessary a mess of tokens (we are talking about 15 ~ 20 new tokens)

but at the same time, how to pick those cases that got missed after having no deep scan without adding all those tokens....

I don't know if this a good design but can we leave the deep scan and while scanning we skip tokens that are not any of the tokens mentioned here #14792 (comment) so we will have
isMathematicalOperator() , isBitwiseOperator(), isLogicalOperator() etc... and skip the validation while deep scanning based on the type of token
So basiaclly this solution is almost the same as #14792 (comment) but we won't explicity add them as new token of the check we will be checking them internally while deep scanning and ignore all other tokens

rnveach · 2024-05-02T00:01:07Z

but can we leave the deep scan and while scanning we skip tokens that are not any of the tokens mentioned here

What is the difference between leaving the deep scanning (dropping the issue) or leaving the deep scanning while checking this internal only list ? My understanding was you built the list from the deep scanning (and/or regression).

I was thinking along the lines we break this check apart. We drop expression support here and create an expression only check (we want to do this for another check, #5945 ), specify all the tokens you suggested, but make them non-configurable (must stop on them) and no deep scanning. I can't really imagine people wanting different configs for different expression tokens. If someone does, we have this isolated check to make it easier to explain.

The problem with deep scanning is its easy to lose control and its harder to understand what it is exactly looking at. We had issues before where a check went beyond its boundaries in scanning. We are sort of having a discussion like this for UnusedLocalVariableCheck regarding pitest. Another example is #5234 and #5124 which I specifically mentioned branchContains is a dangerous method to use all the time since there is no restriction on how deep it can search.

mahfouz72 · 2024-05-02T00:23:17Z

What is the difference between leaving the deep scanning (dropping the issue) or leaving the deep scanning while checking this internal only list ?

if we leave deep scan and check this internal list only. we will avoid validating tokens thay definitely should not be validated example: RECORD_PATREN_DEF that I have in the issue related to this PR

one of the main problems of the deepscan that we was checking token that not in configuration if we enforce checking this internal list only we will avoid this problem

We drop expression support here and create an expression only check

this is a good solution I am leaning to it. if we really want to remove the deep scan not just leave it and do this list hack to enforce it check only specific toke n

nrmancuso · 2024-05-15T13:23:41Z

Ok, looks like we have some deep seated issues with the implementation and design of ParenPad. Before we consider writing and designing some new "expression only check", we should clearly define our expectations for ParenPad itself, in terms of user facing behavior. No user cares about (or knows of) any "deep scanning", only false positives/negatives.

IMO, customization (turning support on/off for tokens in our case) is one of the things that is killing us on this check. Take a look at ESLint's take on this check: https://eslint.style/rules/js/space-in-parens

Imagine that we were to implement a check like ESLint's. Would that make our life easier? If we were to write a check as simple as ESLint's, would we need a special check for expressions?

mahfouz72 · 2024-05-18T23:40:24Z

we can simplify this check by making the required Tokens LPAREN, RPAREN. So we always stop on both tokens only and check the Pad for each and every LPAREN or RPAREN as simple as that. In this case no need for a special check for expressions we will stop on all.

but with this design, the check will not be configurable users can't customize it for specific tokens so I dunno if this is good

nrmancuso · 2024-05-25T16:38:28Z

but with this design, the check will not be configurable users can't customize it for specific tokens so I dunno if this is good

I propose that we do the following:

Open an issue for making a simple check that is only configurable like EsLint's check.
Close this PR
Allow us to discuss this further in the issue

Generally, I like to start with some simple concept like this and get it out to users and get real feedback, instead of trying to dream up everything a user could ask for initially (this is always doomed to fail, since lots of folks write code differently).

Also, what users may have liked 10 years ago is not necessarily what they like now.

I think it is time to start over with this check.

mahfouz72 · 2024-05-25T21:46:33Z

#14896

mahfouz72 commented Apr 13, 2024

View reviewed changes

mahfouz72 force-pushed the remove-deepscan branch from b1677ed to 0f44b8f Compare April 13, 2024 16:47

mahfouz72 force-pushed the remove-deepscan branch from 0f44b8f to c89bd82 Compare April 13, 2024 17:07

mahfouz72 force-pushed the remove-deepscan branch from c89bd82 to a3f279c Compare April 13, 2024 18:53

mahfouz72 force-pushed the remove-deepscan branch from a3f279c to c16e3bc Compare April 14, 2024 00:00

Issue checkstyle#14747: fix ParenPad to not flag unsupported Tokens

5a8389d

mahfouz72 force-pushed the remove-deepscan branch from c16e3bc to 5a8389d Compare April 14, 2024 00:07

nrmancuso self-assigned this May 7, 2024

rnveach added the high demand label May 8, 2024

mahfouz72 closed this May 25, 2024

mahfouz72 mentioned this pull request May 26, 2024

Create New Parentheses Padding Check #14896

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

mahfouz72 commented Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

romani commented Apr 13, 2024

mahfouz72 commented Apr 14, 2024

romani commented Apr 14, 2024

mahfouz72 commented Apr 14, 2024

nrmancuso commented May 1, 2024

mahfouz72 commented May 1, 2024

rnveach commented May 1, 2024

mahfouz72 commented May 1, 2024 •

edited

rnveach commented May 2, 2024 •

edited

mahfouz72 commented May 2, 2024 •

edited

nrmancuso commented May 15, 2024

mahfouz72 commented May 18, 2024

nrmancuso commented May 25, 2024 •

edited

mahfouz72 commented May 25, 2024 •

edited

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Conversation

mahfouz72 commented Apr 13, 2024

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

romani commented Apr 13, 2024

mahfouz72 commented Apr 14, 2024

romani commented Apr 14, 2024

mahfouz72 commented Apr 14, 2024

nrmancuso commented May 1, 2024

mahfouz72 commented May 1, 2024

rnveach commented May 1, 2024

mahfouz72 commented May 1, 2024 • edited

rnveach commented May 2, 2024 • edited

mahfouz72 commented May 2, 2024 • edited

nrmancuso commented May 15, 2024

mahfouz72 commented May 18, 2024

nrmancuso commented May 25, 2024 • edited

mahfouz72 commented May 25, 2024 • edited

mahfouz72 commented May 1, 2024 •

edited

rnveach commented May 2, 2024 •

edited

mahfouz72 commented May 2, 2024 •

edited

nrmancuso commented May 25, 2024 •

edited

mahfouz72 commented May 25, 2024 •

edited