-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allow expressions in backticks only identifier #4479
Comments
What's the actual problem here?
As far as I can tell, the cost of this feature in the parser is one straightforward production. There's no special lexer behavior or monadic action. It's hard to imagine a user-visible feature that is less complex as far as the parser is concerned.
Any evidence that this trips beginners up?
language-cst-parser matches the canonical Haskell parser on this feature without complication. Are you encountering difficulty with some other PureScript parser? |
When the first response I read is Back to the issue at hand: The compiler, and all duplication of the tooling, has a cost to maintain and build that is proportional to the features it implements and the accidental complexity of how those features are implemented. This is like any software, the compiler is not special in this regard. Do you need references for this? As with optimizations: You can squeeze as much performance out of every cpu cycle as you can but nothing beats not doing work. The same is for maintaining code. You can use DRY, the 7 steps of naming, law of demeter and or any other strategy to improve maintainability, but noting beats deleting code. So asking the question, Is this feature worth the cost it has? and my hypothesis is no, it doesn't. There are other ways to write the same thing and i don't see backticks used that much in the wild and never seen backtick expression used. I also know that there are teams has formatters that removes backticks all together. When it comes to making it easier for people new to Purescript, its built on the assumption that there already is a good way to express the same thing. When reading code you want to focus as much as possible on the code in front of you, that is why naming is so important. You don't want to go and lookup what a function does because you don't understand what it is doing or don't trust that its doing what it says. This is even worse with language features. A feature that is not used frequently makes it even more difficult for unexperienced developer to know and learn. In best case it can guess from experience, and in worst case they needs to stop what they are doing and go and learn new syntax, which can be harder then reading source of a function that has a bad name. |
On a more technical note @rhendric
I belive that the change would mean that ithe production could instead become a lex token making it much simpler and quicker to parse, but its a bit beside the point. |
Because I was wondering the same thing that Ryan said above, I'll ask that question instead. What is your need? Or perhaps stated differently, when and why did this become an issue for you, @Neppord? I feel like there's probably more context here than what has been stated. |
First, thanks for asking! :) I'm developing the intellij plugin for purescript. And it has partial support for backticks. So there is no need for them to go away for that parser to work, it's a mix between the compiler and the one written in purescript. So my need is not to be able to implement the thing, that is already done. But whenever I read that part of the grammar and think about how often I see it's being used I get an urge to remove it. It's very deeply ingrained in me that even the small improvements over time have a big payoff. It's the 1% improvement a week that leads to (1.01**52) ≈ 68% improvement over a year that is beckoning me. So any easy and simple improvement i can make to any system is a itch i want to scratch. It turns out though that this improvement wasn't so easy to make, there seams to be much more opinions against it then I imagineed and I'm about to give up, on this improvement. I can try to dig deeper into my needs if this wasn't sufficient. |
What needs do you two whant to fulfill by keeping the implementation as is? @rhendric and @JordanMartinez? What have I missed? |
Thanks for providing more context! To restate what I think you said.... You essentially want to build something superb by making many small changes over time (i.e. the IntelliJ plugin for PureScript). Every time you look at the extended infix expressions (as I call them in my reference work), there's an itch that says, "This is a flaw in the language that should be fixed by removing it." Previously, you would just ignore it, though it would bother you. Now you are raising an issue for it. Generally, when making changes to the compiler/language, we need justification for such a change. However, some changes don't really have an objective justification as much as they are subjective opinions. This case initially felt more like a subjective opinion (i.e. "I think extended infix expressions are a bad idea, so let's remove them") rather than an objective one. Hence, why I was initially confused. Thus far, you have described removing these kinds of expressions from the language as an improvement, but it wasn't obvious to me how they are an improvement and whom benefits if such a change was done. I'm guessing whatever "pushback" you feel that you've gotten has been more from this angle (i.e. "Is this an improvement?") than people disagreeing that this should be done. Or put differently, I wanted to make sure this requested change wasn't a symptom of some larger issue (e.g. having trouble writing a parser than can handle this case). Getting back on point. Should this be removed? Here are the topics of conversation that come to my mind:
As for me, I think it's worthwhile to remove this feature because it's not a "necessary" language feature and the workaround isn't burdensome to write. I think the argument for removing it is further strengthened if the below things are true:
If someone goes through the effort of showing exactly how many times this feature is used in the latest package set and the usage count and library count are low, I think arguments for keeping it would be weakened. However, this is just my opinion. |
So first of all, I'm biased: I like and use this feature. I consider it one of the small quality-of-life improvements that PureScript offers over Haskell, and I wish GHC supported it. I would miss it if PureScript dropped it. That doesn't mean I would object to its removal if it caused more problems than benefits, on net. But I am currently utterly unconvinced of any problems it causes (no, I don't need the abstract concept of the maintenance cost of code reiterated; not all lines of code have equal maintenance cost and I need convincing that these specific lines cause maintenance problems, preferably from someone who actually maintains them), and would like to see a stronger case for these problems more than I would want to see a comprehensive survey of how much this feature is used—if it doesn't cause any actual problems for actual users or maintainers, then one happy user is enough to keep it in my opinion, and I am such a one. All that said, I would be more sympathetic to a proposal to make this feature easier for external tools to handle by reducing its interaction with layout. I see no need to support infix expressions spanning multiple lines, for example, if that's difficult for some of these external systems. As a point of information, this feature was added by Phil as discussed in #839. (A link in that issue points to this usage by @natefaubion of seven years ago, and I'm really curious to hear from him about why the apparent about-face in whatever message people are quoting in this thread.) |
Oh, and just to be clear, @Neppord, since we miscommunicated before: when I use the word ‘actual’, I'm not intending to imply that problems you experience are invalid. I mean ‘actual’ as opposed to ‘hypothetical’—I want examples of difficulties you've actually had in your actual experience, not difficulties you hypothesize that you could have or that you hypothesize that PureScript maintainers have. Any problems that you've had in your actual experience are valid and I would like to hear about them, even if we end up disagreeing on the outcome. |
Here is the discussion on discord that made me in the end create this PR, https://discord.com/channels/864614189094928394/872650895744176188/1122110393985286206 |
That usage likely represents the only time I've ever used it, and it was only a sort of "see how it feels" usage. You can probably find an example of any kind of syntax at some point or another in code I've written. At that time I also noted "I could take it or leave it". Since then, I contributed the parser and layout algorithm. I think the layout formulation we have (as baroque as it appears to be) is crucial to the proliferation of (accurate) alternative parsers. It's written in a quirky style that's easy to translate verbatim to other languages, but is otherwise largely impossible to reason about given any mistakes. I feel pretty strongly that, given the current state of the parser and layout algorithm we would likely not add this feature today if it was proposed (especially given hindsight about it's usage), and I nominally support removing features that create additional complexities for layout. Another example (probably a more significant contributor to complexity) being unquoted layout keywords in record accessor position. In particular for backticks, since it isn't a symmetric delimiter, you don't know if any given backtick is an open or close, so you have to:
Ideally, I had wanted nested backticks to be fully prohibited, but this requires forking the expr grammar below the appropriate precedence level. So I opted to just supporting it in layout with the backtracking check. Another alternative could be to only support a single application spine with non-recursive terms (I think this would allow all the examples I've seen), which would allow us to only need to inspect the top item in the layout context like all the other rules. Whether or not it should actually be removed (aside from my strong opinion on the syntax overall) is a different question. I have no intention of doing any of the work necessary to actually make it happen, so that means I'm still in the "take it or leave it" camp unless someone is actually motivated to see it through and just needs an approval. |
I definitely can do the work, that is "the small effort" in my opinion. And it will also benefit me in learning the codebase better. |
The work is more than the compiler though. I'd consider it to also be researching it's usage in the ecosystem, and potentially contributing fixes to make the transition easier if we determine it's worth it. |
I have a project that uses the CST parser that index the package-set to enable some features in the plugin. I ran it on psc-0.15.7 since it was the last one i had indexed and looked for infix expressions that contained more then an identifier, this was the output: {
"these": 4,
"psa-utils": 3,
"ps-cst": 1,
"profunctor": 1,
"optparse": 3,
"halogen-hooks": 1,
"graphql-client": 4,
"datetime": 1
} |
@natefaubion I agree, and as you can see I have already started to gather data. And since i plan for adding a warning first, people will be able to ask for help. |
Summary
The hypothesis is that the feature of having whole expressions in backticks are not used that often. And that it overcomplicates the parser.
Motivation
The feature was added due to the ease of doing so when using a different parsing library. By removing this feature we hope to make both the language simpler to learn for beginner and the parser easier to reproduce.
also:
Proposal
Proposal is to first add a warning that the feature will go a way. and then remove it after 6 months.
Examples
this is currently legal:
and in the future this expression would need to be written like so:
The text was updated successfully, but these errors were encountered: