Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add * wildcard support. #269

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wanderview
Copy link

No description provided.

@wanderview
Copy link
Author

@blakeembrey PTAL. This is the first in a series of pull requests to add changes/features necessary for urlpattern.

This first PR adds * wildcard support.

Note, I was going to squash the commits and add a better commit message, but it seems some newer version of a tool you use in the commit hook added a ton of commas at the end of arrays throughout the codebase. Not sure if you have a way to avoid that. Anyway, happy to try to reformat this however you would like.

Thanks.

@blakeembrey
Copy link
Member

I rebased this PR on master and force pushed to your branch (just a heads up on that one) to fix the formatting issue. Sorry about that!

},
],
[
["/", ["/", ""]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a test for just "" would help ensure it never regresses unexpectedly.

[],
],
[
"/test/*?",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why this would work, but do we want it? Or is it better/less confusing for * to just be its own concept? E.g. *? wouldn't be allowed at all, * is just a regex .* (maybe even just non-matching if the backward compatibility doesn't care about capturing it?)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakeembrey this is exactly what I enabled in my PR fyi, (making wildcard optional by default) however, non-capturing I think would be a bad idea because routers, for example, might want to display what trailing path segment was "not found" (it is very useful to use the wildcard like a default in a switch statement). You still want to know what the default ended up being so you can display that on your 404 page or pass it along to a sub-router.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but if you really want to capture it you still could by doing a capturing group like (.*), it’s not all or nothing. I’m mostly a little iffy on the idea of conflating * inheriting any parameter behavior, but I will think more it. I think it mostly comes down to what we’re trying to add asterisk support for - is it backward compatibility with existing path matching patterns or does it have other features.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we are a bit locked into the behavior with urlpattern now since its shipped in two implementations. Obviously path-to-regexp can make its own decision, but it would mean diverging at least for now.

My argument for why I think this behavior is good is that it treats the implicit prefixes and modifiers consistently for all types of matching groups. It is also simple to explain that a * on its own is an alias for (.*). I feel like other options lead to more complex mental models. (I know that @HansBrende and I disagree on this, though.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanderview I agree, that it is indeed unfortunate that you are a bit locked in... but IMO, it will be even more unfortunate when you are completely locked in once the feature comes out of experimental status and ships to all major browsers. To put it another way: you are less locked in right now than you will ever be in the future. So... all I'm saying is, let's not be too hasty and make sure the path forward is the one that makes the most sense... because the decisions made right now will affect all developers for eternity... and that is no small thing 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its auxiliary function is simply to change the default pattern for that variable to .*.

That’s not it’s function though. That’d break with the behavior for + and is incompatible with implementing your own matching group definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it treats the implicit prefixes and modifiers consistently for all types of matching groups

This makes sense, but does it need to be a matching group? For the URLPattern use-case it’s backward compatibility with existing patterns, but do existing patterns use the match for anything?

Another way this can be explained simply is how it’s already been used and how it works in regex. It’s “zero or more of the pattern”. If there’s no pattern, it can assume it is zero or more of anything, so .*.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakeembrey re: my proposal that "[the wildcard's] auxiliary function is simply to change the default pattern for that variable [that it is quantifying] to .*."

That’s not it’s function though.

But you also said in the comment below it:

It’s “zero or more of the pattern”. If there’s no pattern, it can assume it is zero or more of anything, so .*."

We are saying exactly the same thing here. I'm saying its auxiliary function (i.e. its additional function, besides being a quantifier) is to replace the default pattern with .*, so if the user does not supply a pattern, the pattern would be .*, exactly as you said! This is identical to what you just said... right? If so, your restated definition does not align with @wanderview's, but with mine, since his definition only applies if the variable is unnamed.

That’d break with the behavior for +

I'm not suggesting that the behavior of + should change, because it doesn't need to (although I could tweak my PR to include + as well, if you prefer). Asterisk is the only symbol that has a double-meaning here. It would also be ambiguous which segment would be required for + if we were to implement it with the same auxiliary function as *. If users want exactly the same behavior with + as for *, then what they are really after is a wildcard plus a mandatory path segment, and they could simply make that unambiguous by doing:

/:foo_0/:foo_rest* (if the required segment must be at the beginning)
Or:
/:foo_rest*/:foo_n (if the required segment must be at the end)
Or:
/:foo_leading*/:foo_i/:foo_trailing* (if the required segment can be anywhere)

and is incompatible with implementing your own matching group definition.

Matching groups themselves do not have a "default" (read: "fallback") pattern like variables do, so my definition does not change their previous behavior even if they are quantified by an asterisk (the auxiliary function becomes a no-op in that case).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, but does it need to be a matching group? For the URLPattern use-case it’s backward compatibility with existing patterns, but do existing patterns use the match for anything?

Another way this can be explained simply is how it’s already been used and how it works in regex. It’s “zero or more of the pattern”. If there’s no pattern, it can assume it is zero or more of anything, so .*.

There is no other way to do this kind of matching in path-to-regexp without producing a capture group AFAICT. I took this as a signal that developers liked to have access to what was matched in general. And it does seems likely there will be at least some cases where developers will want to know what matched. So it does seem to have some utility to me and devs who don't care can ignore that capture group output.

Sorry I didn't realize you had these concerns with the feature before we launched it on the urlpattern side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no other way to do this kind of matching in path-to-regexp without producing a capture group AFAICT

We can always introduce a new "token" when parsing, don't need to shove it into the same pattern.

And it does seems likely there will be at least some cases where developers will want to know what matched

Definitely not arguing against that, probably depends on the framework though. E.g. in express.js you'd just do req.url instead.

Sorry I didn't realize you had these concerns with the feature

I think I missed that it was going to act like a group and be modifiable with ?. All the other behavior seems totally reasonable, but making it modifiable makes it a little confusing. E.g. why *? and not ** or *+? Why can't I just use + or ? standalone instead.

Just a standalone * seems reasonable without overloading it with other behavior, if people really want to make the prefix optional they can just {/*}? or use the existing features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants