Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exact Match by non capturing group but allow wildcard for subsegments #223

Open
janus-reith opened this issue May 1, 2020 · 8 comments
Open
Labels

Comments

@janus-reith
Copy link

I'm trying to rewrite non-localized paths (e.g /page1) to localized ones (/de/page1) and therefore I'm seeking help to get the following to work:

  • "/de" or "/en" should NOT be matched

  • "/de/anything" or "/en/anything" should NOT be matched

  • "/test" or "/test/anything" should be matched

  • "/deno" or "/end" should be matched

With my current implementation which is based on:

"/:test((?!login)[^/]+)",

I can get the first 3 of 4 requierements, but can't prevent "de" from also matching "deno".
It seems like instead of allowing any chars which are not "/" after the lang segment, I would need a solution which allows either an exact match or a wilcard which must be preceded by another "/". (So both /de and /de/anything would be covered).

Segmenting these values any further is not a requierement as I'm taking req.url as a whole and only prepend the locale.

This sandbox should probably make it easy to get what I mean:
https://codesandbox.io/s/path-to-regexp-demo-c9efk?file=/src/index.js
(Result is in the console)

I got a bit stuck here, and any help is appreciated, thank you! :)

@blakeembrey
Copy link
Member

This is pretty complex and a little outside of how path-to-regexp is intended to be used. My suggestion would be to split this into separate routes and have them ordered. Do you know what framework you're using this with? Depending on where, you could just match /:lang first, and if it didn't match your expected language pass it to child routes.

@janus-reith
Copy link
Author

@blakeembrey Thanks for your help.
I'm using nextjs, which allows for an ordered list of matches, and pass those that don't match to the next one. (Functionality is described here: vercel/next.js#9081)

Regarding separate routes, yes I also that idea, but thought that having one expression to match all cases would've been cleaner and I'm just to unskilled with regexpes to get this right.
But I'm happy to put this into separate routes if that is easier.

So, with separate routes, would I have

  1. one that does exact matches like /abc or /test or /deno, but does not match /de or /en
  2. one that does exact matches but allows for a wilcard after a "/" occured, so it matches /abc/[...] and /test/[...], but not /de[...] and /en[...]?

The general background: I'm creating an i18n example project and I have all names available of the routes that should NOT match as I can import the array of used locales by name at the point where I define my rewrites, so I can have something like en|de, but I don't have access to all currently existing paths like /test or /abc at that point, and want to avoid duplication by manually having to define all of them there.

@blakeembrey
Copy link
Member

I can't quickly translate it into next.js, but normally I'd do something like this:

const firstRoute = pathToRegexp('/:lang(en|de|...)', { end: false }) // The `end: false` lets you match everything under these paths, e.g. `/en`, `/en/test`, but not `/end` (because it's not a separate "segment").
const anyOtherRoute = '/...'

@blakeembrey
Copy link
Member

The key is typically to design a route that matches on what you're trying to do, then build something for the fallback routes. This makes it a lot easier to maintain than trying to do negative route matches.

@Vadorequest
Copy link

@blakeembrey I'm running into the exact same issue.

Here is how Next.js allows us to customise this behaviour (RFC specs)

See vercel/next.js#9081 rewrites section (screenshot below)

image

Here is how we actually use it, inside /next.config.js:

const rewrites = [
        {
          // XXX Doesn't work locally (maybe because of rewrites), but works online
          source: '/',
          destination: '/api/autoRedirectToLocalisedPage',
        },
        {
          source: `/:locale((?!${allowedLocales.join('|')})[^/]+)(.*)`,
          destination: '/api/autoRedirectToLocalisedPage',
        },
      ];

Full /next.config.js code source:
https://github.com/UnlyEd/next-right-now/pull/42/files/3cde65a68dc2bb90dcb777f19ffb036bb967c607#diff-5d0c276360a637d1b787a57760665fbeR34-R48

How would you advise to tackle this issue? I'm really not familiar with path-to-regexp. (and regexes in general) Thank you!

@nicholaschiang
Copy link

Just a note that if this were possible (it isn't support by path-to-regexp yet), I think it would solve some issues (e.g. /end would redirect to /en/end properly):

> source = '/((?!en(\\/|$)|fr(\\/|$))[^/]+))(.*)'
'/((?!en(\\/|$)|fr(\\/|$))[^/]+))(.*)'
> options
{ strict: true, sensitive: false, delimiter: '/' }
> regexp = pathToRegexp(source, [], options);
Uncaught TypeError: Capturing groups are not allowed at 7
    at lexer (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:74:31)
    at parse (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:97:18)
    at stringToRegexp (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:329:27)
    at pathToRegexp (/home/nchiang/repos/covid-tutoring/node_modules/path-to-regexp/dist/index.js:403:12)

Note that I added a capture group at the end of each locale (e.g. en(\\/|$)) that ensure that we only capture it as a locale if it is:

  1. At the end of the URL (denoted by the $).
  2. Is directly followed by a slash (denoted by the \\/).

@mxck
Copy link

mxck commented May 5, 2020

@nicholaschiang we can split parentheses to separate element. Like (!?en($|/)) to (!?en/|en$).

Working example:

 const languagesMask = ['zh-cn', 'en', 'es', 'fr', 'it', 'ru', 'th']
      .map((lang) => [`${lang}/`, `${lang}$`])
      .flat()
      .join('|');

const source = `/((?!${languagesMask})[^/]+)/:path*`

@TZB-Loong
Copy link

const languagesMask = ['zh-cn', 'en', 'es', 'fr', 'it', 'ru', 'th']
.map((lang) => [${lang}/, ${lang}$])
.flat()
.join('|');

const source = /((?!${languagesMask})[^/]+)/:path*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants