Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly matches [1-5] against [1-5] #71

Open
papb opened this issue Sep 26, 2020 · 8 comments
Open

Incorrectly matches [1-5] against [1-5] #71

papb opened this issue Sep 26, 2020 · 8 comments
Labels
bug Something isn't working

Comments

@papb
Copy link

papb commented Sep 26, 2020

  • Code: picomatch('[1-5]')('[1-5]')
  • Expected result: false
  • Actual result: true

Apparently this mistake is caused explicitly:

let match = input === glob;

Why? This should not match. Please remove this check.

As a workaround, I am using picomatch.makeRe(pattern).test(str) instead of picomatch(pattern)(str).

@jonschlinkert
Copy link
Member

jonschlinkert commented Sep 26, 2020

Did you try literalBrackets? See the options on the readme.

picomatch('[1-5]', { literalBrackets: true })('[1-5]');

I haven't tried it, let me know if this doesn't work.

Why? This should not match. Please remove this check.

Have you run the unit tests with that check disabled? How many tests fail?

Also, why? Why wouldn't that match? It makes no sense to say that it should fail if the glob exactly matches the input string.

@papb
Copy link
Author

papb commented Sep 27, 2020

Hi @jonschlinkert, thanks for the fast reply!

Did you try literalBrackets?

This is not about literalBrackets, I want the brackets in the glob pattern to behave as the special characters they are.

Have you run the unit tests with that check disabled? How many tests fail?

No. I will do that.

Also, why? Why wouldn't that match? It makes no sense to say that it should fail if the glob exactly matches the input string.

It shouldn't match because the glob pattern [1-5] means any string consisting of one number from 1 to 5. The only strings that should match are '1', '2', '3', '4' and '5'. The string '[1-5]' is not one of them, so it shouldn't match.

Consider a scenario in which I am looking for files whose names are a single digit. I'd use the pattern [0-9]. Unfortunately, a malicious user would be able to create a file called [0-9] on purpose and it would match.

@jonschlinkert
Copy link
Member

This is not about literalBrackets

Did you try setting literalBrackets to false? If that doesn't work we'll mark this as a bug.

I want the brackets in the glob pattern to behave as the special characters they are.

Brackets are legal path characters. We need to handle literal matching as well.

It shouldn't match because the glob pattern [1-5] means any string consisting of one number from 1 to 5. The only strings that should match are '1', '2', '3', '4' and '5'. The string '[1-5]' is not one of them, so it shouldn't match.

Thanks, I was asking about your use case, not how regular expression work.

Unfortunately, a malicious user would be able to create a file called [0-9] on purpose and it would match.

Malicious? Lol k.

Consider a scenario in which I am looking for files whose names are a single digit. I'd use the pattern [0-9].

You could do this:

const isMatch = pattern => {
  const matcher = picomatch(pattern);
  return input => pattern !== input && matcher(input);
};

@papb
Copy link
Author

papb commented Sep 27, 2020

Interesting, after changing the code to the way I thought it should be, I get 15 failing tests:

  1) from the Bash 4.3 spec/unit tests
       should use escaped characters as literals

  2) from the Bash 4.3 spec/unit tests
       should match escaped quotes

  3) extglobs (bash)
       "a\(b" should match "a\(b"

  4) extglobs (bash)
       "a\z" should match "a\z"

  5) extglobs (bash)
       "a\z" should match "a\z"

  6) extglobs (bash)
       "a\z" should match "a\z"

  7) extglobs (bash)
       "a\z" should not match "a\z"

  8) extglobs (minimatch)
       "a\z" should match "a\z"

  9) extglobs (minimatch)
       "a\z" should match "a\z"

  10) extglobs (minimatch)
       "a\z" should not match "a\z"

  11) negation patterns - "!"
       should not negate when inside quoted strings

  12) options
       options.noextglob
         should match literal parens when noextglob is true (issue #116)

  13) options
       options.noextglob
         should not match extglobs when noextglob is true

  14) options
       options.unescape
         should remove backslashes in glob patterns:

  15) slash handling - windows
       should not match literal backslashes with literal forward slashes when windows is disabled

Looks like that check was being useful to work around a few edge cases...

@papb
Copy link
Author

papb commented Sep 27, 2020

Did you try setting literalBrackets to false? If that doesn't work we'll mark this as a bug.

Hmm, I just tried picomatch('[1-5]', { literalBrackets: false })('[1-5]') and it still gives true unfortunately...

Brackets are legal path characters. We need to handle literal matching as well.

Of course, but only if the user requested so :) By the way now I have a tangential documentation question: it says that the default value for literalBrackets is undefined, how does that differ from false?

Thanks, I was asking about your use case, not how regular expression work.

Sorry 😅

@papb
Copy link
Author

papb commented Sep 27, 2020

I have just asked a related question on Unix SE

@jonschlinkert
Copy link
Member

and it still gives true unfortunately...

Ok, thanks for letting me know. Marking as a bug.

I have just asked a related question on Unix SE

Since brackets in bash have special significance that only makes sense in the terminal, picomatch is able to provide more regex matching features you won't find in bash.

Sorry 😅

no need. thanks for helping me uncover a bug.

@jonschlinkert jonschlinkert added the bug Something isn't working label Sep 27, 2020
papb added a commit to papb/picomatch that referenced this issue Sep 27, 2020
@papb
Copy link
Author

papb commented Sep 27, 2020

By the way, for the record, after my modification attempt, one of the assertions that failed is the following:

assert(isMatch('\\*', '\\*'));

However, interestingly, by trying on a real bash terminal, the real behavior also disagrees with this assertion!

$ mkdir temp && cd temp
$ touch \*
$ touch \\\*
$ ls
'*'  '\*'
$ ls \*
'*'

The last command shows that the \* glob in fact didn't match '\*', only '*'. Looks like the test is incorrect? 😮

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants