Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output errors do not lead to a non-zero exit code #305

Open
yahesh opened this issue May 14, 2024 · 9 comments
Open

output errors do not lead to a non-zero exit code #305

yahesh opened this issue May 14, 2024 · 9 comments

Comments

@yahesh
Copy link

yahesh commented May 14, 2024

When providing trurl with a URL-encoded component that it is unhappy with (e.g. using URL-encoded characters smaller than %20) then the tool will not decode the component as requested but output a "trurl note", swallowing all output and only returning a single line break while the exit code is still 0. This is even the case with the --verify option set.

This could severely break pipelines that use trurl on untrusted URLs. Currently, tools using trurl cannot distinguish between successful parsing and URL-decoding errors based on the exit codes.

# this is okay
% trurl "?" --get "{host}" --verify ; echo "Exitcode: $?"
trurl error: No host part in the URL [?]
trurl error: Try trurl -h for help
Exitcode: 9
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl ".#fragment%00" --get "{fragment}" --verify ; echo "Exitcode: $?" 
trurl note: URL decode error, most likely because of rubbish in the input (fragment)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "./path%00" --get "{path}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (path)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl ".?query%00" --get "{query}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (query)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "user%00:password%00@." --get "{user}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (user)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "user%00:password%00@." --get "{password}" --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (password)


Exitcode: 0
@bagder
Copy link
Member

bagder commented May 14, 2024

It returns 0 when it can parse the URL, which it can in your examples. You might want --no-guess-scheme if you want it to not guess the scheme, as that makes it rather lenient.

@yahesh
Copy link
Author

yahesh commented May 15, 2024

It returns 0 when it can parse the URL, which it can in your examples. You might want --no-guess-scheme if you want it to not guess the scheme, as that makes it rather lenient.

I'm not sure we're talking about the same thing here, because even with --no-guess-scheme set the tool will swallow the output due to the decoding error but still provide an exit code of 0.

# this is okay
% trurl "https://?" --get "{host}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl error: No host part in the URL [https://?]
trurl error: Try trurl -h for help
Exitcode: 9
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://.#fragment%00" --get "{fragment}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (fragment)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://./path%00" --get "{path}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (path)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://.?query%00" --get "{query}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (query)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://user%00:password%00@." --get "{user}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (user)


Exitcode: 0
# THIS SHOULD RETURN A NON-ZERO EXIT CODE
% trurl "https://user%00:password%00@." --get "{password}" --no-guess-scheme --verify ; echo "Exitcode: $?"
trurl note: URL decode error, most likely because of rubbish in the input (password)


Exitcode: 0

@bagder
Copy link
Member

bagder commented May 15, 2024

Yes, because it returns 0 when it can parse the URL and it only adds a "note" about the problem to output what was asked.

We should probably consider making the output problems return non-zero as well.

bagder added a commit that referenced this issue May 15, 2024
Most typically when %00 or something crazy is used in the component and
it is extracted URL decoded. Makes trurl return ERROR_GET (10) now.

Added two tests to verify.

Reported-by: yahesh on github
Fixes #305
@bagder
Copy link
Member

bagder commented May 15, 2024

It struck me we don't return error for those because you might pass in many URLs, and if you get a problem with --get on just one of them how is trurl supposed to behave then? Maybe we need another option... ?

For example , this is a legit command line:

$ trurl curl.se curl.se/?one%00 curl.se/?two -g{query}

@bagder bagder changed the title URL-decode errors do not lead to a non-zero exit code output errors do not lead to a non-zero exit code May 15, 2024
@yahesh
Copy link
Author

yahesh commented May 16, 2024

@bagder Curl has support for a --fail option. --fail in trurl could be described as being more strict than --verify.

@bagder
Copy link
Member

bagder commented May 16, 2024

It's not about "more strict" though. --verify is for being strict about (verifying) the input, this discussion is about using parts of the otherwise fine input when creating output. Basically problems with --get.

The new option would probably be documented as "return error if failing to produce output" or something. I think --fail is an odd name for that. How about --strict-get or --werror (the latter being "treat warnings as errors" as that's the name for the corresponding gcc option). Or something else.

@yahesh
Copy link
Author

yahesh commented May 16, 2024

At the end, the decision is up to you, but I don't understand why you say that --fail is an odd name for the option when the word is even part of how you would document it:

"return error if failing to produce output"

bagder added a commit that referenced this issue May 16, 2024
The strict prefix makes trurl immediately exit with an error code if
such a get can't be done due to URL decoding problems. By default, such
a problem will only make trurl skip that part and silently continue.

Use it like this: --get "{strict:query}"

I also made "url:" a valid prefix, and now we consider the single colon
prefix to be a shortcut for url:. The concept of prefixes scale better
when we use real words instead of single characters.

Ref: #305
@bagder
Copy link
Member

bagder commented May 16, 2024

I created #310, which instead adds a prefix for the get component to ask for "strict" or regular.

@bagder
Copy link
Member

bagder commented May 17, 2024

@yahesh that would make your command lines above return errors in the way you wanted them to, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants