-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array.shuffle: explicitly validate the result of [rand]. #13168
Conversation
This yields more readable code (the previous code used [get] instead of [unsafe_get] in a way that was quite cryptic), a more precise runtime check (in the previous code, a violation of [rand]'s contract could go unnoticed), and a more understandable exception (namely, [Invalid_argument "Array.shuffle"] instead of [Invalid_argument "array index out of bounds"]).
stdlib/array.ml
Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort | |||
let shuffle ~rand a = (* Fisher-Yates *) | |||
for i = length a - 1 downto 1 do | |||
let j = rand (i + 1) in | |||
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this version instead?
if j < 0 || j > i then invalid_arg "Array.shuffle";
stdlib/array.ml
Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort | |||
let shuffle ~rand a = (* Fisher-Yates *) | |||
for i = length a - 1 downto 1 do | |||
let j = rand (i + 1) in | |||
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would want a better error message than this, for users to know what is going on. For example "Array.shuffle:
rand returned an integer out of range"
. Or one could even be more clear:
Printf.ksprintf invalid_arg
"Array.shuffle: `rand` returned an integer %d \
out of the expected range [0; %d]"
j i
stdlib/array.ml
Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort | |||
let shuffle ~rand a = (* Fisher-Yates *) | |||
for i = length a - 1 downto 1 do | |||
let j = rand (i + 1) in | |||
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but perhaps we could try to slowly move the Stdlib towards less cryptic error messages that require less trips to printf debugging.
Here are gradual improvements.
invalid_arg "Array.shuffle: rand"
invalid_arg "Array.shuffle: rand returned an unexpected integer"
invalid_arg ("Array.shuffle: rand " ^ (string_of_int (i + 1)) ^ " returned " ^
(string_of_int j) ^ ", out of expected [0;" ^ (string_of_int i) "] range")
Personally I would really like to have the last one which has all the information one would ask for in a debug session. It is equivalent to the format string:
"Array.shuffle: rand %d returned %d, out of expected [0;%d] range"
but I suspect we don't want (or can't) to get a Printf
dependency on Array
. If that's actually ok defining invalid_argf
as Printf.ksprintf invalid_arg
is always nice for formatted raises.
Having a separate function like err_rand
for formatting the message or the raise allows to keep the function clean from the nice but long morasses of text (and allows error message reuse of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using Printf
or ^
, I would propose to use String.concat
:
String.concat "" [
"Array.shuffle: rand("; string_of_int (i + 1); ")"; " returned "; string_of_int j;
", outside of the expected range [0; "; string_of_int i; "]"
]
mostly because a punctuation like ;
feels less intrusive than ^
.
(hm, that's one case where I would have used [: ... :]
if available)
… violates its contract.
stdlib/array.ml
Outdated
@@ -439,10 +439,18 @@ let stable_sort cmp a = | |||
|
|||
let fast_sort = stable_sort | |||
|
|||
let shuffle_contract_violation i j = | |||
let msg = | |||
"Array.shuffle: rand(" ^ string_of_int (i + 1) ^ ")" ^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not an idiomatic OCaml function call which is going to look weird when you get the error. I mean in man
of my system rand(3)
is actually a thing :-) If you feel parens are needed then I rather have (rand %d)
and or use ocamldoc syntax [rand %d]
but then we should establish a pattern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yet another way would be to use the compiler's convention for quoting code which I believe is straight quotes nowadays.
stdlib/array.ml
Outdated
let msg = | ||
"Array.shuffle: rand(" ^ string_of_int (i + 1) ^ ")" ^ | ||
" returned " ^ string_of_int j ^ | ||
", outside of the expected range [0; " ^ string_of_int i ^ "]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that using the more brief out of expected range
makes it slightly more probable that the error message fits on a line (assuming 80 columns) and is thus easier to read.
I have a question about how defensive should the OCaml stdlib be with respect to existing functions in other modules? So, where should checking end, where should it begin? Should a "String.make" be validated to be of the right size (and contain the right data)? I'm just curious where to start and where to stop. |
Here the |
@gasche ah, thanks. I didn't read the code carefully enough. |
Use straight quotes as delimiters of call, as in 'rand 3', instead of parentheses as delimiters of argument, as in rand(3). Shorten the message by writing "out of expected range".
Also, abbreviate [string_of_int], for readability.
Thanks all. I have taken your comments into account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! That's what should have been done in the first place (by me).
@fpottier could you add a Changes entry? |
The dependencies also needs to be updated with |
OK, done. I have updated just
|
This yields more readable code (the previous code used [get] instead of [unsafe_get] in a way that was quite cryptic), a more precise runtime check (in the previous code, a violation of [rand]'s contract could go unnoticed), and a more understandable exception (namely, [Invalid_argument "Array.shuffle"] instead of
[Invalid_argument "array index out of bounds"]).