Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array.shuffle: explicitly validate the result of [rand]. #13168

Merged
merged 6 commits into from
May 17, 2024

Conversation

fpottier
Copy link
Contributor

This yields more readable code (the previous code used [get] instead of [unsafe_get] in a way that was quite cryptic), a more precise runtime check (in the previous code, a violation of [rand]'s contract could go unnoticed), and a more understandable exception (namely, [Invalid_argument "Array.shuffle"] instead of
[Invalid_argument "array index out of bounds"]).

This yields more readable code (the previous code used [get] instead
of [unsafe_get] in a way that was quite cryptic), a more precise
runtime check (in the previous code, a violation of [rand]'s contract
could go unnoticed), and a more understandable exception (namely,
[Invalid_argument "Array.shuffle"] instead of
[Invalid_argument "array index out of bounds"]).
stdlib/array.ml Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort
let shuffle ~rand a = (* Fisher-Yates *)
for i = length a - 1 downto 1 do
let j = rand (i + 1) in
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this version instead?
if j < 0 || j > i then invalid_arg "Array.shuffle";

stdlib/array.ml Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort
let shuffle ~rand a = (* Fisher-Yates *)
for i = length a - 1 downto 1 do
let j = rand (i + 1) in
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle";
Copy link
Member

@gasche gasche May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want a better error message than this, for users to know what is going on. For example "Array.shuffle: rand returned an integer out of range". Or one could even be more clear:

Printf.ksprintf invalid_arg
  "Array.shuffle: `rand` returned an integer %d \
   out of the expected range [0; %d]"
  j i

stdlib/array.ml Outdated
@@ -442,8 +442,9 @@ let fast_sort = stable_sort
let shuffle ~rand a = (* Fisher-Yates *)
for i = length a - 1 downto 1 do
let j = rand (i + 1) in
if not (0 <= j && j <= i) then invalid_arg "Array.shuffle";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, but perhaps we could try to slowly move the Stdlib towards less cryptic error messages that require less trips to printf debugging.

Here are gradual improvements.

invalid_arg "Array.shuffle: rand"
invalid_arg "Array.shuffle: rand returned an unexpected integer"
invalid_arg ("Array.shuffle: rand " ^ (string_of_int (i + 1)) ^ " returned " ^
  (string_of_int j) ^ ", out of expected [0;" ^ (string_of_int i) "] range")

Personally I would really like to have the last one which has all the information one would ask for in a debug session. It is equivalent to the format string:

"Array.shuffle: rand %d returned %d, out of expected [0;%d] range"

but I suspect we don't want (or can't) to get a Printf dependency on Array. If that's actually ok defining invalid_argf as Printf.ksprintf invalid_arg is always nice for formatted raises.

Having a separate function like err_rand for formatting the message or the raise allows to keep the function clean from the nice but long morasses of text (and allows error message reuse of course).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gasche and @dbuenzli for your suggestions. I have improved the error message (without depending on Printf) and have placed it in a separate function.

Copy link
Member

@Octachron Octachron May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using Printf or ^, I would propose to use String.concat:

    String.concat "" [
      "Array.shuffle: rand("; string_of_int (i + 1); ")"; " returned "; string_of_int j;
      ", outside of the expected range [0; "; string_of_int i; "]"
   ]

mostly because a punctuation like ; feels less intrusive than ^.
(hm, that's one case where I would have used [: ... :] if available)

@gasche gasche self-assigned this May 15, 2024
stdlib/array.ml Outdated
@@ -439,10 +439,18 @@ let stable_sort cmp a =

let fast_sort = stable_sort

let shuffle_contract_violation i j =
let msg =
"Array.shuffle: rand(" ^ string_of_int (i + 1) ^ ")" ^
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not an idiomatic OCaml function call which is going to look weird when you get the error. I mean in man of my system rand(3) is actually a thing :-) If you feel parens are needed then I rather have (rand %d) and or use ocamldoc syntax [rand %d] but then we should establish a pattern here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another way would be to use the compiler's convention for quoting code which I believe is straight quotes nowadays.

stdlib/array.ml Outdated
let msg =
"Array.shuffle: rand(" ^ string_of_int (i + 1) ^ ")" ^
" returned " ^ string_of_int j ^
", outside of the expected range [0; " ^ string_of_int i ^ "]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that using the more brief out of expected range makes it slightly more probable that the error message fits on a line (assuming 80 columns) and is thus easier to read.

@hannesm
Copy link
Member

hannesm commented May 16, 2024

I have a question about how defensive should the OCaml stdlib be with respect to existing functions in other modules? So, where should checking end, where should it begin? Should a "String.make" be validated to be of the right size (and contain the right data)? I'm just curious where to start and where to stop.

@gasche
Copy link
Member

gasche commented May 16, 2024

Here the rand function is a parameter provided by the user, not the call to another function in the standard library, and we need to check in any case that the result is a valid index in the array.

@hannesm
Copy link
Member

hannesm commented May 16, 2024

@gasche ah, thanks. I didn't read the code carefully enough.

Use straight quotes as delimiters of call, as in 'rand 3',
instead of parentheses as delimiters of argument, as in rand(3).

Shorten the message by writing "out of expected range".
Also, abbreviate [string_of_int], for readability.
@fpottier
Copy link
Contributor Author

Thanks all. I have taken your comments into account.

Copy link
Contributor

@dbuenzli dbuenzli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! That's what should have been done in the first place (by me).

@gasche
Copy link
Member

gasche commented May 17, 2024

@fpottier could you add a Changes entry?

@Octachron
Copy link
Member

The dependencies also needs to be updated with make alldepend to account for the Array -> String dependency.

@fpottier
Copy link
Contributor Author

OK, done. I have updated just stdlib/.depend.

make alldepend also caused changes in .depend, but they seem unrelated to my changes.

@gasche gasche merged commit 66b7bfd into ocaml:trunk May 17, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants