The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

gasche · 2024-05-07T15:31:03Z

(This PR sits on top of #13152 and #13153; for now it is best reviewed by looking only at the latest commit.)

#13152 pessimizes Total matches in transitively mutable position. It gives two examples of code that now generates worse code:

(* case (A) *)
let f : bool option ref -> ... = function
| { contents = None } -> ...
| { contents = Some true } -> ...
| _ when guard () -> ...
| { contents = Some false } (* here *) -> ...

type _ gadt = Int : int t | Bool : bool t

(* case (B) *)
let g : int gadt ref -> ... = function
| { contents = Int } (* here *) -> ...

In the case A, the generated code first checks something about the argument root.0, then it checks a guard, then it goes back to the argument root.0. (We say that there are two "splits", resulting in three submatrices, that are checked in sequence.) When it goes back to root.0, its value could have changed in the meantime, so assuming that root.0.0 could only be false is wrong, and we really needed to make the code worse.

In the case B, the generated code reads the mutable field root.0, then immediately checks all possible cases on this constructor. In this case there cannot have been any mutation "in the meantime", this is the first time ever that we are checking this value. So generating a Match_failure case here was not necessary, and we are generating worse code than we could.

In general it is hard to know for sure "this is the first time we do something on this value". Tracking this precisely might be done with more context information, but it would be a tricky change with quickly diminishing returns. There is however a case where it is very easy to see that we are in this situation: where we have not generated any "split", any logic of the form "check this about the first value, and if that fails check that".

The present PR introduces a new piece of static context information used by the pattern-matching compiler to track this criterion. In code:

type temporality =
  | First
  | Following
(** The [temporality] information tracks information about the
    placement of the current submatrix within the
    whole pattern-matching.

    - [First]: this is the first submatrix on this position seen by values
      that flow into the submatrix.
    - [Following]: there was a split, some other submatrix was tried first
      and failed, and the control jumped to the current submatrix.

    This information is used in {!compute_arg_partial}.
*)

When deciding whether a switch on a given position should be pessimized, we now use a refined criterion: we are in a transitively mutable position and the current temporality is Following, we have already accessed this position before.

In particular, the example g above now produces good code again, as well as any reasonable usage of GADTs in mutable positions that I can think of.

It is, of course, possible to build examples where the heuristic is too coarse-grained and which will generate worse code than they could. For example:

type _ gadt = Int : int t | Bool : bool t

let test : int gadt ref -> unit = function
  | _ when Random.bool () -> ()
  | { contents = Int } -> ()

The [check_partial] heuristics are a coarse-grained approach degrade matching totality information in certain cases -- when clauses contain both reads of mutable fields and either guards or forcing of lazy patterns. It is not quite correct (it is not enough to prevent ocaml#7241), and is not sufficient conceptually for OCaml Multicore. In a Multicore world, other domains may race to update values concurrently, so we cannot assume that only a fixed set of matching constructions may result in side-effects. This heuristic is subsumed by the recent changes to: - make context information accurate by binding mutable fields eagerly - make totality information accurate by tracking the mutability of the current position and can now be retired.

…DT+mutable combinations

gasche mentioned this pull request May 7, 2024

Pattern matching with mutable and lazy patterns is unsound #7241

Open

gasche force-pushed the matching-bug-temporality-heuristic branch 3 times, most recently from 1ebb45c to 4dbadd4 Compare May 10, 2024 13:03

gasche added 3 commits May 13, 2024 17:30

matching: fix the 'totality' aspect of The Pattern-Matching Bug

828e27e

matching: introduce a 'temporality' heuristic to de-pessimize some GA…

a3020d5

…DT+mutable combinations

gasche force-pushed the matching-bug-temporality-heuristic branch from 4dbadd4 to a3020d5 Compare May 13, 2024 15:31

gasche mentioned this pull request May 13, 2024

The Pattern-Matching Bug: propagate mutability of argument positions #13138

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

gasche commented May 7, 2024 •

edited

The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

Are you sure you want to change the base?

The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

Conversation

gasche commented May 7, 2024 • edited

gasche commented May 7, 2024 •

edited