Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Pattern-Matching Bug: introduce a temporality heuristic to de-pessimize certain programs #13154

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

gasche
Copy link
Member

@gasche gasche commented May 7, 2024

(This PR sits on top of #13152 and #13153; for now it is best reviewed by looking only at the latest commit.)

#13152 pessimizes Total matches in transitively mutable position. It gives two examples of code that now generates worse code:

(* case (A) *)
let f : bool option ref -> ... = function
| { contents = None } -> ...
| { contents = Some true } -> ...
| _ when guard () -> ...
| { contents = Some false } (* here *) -> ...

type _ gadt = Int : int t | Bool : bool t

(* case (B) *)
let g : int gadt ref -> ... = function
| { contents = Int } (* here *) -> ...

In the case A, the generated code first checks something about the argument root.0, then it checks a guard, then it goes back to the argument root.0. (We say that there are two "splits", resulting in three submatrices, that are checked in sequence.) When it goes back to root.0, its value could have changed in the meantime, so assuming that root.0.0 could only be false is wrong, and we really needed to make the code worse.

In the case B, the generated code reads the mutable field root.0, then immediately checks all possible cases on this constructor. In this case there cannot have been any mutation "in the meantime", this is the first time ever that we are checking this value. So generating a Match_failure case here was not necessary, and we are generating worse code than we could.

In general it is hard to know for sure "this is the first time we do something on this value". Tracking this precisely might be done with more context information, but it would be a tricky change with quickly diminishing returns. There is however a case where it is very easy to see that we are in this situation: where we have not generated any "split", any logic of the form "check this about the first value, and if that fails check that".

The present PR introduces a new piece of static context information used by the pattern-matching compiler to track this criterion. In code:

type temporality =
  | First
  | Following
(** The [temporality] information tracks information about the
    placement of the current submatrix within the
    whole pattern-matching.

    - [First]: this is the first submatrix on this position seen by values
      that flow into the submatrix.
    - [Following]: there was a split, some other submatrix was tried first
      and failed, and the control jumped to the current submatrix.

    This information is used in {!compute_arg_partial}.
*)

When deciding whether a switch on a given position should be pessimized, we now use a refined criterion: we are in a transitively mutable position and the current temporality is Following, we have already accessed this position before.

In particular, the example g above now produces good code again, as well as any reasonable usage of GADTs in mutable positions that I can think of.

It is, of course, possible to build examples where the heuristic is too coarse-grained and which will generate worse code than they could. For example:

type _ gadt = Int : int t | Bool : bool t

let test : int gadt ref -> unit = function
  | _ when Random.bool () -> ()
  | { contents = Int } -> ()

@gasche gasche force-pushed the matching-bug-temporality-heuristic branch 3 times, most recently from 1ebb45c to 4dbadd4 Compare May 10, 2024 13:03
The [check_partial] heuristics are a coarse-grained approach degrade
matching totality information in certain cases -- when clauses contain
both reads of mutable fields and either guards or forcing of lazy
patterns.

It is not quite correct (it is not enough to prevent ocaml#7241), and is
not sufficient conceptually for OCaml Multicore. In a Multicore world,
other domains may race to update values concurrently, so we cannot
assume that only a fixed set of matching constructions may result in
side-effects.

This heuristic is subsumed by the recent changes to:
  - make context information accurate by binding mutable fields eagerly
  - make totality information accurate by tracking the mutability of
    the current position
and can now be retired.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant