Use macros from `limits.h` to prevent signed integer wrap-around warnigns #13083

MisterDA · 2024-04-08T14:22:21Z

The code is currently correct since we use wrap-around semantics for signed integers (-fwrapv), but:

it's difficult to communicate that fact to static analyzers, which warn when computing the minimum integer with left-shifting 1 to the sign bit position (most-significant bit);
MSVC doesn't support wrap-around semantics, but historically hasn't optimized for this (so no harm), and might innocuously warn.

Using constants from <limits.h> instead allows for self-documenting code and silences these warnings.

Computing the minimum signed integer

From the standard (which I recall doesn't consider wrap-around semantics for signed integers):

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. […] If E1 has a signed type and nonnegative value, and E1 × 2^E2 is can't be represented in the result type, then that is the resulting value; otherwise, the behavior is undefined.

The problem being that the result of 1 << CHAR_BIT * sizeof(int) - 1 to compute the minimum int can't be represented in the result type (it's 2^63, but the maximum is 2^63-1); without wrap-around.

Introduce the INTNAT_MIN macro to avoid independent re-definitions of this value.

Is a change entry needed?
This also prevents warnings raised under Windows by clang-cl and improves code quality with MSVC.

(I might have confused undefined behavior with unspecified behavior, oh well)

runtime/bigarray.c

dustanddreams · 2024-04-08T14:44:38Z

runtime/caml/config.h

@@ -140,16 +140,19 @@ typedef unsigned char uint8_t;
 typedef long intnat;
 typedef unsigned long uintnat;
 #define ARCH_INTNAT_PRINTF_FORMAT "l"
+#define INTNAT_MIN LONG_MIN


Have you tried moving these defines to runtime/caml/misc.h? runtime/caml/config.h doesn't include <limits.h> but these new macros depend on it, so it would make more sense to define them in a place where <limits.h> is included.

Hmmm, it's true that config.h is missing limits.h, but adding the following to misc.h also seems like wasted duplication.

#if SIZEOF_PTR == SIZEOF_LONG /* Standard models: ILP32 or I32LP64 */ #define INTNAT_MIN LONG_MIN #elif SIZEOF_PTR == SIZEOF_INT /* Hypothetical IP32L64 model */ #define INTNAT_MIN INT_MIN #elif SIZEOF_PTR == 8 /* Win64 model: IL32P64 */ #define INTNAT_MIN INT64_MIN #endif

config.h could include limits.h instead, we've switched to C11, and most of the compatibility code around C99 integer types seems to have been added for old MSVC.

The preprocessor logic duplication is unfortunate but probably acceptable (with a comment telling it must match what's in config.h) if adding <limits.h> to config.h is considered too large a change.

I think a Changes entry will be required if config.h now includes <limits.h>.

I'm opting to add limits.h to config.h. I think a follow-up PR could switch entirely to C99 fixed-width integers all the macros and defines of config.h.

NickBarnes · 2024-04-17T13:24:49Z

I'll review this.

NickBarnes

This is all good, a clear improvement.

MisterDA · 2024-04-24T13:51:31Z

This is all good, a clear improvement.

Thanks, I've rebased on trunk and added you as a reviewer.

xavierleroy · 2024-04-28T16:46:24Z

What about using INTPTR_MIN, INTPTR_MAX and UINTPTR_MAX unconditionally? OCaml's value type is, morally, intptr_t, even though it is not defined as such for historical reasons.

NickBarnes · 2024-04-29T10:49:46Z

What about using INTPTR_MIN, INTPTR_MAX and UINTPTR_MAX unconditionally? OCaml's value type is, morally, intptr_t, even though it is not defined as such for historical reasons.

This makes sense to me, and could remove the test for SIZEOF_PTR == SIZEOF_LONG etc in config.h. It does need <stdint.h>, but although I see that we use HAS_STDINT_H in config.h, I suspect that parts of the runtime wouldn't compile at all if <stdint.h> were not available.

While we're on the subject, it's surprising to me that we don't seem to have, or use, CAML_INT_MAX and CAML_INT_MIN (or similar names). Maybe this PR would be a reasonable time to introduce them?

xavierleroy · 2024-04-29T11:16:52Z

although I see that we use HAS_STDINT_H in config.h, I suspect that parts of the runtime wouldn't compile at all if <stdint.h> were not available.

Right. <stdint.h> is standard since C99, and OCaml 5 requires C11, so we should use <stdint.h> unconditionally and remove the configure test for it.

MisterDA · 2024-04-29T17:33:59Z

What about using INTPTR_MIN, INTPTR_MAX and UINTPTR_MAX unconditionally? OCaml's value type is, morally, intptr_t, even though it is not defined as such for historical reasons.

While we're on the subject, it's surprising to me that we don't seem to have, or use, CAML_INT_MAX and CAML_INT_MIN (or similar names).

Two good suggestions. I've changed the definitions to use the {u,}intptr_t limits, and namespaced the macros with the CAML_ prefix. It's technically a breaking change to move from UINTNAT_MAX to CAML_UINTNAT_MAX, but opam grep UINTNAT_MAX doesn't return anything.
Would you rather use INTPTR_MIN directly and not have CAML_INTNAT_MIN?

NickBarnes · 2024-05-02T10:35:41Z

What I meant about CAML_INT_MAX and CAML_INT_MIN was the max and min values of OCaml's int type. I find these are in fact currently defined in mlvalues.h as Max_long and Min_long (which I think are confusing names!).

MisterDA · 2024-05-13T18:18:14Z

I've rebased this PR.

What I meant about CAML_INT_MAX and CAML_INT_MIN was the max and min values of OCaml's int type. I find these are in fact currently defined in mlvalues.h as Max_long and Min_long (which I think are confusing names!).

I've introduced CAML_LONG_{MAX,MIN} macros replacing {Max,Min}_long. I think that LONG instead of INT is more consistent with the current naming. Should I retain the former names for compatibility? Are we convinced that this is a good idea?

NickBarnes · 2024-05-14T08:53:02Z

On reflection we shouldn't change Max_long or Min_long in this PR, and I regret suggesting it.
Those names have been fixed for decades and there may be a lot of code out there using them (opam grep immediately finds base_bigstring for instance). If we did change them, or offer new alternatives, IMO it should be CAML_INT_MAX and CAML_INT_MIN, because they are the maximum and minimum values of the OCaml type int.

MisterDA · 2024-05-14T10:07:35Z

On reflection we shouldn't change Max_long or Min_long in this PR, and I regret suggesting it. Those names have been fixed for decades and there may be a lot of code out there using them (opam grep immediately finds base_bigstring for instance).

My thoughts also, I'll remove that commit.

If we did change them, or offer new alternatives, IMO it should be CAML_INT_MAX and CAML_INT_MIN, because they are the maximum and minimum values of the OCaml type int.

but on 64-bits arches, only Val_long maps to a 63-bit integer, right? not Val_int, which is cast'ed to (int).

Introduce the macro INTNAT_MIN.

This fixes the warning from MSVC raised on -0x80000000. > warning C4146: unary minus operator applied to unsigned type, result > still unsigned The other replacements are made for consistency and, hopefully, legibility.

MisterDA changed the title ~~Limits.h min int~~ Use macros from limits.h to prevent signed integer wrap-around warnigns Apr 8, 2024

dustanddreams reviewed Apr 8, 2024

View reviewed changes

MisterDA force-pushed the limits.h-min-int branch 3 times, most recently from 68289f9 to d36498c Compare April 11, 2024 18:53

dra27 assigned NickBarnes Apr 17, 2024

NickBarnes approved these changes Apr 24, 2024

View reviewed changes

MisterDA force-pushed the limits.h-min-int branch from d36498c to 3b61291 Compare April 24, 2024 13:51

dustanddreams approved these changes Apr 24, 2024

View reviewed changes

NickBarnes mentioned this pull request Apr 29, 2024

We depend on <stdint.h> absolutely, not conditionally. #13134

Merged

MisterDA force-pushed the limits.h-min-int branch from 3b61291 to 1b08350 Compare April 29, 2024 17:20

MisterDA force-pushed the limits.h-min-int branch from 1b08350 to 3ba9f9b Compare May 13, 2024 18:16

MisterDA force-pushed the limits.h-min-int branch from 3ba9f9b to b350290 Compare May 14, 2024 10:10

MisterDA added 3 commits May 28, 2024 12:45

Use macros from limits.h for bounds when checking for overflow

ebbff10

Introduce the macro INTNAT_MIN.

Use macros from limits.h when serializing ints and nativeints

f8260d7

This fixes the warning from MSVC raised on -0x80000000. > warning C4146: unary minus operator applied to unsigned type, result > still unsigned The other replacements are made for consistency and, hopefully, legibility.

Add CAML_{U,}INTNAT_{MIN,MAX} macros exposing {u,}intnat limits

2b71514

MisterDA force-pushed the limits.h-min-int branch from b350290 to 2b71514 Compare May 28, 2024 10:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use macros from `limits.h` to prevent signed integer wrap-around warnigns #13083

Use macros from `limits.h` to prevent signed integer wrap-around warnigns #13083

MisterDA commented Apr 8, 2024 •

edited

dustanddreams Apr 8, 2024

MisterDA Apr 8, 2024

dustanddreams Apr 8, 2024

MisterDA Apr 8, 2024

NickBarnes commented Apr 17, 2024

NickBarnes left a comment

MisterDA commented Apr 24, 2024

xavierleroy commented Apr 28, 2024

NickBarnes commented Apr 29, 2024 •

edited

xavierleroy commented Apr 29, 2024

MisterDA commented Apr 29, 2024 •

edited

NickBarnes commented May 2, 2024

MisterDA commented May 13, 2024

NickBarnes commented May 14, 2024 •

edited

MisterDA commented May 14, 2024 •

edited

Use macros from limits.h to prevent signed integer wrap-around warnigns #13083

Are you sure you want to change the base?

Use macros from limits.h to prevent signed integer wrap-around warnigns #13083

Conversation

MisterDA commented Apr 8, 2024 • edited

dustanddreams Apr 8, 2024

Choose a reason for hiding this comment

MisterDA Apr 8, 2024

Choose a reason for hiding this comment

dustanddreams Apr 8, 2024

Choose a reason for hiding this comment

MisterDA Apr 8, 2024

Choose a reason for hiding this comment

NickBarnes commented Apr 17, 2024

NickBarnes left a comment

Choose a reason for hiding this comment

MisterDA commented Apr 24, 2024

xavierleroy commented Apr 28, 2024

NickBarnes commented Apr 29, 2024 • edited

xavierleroy commented Apr 29, 2024

MisterDA commented Apr 29, 2024 • edited

NickBarnes commented May 2, 2024

MisterDA commented May 13, 2024

NickBarnes commented May 14, 2024 • edited

MisterDA commented May 14, 2024 • edited

Use macros from `limits.h` to prevent signed integer wrap-around warnigns #13083

Use macros from `limits.h` to prevent signed integer wrap-around warnigns #13083

MisterDA commented Apr 8, 2024 •

edited

NickBarnes commented Apr 29, 2024 •

edited

MisterDA commented Apr 29, 2024 •

edited

NickBarnes commented May 14, 2024 •

edited

MisterDA commented May 14, 2024 •

edited