Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: support user-defined variance (allow type parameters as type bounds) #67513

Open
1 of 4 tasks
Garciat opened this issue May 19, 2024 · 16 comments
Open
1 of 4 tasks
Labels
generics Issue is related to generics LanguageChange Proposal v2 A language change or incompatible library change
Milestone

Comments

@Garciat
Copy link

Garciat commented May 19, 2024

Go Programming Experience

Experienced

Other Languages Experience

Java, Haskell, Python, C++

Related Idea

  • Has this idea, or one like it, been proposed before?
  • Does this affect error handling?
  • Is this about generics?
  • Is this change backward compatible? Breaking the Go 1 compatibility guarantee is a large cost and requires a large benefit

Has this idea, or one like it, been proposed before?

Sort of: #47127

Also related: #58590

Does this affect error handling?

No.

Is this about generics?

It relates to type constraints/bounds.

Proposal

Allow type parameters as type bounds:

func cast[T U, U any](t T) U { return t }

Note that this not include things like:

  • [T ~U, U any]
  • [T U | int, U any]
  • [T interface{ U }, U any] -- maybe we do want to allow this? But syntactically, it opens the floodgates
  • [T interface{ U; V }, U any, V any]
  • or any other type expression that is not a single type parameter reference

Language Spec Changes

I don't know exactly how, but some clause in https://go.dev/ref/spec#TypeConstraint would need to be updated.

If [T interface{ U }, U any] is allowed, then https://go.dev/ref/spec#General_interfaces would need to be updated as well.

Informal Change

No response

Is this change backward compatible?

Yes.

Orthogonality: How does this change interact or overlap with existing features?

No response

Would this change make Go easier or harder to learn, and why?

No response

Cost Description

No response

Changes to Go ToolChain

No response

Performance Costs

No response

Prototype

I prototyped the change in Garciat@c69063f.

Type-checking works as expected:

func cycle[A B, B C, C D, D A]() {} // new error: invalid recursive type parameter constraint

func term[T interface{ U }, U any]() {} // error: term cannot be a type parameter

func cast[T U, U any](t T) U { return t } // OK

type X struct{}
func (_ X) M() {}

func main() {
  cast[string, string]("hi")           // error: string does not satisfy string (string is not an interface)
  cast[X, interface{}](X{})            // OK
  cast[X, interface{ M() }](X{})       // OK
  cast[X, interface { M(); N() }](X{}) // error: X does not satisfy interface{M(); N()} (missing method N)
}

Compilation also seems to be doing the right thing; the code runs as expected.

However, getting full type parameter embedding in interfaces ([T interface{ U }, U any]) seems to be a much more involved code change.

@Garciat Garciat added LanguageChange Proposal v2 A language change or incompatible library change labels May 19, 2024
@gopherbot gopherbot added this to the Proposal milestone May 19, 2024
@seankhliao seankhliao added the generics Issue is related to generics label May 19, 2024
@pPanda-beta
Copy link

A small note on performance, I believe the current approach of casting (convert to any first and then cast) has runtime impacts, instead if compiler already knows the interface satisfaction by a type, it can cast it with no runtime cost.

e.g. #58590 (comment)

@ianlancetaylor
Copy link
Contributor

You said what but you didn't say why. What kind of code would this permit us to write that we can't write today? Thanks.

@Garciat
Copy link
Author

Garciat commented May 20, 2024

[U any, T U] lets us express the subtype relationship T <: U between two type variables.
It works when T is a concrete type and U is an interface; and T implements U.
But it also works when both T and U are interfaces; and T subsumes U.

In terms of concrete code, so far I have only come up with the following use case.
It is a sort of workaround (inspired by this Haskell module) for methods not being allowed to add new type parameters or constraints.

// Type witness that T has some relationship to U that allows it to be casted T->U
type TyRel[T any, U any] interface {
  Apply(T) U
  ApplyAll([]T) []U
}

// Type equality relationship; T=T by reflexivity
type TyEq[T any] struct{}
func (_ TyEq[T]) Apply(x T) T { return x }
func (_ TyEq[T]) ApplyAll(xs []T) []T { return xs } // little optimization for some cases
func Refl[T any]() TyRel[T, T] { return TyEq[T]{} }

// T <: U relationship; verified by type bounds
type Implements[T U, U any] struct{} // NEW
func (_ Implements[T, U]) Apply(x T) U { return x }
func (_ Implements[T, U]) ApplyAll(xs []T) []U { .. Apply to each element .. } // not much use, really
func Impl[T U, U any]() TyRel[T, U] { return Implements[T, U]{} }

type Slice[T any] []T

// TyRel[T, string] demands from the caller a witness of the proof that T=string
func (s Slice[T]) JoinStrings(rel TyRel[T, string], sep string) string {
  return strings.Join(rel.ApplyAll(s), sep)
}

// TyRel[T, fmt.Stringer] demands from the caller a witness of the proof that T <: fmt.Stringer
func (s Slice[T]) JoinStringers(rel TyRel[T, fmt.Stringer], sep string) string {
  ss := make(Slice[string], len(s))
  for i, x := range s {
    ss[i] = rel.Apply(x).String()
  }
  return ss.JoinStrings(Refl(), sep) // missing type inference on return type for Refl() here
}

type X struct{}
func (_ X) String() string { return "hi" }

func hello() {
	xs := Slice[X]{{},{},{}}
	// in this context, we know T=X and X<:fmt.Stringer; so we encode the latter as a TyRel value
	// for it to be used inside of `JoinStringers` which has no constraints on `T`
	// so TyRel serves as a sort of 'runtime constraint', although it is statically determined
	xs.JoinStringers(Impl(), ", ") // missing type inference on return type for Impl() here
}

@atdiar
Copy link

atdiar commented May 20, 2024

@Garciat I am not sure I understand. Can you work me through it?

As I understand [U any, T U], it would mean that U can be instantiated by any type and that T is constrained by U.
Meaning that, at instantiation, T would be U.

Pre instantiation, T constrained by U, itself constrained by any, means that we can't say much about T apart from it being a type.

If U was constrained by fmt.Stringer instead, we would simply be able to tell that T is a fmt.Stringer as well.
And that T == U.

Unless you are thinking about partial instantiations where T needs to be instantiated once U is. (let's say the instantiated value for U is an interface).
But the generic code can't depend on the instantiation so I don't know what benefit we would get.

@Garciat
Copy link
Author

Garciat commented May 20, 2024

@atdiar:

As I understand [U any, T U], it would mean that U can be instantiated by any type and that T is constrained by U.
Meaning that, at instantiation, T would be U.

At instantiation, T would be some type such that T <: U (T is a subtype of U; or U is a supertype of T) i.e. T is 'castable' to U. So, not necessarily T=U. Although that's also a possibility iff U is an interface type, because non-interface types cannot be supertypes of other types.

Pre instantiation, T constrained by U, itself constrained by any, means that we can't say much about T apart from it being a type.

The generic code knows that a value of type T can be casted to the type U. That is the 'operation' gained from knowing T <: U. When T is a concrete type, it means that T implements U. And when they're both interface types, it means that T subsumes U.

If U was constrained by fmt.Stringer instead, we would simply be able to tell that T is a fmt.Stringer as well.
And that T == U.

In that case, we would know T <: U <: fmt.Stringer.
For example: T=interface{ M(); N(); String() string } and U=interface{ M(); String() string} would satisfy that relation.

@atdiar
Copy link

atdiar commented May 20, 2024

Thanks :)
That's interesting. The described behavior seems to correspond to the partial instantiation I was alluding to. It would be quite difficult in generic code to know how a type parameter is constrained only after it has been instantiated. Perhaps useful but not before we have some kind of type parameter switch. (instantiation depends on generic code, but generic code does not depend on instantiation currently)

Below for the long-winded thought:

A bit worried that it would make instantiation quite complex because of the order things could be instantiated in. Or even dependent on how type parameters are listed (trivial in simple cases, less so with composite types and inference).

But if we step back some more, it's not even clear to me that the only way we can define "constraining a type parameter by another type parameter" is to equate it to being constrained by an instantiation of that type parameter.

After all, we could also decide that, since at instantiation the type parameter becomes a single type (interface or not), T must be that singleton.

If we have [U any, T U] and we instantiate U with fmt.Stringer, either we decide that:

  1. T should be fmt.Stringer

Or we decide that:

  1. the constraint satisfaction is somewhat lazy, in which case we have some form of partial instantiation.
    T is constrained by fmt.Stringer so we can pass another type to further instantiate,

But this partial instantiation doesn't buy us much because in the generic body, we know nothing about how U is instantiated. So we can't rely on the fact that U was instantiated with a fmt.Stringer or anything else.
At best we would know that T is in U's typeset but we don't know which typeset it is before instantiation.
In fact [U fmt.Stringer, T U] can be replaced by [U fmt.Stringer, T fmt.Stringer] for basic interfaces.
If U is instantiated by a concrete type, T should be U.

Note that since interface implementation and satisfaction are different, it doesn't always work.
[U comparable, T U] can see U instantiated by any i.e. interface{} but then T would be constrained by any???

For union interfaces, we currently have no interface cases so this is still not useful. But if we did, we wouldn't get much information since we would be hard pressed to know which union element is the effective constraint to be used in generic code for T. (assuming the compiler checks for properly discriminated cases but that's not the hard part).

So let's try the first option, where T is equal to fmt.Stringer: now we have a way to create type parameter relations that is just a case similar to what is done for composite types and pointers currently. It's still not too useful a case.
If T== U, then we can simply use U.

Note:
Implementation is useful to define type assertion and value assignment.
So I guess the correct term would be assertable instead of castable?

@Garciat
Copy link
Author

Garciat commented May 20, 2024

@atdiar:

In fact [U fmt.Stringer, T U] can be replaced by [U fmt.Stringer, T fmt.Stringer] for basic interfaces.

Not sure what you mean by basic interfaces, but those two parameter lists do not (should not?) mean the same thing.

[U fmt.Stringer, T U] means T <: U <: fmt.Stringer. i.e. U is assignable to fmt.Stringer; and T is assignable to U and also fmt.Stringer (by transitivity).

[U fmt.Stringer, T fmt.Stringer] means U <: fmt.Stringer & T <: fmt.Stringer. Both T and U are assignable to fmt.Stringer. But there is no relation between T and U.

If U is instantiated by a concrete type, T should be U.

Correct. (Although my prototype currently does not allow this; it's harder to implement. U must be an interface type.)

As for the rest, I think I don't follow.

It sounds to me like you're bringing up the two different contexts that interact with this 'feature': the call/instantiation site (where T U is enforced), and the generic function itself (where T U may be 'used'). But I don't see them as 'interacting' with each other; nor should they. The instantiation site will concern itself with checking that, whatever U is, T <: U holds. And the generic function's body will concern itself with making sure that values of type T are used in a type-safe manner, with the additional knowledge that T is assignable to U, whatever U happens to be.

Either way, you could try cloning and running the prototype. That case of [U fmt.Stringer, T U] compiles and does what I would expect it to do (uphold T <: U <: fmt.Stringer). I think playing around with it could help exemplify those relations. (Ignoring any bugs I may have introduced and any cases that I haven't handled.)

@atdiar
Copy link

atdiar commented May 20, 2024

Just to respond quickly, you're right in a sense but this is a bit more subtle. The idea is that [U fmt.Stringer, T U] would be equivalent to [U fmt. Stringer, T fmt.Stringer] in generic code because the generic code has no real way to rely on information that is provided at instantiation. (the constraints are different! You're right. Just speaking about the code itself here).

Here we know that any type argument passed to U will satisfy fmt.Stringer.
So we know that any type argument passed to T constrained by U will satisfy fmt.Stringer. That's the only constraint that we can use before instantiation.

But if U was passed a subtype of fmt.Stringer, then T would be more constrained than merely fmt.Stringer.

That's what happens with concrete types for instance, we get equality T == U.

Now in generic code, the only information we know about T is that it must satisfy fmt.Stringer at the very least. It may be constrained some more but we don't have access to these other constraints before instantiation. So what is the point?
Perhaps avoiding a wrapper function?
I'd just use the wrapper function with the additional constraints explicitly available at compile time.
The only thing this feature would do is create instantiation time constraints which would simply make some code fail at instantiation more.

Edit: a basic interface is an interface defined as a set of methods that types must implement. (per spec).
https://go.dev/ref/spec#Interface_types

@Garciat
Copy link
Author

Garciat commented May 20, 2024

If I understand correctly, you're pointing out that with [U fmt.Stringer, T U], the method set of T is the same as U's, which is that of fmt.Stringer, which is just { String() string }.

That's true. Inside the generic function, may not call any other methods on T. In terms of the method set, nothing is gained.
But the fact that T <: U is known also grants us assignability from T to U.
That's the relation that is currently not possible to represent in Go generics.
And that's what I'm after here.

It is exemplified by

func cast[U any, T U](t T) { return t }

type X struct {}
func (_ X) M() {}
func (_ X) N() {}
func (_ X) String() string { return "..." }

cast[any, string]("hi")
cast[interface{ M() }, X](X{}).M()
cast[interface{ N()) }, interface{ M(); N() }](X{}).N()

// having access to casting as a function to pass to higher-order functions could be useful:

func MapSlice[T, U any](s []T, f func(T) U) []U { ... }

MapSlice([]X{{}, {}, {}}, cast[fmt.Stringer]) // []fmt.Stringer

Now, I'm not asserting that the semantic gain here is monumental; but it does enable a few things.
Having used generics in other languages with subtyping (e.g. C#), I was surprised when I tried to write [U any, T U] and it didn't work. It seemed obvious to me.

When/if Go allows type parameters on methods, being able to represent T <: U between two type parameters will be a lot more important.

Consider:

type Slice[T any] []T

// given:

func (s Slice[T]) ReplaceAll(f func(T) T) {
  for i, x := range s {
    s[i] = f(x)
  }
}

// we could make that more generic:

func (s Slice[T]) ReplaceAll[T any, R T](f func(T) R) {
  for i, x := range s {
    s[i] = f(x) // we allow `f` to return a more specific type R, as long as it is assignable to T
  }
}

@atdiar
Copy link

atdiar commented May 20, 2024

Yes it grants us assignability.
Now, where would we want to use it?
The code example with parametrizable methods is unlikely to land.
But maybe there can be other examples of where this would be useful?

(Also type constructors are invariant in Go so not sure that it would be as used/useful as let's say C#, but still curious, maybe that can be discussed off of the issue tracker. )

@Garciat
Copy link
Author

Garciat commented May 21, 2024

Thanks for the conversation, though. With that last example I realized that this feature would enable use-site variance, similar to Java. (Whether generic methods are allowed or not.)

@atdiar
Copy link

atdiar commented May 21, 2024

No problem. Thanks too. I hadn't seen that. It's definitely interesting to me.
Still not sure how useful that'd be and hence if that will ever be implemented, but that's interesting nonetheless.

@Garciat
Copy link
Author

Garciat commented May 22, 2024

Now that the motivation for this feature request has (in my mind) pivoted towards use-site variance:

I think that with the advent of extensively-generic packages like iter (#61898), use-site variance could add a degree of flexibility to higher-order functions. For example:

// original:
func Filter[V any](f func(V) bool, seq Seq[V]) Seq[V] {
	return func(yield func(V) bool) bool {
		for v := range seq {
			if f(v) && !yield(v) {
				return false
			}
		}
		return true
	}
}

// with use-site variance, we are able to express that predicates are contravariant
func Filter[V P, P any](f func(P) bool, seq Seq[V]) Seq[V] {
	return func(yield func(V) bool) bool {
		for v := range seq {
			if f(v) && !yield(v) { // Note [1]
				return false
			}
		}
		return true
	}
}

Then it would be possible to filter a sequence of concrete type A by a predicate f that accepts some interface type X that A implements. For example:

type Expr interface { ... }

type CallExpr struct { ... } // implements Expr

func IsConstantFoldable(e Expr) bool { ... }

func FindCallExprs(e Expr) iter.Seq[CallExpr] { ... }

func example(tree Expr) {
  // the predicate acts on `Expr`, a supertype of `CallExpr`
  for constCall := iter.Filter(IsConstantFoldable, FindCallExprs(tree)) {
    // ...
  }
}

Note [1]: the cast from V to P is non-trivial: depending on the kind of the instantiated types, the compiler has to emit code that either a) creates an iface value (allocation + itab lookup), or b) calls runtime.conv*/runtime.assertE2I/etc. This optimization proposal becomes a lot more relevant: #51133.


The language design question then becomes: do we want use-site variance or declaration-site variance?

For declaration-site variance, we could imagine in and out qualifiers (a la C#) to declare contravariance and covariance, respectively:

type Predicate[in T any] func(T) bool

// usage:
func Filter[V any](f Predicate[V], seq Seq[V]) Seq[V] {
	return func(yield func(V) bool) bool {
		for v := range seq {
			if f(v) && !yield(v) {
				return false
			}
		}
		return true
	}
}

// doesn't need to be a generic context:
func FilterCalls(f Predicate[CallExpr], calls []CallExpr) []CallExpr { ... }

// we could pass a predicate of type `Predicate[Expr]`, and the compiler should accept it

Then, variance is no longer checked at instantiation time; and it is instead checked during function argument assignment.

However, when lowering the code, both type-to-interface (static itabs) and interface-to-interface (assertE2I) conversions require compiler support. In this particular case, when casting Predicate[Expr] to Predicate[CallExpr], the compiler could introduce a synthetic wrapper function that performs the cast from CallExpr to Expr. But in a general case, this seems intractable (to me). Perhaps declaration-site variance is best suited for OO languages where upcasts are no-ops in terms of the object pointer.

(And that is probably the reason why Go function types do not support variance.)

@atdiar
Copy link

atdiar commented May 23, 2024

How would one use these higher-order functions?

The benefit is still not very clear to me.
If this is simply about predicates, that does not bring much, no?
Also these predicates are easily modifiable by returning them from higher-order generic functions if we want to use interface type argument?

func PredicateConv[V someinterface] (predicate1 func(someinterface) bool) func(V) bool{
    return func(v V) bool{
        return predicate1(v)
    } 
}

?

We would need some stronger and more concrete examples I think.

Thought about how it could help once we had variadic type parameters (allowing us to generalize functions and wrap the current type constructors) but I actually prefer to avoid having to think about variance, especially in user-code.
Too easy of a footgun.

@ianlancetaylor
Copy link
Contributor

This is an interesting idea. It needs more exploration to fully understand it. It also needs more compelling examples. Are there practical use cases where this would simplify code that people want to write? Many of the examples above seem quite abstract.

@Garciat
Copy link
Author

Garciat commented Jun 7, 2024

Thanks for looking into it @ianlancetaylor

In general, the benefit of declaration-site variance would be most appreciable in code using generic algorithms or data structures, together with interface types (subtyping). It's a common language feature intersection point (subtyping + generics) that a lot of languages have dealt with by adding user-defined variance (e.g. Java, Scala, C#, Rust, TypeScript, Python (mypy), etc.)

Granted, variance is a bit of a niche language feature, since the people most likely to use it are library authors implementing generic code. From a user's perspective, variance "just works" when it is properly defined in library code.

I'll have to think more about 'real life' use cases (other than the above iter package). I'll come back to this thread when/if I do come up with something.

@Garciat Garciat changed the title proposal: spec: allow type parameters as type bounds proposal: spec: support user-defined variance (allow type parameters as type bounds) Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generics Issue is related to generics LanguageChange Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

6 participants