Type aliases in Hare currently have two separate uses:
1. To give a name to an existing type for clarity, for example, name
a function type to avoid writing out the signature every time. In
this case, no distinct type should be created.
2. To create a new distinct type with the same storage as the underlying
type. This is used for void aliases, structs, enums, etc. These types
carry a semantic meaning or an invariant (e.g. that an `io::file`
must be a valid file descriptor on Unix-like systems).
Hare conflates those two cases which can lead to unintuitive behaviour.
For example, consider two snippets:
// 1
let x: (io::file | void) = 2;
// 2
type function_a = fn(i64) f64;
type function_b = fn(f64) f64;
let y: (*function_a | *function_b) = &math::sinf64;
As of now, the first snippet compiles while the second one fails to
typecheck.
Arguably the first one is not correct since it allows to accidentally
construct an `io::file` without explicitly communicating that and
acknowledging its invariants with an explicit cast. This case can be
seen in the wild when a `size` can implicitly convert to `io::underread`
and construct an error type when the programmer didn't expect it.
In the second case the intention was to just give a name to the function
signature and avoid repeating it every time. The contract here is the
function type, not the alias. A common solution to avoid using a cast
there was pushing the pointer to the alias. This is still not ideal
because making the pointer nullable would require a new type alias and
a pointer cast somewhere.
The behaviour of `*alias` is different depending on whether it's used
in a tagged union or as a parameter type. For example, passing a
comparison function into sort::sort doesn't need a cast despite the
function type being aliased.
An obvious solution would be to switch to using a single model for type
aliases instead of two. This is easier said than done.
Always using type synonyms will break tagged unions of void aliases
since all synonyms of void will have the same type hash, and will weaken
safety of types like `io::file` by making `int` always assignable to it.
Always using new distinct types works a lot better: `io::file` and
similar types have stronger type safety since they require a cast to
construct, tagged unions still work as expected. This isn't to say this
approach is flawless: the function pointer use-case is completely broken
and requires a cast, which adds one more point of failure for APIs like
sort::sort.
So the best solution would be probably to have a syntax to have both
type synonyms and distinct types. I'm not entirely happy with this idea
since it's bound to confuse people.
Other languages like Rust and Haskell which have different syntax for
type synonyms and distinct types, do not have anonymous structures or
tagged unions like Hare - they have to be declared using special syntax
(`data TypeName` in Haskell, `struct`/`enum` in Rust), so no confusion
can happen there.
In Hare it would be possible to declare a type synonym for an anonymous
struct, and it will even behave somewhat correctly but when two
equivalent structs (declared as synonyms) get into a tagged union,
they'll get a type hash collision, same thing will happen for void
synonyms. This would be a pretty frustrating mistake to deal with if it
does happen to someone.
One interesting observation is that type synonyms behave a lot like
`def`-globals, and distinct types behave a lot like `let`-globals.
Values of synonyms and `def` are just pasted into the AST, but distinct
types and `let` create new objects.
This introduces an opportunity to use `def` and `let` for type
definitions:
type let file = int;
type def f64_to_f64_function = fn(f64) f64;
This even conveniently parallels `typedef` from C, by complete accident.
If mutability overhaul replaces `def` with `const`, this syntax will
still work but might take some getting used to.
A more conventional alternative syntax would be using some kind of
`@alias` or `@synonym` annotation for types and make the default
distinct.
type file = int;
type @alias f64_to_f64_function = fn(f64) f64;
type @synonym f64_to_f64_function = fn(f64) f64;
Let me know if you have other ideas.
On Fri Sep 6, 2024 at 5:33 AM EDT, Alexey Yerin wrote:
> The behaviour of `*alias` is different depending on whether it's used
> in a tagged union or as a parameter type. For example, passing a
> comparison function into sort::sort doesn't need a cast despite the
> function type being aliased.
This has nothing to do with function parameters; assignability semantics
are the same everywhere. The reason for the differing behavior is type
hints. When a unary & expression is provided a pointer type hint, and
the operand's dealiased type is the same as the dealiased secondary type
of the pointer hint, the result of the expression is the type hint. This
mechanism currently doesn't work with tagged union type hints.
So I don't think there's any dualty in the semantics of type aliases
themselves, though different type aliases may serve different purposes.
> So the best solution would be probably to have a syntax to have both
> type synonyms and distinct types. I'm not entirely happy with this idea
> since it's bound to confuse people.
Honestly I don't think we need to introduce any mechanism for synonyms.
Having all aliases be distinct is fine. I think all the problems with
aliases are still solvable.
> This even conveniently parallels `typedef` from C, by complete accident.
> If mutability overhaul replaces `def` with `const`, this syntax will
> still work but might take some getting used to.
At least for now, mutability overhaul will most likely leave the `def`
keyword unchanged. I prefer `const` and advocated for that change, but
it's somewhat controversial, and it's really not worth the energy to
bikeshed rn lol
> Let me know if you have other ideas.
I don't think we need to make any large-scale changes to type aliases.
IMO the biggest issue with them is that their assignability semantics
are too lax. They should be made stricter, as you've been looking in to.
This also ties in with tagged union assignability.
We may also want to add a "safe cast" operator, so we don't force users
to do potentially unsafe casts more often. The fact that we only have
one cast operator, and that operator is unsafe, has led to lots of
footguns, particularly when working with pointers. Maybe a safe cast
operator could use assignability rules rather than castability rules, or
something like that.
The type hint semantics with unary & are also kind of unintuitive, so we
might want to do something about that as well.