@pithlessly · github · λ

1ML for non-specialists: introduction

This is the first entry in what I intend as a multi-part series.

1ML is a type system designed by Andreas Rossberg and described in a collection of papers by him. When it comes to integrating an ML-style module system into a new language, I think 1ML currently provides the best overall story (although it’s not perfect).

Unfortunately, Rossberg’s papers are pretty technical. He is a good writer, but I feel that right now there is still a communication barrier between academics who are in a position to discuss 1ML in depth and people who are in a position to write new compilers. I see a lot more discussion online that merely references 1ML than I do discussion that seems to deeply understand it.

I’ve been studying 1ML for the last several months, and working on my own prototype implementation (which unfortunately isn’t yet in a publishable state). In an effort to help these ideas percolate down towards mainstream adoption, I want to help demystify some things that I found confusing.

This series will not substitute for reading the paper, but I hope it can be a useful companion. I’ll be working from the version of the paper published in the Journal of Functional Programming in 2018. It’ll be very helpful to be able to look at that paper as you read this. In fact, you can press this button to pull it up on the side in this page: (It probably doesn’t look great on mobile, unfortunately.)

Prerequisite knowledge

In writing this, I assume the reader:

Goals

By the end of this series of articles, the reader will have the tools they need to understand every single inference rule, including those of the type inference algorithm described in Section 7.

I also want to discuss some implementation concerns that are not mentioned in the paper (such as how to integrate type inference with levels).

Non-goals

A decent portion of the paper is dedicated to meta-theory (e.g. the proofs that the type system is sound, etc.). I have not read these sections in depth, and will not be discussing them here. It is reasonable for compiler authors to take these results as given and not focus on their details.

In some places, I will be criticizing the way the paper presents its ideas. Please do not take these as prescriptions for how it should have been written. I do not have the standing to make prescriptions like that; the paper was written with considerations other than the ones most important to me (such as brevity of the rules, ease of meta-theory, & an intended audience of other academics).

System Fω and the architecture of the type system

At a basic level, 1ML is really just a sophisticated syntactic sugar for a version of the lambda calculus called System Fω. If you’re not familiar, this is basically the core type system of Haskell, including higher-kinded type constructors, with all polymorphism being explicit. 1ML terms (I’ll use term and expression interchangeably) are converted into Fω terms and 1ML types are converted into Fω types.

The specific structure of this translation is best saved for a later article. For now, I’ll say that whenever a type TT appears in the surface syntax, it is immediately converted to a Fω type written with the variable Ξ\Xi. After this translation, the rules completely forget about TT and don’t touch it again. This is why we see similar constructs written with two different syntaxes, like (= type T)\text{(= \textbf{type} $T$)} and [=τ][= \tau].

This may seem a little unfamiliar; I was used to seeing type systems that don’t distinguish between the surface syntax for types and the internal representation of types. But of course, a routine like this exists in any real-world type checker, it’s just that many papers would consider it “out of scope” and not discuss it.

Technical note: order of record fields

Whenever a type system defines a class of objects CC, it will often come with some invariants (“every cCc \in C has property PP”) and some equivalences (“any c,cCc, c' \in C are equivalent if QQ, even if they aren’t the same as written”). In situations like this, it is useful to pay attention to which properties and equivalences are being enforced through the grammar of CC, and which are defined using the inference rules.

Unfortunately, when it comes to how Fω’s record field labels work, the paper isn’t so clear. Fig. 3, which defines the syntax of Fω, has only this to say:

τ::={l:τ}e::={l=e} \tau ::= \dots \mid \{ \overline{ l : \tau } \} \qquad e ::= \dots \mid \{ \overline{ l = e } \}

(See below for an explanation of the overline notation.)

However, Fig. 4, which provides typing rules, has this rule for typing field access:

Γe:{l:τ,l:τ}Γe.l:τ \frac{ \Gamma \vdash e : \{ l : \tau, \overline{ l' : \tau' } \} }{ \Gamma \vdash e.l : \tau }

This rule only makes sense if we understand record fields to be intrinsically unordered. I think we can also reasonably assume that l\overline{ l } is understood to contain no duplicates. That resolves our question; however, for reasons I’ll get into later, I suspect a type checker will want to keep around the original order of field labels in practice.

I want to linger on the type system of Fω for a bit longer. We can think of Fω type expressions τ\tau as terms in a simply-typed lambda calculus whose base “type” is Ω\Omega, the kind of types. This is the core of what gives Fω, and therefore 1ML, its power — type checking can invoke some form of computation at the type level. This power comes with a corresponding burden for the implementer, who must essentially build an interpreter for STLC into their type checker.

The key rule is this one:

Γe:τττΓτ:ΩΓe:τ \frac{ \Gamma \vdash e : \tau' \qquad \tau' \equiv \tau \qquad \Gamma \vdash \tau : \Omega }{ \Gamma \vdash e : \tau }

In words: “whenever an Fω term ee is considered to have a type τ\tau', we could also consider it to have type τ\tau if τ\tau is another well-formed type which is equivalent to τ\tau'.”

The type equivalence rules are presented as two-directional equalities (\equiv), but in practice, a compiler will want to think of these as one-directional reductions. You can use a technique like normalization by evaluation for this. Bringing types all the way to normal form whenever you construct them is a viable approach, but it gets slow for very large types; there are standard techniques available for being lazier (this is not 1ML-specific).

Semantic types and modeling them in code

Figure 6 defines the grammar for semantic types. It’s important, so I’ll reproduce it here.

(abstracted)Ξ::=α.Σ(large)Σ::=πbool[=Ξ]{l:Σ}α.ΣηΞ(small)σ::=πbool[=σ]{l:σ}σIσ(paths)π::=απσ(purity)η::=PI \begin{align*} \text{(abstracted)} \quad && \Xi \quad & ::= \quad \exists \overline\alpha. \Sigma \\ \text{(large)} \quad && \Sigma \quad & ::= \quad \pi \mid \text{bool} \mid [= \Xi] \mid \{ \overline{ l : \Sigma } \} \mid \forall \overline\alpha. \Sigma \to_\eta \Xi \\ \text{(small)} \quad && \sigma \quad & ::= \quad \pi \mid \text{bool} \mid [= \sigma] \mid \{ \overline{ l : \sigma } \} \mid \sigma \to_{\texttt{I}} \sigma \\ \text{(paths)} \quad && \pi \quad & ::= \quad \alpha \mid \pi \overline\sigma \\ \text{(purity)} \quad && \eta \quad & ::= \quad \texttt{P} \mid \texttt{I} \end{align*}

Note: the meaning of overline notation, as in l:Σ\overline{l : \Sigma}

This is meta-theoretic syntax that just means “zero or more” (sometimes including delimiting commas). For those familiar with Rust’s macro system, it is comparable to something like { $($l:label : $sigma:large_typ),* }.

Large and small semantic types

The first thing we should observe is that large and small types aren’t really defining a new grammar, but instead a subgrammar of Fω types (so we have {σ}{Σ}{Ξ}{τ}\{ \sigma \} \subseteq \{ \Sigma \} \subseteq \{ \Xi \} \subseteq \{ \tau \}). This is why the metavariable τ\tau doesn’t appear much in the rest of the paper from now on — most of the Fω stuff going on will be restricted to semantic types.

The only wrinkle is the new type formers: the singleton [=Ξ][= \Xi] and the function type with a purity annotation ΣηΞ\Sigma \to_\eta \Xi. The paper defines these via desugaring to Fω types with single-field records, but an implementation will find it easier to treat them as their own type constructors.

Paths

The path grammar is a little wonky. It allows repetition of σ\sigma in two different ways, suggesting that a path is an Fω type variable applied to a list of lists of small types, like π::=ασ\pi ::= \alpha \overline{\overline\sigma}. My guess is that this is just a mistake; it suggests that paths have structure beyond the Fω types of which they are supposed to be a subgrammar, and nothing in the rest of the paper goes wrong if we just treat it as π::=ασ\pi ::= \alpha \overline\sigma.

I honestly find the whole idea of a path a little poorly motivated. My best attempt to articulate the role paths play in the rest of the paper is that they enforce that a semantic type is (roughly) in normal form (since type-level function applications always have a variable at their head), and that only small types can be applied as arguments to functions. “Path” is really a misnomer from this perspective. (The fact that a set of paths is an input to the subtyping relation ΓΣπΣ\Gamma \vdash \Sigma' \leq_{\overline\pi} \Sigma is actually a red herring, as I’ll discuss in a later article.)

Abstracted types

The abstracted type grammar Ξ::=α.Σ\Xi ::= \exists \overline\alpha. \Sigma risks giving the reader poor intuitions.1 You should think of “abstracted types” as corresponding to signatures (a.k.a. module types) in traditional descriptions of ML. They can contain one or more abstract types (the α\overline\alpha), but thinking of them as existentially quantified is too limiting, since they can be specialized to concrete types after the fact by constructs like T where type t = bool.

As an illustration of this, you can look at the desugaring rules on page 25, which say that the surface signature { type t; x : t } gets desugared to Ξ=α:Ω.{t:[=α],x:α}\Xi = \exists \alpha : \Omega. \{ t : [= \alpha], x : \alpha \}. This is an Fω type which is provably useless!2 But this signature isn’t useless — we could readily imagine a functor in OCaml having it as an argument.

In my opinion, writing \exists here is an abuse of notation. A better mental model of an abstracted type is something like (α:κ.Σ)(\overline{\alpha : \kappa}. \Sigma) — a large type expression that has a number of free variables, and is equipped with a description of those variables (their kinds), but does not prescribe an interpretation for the variables, allowing them to later become existentially quantified, universally quantified, or substituted with concrete types. This is what the paper calls translucency; if you’re familiar with nominal sets, you can also think of it as a binder.

A summary of my implementation recommendations

Wrapping up

That’s all I have time for today. In the next entry, I’ll cover the surface syntax of 1ML, the desugaring judgement ΓTΞ\Gamma \vdash T \leadsto \Xi, the role of the [=Ξ][= \Xi] type, and how I think about phase separation.

Footnotes

1: The use of this notation goes at least back to F-ing modules, so there is at least precedent for it, and it makes the rules more concise to not have to coerce from Ξ\Xi to a “real” existential type, but I still don’t like it. ↩︎

2: This type is isomorphic to A:=α.αA := \exists \alpha. \alpha, a type whose values cannot be distinguished in any way — there’s no way to write a function AboolA \to \textbf{bool} that isn’t constant. ↩︎