Contents

*

<hmrefbasics2.sig>=
signature HM_REF_BASICS2 = sig
  type var = HindleyMilner2.refvar
  type 'v ty = 'v HindleyMilner2.ty
  val eq  : var ty -> var ty -> bool
  val eqb : var ty *  var ty -> bool
end
Defines eq, eqb, HM_REF_BASICS2, ty, var (links are to index).

<hmref2.sig>=
signature HM_REF2 = sig
  type refvar = HindleyMilner2.binding ref
  <basic datatypes>
  <sig>
  type vars = {ty : HindleyMilner2.refvar list, width : HindleyMilner2.refvar list}
  val freeVars : HindleyMilner2.refvar HindleyMilner2.term -> vars
  val freeVarsPoly : ('c * vars -> vars) ->
                     (HindleyMilner2.refvar, 'c) HindleyMilner2.topty -> vars
end
Defines freeVars, freeVarsPoly, HM_REF2, refvar, vars (links are to index).

We have to know when type variables are covariant or contravariant. These functions return false for width variables.

<sig>= (<-U) [D->]
val covariantIn     : refvar * ty -> bool
val contravariantIn : refvar * ty -> bool
Defines contravariantIn, covariantIn (links are to index).

<basic datatypes>= (<-U U->)
type 'a pty = 'a HindleyMilner2.ty
type 'a pwidth = 'a HindleyMilner2.width
type 'e exp = (refvar, 'e) Typed2.exp
type ty    = refvar HindleyMilner2.ty
type 'c topty = (refvar, 'c) HindleyMilner2.topty
type width = refvar HindleyMilner2.width
Defines exp, pty, pwidth, topty, ty, width (links are to index).

We use mutable cells to represent variables so as to avoid explicit substitutions; instead we implement substitution by assignment. Once assigned a BOUND value, a variable is equivalent to that value.

(N.B. Type variables may not participate in cycles; the unifier contains and occurs check to prevent formation of cycles. Also note, althought the kinds of constructors are not used in this module, they are believed to be useful for elaborating types written by the user.)

Unification changes the contents of variables' cells. When two types don't unify, we raise Unify.

<sig>+= (<-U) [<-D->]
exception Unify of ty * ty
val unify      : ty * ty -> unit
val unifyTypes : ty list * ty list -> unit
val unifyWidth : width * width -> unit
Defines Unify, unify, unifyTypes, unifyWidth (links are to index).

To transform between types and type schemes, we can generalize or instantiate. Generalization involves adding  quantifiers for (``closing over'') free type and constant variables. [When we handle Milner let properly, we will close over only those variables that aren't free in the environment.] Instantiation involves stripping off quantifiers and replacing -bound variables with fresh variables.

We want this module to be usable not only to infer types, but to convert from implicitly-typed to explicitly-typed calculi. This means that when we close over a type or constant variable, we must add a Lambda-abstraction to a term. Similarly, when we instantiate a type or constant variable, we must create a type or constant application of a term. So as not to confuse term manipulations with type manipulations, we require that two kinds of Lambda-abstraction be passed to generalize and that type and constant application be passed to instantiate.

<sig>+= (<-U) [<-D]
type constraints
val instantiate :
  { tyapp : 'exp * ty -> 'exp, widthapp : 'exp * width -> 'exp } ->
  'exp * constraints topty -> constraints * 'exp * ty
val generalize : 
  { tyabs    : refvar * constraints * 'e -> 'e
  , widthabs : refvar * constraints * 'e -> 'e
  } ->
  (constraints * 'e * ty) * {freeEnv:ty list, freeCon:ty list list} ->
  constraints * 'e * constraints topty
Defines constraints, generalize, instantiate (links are to index).

Both of these functions return not only a type but also a modified term, and generalize returns the unused constraints. As is standard, generalize has to know about type variables that are free in the environment, so it can avoid closing over them. The way those arguments are passed is a bit odd; essentially we pass types, not variables, and when we want the variables themselves, we grab the free variables of those types. The advantage of this technique is that a unification does not invalidate the set, whereas if we had a set of variables, we would always be checking to see if one got unified with something when we weren't paying attention. The other strangeness is that, because we're using Nordlander's subtyping technique, we have to avoid closing over variables free in subtype constraints. (Nordlander has proved that subtype constraints are sufficient in his system, since no varialbes free in tau|D are free in Gamma, but I haven't done a similar proof of my system.)

alpha= taufree sigma= alpha| N_alpha.tau N N_alpha, abs(alpha, e), sigma= generalize(N, e, tau) free
<hmrefbasics2.sml>=
structure HMRefBasics2 : HM_REF_BASICS2 = struct
  structure HM = HindleyMilner2
  type var = HindleyMilner2.refvar
  type 'v ty  = 'v HindleyMilner2.ty
  <basics implementation>
end

Defines HM, HMRefBasics2, ty, var (links are to index).

<hmref2.sml>=
structure HindleyMilnerRef2 : HM_REF2 = struct
  structure HM = HindleyMilner2
  structure T = Typed2
  structure N = PreNumeric
  structure V = HM.Vars
  type refvar = HM.refvar 
  <basic datatypes>
  <type utilities>
  <type implementation>
end
Defines HindleyMilnerRef2, HM, N, refvar, T, V (links are to index).

<hmrefprint2.sml>=
functor HMRefPrint2Fun(
  structure HMRef : HM_REF2
  structure HMP : HM_PRINT2_PLUS
    where type HM.refvar' = HindleyMilner2.refvar'
    where type HM.var  = HindleyMilner2.var
) : sig
  type ty
  structure HMP : HM_PRINT2_PLUS
    where type HM.refvar' = HindleyMilner2.refvar'
    where type HM.var  = HindleyMilner2.var
  val errorPair : ty * ty -> (HMP.tree * HMP.tree) HMP.Printer.M
end
 = struct
  structure HM = HindleyMilner2
  structure pp = PP.Short
  structure U = HMP.Unparser
  structure P = HMP.Printer
  structure HMPA = HMP.Ast

  type ty = HM.refvar HM.ty
  structure HMP = HMP

  val >>= = P.bind
  infixr 0 >>= >>==
  fun x >>== f = P.map f x

  <displaying failed unifications>
end  
Defines >>=, errorPair, HM, HMP, HMPA, HMRef, HMRefPrint2Fun, P, pp, ty, U, x (links are to index).

Unification

As in most unifiers, we have to implement an occurs check so we can avoid, e.g., unifying 'a with 'a list. tyFreeIn is that occurs check.

<junk>= [D->]
fun tyFreeIn deref (v, t) =
  let fun f(TYVAR v')        = (case deref v' of SOME t' => f t' | NONE => v = v')
        | f(CON(_, args))    = List.exists argf args
        | f(ARROW(arg, ret)) = f arg orelse f ret
        | f(TUPLE args)      = List.exists f args
        | f(RECORD args)     = List.exists (f o #2) args
      and argf(TYARG    t) = f t
        | argf(WIDTHARG t) = false
  in  f t
  end  
Defines tyFreeIn (links are to index).

We don't expect clients to search for free variables, but we have to make the occurs check available to the module that prints failed unifications.

<junk>+= [<-D]
val tyFreeIn : ('a -> 'a ty option) -> 'a * 'a ty -> bool
Defines tyFreeIn (links are to index).

Unification is implemented by side effects on type variables. Fail is raised anywhere things go wrong; at opportune moments, we catch Fail in order to raise the more informative Unify. ulist helps unify different kinds of lists.

<type implementation>= (<-U) [D->]
exception Unify of ty * ty

local
  exception Fail 
  exception Impossible
  fun ulist u ([], []) = ()
    | ulist u (h1::t1, h2::t2) = (u (h1, h2); ulist u (t1, t2))
    | ulist u  _ = raise Fail
  fun isWidth (HM.WIDTH _) = true
    | isWidth (HM.WIDTHVAR _) = true
    | isWidth _ = false
in
  <unification>
  val unifyTypes = ulist unify
end
Defines Fail, Impossible, isWidth, ulist, Unify, unifyTypes (links are to index).

All the tricky bits are in the treatment of variables.

<unification>= (<-U)
fun unify(t1, t2) =
  let fun tymatch v (HM.TYVAR v') = v = v'
        | tymatch _ _ = false
      fun widthmatch v (HM.WIDTHVAR v') = v = v'
        | widthmatch _ _ = false
      fun u(HM.TYVAR v, t) = uvar tymatch (v, t)
        | u(t, HM.TYVAR v) = uvar tymatch (v, t)
        | u(HM.TUPLE  a1, HM.TUPLE  a2) = ulist unify      (a1, a2)
        | u(HM.RECORD a1, HM.RECORD a2) = ulist unifyNamed (a1, a2)
        | u(HM.CON(n1, a1), HM.CON(n2, a2)) = 
            if n1 = n2 then ulist unify (a1, a2) else raise Fail
        | u(HM.ARROW(a1, r1), HM.ARROW(a2, r2)) = ulist unify ([r1,a1], [r2,a2])
        | u(HM.WIDTHVAR v, w) = uvar widthmatch (v, w)
        | u(w, HM.WIDTHVAR v) = uvar widthmatch (v, w)
        | u(HM.WIDTH w1, HM.WIDTH w2) = if w1 = w2 then () else raise Fail
        | u _ = raise Fail
      and unifyNamed ((n1, t1), (n2, t2)) =
            if n1 = n2 then unify(t1, t2) else raise Fail
      and uvar match (ref (HM.BOUND t), t') = unify(t, t')
        | uvar match (v as ref HM.UNBOUND, t') =
            let val t' = HM.meaning t'
            in  if match v t' then ()
                else if HM.freeIn (v, t') then raise Fail
                else v := HM.BOUND t'
            end
  in  u(t1, t2) handle Fail =>
                  if isWidth t1 orelse isWidth t2 then raise Fail
                  else raise Unify (t1, t2) (* better error msgs *)
  end
val unify = fn x => unify x handle Fail => raise Unify x
val unifyWidth = unify
Defines unify, unifyWidth (links are to index).

Equality testing

<basics implementation>= (<-U)
fun eq t1 t2 = eqb(t1, t2)
and eqb(t1, t2) =
  let fun all p ([], []) = true 
        | all p (h::t, h'::t') = p(h, h') andalso all p (t, t')
        | all _ _ = false
      fun u(HM.TYVAR v, t) = veq(v, t)
        | u(t, HM.TYVAR v) = veq(v, t)
        | u(HM.TUPLE  a1, HM.TUPLE  a2) = all u (a1, a2)
        | u(HM.RECORD a1, HM.RECORD a2) = all unamed (a1, a2)
        | u(HM.CON(n1, a1), HM.CON(n2, a2)) = n1 = n2 andalso all u (a1, a2)
        | u(HM.ARROW(a1, r1), HM.ARROW(a2, r2)) = u(a1, a2) andalso u(r1, r2)
        | u(HM.WIDTHVAR v, w) = veq(v, w)
        | u(w, HM.WIDTHVAR v) = veq(v, w)
        | u(HM.WIDTH w1, HM.WIDTH w2) = w1 = w2
        | u _ = false
      and veq(ref (HM.BOUND t), t') = u(t, t')
        | veq(v as ref HM.UNBOUND, t) =
            let fun isv (HM.TYVAR    (ref (HM.BOUND t))) = isv t
                  | isv (HM.WIDTHVAR (ref (HM.BOUND t))) = isv t
                  | isv (HM.TYVAR    v') = v = v'
                  | isv (HM.WIDTHVAR v') = v = v'
                  | isv _ = false
            in  isv t
            end
      and unamed ((n1, t1), (n2, t2)) = n1 = n2 andalso u(t1, t2)
  in  u(t1, t2)
  end
Defines eq, eqb (links are to index).

Displaying failed unifications

The goal here is to show a pair of types with the non-matching parts highlighted. We have to be careful to tackle the parts of types in the same order we would use in unify. ulist and everything else now use monads to track the printer state. We need extra effort to detect list-length mismatches; otherwise, the mismatches aren't detected until the monadic value is used, and then the raising of Fail takes place in Printer.print, which is outside the scope of the necessary handler. Using lift avoids delaying the recursive call to ulist u.

<displaying failed unifications>= (<-U)
local
  exception Fail 
  exception Impossible
  fun pphighlight l = pp.li [pp.te " >>> ", pp.li l, pp.te " <<< "]
  fun highlight t = U.DECORATE (pphighlight, t)
  fun hitree  x = P.map highlight (HMPA.ty x)
  fun swap (x, y) = (y, x)
  fun pairmap f (x, y) = f x >>= (fn x => f y >>== (fn y => (x, y)))
  val hipair = pairmap hitree
  fun lift {h, t} = h >>= (fn h => t >>== (fn t => {h=h, t=t}))
  fun ulist u ([], []) = P.unit ([], [])
    | ulist u (h1::t1, h2::t2) =
        lift {h=u (h1, h2), t=ulist u (t1, t2)} >>== (fn {h=(h1, h2), t=(t1, t2)} => 
           (h1 :: t1, h2 :: t2))
    | ulist u  _ = raise Fail
in
  <definition of errorPair>
end
Defines Fail, highlight, hipair, hitree, Impossible, lift, pairmap, pphighlight, swap, ulist (links are to index).

The general idea is to re-run the unification, and by catching Fail, to know what to highlight.

<definition of errorPair>= (<-U)
local
  fun u(HM.TYVAR v, t) = uvar(v, t)
    | u(t, HM.TYVAR v) = P.map swap (uvar(v, t))
    | u(t1 as HM.TUPLE a1, t2 as HM.TUPLE a2) = 
         (ulist u (a1, a2) >>== (fn (a1, a2) => (HMP.tuple a1, HMP.tuple a2))
          handle Fail => hipair(t1, t2))
    | u(t1 as HM.RECORD a1, t2 as HM.RECORD a2) =
         (ulist unifyNamed (a1, a2) >>== (fn (a1, a2) =>(HMP.record a1, HMP.record a2))
          handle Fail => hipair(t1, t2))
    | u(t1 as HM.CON(n1, a1), t2 as HM.CON(n2, a2)) = 
         if n1 = n2 then
           ulist u (a1, a2) >>== (fn (a1, a2) => (HMP.con(n1, a1), HMP.con(n2, a2)))
           handle Fail => hipair(t1, t2)
         else
           hipair(t1, t2)
    | u(HM.ARROW(a1, r1), HM.ARROW(a2, r2)) = 
         (ulist u ([r1, a1], [r2, a2]) handle Fail => raise Impossible) >>==
           (fn ([r1, a1], [r2, a2]) => (HMP.arr(a1, r1), HMP.arr(a2, r2))
             | _ => raise Impossible)
    | u(HM.WIDTHVAR v, c) = uvar(v, c)
    | u(c, HM.WIDTHVAR v) = P.map swap (uvar(v, c))
    | u(c1 as HM.WIDTH n1, c2 as HM.WIDTH n2) = 
        if n1 = n2 then pairmap HMPA.width (c1, c2)
        else            hipair (c1, c2)
    | u (t1, t2) = hipair(t1, t2)
  and unifyNamed ((n1, t1), (n2, t2)) =
        if n1 = n2 then
          u (t1, t2) >>== (fn (t1, t2) => ((pp.te n1, t1), (pp.te n2, t2)))
        else
          pairmap HMPA.ty (t1, t2) >>== (fn (t1, t2) =>
            ((pphighlight [pp.te n1], t1), (pphighlight [pp.te n2], t2)))
  and uvar(ref (HM.BOUND t1), t2) = u(t1, t2)
    | uvar(v as ref HM.UNBOUND, t) = 
        let val t = HM.meaning t   
        in  if t = HM.TYVAR v then pairmap HMPA.ty (HM.TYVAR v, t)
            else if HM.freeIn (v, t) then hipair(HM.TYVAR v, t)
            else pairmap HMPA.ty (HM.TYVAR v, t)
        end
in
  val errorPair = u
end
Defines errorPair, u, unifyNamed, uvar (links are to index).

Compute free type variables and free width variables. Everything is backwards until the end

<type implementation>+= (<-U) [<-D->]
val insert = V.add'
type vars = {ty : HindleyMilner2.refvar list, width : HindleyMilner2.refvar list}
fun insertTy    (v, {ty, width}) = {ty = insert(v, ty), width = width}
fun insertWidth (v, {ty, width}) = {ty = ty, width = insert(v, width)}
fun freeVars t =
  let fun ty(HM.TYVAR    (     ref (HM.BOUND t)), l) = ty(t, l)
        | ty(HM.TYVAR    (v as ref (HM.UNBOUND)), l) = insertTy(v, l)
        | ty(HM.WIDTHVAR (     ref (HM.BOUND t)), l) = ty(t, l)
        | ty(HM.WIDTHVAR (v as ref (HM.UNBOUND)), l) = insertWidth(v, l)
        | ty(HM.CON(_, args),                     l) = foldl ty l args
        | ty(HM.ARROW(arg, ret),                  l) = ty(ret, ty(arg, l))
        | ty(HM.TUPLE args,                       l) = foldl ty      l args
        | ty(HM.RECORD args,                      l) = foldl binding l args
        | ty(HM.WIDTH _,                          l) = l
      and binding((n, t),                         l) = ty(t, l)
      val {ty, width} = ty(t, {ty=V.empty, width=V.empty})
      fun rev set = foldl V.add' V.empty (V.listItems set)
  in  {ty = rev ty, width = rev width}
  end  
fun freeVarsPoly conVars t =
  let fun fv (bound, HM.LIFT t) = freeVars t -- bound
        | fv (bound, HM.ALLTY    (v, c, t)) =
            (conVars (c, {ty=V.empty, width=V.empty}) -- bound) ++
            fv (insertTy(v, bound), t)
        | fv (bound, HM.ALLWIDTH (v, c, t)) =
            (conVars (c, {ty=V.empty, width=V.empty}) -- bound) ++
            fv (insertWidth(v, bound), t)
  in  fv({width=V.empty, ty=V.empty}, t)
  end
Defines freeVars, freeVarsPoly, insert, insertTy, insertWidth, vars (links are to index).

Generalization is pretty anti-climactic; the hard work is already done. We just have to avoid closing over the variables that shouldn't be closed over (because they are free in the environment or in the subtype constraints). These variables are passed in by means of free. Then we follow Nordlander, in closing over the variables free in tau. The numeric constraints have to split into constraints that are definitely in or definitely out. Normally, a constraint contained within the variables is the escaping one---here, it's the one to be closed over.

<type implementation>+= (<-U) [<-D->]
fun generalize {tyabs, widthabs} ((N, e, tau), {freeEnv, freeCon}) =
  let val envVars = freeVars (HM.TUPLE freeEnv) 
      val conVars = freeVars (HM.TUPLE (foldr op @ [] freeCon))
      val alphas = freeVars tau -- envVars -- conVars
      val {escapes=N_alphas, mustSolve=N'} = N.split (join alphas) N
      <definition of close>
      val (e, sigma) = close alphas N_alphas (e, tau)
  in  (N', e, sigma)
  end
Defines generalize (links are to index).

We're closing over a set of variables alpha with constraints N, always with the precondition that N alpha.

<definition of close>= (<-U)
fun allty    alpha c tau = HM.ALLTY    (alpha, c, tau)
fun allwidth alpha c tau = HM.ALLWIDTH (alpha, c, tau)
fun freeNumeric ncon =
  foldr (fn (t, vs) => V.union(join (freeVars t), vs)) V.empty (N.freeTerms ncon)
fun close {ty=t::ts, width=w} N (e, tau) =
      let val alphas' = {ty=ts, width=w}
          fun inner invars ncon = V.isSubset(freeNumeric ncon, invars)
          val N' = N.Set.filter (inner (join alphas')) N
          val (e, tau) = close alphas' N' (e, tau)
          val Nbound = N.Set.difference (N, N')
      in  (tyabs (t, Nbound, e), allty t Nbound tau)
      end
  | close {ty=[], width=w::ws} N (e, tau) =
      let val alphas' = {ty=[], width=ws}
          fun inner invars ncon =
            V.exists (fn v => V.member(invars, v)) (freeNumeric ncon)
          val N' = N.Set.filter (inner (join alphas')) N
          val (e, tau) = close alphas' N' (e, tau)
          val Nbound = N.Set.difference (N, N')
      in  (widthabs (w, Nbound, e), allwidth w Nbound tau)
      end
  | close {ty=[], width=[]} N (e, tau) =
      if N.Set.isEmpty N then
        (e, HM.LIFT tau)
      else
        Impossible.impossible "constraint with wandering bound variables"
Defines allty, allwidth, close, freeNumeric (links are to index).

We need set operations on {ty, width} pairs.

<type utilities>= (<-U)
local
  val op - = V.difference
  val op + = V.union
in 
  infix 3 -- ++
  fun {ty=t1, width=w1} -- {ty=t2, width=w2} = {ty = t1 - t2, width = w1 - w2}
  fun {ty=t1, width=w1} ++ {ty=t2, width=w2} = {ty = t1 + t2, width = w1 + w2}
  fun join {ty, width} = ty + width
  fun varsList {ty=t, width=w} = {ty=V.listItems t, width=V.listItems w}
end
Defines join, op, varsList (links are to index).

Instantiation is a bit tricky, because we have to substitute for -bound variables. Rather than substitute repeatedly, which would require a lot of allocation, I save up the substitutions as lists of pairs of variables, then apply them all at once. They also have to be applied to constraints.

<type implementation>+= (<-U) [<-D->]
type constraints = N.Set.set
fun instantiate {tyapp, widthapp} (e, t) =
  let fun inst (e, HM.ALLTY    (v, con, t), vars, cons') =
            let val v' = ref HM.UNBOUND
            in  inst(tyapp   (e, HM.TYVAR    v'), t, (v, HM.TYVAR v') :: vars, con :: cons')
            end
        | inst (e, HM.ALLWIDTH (v, con, t), vars, cons') =
            let val v' = ref HM.UNBOUND
            in  inst(widthapp(e, HM.WIDTHVAR v'), t, (v, HM.WIDTHVAR v') :: vars, con :: cons')
            end
        | inst (e, HM.LIFT t, varmap, cons') = 
            let val subst = HM.listSubst varmap
                fun add (c, s) = foldr (fn (c, s) => N.Set.add'(N.map subst c, s)) s c
            in  (foldl add N.Set.empty cons', e, subst t)
            end
  in  inst (e, t, [], [])
  end
Defines constraints, instantiate (links are to index).

We find positive and negative occurrences of type variables (and only type variables --- these never succeed on width variables).

<type implementation>+= (<-U) [<-D]
fun variant (v, positive) =
  let fun co (HM.TYVAR (     ref (HM.BOUND t  ))) = co t
        | co (HM.TYVAR (v' as ref (HM.UNBOUND))) = positive andalso v = v'
        | co (HM.CON(cons, args)) = List.exists co args
        | co (HM.TUPLE  args)     = List.exists co args
        | co (HM.RECORD args)     = List.exists (fn (n, t) => co t) args
        | co (HM.ARROW(arg, ret)) = co ret orelse variant(v, not positive) arg
        | co (HM.WIDTH n) = false
        | co (HM.WIDTHVAR _) = false
  in  co
  end
fun covariantIn    (v, t) = variant(v, true)  t
fun contravariantIn(v, t) = variant(v, false) t
Defines contravariantIn, covariantIn, variant (links are to index).