2023-01-13
This is a literate Haskell-file. You can open it in GHCi by doing the following:
curl -O https://las.rs/blog/all-you-need-is-hkt-s.lhs
nix develop --impure --expr \
'with import <nixpkgs> {}; mkShell {
buildInputs =
[(haskell.packages.ghc924.ghcWithPackages
(p: [ p.markdown-unlit ])
)];
}' \
-c ghci -pgmL markdown-unlit all-you-need-is-hkt-s.lhs
Haskellers (and even worse, dependent type brained people) are prone to making complicated types to maximise abstraction and correctness. However, you might be surprised to know that you (technically) need nothing more than ADTs, HKTs, and rank-N types to do almost all of the magic you need.
Core theorem: (Roughly) any valid Haskell term (or of a similar language) typed with type families and GADTs, can be reformulated as a semantically equivalent term that can be typed without type families and GADTs.
What do I mean with semantically equivalent? When erasing types, the new term has the same structure, modulo technically unnecessary wrappings using constructors that ought to be newtype constructors. (You could in fact aleviate this by adding more language features, but that would ruin the point, unless there is a way of doing it in a minimal way.)
Type families in Haskell represent uncurriable functions that transform one type-level term into another type-level term.
Let us consider the following trivial example:
type family Not b where
Not 'True = 'False
Not 'False = 'True
It is in fact possible to transform the above into a data family, and in fact turn any type family into a data family (or something equivalent to avoid the limitation wrt. overlapping data family instances, see end of this post):
type NotD :: Bool -> (Bool -> Type) -> Type
data family NotD b k
newtype instance NotD 'True k = NotDTrue (k 'False)
newtype instance NotD 'False k = NotDFalse (k 'True)
How can this be true? The intuitive and informal proof is that, at the type-level terms of data kinds, though not types themselves, will always be used to construct a type.
We lose no power (modulo GHC bugs and Haskell misfeatures) compared to the type family version. In addition, since the data family uses newtype instances, the term can be kept representationally equivalent.
Let us give an example:
data Something (b :: Bool) where
SomethingInt :: Int -> Something 'True
SomethingDouble :: Double -> Something 'False
flipSomething :: Something b -> Something (Not b)
SomethingInt n) = SomethingDouble $ fromIntegral n
flipSomething (SomethingDouble n) = SomethingInt $ truncate n flipSomething (
We can transform the above into the following:
flipSomethingD :: Something b -> NotD b Something
SomethingInt n) = NotDTrue $ SomethingDouble $ fromIntegral n
flipSomethingD (SomethingDouble n) = NotDFalse $ SomethingInt $ truncate n flipSomethingD (
In fact, we can even do the following:
flipSomethingD' :: Something b -> Something (Not b)
= unsafeCoerce flipSomethingD flipSomethingD'
You can try running the above function and confirm it works.
Representionally, it should be equivalent to flipSomething
modulo GHC optimisations that might be inhibitted due to the use of
unsafeCoerce
.
If you look at NotD
, you might notice that it looks
awfully like a GADT.
-- :k NotD
type NotD :: Bool -> (Bool -> Type) -> Type
-- :t NotDTrue
pattern NotDTrue :: forall (k :: Bool -> Type). k 'False -> NotD 'True k
-- :t NotDFalse
pattern NotDFalse :: forall (k :: Bool -> Type). k 'True -> NotD 'False k
In fact, we can formulate the exact same thing, except unfortunately it turns into a data type, hence unnecessary constructor tags and wrappings are added in the representation:
type NotG :: Bool -> (Bool -> Type) -> Type
data NotG b k where
NotGTrue :: k 'False -> NotG 'True k
NotGFalse :: k 'True -> NotG 'False k
flipSomethingG :: Something b -> NotG b Something
SomethingInt n) = NotGTrue $ SomethingDouble $ fromIntegral n
flipSomethingG (SomethingDouble n) = NotGFalse $ SomethingInt $ truncate n flipSomethingG (
You can not do the unsafeCoerce
trick on this to
turn it back into the original flipSomething
, given that
NotG
is (unfortunately) not a newtype.
One thing to note is that we gain slightly more power by turning it
into a GADT, notably, we can match on NotG b k
and thus
find out what b
is.
We can also go from GADTs to normal ADTs using equality proofs instead:
type NotE :: Bool -> (Bool -> Type) -> Type
data NotE b k
= b ~ 'True => NotETrue (k 'False)
| b ~ 'False => NotEFalse (k 'True)
flipSomethingE :: Something b -> NotE b Something
SomethingInt n) = NotETrue $ SomethingDouble $ fromIntegral n
flipSomethingE (SomethingDouble n) = NotEFalse $ SomethingInt $ truncate n flipSomethingE (
However, equalities in constructor constraints are also somewhat
exotic. We can remove them by instead using
Data.Type.Equality.:~:
, but that is in turn defined using a
GADT, which we want to avoid. We turn to Leibniz equality:
type (:~:) :: forall k. k -> k -> Type
newtype (:~:) a b = EqPrf (forall p. p a -> p b)
refl :: a :~: a
= EqPrf id
refl
absurd :: 'False :~: 'True -> a
= error "absurd"
absurd _
type NotEL :: Bool -> (Bool -> Type) -> Type
data NotEL b k
= NotELTrue (b :~: 'True) (k 'False)
| NotELFalse (b :~: 'False) (k 'True)
flipSomethingEL :: Something b -> NotEL b Something
SomethingInt n) =
flipSomethingEL (NotELTrue refl $ SomethingDouble $ fromIntegral n
SomethingDouble n) =
flipSomethingEL (NotELFalse refl $ SomethingInt $ truncate n
ex :: Int
= case flipSomethingEL (SomethingDouble 4.2) of
ex NotELFalse _ (SomethingInt n) -> n
NotELTrue prf _ -> absurd prf
What is a non-trivial example then?
data Nat = Zero | Succ Nat
data Vec (n :: Nat) (a :: Type)
= VNil (n :~: 'Zero)
| forall n'. VCons ('Succ n' :~: n) a (Vec n' a)
data Plus (x :: Nat) (y :: Nat) (k :: Nat -> Type)
= PlusZero (x :~: 'Zero) (k y)
| forall x'. PlusSucc ('Succ x' :~: x) (Plus x' ('Succ y) k)
newtype Flip (f :: a -> b -> Type) (y :: b) (x :: a)
= Flip (f x y)
twoBools :: Plus ('Succ 'Zero) ('Succ 'Zero) (Flip Vec Bool)
= PlusSucc refl $
twoBools PlusZero refl $ Flip $ VCons refl True $ VCons refl True $ VNil refl
Compared to the original approach where youâd simply pass in
VCons True $ VCons True $ VNil
, what weâre doing here is
more akin to passing in an execution trace, i.e. the execution
of the âtype familyâ is included in the program, and the type checker
checks it.
Existential quantification can be expressed as universal quantification instead.
newtype Some f = Some { runSome :: forall r. (forall a. f a -> r) -> r }
some :: f a -> Some f
= Some $ \r -> r x some x
Then we can instead define the above as:
data VUCons' n a n' = VUCons' ('Succ n' :~: n) a (VecU n' a)
data VecU (n :: Nat) (a :: Type)
= VUNil (n :~: 'Zero)
| VUCons (Some (VUCons' n a))
data PlusUSucc' x y k x' = PlusUSucc' ('Succ x' :~: x) (PlusU x' ('Succ y) k)
data PlusU (x :: Nat) (y :: Nat) (k :: Nat -> Type)
= PlusUZero (x :~: 'Zero) (k y)
| PlusUSucc (Some (PlusUSucc' x y k))
twoBoolsU :: PlusU ('Succ 'Zero) ('Succ 'Zero) (Flip VecU Bool)
= PlusUSucc $ some $ PlusUSucc' refl $
twoBoolsU PlusUZero refl $ Flip $ VUCons $ some $ VUCons' refl True $
VUCons $ some $ VUCons' refl True $ VUNil refl
The answer is yes. I assume you are familiar with the SKI calculus, if not, it essentially works the following way:
s :: (a -> b -> c) -> (a -> b) -> a -> c
= (<*>)
s k :: a -> b -> a
= pure
k i :: a -> a
= id i
Using these three primitives, we can write any LC term (with the same restrictions wrt. the type system used).
How can we leverage this here? We can use the CPS trick above to
define newtype
s for the above constructs. You could also
define the AST as an ADT and then have the interpreter be another ADT
(using the Leibniz or GADT trick), but this seems simpler:
type CPS a = (a -> Type) -> Type
infixr 0 ~>
type a ~> b = a -> CPS b
type UnCPS' :: a -> (b -> Type) -> (a ~> b) -> Type
newtype UnCPS' x k f = UnCPS' (f x k)
type UnCPS :: CPS (a ~> b) -> a ~> b
newtype UnCPS x y k = UnCPS (x (UnCPS' y k))
type S2' :: CPS (b ~> c) -> (c -> Type) -> b -> Type
newtype S2' f k x = S2' (UnCPS f x k)
type S2 :: (a ~> b ~> c) -> (a ~> b) -> a ~> c
newtype S2 f g x k = S2 (g x (S2' (f x) k))
type S1 :: (a ~> b ~> c) -> (a ~> b) ~> a ~> c
newtype S1 f g k = S1 (k (S2 f g))
type S :: (a ~> b ~> c) ~> (a ~> b) ~> a ~> c
newtype S f k = S (k (S1 f))
type (<*>) a b = S :@ a :@ b
infixl 4 <*>
type K1 :: a -> b ~> a
newtype K1 x y k = K1 (k x)
type K :: a ~> b ~> a
newtype K x k = K (k (K1 x))
type I :: a ~> a
newtype I x k = I (k x)
infixl 9 :@
type (:@) :: (a ~> c ~> d) -> a -> c ~> d
newtype (:@) x y z k = App (UnCPS (x y) z k)
We use the same CPS trick as used for turning type families into data families, and thus are able to construct the following:
type S :: (a ~> b ~> c) ~> (a ~> b) ~> a ~> c
type K :: a ~> b ~> a
type I :: a ~> a
where a ~> b
represents a type-level computation
using CPS, i.e. itâs defined as
a -> (b -> Type) -> Type
. Using the new
application operator, :@
, we can now move our
pointfreeless programming to the type-level!
type FlipS :: (a ~> b ~> c) ~> b ~> a ~> c
type FlipS = S :@ (S :@ (K :@ S) :@ (S :@ (K :@ K) :@ S)) :@ (K :@ K)
type ComposeS :: (b ~> c) ~> (a ~> b) ~> a ~> c
type ComposeS = S :@ (K :@ S) :@ K
type (.) a b = ComposeS :@ a :@ b
infixr 8 .
type TwiceS :: a ~> (a ~> a ~> b) ~> b
type TwiceS = S :@ (S :@ (K :@ S) :@ (S :@ (K :@ (S :@ I)) :@ K)) :@ K
What does this gain us? Essentially, type-level lambdas without built-in language support. Quite powerful Iâd say.
Can we then possibly go a step further? Could we, replace data kinds, with SKI-Scott-encoded versions?
The answer is very unfortunately ânoâ in Haskell, because universal quantification isnât erased at the type-level.
The issue:
-- In fact Church-encoded, but point stands.
type NatS :: Type
type NatS = forall a. a ~> (a ~> a) ~> a
type ZeroS :: NatS
type ZeroS = K
-- Fails horribly!
{-
/my/file:some:place: error:
⢠Couldn't match kind âa0 -> CPS ((b0 ~> c0) ~> b0)â with âNatâ
Expected kind âNat ~> Natâ,
but âComposeS :@ (S :@ I)â has kind â(a0 ~> ((b0 ~> c0) ~> b0))
-> ((a0 ~> ((b0 ~> c0) ~> c0)) -> Type) -> Typeâ
⢠In the type âComposeS :@ (S :@ I)â
In the type declaration for âSuccSâ
|
79 | type SuccS = ComposeS :@ (S :@ I)
-}
-- type SuccS :: NatS ~> NatS
-- type SuccS = ComposeS :@ (S :@ I)
Though NatS
uses a forall
, the
forall
doesnât work as expected.
The issue becomes more clear if we reformulate it into
type SI :: forall b. ((b ~> b) ~> b) ~> (b ~> b) ~> b
type SI = S :@ I
{-
/my/file:some:place: error:
⢠Couldn't match kind: a0 -> CPS ((b0 ~> b0) ~> b0)
with: forall a. a ~> ((a ~> a) ~> a)
Expected kind â(forall a. a ~> ((a ~> a) ~> a))
~> (forall b. b ~> ((b ~> b) ~> b))â,
but âComposeS :@ SIâ has kind â(a0 ~> ((b0 ~> b0) ~> b0))
-> ((a0 ~> ((b0 ~> b0) ~> b0)) -> Type) -> Typeâ
⢠In the type âComposeS :@ SIâ
In the type declaration for âSuccSâ
|
78 | type SuccS = ComposeS :@ SI
| ^^^^^^^^^^^^^^
-}
type SuccS :: (forall a. a ~> (a ~> a) ~> a) ~> (forall b. b ~> (b ~> b) ~> b)
type SuccS = ComposeS :@ SI
If we take the crux of the error, it says it canât do the following
~> (b0 ~> b0) ~> b0
a0 ~
forall a. a ~> (a ~> a) ~> a
Both typings would be valid for the above type-level term,
if the forall
had been erased. Because it isnât,
and is more akin to an implicit function, GHC canât do anything, since
we donât have real type-level lambdas, and the point is to avoid
type-level lambdas.
Though we have been defeated here, we can still have some fun, albeit Iâve gone back to using real GADTs to help with type (term?) inference:
data VecImpl (n :: Nat) (a :: Type) where
VImplNil :: VecImpl 'Zero a
VImplCons :: a -> VecImpl n a -> VecImpl ('Succ n) a
-- Representation is equal to lists so we steal the instance
instance Show a => Show (VecImpl n a) where
showsPrec = unsafeCoerce $ showsPrec @[a]
show = unsafeCoerce $ show @[a]
showList = unsafeCoerce $ showList @[a]
type VecS1 :: Nat -> Type ~> Type
newtype VecS1 (n :: Nat) (a :: Type) (k :: Type -> Type)
= VecS1 (k (VecImpl n a))
type VecS :: Nat ~> Type ~> Type
newtype VecS (n :: Nat) (k :: (Type ~> Type) -> Type)
= VecS (k (VecS1 n))
type PlusImpl :: Nat -> Nat ~> Nat
data PlusImpl (x :: Nat) (y :: Nat) (k :: Nat -> Type) where
PlusImplZero :: (k y) -> PlusImpl 'Zero y k
PlusImplSucc :: PlusImpl x ('Succ y) k -> PlusImpl ('Succ x) y k
type PlusS :: Nat ~> Nat ~> Nat
newtype PlusS x k = PlusS (k (PlusImpl x))
-- We use SKI stuff to flip stuff around rather than simply using Flip as before
twoBoolsS ::
FlipS :@ VecS :@ Bool . PlusS :@ 'Succ 'Zero)
('Succ 'Zero)
(Identity
twoBoolsS
is what you expect it to be, however, how do
we define it? The answer is that we ask GHC to infer it for us. Insert a
typed hole, i.e. twoBoolsS = _
, and let GHC guide you
forward. Because only one constructor is valid, GHC will
essentially infer the term for you. When you reach the case of
PlusImpl
, you need to try both, and one of them will make
GHC err because itâs invalid. What does the end result look like?
= App
twoBoolsS $ UnCPS $ App $ UnCPS $ App $ UnCPS $ App $ UnCPS $ S $ UnCPS' $ S1
$ UnCPS' $ S2 $ K $ S2' $ UnCPS $ App $ UnCPS $ K $ UnCPS' $ K1 $ UnCPS'
$ S $ UnCPS' $ S1 $ UnCPS' $ S2 $ App $ UnCPS $ PlusS $ UnCPS' $ PlusImplSucc
$ PlusImplZero $ S2' $ UnCPS $ K1 $ UnCPS' $ App $ UnCPS $ App $ UnCPS
$ App $ UnCPS $ App $ UnCPS $ S $ UnCPS' $ S1 $ UnCPS' $ S2 $ App $ UnCPS
$ K $ UnCPS' $ K1 $ S2' $ UnCPS $ App $ UnCPS $ App $ UnCPS $ S $ UnCPS'
$ S1 $ UnCPS' $ S2 $ App $ UnCPS $ App $ UnCPS $ S $ UnCPS' $ S1 $ UnCPS'
$ S2 $ S $ S2' $ UnCPS $ App $ UnCPS $ K $ UnCPS' $ K1 $ UnCPS' $ K $ S2'
$ UnCPS $ App $ UnCPS $ K $ UnCPS' $ K1 $ UnCPS' $ S $ UnCPS' $ S1 $ UnCPS'
$ S2 $ K $ S2' $ UnCPS $ K1 $ UnCPS' $ S1 $ UnCPS' $ S2 $ K1 $ S2' $ UnCPS
$ VecS $ UnCPS' $ VecS1 $ Identity $ VImplCons True $ VImplCons False $ VImplNil
No one said this would be pr*tty. It took me a painstaking amount of
time to do the above with GHCi. I assume it would be faster if you used
HLS. You could theoretically also completely automate it upto the
uninferrable parts, i.e. the two booleans you choose (True
and False
in my case).
Maybe the above wasnât a good idea after all. Maybe there was a reason they added type families (surprisingly). I would (unironically) however recommend using data families instead wherever possible as they are much easier to infer types for (given that they are normal data types mostly).
Iâm especially annoyed by the fact that many people create associated
type families that end in -> Type
, yet donât use a data
family. You have absolutely no good reason most of the time!
There is one notable real-world use case however: Given any complex
data type, you can apply the above transformations to turn it into
something that you can derive Generic
for, if you also turn
any instance of universal quantification into explicit uses of a
Forall
type.
You can also handle recursive types by turning recursion into using
explicit fixpoints using Fix
.
Given this:
type family IsUnit (x :: Type) :: Bool where
IsUnit () = 'True
IsUnit _ = 'False
How do we use the above tricks?
Converting it into a data family doesnât work:
data family IsUnitD (a :: Type) (k :: Bool -> Type)
newtype instance IsUnitD () k = IsUnitDY (k 'True)
newtype instance IsUnitD _ k = IsUnitDN (k 'False)
If all you want to do is improve inference, do the following instead:
newtype IsUnitD (a :: Type) (k :: Bool -> Type) = IsUnitD (k (IsUnit a))
If however you want to see how to replicate this in languages with fewer features, you could naively do:
data IsUnitG (a :: Type) (k :: Bool -> Type) where
IsUnitGY :: (k 'True) -> IsUnitG () k
IsUnitGN :: (k 'False) -> IsUnitG a k
The issue is this is wrong! In the case of ()
, both
constructors are valid. We need an inequality proof:
data IsUnitE (a :: Type) (k :: Bool -> Type)
= IsUnitEY (a :~: ()) (k 'True)
| IsUnitEN (a :~: () -> Void) (k 'False)
data SBool (b :: Bool) where
STrue :: SBool 'True
SFalse :: SBool 'False
handleIsUnitEY :: IsUnitE () SBool -> SBool 'True
IsUnitEY _ b) = b
handleIsUnitEY (IsUnitEN p _) = case p refl of
handleIsUnitEY (
handleIsUnitEN :: IsUnitE Int SBool -> SBool 'False
IsUnitEY _ _) = error "impossible"
handleIsUnitEN (IsUnitEN _ b) = b handleIsUnitEN (
This is one aspect where using explicit equalities is more powerful than using data families and GADTs. If only the above type could be a newtype!
However, there is an issue: It is not clear to me how you would
construct a :~: b -> Void
for two different types
a
and b
(i.e., the heads when in WHNF differ).
Take for example, Int :~: ()
, which we get from the
IsUnitEY
case in handleIsUnitEN
: Can we
somehow reduce this to False :~: True
to make use of the
âtrustedâ function absurd
? Naive solutions end up with an
open branch with Int :~: ()
in scope again, thus resulting
in a cycle, unless you use type families etc., which ruins the
point.
Type theorist. Rolling my own crypto.
This page has a markdown version
Public PGP key (6B66 1F36 59D3 BAE7 0561 862E EA8E 9467 5140 F7F4)