So far I’ve ignored the back and forth on the Scottish referendum on secession from the United Kingdom, but this weekend I decided that it was past time for me to sort it out. For those of you who don’t know me, I’ll mention that I lived for 3.5 years in Scotland quite some time […]

Categories: Functional Programming

Hi *,
Here's a new thing: Blog posts! That's right. A while back, we started a new set of emails on the developers list containing weekly updates, from GHC HQ. But we eventually decided it should be more broad and encompass the work GHC sees as a project - including all the things our contributors do.
So now it's the weekly GHC news - and we (or, well, I) have decided to blogify the weekly emails!
So without further adieu, here's the current recap. The original mailing list copy is available here.
As Gabor mentioned on the list earlier today, I (Austin) accidentally broke the Windows build. Sorry. :( We really need to get Phab building Windows too ASAP... I'm working on a fix this morning.
I sent out the HCAR draft this morning. Please edit it! I think we have a few weeks of lead time however, so we're not in a rush like last time. But I will send reminders. :)
The server migration for ghc.haskell.org seems to have gone pretty smoothly in the past week. It's had plenty of traffic so far. The full migration is still ongoing and I want to complete it this week.
I've finished reorganizing some of the Git and Repository pages after some discussion. We now have the Repositories[1] page, linked to on the left side, which details some notes on the repositories we use, and links to other various pages. I'm thinking of replacing this side-bar "root" with a link to the main Git[2] page, perhaps.
Miscellaneous: ghc.haskell.org and phabricator.haskell.org now sets the Strict-Transport-Security header. This just means you always use SSL now when you visit those pages (so you can't be connection-hijacked via a 503 redirect).
I'm merging some patches at the moment, although the windows fix is currently what I'll push first: Phab:D205, Phab:D204, Phab:D203, Phab:D199, Phab:D194, Phab:D179, Phab:D174. Do let me know if there's anything you want me to look at.
GHC works on Wine apparently for all you Linux users - thanks Mikolaj![3]
Jan had some questions about infrastructure which I just followed up on this morning. In particular: does anyone feel strongly about his first question?[4]
Herbert committed the first part of the Traversable/Foldable changes, by moving the typeclasses to Prelude. This is part of an ongoing series of patches. Things like adding Bifunctor will finally come after this.[5]
Also, added bonus: we'll start including some of the tickets we closed
this week.
Closed tickets for the past week include: #9585, #9545, #9581, #6086, #9558, and #3658.
Please let me know if you have any questions.
[1] https://ghc.haskell.org/trac/ghc/wiki/Repositories
[2] https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/Git
[3] https://www.haskell.org/pipermail/ghc-devs/2014-September/006283.html
[4] https://www.haskell.org/pipermail/ghc-devs/2014-September/006275.html
[5] https://phabricator.haskell.org/D209
2014-09-15T13:47:29Z
thoughtpolice

Categories: Functional Programming

Comparing AVL Trees in C++ and Haskell
posted on 2014-09-15
by Dat Do
This post compares the runtimes of AVL tree operations in C++ vs Haskell. In particular, we insert 713,000 strings from a file into an AVL Tree. This is a \(O(n \log n)\) operation. But we want to investigate what the constant factor looks like in different situations.
Experimental setup: All the code for these tests is available in the github repository. The C++ AVL tree was created in a data structures course that I took recently and the Haskell AVL tree is from the Haskell library Data.Tree.AVL. Additionally, the Haskell code stores the strings as ByteStrings because they are much more efficient than the notoriously slow String. To see how the runtime is affected by files of different sizes, the file was first partitioned into 10 segments. The first segment has 71,300 words, the second 71,300 * 2 words, and so on. Both the C++ and Haskell programs were compiled with the -O2 flag for optimization. The test on each segment is the average runtime of three separate runs.
Here’s the results:
C++vHaskell
C++ is a bit faster than Haskell on the last partition for this test.
I guess this is because Haskell operates on immutable data. Every time a new element is to be inserted into the Haskell AVL tree, new parent nodes must be created because the old parent nodes cannot be changed. This creates quite a bit of overhead. C++ on the other hand, does have mutable data and can simply change the node that a parent node is pointing to. This is faster than making a whole new copy like the Haskell code does.
Is there an easy way to speed up our Haskell code?
There is a Haskell library called parallel that makes parallel computations really convenient. We’ll try to speed up our program with this library.
You might think that it is unfair to compare multithreaded Haskell against C++ that is not multithreaded. And you’re absolutely right! But let’s be honest, manually working with pthreads in C++ is quite the headache, but parallism in Haskell is super easy.
Before we look at the results, let’s look at the parallelized code. What we do is create four trees each with a portion of the set of strings. Then, we call par on the trees so that the code is parallelized. Afterwards, we union the trees to make them a single tree. Finally, we call deepseq so that the code is evaluated.
{-# LANGUAGE TemplateHaskell #-}
import Control.DeepSeq.TH
import Control.Concurrent
import Data.Tree.AVL as A
import Data.COrdering
import System.CPUTime
import qualified Data.ByteString.Char8 as B
import Control.DeepSeq
import Data.List.Split
import System.Environment
import Control.Parallel
$(deriveNFData ''AVL)
-- Inserts elements from list into AVL tree
load :: AVL B.ByteString -> [B.ByteString] -> AVL B.ByteString
load t [] = t
load t (x:xs) = A.push (fstCC x) x (load t xs)
main = do
args <- getArgs
contents <- fmap B.lines $ B.readFile $ args !! 0
let l = splitEvery (length contents `div` 4) contents
deepseq contents $ deepseq l $ return ()
start <- getCPUTime
-- Loading the tree with the subpartitions
let t1 = load empty $ l !! 0
let t2 = load empty $ l !! 1
let t3 = load empty $ l !! 2
let t4 = load empty $ l !! 3
let p = par t1 $ par t2 $ par t3 t4
-- Calling union to combine the trees
let b = union fstCC t1 t2
let t = union fstCC t3 t4
let bt = union fstCC b t
let bt' = par b $ par t bt
deepseq p $ deepseq bt' $ return ()
end <- getCPUTime
n <- getNumCapabilities
let diff = ((fromIntegral (end-start)) / (10^12) / fromIntegral n)
putStrLn $ show diff
Great, so now that the Haskell code has been parallelized, we can compile and run the program again to see the difference. To compile for parallelism, we must use some special flags.
ghc –O2 filename -rtsopts –threaded
And to run the program (-N4 refers to the number of cores).
./filename +RTS –N4
C++vHaskell4
Haskell now gets better runtimes than C++.
Now that we know Haskell is capable of increasing its speeds through parallelism, it would be interesting to see how the runtime is affected by the degree of parallelism.
According to Amdahl’s law, a program that is 100% parallelized will see a proportional speed up based on the number of threads of execution. For example, if a program that is 100% parallelized takes 2 seconds to run on 1 thread, then it should take 1 second to run using 2 threads. The code used for our test, however, is not 100% parallelized since there a union operation performed at the end to combine the trees created by the separate threads. The union of the trees is a \(O(n)\) operation while the insertion of the strings into the AVL tree is a \(O\left(\frac{n \log n }{p}\right)\) operation, where \(p\) is the number of threads. Therefore, the runtime for our test should be
\[O\left(\frac{n\log{n}}{p} + n\right)\]
Here is a graph showing the runtime of the operation on the largest set (713,000 strings) across increasing levels of parallelism.
HaskellParallelization
Taking a look at the results, we can see that the improvement in runtime does not fit the 100% parallelized theoretical model, but does follow it to some extent. Rather than the 2 core runtime being 50% of the 1 core runtime, the 2 core runtime is 56% of the 1 core runtime, with decreasing efficiency as the number of cores increases. Though, it is clear that there are significant improvements in speed through the use of more processor cores and that parallelism is an easy way to get better runtime speeds with little effort.
2014-09-15T00:00:00Z
2014-09-15T00:00:00Z

Categories: Functional Programming

Scandalous! Nick Robinson asks Alex Salmond a question, and Salmond takes seven minutes to answer in detail.On the evening news, Nick Robinson summarises Salmond's answer in a few seconds as 'He didn't answer'.(Above spotted via Arc of Prosperity.)And today, this.I used to be a supporter of the BBC, but it's getting harder and harder to justify.

Categories: Functional Programming

My last post quoted Joe Stiglitz, indirectly, to refute Paul Krugman's fear mongering. Now the man himself has spoken in the Sunday Herald.As Scotland contemplates independence, some, such as Paul Krugman, have questioned the "economics".Would Scotland, going it alone, risk a decline in standards of living or a fall in GDP? There are, to be sure, risks in any course of action: should Scotland stay in the UK, and the UK leave the EU, the downside risks are, by almost any account, significantly greater. If Scotland stays in the UK, and the UK continues in its policies which have resulted in growing inequality, even if GDP were slightly larger, the standards of living of most Scots could fall.Cutbacks in UK public support to education and health could force Scotland to face a set of unpalatable choices - even with Scotland having considerable discretion over what it spends its money on.But there is, in fact, little basis for any of the forms of fear-mongering that have been advanced. Krugman, for instance, suggests that there are significant economies of scale: a small economy is likely, he seems to suggest, not to do well. But an independent Scotland will still be part of Europe, and the great success of the EU is the creation of a large economic zone.Besides, small political entities, like Sweden, Singapore, and Hong Kong have prospered, while much larger entities have not. By an order of magnitude, far more important is pursuit of the right policies.Another example of a non-issue is the currency. There are many currency arrangements that would work. Scotland could continue using sterling - with or without England's consent.Because the economies of England and Scotland are so similar, a common currency is likely to work far better than the euro - even without shared fiscal policy. But many small countries have managed to have a currency of their own - floating, pegged, or "managed."

Categories: Functional Programming

One relentless lie behind 'No' is that Scotland is too wee to make it on its own, counterexamples such as Denmark, Sweden, Singapore, and Hong Kong being conveniently ignored. May this post from Thomas Widmann, a Dane residing in Scotland, help to dispel the disinformation. Pick a random person from somewhere on this planet. Ask them to name an alcoholic drink from Scotland, and it’s very likely they’ll reply “Whisky”. Ask them to name one from Denmark, and they’ll probably be tongue-tied. (They could answer “Gammel Dansk” or “Akvavit”, but they’re just not nearly as famous as whisky.)Now repeat the exercise, but ask about a food item. Again, it’s likely they’ll have heard of haggis but that they’ll be struggling to name anything from Denmark.Now try a musical instrument. Bagpipes and … sorry, cannot think of a Danish one.A sport? Scotland has golf, of course. Denmark can perhaps claim ownership of handball, but it’s not associated with Denmark in the way that golf makes everybody think of Scotland.A piece of clothing? Everybody knows the kilt, but I’d be very surprised if anybody can name one from Denmark.A monster? Everybody knows what’s lurking in Loch Ness, but is there anything scary in Denmark?The only category where Denmark perhaps wins is toys, where Lego surely is more famous than anything from Scotland (but many people don’t know Lego is from Denmark).Denmark is also well-known for butter and bacon, of course, but these aren’t Danish in origin or strongly associated with Denmark in people’s minds.Several famous writers and philosophers were Danish (e.g., Hans Christian Andersen and Søren Kierkegaard), but Scotland can arguably list more names of the same calibre, and the Scottish ones wrote in English, which makes them much more accessible to the outside world.Scottish universities are also ranked better than the Danish ones in recent World rankings.Finally, Scotland has lots of oil and wind, water and waves. Denmark has some, but not nearly as much, and most other countries have less than Denmark.Because of all of this, I don’t worry about the details when it comes to Scottish independence. If Denmark can be one of the richest countries on the planet, of course Scotland can be one too.Yes, there might be a few tough years while the rUK are in a huff and before everything has been sorted out. And of course there will be occasional crises in the future, like in any other country.However, unless you subscribe to the school that Denmark and other small countries like Norway and Switzerland are complete failures because they don’t have nuclear weapons and a permanent seat on the UN’s Security Council, there’s simply no reason to assume Scotland won’t do exceptionally well as an independent country in the longer term.So I’m not worried. Of course there are many details to sort out, but at the end of the day everything will be fine. Scotland will be a hugely successful independent country. Dinna fash yersel!

Categories: Functional Programming

Some of my colleagues have commented on Paul Krugman's financial diatribe, Scots, what the heck. My colleague Nigel Goddard penned a response.While I have a lot of respect for Paul Krugman, his blanket warning against currency unions is misplaced, at least according to Joe Stiglitz, another Nobel prize winning economist (two economists; three opinions). Stiglitz said (at the book festival) that all three proposed models (currency union, use of currency without union, separate currency) can work depending on the characteristics of the economies and, most particularly, the quality of the institutions running the currency (or currencies). For a union to work the economies must be similar in various important ways (level of investment, competitive advantage, etc). rUK and Scotland are similar at present so it can easily work. Over the longer term it may be that Scotland breaks out of the low-investment, low-skill rUK model and goes for a more northern European high-investment, high-skill model (let's hope!). In that case a currency union would over time come under strain and eventually break up (i.e., into separate currencies). But in that case I know which economy I'd rather be in.In the current situation, if the UK's central bank takes a decision contrary to Scotland's interest there is nothing we can do about it. An independent Scotland using UK currency could, if need be, move to its own currency. Krugman's piece says not a word on this option.RBS, Lloyds, and TSB cannot afford to keep their headquarters in Scotland without the backing of the larger UK, though they may well keep much of their operations here. This is touted by Darling as a reason to vote No, but for me it's a reason to vote Yes. Banks too big to fail are an appalling idea; getting rid of them is an enormous benefit of an independent Scotland. I only hope Westminster keeps its promise not to agree a fiscal union!

Categories: Functional Programming

After some initially positive experience with ghc-7.8-rc1 I’ve decided to upstream most of gentoo fixes. On rare arches ghc-7.8.3 behaves a bit bad: ia64 build stopped being able to link itself after ghc-7.4 (gprel overflow) on sparc, ia64 and ppc ghc was not able to create working shared libraries integer-gmp library on ia64 crashed, and […]

Categories: Functional Programming

Two months ago Ivan asked me if we had working darcs-2.8 for ghc-7.8 in gentoo. We had a workaround to compile darcs to that day, but darcs did not work reliably. Sometimes it needed 2-3 attempts to pull a repository. A bit later I’ve decided to actually look at failure case (Issued on darcs bugtracker) […]

Categories: Functional Programming

At last week's CUFP I did a talk called “Haskell tools for satellite operations”. The abstract is:
Since 2013-04 the presenter has been supporting SSC (the Swedish Space Corporation) in operating the telecommunications satellite “Sirius 3” from its Mission Control Center in Kiruna. Functions in the satellite vendor's operations software are breaking down as the orbit of the ageing satellite degrades. To fill in the gaps in software capabilities the presenter has developed several operational tools using Haskell.
The talk will give an overview of the satellite operations environment, the tools developed in Haskell, how they benefitted (and occasionally suffered) from the choice of implementation language, which (public) libraries were critical to their success, and how they were deployed in the satellite operations environment.
A video recording of the talk is available on the CUFP page for the talk and on youtube.
If this interests you be sure to check out the other talk from the “Functional programming in space!” track; Michael Oswald's Haskell in the Misson Control Domain.

Categories: Functional Programming

The Haskell language provides the following guarantee (with caveats): if two programs are equal according to equational reasoning then they will behave the same. On the other hand, Haskell does not guarantee that equal programs will generate identical performance. Consequently, Haskell library writers must employ rewrite rules to ensure that their abstractions do not interfere with performance.Now suppose there were a hypothetical language with a stronger guarantee: if two programs are equal then they generate identical executables. Such a language would be immune to abstraction: no matter how many layers of indirection you might add the binary size and runtime performance would be unaffected.Here I will introduce such an intermediate language named Morte that obeys this stronger guarantee. I have not yet implemented a back-end code generator for Morte, but I wanted to pause to share what I have completed so far because Morte uses several tricks from computer science that I believe deserve more attention.Morte is nothing more than a bare-bones implementation of the calculus of constructions, which is a specific type of lambda calculus. The only novelty is how I intend to use this lambda calculus: as a super-optimizer.NormalizationThe typed lambda calculus possesses a useful property: every term in the lambda calculus has a unique normal form if you beta-reduce everything. If you're new to lambda calculus, normalizing an expression equates to indiscriminately inlining every function call.What if we built a programming language whose intermediate language was lambda calculus? What if optimization was just normalization of lambda terms (i.e. indiscriminate inlining)? If so, then we would could abstract freely, knowing that while compile times might increase, our final executable would never change.RecursionNormally you would not want to inline everything because infinitely recursive functions would become infinitely large expressions. Fortunately, we can often translate recursive code to non-recursive code!I'll demonstrate this trick first in Haskell and then in Morte. Let's begin from the following recursive List type along with a recursive map function over lists:import Prelude hiding (map, foldr)data List a = Cons a (List a) | Nilexample :: List Intexample = Cons 1 (Cons 2 (Cons 3 Nil))map :: (a -> b) -> List a -> List bmap f Nil = Nilmap f (Cons a l) = Cons (f a) (map f l)-- Argument order intentionally switchedfoldr :: List a -> (a -> x -> x) -> x -> xfoldr Nil c n = nfoldr (Cons a l) c n = c a (foldr l c n)result :: Intresult = foldr (map (+1) example) (+) 0-- result = 9Now imagine that we disable all recursion in Haskell: no more recursive types and no more recursive functions. Now we must reject the above program because:the List data type definition recursively refers to itselfthe map and foldr functions recursively refer to themselvesCan we still encode lists in a non-recursive dialect of Haskell?Yes, we can!-- This is a valid Haskell program{-# LANGUAGE RankNTypes #-}import Prelude hiding (map, foldr)type List a = forall x . (a -> x -> x) -> x -> xexample :: List Intexample = \cons nil -> cons 1 (cons 2 (cons 3 nil))map :: (a -> b) -> List a -> List bmap f l = \cons nil -> l (\a x -> cons (f a) x) nilfoldr :: List a -> (a -> x -> x) -> x -> xfoldr l = lresult :: Intresult = foldr (map (+ 1) example) (+) 0-- result = 9Carefully note that:List is no longer defined recursively in terms of itselfmap and foldr are no longer defined recursively in terms of themselvesYet, we somehow managed to build a list, map a function over the list, and fold the list, all without ever using recursion! We do this by encoding the list as a fold, which is why foldr became the identity function.This trick works for more than just lists. You can take any recursive data type and mechanically transform the type into a fold and transform functions on the type into functions on folds. If you want to learn more about this trick, the specific name for it is "Boehm-Berarducci encoding". If you are curious, this in turn is equivalent to an even more general concept from category theory known as "F-algebras", which let you encode inductive things in a non-inductive way.Non-recursive code greatly simplifies equational reasoning. For example, we can easily prove that we can optimize map id l to l:map id l-- Inline: map f l = \cons nil -> l (\a x -> cons (f a) x) nil= \cons nil -> l (\a x -> cons (id a) x) nil-- Inline: id x = x= \cons nil -> l (\a x -> cons a x) nil-- Eta-reduce= \cons nil -> l cons nil-- Eta-reduce= lNote that we did not need to use induction to prove this optimization because map is no longer recursive. The optimization became downright trivial, so trivial that we can automate it!Morte optimizes programs using this same simple scheme:Beta-reduce everything (equivalent to inlining)Eta-reduce everythingTo illustrate this, I will desugar our high-level Haskell code to the calculus of constructions. This desugaring process is currently manual (and tedious), but I plan to automate this, too, by providing a front-end high-level language similar to Haskell that compiles to Morte:-- mapid.mt( \(List : * -> *)-> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b )-> \(id : forall (a : *) -> a -> a) -> \(a : *) -> map a a (id a))-- List(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)-- map( \(a : *)-> \(b : *)-> \(f : a -> b)-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : b -> x -> x)-> \(Nil: x)-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil)-- id(\(a : *) -> \(va : a) -> va)This line of code is the "business end" of the program:\(a : *) -> map a a (id a)The extra 'a' business is because in any polymorphic lambda calculus you explicitly accept polymorphic types as arguments and specialize functions by applying them to types. Higher-level functional languages like Haskell or ML use type inference to automatically infer and supply type arguments when possible.We can compile this program using the morte executable, which accepts a Morte program on stdin, outputs the program's type stderr, and outputs the optimized program on stdout:$ morte < id.mt∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a → x → x) → x → xλ(a : *) → λ(l : ∀(x : *) → (a → x → x) → x → x) → lThe first line is the type, which is a desugared form of:forall a . List a -> List aThe second line is the program, which is the identity function on lists. Morte optimized away the map completely, the same way we did by hand.Morte also optimized away the rest of the code, too. Dead-code elimination is just an emergent property of Morte's simple optimization scheme.EqualityWe could double-check our answer by asking Morte to optimize the identity function on lists:-- idlist.mt( \(List : * -> *)-> \(id : forall (a : *) -> a -> a) -> \(a : *) -> id (List a))-- List(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)-- id(\(a : *) -> \(va : a) -> va)Sure enough, Morte outputs an alpha-equivalent result (meaning the same up to variable renaming):$ ~/.cabal/bin/morte < idlist.mt∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a → x → x) → x → xλ(a : *) → λ(va : ∀(x : *) → (a → x → x) → x → x) → vaWe can even use the morte library to mechanically check if two Morte expressions are alpha-, beta-, and eta- equivalent. We can parse our two Morte files into Morte's Expr type and then use the Eq instance for Expr to test for equivalence:$ ghciPrelude> import qualified Data.Text.Lazy.IO as TextPrelude Text> txt1 <- Text.readFile "mapid.mt"Prelude Text> txt2 <- Text.readFile "idlist.mt"Prelude Text> import Morte.Parser (exprFromText)Prelude Text Morte.Parser> let e1 = exprFromText txt1Prelude Text Morte.Parser> let e2 = exprFromText txt2Prelude Text Morte.Parser> import Control.Applicative (liftA2)Prelude Text Morte.Parser Control.Applicative> liftA2 (==) e1 e2Right True$ -- `Right` means both expressions parsed successfully$ -- `True` means they are alpha-, beta-, and eta-equivalentWe can use this to mechanically verify that two Morte programs optimize to the same result.Compile-time computationMorte can compute as much (or as little) at compile as you want. The more information you encode directly within lambda calculus, the more compile-time computation Morte will perform for you. For example, if we translate our Haskell List code entirely to lambda calculus, then Morte will statically compute the result at compile time.-- nine.mt( \(Nat : *)-> \(zero : Nat)-> \(one : Nat)-> \((+) : Nat -> Nat -> Nat)-> \((*) : Nat -> Nat -> Nat)-> \(List : * -> *)-> \(Cons : forall (a : *) -> a -> List a -> List a)-> \(Nil : forall (a : *) -> List a)-> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b )-> \( foldr : forall (a : *) -> List a -> forall (r : *) -> (a -> r -> r) -> r -> r )-> ( \(two : Nat) -> \(three : Nat) -> ( \(example : List Nat) -> foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero ) -- example (Cons Nat one (Cons Nat two (Cons Nat three (Nil Nat)))) ) -- two ((+) one one) -- three ((+) one ((+) one one)))-- Nat( forall (a : *)-> (a -> a)-> a-> a)-- zero( \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> Zero)-- one( \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> Succ Zero)-- (+)( \(m : forall (a : *) -> (a -> a) -> a -> a)-> \(n : forall (a : *) -> (a -> a) -> a -> a)-> \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> m a Succ (n a Succ Zero))-- (*)( \(m : forall (a : *) -> (a -> a) -> a -> a)-> \(n : forall (a : *) -> (a -> a) -> a -> a)-> \(a : *)-> \(Succ : a -> a)-> \(Zero : a)-> m a (n a Succ) Zero)-- List( \(a : *)-> forall (x : *)-> (a -> x -> x) -- Cons-> x -- Nil-> x)-- Cons( \(a : *)-> \(va : a)-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : a -> x -> x)-> \(Nil : x)-> Cons va (vas x Cons Nil))-- Nil( \(a : *)-> \(x : *)-> \(Cons : a -> x -> x)-> \(Nil : x)-> Nil)-- map( \(a : *)-> \(b : *)-> \(f : a -> b)-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)-> \(x : *)-> \(Cons : b -> x -> x)-> \(Nil: x)-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil)-- foldr( \(a : *)-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)-> vas)The relevant line is:foldr Nat (map Nat Nat ((+) one) example) Nat (+) zeroIf you remove the type-applications to Nat, this parallels our original Haskell example. We can then evaluate this expression at compile time:$ morte < nine.mt∀(a : *) → (a → a) → a → aλ(a : *) → λ(Succ : a → a) → λ(Zero : a) → Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))Morte reduces our program to a church-encoded nine.Run-time computationMorte does not force you to compute everything using lambda calculus at compile time. Suppose that we wanted to use machine arithmetic at run-time instead. We can do this by parametrizing our program on:the Int type,operations on Ints, andany integer literals we useWe accept these "foreign imports" as ordinary arguments to our program:-- foreign.mt-- Foreign imports \(Int : *) -- Foreign type-> \((+) : Int -> Int -> Int) -- Foreign function-> \((*) : Int -> Int -> Int) -- Foreign function-> \(lit@0 : Int) -- Literal "1" -- Foreign data-> \(lit@1 : Int) -- Literal "2" -- Foreign data-> \(lit@2 : Int) -- Literal "3" -- Foreign data-> \(lit@3 : Int) -- Literal "1" -- Foreign data-> \(lit@4 : Int) -- Literal "0" -- Foreign data-- The rest is compile-time lambda calculus-> ( \(List : * -> *) -> \(Cons : forall (a : *) -> a -> List a -> List a) -> \(Nil : forall (a : *) -> List a) -> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> List a -> List b ) -> \( foldr : forall (a : *) -> List a -> forall (r : *) -> (a -> r -> r) -> r -> r ) -> ( \(example : List Int) -> foldr Int (map Int Int ((+) lit@3) example) Int (+) lit@4 ) -- example (Cons Int lit@0 (Cons Int lit@1 (Cons Int lit@2 (Nil Int)))) ) -- List ( \(a : *) -> forall (x : *) -> (a -> x -> x) -- Cons -> x -- Nil -> x ) -- Cons ( \(a : *) -> \(va : a) -> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x) -> \(x : *) -> \(Cons : a -> x -> x) -> \(Nil : x) -> Cons va (vas x Cons Nil) ) -- Nil ( \(a : *) -> \(x : *) -> \(Cons : a -> x -> x) -> \(Nil : x) -> Nil ) -- map ( \(a : *) -> \(b : *) -> \(f : a -> b) -> \(l : forall (x : *) -> (a -> x -> x) -> x -> x) -> \(x : *) -> \(Cons : b -> x -> x) -> \(Nil: x) -> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil ) -- foldr ( \(a : *) -> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x) -> vas )We can use Morte to optimize the above program and Morte will reduce the program to nothing but foreign types, operations, and values:$ morte < foreign.mt∀(Int : *) → (Int → Int → Int) → (Int → Int → Int) → Int →Int → Int → Int → Int → Intλ(Int : *) → λ((+) : Int → Int → Int) → λ((*) : Int → Int → Int) → λ(lit : Int) → λ(lit@1 : Int) → λ(lit@2 : Int) → λ(lit@3 : Int) → λ(lit@4 : Int) → (+) ((+) lit@3 lit) ((+) ((+) lit@3 lit@1) ((+) ((+) lit@3 lit@2) lit@4))If you study that closely, Morte adds lit@3 (the "1" literal) to each literal of the list and then adds them up. We can then pass this foreign syntax tree to our machine arithmetic backend to transform those foreign operations to efficient operations.Morte lets you choose how much information you want to encode within lambda calculus. The more information you encode in lambda calculus the more Morte can optimize your program, but the slower your compile times will get, so it's a tradeoff.CorecursionCorecursion is the dual of recursion. Where recursion works on finite data types, corecursion works on potentially infinite data types. An example would be the following infinite Stream in Haskell:data Stream a = Cons a (Stream a)numbers :: Stream Intnumbers = go 0 where go n = Cons n (go (n + 1))-- numbers = Cons 0 (Cons 1 (Cons 2 (...map :: (a -> b) -> Stream a -> Stream bmap f (Cons a l) = Cons (f a) (map f l)example :: Stream Intexample = map (+ 1) numbers-- example = Cons 1 (Cons 2 (Cons 3 (...Again, pretend that we disable any function from referencing itself so that the above code becomes invalid. This time we cannot reuse the same trick from previous sections because we cannot encode numbers as a fold without referencing itself. Try this if you don't believe me.However, we can still encode corecursive things in a non-corecursive way. This time, we encode our Stream type as an unfold instead of a fold:-- This is also valid Haskell code{-# LANGUAGE ExistentialQuantification #-}data Stream a = forall s . MkStream { seed :: s , step :: s -> (a, s) }numbers :: Stream Intnumbers = MkStream 0 (\n -> (n, n + 1))map :: (a -> b) -> Stream a -> Stream bmap f (MkStream s0 k) = MkStream s0 k' where k' s = (f a, s') where (a, s') = k s In other words, we store an initial seed of some type s and a step function of type s -> (a, s) that emits one element of our Stream. The type of our seed s can be anything and in our numbers example, the type of the internal state is Int. Another stream could use a completely different internal state of type (), like this:-- ones = Cons 1 onesones :: Stream Intones = MkStream () (\_ -> (1, ()))The general name for this trick is an "F-coalgebra" encoding of a corecursive type.Once we encode our infinite stream non-recursively, we can safely optimize the stream by inlining and eta reduction:map id l-- l = MkStream s0 k= map id (MkStream s0 k)-- Inline definition of `map`= MkStream s0 k' where k' = \s -> (id a, s') where (a, s') = k s-- Inline definition of `id`= MkStream s0 k' where k' = \s -> (a, s') where (a, s') = k s-- Inline: (a, s') = k s= MkStream s0 k' where k' = \s -> k s-- Eta reduce= MkStream s0 k' where k' = k-- Inline: k' = k= MkStream s0 k-- l = MkStream s0 k= lNow let's encode Stream and map in Morte and compile the following four expressions:map ididmap f . map gmap (f . g)Save the following Morte file to stream.mt and then uncomment the expression you want to test:( \(id : forall (a : *) -> a -> a)-> \( (.) : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> (a -> c) )-> \(Pair : * -> * -> *)-> \(P : forall (a : *) -> forall (b : *) -> a -> b -> Pair a b)-> \( first : forall (a : *) -> forall (b : *) -> forall (c : *) -> (a -> b) -> Pair a c -> Pair b c )-> ( \(Stream : * -> *) -> \( map : forall (a : *) -> forall (b : *) -> (a -> b) -> Stream a -> Stream b ) -- example@1 = example@2 -> ( \(example@1 : forall (a : *) -> Stream a -> Stream a) -> \(example@2 : forall (a : *) -> Stream a -> Stream a) -- example@3 = example@4 -> \( example@3 : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> Stream a -> Stream c ) -> \( example@4 : forall (a : *) -> forall (b : *) -> forall (c : *) -> (b -> c) -> (a -> b) -> Stream a -> Stream c ) -- Uncomment the example you want to test -> example@1-- -> example@2-- -> example@3-- -> example@4 ) -- example@1 (\(a : *) -> map a a (id a)) -- example@2 (\(a : *) -> id (Stream a)) -- example@3 ( \(a : *) -> \(b : *) -> \(c : *) -> \(f : b -> c) -> \(g : a -> b) -> map a c ((.) a b c f g) ) -- example@4 ( \(a : *) -> \(b : *) -> \(c : *) -> \(f : b -> c) -> \(g : a -> b) -> (.) (Stream a) (Stream b) (Stream c) (map b c f) (map a b g) ) ) -- Stream ( \(a : *) -> forall (x : *) -> (forall (s : *) -> s -> (s -> Pair a s) -> x) -> x ) -- map ( \(a : *) -> \(b : *) -> \(f : a -> b) -> \( st : forall (x : *) -> (forall (s : *) -> s -> (s -> Pair a s) -> x) -> x ) -> \(x : *) -> \(S : forall (s : *) -> s -> (s -> Pair b s) -> x) -> st x ( \(s : *) -> \(seed : s) -> \(step : s -> Pair a s) -> S s seed (\(seed@1 : s) -> first a b s f (step seed@1)) ) ))-- id(\(a : *) -> \(va : a) -> va)-- (.)( \(a : *)-> \(b : *)-> \(c : *)-> \(f : b -> c)-> \(g : a -> b)-> \(va : a)-> f (g va))-- Pair(\(a : *) -> \(b : *) -> forall (x : *) -> (a -> b -> x) -> x)-- P( \(a : *)-> \(b : *)-> \(va : a)-> \(vb : b)-> \(x : *)-> \(P : a -> b -> x)-> P va vb)-- first( \(a : *)-> \(b : *)-> \(c : *)-> \(f : a -> b)-> \(p : forall (x : *) -> (a -> c -> x) -> x)-> \(x : *)-> \(Pair : b -> c -> x)-> p x (\(va : a) -> \(vc : c) -> Pair (f va) vc))Both example@1 and example@2 will generate alpha-equivalent code:$ morte < example1.mt∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → xλ(a : *) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → st$ morte < example2.mt∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → xλ(a : *) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → vaSimilarly, example@3 and example@4 will generate alpha-equivalent code:$ morte < example3.mt∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → xλ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → st x (λ(s : *) → λ(seed : s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x (λ(va : a) → Pair (f (g va)))))$ morte < example4.mt∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → xλ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀(x : *) → (c → s → x) → x) → x) → va x (λ(s : *) → λ(seed : s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x (λ(va : a) → Pair (f (g va))))We inadvertently proved stream fusion for free, but we're still not done, yet! Everything we learn about recursive and corecursive sequences can be applied to model recursive and corecursive effects!EffectsI will conclude this post by showing how to model both recursive and corecursive programs that have side effects. The recursive program will echo ninety-nine lines from stdin to stdout. The equivalent Haskell program is in the comment header:-- recursive.mt-- The Haskell code we will translate to Morte:---- import Prelude hiding (-- (+), (*), IO, putStrLn, getLine, (>>=), (>>), return )-- -- -- Simple prelude---- data Nat = Succ Nat | Zero---- zero :: Nat-- zero = Zero---- one :: Nat-- one = Succ Zero---- (+) :: Nat -> Nat -> Nat-- Zero + n = n-- Succ m + n = m + Succ n---- (*) :: Nat -> Nat -> Nat-- Zero * n = Zero-- Succ m * n = n + (m * n)---- foldNat :: Nat -> (a -> a) -> a -> a-- foldNat Zero f x = x-- foldNat (Succ m) f x = f (foldNat m f x)---- data IO r-- = PutStrLn String (IO r)-- | GetLine (String -> IO r)-- | Return r---- putStrLn :: String -> IO U-- putStrLn str = PutStrLn str (Return Unit)---- getLine :: IO String-- getLine = GetLine Return---- return :: a -> IO a-- return = Return---- (>>=) :: IO a -> (a -> IO b) -> IO b-- PutStrLn str io >>= f = PutStrLn str (io >>= f)-- GetLine k >>= f = GetLine (\str -> k str >>= f)-- Return r >>= f = f r---- -- Derived functions---- (>>) :: IO U -> IO U -> IO U-- m >> n = m >>= \_ -> n---- two :: Nat-- two = one + one---- three :: Nat-- three = one + one + one---- four :: Nat-- four = one + one + one + one---- five :: Nat-- five = one + one + one + one + one---- six :: Nat-- six = one + one + one + one + one + one---- seven :: Nat-- seven = one + one + one + one + one + one + one---- eight :: Nat-- eight = one + one + one + one + one + one + one + one---- nine :: Nat-- nine = one + one + one + one + one + one + one + one + one---- ten :: Nat-- ten = one + one + one + one + one + one + one + one + one + one---- replicateM_ :: Nat -> IO U -> IO U-- replicateM_ n io = foldNat n (io >>) (return Unit)---- ninetynine :: Nat-- ninetynine = nine * ten + nine---- main_ :: IO U-- main_ = getLine >>= putStrLn-- "Free" variables( \(String : * )-> \(U : *)-> \(Unit : U) -- Simple prelude-> ( \(Nat : *) -> \(zero : Nat) -> \(one : Nat) -> \((+) : Nat -> Nat -> Nat) -> \((*) : Nat -> Nat -> Nat) -> \(foldNat : Nat -> forall (a : *) -> (a -> a) -> a -> a) -> \(IO : * -> *) -> \(return : forall (a : *) -> a -> IO a) -> \((>>=) : forall (a : *) -> forall (b : *) -> IO a -> (a -> IO b) -> IO b ) -> \(putStrLn : String -> IO U) -> \(getLine : IO String) -- Derived functions -> ( \((>>) : IO U -> IO U -> IO U) -> \(two : Nat) -> \(three : Nat) -> \(four : Nat) -> \(five : Nat) -> \(six : Nat) -> \(seven : Nat) -> \(eight : Nat) -> \(nine : Nat) -> \(ten : Nat) -> ( \(replicateM_ : Nat -> IO U -> IO U) -> \(ninetynine : Nat) -> replicateM_ ninetynine ((>>=) String U getLine putStrLn) ) -- replicateM_ ( \(n : Nat) -> \(io : IO U) -> foldNat n (IO U) ((>>) io) (return U Unit) ) -- ninetynine ((+) ((*) nine ten) nine) ) -- (>>) ( \(m : IO U) -> \(n : IO U) -> (>>=) U U m (\(_ : U) -> n) ) -- two ((+) one one) -- three ((+) one ((+) one one)) -- four ((+) one ((+) one ((+) one one))) -- five ((+) one ((+) one ((+) one ((+) one one)))) -- six ((+) one ((+) one ((+) one ((+) one ((+) one one))))) -- seven ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))) -- eight ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))) -- nine ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))) -- ten ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))))) ) -- Nat ( forall (a : *) -> (a -> a) -> a -> a ) -- zero ( \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> Zero ) -- one ( \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> Succ Zero ) -- (+) ( \(m : forall (a : *) -> (a -> a) -> a -> a) -> \(n : forall (a : *) -> (a -> a) -> a -> a) -> \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> m a Succ (n a Succ Zero) ) -- (*) ( \(m : forall (a : *) -> (a -> a) -> a -> a) -> \(n : forall (a : *) -> (a -> a) -> a -> a) -> \(a : *) -> \(Succ : a -> a) -> \(Zero : a) -> m a (n a Succ) Zero ) -- foldNat ( \(n : forall (a : *) -> (a -> a) -> a -> a) -> n ) -- IO ( \(r : *) -> forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (r -> x) -> x ) -- return ( \(a : *) -> \(va : a) -> \(x : *) -> \(PutStrLn : String -> x -> x) -> \(GetLine : (String -> x) -> x) -> \(Return : a -> x) -> Return va ) -- (>>=) ( \(a : *) -> \(b : *) -> \(m : forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (a -> x) -> x ) -> \(f : a -> forall (x : *) -> (String -> x -> x) -> ((String -> x) -> x) -> (b -> x) -> x ) -> \(x : *) -> \(PutStrLn : String -> x -> x) -> \(GetLine : (String -> x) -> x) -> \(Return : b -> x) -> m x PutStrLn GetLine (\(va : a) -> f va x PutStrLn GetLine Return) ) -- putStrLn ( \(str : String) -> \(x : *) -> \(PutStrLn : String -> x -> x ) -> \(GetLine : (String -> x) -> x) -> \(Return : U -> x) -> PutStrLn str (Return Unit) ) -- getLine ( \(x : *) -> \(PutStrLn : String -> x -> x ) -> \(GetLine : (String -> x) -> x) -> \(Return : String -> x) -> GetLine Return ))This program will compile to a completely unrolled read-write loop, as most recursive programs will:$ morte < recursive.mt∀(String : *) → ∀(U : *) → U → ∀(x : *) → (String → x → x) → ((String → x) → x) → (U → x) → xλ(String : *) → λ(U : *) → λ(Unit : U) → λ(x : *) → λ(PutStrLn : String → x → x) → λ(GetLine : (String → x) → x) → λ(Return : U → x) → GetLine (λ(va : String) → PutStrLn va (GetLine (λ(va@1 : String) → PutStrLn va@1 (GetLine (λ(va@2 : String) → PutStrLn va@2 (GetLine (λ(va@3 : String) → PutStrLn ... ... GetLine (λ(va@92 : String) → PutStrLn va@92 (GetLine (λ(va@93 : String) → PutStrLn va@93 (GetLine (λ(va@94 : String) → PutStrLn va@94 (GetLine (λ(va@95 : String) → PutStrLn va@95 (GetLine (λ(va@96 : String) → PutStrLn va@96 (GetLine (λ(va@97 : String) → PutStrLn va@97 (GetLine (λ(va@98 : String) → PutStrLn va@98 (Return Unit))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))In contrast, if we encode the effects corecursively we can express a program that echoes indefinitely from stdin to stdout:-- corecursive.mt-- data IOF r s-- = PutStrLn String s-- | GetLine (String -> s)-- | Return r---- data IO r = forall s . MkIO s (s -> IOF r s)---- main = MkIO-- Nothing-- (maybe (\str -> PutStrLn str Nothing) (GetLine Just))( \(String : *)-> ( \(Maybe : * -> *) -> \(Just : forall (a : *) -> a -> Maybe a) -> \(Nothing : forall (a : *) -> Maybe a) -> \( maybe : forall (a : *) -> Maybe a -> forall (x : *) -> (a -> x) -> x -> x ) -> \(IOF : * -> * -> *) -> \( PutStrLn : forall (r : *) -> forall (s : *) -> String -> s -> IOF r s ) -> \( GetLine : forall (r : *) -> forall (s : *) -> (String -> s) -> IOF r s ) -> \( Return : forall (r : *) -> forall (s : *) -> r -> IOF r s ) -> ( \(IO : * -> *) -> \( MkIO : forall (r : *) -> forall (s : *) -> s -> (s -> IOF r s) -> IO r ) -> ( \(main : forall (r : *) -> IO r) -> main ) -- main ( \(r : *) -> MkIO r (Maybe String) (Nothing String) ( \(m : Maybe String) -> maybe String m (IOF r (Maybe String)) (\(str : String) -> PutStrLn r (Maybe String) str (Nothing String) ) (GetLine r (Maybe String) (Just String)) ) ) ) -- IO ( \(r : *) -> forall (x : *) -> (forall (s : *) -> s -> (s -> IOF r s) -> x) -> x ) -- MkIO ( \(r : *) -> \(s : *) -> \(seed : s) -> \(step : s -> IOF r s) -> \(x : *) -> \(k : forall (s : *) -> s -> (s -> IOF r s) -> x) -> k s seed step ) ) -- Maybe (\(a : *) -> forall (x : *) -> (a -> x) -> x -> x) -- Just ( \(a : *) -> \(va : a) -> \(x : *) -> \(Just : a -> x) -> \(Nothing : x) -> Just va ) -- Nothing ( \(a : *) -> \(x : *) -> \(Just : a -> x) -> \(Nothing : x) -> Nothing ) -- maybe ( \(a : *) -> \(m : forall (x : *) -> (a -> x) -> x-> x) -> m ) -- IOF ( \(r : *) -> \(s : *) -> forall (x : *) -> (String -> s -> x) -> ((String -> s) -> x) -> (r -> x) -> x ) -- PutStrLn ( \(r : *) -> \(s : *) -> \(str : String) -> \(vs : s) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> PutStrLn str vs ) -- GetLine ( \(r : *) -> \(s : *) -> \(k : String -> s) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> GetLine k ) -- Return ( \(r : *) -> \(s : *) -> \(vr : r) -> \(x : *) -> \(PutStrLn : String -> s -> x) -> \(GetLine : (String -> s) -> x) -> \(Return : r -> x) -> Return vr ))This compiles to a state machine that we can unfold one step at a time:$ morte < corecursive.mt∀(String : *) → ∀(r : *) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r → x) → x) → x) → xλ(String : *) → λ(r : *) → λ(x : *) → λ(k : ∀(s : *) → s → (s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r → x) → x) → x) → k (∀(x : *) → (String → x) → x → x) (λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Nothing) (λ(m : ∀(x : *) → (String → x) → x → x) → m (∀(x : *) → (String → (∀(x : *) → (String → x) → x → x) → x) → ((String → ∀(x : *) → (String → x) → x → x) → x) → (r → x) → x) (λ(str : String) → λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x → x) → x) → λ(Return : r → x) → PutStrLn str (λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Nothing)) (λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x → x) → x) → λ(Return : r → x) → GetLine (λ(va : String) → λ(x : *) → λ(Just : String → x) → λ(Nothing : x) → Just va))I don't expect you to understand that output other than to know that we can translate the output to any backend that provides functions, and primitive read/write operations.ConclusionIf you would like to use Morte, you can find the library on both Github and Hackage. I also provide a Morte tutorial that you can use to learn more about the library.Morte is dependently typed in theory, but in practice I have not exercised this feature so I don't understand the implications of this. If this turns out to be a mistake then I will downgrade Morte to System Fw, which has higher-kinds and polymorphism, but no dependent types.Additionally, Morte might be usable to transmit code in a secure and typed way in distributed environment or to share code between diverse functional language by providing a common intermediate language. However, both of those scenarios require additional work, such as establishing a shared set of foreign primitives and creating Morte encoders/decoders for each target language.Also, there are additional optimizations which Morte might implement in the future. For example, Morte could use free theorems (equalities you deduce from the types) to simplify some code fragments even further, but Morte currently does not do this.My next goals are:Add a back-end to compile Morte to LLVMAdd a front-end to desugar a medium-level Haskell-like language to MorteOnce those steps are complete then Morte will be a usable intermediate language for writing super-optimizable programs.Also, if you're wondering, the name Morte is a tribute to a talking skull from the game Planescape: Torment, since the Morte library is a "bare-bones" calculus of constructions.LiteratureIf this topic interests you more, you may find the following links helpful, in roughly increasing order of difficulty:Data and Codata - A blog post by Dan Piponi introducing the notions of data and codataChurch encoding - A wikipedia article on church encoding (converting things to lambda expressions)Total Functional Programming - A paper by D. A. Turner on total programming using data and codataRecursive types for free! - An unpublished draft by Philip Wadler about F-algebras and F-coalgebrasUnderstanding F algebras - A blog post by Bartosz MilewskiBeyond Church encoding: Boehm-Berarducci isomorphism of algebraic data types and polymorphic lambda-terms - Oleg Kiselyov's collection of notes on Boehm-Berarducci encoding, a more complete version of Church encodingF-algebras and F-coalgebras - wikipedia articles that are short, sweet, and utterly impenetrable unless you already understand the subject

Categories: Functional Programming

Haskell could use some new syntax LAMBDA_RECORD_MODIFY which could be used as follows: import qualified Control.Monad.State as State; data Record { field :: Int }; ... State.modify $ LAMBDA_RECORD_MODIFY { field = ... }; which is equivalent to State.modify $ \x -> x { field = ... } but not having to name the lambda parameter "x" (twice). I suspect this is one of the things lenses are trying to do.

Categories: Functional Programming

This longish post gives Simon's reflections on the implementation of Cloud-Haskell-style static pointers and serialiation. See also StaticPointers.
Much of what is suggested here is implemented, in some form, in two existing projects
Cloud Haskell libraries distributed-static and rank1dynamic. Background in the paper Towards Haskell in the Cloud.
HdpH libraries hdph and hdph-closure. Background in the paper Implementing a high-level distributed-memory parallel Haskell in Haskell
My goal here is to identify the smallest possible extension to GHC, with
the smallest possible trusted code base, that would enable these
libraries to be written in an entirely type-safe way.
Background
Background: the trusted code base
The implementation Typeable class, and its associated functions, in
GHC offers a type-safe abstraction, in the classic sense that
"well typed programs won't go wrong". For example, we in Data.Typeable we have
cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b
We expect cast to be type-safe: if cast returns a value Just x then we really do know
that x :: b. Let's remind ourselves of class Typeable:
class Typeable a where
typeRep :: proxy a -> TypeRep
(It's not quite this, but close.) The proxy a argument is
just a proxy for type argument; its value is never inspected
and you can always pass bottom.
Under the hood, cast uses typeRep to get the runtime TypeRep for
a and b, and compares them, thus:
cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b
cast x = if typeRep (Proxy :: Proxy a) == typeRep (Proxy :: Proxy b)
then Just (unsafeCoerce x)
else Nothing
Although cast is written in Haskell, it uses unsafeCoerce. For it
to truly be type-safe, it must trust the Typeable instances. If the
user could write a Typeable instance, they could write a bogus one, and
defeat type safety. So only GHC is allowed write Typeable instances.
In short, cast and the Typeable instances are part of the trusted code base, or TCB:
The TCB should be as small as possible
The TCB should have a small, well-defined, statically-typed API used by client code
Client code is un-trusted; if the client code is well-typed, and the TCB is implemented correctly, nothing can go wrong
Background Typeable a and TypeRep
I'll use the Typeable a type class and values of type TypeRep more or less interchangeably. As you can see from the definition of class Typeable above, its payload is simply a constant function returning a TypeRep. So you can think of a Typeable a as simply a type-tagged version of TypeRep.
Of course, a Typeable a is a type class thing, which is hard to pass around explicitly like a value, but that is easily fixed using the "Dict Trick",
well known in Haskell folk lore:
data Dict (c :: Constraint) where
Dict :: forall c. c => Dict c
Now a value of type Dict (Typeable a) is an ordinary value that embodies a Typeable a dictionary. For example:
f :: Dict (Typeable a) -> Dict (Typeable b) -> a -> Maybe b
f Dict Dict val = cast val
The pattern-matches against the Dict constructor brings the Typeable dictionaries
into scope, so they can be used to discharge the constraint arising from the call to cast.
Background: serialisation
I'm going to assume a a type class Serialisable, something like this:
class Serialisable a where
encode :: a -> ByteString
decode :: ByteString -> Maybe (a, ByteString)
'll use "encode" and "decode" as synonyms for "serialise" and "deserialise", because the former are easier to pronounce.
Here's an interesting question: are instances of Serialisable part of the TCB? No, they are not.
Here is a tricky case:
decode (encode [True,False]) :: Maybe (Int, ByteString)
Here I have encode a [Bool] into a ByteString, and then decoded an Int from that ByteString. This may
be naughty or undesirable, but it cannot seg-fault: it is type-safe in the sense above. You can
think of it like this: a decoder is simply a parser for the bits in the ByteString, so a decoder
for (say) Int can fail to parse a full Int (returning Nothing), but it can't return a non-Int.
For the naughtiness, one could imagine that a Cloud Haskell library
might send fingerprints or TypeReps or whatnot to eliminate
potential naughtiness. But even then it is very valuable if the
type-safety of the system does not rely on the CH library. Type
safety depends only on the correctness of the (small) TCB;
naughtiness-safety might additionally depend on the correctness of the
CH library.
Background: static pointers
I'm taking for granted the basic design of the Cloud Haskell paper.
That is,
A type constructor StaticPtr :: * -> *. Intuitively, a value of type StaticPtr t is represented by a static code pointer to a value of type t. Note "code pointer" not "heap pointer". That's the point!
A language construct static , whose type is StaticPtr t if has type t.
In static , the free variables of must all be bound at top level. The implementation almost certainly works by giving a top-level definition with a new name, static34 = .
A function unStatic :: StaticPtr a -> a, to unwrap a static pointer.
Static values are serialisable. Something like instance Serialisable (StaticPtr a). (This will turn out to be not quite right.) Operationally this works by serialising the code pointer, or top-level name (e.g "Foo.static34").
All of this is built-in. It is OK for the implementation of StaticPtr to be part of the TCB.
But our goal is that no other code need be in the TCB.
A red herring. I'm not going to address the question of how to serialise a static pointer. One method would be to serialise a machine address, but that only works if the encoding and decoding ends are running identical binaries. But that's easily fixed: encode a static as the name of the static value e.g. "function foo from module M in package p". Indeed, I'll informally assume an implementation of this latter kind.
In general, I will say that what we ultimately serialise is a StaticName. You can think of a StaticName as package/module/function triple, or something like that. The implementation of StaticName is certainly not part of the client-visible API for StaticPtr; indeed, the type StaticName is not part of the API either. But it gives us useful vocabulary.
Serialising static pointers
We can see immediately that we cannot expect to have instance Serialisable (Static a),
which is what the Cloud Haskell paper proposed. If we had such an instance we would have
encodeStatic :: forall a. StaticPtr a -> ByteString
decodeStatic :: forall a. ByteString -> Maybe (StaticPtr a, ByteString)
And it's immediately apparent that decodeStatic cannot be right.
I could get a ByteString from anywhere, apply decodeStatic to it,
and thereby get a StaticPtr a. Then use
unStatic and you have a value of type a, for, for any type a!!
Plainly, what we need is (just in the case of cast) to do a dynamic typecheck, thus:
decodeStatic :: forall a. Typeable a
=> ByteString -> Maybe (StaticPtr a, ByteString)
Let's think operationally for a moment:
GHC collects all the StaticPtr values in a table, the static pointer table or SPT. Each row contains
The StaticName of the value
A pointer to closure for the value itself
A pointer to its TypeRep
decodeStatic now proceeds like this:
Parse a StaticName from the ByteString (failure => Nothing)
Look it up in table (not found => Nothing)
Compare the TypeRep passed to decodeStatic (via the Typeable a dictionary) with the one ine the table (not equal => Nothing)
Return the value
Side note. Another possibility is for decodeStatic not to take a Typeable a context but instead for unStatic to do so:: unStatic :: Typeable a => StaticPtr a -> Maybe a. But that seems a mess. Apart from anything else, it would mean that a value of type StaticPtr a might or might not point to a value of type a, so there's no point in having the type parameter in the first place. End of side note.
This design has some useful consequences that are worth calling out:
A StaticPtr is serialised simply to the StaticName; the serialised form does not need to contain a TypeRep. Indeed it would not even be type-safe to serialise a StaticPtr to a pair of a StaticName and a TypeRep, trusting that the TypeRep described the type of the named function. Why not? Think back to "Background: serialisation" above, and imagine we said
decode (encode ["wibble", "wobble"])
:: Typeable a => Maybe (StaticPtr a, ByteString)
Here we create an essentially-garbage ByteString by encoding a [String], and try to decode it. If, by chance, we successfully parse a valid StaticName and TypeRep, there is absolutely no reason to suppose that the TypeRep will describe the type of the function.
Instead, the TypeRep of the static pointer lives in the SPT, securely put there when the SPT was created. Not only is this type-safe, but it also saves bandwidth by not transmittingTypeReps.
Since clients can effectively fabricate a StaticName (by supplying decodeStatic with a bogus ByteString, a StaticName is untrusted. That gives the implementation a good deal of wiggle room for how it chooses to implement static names. Even a simple index in the range 0..N would be type-safe!
The motivation for choosing a richer representation for StaticName (eg package/module/name) is not type-safety but rather resilience to change. For example, the Haskell programs at the two ends could be quite different, provided only that they agreed about what to call the static pointers that they want to exchange.
Statics and existentials
Here is something very reasonable:
data StaticApp b where
SA :: StaticPtr (a->b) -> StaticPtr a -> StaticApp b
unStaticApp :: StaticApp a -> a
unStaticApp (SA f a) = unStatic f (unStatic a)
(We might want to add more constructors, but I'm going to focus only on SA.)
A SA is just a pair of StaticPtrs, one for a function and one for an argument. We can securely unwrap it with unStaticApp.
Now, here is the question: can we serialise StaticApps? Operationally, of course yes: to serialise a SA, just serialise the two StaticPtrs it contains, and dually for deserialisation. But, as before, deserialisation is the hard bit. We seek:
decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
But how can we write decodeSA? Here is the beginning of an attempt:
decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
= case decodeStatic bs :: Maybe (StaticPtr (a->b)) of
Nothing -> Nothing
Just (fun, bs1) -> ...
and you can immediately see that we are stuck. Type variable b is not in scope.
More concretely, we need a Typeable (a->b) to pass in to decodeStatic,
but we only have a Typeable b to hand.
What can we do? Tantalisingly, we know that if decodeStatic succeeds in parsing a static StaticName from bs then, when we look up that StaticName in the Static Pointer Table, we'll find a TypeRep for the value. So rather than passing a Typeable dictionary into decodeStatic, we'd like to get one out!
With that in mind, here is a new type signature for decodeStatic that returns
both pieces:
data DynStaticPtr where
DSP :: Typeable a => StaticPtr a -> DynStaticPtr
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)
(The name DynStaticPtr comes from the fact that this data type is extremely similar to the library definition of Dynamic.)
Operationally, decodeStaticK bs fail cont works like this;
Parse a StaticName from bs (failure => return Nothing)
Look it up in the SPT (not found => return Nothing)
Return the TypeRep and the value found in the SPT, paired up with DSP. (Indeed the SPT could contain the DynStaticPtr values directly.)
For the construction of DynStaticPtr to be type-safe, we need to know that the
TypeRep passed really is a TypeRep for the value; so the construction
of the SPT is (unsurprisingly) part of the TCB.
Now we can write decodeSA (the monad is just the Maybe monad, nothing fancy):
decodeSA :: forall b. Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
= do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
; (DSP (arg :: StaticPtr targ), bs2) <- decodeStatic bs1
-- At this point we have
-- Typeable b (from caller)
-- Typeable tfun (from first DSP)
-- Typeable targ (from second DSP)
; fun' :: StaticPtr (targ->b) <- cast fun
; return (SA fun' arg, bs2) }
The call to cast needs Typeable tfun, and Typeable (targ->b). The
former is bound by the first DSP pattern match. The latter is
constructed automatically from Typeable targ and Typeable b, both
of which we have. Bingo!
Notice that decodeSA is not part of the TCB. Clients can freely write code like decodeSA and be sure that it is type-safe.
From static pointers to closures
The original Cloud Haskell paper defines closures like this:
data Closure a where
Clo :: StaticPtr (ByteString -> a) -> ByteString -> Closure a
It is easy to define
unClo :: Closure a -> a
unClo (Clo s e) = unStatic s e
Side note on HdpH
HdpH refines the Cloud Haskell Closure in (at least) two ways. I think (but I am not certain) that this declaration captures the essence:
data Closure a where
Clo :: StaticPtr (ByteString -> a) -> Put () -> a -> Closure a
The refinements are:
The extra argument of type 'a' to avoid costs when we build a closure and then unwrap it with unClo locally, or repeatedly.
The use of Put () rather than a ByteString for the serialised environment, to avoid repeated copying when doing nested serialisation.
Both are importnat, but they are orthogonal to the discussion about static types, so I'll use the CH definition from here on.
Serialising closures
Just as in the case of StaticPtr, it is immediately clear that we
cannot expect to have
decodeClo :: ByteString -> Maybe (Closure a, ByteString)
Instead we must play the same trick, and attempt to define
data DynClosure where
DC :: Typeable a => Closure a -> DynClosure
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
But there's an immediate problem in writing decodeClo:
decodeClo bs
= do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
; (env, bs2) <- decodeByteString bs1
; return (DC (Clo fun env), bs2) } -- WRONG
This won't typecheck because DC needs Typeable a, but we only have Typeable (ByteString -> a)`.
This is Jolly Annoying. I can see three ways to make progress:
Plan A: Provide some (type-safe) way to decompose TypeReps, to get from Typeable (a->b) to Typeable b (and presumably Typeable a as well).
Plan C: Serialise a TypeRep a with every Closure a.
Plan C: Generalise StaticPtr
I like Plan C best. They are each discussed next.
Plan A: Decomposing TypeRep
At the moment, GHC provides statically-typed ways to construct and compare a TypeRep (via cast), but no way to decompose one, at least not in a type-safe way.
It is tempting to seek this function as part of the TCB:
class Typeable a where
typeRep :: proxy a -> TypeRep
decomposeTypeRep :: DecompTR a
data DecompTR a where
TRApp :: (Typeable p, Typeable q) => DecompTR (p q)
TRCon :: TyCon -> DecompTR a
This isn't a bad idea, but it does mean that Typeable a must be implemented (and presumably serialised) using a tree, whereas the current API would allow an implementation consisting only of a fingerprint.
(Thought experiment: maybe a Typeable a, and Dict (Typeable a) can be represented as a tree, but a TypeRep could be just a fingerprint?)
Plan B: serialise TypeRep with Closure
Since we need a Typeable a at the far end, we could just serialise it directly
with the Closure, like this:
encodeClo :: forall a. Typeable a => Closure a -> ByteString
encodeClo (Clo fun env)
= encodeTypeable (proxy :: a)
++ encodeStatic fun
++ encodeByteString env
Here I am assuming (as part of the TBC)
encodeTypeable :: Typeable a => proxy a -> ByteString
decodeTypeable :: ByteString -> Maybe (DynTypeable, ByteString)
data DynTypeable where
DT :: Typeable a => proxy a -> DynTypeable
which serialises a TypeRep. (Or, operationally, perhaps just its fingerprint.)
Now I think we can write decodeClo:
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
= do { (DT (_ :: Proxy a), bs1) <- decodeTypeable
; (DSP (fun :: StaticPtr tfun), bs2) <- decodeStatic bs1
; (env, bs3) <- decodeByteString bs2
; fun' :: StaticPtr (ByteString -> a) <- cast fun
; return (DC (Clo fun' env), bs2) } -- WRONG
But this too is annoying: we have to send these extra TypeReps when morally they are already sitting there in the SPT.
Plan C: Generalising StaticPtr
Our difficulty is that we are deserialising StaticPtr (ByteString -> a) but we want to be given Typeable a not Typeable (ByteString -> a). So perhaps we can decompose the type into a type constructor and type argument, like this:
data StaticPtr (f :: *->*) (a :: *)
unStatic :: StaticPtr f a -> f a
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)
data DynStaticPtr where
DS :: (Typeable f, Typeable a) => StaticPtr (f a) -> DynStaticPtr
Each row of the SPT contains:
The StaticName
The value of type f a
The Typeable f dictionary
The Typeable a dictionary
Now we can define closures thus:
data Closure a where
Clo :: StaticPtr (ByteString ->) a -> ByteString -> Closure a
and these are easy to deserialise:
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
= do { (DSP (fun :: StaticPtr f a), bs1) <- decodeStatic bs
; (env, bs2) <- decodeByteString bs1
-- Here we have Typeable f, Typeable a
; fun' :: StaticPtr (ByteString ->) a <- cast fun
-- This cast checks that f ~ (ByteString ->)
-- Needs Typeable f, Typealbe (ByteString ->)
; return (DC (Clo fun env), bs2) }
-- DC needs Typeable a
I like this a lot better, but it has knock on effects.
The old StaticPtr a is now StaticPtr Id a.
What becomes of our data type for StaticApply? Perhpas
data StaticApp f b where
SA :: StaticPtr f (a->b) -> StaticPtr f b -> StaticApp f b
unStaticApp :: Applicative => StaticApp f b -> f b
ToDo: ...I have not yet followed through all the details
Applying closures
Can we write closureApply? I'm hoping for a structure like this:
closureApply :: Closure (a->b) -> Closure a -> Closure b
closureApply fun arg = Clo (static caStatic) (fun, arg)
caStatic :: ByteString -> b -- WRONG
caStatic bs = do { ((fun,arg), bs1) <- decode bs
; return (unClo fun (unClo arg), bs1) }
This is obviously wrong. caStatic clearly cannot have that type. It would
at least need to be
caStatic :: Typeable b => ByteString -> b
and now there is the thorny question of where the Typeable b dictionary comes from.
ToDo: ...I have stopped here for now
Polymorphism and serialisation
For this section I'll revert to the un-generalised single-parameter StaticPtr.
Parametric polymorphism
Consider these definitions:
rs1 :: Static ([Int] -> [Int])
rs1 = static reverse
rs2 :: Static ([Bool] -> [Bool])
rs2 = static reverse
rs3 :: forall a. Typeable a => Static ([a] -> [a])
rs3 = static reverse
The first two are clearly fine. The SPT will get one row for each of the two monomorphic calls to reverse, one with a TypeRep of [Int] -> [Int] and one with a TypeRep of [Bool] -> [Bool].
But both will have the same code pointer, namely the code for the polymorpic reverse function. Could we share just one StaticName for all instantiations of reverse, perhaps including rs3 as well?
I think we can. The story would be this:
The SPT has a row for reverse, containing
The StaticName for reverse
A pointer to the code for reverse (or, more precisely, its static closure).
A function of type TypeRep -> TypeRep that, given the TypeRep for a returns a TypeRep for [a] -> [a].
When we serialise a StaticPtr we send
The StaticName of the (polymorphic) function
A list of the TypeReps of the type arguments of the function
The rule for static becomes this: the free term variables must all be top level, but it may have free type variables, provided they are all Typeable.
All of this is part of the TCB, of course.
Type-class polymorphism
Consider static sort where sort :: Ord a => [a] -> [a]. Can we make such a StaticPtr. After all, sort gets an implicit value argument, namely an Ord a dictionary. If that dictionary can be defined at top level, well and good, so this should be OK:
ss1 :: StaticPtr ([Int] -> [Int])
ss1 = static sort
But things go wrong as soon as you have polymorphism:
ss2 :: forall a. Ord a => StaticPtr ([a] -> [a])
ss2 = static sort -- WRONG
Now, clearly, the dictionary is a non-top-level free variable of the call to sort.
We might consider letting you write this:
ss3 :: forall a. StaticPtr (Ord a => [a] -> [a])
ss3 = static sort -- ???
so now the static wraps a function expeting a dictionary. But that edges us uncomforatbly close to impredicative types, which is known to contain many dragons.
A simpler alternative is to use the Dict Trick (see Background above):
ss4 :: forall a. StaticPtr (Dict (Ord a) -> [a] -> [a])
ss4 = static sortD
sortD :: forall a. Dict (Ord a) -> [a] -> [a]
sortD Dict xs = sort xs
Now, at the call side, when we unwrap the StaticPtr, we need to supply an explicit Ord dictionary, like this:
...(unStatic ss4 Dict)....
For now, I propose to deal with type classes via the Dict Trick, which is entirely end-user programmable, leaving only parametric polymorphism for built-in support.
2014-09-11T13:34:23Z
simonpj

Categories: Functional Programming

I've heard that my previous blog
post has
caused a bit of confusion, as sarcasm doesn't really come across in text very
well. So let me elaborate (and of course, in the process, kill the joke):Some years back, Erik found a case that was quite difficult to implement using
enumerator. After we cracked our heads on it for long enough, some of us (I
don't actually remember who was involved) decided to work on a new streaming
library. That library ended up being called conduit (thanks to Yitz for the
naming idea). It turns out that most people are unaware of that history, so
when at ICFP, I casually mentioned that Erik was the cause of conduit coming
into existence, some people were surprised. Erik jokingly chastised me for
not giving him enough credit. In response, I decided to write an over-the-top
post giving Erik all credit for conduit. I say over the top, since I made
it seem like there was some large amount of blame being heaped on as well.So to be completely clear:Erik and I are good friends, and this was just a bit of an inside joke turned public.No one has said anything offensive to me at all about conduit. There are obviously differing opinions out there about the best library for a job, but there's nothing offensive about it, just healthy discussion around a complicated topic. My purpose in making a big deal about it was not to express frustration at anyone attacking me, but rather to just play up the joke a bit more.My apologies to anyone who was confused, upset, or worried by the previous
post, it was completely unintentional.

Categories: Functional Programming

FP Complete is looking to expand its Haskell development team. We’re looking
for a Haskeller with a strong background in web UI development. This position
will encompass both work on our core products- such as FP Haskell Center and
School of Haskell- as well as helping customers develop frontends to their
Haskell applications.We will want you to start right away. The will be a contractor position, full
time for at least 3 months, with the intention to continue long-term on a more
or less full-time basis. Additionally, while the main focus of the position
will be UI development, there will be many opportunities to expand into other
areas of focus. This is a telecommute position: you can work from home or wherever you choose,
with little or no travel. Location in North America is ideal; you will work
with colleagues who are on North American and European hours.Skills required:Strong Haskell coding skills.Experience with creating HTML/CSS/Javascript web applications (fat clients a plus).Ideally: experience with both Yesod and Fay for server and client side coding, respectively. (Perk: you’ll get a chance to work with the authors of both tools.)Experience deploying applications into production, especially at large scale, is a plus.Ability to interact with a distributed development team, and to manage your time without an in-person supervisorAbility to work with clients on gathering requirementsGeneral source control/project skills: Git, issue trackingAbility to communicate clearly in issues, bug reports and emailsProficient on a Linux systemPlus: experience with deployment, Docker, and/or CoreOSPlease send resume or CV to
michael@fpcomplete.com. Any existing work-
either a running website or an open source codebase- which you can include
links to will be greatly appreciated as well.

Categories: Functional Programming

George Monbiot writes stirring stuff on indy. Well-reasoned and well-documented.Imagine the question posed the other way round. An independent nation is asked to decide whether to surrender its sovereignty to a larger union. It would be allowed a measure of autonomy, but key aspects of its governance would be handed to another nation. It would be used as a military base by the dominant power and yoked to an economy over which it had no control. ... Most nations faced even with such catastrophes choose to retain their independence – in fact, will fight to preserve it – rather than surrender to a dominant foreign power....The fears the no campaigners have worked so hard to stoke are – by comparison with what the Scots are being asked to lose – mere shadows. As Adam Ramsay points out in his treatise Forty-Two Reasons to Support Scottish Independence, there are plenty of nations smaller than Scotland that possess their own currencies and thrive. Most of the world’s prosperous nations are small: there are no inherent disadvantages to downsizing.Remaining in the UK carries as much risk and uncertainty as leaving. England’s housing bubble could blow at any time. We might leave the European Union....How is the argument altered by the fact that Scotland is considering whether to gain independence rather than whether to lose it? It’s not. Those who would vote no – now, a new poll suggests, a rapidly diminishing majority – could be suffering from system justification.System justification is defined as the “process by which existing social arrangements are legitimised, even at the expense of personal and group interest”. It consists of a desire to defend the status quo, regardless of its impacts. It has been demonstrated in a large body of experimental work, which has produced the following surprising results.System justification becomes stronger when social and economic inequality is more extreme. This is because people try to rationalise their disadvantage by seeking legitimate reasons for their position. In some cases disadvantaged people are more likely than the privileged to support the status quo. One study found that US citizens on low incomes were more likely than those on high incomes to believe that economic inequality is legitimate and necessary.It explains why women in experimental studies pay themselves less than men, why people in low-status jobs believe their work is worth less than those in high-status jobs, even when they’re performing the same task, and why people accept domination by another group. It might help to explain why so many people in Scotland are inclined to vote no....To deny this to yourself, to remain subject to the whims of a distant and uncaring elite, to succumb to the bleak, deferential negativity of the no campaign, to accept other people’s myths in place of your own story: that would be an astonishing act of self-repudiation and self-harm. Consider yourselves independent and work backwards from there; then ask why you would sacrifice that freedom.Scots voting no to independence would be an astonishing act of self-harmA yes vote in Scotland would unleash the most dangerous thing of all - hope(Monbiot is also the author of Heat, my favorite book on climate change.)

Categories: Functional Programming

Polymorphism in Haskell vs C++
posted on 2014-09-10
by Jonathan Dugan
Parametric polymorphism is when you write one function that works on many data types. In C++, this is pretty confusing, but it’s really easy in Haskell. Let’s take a look at an example.
Let’s say we want a function that calculates the volume of a box. In C++, we’d use templates so that our function works with any numeric type:
template
T boxVolume(T length, T width, T height)
{
return length * width * height;
}
Templates have an awkward syntax, but that isn’t too much of a hassle. C++ has much bigger problems. What if in the course of writing your program, you accidentally pass in some strings to this function?
int main()
{
cout << boxVolume("oops","no","strings") << endl;
}
We get this error when we compile with g++:
test.cpp: In instantiation of _T boxVolume(T, T, T) [with T = const char*]_:
test.cpp:22:47: required from here
test.cpp:8:19: error: invalid operands of types _const char*_ and _const char*_ to binary
_operator*_
return length * width * height;
This error message is a little hard to understand because of the templates. If we had written our function to use doubles instead of templates:
double boxVolume(double length, double width, double height)
{
return length * width * height;
}
We would get this simpler error message:
test.cpp: In function _int main()_:
test.cpp:22:47: error: cannot convert _const char*_ to _double_ for argument _1_ to _double
boxVolume(double, double, double)_
cout << boxVolume("oops","nope","bad!") << endl;
We see that this error is shorter and easier to use, as it clearly tells us we cannot pass string literals to our function. Plus there is no superfluous comment about our “instantiation” of boxVolume.
Now let’s try to write a polymorphic boxVolume in Haskell:
boxVolume :: a -> a -> a -> a
boxVolume length width height = length * width * height
When we try to compile, we get the error:
test.hs:2:50:
No instance for (Num a) arising from a use of `*'
Possible fix:
add (Num a) to the context of
the type signature for boxVolume :: a -> a -> a -> a
In the expression: length * width * height
In an equation for `boxVolume':
boxVolume length width height = length * width * height
Uh-oh! An error message! What went wrong? It says that we tried to use the * operator without declaring our parameters as an instance of the Num type class.
But what is a type class? This leads us to ad hoc polymorphism, also known as function overloading. Ad hoc polymorphism is when a function can be applied to different argument types, each with a different implementation. For example, the STL classes stack and queue each have their own push and pop functions, which, although they have the same names, do different things:
stack s;
queue q;
s.push(1); q.push(1);
s.push(2); q.push(2);
s.push(3); q.push(3);
s.pop(); q.pop();
After the above code is executed, the stack s will be left with the numbers 1,2 while the queue q will be left with the numbers 2,3. The function pop behaves differently on stacks and queues: calling pop on a stack removes the item added last, while calling pop on a queue removes the item added first.
Haskell does not support function overloading, except through type classes. For example, if we were to specifically declare our own Stack and Queue classes with push and pop functions:
data Stack = Stack [Int] deriving Show
data Queue = Queue [Int] deriving Show
push :: Stack -> Int -> Stack
push (Stack xs) x = Stack (x:xs)
pop :: Stack -> Stack
pop (Stack []) = Stack []
pop (Stack xs) = Stack (tail xs)
push :: Queue -> Int -> Queue
push (Queue xs) x = Queue (x:xs)
pop :: Queue -> Queue
pop (Queue []) = Queue []
pop (Queue xs) = Queue (init xs)
It results in a compiler error:
stack.hs:11:1:
Duplicate type signatures for `push'
at stack.hs:4:1-4
stack.hs:11:1-4
stack.hs:12:1:
Multiple declarations of `push'
Declared at: stack.hs:5:1
stack.hs:12:1
stack.hs:14:1:
Duplicate type signatures for `pop'
at stack.hs:7:1-3
stack.hs:14:1-3
stack.hs:15:1:
Multiple declarations of `pop'
Declared at: stack.hs:8:1
stack.hs:15:1
Changing the names of our push and pop functions to, say, stackPush, stackPop, queuePush, and queuePop would let the program compile.
A more generic way, however, is to create a type class. Let’s make a Sequence type class that implements our push and pop functions.
class Sequence s where
push :: s -> Int -> s
pop :: s -> s
This type class declaration says that any data type that is an instance of this Sequence type class can use the push and pop operations, or, in other words, can add and remove an Int. By making our Stack and Queue instances of the Sequence type class, both data types can have their own implementations of the push and pop functions!
instance Sequence Stack where
push (Stack xs) x = Stack (x:xs)
pop (Stack []) = Stack []
pop (Stack xs) = Stack (tail xs)
instance Sequence Queue where
push (Queue xs) x = Queue (x:xs)
pop (Queue []) = Queue []
pop (Queue xs) = Queue (init xs)
Replacing our function definitions with these instantiations of the Sequence type class lets our program compile.
Type classes are also an important part of using templates in function definitions. In our function boxVolume, we got an error because we tried to use the * operation without declaring the type variable a as an instance of the Num type class. The Num type class is basically for anything that acts like a number, such as Int, Float, and Double, and it lets you use the common operations of +, -, and *.
Let’s change our function to declare that a is a Num:
boxVolume :: (Num a) => a -> a -> a -> a
boxVolume length width height = length * width * height
This is called adding a class constraint. Whenever we want to declare a template function that relies on other functions, we have to add a class constraint that tells both the user and the compiler which types of data can be put into the function.
If we were to call boxVolume on strings, we would get this simple error message:
ghci> boxVolume "a" "b" "c"
:14:1:
No instance for (Num [Char]) arising from a use of `boxVolume'
Possible fix: add an instance declaration for (Num [Char])
In the expression: boxVolume "a" "b" "c"
In an equation for `it': it = boxVolume "a" "b" "c"
The compiler tells us it can’t evaluate this function because strings aren’t numbers! If we really wanted to, we could make String an instance of the Num type class, and then this function would work! (Of course, why you would want to do that is beyond me.) That’s the power of parametric polymorphism combined with type classes.
So there you have it. In C++, although we can easily implement ad hoc polymorphism through function overloading, parametric polymorphism is a tricky beast. This is made easier in Haskell, especially with the use of type classes. Type classes guarantee that data passed in to functions will work, and guide the user into what they can pass into a function. Use type classes to your advantage when you next write a Haskell program!
2014-09-10T00:00:00Z
2014-09-10T00:00:00Z

Categories: Functional Programming

Getting started with GitHub, vim, and bash
posted on 2014-09-07
by Rashid Goshtasbi and Kyler Rynear
Learning to use git, vim, and bash was hard for us. These tools are so different than the tools we used when we first learned to program. And they’re confusing! But our professor made us use them… and eventually… after we learned the tools… we discovered that we really like them! So we’ve put together a simple video guide to help you learn and enjoy these tools too. We did this as part of the CS100 open source software development class at UC Riverside.
Click here to watch the full playlist on YouTube.
Getting Started with GitHub
This video shows you step by step how to create an account on GitHub. Then we see how to create our first repository called test, and transfer it from GitHub onto our local machine using the git clone command.
Creating a file, pushing to GitHub, and pulling from GitHub
How do we create files and upload them to GitHub? The touch command will create an empty file for you. The vim command will open a file in an advanced text editor that we talk about farther down the page. The git push command sends these files from your local machine up to GitHub, and the git pull command downloads files from GitHub and saves them to your local computer.
Branches
Branches let you work on files without messing up your original code. When you finish your changes, you can merge them into the master branch. This is the best part of version control.
Tags
Most programs have different versions, for example: 1.0, 1.1, 1.2, 2.1 and 2.2.1. The git tag command let’s you create these versions. They’re just like a checkpoint in a Mario game!
Forking & Pull Requests
Let’s say you want to contribute to an open source project, but you don’t have permission. In order to contribute to someone else’s repository, you must first “fork” it to create a repo that you do have push permission on. Then you issue a pull request through the GitHub website. This tells the owner of the original repo that you’ve made some changes they can incorporate.
The README.md file
README.md files are how you document your projects. The README.md should explain your program, give installation instructions, and list known bugs. Basically, it explains to someone else who has absolutely no idea what your program does or how to code, but it enables the user to understand the concepts and basic directions to execute your program. The .md extension at the end of the filename indicates that the file uses markdown formatting. This is a simple way to create nice looking documentation.
Learning vim
vim is an advanced text editor for Unix operating systems. It’s powerful, but all the commands are intimidating for first time users. Even though it’s hard to get used to at first, these videos will help you learn some of the basic commands and get comfortable with vim.
Getting Started
It was difficult at first trying to transverse my code while using vim. I was so used to being able to use my mouse and simply click where I wanted to go. There are many ways to maneuver inside of vim. Some may just use the h,j,k,l, up, down, left, right arrow keys, or the w, e, b keys to move. You can also press gg to go to the top of the code, G to go to the bottom of it, and (any number)G to go to the line number typed before the capital G.)
Cutting, copying, and pasting took a while to get used to when using vim. Sometimes there was something I wanted in my code that was in the instructions for the assignment. In order to paste I would use the p command, but I could not paste things from outside of vim into it. If I had something copied outside of vim, then to paste it into vim I would right click and just click paste. This would paste it wherever the cursor currently is. If you right click to copy, then it will not affect what is copied by using the commands y to copy or the commands d or x to cut. If those commands are used, the just clicking p will paste them. There are other ways to store more than one thing while copying or cutting, but these two ways were the most helpful as I learned how to use vim.
Another personal favorite features of vim, are the shift-a (takes you to the end of the line and into insert mode) and the shift-i (takes you to the beginning of the line and into insert mode) command. You can also press a to append after the cursor position, as well as i to insert before the current cursor position
vim also allows you to use the v or shift-v keys to highlight certain text or lines of code. You can then use other vim commands such as the copy, paste and delete keys to perform your needed actions.
Indentation
At first it felt very time consuming to indent multiple lines. I felt this way until I found about the V command. V lets users highlight a line and pressing up or down can highlight as many lines as they desire. All that was left to do was to type > after everything I wanted to indent was highlighted and it all would indented once to the right. Typing < would instead indent it to the left if I ever wanted to do that.
Deletion
There are two commands for deleting single character. x deletes the character that the cursor is on and moves the cursor to the right; and X deletes the character that the cursor is on and moves the cursor to the left.
The d command is a more powerful way to delete. d can be used with many different things after it. dd will delete the entire line. d$ will delete the rest of the current line. de will delete from where the cursor is up until the end of the word.
Replacing
Lower case r can replace one letter while upper case R can replace one letter with many.
There are three c commands that I regularly use for replacement: ce , which deletes up until the end of the word that the cursor is currently on, then allows you to insert immediately; c$ , which deletes from where the cursor is up until the end of the line, then allows you to insert immediately; and cc , which deletes the whole line that the cursor is on and allows you to insert immediately at the beginning of the line.
Customizing your vim editor with the .vimrc file
Ever wondered how’ve we get our vim editor to work in the way we have it versus the default editor? vim has a file where you can setup it’s defaults such as auto parentheses, auto-indent, and much more. By watching our video above, you can easily create new defaults for your vim editor that can cut time spent formating your text to spend more on coding.
Learning The Terminal
One of the best features of Unix operating systems is the powerful terminal they provide.
The ls command
The ls command is one of the most used terminal commands.
The basic ls command, when run, displays the contents within the current working directory. Passing in a directory name as an argument will display the contents of that directory. It is also possible to pass in a path for a directory to display any directory, regardless of the directory the user is currently in.
If the -a flag is passed in with ls, all items in the current working directory prepended with a . are also displayed, along with the rest of the items.
Passing in the -l flag prints information for each item in the directory in a series of columns on a single line. The first column displays the read, write, and executable permissions for the main user, the group the current user is in, and any user in that order. The next column shows the owner of the item and the next column shows the group owner. The fourth column displays the size, in bytes, of the item. The fifth column displays the moment the item was created, and the last column displays the name of the item.
If the -R flag is passed in, the command will display the contents of the current directory, and then recursively enter every directory within the current directory and display the contents of that directory, then keep going into every directory until there are no more directories in the current directory it is in.
All these options are combinable for different uses. For example, I could use the -l and -a flags to display the information for the items prepended with a . , or use -R and -l together.
The cd and mv commands
The cd and mv commands are crucial commands in order to actually use the terminal. Without cd, I would forever be stuck in their home directory. The mv command is necessary for moving files from one section of the hard drive. The cd command by itself will change the current working directory to the home directory. If passed a directory name that is within the current working directory, the current working directory will be changed to the name of the passed in directory. cd will also take a path as an argument. When a path is passed in, the current working directory will be changed to the directory specified by the path. When cd is passed with .., the directory will go backwards, the directory that the current directory is in.
The mv command will move an item within a certain directory to the directory passed in.
If the destination argument is not a path, the command will look for the destination in the current working directory. The destination argument can be a path, so I can move the item to any directory in the hard drive.
Recording terminal sessions via scripts
With the script command you can record the commands you run in your terminal into a file. By just typing script file_name_here, you can start a script. Also, you don’t need to worry about making a file beforehand, because when you specify the filename, it will make once for you in that name. Then when you’re done, type exit and your terminal will say your script session has ended and re-state the filename in which it recorded all your commands in.
How To SSH (into well server)
Computer Science students have the ability to log into the school’s server using the ssh command. The way to do access the terminal is to type into the command terminal the following text:
ssh your_NetId@bell.cs.ucr.edu
If it is your first time entering the terminal, you will be asked to trust the encryption that the server uses, then prompted to enter the password associated with your NetID. Once doing all those steps, you will be brought to your home directory on the server. To exit the server, type exit into the command prompt and press enter.
A useful command that moves files to and from the remote server onto your home computer is the scp command. To put items from your home computer to the school’s server, type into the command prompt:
scp filename/absolute_path your_NetID@bell.cs.ucr.edu:absolute_path
To move items from the remote server onto your home computer, type into the command prompt:
scp your_NetID@bell.cs.ucr.edu:absolute_path absolute_path
Spectacle App: Using the terminal and vim in one screen
One of the first things I noticed about vim that I initially disliked was that it took over the terminal when I used it. Users with Windows 7 & above automatically have this ability by dragging your screen to the left or right border of your screen. Unfortunately, OS X users don’t have this built in ability. To get around this, OS X users can install the Spectacle App which will enable you to organize multiple windows on your screen with a touch of a buttom. To get around this issue, I started using two terminals instead of just one while I was programming. I would run vim using the first terminal and would run the executable in the second. It was as simple as using :w to save on vim instead of using :wq to save and quit. I could now test my code without ever having to close vim.
perror
When programming for unix based operating systems (which is a primary component of CS100), system calls are a prominent component for code. The perror function captures the error value (if returned) from the system call and prints to stdout an error message based on the system call and the type of error. It takes in one c-string argument, which is a message the user can pass in.
2014-09-07T00:00:00Z
2014-09-07T00:00:00Z

Categories: Functional Programming

Summary An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness. Population Growth Suppose we wish to model population growth of bees via the logistic equation We assume the growth rate … Continue reading →

Categories: Functional Programming

When I was at ICFP last week, it became clear that I had made a huge mistake in
the past three years. A few of us were talking, including Erik de Castro Lopo,
and when I mentioned that he was the original inspiration for creating the
conduit package, everyone else was surprised. So firstly: Erik, I apologize for
not making it clear that you initially kicked off development by finding some
fun corner cases in enumerator that were difficult to debug.So to rectify that, I think it's only fair that I write the following:conduit is entirely Erik's fault.If you love conduit, write Erik a thank you email.More importantly, if you hate conduit, there's no need to complain to me anymore. Erik presumably will be quite happy to receive all such further communications.In other words, it's not my company, I just work here.Thanks Erik :)UPDATE Please also read my follow-up blog
post clarifying this one, just
in case you're confused.

Categories: Functional Programming