Postdoctoral position in Functional and Constraint Programming at KU LeuvenThe Declarative Languages and Artificial Intelligence (DTAI) group of KU Leuven (Belgium) invites applicants for a postdoctoral position in the area of functional and constraint programming. The position revolves around domain-specific languages (DSLs) embedded in Haskell for constraint programming. It is part of the EU project GRACeFUL whose overarching theme is tools for collective decision making. The KU Leuven part of the project is under the direction of prof. Tom Schrijvers.To apply you must hold a recent PhD (or be about to graduate) related to either functional or constraint programming. Experience in both areas is an advantage.You will work closely with prof. Schrijvers and his PhD students at KU Leuven, as well as with the GRACeFUL project partners across Europe.The position is for 3 years. The salary is competitive and the starting date negotiable (but no later than February 1). Moreover, KU Leuven's policy of equal opportunities and diversity applies to this position.Application procedure: http://people.cs.kuleuven.be/~tom.schrijvers/postdocposition2.html

Categories: Functional Programming

For all you local folks, I'll be giving a talk about my dissertation on November 5th at 4:00–5:00 in Ballantine Hall 011. For those who've heard me give talks about it before, not much has changed since NLCS 2013. But the majority of current CL/NLP, PL, and logic folks haven't seen the talk, so do feel free to stop by.
Abstract: Many natural languages allow scrambling of constituents, or so-called "free word order". However, most syntactic formalisms are designed for English first and foremost. They assume that word order is rigidly fixed, and consequently these formalisms cannot handle languages like Latin, German, Russian, or Japanese. In this talk I introduce a new calculus —the chiastic lambda-calculus— which allows us to capture both the freedoms and the restrictions of constituent scrambling in Japanese. In addition to capturing these syntactic facts about free word order, the chiastic lambda-calculus also captures semantic issues that arise in Japanese verbal morphology. Moreover, chiastic lambda-calculus can be used to capture numerous non-linguistic phenomena, such as: justifying notational shorthands in category theory, providing a strong type theory for programming languages with keyword-arguments, and exploring metatheoretical issues around the duality between procedures and values. comments
2014-10-21T02:39:32Z

Categories: Functional Programming

Hi *,
It's been a few weeks since the last message - and I apologize! We actually are changing the posting time to be Friday now, so hopefully this situation will be corrected preeeeetty quickly from this point forward, and hopefully will give better context over the course of a weekly discussion.
That said, let's begin!
We've seen plenty of changes to GHC itself in the past few weeks. Some of the highlights include:
Some changes to help make Prelude combinators fuse better. David Feuer has been leading a lot of this work, and it's been quite fruitful, with several new things now fusing (like takeWhile, scanl, scanr, and mapAccumL.
Relatedly, Data.List.Inits should be far faster thanks to David Feuer (ref: Phab:D329).
The testsuite driver now has preliminary support for Python 3 - which should be useful for platforms out there that sport it, and ones that will use it as the default eventually (such as Fedora 22, possibly).
Some of the initial work by Edward Yang to remove HEAP_ALLOCED from the GHC runtime system has landed. Briefly, HEAP_ALLOCED is a check the RTS uses to determine if some address is part of the dynamic heap - but the check is a bit costly. Edward's full patchset hopes to remove this with an 8% speedup or so on average.
GHC now has a new macro, __GLASGOW_HASKELL_PATCHLEVEL__, which will allow you to determine the point-level release of the GHC you're using. This has been a requested feature in the past we were a little hesitant of adding, but Herbert went through and did it for us. (Ref: Phab:D66)
Template Haskell now supports LINE pragmas, thanks to Eric Mertens (ref: Phab:D299).
Sergei Trofimovich revived libbfd debugging support for the runtime system linker, which should be of use to several daring souls out there (ref: Phab:D193).
Several patches from Gintautas Miliauskas has improved the usability of msys and the testsuite on Windows - and he's not done yet!
A few improvements to the x86 code generator were committed by Reid Barton and Herbert Valerio Riedel, improving size/space for certain cases (ref: Phab:D320, Phab:D163).
and more besides that, including some linker improvements, and general cleanups as usual.
The mailing list has been busy (as usual), with some discussions including:
Austin posted some discussion about the tentative 7.10.1 plans - we're still hoping these are accurate, so take note! We hope to freeze mid-November, and release Feburary 2015! [1]
Austin also called for some feedback: GHC HQ has become convinced a 7.8.4 release is needed to fix some showstoppers - so please let him know soon if you're totally incapable of using 7.8 for something! [2]
Alan Zimmerman has asked for some feedback on his proposed "AST Annotations", which will hopefully allow GHC API clients to add richer annotations to GHC's syntactic representations. The motivation is for refactoring tools like HaRe - and I'm sure all input would be appreciated. [3]
Chris done sparked off a discussion about making GHCi awesomer, and I'm sure everyone can appreciate that! In particular, Chris wanted to discuss programmatic means of controlling GHCi itself, and naturally we need to ask - is the current API not enough, and why? [4]
Yuras Shumovich has implemented a proposal for allowing the Haskell FFI to support C structures natively as return values - this involves interfacing with C ABI rules to properly support structure layout. While Yuras has an initial implementation in Phab:D252, some questions about the feature - including its implementation complexity - remain before it gets merged. [5]
Richard Eisenberg made a modest request: can Phabricator patches have a 'roadmap', so people can instruct reviewers how to read a diff? The answer: yes, and it should be relatively easy to implement, and Austin can do so Real Soon Now™. [6]
Ben Gamari started a big discussion about one-shot event semantics in the I/O manager, with a lot of replies concerning not only the bug, but machines to test the actual change on. With any luck, Ben's fix for the I/O manager and a test machine should come quickly enough. [7]
Herbert Valerio Riedel opened an RFC: Should we look into using AsciiDoc for GHC? Historically, GHC's documentation has been written using DocBook, a verbose but very precise and unambiguous documentation format. However, AsciiDoc offers a much nicer markup language, while retaining DocBook support. In short, it looks like GHC may get a much more clean user manual soon. [8]
Yuras opened another discussion: Should we namespace proposals we create on our wiki? What seems uncontroversial can lead to surprising discussion, and the results were mixed this time it seems. [9]
Geoff Mainland stepped up and fixed Data Parallel Haskell to work with a new version of vector and GHC. Austin had disabled DPH a few weeks prior due to its difficulty to upgrade, and divergent source trees. With 7.10, GHC will hopefully ship a more modern vector and dph to boot.
Austin asks: can we warn on tabs by default for GHC 7.10? It's an easy change and a minor one - but we should at least ask first. Vote now! [10]
Philip Hölzenspies opens up a discussion about Uniques in GHC, and their impact on the compilers current design. Philip has a hopeful design to redo Unique values in GHC, and a patch to support it: Phab:D323. [11]
Richard Eisenberg asks: can we somehow integrate GitHub into our development process? While GitHub doesn't have as many nice tools as something like Phabricator, it has a very high inertia factor, and Richard is interested in making the 'first step' as easy as possible for newbies. Discussions about Phab<->GitHub integrations were afoot, as well as general discussion about contributor needs. There were a lot of points brought up, but the conversation has slightly dried up now - but will surely be revived again. [12]
And now, look at all these tickets we closed! Including: #9658, #9094, #9356, #9604, #9680, #9689, #9670, #9345, #9695, #9639, #9296, #9377, #9184, #9684.
[1] http://www.haskell.org/pipermail/ghc-devs/2014-October/006518.html
[2] http://www.haskell.org/pipermail/ghc-devs/2014-October/006713.html
[3] http://www.haskell.org/pipermail/ghc-devs/2014-October/006482.html
[4] http://www.haskell.org/pipermail/ghc-devs/2014-October/006771.html
[5] http://www.haskell.org/pipermail/ghc-devs/2014-October/006616.html
[6] http://www.haskell.org/pipermail/ghc-devs/2014-October/006719.html
[7] http://www.haskell.org/pipermail/ghc-devs/2014-October/006682.html
[8] http://www.haskell.org/pipermail/ghc-devs/2014-October/006599.html
[9] http://www.haskell.org/pipermail/ghc-devs/2014-October/006730.html
[10] http://www.haskell.org/pipermail/ghc-devs/2014-October/006769.html
[11] http://www.haskell.org/pipermail/ghc-devs/2014-October/006546.html
[12] http://www.haskell.org/pipermail/ghc-devs/2014-October/006523.html
2014-10-20T14:14:23Z
thoughtpolice

Categories: Functional Programming

I recently read through this long post entitles Object Oriented Programming is an expensive disaster which must end. I have to agree I largely agree with what he writes, but I don’t think I ever could have written such a well-researched article, and absolutely never one of equal length ;)
It does include some nice quotes and references and so far I’ve only read one of the many that I bookmarked, Computing Science: Achievements and Challenges (EWD1284). It does include a few quips that, based on other articles I’ve read, seem fairly typical to Dijkstra. I simply love the way he expressed his opinions at times.
This one really ought to have been in the lengthy post on the OOP disaster:
After more than 45 years in the field, I am still convinced that in computing, elegance is not a dispensable luxury but a quality that decides between success and failure; in this connection I gratefully quote from The Concise Oxford Dictionary a definition of “elegant”, viz. “ingeniously simple and effective”. Amen. (For those who have wondered: I don’t think object-oriented programming is a structuring paradigm that meets my standards of elegance.)
And his thoughts on formal methods are well-known of course, as are his thoughts on iterative design. However, I rather like how he expresses a certain level of disgust of the Anglo-Saxon world when writing about those topics:
The idea of a formal design discipline is often rejected on account of vague cultural/philosophical condemnations such as “stifling creativity”; this is more pronounced in the Anglo-Saxon world where a romantic vision of “the humanities” in fact idealizes technical incompetence. Another aspect of that same trait is the cult of iterative design.
It amazes me every time I read something by someone like Dijkstra, just how much things stay the same, even in a field like computing science, which is said to be moving so fast.
2014-10-20T00:00:00Z
2014-10-20T00:00:00Z

Categories: Functional Programming

We have two new updates to Stackage: providing cabal.config files
and including Haddock documentation.Haddock documentation on snapshotsNow all new exclusive snapshots will have haddock links, which you can
access via the following steps:Go to the stackage.org home page.Choose an exclusive snapshot, e.g. GHC 7.8, 2014-10-20.On the snapshot page will be a link in the menu entitled Haddocks.That link will be to an index page
like this
from which you can view documentation of all packages included in the
snapshot. This means you can generally view everything in one place,
on one high availability service.Using Stackage without changing your repoThe recommended way to use Stackage is to simply change your
remote-repo field in your .cabal/config file and run cabal
update. Henceforth your dependencies will be resolved from Stackage,
which is backed by high availability Amazon S3 storage servers, and
you will have successful build plans, compilation and passing
tests. Hurrah!Try Haskell and the
upcoming Haskell.org homepage
were both developed with Stackage. This meant I could just specify the
Stackage snapshot to use in the README and then the next time I want
to upgrade I can just update the snapshot version to get a fresh
working build plan.The issue some people are facing is that they cannot change this
remote-repo field, either because they're using a cabal sandbox,
which doesn't support this yet, or because they just don't want to.The solution to this, in my experience, has been for me to manually go
and run cabal freeze in a project I've already built to get the
cabal.config file and then give these people that file.Now, it's automated via a cabal.config link on snapshot pages:For new developers working on an application who want to use Stackage,
they can do something like this:$ cabal sandbox init
$ curl http://www.stackage.org//cabal.config > cabal.config
$ cabal install --only-depWhich will install their dependencies from Hackage. We can't guarantee
this will always work -- as Stackage snapshots sometimes will have a
manual patch in the package to make it properly build with other
packages, but otherwise it's as good as using Stackage as a repo.A cabal freeze output in cabal.config will contain base and similar
packages which are tied to the minor GHC version (e.g. GHC 7.8.2 vs
GHC 7.8.3 have different base numbers), so if you get a cabal.config
and you get a dependencies error about base, you probably need to
open up the cabal.config and remove the line with the base
constraint. Stackage snapshots as used as repos do not have this
constraint (it will use whichever base your GHC major version uses).Another difference is that cabal.config is more like an
“inclusive” Stackage snapshot
-- it includes packages not known to build together, unlike
“exclusive” snapshots which only contain packages known to build and
pass tests together. Ideally every package you need to use (directly
or indirectly) will come from an exclusive snapshot. If not, it's
recommended that you
submit the package name to Stackage,
and otherwise inclusive snapshots or cabal.config are the fallbacks
you have at your disposal.

Categories: Functional Programming

Summary: I've just released a new version of HLint that can spot an unsafePerformIO which should have NOINLINE but doesn't.I've just released HLint v1.9.10, a tool to suggest improvements to Haskell code. I don't usually bother with release announcements of HLint, as each version just fixes a few things and adds a few hints, it's all in the changelog (plus there have now been 102 releases). The latest release attempts to make using unsafePerformIO a little bit safer. A common idiom for top-level mutable state in Haskell is:myGlobalVar :: IORef IntmyGlobalVar = unsafePerformIO (newIORef 17)That is, define a top-level CAF (function with no variables) and bind it to unsafePerformIO of some created mutable state. But the definition above is unsafe. GHC might decide myGlobalVar is cheap and inline it into several uses, duplicating the variable so that some functions update different versions. Running this code through the latest version of HLint we see:Sample.hs:2:1: Error: Missing NOINLINE pragmaFound: myGlobalVar = unsafePerformIO (newIORef 17)Why not: {-# NOINLINE myGlobalVar #-} myGlobalVar = unsafePerformIO (newIORef 17)HLint has spotted the problem, and suggested the correct fix.Trying it for realLet's take the package slave-thread-0.1.0 and run HLint on it. Slave thread is a new package that helps you ensure you don't end up with ghost threads or exceptions being thrown but missed - a useful package. Running HLint we see:Sample.hs:19:1: Error: Missing NOINLINE pragmaFound: slaves = unsafePerformIO $ Multimap.newIOWhy not: {-# NOINLINE slaves #-} slaves = unsafePerformIO $ Multimap.newIOSample.hs:20:3: Warning: Redundant $Found: unsafePerformIO $ Multimap.newIOWhy not: unsafePerformIO Multimap.newIOSample.hs:25:1: Error: Eta reduceFound: fork main = forkFinally (return ()) mainWhy not: fork = forkFinally (return ())Sample.hs:55:28: Warning: Redundant $Found: PartialHandler.totalizeRethrowingTo_ masterThread $ memptyWhy not: PartialHandler.totalizeRethrowingTo_ masterThread memptySample.hs:72:5: Error: Use unlessFound: if null then return () else retryWhy not: Control.Monad.unless null retryHLint has managed to spot the missing NOINLINE pragma, and also a handful of tiny cleanups that might make the code a little more readable. Fortunately (and independent of HLint), the NOINLINE pragma was added in slave-thread-0.1.1, so the library no longer suffers from that bug.
2014-10-19T21:23:00Z
Neil Mitchell
noreply@blogger.com

Categories: Functional Programming

In an article published in the Guardian yesterday, author Kathleen
Hale recounts how her first book got some negative reviews by
reviewers on a book review website. One reviewer in particular upset
her and Kathleen ends up figuring out the reviewer is using a false
identity, finds out who the reviewer really is and confronts her. The
piece doesn't read to me like some sort of valedictory "I outed a
fraud" type piece (though there are some passages in there which are
questionable in that direction) and equally there are several passages
where Kathleen expresses deep embarrassment and regret for the course
of action she took. This episode, and that article in particular has
caused substantial reaction: currently 600 comments on the Guardian
article plus several other blog posts. There's no shortage of opinion
to be found on Twitter either, as you'd expect.
The course of action that Kathleen took seems to be fairly undisputed
as far as I can find. There is some dispute from some of the other
blog posts as to exactly what was tweeted and said by whom, and there
is dispute over Kathleen's claim that there are factual inaccuracies
made in a review of her book. It is not disputed that the reviewer was
using a false identity and that the reviewer had at least public
Twitter, Facebook, and Instagram accounts under the false
identity. The false identity was also a real name (Blythe Harris), by
which I mean a name which if you introduced yourself by that name, no
one would think you're using a false identity. This is distinct from
claiming to be Peter Rabbit, or Buzz Lightyear.
Many people have equated Kathleen's actions with stalking. My
dictionary defines the verb to stalk as:
to follow or approach (game, prey, etc.) stealthily and
quietly
to pursue persistently and, sometimes, attack (a person with
whom one is obsessed, often a celebrity)
, 4,... [not relevant]
The second item there certainly fits. The British legal approach,
whilst it gives no strict definition gives examples and guidance:
....following a person, watching or spying on them or forcing
contact with the victim through any means, including social media.
The effect of such behaviour is to curtail a victim's freedom,
leaving them feeling that they constantly have to be careful. In
many cases, the conduct might appear innocent (if it were to be
taken in isolation), but when carried out repeatedly so as to
amount to a course of conduct, it may then cause significant
alarm, harassment or distress to the victim.
I'm glad it includes "social media" there. Some comments have
suggested that stalking "in real life" is worse than online. This
seems bizarre to me: as if through a computer you are not interacting
with other human beings but merely with shiny pixels who have no
emotional capacity. "In real life" is everything we know. Whilst we're
alive we have no personal experience of anything other than "in real
life".
So I'm fairly sold on the whole argument that Kathleen's behaviour
towards this reviewer can be considered stalking and as such is
reprehensible.
To me, the far more interesting issue is the use of anonymity, false
identities and any realistic expectation we have of privacy on the
internet. A number of people who claim to write book reviews on such
sites have suggested that the behaviour of Kathleen is exactly why
they write their reviews under false names. I think there's something
of a contradiction going on here.
But let's work backwards. Firstly, Kathleen, through some social
engineering (she requested from the book review site the address of
the reviewer so that she could post her a copy of the book) got the
address of the book reviewer. She then used a telephone directory and
census results to identify who really lived there (or likely owned the
land). Now the use of the telephone directory seems a bit odd to me:
telephony directories map names to numbers (and maybe addresses). Yes,
you could use it to map an address to a name but it's very
inefficient: you're essentially searching through the whole directory
looking for the address whilst the directory is sorted by name, not
address. So unless it was a very small telephone directory, I don't
really buy that. Using census results is far more creditable: they're
public documents and when they're online, they do allow you to search
by address. In the UK you can only get access to the
raw census details 100 years after the census has been published
which, to a high probability, rules it out as a means to tie an
address to a person who's still alive. You can get statistics and
aggregates from more recent census results but you can't get the raw
data. I'm assuming that in the US there's no such restriction on
access to raw census data. If there is then I don't understand how
Kathleen really managed to get a name for the owner of the property.
Instead, in the UK, if you want to find out who owns some land, you
can pay the land registry £3 and they'll tell you. Presumably there
are means by which you can legally hide this; I'm sure the rich have
figured this out - probably some method by which some fake company in
a tax haven technically "owns" the land and as they're registered
abroad, they don't have to divulge any further details about that
company. So yes, you could argue the Land Registry is profiting from
facilitating stalkers, but equally there are a bunch of legitimate
reasons to need access to such data and I can't think of any sane way
to ensure the use of such a service isn't abused. So from that I
conclude that unless the owner is a millionaire, the owner of any land
is public knowledge.
The use of social engineering to get the address in the first place is
more interesting but very obvious. This sort of thing happens a lot
and sometimes to horrifying consequences (e.g. the Australian DJs who
phoned up a hospital, pretending to be the Queen and Prince of Wales,
enquiring as to the health of the Duchess of Cambridge. The nurse fell
for the hoax and put the call
through. Three days later, the nurse committed suicide). As a species
we are not good at taking the time to verify who we're talking to or
why. Whilst (hopefully) most of us would hang up if our bank
apparently rang us and then asked for our credit card details "for
security" this is largely only because it's in the bank's interest (in
terms of cost of insurance) to reduce fraud, so they've trained us as
such. But in all sorts of other scenarios we implicitly trust people
we've no real reason to. A simple example: ticket inspectors on public
transport. They may be wearing the uniform, but it could be
faked. With their travel-card readers they could be seeing who has the
expensive yearly travel cards, scanning the unique numbers from them
and then using them to program up fraudulent cards. The crypto on
those things is notoriously weak. Has anyone ever requested some means
to verify the identity of a ticket inspector? And even if you could,
how do you know they're not crooked regardless?
So phoning someone up, impersonating someone else, or pretending to
have valid reasons to request the information you're requesting is
always likely to work. It might be illegal in some cases, but it's
certainly human nature to try to be helpful and if you're given a
plausible justification, on what basis could you refuse the request
unless it's contrary to some sort of company policy? In this case, if
you're concerned about anonymity, wouldn't you be concerned about this
possibility, and make use of an anonymous mail box?
Article 8 of the European Convention on Human Rights guarantees an
individual's right to respect for privacy and family life, including
correspondence. Is privacy the same as anonymity? No, definitely not:
In conflating anonymity and privacy, we have failed to see an
important factual difference between them: under the condition of
privacy, we have knowledge of a person’s identity, but not of an
associated personal fact; whereas under the condition of
anonymity, we have knowledge of a personal fact, but not of the
associated person’s identity
The vast violations of our lives by state surveillance as revealed by
Snowdon over the last year demonstrates the whole-scale collation of
everything we do online and off by our governments. This is both being
able to observe an action and identify the individual who caused it
(thus we have no hope of performing any action anonymously), and being
able to observe an individual and know the actions they take (thus no
privacy). I can't work out whether the ECHR has anything to say on a
right to anonymity; I get the sense that it doesn't try to protect
that. So that's basically saying: "the state shouldn't record your
every move (as that's an invasion of privacy), but moves that we're
interested in, we can know who did them". Of course, we now know
they're recording everything anyway.
We also know that computer systems can always be hacked into - there
is no real security anywhere. Given a skilled and sufficiently funded
adversary, any computer system connected in any way to the internet
can be hacked into. Why? Because humans wrote the software that runs
on those computers and humans are incapable of writing bug-free
software. Look at all the large scale data breaches in recent
history. Nothing is secure.
So we have laws that seem to try and protect privacy, but they're
violated by our own governments, and in any case, we have countless
examples of our inability to store any information securely. So is
there really any hope to be able to exist with anonymity on the
internet?
As ever, it depends who your adversary is. If your adversary is a
government (either your own or some foreign government) then no, you
have no hope. If it's a previous partner of yours who has no
particular computer training, then yes, you're probably going to have
a reasonable chance of being anonymous for a while. But you need to
read up on this and think hard: it's not a trivial undertaking. There
are some good guides as to how to do this, but:
All writers - whether writing under their own names or not -
should be aware of the risks they may incur by hitting 'publish'.
What is the effect of hitting "publish"? It's to put more data points
out there which may lead people to be able to identify you. The fewer
data points out there, the better. So coming back to our book
reviewer, if you want to review books anonymously, and if your
justification for acting anonymously is to avoid being stalked by
authors who don't like your reviews, then why put so many data points
out there? Why have the Facebook page, the Instagram profile with the
faked photos, the Twitter account? Why give your real postal address
to the book review club knowing they're going to post books to it and
might conceivably give your address out to other people?
The social media accounts in particular I find most odd. If you want
to review books then review books. Build your following, your
reputation and cachet on the quality of your reviews. If I'm looking
at a book review I really don't care where you went on holiday, what
your tweets are, or how many pets you have. Putting that information
out there undermines your entire justification for being anonymous: if
you want to be anonymous (i.e. you don't want people to find out who
you are) then why are you putting so much unnecessary information out
there that may allow people to figure out who you are?
Equally, use a name that clearly communicates to me you're trying to
be anonymous: call yourself TheBookReviewer53, DostoyevskyLover or
OrwellWasRight. Doing so doesn't lessen the validity of your opinions
on your chosen subject and is more honest with people reading your
reviews: it's overtly saying "I have reasons to want to exist
anonymously on the internet". It reveals nothing more about your real
identity either: regardless of the obvious fictitious-ness of your
online persona, if you can be found, you can be found.
Researchers show that four data points about a person’s location can identify that person with 95% accuracy. FOUR. You
think you can tweet anonymously from your phone? You think apps like
Whisper allow you to act anonymously? As with pretty much everything
related to the internet and computing, unless you've spent the last 20
years of your life working with computers, studying computers and
thinking very hard about threat models and what data you're putting
out there, and are utterly paranoid, you basically haven't got a
chance. Do you turn off wifi on your phone when you leave the house?
You should. You trust that USB pen drive you're transferring documents
on? You shouldn't.
Finally and most obviously, any attempt at anonymity clearly doesn't
insulate you from the law. As members of various hacking groups such
as lulzsec found out, you always can be found out by law enforcement
agencies. Yes, you might be able to make it difficult for a poorly
funded person to come after you for libel (which is really just an
indictment of the disgusting relationship between justice and money)
but it's quite a risk to take. If you wouldn't put it in print with
your real name attached, you're placing an awful lot of trust on your
ability to maintain your anonymity against an adversary you probably
don't know as well as you need to.
2014-10-19T15:00:00Z

Categories: Functional Programming

Suppose you are writing a compiler for some programming language or DSL.
If you are doing source to source transformations in your compiler,
perhaps as part of an optimization pass, you will need to construct and
deconstruct bits of abstract syntax. It would be very convenient if we
could write that [...]

Categories: Functional Programming

Just found a really nice little hack that makes working in the GHC interactive
REPL
a little easier and more convenient.
First of all, I added the following line to my ~/.ghci file.
:set -DGHC_INTERACTIVE
All that line does is define a GHC_INTERACTIVE pre-processor
symbol.
Then in a file that I want to load into the REPL, I need to add this to the top
of the file:
{-# LANGUAGE CPP #-}
and then in the file I can do things like:
#ifdef GHC_INTERACTIVE
import Data.Aeson.Encode.Pretty
prettyPrint :: Value -> IO ()
prettyPrint = LBS.putStrLn . encodePretty
#endif
In this particular case, I'm working with some relatively large chunks of JSON
and its useful to be able to pretty print them when I'm the REPL, but I have
no need for that function when I compile that module into my project.

Categories: Functional Programming

Summary: A few weeks ago Hackage stopped generating docs. I have a script that generates the docs, and also fixes some Haddock bugs.Update: The Haddock documentation generators are running once again, but may be somewhat behind for a little while. A few weeks ago Hackage stopped generating documentation, so if you look at recently uploaded pages they tend to either lack docs, or have very alert maintainers who did a manual upload. I've packaged up my solution, which also fixes some pretty annoying Haddock bugs. Given that I can now get docs faster and better with my script, I'll probably keep using it even after Haddock on Hackage gets restarted.How to use itYou are likely to get better results if you always install the packages you use with documentation.Ensure you have tar, curl, cp and cabal on your $PATH.cabal update && cabal install neilMake a release, don't touch any other code, then make sure you are in the project directory.neil docs --username=YourHackageUsernameType in your Hackage password at the prompt.And like that, your docs are online. To see an example of something that was generated with this process, look at Shake.What I fixedI automated the process using scripts originally taken from the lens library, supplemented with suggestions from Reddit. I then do a number of manual fixups.Haddock now makes cross-module links where it doesn't know what the target is default to types. Concretely, if I write 'Development.Shake.need' in Haddock it generates a link to #t:need, which doesn't exist, when it should be #v:need - I fix that. Update: fixed in Haddock 1.14 or above, as per this ticket.On Windows, if you use CPP and multiline bird-tick (>) Haddock blocks you get a blank line between each line. I fix that.If you follow some of the simpler scripts links outside your package won't work (or at least, didn't for me). I fix that.The neil toolThe neil tool is my personal set of handy Haskell scripts. I make all my releases with it (neil sdist), and do lots of checks that my packages conform to my rules (neil check). I also use it for driving my Travis instances. It's in fairly regular flux. Until now, I've always kept it in Darcs/Git and never released it - it's personal stuff tuned to how I work.You might also notice that neil provides a library. Don't use that, I intend to delete it in a few weeks. Update: library removed.
2014-10-17T20:49:00Z
Neil Mitchell
noreply@blogger.com

Categories: Functional Programming

Posted on October 17, 2014
Tags: types
Lately I’ve been reading a lot of type theory literature. In effort to help my future self, I’m going to jot down a few thoughts on quotient types, the subject of some recent google-fu.
But Why!
The problem quotient types are aimed at solving is actually a very common one. I’m sure at some point or another you’ve used a piece of data you’ve wanted to compare for equality. Additionally, that data properly needed some work to determine whether it was equal to another piece.
A simple example might would be representing rational numbers. A rational number is a fraction of two integers, so let’s just say
type Rational = (Integer, Integer)
Now all is well, we can define a Num instance and what not. But what about equality? Clearly we want equivalent fractions to be equal. That should mean that (2, 4) = (1, 2) since they both represent the same number.
Now our implementation has a sticky point, clearly this isn’t the case on its own! What we really want to say is “(2, 4) = (1, 2) up to trivial rejiggering”.
Haskell’s own Rational type solves this by not exposing a raw tuple. It still exists under the hood, but we only expose smart constructors that will reduce our fractions as far as possible.
This is displeasing from a dependently typed setting however, we want to be able to formally prove the equality of some things. This “equality modulo normalization” leaves us with a choice. Either we can really provide a function which is essentially
foo : (a b : Rational)
-> Either (reduce a = reduce b) (reduce a /= reduce b)
This doesn’t really help us though, there’s no way to express that a should be observationally equivalent to b. This is a problem seemingly as old as dependent types: How can we have a simple representation of equality that captures all the structure we want and none that we don’t.
Hiding away the representation of rationals certainly buys us something, we can use a smart constructor to ensure things are normalized. From there we could potentially prove a (difficult) theorem which essentially states that
=-with-norm : (a b c d : Integer)
-> a * d = b * c -> mkRat a b = mkRat c d
This still leaves us with some woes however, now a lot of computations become difficult to talk about since we’ve lost the helpful notion that denominator o mkRat a = id and similar. The lack of transparency shifts a lot of the burden of proof onto the code privy to the internal representation of the type, the only place where we know enough to prove such things.
Really what we want to say is “Hey, just forget about a bit of the structure of this type and just consider things to be identical up to R”. Where R is some equivalence relation, eg
a R a
a R b implies b R a
a R b and b R c implies a R c
If you’re a mathematician, this should sound similar. It’s a lot like how we can take a set and partition it into equivalence classes. This operation is sometimes called “quotienting a set”.
For our example above, we really mean that our rational is a type quotiented by the relation (a, b) R (c, d) iff a * c = b * d.
Some other things that could potentially use quotienting
Sets
Maps
Integers
Lots of Abstract Types
Basically anything where we want to hide some of the implementation details that are irrelevant for their behavior.
More than Handwaving
Now that I’ve spent some time essentially waving my hand about quotient types what are they? Clearly we need a rule that goes something like
Γ ⊢ A type, E is an equivalence relation on A
———————————————–———————————————————————————————
Γ ⊢ A // E type
Along with the typing rule
Γ ⊢ a : A
——————————————————
Γ ⊢ a : A // E
So all members of the original type belong to the quotiented type, and finally
Γ ⊢ a : A, Γ ⊢ b : A, Γ ⊢ a E b
–——————————————–——————————————————
Γ ⊢ a ≡ b : A // E
Notice something important here, that ≡ is the fancy shmancy judgmental equality baked right into the language. This calls into question decidability. It seems that a E b could involve some non-trivial proof terms.
More than that, in a constructive, proof relevant setting things can be a bit trickier than they seem. We can’t just define a quotient to be the same type with a different equivalence relation, since that would imply some icky things.
To illustrate this problem, imagine we have a predicate P on a type A where a E b implies P a ⇔ P b. If we just redefine the equivalence relation on quotes, P would not be a wellformed predicate on A // E, since a ≡ b : A // E doesn’t mean that P a ≡ P b. This would be unfortunate.
Clearly some subtler treatment of this is needed. To that end I found this paper discussing some of the handling of NuRPL’s quotients enlightening.
How NuPRL Does It
The paper I linked to is a discussion on how to think about quotients in terms of other type theory constructs. In order to do this we need a few things first.
The first thing to realize is that NuPRL’s type theory is different than what you are probably used to. We don’t have this single magical global equality. Instead, we define equality inductively across the type. This notion means that our equality judgment doesn’t have to be natural in the type it works across. It can do specific things at each case. Perhaps the most frequent is that we can have functional extensionality.
f = g ⇔ ∀ a. f a = g a
Okay, so now that we’ve tossed aside the notion of a single global equality, what else is new? Well something new is the lens through which many people look at NuRPL’s type theory: PER semantics. Remember that PER is a relationship satisfying
a R b → then b R a
a R b ∧ b R c → a R c
In other words, a PER is an equivalence relationship that isn’t necessarily reflexive at all points.
The idea is to view types not as some opaque “thingy” but instead to be partial equivalence relations across the set of untyped lambda calculus terms. Inductively defined equality falls right out of this idea since we can just define a ≡ b : A to be equivalent to (a, b) ∈ A.
Now another problem rears it head, what does a : A mean? Well even though we’re dealing with PERs, but it’s quite reasonable to say something is a member of a type if it’s reflexive. That is to say each relation is a full equivalence relation for the things we call members of that type. So we can therefore define a : A to be (a, a) ∈ A.
Another important constraint, in order for a type family to be well formed, it needs to respect the equality of the type it maps across. In other words, for all B : A → Type, we have (a, a') ∈ A' ⇒ (B a = B a') ∈ U. This should seem on par with how we defined function equality and we call this “type functionality”.
Let’s all touch on another concept: squashed types. The idea is to take a type and throw away all information other than whether or not it’s occupied. There are two basic types of squashing, extensional or intensional. In the intensional we consider two squashed things equal if and only if the types they’re squashing are equal
A = B
————————————
[A] = [B]
Now we can also consider only the behavior of the squashed type, the extensional view. Since the only behavior of a squashed type is simply existing, our extensional squash type has the equivalence
∥A∥ ⇔ ∥B∥
————————–
∥A∥ = ∥B∥
Now aside from this, the introduction of these types are basically the same: if we can prove that a type is occupied, we can grab a squashed type. Similarly, when we eliminate a type all we get is the trivial occupant of the squashed type, called •.
Γ ⊢ A
———————
Γ ⊢ [A]
Γ, x : |A|, Δ[̱•] ⊢ C[̱•]
——————————————————————————
Γ, x : |A|, Δ[x] ⊢ C[x]
What’s interesting is that when proving an equality judgment, we can unsquash obth of these types. This is only because NuRPL’s equality proofs computationally trivial.
Now with all of that out of the way, I’d like to present two typing rules. First
Γ ⊢ A ≡ A'; Γ, x : A, y : A ⊢ E[x; y] = E'[x; y]; E and E' are PERS
————————————————————————————————————————————————————————————————————
Γ ⊢ A // E ≡ A' // E'
In English, two quotients are equal when the types and their quotienting relations are equal.
Γ, u : x ≡ y ∈ (A // E), v :
∥x E y∥, Δ[u] ⊢ C [u]
———————————————————————————————————————————————————–
Γ, u : x ≡ y ∈ (A // E), Δ[u] ⊢ C [u]
There are a few new things here. The first is that we have a new Δ [u] thing. This is a result of dependent types, can have things in our context that depend on u and so to indicate that we “split” the context, with Γ, u, Δ and apply the depend part of the context Δ to the variable it depends on u.
Now the long and short of this is that when we’re of this is that when we’re trying to use an equivalence between two terms in a quotient, we only get the squashed term. This done mean that we only need to provide a squash to get equality in the first place though
Γ ⊢ ∥ x E y
∥; Γ ⊢ x : A; Γ ⊢ y : A
——————————————————————————————————–
Γ ⊢ x ≡ y : A // E
Remember that we can trivially form an ∥ A ∥ from A’.
Now there’s just one thing left to talk about, using our quotiented types. To do this the paper outlines one primitive elimination rule and defines several others.
Γ, x : A, y : A, e : x E y, a : ND, Δ[ndₐ{x;y}] ⊢ |C[ndₐ{x;y}]|
——————————————————————————————————————————————————————————————–
Γ, x : A // E, Δ[x] ⊢ |C[x]|
ND is a admittedly odd type that’s supposed to represent nondeterministic choice. It has two terms, tt and ff and they’re considered “equal” under ND. However, nd returns its first argument if it’s fed tt and the second if it is fed ff. Hence, nondeterminism.
Now in our rule we use this to indicate that if we’re eliminating some quotiented type we can get any value that’s considered equal under E. We can only be assured that when we eliminate a quotiented type, it will be related by the equivalence relation to x. This rule captures this notion by allowing us to randomly choose some y : A so that x E y.
Overall, this rule simply states that if C is occupied for any term related to x, then it is occupied for C[x].
Wrap up
As with my last post, here’s some questions for the curious reader to pursue
What elimination rules can we derive from the above?
If we’re of proving equality can we get more expressive rules?
What would an extensional quotient type look like?
Why would we want intensional or extensional?
How can we express quotient types with higher inductive types from HoTT
The last one is particularly interesting.
Thanks to Jon Sterling for proof reading
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'codeco'; // required: replace example with your forum shortname
/* * * DON'T EDIT BELOW THIS LINE * * */
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
Please enable JavaScript to view the comments powered by Disqus.
comments powered by Disqus
2014-10-17T00:00:00Z

Categories: Functional Programming

TL;DR
In this blogpost I present a proof-of-concept operator $.$, which allows you to replace:
foo x0 x1 x2 ... xN = bar $ qux x0 x1 x2 ... xN
by:
foo = bar $.$ qux
Introduction
This is a literate Haskell file, which means you should be able to just drop it into GHCi and play around with it. You can find the raw .lhs file here. Do note that this requires GHC 7.8 (it was tested on GHC 7.8.2).
> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE FlexibleInstances #-}
> {-# LANGUAGE OverlappingInstances #-}
> {-# LANGUAGE TypeFamilies #-}
> {-# LANGUAGE TypeOperators #-}
> import Data.Char (toLower)
If you have been writing Haskell code for a while, you have undoubtedly used the $ operator to “wrap” some expression with another function, mapping over the result type. For example, we can “wrap” the expression toLower 'A' with print to output the result.
print $ toLower 'A'
It is not unlikely either to have functions that just wrap other functions, e.g.:
> printLower :: Char -> IO ()
> printLower x = print $ toLower x
If the function that is being wrapped (toLower in the previous example) has only one argument, the . operator allows writing a very concise definition of functions which just wrap those single-argument functions.
> printLower' :: Char -> IO ()
> printLower' = print . toLower
However, this gets tedious when the number arguments increases. Say that we have the following function which takes three arguments (don’t worry about the horrible implementation, but rather focus on the type):
> -- | Formats a double using a simple spec. Doesn't do proper rounding.
> formatDouble
> :: Bool -- ^ Drop trailing '0'?
> -> Int -- ^ #digits after decimal point
> -> Double -- ^ Argument
> -> String -- ^ Result
> formatDouble dropTrailingZero numDigits double =
> let (pre, post) = case break (== '.') (show double) of
> (x, '.' : y) -> (x, y)
> (x, y) -> (x, y)
> post' = take numDigits (post ++ repeat '0')
> pre' = case pre of
> '0' : x -> if dropTrailingZero then x else pre
> _ -> pre
> in pre' ++ "." ++ post'
We can wrap formatDouble using print by successively using the . operator, but the result does not look pretty, nor very readable:
> printDouble :: Bool -> Int -> Double -> IO ()
> printDouble = (.) ((.) ((.) print)) formatDouble
The $.$ operator
This makes one wonder if we can’t define an additional operator $.$ (pronounced holla-holla-get-dolla) which can be used like this:
> printDouble' :: Bool -> Int -> Double -> IO ()
> printDouble' = print $.$ formatDouble
Additionally, it should be generic, as in, it should work for an arbitrary number of arguments, so that we can also have:
> printMax' :: Int -> Int -> IO ()
> printMax' = print $.$ max
> printLower'' :: Char -> IO ()
> printLower'' = print $.$ toLower
From this, it becomes clear that the type of $.$ should be something like:
($.$)
:: (a -> b)
-> (x0 -> x1 -> ... -> xn -> a)
-> (x0 -> x1 -> ... -> xn -> b)
The first question is obviously, can we write such an operator? And if we can, how generic is it?
When my colleague Alex asked me this question, I spent some time figuring it out. I previously thought it was not possible to write such an operator in a reasonably nice way, but after some experiments with the closed type families in GHC 7.8 I managed to get something working. However, the solution is far from trivial (and I now suspect more elegant solutions might exist).
A possible solution
Spoiler warning: Click here to reveal the solution.
The main ingredient for my solution is type-level composition.
> newtype (f :.: g) a = Comp {unComp :: f (g a)}
> infixl 9 :.:
This way we can represent nested types in an alternative way. If we have, for example:
> ex1 :: Maybe [Int]
> ex1 = Just [3]
We could represent the type of this as the composition of the type constructors Maybe and []:
> ex1Comp :: (Maybe :.: []) Int
> ex1Comp = Comp (Just [3])
We can nest these compositions arbitrarily. If we have three nested type constructors:
> ex2 :: Int -> (Maybe [Int])
> ex2 = \x -> Just [x + 1]
A nested :.: composition is used:
> ex2Comp :: ((->) Int :.: Maybe :.: []) Int
> ex2Comp = Comp (Comp (\x -> Just [x + 1]))
We already gave a hint to the solution here: (->) a (in this case (->) Int) is also a type constructor. We can nest these as well:
> formatDoubleComp
> :: ((->) Bool :.: (->) Int :.: (->) Double) String
> formatDoubleComp = Comp (Comp formatDouble)
Now, let’s think about what happens to f :.: g if f and g are both a Functor:
> instance (Functor f, Functor g) => Functor (f :.: g) where
> fmap f (Comp g) = Comp (fmap (fmap f) g)
The result is a Functor which allows us to fmap deep inside the original functor.
Additionally, we know that (->) a is a Functor widely known as Reader. This shows us that it indeed becomes feasible to apply a function to the final result of a function (in its :.: form), namely just by using fmap:
For example, for the function formatDouble, we get:
> printDoubleComp
> :: ((->) Bool :.: (->) Int :.: (->) Double) (IO ())
> printDoubleComp = fmap print formatDoubleComp
At this point, it is clear that we could try to implement the $.$ operator by:
converting the regular function to its :.: form (toComp);
applying the transformation using fmap;
converting the :.: form of the updated function back to a regular function (fromComp).
E.g.:
> printDouble''
> :: Bool -> Int -> Double -> IO ()
> printDouble'' = fromComp (fmap print (toComp formatDouble))
However, implementing (1) and (3) turns out to be reasonably hard. I think it makes more sense for me to just give a high-level overview: a very substantial amount of explanation would be required to explain this to new Haskellers, and more experienced Haskellers would probably have more fun figuring this out themselves anyway.
We’re going to need the simple Id Functor, let’s inline it here.
> newtype Id a = Id {unId :: a}
> instance Functor Id where
> fmap f (Id x) = Id (f x)
Implementing toComp involves implementing a typeclass with no fewer than three auxiliary type families.
> class ToComp t where
> toComp :: t -> (F t :.: G t) (A t)
> type family F t where
> F (a -> (b -> c)) = (->) a :.: F (b -> c)
> F (b -> c) = (->) b
> type family G t where
> G (a -> (b -> c)) = G (b -> c)
> G (b -> c) = Id
> type family A t where
> A (a -> (b -> c)) = A (b -> c)
> A (b -> c) = c
> instance ( F (a -> b) ~ (->) a
> , G (a -> b) ~ Id
> , A (a -> b) ~ b
> ) => ToComp (a -> b) where
> toComp f = Comp (Id . f)
> instance ( ToComp (b -> c)
> , F (a -> (b -> c)) ~ ((->) a :.: F (b -> c))
> , G (a -> (b -> c)) ~ G (b -> c)
> , A (a -> (b -> c)) ~ A (b -> c)
> ) => ToComp (a -> (b -> c)) where
> toComp f = Comp $ Comp $ unComp . toComp . f
Implementing fromComp requires another typeclass, which in turn requires one auxiliary closed type family.
> class FromComp t where
> fromComp :: t -> C t
> type family C t where
> C (Id a) = a
> C (b -> Id a) = b -> a
> C (((->) b :.: f) a) = b -> C (f a)
> C ((f :.: g :.: h) a) = C ((f :.: g) (h a))
> instance FromComp (Id a) where
> fromComp = unId
> instance FromComp (b -> Id a) where
> fromComp f = unId . f
> instance ( FromComp (f a)
> , C (((->) b :.: f) a) ~ (b -> C (f a))
> ) => FromComp (((->) b :.: f) a) where
> fromComp (Comp f) = fromComp . f
> instance ( FromComp ((f :.: g) (h a))
> , C ((f :.: g :.: h) a) ~ C ((f :.: g) (h a))
> ) => FromComp ((f :.: g :.: h) a) where
> fromComp (Comp (Comp f)) = fromComp (Comp f)
Once we have these, the implementation of $.$ becomes:
> ($.$)
> :: ( ToComp t
> , Functor (F t)
> , Functor (G t)
> , FromComp ((F t :.: G t) b)
> )
> => (A t -> b)
> -> t
> -> C ((F t :.: G t) b)
> f $.$ t = fromComp $ fmap f (toComp t)
…where the implementation is arguably much easier to read than the type signature.
With this loaded up, GHCi now even manages to infer nice and readable type signatures:
*Main> :t print $.$ toLower
print $.$ toLower :: Char -> IO ()
*Main> :t print $.$ formatDouble
print $.$ formatDouble :: Bool -> Int -> Double -> IO ()
It doesn’t do well when type polymorphism is involved though; e.g. for max :: Ord a => a -> a -> a, we get:
*Main> :t print $.$ max
print $.$ max
:: (ToComp (a -> a), FromComp (F (a -> a) (G (a -> a) (IO ()))),
Show (A (a -> a)), Ord a, Functor (F (a -> a)),
Functor (G (a -> a))) =>
a -> C (F (a -> a) (G (a -> a) (IO ())))
Now, two questions remain for the interested reader:
Can the implementation be simplified? In particular, do we need such an unholy number of type families?
Can this be generalized to all Functors, rather than just the Reader Functor? I was able to do something like this, but it made the infered types a lot less nice, and didn’t work that well in practice.
Thanks to my colleague Alex for proofreading!
2014-10-17T00:00:00Z

Categories: Functional Programming

Ever since I read about sytemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package!
My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and
$ cat > /etc/systemd/network/eth.network <<__END__
[Match]
Name=eth0
[Network]
DHCP=yes
__END__
$ systemctl enable systemd-networkd
$ systemctl start systemd-networkd
I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works.
But then I was bitten by what I guess some people call the viral nature of systemd: sytemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly.
So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package):
[Unit]
Description=WPA supplicant daemon (interface-specific version)
Requires=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device
[Service]
Type=simple
ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I
[Install]
Alias=multi-user.target.wants/wpa_supplicant@%i.service
Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with
[Match]
Name=wlan0
[Network]
DHCP=yes
does the rest.
So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model.
The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange.
2014-10-14T20:26:13Z
Joachim Breitner
mail@joachim-breitner.de

Categories: Functional Programming

Ever since I read about systemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package!
My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and
$ cat > /etc/systemd/network/eth.network <<__END__
[Match]
Name=eth0
[Network]
DHCP=yes
__END__
$ systemctl enable systemd-networkd
$ systemctl start systemd-networkd
I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works.
But then I was bitten by what I guess some people call the viral nature of systemd: systemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly.
So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package):
[Unit]
Description=WPA supplicant daemon (interface-specific version)
Requires=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device
[Service]
Type=simple
ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I
[Install]
Alias=multi-user.target.wants/wpa_supplicant@%i.service
Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with
[Match]
Name=wlan0
[Network]
DHCP=yes
does the rest.
So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model.
The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange.
2014-10-14T20:26:13Z
Joachim Breitner
mail@joachim-breitner.de

Categories: Functional Programming

Summary: Shake is not like Make, it has different internal state, which leads to different behaviour. I also store the state in an optimised way.Update: I'm keeping an up to date version of this post in the Shake repo, which includes a number of questions/answers at the bottom, and is likely to evolve over time to incorporate that information into the main text.In order to understand the behaviour of Shake, it is useful to have a mental model of Shake's internal state. To be a little more concrete, let's talk about Files which are stored on disk, which have ModTime value's associated with them, where modtime gives the ModTime of a FilePath (Shake is actually generalised over all those things). Let's also imagine we have the rule:file *> \out -> do need [dependency] runSo file depends on dependency and rebuilds by executing the action run.The Make ModelIn Make there is no additional state, only the file-system. A file is considered dirty if it has a dependency such that:modtime dependency > modtime fileAs a consequence, run must update modtime file, or the file will remain dirty and rebuild in subsequent runs.The Shake ModelFor Shake, the state is:database :: File -> (ModTime, [(File, ModTime)])Each File is associated with a pair containing the ModTime of that file, plus a list of each dependency and their modtimes, all from when the rule was last run. As part of executing the rule above, Shake records the association:file -> (modtime file, [(dependency, modtime dependency)])The file is considered dirty if any of the information is no longer current. In this example, if either modtime file changes, or modtime dependency changes.There are a few consequences of the Shake model:There is no requirement for modtime file to change as a result of run. The file is dirty because something changed, after we run the rule and record new information it becomes clean.Since a file is not required to change its modtime, things that depend on file may not require rebuilding even if file rebuilds.If you update an output file, it will rebuild that file, as the ModTime of a result is tracked.Shake only ever performs equality tests on ModTime, never ordering, which means it generalises to other types of value and works even if your file-system sometimes has incorrect times.These consequences allow two workflows that aren't pleasant in Make:Generated files, where the generator changes often, but the output of the generator for a given input changes rarely. In Shake, you can rerun the generator regularly, and using a function that writes only on change (writeFileChanged in Shake) you don't rebuild further. This technique can reduce some rebuilds from hours to seconds.Configuration file splitting, where you have a configuration file with lots of key/value pairs, and want certain rules to only depend on a subset of the keys. In Shake, you can generate a file for each key/value and depend only on that key. If the configuration file updates, but only a subset of keys change, then only a subset of rules will rebuild. Alternatively, using Development.Shake.Config you can avoid the file for each key, but the dependency model is the same.Optimising the ModelThe above model expresses the semantics of Shake, but the implementation uses an optimised model. Note that the original Shake paper gives the optimised model, not the easy to understand model - that's because I only figured out the difference a few days ago (thanks to Simon Marlow, Simon Peyton Jones and Andrey Mokhov). To recap, we started with:database :: File -> (ModTime, [(File, ModTime)])We said that File is dirty if any of the ModTime values change. That's true, but what we are really doing is comparing the first ModTime with the ModTime on disk, and the list of second ModTime's with those in database. Assuming we are passed the current ModTime on disk, then a file is valid if:valid :: File -> ModTime -> Boolvalid file mNow = mNow == mOld && and [fst (database d) == m | (d,m) <- deps] where (mOld, deps) = database fileThe problem with this model is that we store each File/ModTime pair once for the file itself, plus once for every dependency. That's a fairly large amount of information, and in Shake both File and ModTime can be arbitrarily large for user rules.Let's introduce two assumptions:Assumption 1: A File only has at most one ModTime per Shake run, since a file will only rebuild at most once per run. We use Step for the number of times Shake has run on this project.Consequence 1: The ModTime for a file and the ModTime for its dependencies are all recorded in the same run, so they share the same Step.Assumption 2: We assume that if the ModTime of a File changes, and then changes back to a previous value, we can still treat that as dirty. In the specific case of ModTime that would require time travel, but even for other values it is very rare.Consequence 2: We only use historical ModTime values to compare them for equality with current ModTime values. We can instead record the Step at which the ModTime last changed, assuming all older Step values are unequal.The result is:database :: File -> (ModTime, Step, Step, [File])valid :: File -> ModTime -> Boolvalid file mNow = mNow == mOld && and [sBuild >= changed (database d) | d <- deps] where (mOld, sBuilt, sChanged, deps) = database file changed (_, _, sChanged, _) = sChangedFor each File we store its most recently recorded ModTime, the Step at which it was built, the Step when the ModTime last changed, and the list of dependencies. We now check if the Step for this file is greater than the Step at which dependency last changed. Using the assumptions above, the original formulation is equivalent.Note that instead of storing one ModTime per dependency+1, we now store exactly one ModTime plus two small Step values.We still store each file many times, but we reduce that by creating a bijection between File (arbitrarily large) and Id (small index) and only storing Id.Implementing the ModelFor those who like concrete details, which might change at any point in the future, the relevant definition is in Development.Shake.Database:data Result = Result {result :: Value -- the result when last built ,built :: Step -- when it was actually run ,changed :: Step -- when the result last changed ,depends :: [[Id]] -- dependencies ,execution :: Float -- duration of last run ,traces :: [Trace] -- a trace of the expensive operations } deriving ShowThe differences from the model are:ModTime became Value, because Shake deals with lots of types of rules.The dependencies are stored as a list of lists, so we still have access to the parallelism provided by need, and if we start rebuilding some dependencies we can do so in parallel.We store execution and traces so we can produce profiling reports.I haven't shown the File/Id mapping here - that lives elsewhere.I removed all strictness/UNPACK annotations from the definition above, and edited a few comments.As we can see, the code follows the optimised model quite closely.
2014-10-13T20:53:00Z
Neil Mitchell
noreply@blogger.com

Categories: Functional Programming

Here’s another idea for a video game.
The theme of the game is “be consistent”. It's a minimalist-styled 2D platformer. The core mechanic is that whatever you do the first time, the game makes it so that that was the right action. Examples of how this could work:
At the start, you're standing at the center of a 2×2 checkerboard of background colors (plus appropriate greebles, not perfect squares). Say the top left and bottom right is darkish and the other quadrants are lightish. If you move left, then the darkish stuff is sky, the lightish stuff is ground, and the level extends to the left. If you move right, the darkish stuff is ground, and the level extends to the right.
The first time you need to jump, if you press W or up then that's the jump key, or if you press the space bar then that's the jump key. The other key does something else. (This might interact poorly with an initial “push all the keys to see what they do”, though.)
You meet a floaty pointy thing. If you walk into it, it turns out to be a pickup. If you shoot it or jump on it, it turns out to be an enemy.
If you jump in the little pool of water, the game has underwater sections or secrets. If you jump over the little pool, water is deadly.
2014-10-13T17:46:51Z
Kevin Reid (kpreid)
kpreid@switchb.org

Categories: Functional Programming

FIRST CALL FOR PAPERS12th International Conference on Mathematics of Program Construction, MPC 2015Königswinter, Germany, 29 June - 1 July 2015http://www.cs.ox.ac.uk/conferences/MPC2015/BACKGROUNDThe MPC conferences aim to promote the development of mathematical principlesand techniques that are demonstrably practical and effective in the processof constructing computer programs, broadly interpreted.The 2015 MPC conference will be held in Königswinter, Germany, from 29th Juneto 1st July 2015. The previous conferences were held in Twente, TheNetherlands (1989), Oxford, UK (1992), Kloster Irsee, Germany (1995),Marstrand, Sweden (1998), Ponte de Lima, Portugal (2000), Dagstuhl, Germany(2002), Stirling, UK (2004, colocated with AMAST), Kuressaare, Estonia (2006,colocated with AMAST), Marseille, France (2008), Québec City, Canada (2010,colocated with AMAST), and Madrid, Spain (2012).TOPICSPapers are solicited on mathematical methods and tools put to use in programconstruction. Topics of interest range from algorithmics to support forprogram construction in programming languages and systems. The notion of"program" is broad, from algorithms to hardware. Some typical areas are typesystems, program analysis and transformation, programming-language semantics,security, and program logics. Theoretical contributions are welcome, providedthat their relevance to program construction is clear. Reports onapplications are welcome, provided that their mathematical basis is evident.We also encourage the submission of "pearls": elegant, instructive, and funessays on the mathematics of program construction.IMPORTANT DATES * Submission of abstracts: 26 January 2015 * Submission of full papers: 2 February 2015 * Notification to authors: 16 March 2015 * Final version: 13 April 2015SUBMISSIONSubmission is in two stages. Abstracts (plain text, 10 to 20 lines) must besubmitted by 26 January 2015. Full papers (pdf) adhering to the LaTeX llncsstyle must be submitted by 2 February 2015. There is no official page limit,but authors should strive for brevity. The web-based system EasyChair will beused for submission (https://easychair.org/conferences/?conf=mpc2015).Papers must report previously unpublished work, and must not be submittedconcurrently to a journal or to another conference with refereed proceedings.Accepted papers must be presented at the conference by one of the authors.Please feel free to write to mpc2015@easychair.org with any questions aboutacademic matters.The proceedings of MPC 2015 will be published in Springer-Verlag's LectureNotes in Computer Science series, as have all the previous editions. Authorsof accepted papers will be expected to transfer copyright to Springer forthis purpose. After the conference, authors of the best papers will beinvited to submit revised versions to a special issue of the Elsevier journalScience of Computer Programming.PROGRAMME COMMITTEERalf Hinze University of Oxford, UK (chair)Eerke Boiten University of Kent, UKJules Desharnais Université Laval, CanadaLindsay Groves Victoria University of Wellington, New ZealandZhenjiang Hu National Institute of Informatics, JapanGraham Hutton University of Nottingham, UKJohan Jeuring Utrecht University and Open University, The NetherlandsJay McCarthy Vassar College, USBernhard Möller Universität Augsburg, GermanyShin-Cheng Mu Academia Sinica, TaiwanDave Naumann Stevens Institute of Technology, USPablo Nogueira Universidad Politécnica de Madrid, SpainUlf Norell University of Gothenburg, SwedenBruno C. d. S. Oliveira The University of Hong Kong, Hong KongJosé Nuno Oliveira Universidade do Minho, PortugalAlberto Pardo Universidad de la República, UruguayChristine Paulin-Mohring INRIA-Université Paris-Sud, FranceTom Schrijvers KU Leuven, BelgiumEmil Sekerinski McMaster University, CanadaTim Sheard Portland State University, USAnya Tafliovich University of Toronto Scarborough, CanadaTarmo Uustalu Institute of Cybernetics, EstoniaJanis Voigtländer Universität Bonn, GermanyVENUEThe conference will take place in Königswinter, Maritim Hotel, whereaccommodation has been reserved. Königswinter is situated on the right bankof the river Rhine, opposite Germany's former capital Bonn, at the foot ofthe Siebengebirge.LOCAL ORGANIZERSRalf Hinze University of Oxford, UK (co-chair)Janis Voigtländer Universität Bonn, Germany (co-chair)José Pedro Magalhães University of Oxford, UKNicolas Wu University of Oxford, UKFor queries about local matters, please write to jv@informatik.uni-bonn.de.

Categories: Functional Programming

The latest update of optparse-applicative triggered me to look over the functions in cblrepo for parsing a few custom command line options. I used to do the parsing in a rather ad-hoc way with lots of use of list functions to split on specific characters. For instance, some option values are pairs of package name and version separated by a comma: PKG-NAME,VERSION. This worked fine and was easy to plug into version 0.10 of optparse-applicative. It was also easily extended to triples, PKG-NAME,VERSION,RELEASE, but it started feeling a bit brittle when some tuples got extended with an optional list of flag assignments, PKG-NAME,VERSION[:FLAG,FLAG,FLAG,...]. The recent release of version 0.11 of optparse-applicative changed the API for custom option value parsers radically; instead of passing a string to the parser, the parser has to use readerAsk to get the string. In short, ReaderM turned into a state monad.
In adjusting to the new API I noticed that the code was organised in such a way that some low-level parsing functions were used directly from command line option definitions, while also being used as building blocks for the more complex parsers. This of course meant that the structuring of the functions needed to be changed completely to deal with the API change.
It turns out there already was a parser that was written in a different style (here already adjusted to the 0.11 API):
readerGhcVersion :: ReadM Version
readerGhcVersion =
arg <- readerAsk
case lastMay $ readP_to_S parseVersion arg of
Just (v, "") -> return v
_ -> fail $ "cannot parse value `" ++ arg ++ "`"
So I rewrote the rest of the parsers in a similar style. The arguably most complicated is this one:
readPkgNVersion :: ReadP (String, Version)
readPkgNVersion = do
n <- many (satisfy (/= ','))
char ','
v <- parseVersion
return (n, v)
readFlag :: ReadP (FlagName, Bool)
readFlag = readNegFlag <++ readPosFlag
where
readNegFlag = do
char '-'
n <- many (satisfy (/= ','))
return (FlagName n, False)
readPosFlag = do
n0 <- get
n <- many (satisfy (/= ','))
return (FlagName (n0 : n), True)
strCblPkgArgReader :: ReadM (String, Version, FlagAssignment)
strCblPkgArgReader = let
readWithFlags = do
(n, v) <- readPkgNVersion
char ':'
fas <- sepBy readFlag (char ',')
return (n, v, fas)
readWithoutFlags = do
(n, v) <- readPkgNVersion
return (n, v, [])
in do
s <- readerAsk
case lastMay (readP_to_S (readWithFlags <++ readWithoutFlags) s) of
Just (r, "") -> return r
_ -> fail $ "Cannot parse: " ++ s
It is slightly longer, but it’s rather a lot easier to read what’s happening after this rewrite. ReadP feels like a lighter option than pulling in parsec as a dependency, but I’d love to hear any comments or suggestions, as well as pointers to how other people deal with parsing of non-trivial types of arguments in combination with optparse-applicative.
2014-10-13T00:00:00Z
2014-10-13T00:00:00Z

Categories: Functional Programming

Non-diffusive atmospheric flow #7: PCA for spatio-temporal data
October 12, 2014
Although the basics of the “project onto eigenvectors of the covariance matrix” prescription do hold just the same in the case of spatio-temporal data as in the simple two-dimensional example we looked at in the earlier article, there are a number of things we need to think about when we come to look at PCA for spatio-temporal data. Specifically, we need to think bout data organisation, the interpretation of the output of the PCA calculation, and the interpretation of PCA as a change of basis in a spatio-temporal setting. Let’s start by looking at data organisation.
The Z500 anomaly data we want to analyse has 66 × 151 = 9966 days of data, each of which has 72 × 15 = 1080 spatial points. In our earlier two-dimensional PCA example, we performed PCA on a collection of two-dimensional data points. For the Z500 data, it’s pretty clear that the “collection of points” covers the time steps, and each “data point” is a 72 × 15 grid of Z500 values. We can think of each of those grids as a 1080-dimensional vector, just by flattening all the grid values into a single row, giving us a sequence of “data points” as vectors in ℝ1080 that we can treat in the same kind of way as we did the two-dimensional data points in the earlier example. Our input data thus ends up being a set of 9966 1080-dimensional vectors, instead of 500 two-dimensional vectors (as for the mussel data). If we do PCA on this collection of 1080-dimensional vectors, the PCA eigenvectors will have the same shape as the input data vectors, so we can interpret them as spatial patterns, just by inverting the flattening we did to get from spatial maps of Z500 to vectors – as long as we interpret each entry in the eigenvectors as the same spatial point as the corresponding entry in the input data vectors, everything works seamlessly. The transformation goes like this:
pattern → vector → PCA → eigenvector → eigenpattern
So we have an interpretation of the PCA eigenvectors (which we’ll henceforth call “PCA eigenpatterns” to emphasise that they’re spatial patterns of variability) in this spatio-temporal data case. What about the PCA eigenvalues? These have exactly the same interpretation as in the two-dimensional case: they measure the variance “explained” by each of the PCA eigenpatterns. And finally, the PCA projected components tell us how much of each PCA eigenpattern is present in each of the input data vectors. Since our input data has one spatial grid per time step, the projections give us one time series for each of the PCA eigenvectors, i.e. one time series of PCA projected components per spatial point in the input. (In one way, it’s kind of obvious that we need this number of values to reproduce the input data perfectly – I’ll say a little more about this when we think about what “basis” means in this setting.)
The PCA calculation works just the same as it did for the two-dimensional case: starting with our 1080-dimensional data, we centre the data, calculate the covariance matrix (which in this case is a 1080 × 1080 matrix, the diagonal entries of which measure the variances at each spatial point and the off-diagonal entries of which measure the covariances between each pair of spatial points), perform an eigendecomposition of the covariance matrix, then project each of the input data points onto each of the eigenvectors of the covariance matrix.
We’ve talked about PCA as being nothing more than a change of basis, in the two-dimensional case from the “normal” Euclidean basis (with unit basis vectors pointing along the x- and y-coordinate axes) to another orthnormal basis whose basis vectors are the PCA eigenvectors. How does this work in the spatio-temporal setting? This is probably the point that confuses most people in going from the simple two-dimensional example to the N-dimensional spatio-temporal case, so I’m going to labour the point a bit to make things as clear as possible.
First, what’s the “normal” basis here? Each time step of our input data specifies a Z500 value at each point in space – we have one number in our data vector for each point in our grid. In the two-dimensional case, we had one number for each of the mussel shell measurements we took (length and width). For the Z500 data, the 1080 data values are the Z500 values measured at each of the spatial points. In the mussel shell case, the basis vectors pointed in the x-axis direction (for shell length) and the y-axis direction (for the shell width). For the Z500 case, we somehow need basis vectors that point in each of the “grid point directions”, one for each of the 1080 grid points. What do these look like? Imagine a spatial grid of the same shape (i.e. 72 × 15) as the Z500 data, where all the grid values are zero, except for one point, which has a grid value of one. That is a basis vector pointing in the “direction” of the grid point with the non-zero data value. We’re going to call this the “grid” basis for brevity. We can represent the Z500 value at any spatial point (i,j) as
Z500(i,j)=∑k=11080ϕkek(i,j)
where ek(i,j) is zero unless k=15(i−1)+j, in which case it’s one (i.e. it’s exactly the basis element we just described, where we’re numbering the basis elements in row-major order) and ϕk is a “component” in the expansion of the Z500 field using this grid basis. Now obviously here, because of the basis we’re using, we can see immediately that ϕ15(i−1)+j=Z500(i,j), but this expansion holds for any orthnormal basis, so we can transform to a basis where the basis vectors are the PCA eigenvectors, just as for the two-dimensional case. If we call these eigenvectors e~k(i,j), then
Z500(i,j)=∑k=11080ϕ~ke~k(i,j),
where the ϕ~k are the components in the PCA eigenvector basis. Now though, the e~k(i,j) aren’t just the “zero everywhere except at one point” grid basis vectors, but they can have non-zero values anywhere.
Compare this to the case for the two-dimensional example, where we started with data in a basis that had seperate measurements for shell length and shell width, then transformed to the PCA basis where the length and width measurements were “mixed up” into a sort of “size” measurement and a sort of “aspect ratio” measurement. The same thing is happening here: instead of looking at the Z500 data in terms of the variations at individual grid points (which is what we see in the grid basis), we’re going to be able to look at variations in terms of coherent spatial patterns that span many grid points. And because of the way that PCA works, those patterns are the “most important”, in the sense that they are the orthogonal (which in this case means uncorrelated) patterns that explain the most of the total variance in the Z500 data.
As I’ve already mentioned, I’m going to try to be consistent in terms of the terminology I use: I’m only ever going to talk about “PCA eigenvalues”, “PCA eigenpatterns”, and “PCA projected components” (or “PCA projected component time series”). Given the number of discussions I’ve been involved in in the past where people have been talking past each other just because one person means one thing by “principal component” and the other means something else, I’d much rather pay the price of a little verbosity to avoid that kind of confusion.
Z500 PCA calculation
The PCA calculation for the Z500 data can be done quite easily in Haskell. We’ll show in this section how it’s done, and we’ll use the code to address a couple of remaining issues with how spatio-temporal PCA works (specifically, area scaling for data in latitude/longitude coordinates and the relative scaling of PCA eigenpatterns and projected components).
There are three main steps to the PCA calculation: first we need to centre our data and calculate the covariance matrix, then we need to do the eigendecomposition of the covariance matrix, and finally we need to project our original data onto the PCA eigenvectors. We need to think a little about the data volumes involved in these steps. Our Z500 data has 1080 spatial points, so the covariance matrix will be a 1080 × 1080 matrix, i.e. it will have 1,166,400 entries. This isn’t really a problem, and performing an eigendecomposition of a matrix of this size is pretty quick. What can be more of a problem is the size of the input data itself – although we only have 1080 spatial points, we could in principle have a large number of time samples, enough that we might not want to read the whole of the data set into memory at once for the covariance matrix calculation. We’re going to demonstrate two approaches here: in the first “online” calculation, we’ll just read all the data at once and assume that we have enough memory; in the second “offline” approach, we’ll only ever read a single time step of Z500 data at a time into memory. Note that in both cases, we’re going to calculate the full covariance matrix in memory and do a direct eigendecomposition using SVD. There are offline approaches for calculating the covariance and there are iterative methods that allow you to calculate some eigenvectors of a matrix without doing a full eigendecomposition, but we’re not going to worry about that here.
Online PCA calculation
As usual, the code is in a Gist.
For the online calculation, the PCA calculation itself is identical to our two-dimensional test case and we reuse the pca function from the earlier post. The only thing we need to do is to read the data in as a matrix to pass to the pca function. In fact, there is one extra thing we need to do before passing the Z500 anomaly data to the pca function. Because the Z500 data is sampled on a regular latitude/longitude grid, grid points near the North pole correspond to much smaller areas of the earth than grid points closer to the equator. In order to compensate for this, we scale the Z500 anomaly data values by the square root of the cosine of the latitude – this leads to covariance matrix values that scale as the cosine of the latitude, which gives the correct area weighting. The listing below shows how we do this. First we read the NetCDF data then we use the hmatrix build function to construct a suitably scaled data matrix:
Right z500short <- get innc z500var :: RepaRet3 CShort
-- Convert anomaly data to a matrix of floating point values,
-- scaling by square root of cos of latitude.
let latscale = SV.map (\lt -> realToFrac $ sqrt $ cos (lt / 180.0 * pi)) lat
z500 = build (ntime, nspace)
(\t s -> let it = truncate t :: Int
is = truncate s :: Int
(ilat, ilon) = divMod is nlon
i = Repa.Z Repa.:. it Repa.:. ilat Repa.:. ilon
in (latscale ! ilat) *
(fromIntegral $ z500short Repa.! i)) :: Matrix Double
Once we have the scaled Z500 anomaly data in a matrix, we call the pca function, which does both the covariance matrix calculation and the PCA eigendecomposition and projection, then write the results to a NetCDF file. We end up with a NetCDF file containing 1080 PCA eigenpatterns, each with 72 × 15 data points on our latitude/longitude grid and PCA projected component time series each with 9966 time steps.
One very important thing to note here is the relative scaling of the PCA eigenpatterns and the PCA projected component time series. In the two-dimensional mussel shell example, there was no confusion about the fact that the PCA eigenvectors as we presented them were unit vectors, and the PCA projected components had the units of length measured along those unit vectors. Here, in the spatio-temporal case, there is much potential for confusion (and the range of conventions in the climate science literature doesn’t do anything to help alleviate that confusion). To make things very clear: here, the PCA eigenvectors are still unit vectors and the PCA projected component time series have the units of Z500!
The reason for the potential confusion is that people quite reasonably like to draw maps of the PCA eigenpatterns, but they also like to think of these maps as being spatial patterns of Z500 variation, not just as basis vectors. This opens the door to all sorts of more or less reputable approaches to scaling the PCA eigenpatterns and projected components. One well-known book on statistical analysis in climate research suggests that people should scale their PCA eigenpatterns by the standard deviation of the corresponding PCA projected component time series and the values of the PCA projected component time series should be divided by their standard deviation. The result of this is that the maps of the PCA eigenpatterns look like Z500 maps and all of the PCA projected component time series have standard deviation of one. People then talk about the PCA eigenpatterns as showing a “typical ± 1 SD” event.
Here, we’re going to deal with this issue by continuing to be very explicit about what we’re doing. In all cases, our PCA eigenpatterns will be unit vectors, i.e. the things we get back from the pca function, without any scaling. That means that the units in our data live on the PCA projected component time series, not on the PCA eigenpatterns. When we want to look at a map of a PCA eigenpattern in a way that makes it look like a “typical” Z500 deviation from the mean (which is a useful thing to do), we will say something like “This plot shows the first PCA eigenpattern scaled by the standard deviation of the first PCA projected component time series.” Just to be extra explicit!
Offline PCA calculation
The “online” PCA calculation didn’t require any extra work, apart from some type conversions and the area scaling we had to do. But what if we have too much data to read everything into memory in one go? Here, I’ll show you how to do a sort of “offline” PCA calculation. By “offline”, I mean an approach that only ever reads a single time step of data from the input at a time, and only ever writes a single time step of the PCA projected component time series to the output at a time.
Because we’re going to be interleaving calculation and I/O, we’re going to need to make our PCA function monadic. Here’s the main offline PCA function:
pcaM :: Int -> (Int -> IO V) -> (Int -> V -> IO ()) -> IO (V, M)
pcaM nrow readrow writeproj = do
(mean, cov) <- meanCovM nrow readrow
let (_, evals, evecCols) = svd cov
evecs = fromRows $ map evecSigns $ toColumns evecCols
evecSigns ev = let maxelidx = maxIndex $ cmap abs ev
sign = signum (ev ! maxelidx)
in cmap (sign *) ev
varexp = scale (1.0 / sumElements evals) evals
project x = evecs #> (x - mean)
forM_ [0..nrow-1] $ \r -> readrow r >>= writeproj r . project
return (varexp, evecs)
It makes use of a couple of convenience type synonyms:
type V = Vector Double
type M = Matrix Double
The pcaM function takes as arguments the number of data rows to process (in our case, the number of time steps), an IO action to read a single row of data (given the zero-based row index), and an IO action to write a single row of PCA projected component time series data. As with the “normal” pca function, the pcaM function returns the PCA eigenvalues and PCA eigenvectors as its result.
Most of the pcaM function is the same as the pca function. There are only two real differences. First, the calculation of the mean and covariance of the data uses the meanCovM function that we’ll look at in a moment. Second, the writing of the PCA projected component time series output is done by a monadic loop that uses the IO actions passed to pcaM to alternately read, project and write out rows of data (the pca function just returned the PCA projected component time series to its caller in one go).
Most of the real differences to the pca function lie in the calculation of the mean and covariance of the input data:
meanCovM :: Int -> (Int -> IO V) -> IO (V, M)
meanCovM nrow readrow = do
-- Accumulate values for mean calculation.
refrow <- readrow 0
let maddone acc i = do
row <- readrow i
return $! acc + row
mtotal <- foldM maddone refrow [1..nrow-1]
-- Calculate sample mean.
let mean = scale (1.0 / fromIntegral nrow) mtotal
-- Accumulate differences from mean for covariance calculation.
let refdiff = refrow - mean
caddone acc i = do
row <- readrow i
let diff = row - mean
return $! acc + (diff `outer` diff)
ctotal <- foldM caddone (refdiff `outer` refdiff) [1..nrow-1]
-- Calculate sample covariance.
let cov = scale (1.0 / fromIntegral (nrow - 1)) ctotal
return (mean, cov)
Since we don’t want to read more than a single row of input data at a time, we need to explicitly accumulate data for the mean and covariance calculations. That means making two passes over the input data file, reading a row at a time – the maddone and caddone helper functions accumulate a single row of data for the mean and covariance calculations. The accumulator for the mean is pretty obvious, but that for the covariance probably deserves a bit of comment. It uses the hmatrix outer function to calculate (xi−x‾)(xi−x‾)T (where xi is the ith data row (as a column vector) and x‾ is the data mean), which is the appropriate contribution to the covariance matrix for each individual data row.
Overall, the offline PCA calculation makes three passes over the input data file (one for the mean, one for the covariance, one to project the input data onto the PCA eigenvectors), reading a single data row at a time. That makes it pretty slow, certainly far slower than the online calculation, which reads all of the data into memory in one go, then does all the mean, covariance and projection calculations in memory, and finally writes out the PCA projected components in one go. However, if you have enough data that you can’t do an online calculation, this is the way to go. You can obviously imagine ways to make this more efficient, probably by reading batches of data rows at a time. You’d still need to do three passes over the data, but batching the reads would make things a bit quicker.
Visualising the PCA results
There are three things we can look at that come out of the PCA analysis of the Z500 anomaly data: the PCA eigenvalues (best expressed as “fraction of variance explained”), the PCA eigenpatterns and the PCA projected component time series.
First, let’s look at the eigenvalues. This plot shows the fraction of variance explained for the first 100 PCA eigenvalues of the Z500 anomaly data, both individually (blue) and cumulatively (orange):
The eigenvalues are ordered in decreasing order of magnitude in what’s usually called a “scree plot”. The reason for the name is pretty obvious, since the eigenvalues fall off quickly in magnitude giving the graph the look of cliff face with a talus slope at its foot. We often look for a “knee” in a plot like this to get some idea of how many PCA eigenpatterns we need to consider to capture a good fraction of the total variance in the data we’re looking at. Here we can see that just ten of the PCA eigenpatterns explain about half of the total variance in the Z500 anomaly data (which is a set of 1080-dimensional vectors, remember). However, there’s not all that much of a “knee” in the scree plot here, which is pretty typical for climate and meteorological data – we often see a gradual fall-off in PCA eigenvalue magnitude rather than a discrete set of larger magnitude eigenvalues that we can identify as “the important ones”.
We can get some idea of what’s going on with this gradual fall-off by looking at the PCA eigenpatterns. As mentioned in the previous section, there is a question about how we scale these for display. To be completely explicit about things, here we’re going to plot PCA eigenpatterns scaled by the standard deviation of the corresponding PCA projected component time series. This gives us “typical one standard deviation” patterns that we can plot with units of geopotential height. These are usually easier to interpret than the “unit vector” PCA eigenpatterns than come out of the PCA calculation.
Here are the first six PCA eigenpatterns for the Z500 anomaly data (you can click on these images to see larger versions; the numbers in parentheses show the fraction of total Z500 anomaly variance explained by each PCA eigenpattern.):
For comparison here are the eigenpatterns for eigenpatterns 10,20,…,60:
The first thing to note about these figures is that the spatial scales of variation for the PCA eigenpatterns corresponding to smaller eigenvalues (i.e. smaller explained variance fractions) are also smaller – for the most extreme case, compare the dipolar circumpolar spatial pattern for the first eigenpattern (first plot in the first group of plots) to the fine-scale spatial features for the 60th eigenpattern (last plot in the second group). This is what we often seen when we do PCA on atmospheric data. The larger spatial scales capture most of the variability in the data so are represented by the first few eigenpatterns, while smaller scale spatial variability is represented by later eigenpatterns. Intuitively, this is probably related to the power-law scaling in the turbulent cascade of energy from large (planetary) scales to small scales (where dissipation by thermal diffusion occurs) in the atmosphere1.
The next thing we can look at, at least in the first few patterns in the first group of plots, are some of the actual patterns of variability these things represent. The first PCA eigenpattern, for example, represents a dipole in Z500 anomaly variability with poles in the North Atlantic just south of Greenland and over mainland western Europe. If you look back at the blocking Z500 anomaly plots in an earlier post, you can kind of convince yourself that this first PCA eigenpattern looks a little like some instances of a blocking pattern over the North Atlantic. Similarly, the second PCA eigenpattern is mostly a dipole between the North Pacific and North America (with some weaker associated variability over the Atlantic, so we might expect this somehow to be related to blocking episodes in the Pacific sector.
This is all necessarily a bit vague, because these patterns represent only part of the variability in the data, with each individual pattern representing only a quite small fraction of the variability (8.86% for the first eigenpattern, 7.46% for the second, 6.27% for the third). At any particular point in time, the pattern of Z500 anomalies in the atmosphere will be made up of contributions from these patterns plus many others. What we hope though is that we can tease out some interesting characteristics of the atmospheric flow by considering just a subset of these PCA eigenpatterns. Sometimes this is really easy and obvious – if you perform PCA and find that there are two leading eigenpatterns that explain 80% of the variance in your data, then you can quite straightforwardly press ahead with analysing only those two patterns of variability, safe in the knowledge that you’re capturing most of what’s going on in your data. In our case, we’re going to try to get some sense of what’s going on by looking at only the first three PCA eigenpatterns (we’ll see why three in the next article). The first three eigenpatterns explain only 22.59% of the total variance in our Z500 anomaly data, so this isn’t obviously a smart thing to do. It does turn out to work and to be quite educational though!
The last component of the output from the PCA procedure is the time series of PCA projected component values. Here we have one time series (of 9966 days) for each of the 1080 PCA eigenpatterns that we produced. At each time step, the actual Z500 anomaly field can be recovered by adding up all the PCA eigenpatterns, each weighted by the corresponding projected component. You can look at plots of these time series, but they’re not in themselves all that enlightening. I’ll say some more about them in the next article, where we need to think about the autocorrelation properties of these time series.
(As a side note, I’d comment that the PCA eigenpatterns shown above match up pretty well with those in Crommelin’s paper, which is reassuring. The approach we’re taking here, of duplicating the analysis done in an existing paper, is actually a very good way to go about developing new data analyis code – you can see quite quickly if you screw things up as you’re going along by comparing your results with what’s in the paper. Since I’m just making up all the Haskell stuff here as I go along, this is pretty handy!)
But don’t make too much of that, not in any kind of quantitative sense anyway – there’s certainly no obvious power law scaling in the explained variance of the PCA eigenpatterns as a function of eigenvalue index, unless you look at the data with very power law tinted spectacles! I’m planning to look at another paper at some point in the future that will serve as a good vehicle for exploring this question of when and where we can see power law behaviour in observational data.}↩
data-analysis haskell
(function () {
var articleId = fyre.conv.load.makeArticleId(null);
fyre.conv.load({}, [{
el: 'livefyre-comments',
network: "livefyre.com",
siteId: "290329",
articleId: articleId,
signed: false,
collectionMeta: {
articleId: articleId,
url: fyre.conv.load.makeCollectionUrl(),
}
}], function() {});
}());
2014-10-12T18:46:21Z

Categories: Functional Programming

I wanted to try out a recent feature in GHC this week, so I was using HEAD. When I say using, I mean it: I wasn’t developing with it, but using it to build Ivory, our safe C EDSL, as well as some libraries written in Ivory. I want to point out a few dragons that lay […]

Categories: Functional Programming