Functional programming is one type of declarative programming.
In functional programs, it still feels like you're giving explicit instructions, but the language/compiler has more freedom in interpreting details.
Start by thinking of imperative programming, but…
state(global variables, etc.).
for
loop counters or while
conditions).How can we get anything done without variables? The emphasis is on evaluating calculations, not executing steps.
We still get function arguments. e.g. x
is a function argument, not a variable in these:
def half_of(x): return x/2
double half_of(double x) { return x/2; }
half_of x = x/2
Recursion can take the place of loops.
What we express is calculation we need the result of
instead of follow these steps
. That is what gives the compiler more freedom.
Having no variables means (should mean?) easier debugging: results of a function are determined entirely by its arguments.
Haskell is a functional programming language.
It's very functional: purely functional. It doesn't allow you to sneak imperative tricks in.
Some other functional languages allow some imperative things to happen (Lisp, Scheme, OCaml, Erlang). They are arguably more practical, but [I claim] not as good at helping you learn functional style programming.
So, we're using Haskell.
We will use GHC (the Glasgow Haskell Compiler) as our compiler. It's the standard toolchain for Haskell.
Also, it includes the GHCi interactive environment where you can test expressions/code, as well as interact with programs/modules.
Prelude> 2-4 -2 Prelude> 2*3 /= 6 False Prelude> reverse [2,3,4,5] [5,4,3,2]
To load code from a .hs
file, use
::l
Prelude> :l mycode.hs [1 of 1] Compiling Main ( mycode.hs, interpreted ) Ok, modules loaded: Main. *Main>
To reload changes to the .hs
file, use
::r
*Main> :r Ok, modules loaded: Main.
Basic data types (incomplete):
4
, 6.3
, (-4)
"abc"
, "6.3"
[1, 2, 3, 4]
, ["abc", "def"]
True
, False
Numeric expressions work like you probably expect:
Expression | Result |
---|---|
3 + 4 | 7 |
3 + 4 * 2 | 11 |
2 ^ 8 | 256 |
3.0 + 4 | 7.0 |
8.0 / 2 | 4.0 |
… and boolean expressions:
Expression | Result |
---|---|
not True | False |
False || True | True |
False && True | False |
6 == 4 | False |
6 /= 4 | True (≠ comparison) |
6 <= 4 | False |
One basic list operation: concatenate:
Expression | Result |
---|---|
[1,2,3] ++ [4,5] | [1,2,3,4,5] |
[] ++ [1,2] | [1,2] |
Another: cons (short for construct
):
Expression | Result |
---|---|
3 : [4,5] | [3,4,5] |
2 : (3 : (4 : [])) | [2,3,4] |
2 : 3 : 4 : [] | [2,3,4] |
Note: The (:)
operator has a list element on the left, and a list on the right. The element is prepended to the start of the list. The operator is right-associative, so the last two examples are equivalent.
Characters are indicated with single quotes and strings with double quotes (like C):
Expression | Result |
---|---|
'a' | a character |
"a" | a string |
"abc" | a string |
'abc' | an error |
"abc" ++ "def" | "abcdef" |
'a' ++ "bcd" | an error |
Strings are just shorthand for lists of characters (like C):
Expression | Result |
---|---|
"ab" ++ ['c', 'd'] | "abcd" |
'a' : "bc" | "abc" |
['X', 'Y'] == "XY" | True |
Haskell lists are conceptually singly-linked lists: imagine a head with pointer to the rest of the list. *
Not only does 1:2:[] == [1,2]
but you could think of the syntax [1,2]
as just a shorthand for 1:2:[]
.
Functions are (obviously) important in functional languages. They are the primary unit of code (as opposed to classes, statements, etc).
Calling functions in Haskell looks different than you might expect: no parens around arguments; just spaces separate arguments. It may take some getting used to.
Most Languages | Haskell Equivalent |
---|---|
f(a, b) | f a b |
f(a, b+1) | f a (b+1) |
f(a, b) + 1 | f a b + 1 (f a b) + 1 |
f(g(a), b) | f (g a) b |
Haskell knows what is a function and how many arguments it takes: it expects the right number of arguments to follow (except when it doesn't: more later).
So, if f
is a two-argument function, this is sensible:
f 1 2
… and these are incomplete and have and extra argument, respectively (but more later):
f 1 f 1 2 3
Defining functions (at least for simple expressions) is easy (filename functions1.hs
):
half_of x = x/2 isBig a = a > 100 listify x y = [x, y]
And then you can use them:
Prelude> :l functions1.hs [1 of 1] Compiling Main ( functions1.hs, interpreted ) Ok, modules loaded: Main. *Main> half_of 5 2.5 *Main> isBig 123 True *Main> listify 'a' 'b' "ab"
A simple expression doesn't let us get much logic into a function.
The most Haskell-ish way to express a condition is pattern matching where we can express if the arguments look like this, the result is…
.
For example, a function for a logical and
:
myAnd False _ = False myAnd True a = a
Things happening there:
False
._
matches any value (but ignores it).True
The behaviour is a logical and
operation:
*Main> :r Ok, modules loaded: Main. *Main> myAnd True True True *Main> myAnd True False False *Main> myAnd False True False *Main> myAnd False False False
Rule: the first definition that matches applies.
isZero 0 = True isZero _ = False
*Main> isZero 0 True *Main> isZero 1 False
Recursion works and we can match a cons pattern (first and rest of a list). We could define something like:
myLength [] = 0 myLength (_:xs) = 1 + myLength xs
*Main> myLength [1, 2, 3] 3 *Main> myLength "Hello World" 11
The cons pattern is very common for recursing over a list.
processList [] = … -- base case processList (x:xs) = … processList xs … -- recursive case
Another example: are two lists equal?
listEqual [] [] = True listEqual (x:xs) (y:ys) = x == y && listEqual xs ys
*Main> listEqual [1,2,3] [1,2,3] True *Main> listEqual [1,2,3] [1,2,4] False
Problem?
The patterns in the function definition miss the case of non-equal length lists.
*Main> listEqual [1,2,3] [1,2,3,4] *** Exception: functions1.hs:(16,1)-(17,51): Non-exhaustive patterns in function listEqual
We can add these cases to cover those inputs:
listEqual [] (_:_) = False listEqual (_:_) [] = False
(or listEqual _ _ = False
and use the first-match-only behaviour.)
We can match whatever patterns are necessary to solve our problem.
e.g. Does a list have even length? We can do that by matching two elements from the start of a list and recursing.
hasEvenLength [] = True -- base case: length 0 is even hasEvenLength [_] = False -- base case: length 1 is odd hasEvenLength (_:_:rest) = hasEvenLength rest
Rules for pattern matching:
x
): matches anything and captures the value._
: matches anything but we don't need the value.1
, []
, etc): matches if equal.Matching lists:
[]
: matches an empty list.[a]
: matches a one-element list.(a:tail)
: matches a list with at least one element. First is a
, rest of list is tail
(and could be an empty list).(a:b:tail)
: matches a list with at least two elements. First is a
, second is b
, remainder of list is tail
(could be empty).For example, if we have the pattern:
(a:b:c)
… then these lists match as follows:
List | a | b | c |
---|---|---|---|
[6,7,8,9] | 6 | 7 | [8,9] |
['a', 'b'] | 'a' | 'b' | [] |
"ab" | 'a' | 'b' | [] (== "" ) |
[100] | no match |
Pattern matching produces nice-looking code, but it can't do everything. There are a few ways to write more flexible conditional code.
There is an if
/else
expression, but I'm banning it: it's not Haskell-ish, and often used by students to avoid learning the functional style.
Much nicer: guarded expressions
mySignum x | x>0 = 1 | x<0 = -1 | otherwise = 0
Analgous to the mathematical function:
\[ \mathit{signum}(x) = \begin{cases} 1 & \text{if } x>0\\ -1 & \text{if } x<0\\ 0 & \text{otherwise}\\ \end{cases} \]There is also a case
expression to give multiple alternatives in a convenient way:
word n = case n of 1 -> "one" 2 -> "two" 3 -> "three" _ -> "???"
The case
is an expression: it can be used just like any other Haskell expression:
wordWithX n = (case n of 1 -> "one" 2 -> "two" 3 -> "three" _ -> "???") ++ "X"
All of the values
for the cases are patterns, just like function definitions:
describeList lst = "The list is " ++ case lst of _:_:_:_:_ -> "fairly long" -- >= 4 elements _:_ -> "short" -- >= 1 element [] -> "empty"
*Main> describeList [] "The list is empty" *Main> describeList [3,4] "The list is short" *Main> describeList "hello" "The list is fairly long"
Since there are no variables, we can't write loops like we're used to in other languages.
Loops (as we think of them) need a variable (i
here) that determines when the loop ends: no variable = no way to end the loop.
for ( int i=0; i<10; i++ ) { cout << i << '\n'; }
int i = 100; while ( i>1 ) { cout << i << '\n'; i /= 2; }
We have recursion, and know (I hope) that recursion can be used instead of any loops.
myLength [] = 0 myLength (_:xs) = 1 + myLength xs
Recursion is going to be the biggest way we repeat logic: time to get comfortable with it.
With recursions and conditions as we have them, the language is Turing complete. Everything else is just convenience.
We will also have implicit loops [my term]: an expression that implies repeated application of a function…
Recall the set builder notation from math:
\[ \{x^2 \mid x\in\mathbb{N}, x \text{ is even} \} = \{ 4, 16, 36, \ldots \} \]It's a very efficient notation to describe a set. Haskell has a very similar notation to build a list…
A list comprehension are a syntax in Haskell to describe a list, similar to the set builder notation.
[Python borrowed list comprehensions: you may have seen them there. Or LINQ in .NET is like list comprehensions and SQL had a baby.]
First part: generators, like \(x\in\mathbb{N}\)
in the set notation. The generator creates the source
values.
Use the <-
operator with an existing list on the right. e.g.
x <- [1,2,3,4]
The generator can be used with any expression to build a list. Each generator value is used as an argument
to the expression.
Prelude> [ x^2+1 | x <- [1, 2, 3, 4, 5, 6, 7] ] [2,5,10,17,26,37,50] Prelude> [ succ c | c <- "hello" ] "ifmmp"
You can use more than one generator in a comprehension: every pair of values is produced.
Prelude> [ 10*x + y | x <- [3,4,5], y <- [6,7,8,9]] [36,37,38,39,46,47,48,49,56,57,58,59]
A second piece we can add: guards, like \(x \text{ is even}\)
in the set notation.
A guard is a boolean expression that decides if a value should be included in the result or not.
Prelude> [ x^2+1 | x <- [1, 2, 3, 4, 5, 6, 7] ] [2,5,10,17,26,37,50] Prelude> [ x^2+1 | x <- [1, 2, 3, 4, 5, 6, 7], even x ] [5,17,37]
List comprehensions are often used with the
shorthand for arithmetic sequences...
Expression | Result |
---|---|
[1..6] | [1,2,3,4,5,6] |
[2,4..15] | [2,4,6,8,10,12,14] |
['a'..'m'] | "abcdefghijklm" |
The
can be used anywhere you need a list, not just in a comprehension. But it makes it easy to write a list comprehension that builds a list of arbitrary size:..
firstSquares n = [ i*i | i <- [1..n] ] firstEvenSquares n = [ i*i | i <- [1..n], even i] firstEvenSquares' n = [ i*i | i <- [2,4..n]]
*Main> firstSquares 10 [1,4,9,16,25,36,49,64,81,100] *Main> firstEvenSquares 10 [4,16,36,64,100] *Main> firstEvenSquares' 10 [4,16,36,64,100]
The notation we have lets us be very expressive in Haskell.
qs [] = [] qs (x:xs) = smaller ++ [x] ++ larger where smaller = qs [a | a<-xs, a<=x] larger = qs [a | a<-xs, a>x]
… a Quicksort in four lines.
It's easy to create unreadable code in Haskell: any function that does anything interesting probably requires a long expression. Just because it works doesn't mean it's good code.
The let
and where
clauses let you alias values so you can use them by-name or multiple times.
e.g. in the definition of the Quicksort, the where
let me (1) turn what could have been one long, ugly expression into three short ones; (2) give the sub-expressions names so they are easier to understand.
qs [] = [] qs (x:xs) = smaller ++ [x] ++ larger where smaller = qs [a | a<-xs, a<=x] larger = qs [a | a<-xs, a>x]
The
structure creates an expression and can be used anywhere you can use any other expression.let name=… in …
Prelude> let x=4 in x*2 8 Prelude> (let x=4 in x*2) ^ 3 512
But it's more useful to simplify things in longer definitions:
qs' [] = [] qs' (x:xs) = let smaller = qs' [a | a<-xs, a<=x] larger = qs' [a | a<-xs, a>x] in smaller ++ [x] ++ larger
The where
must be part of a function definition (or similar block):
somePowers x = [x, sq x, sq (sq x)] where sq n = n*n
In either case, multiple sub-expressions can be defined, separated by linebreak (or semicolon).
The choice between let
and where
is often one of style: does it make more sense to define the sub-expressions before or after the main
expression? It will depend on the context. Do the more readable thing.
qs (x:xs) = smaller ++ [x] ++ larger where smaller = qs [a | a<-xs, a<=x] larger = qs [a | a<-xs, a>x]
qs' (x:xs) = let smaller = qs' [a | a<-xs, a<=x] larger = qs' [a | a<-xs, a>x] in smaller ++ [x] ++ larger
You should use let
or where
if your code is getting complicated. Break up big expressions; give the sub-parts meaningful names.
[If I see a single expression more than one moderate-length line, I'm going to refuse to help you. The TAs should take off marks for coding style too.]
The idea of recursion shouldn't be new. But maybe a little review wouldn't hurt…
Function definitions will have two cases (which may be sub-divided further).
Base case: small case(s) where the result is obvious/easy.
Recursive case: you find a smaller subproblem (i.e. closer to the base case), another version of the same problem that will help you find the real
solution.
Make a recursive call to solve the subproblem.
Use that sub-result to create the real
result.
e.g. Argument n
becomes n-1
or n/2
. Base case is 0
or 1
.
e.g. Argument (x:xs)
becomes xs
. Base case is []
, or occasionally [x]
.
An example: calculate powers (\(x^y\) for \(x, y\) integers, \(y\ge 0\)).
myPower _ 0 = 1 myPower x y = x * myPower x (y-1)
This uses the facts \(x^0=1\) and \(x^y = x\cdot x^{y-1}\).
Like with any other iteration, we have the opportunity to be more or less efficient.
myPower' x y | y==0 = 1 | even y = half*half | odd y = x*half*half where half = myPower' x (div y 2)
This uses the facts \(x^0=1\) and \(x^y = {(x^{y/2})}^2\) (and fixes integer division rounding).
Running times?
myPower _ 0 = 1 myPower x y = x * myPower x (y-1) myPower' x y | y==0 = 1 | even y = half*half | odd y = x*half*half where half = myPower' x (div y 2)
myPower
is \(O(y)\) and myPower'
is \(O(\log y)\).
More recursion: Hutton chapter 6 (especially §6.6) and Learn you a Haskell chapter 5.
Many people think that recursion is slow(er than loops). With some languages/tools, it is.
It is also limited by the implementation in some languages: can only recurse to a certain modest depth. (e.g. Python's sys.getrecursionlimit
seems to give 3000 by default.)
But limitations in some tools don't imply any conceptual problem.
Counter argument #1: Don't worry about it.
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Don Knuth or maybe Tony Hoare
If a recursive implementation is more readable or expressive, then it's better than fast code, except maybe in the most-called few percent. So write the readable recursive version and move on.
Counter argument #2: Clever algorithms beat clever code. Write a good algorithm in the most readable way.
e.g. compare how long these take:
*Main> myPower 1 5000000 1 *Main> myPower' 1 5000000 1
No amount of clever coding is going to save myPower
.
Counter argument #3: maybe the problem isn't the code, it's the compiler.
The thing that usually slows recursion (compared to loops) is building the call stack. Let's explore that…
The call stack keeps track of where to return to when the (recursive) function calls exit. If you recurse 1000 levels deep, the call stack needs 1000 entries built/destroyed.
The function call | … calculates | … returns |
---|---|---|
myPower 2 3 | 2 * myPower 2 2 | 8 |
myPower 2 2 | 2 * myPower 2 1 | 4 |
myPower 2 1 | 2 * myPower 2 0 | 2 |
myPower 2 0 | 1 | 1 |
All of the expressions must be stored in-progress (on the call stack, as stack frames) so the recursion can be unwound once recursive calls start returning actual values. That's expensive if the recursion is deep.
Theory: recursion is slower than loops because we have to maintain the call stack. Therefore, if we could get rid of the call stack, it would be faster. Right?
Consider this (slightly stupid) function that adds two (positive) integers:
stupidAdder x 0 = x stupidAdder x y = stupidAdder (x+1) (y-1)
*Main> stupidAdder 16 3 19 *Main> stupidAdder (-5) 100 95
Again, based on some math we know:
\[ x + 0 = x \\ x + y = (x+1) + (y-1) \]When we call stupidAdder 16 3
, this happens:
The function call | … calculates | … returns |
---|---|---|
stupidAdder 16 3 | stupidAdder 17 2 | 19 |
stupidAdder 17 2 | stupidAdder 18 1 | 19 |
stupidAdder 18 1 | stupidAdder 19 0 | 19 |
stupidAdder 19 0 | 19 | 19 |
The final answer is there in the base case. Why bother keeping the call stack?
We didn't need the call stack in that example because the result is the recursive call by itself, not some expression including the recursive result and other stuff.
e.g. not this:
stupidAdder' x y = 1 + stupidAdder' x (y-1)
If we can write recursive functions like this, the compiler could decide it's okay to throw away the call stack. If it does, our code should be faster.
These are called tail recursive functions: the very last thing that happens in the function (the tail
) is a function call.
In general, we can talk about a tail call: any function that ends by returning another function call by itself.
Haskell will eliminate tail calls if compiler optimization is turned on. This can only be done if the code is compiled outside GHCi.
$ ulimit -v 4194304 # limit to 4GB memory $ ghc -O2 tailrec.hs # non-tail recursive stupidAdder $ time ./tailrec tailrec: out of memory real 0m5.298s user 0m4.565s sys 0m0.733s $ ghc -O2 tailrec.hs # tail recursive stupidAdder $ time ./tailrec 500000001 real 0m0.443s user 0m0.438s sys 0m0.005s
Creating a tail recursive function generally requires passing a partial result as an argument: instead of the call stack keeping track of what you're calculating, you have to do it yourself.
The argument to hold the partial result is the accumulator. In stupidAdder
, the first argument was implicitly doing this.
Let's look back at myPower
:
myPower _ 0 = 1 myPower x y = x * myPower x (y-1)
We'll introduce an accumulator argument a
, and make the rule that this returns \(a\cdot x^y\):
myPowerTailRec a x y
The base case: \(a\cdot x^0 = a\)
myPowerTailRec a _ 0 = a
The recursive case: \(a\cdot x^y = (a\cdot x) \cdot x^{y-1}\)
myPowerTailRec a x y = myPowerTailRec (x*a) x (y-1)
Basically: the call stack used to remember
. Now we do the work in the accumulator.x *
Running time to find \(1^{400000000}\) with these functions compiled with ghc -O2
:
myPower
;myPowerTailRec
.… but 0.005 s with the logarithmic running time myPower'
.
The same trick with a factorial function:
factorial 0 = 1 factorial n = n * factorial (n-1)
Use the rule \(a \cdot n! = (n\cdot a) \cdot (n-1)!\) and this becomes:
factorial' n = factorialTailRec 1 n where factorialTailRec a 0 = a factorialTailRec a n = factorialTailRec (n*a) (n-1)
The where
lets us hide implementation junk from the caller.
Is tail recursion worth the effort? Not usually: it's a premature optimization.
Never if the compiler doesn't do tail-call optimization. (e.g. the usual compilers for Python and Java don't.)
But if you have a function that takes a long time because of overhead of maintaining the call stack, then maybe. (But consider how weirdly artificial I had to be to create myPower
where it was noticable.)
Modern C compilers can optimize tail calls, and construct the accumulator (in simple cases, at least).
Have a look at an example. Compare the assembly code generated by gcc -O0
and gcc -O2
.
This can speed up recursive implementations, and avoid limits on recursion depth.
Implications of the assembly the compilers generate…
myPower(1, 1000000)
will probably cause a stack overflow with -O0
, but succeed with -O2
.x *
might get transformed to the tail call and non-recusive implementation. Maybe not.