Professional Documents
Culture Documents
Hi everyone,
welcome to my Haskell tutorial!
There's other tutorials out there, but you'll like this one the
best for sure: You can just cut and paste the code from this
tutorial bit by bit, and in the process, your new program will
create magically create more and more cool graphics along
the way... The final program will have less than 100 lines of
Haskell[1] and will organize a mass picnic in an arbitrarily-
shaped public park map and will print pretty pictures
showing where everyone should sit! (Here's what the final
product will look like, if you're curious...)
Now you have all you need to run the "Hello World" program
below- Just copy the code into a file called tutorial.hs:
import Data.List
import Text.Regex
import System.Random
import Data.Ord
main = do
putStr "Hello World! Let's have a picnic! \n"
After you've copied the code to a file, just run your new
program from the command line as shown below:
As you can see, there's lots of stuff in this "Hello World" that
doesn't seem necessary... Actually, only the last two lines
(starting at main = do) are really needed to print our
message... The rest is just header stuff that we'll use later in
the tutorial...
At the top, you can see some things we're importing out of
the standard GHC libraries:
The next few lines define different types that we're going to
use in our picnic program... Haskell programmers are really
fond of lots of types, because Haskell compilers are basically
just these crazy type-crunching machines that will take any
types you give them and if they find anything questionable
about how you're using types they will let you know early,
preventing your program from becoming buggy. The way
Haskell compilers handle types puts them head and shoulders
above most other compilers (Read more about Hindley-
Milner type inference here.)
The first few type lines should be pretty self-explanatory,
for a program that's going to work with pictures: A Point is
just a pair of floating numbers, a Color is an rgb triple, a
Polygon is just a list of Points, a Person is just a list of
the answers they gave in the questionaire- Very simple stuff...
To keep things simple I'm building these new types using the
type command, instead of another command called data.
This other command (which you don't need to know about
today) lets you attach a special new type name that you use to
construct your new type- That name makes it even harder to
accidentally use the wrong type in the wrong place in your
program.
Another thing thats "bad" about this code is that we used the
type Int. Unlike all other types in this program, Int has an
upper limit and could theoretically cause our program to err
out if any integers got way too massively big. In Haskell, this
type of integer can't get bigger than 2^31... We could have
used the type Integer instead- This type of integer grows
"magically" and can hold any sized integer, no matter how
big... but it's slower, so we used Int instead, since our
numbers will never get that big.
NEXT
Now that we've got "Hello World" working... let's load the
answers people gave to the questions on the picnicmob.org
website... For testing purposes, I've created some anonymized
data and put it into the file people.txt for you to use- Just save
this file to your computer.
Here's what that file looks like- It's just a list of list of
numbers. Each row is a person and the numbers are the value
answer number they gave on each question:
[
[2,3,3,4,4,3,5,2,2,3,2,2,2,3,2,5,3,1,3,5,2,5,2,2,2,3,2,5,2,3],
[2,3,3,2,3,3,5,2,3,4,2,2,1,1,1,2,1,5,1,4,2,5,2,2,2,2,4,1,1,1],
[2,3,4,3,3,5,5,2,3,5,2,3,2,1,2,4,4,3,3,1,2,5,3,5,2,3,4,1,1,2],
[1,3,3,3,3,3,3,3,4,4,3,3,4,4,2,3,3,3,1,4,3,4,3,3,2,1,3,2,3,3],
[4,1,3,3,3,3,4,1,3,4,3,3,1,2,1,4,4,2,2,4,2,2,2,1,2,3,4,2,5,1],
[2,1,1,2,3,5,4,1,1,1,2,3,5,1,3,2,4,2,3,5,2,5,2,2,2,3,4,2,2,4],
[1,3,4,2,3,5,4,1,3,1,2,2,2,3,1,1,1,2,2,2,2,1,2,1,2,3,3,1,3,3],
[3,3,3,1,3,2,2,1,2,3,2,3,2,3,2,3,3,5,2,1,2,5,2,1,2,1,4,2,2,2],
[1,3,4,3,3,5,5,3,1,4,2,4,1,1,5,1,1,4,2,5,2,5,3,1,2,3,4,2,1,4],
[1,3,4,2,4,5,4,3,2,1,3,4,1,1,4,5,4,5,3,4,2,2,2,1,2,3,3,4,1,4],
[1,1,1,4,4,1,4,3,3,3,4,3,1,2,1,4,1,5,2,2,2,4,2,5,2,1,4,4,1,3],
...
Here's the code that reads in this list of numbers. Adding this
code to our existing program is super easy: Just paste this
new code to the bottom of the "Hello World" program to
make the new code work- Do the same thing, as we go
along, with all the other code fragments in this tutorial.
perfect!!
Ok, so let's dissect this code to figure out how it worked... The
first line read the file "people.txt" and assigned it to the a string
named people_text... Note that we used a left-pointing
arrow < to do the assignment, instead of an equals sign. The
reason for this is that Haskell is very finicky about code that
does I/O, like reading from the disk or printing something on
the screen. In the Haskell way of thinking, code that does this
is pure evil and needs to be imprisoned into its own pen,
separately from the rest of your Haskell program.
The last two lines are obvious- They just print a message on
the screen indicating the length of the people variable.
NEXT
SVG files are a simple file format for specifying images that
you'll probably hear a lot more of in the future- It's
completely resolution independent and already used by many
drawing programs such as Inkscape, which I highly
recommend.
After you run this code, you'll see the following picture
drawn in the file tut0.svg that you can open in your web
browser:
colorize c p = zip (repeat c) p
To start off, I should point one little oddity about Haskell: It's
a lazy language- This means that when your program runs,
only those things are calculated at the last possible moment-
This means Haskell programs can (and usually do) have data
structures in them that are actually infinite in size... repeat
is one example of a function that does this:
The next function, zip, takes two list and zipps 'em up into a
list of pairs: So it'll take our list of colors and our list of
polygons in the variable p and make a list of
(Color,Polygon) pairs, which is what we want- Since
the number of polygons will probably be finite in number, the
zip function will stop using items from the infinite list of
colors and, voila, the infinities magically disappear, through
sheer, unmitigated, abominable laziness.
Ok, so now let's take this function and let's see if we can
make it more elegant
colorize c p = zip (repeat c) p
colorize :: Color > [Polygon] > [(Color,Polygon)]
Earlier I told you that you should look at the last arrow in a
type definition and that all things in front of that arrow are
parameters- While this is a good way to think about Haskell
functions, in reality, the arrow > is actually right
associative... that means this function actually reads as
follows:
colorize :: Color > ([Polygon] > [(Color,Polygon)])
colorize c = zip (repeat c)
colorize c = zip $ repeat c
The dollar sign is really just syntactic sugar but it's really
useful because if we are just feeding single values into
functions, as we usually are, then it forms kind of an
assembly line- So basically, we're taking the value c, running
it through repeat, then running it through zip- This works
exactly the same way as a Unix pipe, if you've ever worked
with that clever concoction before...
colorize c = (zip.repeat) c
Can you figure out why we want to do this? You may notice
we have another value dangling off the back of our function
again- We can snip it off of both sides, just like before,
leaving us with the final, most elegant form:
colorize = zip.repeat
NEXT
This code is pretty much structured the same way as the code
that did the writing of the SVG. In this case we're using
regular expressions to split the data into parts that have each
polygon. The SVG format is actually very complex, so this
code takes some liberties in the format and may fail on some
SVG files- For the purposes of loading some SVG maps into
our program, though, it's great!
What's good about this code?
NEXT
First of all, you notice that this function is actually has two
definitions, with different patterns to the left of the equal sign
((a:b:c:xs) and _) The way Haskell works is that it will
try to match to the first pattern if it can. Then, if that fails, it'll
go to the next one it finds.
NEXT
Now that weve broken up our park into little triangles, it's
pretty easy to write some functions that will partition the park
into chunks by slicing our triangle piles along arbitrary
horizontal and vertical lines. Here's the code that will do that:
let clipTriangle :: (Point -> Point -> Point) -> [Point] -> [Point] -> [Polygon]
clipTriangle i [] [a,b,c] = []
clipTriangle i [a] [b,c] = [[a,i a b,i a c]]
clipTriangle i [a,b] [c] = [[a,i a c,b],[b,i a c,i b c]]
clipTriangle i [a,b,c] [] = [[a,b,c]]
let slice :: (Point -> Bool) -> (Point -> Point -> Point) -> [Polygon] ->
([Polygon],[Polygon])
slice f i t = (clip f,clip $ not.f)
where clip g = concatMap ((uncurry $ clipTriangle i).(partition g)) t
NEXT
Now that we have the tools we need to cut our park into nice
little pieces, how do we figure out the best way to cut it up?
First, we cut the park roughly down the middle in two halves-
If the park is "fat", we cut it vertically... If the park is "tall",
we cut it horizontally. Next, we calculate the surface area of
each side- Using the ratio of surface areas, we can now
randomly break our picnickers into two populations, one for
each side.
We keep doing this until every person has their own space-
Here's the code that makes it work:
let boundingRect :: [Polygon] -> (Float,Float,Float,Float)
boundingRect p = (minimum xs,minimum ys,maximum xs,maximum ys)
where xs = map fst $ concat p
ys = map snd $ concat p
Another cool feature of this code is that it colors all the land
chunks using a rainbow of colors- We do this by converting
our rainbow of colors into an infinitely repeating rainbow
of colors using the cycle function, and then we write a little
esoteric code that will call every successive rainbow color
function (remember, rainbow consisted of a list of color-
applying functions) to make a rainbow-colored map of park
lots.
First of all, every time we need to check the land area for a
region of the park we're recalculating it from scratch- Since
this is done over and over again for the same pieces as the
park is recursively subdivided, a lot of computrons are being
wasted. A better version would "remember" previous triangle
areas using memoization.
NEXT
NEXT
Note that with this function, the islands are now more like peninsulas (or perhaps jetties... or maybe
enclaves would be more appropriate... definitely not buttes, though...) This is the function we'll use
for "walking around".
NEXT
On your mark...
The color of the lines tells you how well two neighbors get
along... The similarityColor function is a little kludge
that creates a nice color from red to blue, depending on how
compatible two people are (as usual in Haskell, the type
signature of Person>Person>Color is pretty much a
dead give-away...)
NEXT
Get Set...
Next, we need to define our four simulated annealing functions... In the left
corner, picnicEnergy tells us the overall happiness of our picnic
population... smaller, low energy values mean greater happiness. In the right
corner, picnicMotion swaps two neighbors randomly to cause motion. In
the front corner, (This is a tag-team match) picnicTemperature gives the
current temperature: We want to start hot, then cool things down so that our
picnic crystalizes in a controlled manner. And finally, in the back,
picnicTransitionProbability takes the current temperature and the energy of
two states and calculates the likelihood a transition into the new state will
occur. For more info, consult you know what...
When we run the program with this new addition, We'll now
get some info on the picnic starting positions:
Well, these four functions are going to be the bread and butter
of our this program- Probably, about 99% of CPU time will
be spent in this little bit of code... Consequently, every little
inefficiency in this code will hurt performance hundredfold...
And boy, is this code inefficient: Because we're storing
everything in lists, it means that any function doing a search,
such as findPerson, will be inefficient to the point of
absurdity... Even worse is the fact that picnicEnergy checks
the happiness of the entire picnicker population, even though
only two people move at any one time and does so in the
most tedious way by checking every questionaire question on
every person against neighbors over and over and over again,
instead of precalculating the compatibility of two people
ahead of time.
NEXT
GO!!!
Before
After
WooHoo! Now let's figure out what this final piece of code is
all about...
Back To lisperati.com