You've all heard it, and you all know it: premature optimizations are the root of all evil. But what does that mean? When is an optimization premature? I've come to think of this sort of 'dilemma' many times at work, where I see both my self and coworkers judging this by different standards.
Some programmers really don't care about performance at all. They write code that looks perfect, and fix performance issues as they come. Others think about performance all the time, where they will happily trade good software engineering principles for improved performance. But I think that thinking about performance in both of these ways are wrong. Performance is about simple and 'natural' design.
Let me give one recent example from work. A few months back, we rewrote parts of our software to get new features. Most importantly this involved a new protocol to be used between some client and server. After a while, users of our software were complaining about a server process consuming an unreasonable amount of memory. After profiling the server in question, we found that they were right: there was no need for that process to use that much memory. While profiling, we found a lot of things consuming an unecessary amount of memory, but the most important of them was that some large piece of data was copied many times before sending it through the cable.
Why did we write software like that? Well, while writing the new protocol we thought that it didn't matter. The data was typically not that big, and the server could handle the load fairly well according to benchmarks. Turns out that the user was serving much more data through that server than we anticipated. Luckily, increasing the JVM heap size made things go around, so it was no catastrophe.
We set off to fix it, and the amount of 'garbage' created was much less and memory usage improved significantly. But to get there, we had to introduce a new version of our protocol and refactor code paths to be able to pass data through the stack without copying. It was no more than 5 days of work, but those days could probably have been spent doing more useful things.
But surely, you'd expect the code to become horrible, full of dirty hacks and as unmaintainable as OpenSSL? Actually, it didn't. The refactorings and protocol changes did not worsen the state of our software. In fact, the design became better and more understandable, because we had to think about how to represent the data in a fashion that reduced the amount of copying done. This gave us an abstraction that was easier to work with.
Premature optimizations are really evil if they make your software harder to read and debug. Optimizations at the design level on the other hand, can save you a lot of trouble later on. I am under the impression that a lot of software is written without a 'performance mindset', and that a lot of manpower is wasted due to this. If we used 1 day to properly think through our protocol and software design in terms of data flow, we could spend 4 days at the beach drinking beer.
I like beer.
Having run my toy web service wishsys for a few months, I thought it would be nice to setup a continuous delivery/deployment pipeline for it, so that I could develop a new feature and push it into production as quickly as possible (if it passes all tests). I thought writing a small article about this would be nice as well, as I found few resources for doing CD with haskell.
The wishsys code is written in haskell and uses the yesod web framework. Update: I noticed a comment on reddit mentioning keter, which is essentially what I should have used instead of debian packages since I'm using yesod. This guide should still be relevant regarding CI though.
Choosing CI system
After doing a little research, I ended up with two alternatives to use for CI.
I have some experience with Jenkins at work, and since its very generic, I thought it would be easy to build a haskell project with it. Since the wishsys code is hosted at github, I was tempted to try travis, since it integrates well with github. I decided to go with Jenkins instead, mainly because I had a VPS to run it, and I didn't need the great Heroku support, as I would be deploying it to the same VPS. There is a good introduction of CI for Haskell and Yesod at yesodweb.
Git and branches
Having chosen which CI-system to use, I needed to structure the git repository in a way that fits my workflow. Since I wanted to have some control over what changes that are pushed out to production, I created a stable branch in the repository, which is the branch from which the production binaries are compiled. All development goes into other branches. For CI to have any meaning, all of these branches should be built all the time so I can be sure that my tests are stable and that they pass. The plan was to create a build job for every branch that builds and tests that particular branch.
Setting up Jenkins
Setting up Jenkins is very easy on Debian, as you just need to add the jenkins debian repo and install packages. Having done that before, I just started on creating a new project for wishsys. Jenkins has a lot of plugins, and since this is a github project, I installed the github plugin for jenkins as well.
Having only the stable branch, I created one job called wishsys-build-stable. To build haskell, I added an entry under "Execute shell", which runs the following commands when building:
$CABAL sandbox init $CABAL --enable-tests install
$CABAL is parameterized to /var/lib/jenkins/.cabal/bin/cabal since I needed a newer cabal version to get the sandbox functionality. I also configured the job to trigger for every commit, though I will probably change it to build as often as possible to ensure there are no unstable tests.
Creating a debian package
The next step was to create debian packages of the software, so I could easily install it in my VPS (manually or automatically). Since this was unfamiliar territory, it took some time and reading to grasp the package layout, but I found an intro guide for creating debian packages that helped me. In addition, I added a makefile that invokes all of the cabal commands to build my project and to do what the debian package tools expect.
Creating debian packages in jenkins
I then started looking for a jenkins plugin to help me build the debian package, and found the following plugins:
debian-package-builder seemed to only support automatic version manipulation for subversion repositories. Since wishsys uses git, I went for jenkins-debian-glue. Creating a debian package is complicated in itself, and I initially spent some time doing a lot of what jenkins-debian-glue tries to do automatically (automatically creating a tarball of the repo and running git-import-orig and git-buildpackage).
I used this guide to setup the jenkins jobs. I ended up with a wishsys-debian-source job for building the source package, and a wishsys-debian-binaries job for creating binary packages.
The jobs are run in the following order: wishsys-build-stable -> wishsys-debian-source -> wishsys-debian-binaries
The wishsys-build-stable job is run for each commit, and reuses the git checkout between builds to reduce build times. The wishsys-debian-source job simply creates a tarball of the git repo, and forwards the resulting tarball to the wishsys-debian-binaries job, which does a full clean build of the software before creating the binary package itself.
Setting up a CD pipeline is a great way to get your features tested and into production in minimal time. Though this setup is somewhat debian specific, the generic pattern should be reusable. In the future, I would like to avoid building the project twice (once in wishsys-build-stable, and once in wishsys-debian-binaries), but the build times are currently not an issue. Another improvement would be to get hunit and quickcheck test reports displayed in Jenkins.
The set of files necessary for debian build are available at github.
Today, I launched my wishsys service, which is just a simple service for creating wish lists with separate access for owners and guests. The original use case was my own wedding, so I created an even simpler version for that using snap. Snap worked great, but I had some hassle building the authentication mechanisms properly.
After the wedding, one of the guests wanted to use the same system for their wedding, so I thought I might as well create a more generic wish list service. This time, I went with yesod, as it seemed to provide more of a platform than a framework.
At Yahoo!, I work on a search platform, and there are a few things I expect from a platform. It should provide
- Access control
- Higher level APIs for request handling
- Good APIs
- Good documentation
- Ease of deployment
- Test framework
Yesod did not let me down. It provides a real book, and not just a bunch of outdated wiki pages. Its solution for storage is excellent. Persistent allows me to write the definition of my data structures in a single place, and automatically generate a database schema and haskell types. I chose to use postgresql as my persistent backend, and by using the scaffold code, getting it working was trivial. Creating request handlers was so easy, I won't even tell you how I did it.
My biggest yesod issue was authentication, since I had somewhat special requirements where I wanted to have two users with different access levels (admin and guest). I also missed a method in the authentication system to request a user to be logged in, regardless of what authentication backend used. I ended up looking at what HashDB did internally, and just copy that (If there is a better way, please let me know).
I used the hamlet template system to write HTML with minimal haskell clutter. Forms are a pleasure to work with, because I don't have to repeat myself. I just had to create the form in one place, and I could then use it both for generating correct HTML and easily parse the POST request.
I just followed the deployment chapter when deploying, and then the service was suddenly live. Even more important to note is the development server, which automatically compiles the app if something changes. Great for local testing!
My biggest issue with yesod was understanding compilation errors messages. But, when I got things working, yesod was a great experience. It is one of the few open source projects I've seen that understands what it means to be a platform, and it thinks of your needs before you realize them. Kudos!
Btw, the wishsys source code can be found on github
I have tried using Haskell to various smaller projects, such as wishsys and a game that I never got really far into making. But learning a new programming language through the means of hobby projects only work as long as the project is contained and small. For my part, most hobby projects start out with great ideas and grand designs, but end up as a mess since I am unfamiliar with the programming language.
When using a new programming language, time is spent learning the language rather than developing the project. This in turn means that I end up learning the bare minimum to get the job done. And this defeats the purpose of using a project to learn a new language. If the goal is to finish the project, you should have used something you know well and feel most productive with. If the goals is to learn a programming language, you should start out with a small project instead.
For me, project euler is a great way to learn Haskell, because it contains a lot of problems that Haskell (and functional languages in general) is the perfect tool for solving. The projects I mentioned above involves using databases, multiple threads and other scary real world stuff, but I just wanted to learn Haskell. And better yet, once you have solved a problem, chances are you can find someone with an even more elegant solution written in the same programming language you are using. A great way to learn!
This time I thought I'd share our dinner plans for the next week. We take turns creating dinner list every week, and next week is my turn!
- Monday: Tortellini with sun dried tomatoes and mozzarella
- Tuesday: Fish with avocado, ruccola salad and hot mustard
- Wednesday: Fennel soup with chicken
- Thursday: Albondigas
- Friday: Salmon with fennelrisotto
- Saturday: Breaded cod with salad and potatoes
- Sunday: Home made tomato soup
Lets hope it tastes as good as I think it does.
I have started to learn myself haskell using the book named "Real world Haskell". I have so far only come to chapter 4, but I am already in love with some of the features:
Its strict static type system, which makes it easy to understand what a function does. Moreover, it allows you to think through what your code is going to do as well as make the decisions of what to do for special cases up front. The following is a definition of a function which compares the length of two lists, and returns their order (==, <, >). The definition clearly states that it operates on two lists of any type, and returns a value of type Ordering. Crystal clear!
listCmp :: [a] -> [a] -> Ordering
Partially due to the above point, one can avoid unpleasant bugs later on, because you chose to postpone your decision on what to do with your input.
Pattern matching. I came across this in the Oz programming language when I was in university, but I didn't really understand how powerful and readable everything becomes until using it in Haskell. The following function takes a separator and a list of lists as argument, and combines the lists using the separator:
intersperse :: a -> [ [a] ] -> [a] intersperse sep  =  intersperse sep (x:) = x intersperse sep (x:xs) = x ++ [sep] ++ (intersperse sep xs)
I love how you can just look at the patterns to see what cases is covered by the function, rather than nesting into some complex if sentence.
Readability when using 'where' syntax. This is the implementation of the listCmp function:
listCmp lhs rhs | lengthLhs < lengthRhs = LT | lengthLhs > lengthRhs = GT | otherwise = EQ where lengthLhs = (length lhs) lengthRhs = (length rhs)
What I like about it is that you can separate the logic performed on values from the function calls, so that when you read the code, you see the actual computation done by the function in the different cases. You can also do this with the let syntax, but I think the above reads really well.
For a while now, I have been using Ubuntu Linux on my desktop, and it as worked really well. In fact, I even installed Windows 7 on my media center (replacing Linux) just to stop bothering with configuring my system all the time. Since I started working at Yahoo!, I did not really feel like having to do extra work at home in order for my computer to function properly. Moreover, I did not have much time left to work on FreeBSD, so I simply reinstalled my desktop with Linux, and that has been working well for almost a year now.
But recently I have sort of missed working on FreeBSD, so I decided to give it at try again from a user perspective. Many of the things I feel was lacking is still there. However, the things that were good, are still good. So far, I have been able to install all software that I wanted to install, but I still feel that we need something better on top of ports in order to make it easier for users. Hopefully, some of the initiatives that I have seen on the mailing list will not die any time soon. Apart from ports, many of the common tasks are pretty manual too. Configuring the system should be more straightforward than having to guess and edit what should be in /etc/rc.conf. Though many of the issues I encounter comes from the fact that FreeBSD has a very small userbase, and is simply not prioritized by many companies, there are a lot of things that can be improved irregardless of that. If i start doing any more FreeBSD work, it is most likely to be in the "make-it-less-painful-to-use"-department.
I just bought two Western Digital 2 TB disks the other day in order to increase storage capacity. I was planning on putting a ZFS mirror on them. The other day I discovered that the disks uses a new drive format called "Advanced Disk Format". This format basically extends the sector size from 512 to 4096 bytes.
The problem is that the disks report their sector size to be 512 rather than 4096 in order for them to work well with existing operating systems. The issues with these disks are discussed here and here.
To summarize, this results in two main problems:
Partitioning tools operate on 512 bytes "logical" sectors, which may result in a partition starting at a non-aligned (compared to 4096 bytes) physical sector. If using partitioning tools that are not updated to align partitions to 4k, a request may cause a write to more than one sector.
File systems/disk consumers think the underlying device has a 512 byte sector size, and issues requests that are below 4096 bytes. For a write request, this is catastrophic, because in order to write only parts of a block, the disk will have to read the block and modify the part that changed, before writing it back to disk (Read-modify-write).
Dag Erling Smørgrav made a tool to benchmark disk performance using aligned and misaligned writes (mentioned in his post above (svn co svn://svn.freebsd.org/base/user/des/phybs). Here are the results:
nobby# ./phybs -w /dev/gpt/storage0 count size offset step msec tps kBps 131072 1024 0 4096 131771 16 994 131072 1024 512 4096 136005 16 963 65536 2048 0 8192 74762 14 1753 65536 2048 512 8192 71407 15 1835 65536 2048 1024 8192 73432 15 1784 32768 4096 0 16384 20710 130 6328 32768 4096 512 16384 61987 43 2114 32768 4096 1024 16384 62719 43 2089 32768 4096 2048 16384 61089 44 2145 16384 8192 0 32768 14238 245 9205 16384 8192 512 32768 53348 65 2456 16384 8192 1024 32768 52868 66 2479 16384 8192 2048 32768 50914 68 2574
Clearly, using < 4k blocks results in bad performance. Using blocks larger than 4k results in a 3x speedup.
The way I solved this in FreeBSD was to partition the disk manually with gpart and set the partition start to a multiple of 8 (8 * 512 = 4096). All partitions on the disk should start at a sector number that is a multiple of 8.
ZFS uses variable block sizes for its requests, which can pose a problem when the underlying provider reports a sector size of 512 bytes. In order to override this, I used gnop(8), which can create a provider on top of another provider with different characteristics: gnop create -o 4096 -S 4096
The -o parameter makes sure that the new provider does not conflict with the original provider when ZFS tries to detect any filesystems on the disk. The second parameter sets the sector size of the new parameter to 4096, which makes sure that all requests going to the disk from ZFS will be in 4k blocks.
For UFS, the default fixed block size is 16k, so there should be no worries about it using lower block sizes. Moreover, newfs provides a -S parameter, which overrides the sector size of the underlying provider. I have not tried using UFS on these disks, but I don't see any reason for it not working.
After looking for a long time as to why my default locale in gnome changed after a recent upgrade, I finally found out where to change the locale setting. The problem was that gnome did not seem to pick up my system locale settings, and the norwegian characters in my terminal came up as question marks.
As the gnome login manager (gdm) got rewritten, there is now no way to change this locale at the login screen unless it was picked up by gdm. But, as always, reading the documentation helps. After reading
I discovered that I could just edit
and write this:
[Desktop] Language=en_US.UTF-8 Layout=no
to set the correct locale!
I just learned of sysutils/bsdadminscripts after my previous post about how hard it was to use packages only in FreeBSD. Well, I think I found a partial solution to my problem, as the bsdadminscripts port contains a pkg_upgrade utility, which is able to update your system without a ports tree available, as long as the INDEX file exist on the packages server.
I now use this in combination with my port tinderbox, building the packages I want for my laptop. Then I generate the INDEX file in the tinderbox ports tree, and put it into the packages folder of the tinderbox. Voila! I can now use pkg_upgrade -a, and all packages are upgraded to the latest version.
There are a few things that I think can be improved: Have the tinderbox scripts automatically generate the INDEX file and putting it into the packages directory with a simple command or just do it on an update of the ports tree. The other thing is what I mentioned in my previous post about keeping the official packages properly up to date.
I guess I'm not the typical FreeBSD user, because I do not enjoy using ports much. Mainly this is because I also use it as a desktop. On a powerful server or workstation, ports is fine. It's super flexible and everything works quite well. And kudos to all people working on updating and making improvements to it.
However, using ports on my laptop really makes me cry. Why? If I want to install a port, I have to keep a ports tree on my laptop and actually compile everything. Since I have a pretty weak laptop in terms of processing power, this takes ages. But of course, I can install packages! The thing with packages, however, is that it works really well for a release, but when upgrading later on, I always end up in trouble if I try to use the official FreeBSD packages.
First of all, the package sets following each release gets outdated quickly. Second, if I want to update my packages without using ports I get into trouble. There is no real package upgrade tool that I know of, but I can install portupgrade if I want to, because it has a fancy -PP options, telling it to use packages only. But there are issues with this: portupgrade seems to require that you have a ports tree to work. In addition, when you have the ports tree, portupgrade will look for packages matching the exact version that is in ports, and if the package server does not happen to have the same ports tree as you (only one commit updating a port can break this), it fails.
So what is the solution for me, besides writing a pkg_upgrade? Having a ports tinderbox on a different host to build packages for my laptop (I could use official 8-stable packages for instance, but there always seem to be some packages missing, and some not built). And the upgrade procedure? Move /usr/local and /var/db/pkg away, and reinstall packages. It works ok, but looking at how well this can be handled on other systems, it's a bit silly :/ So, maybe I'll just have to look closer at the pkg_upgrade idea :)
So, on to the constructive part of this rantWpost. There is no need to change everything for this to work better. A pkg_upgrade tool can probably reuse a lot from the other pkgtools, such as version checking and dependency checking. However, the hard part is knowing what version to get from the servers. Luckily, the Latest/ directory contains unversioned tarballs of packages that can be examined to get their version. But again, this requires one to get the packages first in order to examine it. Not very bandwidth-friendly. I think a simple approach would be to keep a version list together with the packages, which could be used by pkg_upgrade to check if any new version of a package exists (much like INDEX in /usr/ports I guess). I haven't thought about the hardest question yet: how to handle dependencies and package renaming, but I would think one could allow specifying this in the same file.
Update: As i was working against my local package repository, I did not notice that the official package repositories actually contains the INDEX file from the ports tree where the packages are built.
I also think the package building procedures could be changed, because somehow, there are always packages missing (at least several gnome packages last time I tried). I do not know much about this though, but I would advocate for a system where a package was rebuilt on all architectures and supported releases once a commit was made to the affecting port.
There, I feel better now :)
Last year in Japan I bought a Cowon iAudio D2 player, which have proven to be quite good. But a few days ago, I thought I'd try to upgrade the firmware of it. I then discovered that there are four different types of firmware depending on where you bought it. As I bought it in Japan, my firmware was not compatible with other firmwares. The reason for this are mostly due to small differences in hardware. In my case, I have the possiblity of watching Japanese television (not really useful in Norway).
Therefore, I thought I would try and upgrade to the european firmware (a lot more fixes get through to this firmware it seems), but I was a bit afraid I would brick it if it was the case. I looked around at the iaudiophile forums, and finally I found someone with the same attempt, and they succeeded. The procedure was easy, but to be able to use the european firmware, I had to rename them to have the same file name as the Japanese, in order for the player to pick them up. Luckily, it worked for me too. Phew!
As I usually have a few classes at school which requires special software, I wanted to be able to run some of this software on my own computer, as there are student versions of some of the software. One of these is ModelSim from MentorGraphics. ModelSim is basically a simulator for hardware designs, and I use it to simulate VHDL. Unfortunately, ModelSim only comes for Windows, Linux and Solaris. As I only run FreeBSD on my laptop, no software for me :( But wait, FreeBSD have the linuxulator!, which allows Linux binaries to be run unmodified on a FreeBSD host (It is basically an implementation of Linux syscalls within the FreeBSD kernel). The steps I needed to go through to install the Linux version of ModelSim was pretty easy.
First of all, one of the emulators/linux_base* ports needs to be installed. I chose linux_base-fc6, as I'd like the Linux 2.6 support (although I'm not sure if that is actually needed). After installing the port, a linux userland appears in /compat/linux. To make sure I don't get any problems with programs needing procfs, I mount linprocfs(5) as well.
There, easy! Ready to run linux programs. Now, ModelSim comes with its own installer, which needs a few additional files that you can get at their web-site. However, programs may depend on additional libraries, and this is IMO the most tricky part about the linuxulator. In my case, I got some errors complaining about not finding libXsomething. Luckily, there are a few ports that you can install for the most common libraries. In this case, I had to install x11/linux-xorg-libs. Although a very old version, I was able to run the ModelSim installer and the installed binaries afterwards. Awesome!
Today, I'm sitting in a café in Oslo, waiting until we're leaving for Gardermoen and our flight back to the Netherlands after one week vacation in Norway. The weather was nice, and I got to do some skiing at least. However, I was actually supposed to be in the Netherlands already. The reason that I'm not, is that we (my girlfriend and I) missed the flight on Sunday. We actually missed it by one day, as we were 100% sure that we were leaving yesterday, so when we showed up at the airport, we were shocked to learn that we were 24 hours late! This was a silly mistake, as neither of us really looked at the date, we always assumed that we were leaving on Monday. Unfortunately, to be able to board the flight that we assumed was our flight, we had to pay 3000 NOK extra per ticket! In other words, we had to find other ways of getting back. Luckily, we got to stay at my sister place last night, and got new tickets for today's flight at approximately the same price as our original tickets (700 NOK per ticket). Hopefully, we'll be back in our apartment tonight :)
After Arnar Mar Sig posted his patchset for an initial skeleton of the AVR32 port almost a year ago, things started to catch speed in the beginning of this year. The work is done in perforce, and is progressing well. Currently, the system boots and recognizes most of the hardware, but linker work is required to be able to run init.
So far, I've been working on busdma support, grabbing the source from the mips port and adjusting it as well as implementing support for cache operations on the AVR32. It seems to work for now, as Arnar was able to get the ate(4) device driver to work with it.
The last work have been to design and implement a generic device clock framework. This is supposed to be used with devices in an architecture independent way, so that devices can be associated with a clock without knowing what clock it is (assigned internally for each architecture). This is necessary for a few devices to avoid #ifdefs all over the place. For instance, the at91_mci device is identical to the one used in AVR32, and it gets the clock frequency based on at91 machine dependant defines. Another property of this would be to export clocks using this interface to userland (AVR32 have a set of generic clocks as well).
Last weekend I imported gvinum into HEAD, and I hope many users (and old users) of gvinum will try it out, as it have some nice improvements. Moving it into HEAD now, means it also will become part of 8.0-RELEASE which is coming later this year, and since it is a lot of changes, the intention is to have it in HEAD now for a while before the release process begins. Among the most interesting updates for users are:
- Support more of the old vinum command set
- Less panics :)
- Rebuilding and synchronizing plexes can be done while mounted.
- Support for growing striped or raid5 plexes while mounted, meaning that you can just add a new disk to your gvinum configuration, and grow it to cover the new disk.
Damn, reading for exams is really not my favorite thing. It's not that it's very hard material, but the motivation is the problem. I always tend to get a bit sloppy with classes where the only form of assessment is the exam, and if the class is not very interesting either, it gets hard. However, these kind of classes are typically very theoretical courses, and one way I cope with it is to make them practical. For instance, in this course there are lot of distributed algorithms that the student is expected to know. Some of them are almost several pages long, and I'm really not the type for keeping all that in my head, and if I did, it would only be because I memorized it. So instead, I tried to implement the algorithm, as it helps with understanding because you can see how it works in action! What I did in this case was to create a node abstraction/class which I could re-use in several algorithms. The nodes definition is something like this:
void send(Message, nodeid); // Send message to a single node void multicast(Message); // Multicast message to all neighbouring nodes void deliver(Message); // RMI method called by other nodes via their send method Message receive(); // Blocking receive method to fetch contents from buffer
The node creation itself adds necessary neighbours, and connections are specified at startup time. The Message class contains most info necessary, but is extended in some algorithms that need extra stuff. I implemented these algorithms using the interface:
- Ricart-Agrawala's mutex algorithm
- Maekawas mutex algorithm
- Peterson election in unidirectional ring
Some algorithms are really tricky, and I end up spending more time wondering how to implement it than actually doing it, so I guess this technique is not good always :)
Phew, the first quarter of my exchange study is almost over. So far, the stay here in the Netherlands have been very exciting. First of all, we did an awesome project creating a quad rotor controller using a joystick to fly. A demo of the previous years group can be found here. We were actually able to make it work like in this video. The hardware consists of a Xilinx Spartan 3E starter kit running the X32 CPU core developed here at TU-Delft, a PC with serial link to the FPGA board, a joystick connected to the PC, and the Quad Rotor itself connected to the FPGA board via a modified serial link. We implemented the control software, signal filtering etc on the X32 in C, and after optimizations, we had a cycle time almost half of the required, and it flew!
The other course I've been taking is a seminar on wireless sensor network, which handled nearly all aspects of this topic, having students present a paper on a certain topic each week. I presented a paper on reliable energy aware routing, which was very interesting.
Lastly, I have a course in distributed algorithms, which will finish on April the 4th with an exam. The course teaches various distributed algorithms for synchronization, global state detection, deadlock, locking etc, and goes through several P2P protocols as well.
After this quarter I'll also go home to Norway for a short vacation, finally :)
The past week I've been using some of my time to setup Ikiwiki, and I was able to import my wordpress FreeBSD blog without too much hassle. I had to manually edit some posts, but other than that, the most work was into getting the tagging stuff right.
After loader support for ZFS was imported into FreeBSD around a month ago, I've been thinking of installing a ZFS-only system on my laptop. I also decided to try out using the GPT layout instead of using disklabels etc.
The first thing I started with was to grab a snapshot of FreeBSD CURRENT. However, I discovered that the loader doesn't support ZFS, so you have to build your own FreeBSD cd in order to install a working loader! Look in src/release/Makefile and src/release/i386/mkisoimages.sh for how to do this. Since sysinstall doesn't support setting up ZFS etc, it can't be used, so one have to use the Fixit environment on the FreeBSD install cd to set it up. I started out by removing the existing partition table on the disk (just writing zeros to the start of the disk will do).
Then, the next step was to setup the GPT with the partitions that I wanted to have. Using gpt in FreeBSD, one should create one partition to contain the initial gptzfsboot loader. In addition, I wanted a swap partition, as well as a partition to use for a zpool for the whole system.
To setup the GPT, I used gpart(8) and looked at examples from the man-page. The first thing to do is to setup the GPT partition scheme, first by creating the partition table, and then add the appropriate partitions.
gpart create -s GPT ad4 gpart add -b 34 -s 128 -t freebsd-boot ad4 gpart add -b 162 -s 5242880 -t freebsd-swap ad4 gpart add -b 5243042 -s 125829120 -t freebsd-zfs ad4 gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad4
This creates the initial GPT, and adds three partitions. The first partition contains the gptzfsboot loader which is able to recognize and load the loader from a zfs partition. The second partition is the swap partition (I used 2.5 GB for swap in this case). The third partition is the partition containing the zpool (60GB). Sizes and offsets are specified in sectors (1 sector is typically 512 bytes). The last command puts the needed bootcode into ad4p1 (freebsd-boot).
Having setup the partitions, the hardest part should be done. As we are in the fixit environment, we can now create the zpool as well.
zpool create data /dev/ad4p3
The zpool should now be up and running. I then decided to create the different filesystems i wanted to have in this pool. I created /usr, /home and /var (I use tmpfs for /tmp).
Then, freebsd must be installed on the system. I did this by copying all folders from /dist in the fixit environment into the zpool. In addition, the /dev folder have to be created. For better details on this, you can follow (http://wiki.freebsd.org/AppleMacbook) At least /dist/boot should be copied in order to be able to boot.
Then, the boot have to be setup. First, boot/loader.conf have to contain:
Any additional filesystems or swap has to be entered into etc/fstab, in my case:
/dev/ad4p2 none swap sw 0 0
I also entered the following into etc/rc.conf
In addition, boot/zfs/zpool.cache has to exist in order to be able to let the zpool be imported automatically when zfs loads on system boot. To do this, I had to:
mkdir /boot/zfs zpool export data && zpool import data
In order to make /boot/zfs/zpool.cache get populated in the Fixit environment. Then, I copied zpool.cache to boot/zfs on the zpool:
cp /boot/zfs/zpool.cache /data/boot/zfs
Finally, a basic system should be installed.The last ting to do is to unmount the filesystem(s) and set a few properties:
zfs set mountpoint=legacy data zfs set mountpoint=/usr data/usr zfs set mountpoint=/var data/var zfs set mountpoint=/home data/home zpool set bootfs=data data
To get all the quirks right, such as permissions etc, you should to a real install with making world or using sysinstall when booted into the system. Reboot, and you might be as lucky as me and boot into your ZFS-only system :) For further information, take a look at:
http://wiki.freebsd.org/ZFSOnRoot which contains some information on how to use ZFS as root, but by booting from ufs and: http://wiki.freebsd.org/AppleMacbook which has a nice section on setting up the zpool in a Fixit environment.
When rebuilding FreeBSD after this type of install, it's also important that you build with LOADER_ZFS_SUPPORT=YES in order for the loader to be able to read zpools.