Thursday, November 25, 2010

Updates! Pedigree-Apps and POSIX stuff.

Over the past couple of months I've been working less on Pedigree as a whole and working on less intensive projects, such as playing video games, or writing small userspace applications. There is nothing that sorts out early burnout symptoms faster than taking a short break - and that means getting away from low-level programming completely! So progress is definitely slowing down, but that doesn't mean things aren't getting done.

As for the "Pedigree Team." Eduard (eddyb) has now started high school and so has a lot less time to contribute to Pedigree. He's still got a fair few ideas flying through his mind though and that's always a good thing. Can't deny it - since he first showed up in our IRC channel his behaviour (not to mention his English) has improved significantly. Just have to hope that doesn't go to his head ;-).

I still consider Pedigree a unique project: proof of what can happen when a group of mostly like-minded individuals come together and use their combined skillset to make something really significant. It's a shame people like James and Joerg have left us, but this is just the way things work. I don't know about Joerg, but James has certainly used Pedigree as a springboard into some pretty amazing work with his Horizon project.

Enough talk about logistics and teamwork though! Let's talk about code.

The first and foremost thing I really want to talk about is the "Pedigree-Apps" repository. This was set up a while back as a place for all of our ported packages to sit, and as a staging point for a master package repository for whichever package manager we ended up using.

History lesson time! We initially started using "pacman", Arch Linux's package manager. It's a nice piece of software, and does the job very well. The dependencies it asks for are not quite as nice (especially libdownload, which is essentially impossible to find) however. It also doesn't lend itself well necessarily to cross-compiling packages.

So recently I put together a new piece of software called "pup" - "Package Updater for Pedigree." It takes a path to a directory containing a Pedigree-style tree of directories that it tarballs and compresses and adds to a package repository somewhere. The databases used on both the server (for synchronising local "available package" lists) and on the client side ("installed packages") are nice and simple sqlite databases that Python has modules to read out of the box. Oh yes, that's right - pup is completely written in Python. Why write pup in Python? The answer is simple: Python is already a Pedigree build dependency. You have to have Python to build Pedigree.

In the future the main Pedigree repository will ship with a copy of pup ready to run, and instead of distributing hundreds of megabytes worth of binary files for a hard disk image, anyone who builds Pedigree will be allowed to choose which packages to add to their hard disk image. This will hopefully improve the size of source snapshots and shallow clones, and also make it easier to customise a hard disk image to a specific testcase.

As for Pedigree-Apps itself, it now has a shiny build system that caters directly to both cross-compiling and automated builds. Package maintainers just provide patches and a set of scripts (for each phase of the build - prebuild, configuration, build, install, and for packages which provide libraries that future package builds might depend on, a script to set up library links for the cross-compiler) and the rest falls into place. Compared to the old system, this is much easier for package maintainers and also significantly reduces the total size of the repository, which is fantastic for those with slower internet connections.

Now that all that is aside, I have a quick few things to add about a few changes that have gone through the POSIX subsystem. And by quick, I mean bullet points! Yay!

  • Brand new pthread locks
    • I got fairly sick of having every pthread lock function call into the kernel, and it seriously hammered performance. So I spent a few hours rewriting both the spinlock and mutex functions to use atomic operations all the way out in userspace, only calling into the kernel to put a thread to sleep or wake a thread up. The resulting performance boost is significant: a few mutex tests that I ran as a benchmark averaged a completion time very close to that of Linux. Obviously it's hard to truly benchmark such an OS-specific thing, but I think that if a VM is hitting speeds close to that of Linux for a similar construct, I've done pretty well.
  • Rework of statfs/statvfs
    • Only one of these functions is actually standardised - statvfs. The other is not. So I've removed statfs completely and stuck with statvfs, and so far nothing has complained. I'm yet to actually implement either of these functions, but I finally have a method of enumerating mounted filesystems in the VFS making an implementation of these functions fairly simple.
  • Condition variables
    • I'm hesitant to put this here because I really don't like the implementation I wrote, but I'll do it anyway. Condition variables of the pthread variety are now implemented. Badly. There is very little in the way of atomicity, and they end up being two mutexes tied together that are locked and unlocked unatomically. It's pretty nasty all up. Eventually I'll just rewrite these functions to use their own specific implementation without building upon the existing mutex implementation, but that will probably only really come when this implementation causes a bug.

All up there's a lot still going on with Pedigree. Most tasks are now becoming longer subprojects - this is just a part of building such a massive piece of software and is to be expected.

We're looking at possibly doing a "milestone" release in a month or two just to get our current code out into the wild (which does well for finding both bugs and feature requests). Stay tuned!