Friday, August 03, 2012

Moved!

That's right, I've moved to a different blog - and a heap of this content will transfer across.

If you're stumbling across this blog and want to read anything a bit newer - head on over to Ideas, and Code.

Cheers! :)

Monday, October 10, 2011

Pedigree's 'pup' package manager

Recently there's been a lot of activity in the project repositories related to the 'pup' package manager.

I've been working at getting our binaries and other ported application-related files out of each repository and move towards only having the specific binaries we need for building Pedigree from scratch, and the remainder being source code and tools.

So now a checkout of Pedigree and an Easy Build run will leave you with quite a bare build of Pedigree - you'll get bash, coreutils, and not too much else. This allows you to choose what goes into your build of Pedigree; if you just want Python and Lynx for web browsing, that's totally alright with us.

'pup' is written in Python and will be available to run in Pedigree as well as on the build machine. At the moment however our Python build is a little broken as we are working on bringing proper shared library support to our userspace ports.

In my opinion there's no reason for everybody to download over 250 MB just to get a simple checkout of Pedigree. That's what 'pup' integration is all about.

Minecraft: Fix for Error (4): null

Been having this issue on and off for a while now, but finally nailed the source of the problem today. I run Minecraft behind a Squid 2.7 proxy, which appears to have trouble handling ETags properly.

In particular, it seems to have to do with the If-Match-None header.

Upgraded to Squid 3.0 and I'm able to update Minecraft without using a browser - awesome! :)

Wednesday, June 01, 2011

And now for something COMPLETELY different!

I know this is technically a Pedigree devblog, but I also do sysadmin work and this might be handy for someone else out there.

In a Windows network environment, folder redirection comes in handy for things such as server-side My Documents and such. We use folder redirection for the Desktop and Start Menu as well, to provide a set of icons to each user that we can manage centrally.

However, it turns out Folder Redirection can't always redirect to a mapped network drive (ie, M:). The order of GPO processing means the drive mapping takes place later on, after the folder redirection takes place. So if you've mapped a bunch of network drives for the user and want to redirect to one, without using a UNC path, it won't work. This doesn't work for us as we map some drives to different locations based on the computer's organisational unit.

So the fix comes from Group Policy Preferences: all that "Folder Redirection" does is set a few registry keys. Group Policy Preferences lets us set those registry keys manually, with the added bonus of the extremely powerful granular targeting (Item-Level targeting). The targeting lets us apply different registry keys to Admin users instead of Teachers, for example.

The relevant keys for the Desktop and Start Menu are:

All Users - common across every user logged into the machine
HKLM\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders

Keys: Common Programs ("All Programs"), Common Start Menu (Icons outside of "All Programs"), Common Startup ("Startup" folder), Common Desktop (the desktop).

Individual Users
HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders

Keys: Desktop, Start Menu, Startup, Programs - all the same purpose as above.

So, by setting a combination of these keys, we were able to get the desktop and start menu completely redirected to a network share that's unique per organisational unit. This means modifying the icons for a room involves a simple copy & paste on the file share rather than rebooting a computer for the startup script to take effect.

Sunday, May 15, 2011

Live Streaming some code!

I've set up a Justin.tv channel to which I can stream live coding and debugging of issues in Pedigree and other projects. I gave it a go this afternoon with about an hour's worth of playing around with Pedigree's memset, and I hope to regularly stream new stuff - there's certainly enough to do, after all!

Thursday, February 03, 2011

Package Management & Workflow

For those who don't know, one of my "mini-projects" (of which there are many!) on the Pedigree project is the 'pup' package manager. The idea behind pup is to create a Python-oriented package manager that works great on Pedigree but can also be used for any other operating system that happens to have Python ported. Essentially all of the backend is just standard HTTP GET requests, the packages are just .tar.gz files, and building a package is dead simple (assuming the environment is okay).

The problem with 'pup' is that it current works out of a git repository which holds every package that will be in a master repository. This is great for editing the build scripts and versioning the pup source code, but when it comes to any form of management it's a nightmare. The solution for this is to write a web-based user interface for managing a package repository.

That's where it gets exciting.

Actually managing packages in their built state is very, very easy. You simply modify the SQLite database containing package information where necessary, and move the package files around. The difficulty comes from building and creating packages. The easiest way to convince users to port more software to Pedigree is to make it extremely easy to - this is the goal of pup and the build scripts in our Pedigree-Apps repository. So any web interface for management has to handle both managing existing packages (update/remove/rename/edit attributes) as well as handle bringing new packages into the system.

The workflow I came up with in my head for this is as follows:

  1. User creates a build.sh and related scripts to build the relevant port. This is hopefully tested by the user and works on that system. Creating these scripts is already documented.
  2. These scripts are uploaded to the web interface
  3. The permissions of the user are checked; depending on these the package will be automatically built on the server straight away, or added to a queue for moderation (we'll assume it was permitted to be built here).
  4. As the build continues, its status is made available to the user.
  5. Once the build completes, the user is notified of the status of the build. Most likely an IRC bot is also present to notify developers/interested parties of the build status.
  6. The package is added to the system and mirrored from the master repository to various other mirrors, ready for users to download the update/new package via pup on their Pedigree installation.
Here's the problem - as far as I can tell nothing like this exists.

So I'm looking at writing this myself, most likely in Python. It should be a fun project for when all the other projects are hitting dead ends, and has the added bonus of reducing my workload on Pedigree when done!

As an aside, notable changes since the last blog post:
  • We released a feature-test release - Foster Milestone #1.
  • SSE in the kernel, oh my! memset takes this very well, memcpy not so much due to floating point state issues. These are being resolved at the moment.
  • A variety of changes across the POSIX subsystem and within the network stack for improved stability.
Also, We're looking for anyone who's got a few year's worth of OS development under their belt to come on board and work on the project - if you feel like something new and a C++ kernel doesn't worry you, find us in #pedigree on Freenode IRC.

Thursday, November 25, 2010

Updates! Pedigree-Apps and POSIX stuff.

Over the past couple of months I've been working less on Pedigree as a whole and working on less intensive projects, such as playing video games, or writing small userspace applications. There is nothing that sorts out early burnout symptoms faster than taking a short break - and that means getting away from low-level programming completely! So progress is definitely slowing down, but that doesn't mean things aren't getting done.

As for the "Pedigree Team." Eduard (eddyb) has now started high school and so has a lot less time to contribute to Pedigree. He's still got a fair few ideas flying through his mind though and that's always a good thing. Can't deny it - since he first showed up in our IRC channel his behaviour (not to mention his English) has improved significantly. Just have to hope that doesn't go to his head ;-).

I still consider Pedigree a unique project: proof of what can happen when a group of mostly like-minded individuals come together and use their combined skillset to make something really significant. It's a shame people like James and Joerg have left us, but this is just the way things work. I don't know about Joerg, but James has certainly used Pedigree as a springboard into some pretty amazing work with his Horizon project.

Enough talk about logistics and teamwork though! Let's talk about code.

The first and foremost thing I really want to talk about is the "Pedigree-Apps" repository. This was set up a while back as a place for all of our ported packages to sit, and as a staging point for a master package repository for whichever package manager we ended up using.

History lesson time! We initially started using "pacman", Arch Linux's package manager. It's a nice piece of software, and does the job very well. The dependencies it asks for are not quite as nice (especially libdownload, which is essentially impossible to find) however. It also doesn't lend itself well necessarily to cross-compiling packages.

So recently I put together a new piece of software called "pup" - "Package Updater for Pedigree." It takes a path to a directory containing a Pedigree-style tree of directories that it tarballs and compresses and adds to a package repository somewhere. The databases used on both the server (for synchronising local "available package" lists) and on the client side ("installed packages") are nice and simple sqlite databases that Python has modules to read out of the box. Oh yes, that's right - pup is completely written in Python. Why write pup in Python? The answer is simple: Python is already a Pedigree build dependency. You have to have Python to build Pedigree.

In the future the main Pedigree repository will ship with a copy of pup ready to run, and instead of distributing hundreds of megabytes worth of binary files for a hard disk image, anyone who builds Pedigree will be allowed to choose which packages to add to their hard disk image. This will hopefully improve the size of source snapshots and shallow clones, and also make it easier to customise a hard disk image to a specific testcase.

As for Pedigree-Apps itself, it now has a shiny build system that caters directly to both cross-compiling and automated builds. Package maintainers just provide patches and a set of scripts (for each phase of the build - prebuild, configuration, build, install, and for packages which provide libraries that future package builds might depend on, a script to set up library links for the cross-compiler) and the rest falls into place. Compared to the old system, this is much easier for package maintainers and also significantly reduces the total size of the repository, which is fantastic for those with slower internet connections.

Now that all that is aside, I have a quick few things to add about a few changes that have gone through the POSIX subsystem. And by quick, I mean bullet points! Yay!

  • Brand new pthread locks
    • I got fairly sick of having every pthread lock function call into the kernel, and it seriously hammered performance. So I spent a few hours rewriting both the spinlock and mutex functions to use atomic operations all the way out in userspace, only calling into the kernel to put a thread to sleep or wake a thread up. The resulting performance boost is significant: a few mutex tests that I ran as a benchmark averaged a completion time very close to that of Linux. Obviously it's hard to truly benchmark such an OS-specific thing, but I think that if a VM is hitting speeds close to that of Linux for a similar construct, I've done pretty well.
  • Rework of statfs/statvfs
    • Only one of these functions is actually standardised - statvfs. The other is not. So I've removed statfs completely and stuck with statvfs, and so far nothing has complained. I'm yet to actually implement either of these functions, but I finally have a method of enumerating mounted filesystems in the VFS making an implementation of these functions fairly simple.
  • Condition variables
    • I'm hesitant to put this here because I really don't like the implementation I wrote, but I'll do it anyway. Condition variables of the pthread variety are now implemented. Badly. There is very little in the way of atomicity, and they end up being two mutexes tied together that are locked and unlocked unatomically. It's pretty nasty all up. Eventually I'll just rewrite these functions to use their own specific implementation without building upon the existing mutex implementation, but that will probably only really come when this implementation causes a bug.

All up there's a lot still going on with Pedigree. Most tasks are now becoming longer subprojects - this is just a part of building such a massive piece of software and is to be expected.

We're looking at possibly doing a "milestone" release in a month or two just to get our current code out into the wild (which does well for finding both bugs and feature requests). Stay tuned!

Sunday, May 09, 2010

New Core Features

It's been a long time since I posted anything here. Moving to a new city has slowed my development down significantly, but things are getting back to normal again.

Recently there has been a fair bit of activity, and I'll summarise what's been going on in this post. There's a couple new features, some changes to Pedigree itself, and some bug fixes. In-depth discussion of each, where necessary, follows the list.

  • ZombieQueue (commit d5314bf7)
  • Using > 4 GB of RAM now works, with the x86_64 port (commit bb6b954b)
  • MemoryPool (commit 64454975)
  • UdpLogger (versatile log callbacks) (commit ff5fddba)
ZombieQueue
ZombieQueue solves a problem that has been lying around for quite some time. Things such as signals require that a Process object is deleted as part of it being killed, so it is no longer schedulable. However, this was dealt with originally by a "delete this;" line. Any form of "delete this" can be dangerous when the next line is not a return statement. Even when the next line is a return statement, a reschedule out of the current context and into another, which allocates memory, may overwrite the existing object... making "this" invalid. That said, it did suffice for quite some time.

The idea of ZombieQueue is that you queue these deletions, allowing the thread to be rescheduled (or the call that needs deletion to return) and the object to be deleted at a later time, when it's safe to do so. Eventually ZombieQueue might also have a "delay" parameter, which ensures that it never deletes an object until after a specific time period has passed. This could easily be used for things such as TIME_WAIT handling in the TCP stack.

ZombieQueue has already been implemented into Process, Pipe, and Socket, all of which needed to delete themselves somehow. Now they call into ZombieQueue, which is able to safely delete the object.

x86_64 port using > 4 GB of RAM
A TODO all through the physical memory manager for quite some time has been to handle regions above 4 GB in the x86_64 port. Until this change was made, no ranges of memory above 4 GB were actually accessible. In fact, using the x86_64 port on a system with over 4 GB of memory actually caused anything from a panic to a triple fault and reboot.

This has been rectified now. I'm still not sure it'll handle a full 64-bit physical address space yet, but it will handle a massive amount of memory nonetheless.

MemoryPool
This one is something I'm surprised hasn't been implemented yet. The idea of MemoryPool is to take buffer allocation out of the kernel heap, where it consumes far too much memory - and sometimes even consumed the full size of the heap.

MemoryPool takes a block of memory, outside of the heap, and splits it up into buffers of a specified size. These buffers are distributed to modules and applications where necessary, and returned to the pool when they are no longer needed. Buffers in, for example, the network stack, are around 2048 bytes (rounded up to the next power of 2 from 1500). Rather than have every incoming packet cause an allocation on the kernel heap of, on average, 800 bytes, this moves the processing into a pool of memory dedicated to the job.

A key feature to MemoryPool is its blocking nature. If an allocation cannot be satisfied by the current pool of memory (ie, all buffers are allocated out), it will block until a buffer becomes available. Compared to the heap, this means a constant memory usage is enforced. Effectively, the pool provides a strict contract on its memory consumption: a pool will never resize to fit a new allocation.

This does bring up a minor problem: some code can't block - an interrupt handler for example. For this situation, a non-blocking allocation method is available, which merely returns NULL if no buffers are available.

UdpLogger
It has long been possible to install custom Log callbacks which take log entries and write them to a destination. Callbacks that already exist include the Serial logger, and a callback is used for the loading screen log dump. For testing on real machines, however, neither of these suffice. None of my test machines have a serial port, for example. They are all networked via Ethernet, however.

UdpLogger is able to be installed after the route table is installed, so it's only really useful for debugging if you can actually boot to a shell on the test machine. It dumps all log entries to a given IP address and port. A test machine can then dump all its debugging output to a development machine running netcat.

Here it is, in action:


Log::installCallback has now also been updated to dump all the old log entries to the new callback, so every logging destination should have the same data once they are all installed.

These are just a couple of exciting new features and fixes that have happened in the past couple of weeks. At this stage we have not planned a second release - we're still working through the bugs from Foster and finally getting around to fixing all those TODOs scattered in the code.

At the moment git master is being kept "stable" - that is, it builds and runs to the bash shell (apart from x86_64) - so feel free to grab the latest code and have a play!