The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
Rolling Your Own Embedded Linux Distribution
1. Rolling your own Embedded Linux Distribution
Erik Andersen
Senior Software Engineer, Lineo Inc.
390 South 400 West
Lindon, UT 84042
Email: andersen@lineo.com
Abstract
This presentation will cover the software, tools, libraries, and configuration files needed
to construct an embedded Linux operating system. Some of the software available for
constructing embedded Linux systems will be discussed, and selection criteria for which
tools to use for differing embedded applications will be presented. Throughout the paper,
we will construct an embedded Linux distribution to perform a simple task using only the
Linux kernel, the uClibc C library, BusyBox, and a bootloader. The presenter will then
boot up the newly constructed embedded Linux operating system and show that it works
perfectly.
Linux distributions1 tend to be designed for server and desktop systems. As such, they
deliver a full-featured, comprehensive set of tools for just about every purpose
imaginable. Most Linux distributions, such as RedHat, Debian, or SuSE, provide
hundreds of separate software packages adding up to several gigabytes of software. The
goal of a server or desktop Linux distribution is to provide as much value as possible to
the user, therefore, the large size is quite appropriate. The traditional server and desktop
Linux focus has caused a number of things about the Linux operating system to be much
larger then is desirable for building embedded systems. Since embedded devices
represent a fundamentally different target for Linux, it is appropriate that embedded
devices should use different software than what is commonly used on desktop systems.
Linux has a number of strengths which make it extremely attractive for the next
generation of embedded devices, but it is important that developers use the best software
tools that are available to maximize the advantage of using Linux in the embedded space.
This paper will describe some of the software tools available for building very small
embedded Linux systems.
Introduction
Before we cover the software tools needed to build an embedded Linux system, we have
to answer several questions: Why are we doing this? Why use embedded Linux? What
are the advantages of using Linux? Is Linux small enough to fit inside the target device?
There are a large number of embedded operating systems available that can be used for a
1
The term quot;distributionquot; is used by the Linux community to refer to a collection of software, including the
Linux kernel, application programs, and needed library code, which makes up a complete running system.
Sometimes, the term quot;Linuxquot; or quot;GNU/Linuxquot; is also used to refer to this collection of software.
2. moderate fee to develop high-quality embedded systems, so it's worthwhile to explore
some of the advantages Linux has to offer. The answer to these questions depends largely
on who you ask. The Free Software Foundation would like everyone to use Free
Software, because it is the morally correct thing to do. Open Source advocates (these are
the people who get the most press time of late) will tell you that Open Source software
development can harness the creative efforts of the best software developers throughout
the world and produce high-quality software as a result. While both these viewpoints are
important, they are not generally sufficient to convince executive managers to switch to
using embedded Linux to develop products.
On the other hand, what managers generally do care about is money, and nine times out
of ten, they choose Linux because it is free, as in gratis. When making the choice
between using Linux or using a proprietary operating system, many fail to consider the
rest of the cost involved. How much does it cost to purchase a development seat using
the chosen operating system? How available are software developers that have
experience using the chosen operating system? If you have an existing staff of software
developers, how much will it cost to retrain them? How much are the per-unit royalty
costs? Can the operating system do the job it is required to do? Is Linux small enough
to fit inside my device?
Now I am going to cheat a bit, and avoid giving any detailed, exhaustive answers to these
questions. This is a technical presentation, not an editorial or an advocacy piece.
Hopefully, if you are reading this paper now, you have already decided that Linux is a
viable option for you.
When I began working on embedded Linux, the last of the preceding questions, quot;Is Linux
small enough to fit inside my device?quot; was a difficult problem for us. I work at Lineo
doing embedded Linux software development (my official job title is quot;Senior Software
Engineer/Code Poetquot;). In fact, I was the first engineer hired after Lineo shifted its
direction from embedded DOS to focus exclusively on embedded Linux. At Lineo, we
had customers who wanted to deliver embedded devices which would run Linux in
extremely small amounts of flash memory. This was a real challenge for us, since at the
time, we were relying on the same applications which were used in standard Linux
distributions on the desktop.
As I began to analyze how we could save space, it quickly became apparent that there
were three main areas we could attack to shrink the footprint of Embedix (Lineo's
embedded Linux distribution): the kernel, the application programs, and the libraries.
Many of the higher-ups in Linux kernel development (including Linus himself) have been
working on shrinking the footprint of the kernel. Thus, over the past year and a half, I
have focused on the latter two areas, shrinking the footprint of the application programs
and libraries required to produce working embedded Linux systems.
3. The C Library
Lets take a look at a common embedded Linux system, the Linux Router Project
(http://www.linuxrouter.org/). The Linux Router Project, begun by Dave Cinege, was
and continues to be a very commonly used embedded Linux system. Its self-described
tagline reads quot;A networking-centric micro-distribution of Linuxquot; which is quot;small enough
to fit on a single 1.44MB floppy disk, and makes building and maintaining routers, access
servers, thin servers, thin clients, network appliances, and typically embedded systems
next to trivial.quot; If we download a copy of one of the Linux Router Project's quot;idiot
imagesquot; (I grabbed one from a mirror at
ftp://sunsite.unc.edu/pub/Linux/distributions/linux-router/dists/current/idiot-
image_1440KB_FAT_2.9.8_Linux_2.0.gz). Opening up the idiot-image we can see a
few very interesting facts
[root@sage /tmp]# mount idiot-image_1440KB_FAT_2.9.8_Linux_2.2 /mnt -o
loop
[root@slag /tmp]# mkdir test
[root@slag /tmp]# cd test
[root@slag test]# tar -xzf /mnt/ROOT.LRP
[root@slag test]# du -ch bin root sbin usr var
460k bin
8.0k root
264k sbin
12k usr/bin
304k usr/sbin
36k usr/lib/ipmasqadm
40k usr/lib
360k usr
56k var/lib/lrpkg
60k var/lib
4.0k var/spool/cron/crontabs
8.0k var/spool/cron
12k var/spool
76k var
1.2M total
[root@slag test]# du -ch lib
24k lib/POSIXness
1.1M lib
1.1M total
So as we look at the software contained in this embedded Linux system, we quickly
notice that in a software image totaling 2.2 Megabytes, the libraries take up over half the
space. If we look even closer at the set of libraries, we quickly find that the largest single
component in the entire system is the GNU C library, in this case occupying nearly 650k.
What is more, this is a very old version of the C library; newer versions of GNU libc,
such as version 2.2.2 are over a 1.1 Megabytes all by itself! There are tools available
from Linux vendors and in the Open Source community which can reduce the footprint of
the GNU C library considerably; however, there are system design issues that may
preclude the use of these tools. Even when these tools are appropriate, there are limits to
the amount of size which can be reclaimed from the GNU C library in this way. In this
paper I will not discuss the techniques of library reduction.
4. A single file that is so large certainly looks like low hanging fruit. In practice, replacing
the GNU C library for embedded Linux systems has not been an easy job at all.
The origins of uClibc
As I despaired over the large size of the GNU C library, I decided that the best thing to do
would be to find another C library for Linux that would be better suited for embedded
systems. I spent quite a bit of time looking around, and after carefully evaluating the
various Open Source C libraries that I knew of2, I sadly found that none of them were
suitable replacements. Of all the Open Source C libraries, the library closest to what I
imagined an embedded C library should be was uClibc. However, it also had a lot of
problems at the time-- not the least of which was that uClibc had no central maintainer, it
had no mechanism for supporting multiple architectures, and there had already been
several source tree forks. In short, uClibc was a mess of twisty versions, all different.
After spending some time with the code, I decided to fix it.
With the help of D. Jeff Dionne, one of the originators of uClinux3 , I ported uClibc to run
on Intel compatible x86 CPUs. I then grafted in the header files from glibc 2.1.3 to
simplify software ports, and I cleaned up the resulting breakage. This effort has made
porting software to run with uClibc extremely easy. There were many functions in
uClibc that were either broken or missing, and had to be written from scratch and/or
rewritten. When appropriate, I sometimes grafted in bits of code from the current GNU
C library and libc5. Once the core of the library was reasonably solid, I began adding a
platform abstraction layer to allow uClibc to run on different types of CPUs. Once I had
both the ARM and x86 platforms basically running, I made a few small announcements
to the Internet. At this point, several people began to make regular contributions.
In January 2001, after a great deal of effort both on my part and on the part of the other
contributors to uClibc, I was able to build the first shared library version of uClibc. At
about the same time, a wrapper for the gcc compiler was contributed, which greatly
simplified the process of compiling applications with uClibc.
Enough Background--Let's make something that works!
Now that I have certainly bored people with the history lesson, we finally get to the fun
part, building our own embedded Linux system. To begin we need to create a skeleton
2
The Open Source C libraries I evaluated included Al's Free C RunTime library, which is available from
http://www.k9wk.com/cdoc.html, dietlibc which is available from http://www.fefe.de/dietlibc/
the minix C library from http://www.cs.vu.nl/cgi-bin/raw/pub/minix/2.0.0/src.tar, the newlib library from
http://sources.redhat.com/newlib/, and the eCos C library from ftp://sourceware.cygnus.com/pub/ecos/
3
uClinux is a port of Linux designed to run on micro-controllers which lack Memory Management Units
(MMUs) such as the Motorolla DragonBall or the ARM7TDMI. The uClinux web site is found at
http://www.uclinux.org/.
5. filesystem where we will build our embedded Linux distribution. So to begin with we
will create an empty filesystem
[andersen@slag /tmp]$ dd if=/dev/zero of=root_fs bs=1k count=600
600+0 records in
600+0 records out
[root@slag /tmp]# ls -sh /tmp/root_fs
604k /tmp/root_fs
[andersen@slag /tmp]$ mkfs.minix /tmp/root_fs
224 inodes
600 blocks
Firstdatazone=11 (11)
Zonesize=1024
Maxsize=268966912
At this point, we now have a 600k file containing a minix filesystem. In order to copy
files into this filesystem we must now mount it:
[andersen@slag /tmp]$ mount /tmp/root_fs /mnt -o loop -t minix
mount: only root can do that
[andersen@slag /tmp]$ su
Password:
[root@slag /tmp]# mount /tmp/root_fs /mnt -o loop -t minix
Something important to notice here is that we had to have elevated (root) permissions in
order to mount the filesystem. Now that the filesystem is mounted we can copy any file
into it that we want. It is now time to begin compiling some source code.
Compiling uClibc
Before we can compile uClibc, we must first grab a copy of the source code and unpack it
so it is ready to use. For this paper, we will just grab a copy of the daily uClibc snapshot.
[root@slag /tmp]# wget -q ftp://oss.lineo.com/uClibc-snapshot.tar.gz
[root@slag /tmp]# tar -xzf uClibc-snapshot.tar.gz
[root@slag /tmp]# cd uClibc
[root@slag uClibc]#
uClibc has a configuration file, Config, that can be edited to adjust the way the library is
compiled, such as to enable or disable features (i.e. whether debugging support is enabled
or not), and to select a cross-compiler. Since we are going to be targeting a standard Intel
compatible x86 system, no changes to the configuration file are necessary. We can now
begin the compilation process.
[root@slag uClibc]# make
[---------compilation omitted---------]
make[1]: Leaving directory `/tmp/uClibc/unistd'
ranlib libc.a
Finding missing symbols in libc.a ...
partial linking...
No missing symbols found.
gcc -s -nostdlib -shared -o libuClibc.so.1 -Wl,-soname,libuClibc.so.1
tmp/*.o
6. Finally finished compiling...
[root@slag uClibc]# make install
+ mv -f /lib/libuClibc.so.1 /lib/libuClibc.so.1.old
+ rm -f /lib/libuClibc.so.1.old
+ cp libuClibc.so.1 /lib
+ chmod 644 /lib/libuClibc.so.1
+ chown -R root.root /lib/libuClibc.so.1
+ rm -f /lib/libuClibc.so
+ ln -s /lib/libuClibc.so.1 /lib/libuClibc.so
+ ldconfig
At this point we now have a fully compiled uClibc library which is ready to be used.
Additionally, we now also have a compiler-wrapper which we can use to compile
applications using the uClibc library. Now would be an excellent time to make use of
this wrapper. Which is exactly what we will do in just a few minutes. But first, some
more boring historical information.
The Origins of Busybox
As I mentioned earlier, the two pieces of embedded Linux that I chose to tackle were
making smaller libraries and making smaller application programs. A typical Linux
system contains a variety of command-line utilities from numerous different
organizations and independent programmers. Among the most prominent of these utilities
are the GNU shellutils, fileutils, textutils, and similar programs that can be run within a
shell. The GNU utilities are very high-quality programs which are very, very feature-rich.
The large feature set comes at the cost of being quite large -- prohibitively large
embedded systems. After some investigation, I determined that it would be more efficient
to replace them rather than try to strip them down, so I began looking at alternatives.
Just as with alternative C libraries, there were several choices for small shell utilities:
BSD has a number of utilities which could be used. The Minix operating system, which
was recently released under a free software license, also had many useful utilities. Sash,
the stand alone shell, was also a possibility. After quite a lot of research, the one that
seemed to be the best fit was Busybox. It also appealed to me because I was already
familiar with Busybox from its use on the Debian boot floopies, and because I was
aquatinted with Bruce Perens, who was the maintainer. Starting approximately in
October 1999, I began enhancing Busybox and fixing the most obvious problems. Since
Bruce was otherwise occupied and was no longer actively maintaining Busybox, Bruce
eventually consented to let me take full ownership of Busybox.
Since that time, Busybox has gained a large following and attracted development talent
from literally the whole world. It has been used in commercial products such as IBM
wristwatch and 3Com's Kerbango Internet Radio with more happening all the time. So
many new features and applets have been added to Busybox, that the biggest challenge I
now face is simply keeping up with all of the new patches that get submitted!
7. So, How Does It Work?
Busybox is a multi-call binary that combines many common Unix utilities into a single
executable. When it is run, Busybox checks if it was executed by running a symlink, and
if the symlink name matches an applet compiled into Busybox, it runs that applet. If it
was run as quot;busyboxquot;, then Busybox will read the command line and try to execute any
applet passed as the first argument. For example:
[root@slag busybox]$ ./busybox date
Wed Feb 28 17:17:52 MST 2001
[root@slag busybox]$ ./busybox echo quot;hello therequot;
hello there
[root@slag busybox]$ ln -s ./busybox uname
[root@slag busybox]$ ./uname
Linux
Busybox is designed such that the developer compiling it for his embedded system can
select exactly which applets he wants to include in the final binary. Thus, it is possible to
strip out support for unneeded applets, resulting in a smaller binary with a carefully
selected set of commands. The customization granularity for Busybox even goes one step
further: each applet can contain multiple features that can be turned on or off. Thus, for
example, if you don't want to include command-line completion in the Busybox shell
(lash), or you do not need to mount NFS filesystems, you can simply turn these features
off, further reducing the size of the final Busybox binary.
Compiling Busybox
Let's walk through a normal compile of Busybox. First, we must grab a copy of the
Busybox source code and unpack it so it is ready to use. For this paper, we will just grab
a copy of the daily Busybox snapshot.
[root@slag /tmp]# wget -q http://busybox.lineo.com/busybox.tar.gz
[root@slag /tmp]# tar -xzf busybox.tar.gz
[root@slag /tmp]# cd busybox
Now that we are in the Busybox source directory we can configure Busybox so that it
meets the needs of our embedded Linux system. This is done by editing the file Config.h
so that only the applets and features we want are enabled. Additionally, there are several
configuration settings which can be set in the file named Makefile. These settings are
all prefaced by descriptions which are intended to make it obvious what each setting
does. At this point, we do not need to change anything so we will proceed to the
compilation process.
[root@slag busybox]# make CC=/tmp/uClibc/extra/gcc-uClibc/gcc-uClibc-
i386
[---------compilation omitted---------]
[root@slag busybox]# ldd ./busybox
libuClibc.so.1 => /lib/libuClibc.so.1 (0x4000c000)
ld-linux.so.1 => /lib/ld-linux.so.1 (0x40039000)
[root@slag busybox]# ls -sh ./busybox
140k ./busybox*
8. And we're done!
Installing Busybox to the Target
If you then want to install Busybox onto your target device, this is most easily done by
typing: make install. The installation script automatically creates all the required
directories (such as /bin, /sbin and the like) and creates appropriate symlinks in those
directories for each applet that was compiled into the Busybox binary.
To continue with the loop-mounted filesystem from our earlier example, let's install
Busybox onto it:
[root@slag busybox]# make PREFIX=/mnt install
[---------installation text omitted---------]
There now, that wasn't too difficult. However, we are not yet done. Our loop-mounted
filesystem still does not have a copy of uClibc installed. If you recall a bit earlier, we ran
the 'ldd' command to check the list of shared libraries which Busybox was linked against.
In order for our system to work properly, every shared library listed by 'ldd' must be
included in out target. So now is as good a time as any to install the required libraries.
[root@slag busybox]# ldd ./busybox
libuClibc.so.1 => /lib/libuClibc.so.1 (0x4000c000)
ld-linux.so.1 => /lib/ld-linux.so.1 (0x40039000)
[root@slag busybox]# mkdir /mnt/lib
[root@slag busybox]# cp /lib/libuClibc.so.1 /mnt/lib
[root@slag busybox]# cp /lib/ld-linux.so.1 /mnt/lib
With the Busybox binary and the required shared libraries installed, we really just have a
little bit of wrap-up work to do to finish off our embedded Linux root filesystem. In
particular, we need to create a directory for mounting the proc filesystem, we need a few
entries in /etc, and we need some device special files (device nodes) in /dev. Lets finish
all that work off now.
[root@slag busybox]# mkdir /mnt/dev /mnt/dev/pts /mnt/etc
/mnt/etc/init.d /mnt/proc /mnt/tmp
[root@slag busybox]# (cd /dev; cp -a console core full hd[abcd] kmem mem
null port ram ram0 ram1 random tty tty0 tty1 tty2 tty3 tty4 urandom vcs
vcs0 vcs1 vcsa vcsa0 vcsa1 zero /mnt/dev)
[root@slag busybox]# echo quot;/dev/root / minix ro 0 1quot; > /mnt/etc/fstab
[root@sage /tmp]# ldconfig -qr /mnt
[root@sage /tmp]# touch /mnt/etc/profile
[root@slag busybox]# ln -s /proc/mounts /mnt/etc/mtab
Finally, we need to tell init what we want it to do when we boot up. For now we will just
keep this simple, and simply start up a shell.
9. [root@sage /tmp]# echo quot;::respawn:/bin/shquot; > /mnt/etc/inittab
We now have our root filesystem finished and ready to go. But we still need to do a little
more work before we can boot up our newly built embedded Linux system. For
simplicity, I will use a floppy disk to to boot our newly built system from. Lets hurry and
finish things up quickly now.
[root@sage /tmp]# umount /mnt
[root@sage /tmp]# gzip -9 /tmp/root_fs
[root@sage /tmp]# ls -sh /tmp/root_fs.gz
164k /tmp/root_fs.gz
So now our root filesystem has been compressed and is ready to install on the boot
media. To make things simple, I will use a floppy disk to install to. Lets prepare the
floppy disk so we can boot from it
[root@sage /tmp]# mformat A:
[root@sage /tmp]# syslinux /dev/fd0
[root@sage /tmp]# mcopy /tmp/root_fs.gz A:
So we now have a copy of our root filesystem on a floppy disk. Next we need to install a
copy of the Linux kernel. To make things simple, I'm going to simply install a copy a
Linux kernel that I am currently running on my system.
[root@sage /tmp]# mcopy /boot/kernel-2.2.19pre3 A:linux
Finally, we need to configure the bootloader (in case you missed it a few steps ago, we
are using the syslinux bootloader for this example). I happen to have a ready to use
syslinux configuration file already in my /tmp directory, so I will now install that to the
floppy disk as well:
[root@sage /tmp]# cat SYSLINUX.CFG
DEFAULT linux
APPEND initrd=root_fs.gz root=/dev/ram0 rw
TIMEOUT 1
PROMPT 0
[root@sage /tmp]# mcopy SYSLINUX.CFG A:
And now, finally, we are done. Our embedded Linux system is complete and ready to
boot. And you know what? It is very, very small. Take a look.
[root@sage /tmp]# mdir A:
Volume in drive A has no label
Volume Serial Number is 05D3-BF29
Directory for A:/
LDLINUX SYS 5860 03-01-2001 6:25
root_fs gz 160348 03-01-2001 6:26 root_fs.gz
linux 510337 03-01-2001 6:26 linux
SYSLINUX CFG 76 03-01-2001 6:39
4 files 676 621 bytes
779 776 bytes free
10. With a carefully optimized Linux kernel (which this kernel unfortunately isn't) we could
expect to have even more free space. And remember, every bit of space we save is
money that embedded Linux developers don't have to spend on expensive flash memory.
So now comes the final test; it is now time to boot from our floppy disk. Here is what
you should see.
[-----------kernel boot messages snipped-----------]
Freeing unused kernel memory: 64k freed
init started: BusyBox v0.50pre (2001.03.01-13:06+0000) multi-call binary
-- GPL
2
BusyBox v0.50pre (2001.03.01-13:06+0000) Built-in shell (lash)
Enter 'help' for a list of built-in commands.
/ # du -h
136.0k ./bin
1.0k ./sbin
1.0k ./usr/bin
1.0k ./usr/sbin
3.0k ./usr
196.0k ./lib
1.0k ./dev/pts
2.0k ./dev
1.0k ./etc/init.d
5.0k ./etc
1.0k ./proc
1.0k ./tmp
346.0k .
/ #
And there you have it -- how to build the world's smallest embedded Linux system.
Conclusion
The two largest components of a standard Linux system are the utilities and the libraries.
By replacing these with smaller equivalents a much more compact system can be built.
Using Busybox and uClibc allows you to customize your embedded distribution by
stripping out unneeded applets and features, thus further reducing the final image size.
This space savings translates directly into decreased cost per unit as less flash memory
will be required. Combine this with the cost savings of using Linux, rather than a more
expensive proprietary OS, and the reasons for using Linux become very compelling.