Chapter 5. The Linux Root Filesystem_Linux：Embedded Development-QQ阅读男生历史网

上QQ阅读APP看书，第一时间看更新

Chapter 5. The Linux Root Filesystem

In this chapter, you will learn about the root filesystem and its structure. You will also be presented with information about the root filesystem's content, the various device drivers available, and its the communication with the Linux kernel. We will slowly make the transition to the Yocto Project and the method used to define the Linux root filesystem's content. The necessary information will be presented to make sure that a user will be also able to customize the rootfs filesystem according to its needs.

The special requirements of the root filesystem will be presented. You will be given information on its content, subdirectories, defined purposes, the various filesystem options available, the BusyBox alternative, and also a lot of interesting features.

When interacting with an embedded environment, a lot of developers would start from a minimal root filesystem made available by a distribution provider, such as Debian, and using a cross-toolchain will enhance it with various packages, tools, and utilities. If the number of packages to be added is big, it can be very troublesome work. Starting from scratch would be an even bigger nightmare. Inside the Yocto Project, this job is automatized and there is no need for manual work. The development is started from scratch, and it offers a large number of packages inside the root filesystem to make the work fun and interesting. So, let's move ahead and take a look at this chapter's content to understand more about root filesystems in general.

Interacting with the root filesystem

A root filesystem consists of a directory and file hierarchy. In this file hierarchy, various filesystems can be mounted, revealing the content of a specific storage device. The mounting is done using the mount command, and after the operation is done, the mount point is populated with the content available on the storage device. The reverse operation is called umount and is used to empty the mount point of its content.

The preceding commands are very useful for the interaction of applications with various files available, regardless of their location and format. For example, the standard form for the mount command is mount –t type device directory. This command asks the kernel to connect the filesystem from the device that has the type format mentioned in the command line, along with the directory mentioned in the same command. The umount command needs to be given before removing the device to make sure the kernel caches are written in the storage point.

A root filesytem is available in the root hierarchy, also known as /. It is the first available filesystem and also the one on which the mount command is not used, since it is mounted directly by the kernel through the root= argument. The following are the multiple options to load the root filesystem:

From the memory
From the network using NFS
From a NAND chip
From an SD card partition
From a USB partition
From a hard disk partition

These options are chosen by hardware and system architects. To make use of these, the kernel and bootloader need to be configured accordingly.

Besides the options that require interaction with a board's internal memory or storage devices, one of the most used methods to load the root filesystem is represented by the NFS option, which implies that the root filesystem is available on your local machine and is exported over the network on your target. This option offers the following advantages:

The size of the root filesystem will not be an issue due to the fact that the storage space on the development machine is much larger than the one available on the target
The update process is much easier and can be done without rebooting
Having access to an over the network storage is the best solution for devices with small even inexistent internal or external storage devices

The downside of the over the network storage is the fact that a sever client architecture is needed. So, for NFS, an NFS server functionality will need to be available on the development machine. For a Ubuntu host, the required configuration involves installing the nfs-kernel–server package, sudo apt-get install nfs-kernel-server. After the package is installed, the exported directory location needs to be specified and configured. This is done using the /etc/exports file; here, configuration lines similar to /nfs/rootfs <client-IP-address> (rw,no_root_squash,no_subtree_check) appear, where each line defines a location for the over the network shared locations with the NFS client. After the configuration is finished, the NFS server needs to be restarted in this way: sudo /etc/init.d/nfs-kernel-server restart.

For the client side available on the target, the Linux kernel needs to be configured accordingly to make sure that the NFS support is enabled, and also that an IP address will be available at boot time. This configurations are CONFIG_NFS_FS=y, CONFIG_IP_PNP=y, and CONFIG_ROOT_NFS=y. The kernel also needs to be configured with the root=/dev/nfs parameter, the IP address for the target, and the NFS server nfsroot=192.168.1.110:/nfs/rootfs information. Here is an example of the communication between the two components:

There is also the possibility of having a root filesystem integrated inside the kernel image, that is, a minimal root filesytem whose purpose is to start the full featured root filesystem. This root filesystem is called initramfs. This type of filesystem is very helpful for people interested in fast booting options of smaller root filesystems that only contain a number of useful features and need to be started earlier. It is useful for the fast loading of the system at boot time, but also as an intermediate step before starting the real root filesystem available on one of the available storage locations. The root filesystem is first started after the kernel booting process, so it makes sense for it to be available alongside the Linux kernel, as it resides near the kernel on the RAM memory. The following image explains this:

To create initramfs, configurations need to be made available. This happens by defining either the path to the root filesystem directory, the path to a cpio archive, or even a text file describing the content of the initramfs inside the CONFIG_INITRAMFS_SOURCE. When the kernel build starts, the content of CONFIG_INITRAMFS_SOURCE will be read and the root filesystem will be integrated inside the kernel image.

Note

More information about the initramfs filesystem's options can be found inside the kernel documentations files at Documentation/filesystems/ramfs-rootfs-initramfs.txt and Documentation/early-userspace/README.

The initial RAM disk or initrd is another mechanism of mounting an early root filesystem. It also needs the support enabled inside the Linux kernel and is loaded as a component of the kernel. It contains a small set of executables and directories and represents a transient stage to the full featured root filesystem. It only represents the final stage for embedded devices that do not have a storage device capable of fitting a bigger root filesystem.

On a traditional system, the initrd is created using the mkinitrd tool, which is, in fact, a shell script that automates the steps necessary for the creation of initrd. Here is an example of its functionality:

#!/bin/bash

# Housekeeping...
rm -f /tmp/ramdisk.img
rm -f /tmp/ramdisk.img.gz

# Ramdisk Constants
RDSIZE=4000
BLKSIZE=1024

# Create an empty ramdisk image
dd if=/dev/zero of=/tmp/ramdisk.img bs=$BLKSIZE count=$RDSIZE

# Make it an ext2 mountable file system
/sbin/mke2fs -F -m 0 -b $BLKSIZE /tmp/ramdisk.img $RDSIZE

# Mount it so that we can populate
mount /tmp/ramdisk.img /mnt/initrd -t ext2 -o loop=/dev/loop0

# Populate the filesystem (subdirectories)
mkdir /mnt/initrd/bin
mkdir /mnt/initrd/sys
mkdir /mnt/initrd/dev
mkdir /mnt/initrd/proc

# Grab busybox and create the symbolic links
pushd /mnt/initrd/bin
cp /usr/local/src/busybox-1.1.1/busybox .
ln -s busybox ash
ln -s busybox mount
ln -s busybox echo
ln -s busybox ls
ln -s busybox cat
ln -s busybox ps
ln -s busybox dmesg
ln -s busybox sysctl
popd

# Grab the necessary dev files
cp -a /dev/console /mnt/initrd/dev
cp -a /dev/ramdisk /mnt/initrd/dev
cp -a /dev/ram0 /mnt/initrd/dev
cp -a /dev/null /mnt/initrd/dev
cp -a /dev/tty1 /mnt/initrd/dev
cp -a /dev/tty2 /mnt/initrd/dev

# Equate sbin with bin
pushd /mnt/initrd
ln -s bin sbin
popd

# Create the init file
cat >> /mnt/initrd/linuxrc << EOF
#!/bin/ash
echo
echo "Simple initrd is active"
echo
mount -t proc /proc /proc
mount -t sysfs none /sys
/bin/ash --login
EOF

chmod +x /mnt/initrd/linuxrc

# Finish up...
umount /mnt/initrd
gzip -9 /tmp/ramdisk.img
cp /tmp/ramdisk.img.gz /boot/ramdisk.img.gz

Note

More information on initrd can be found at Documentation/initrd.txt.

Using initrd is not as simple as initramfs. In this case, an archive needs to be copied in a similar manner to the one used for the kernel image, and the bootloader needs to pass its location and size to the kernel to make sure that it has started. Therefore, in this case, the bootloader also requires the support of initrd. The central point of the initrd is constituted by the linuxrc file, which is the first script started and is usually used for the purpose of offering access to the final stage of the system boot, that is, the real root filesytem. After linuxrc finishes the execution, the kernel unmounts it and continues with the real root filesystem.

Delving into the filesystem

No matter what their provenience is, most of the available root filesystems have the same organization of directories, as defined by the Filesystem Hierarchy Standard (FHS), as it is commonly called. This organization is of great help to both developers and users because it not only mentions a directory hierarchy, but also the purpose and content of the directories The most notable ones are:

/bin: This refers to the location of most programs
/sbin: This refers to the location of system programs
/boot: This refers to the location for boot options, such as the kernel image, kernel config, initrd, system maps, and other information
/home: This refers to the user home directory
/root: This refers to the location of the root user's home location
/usr: This refers to user-specific programs and libraries, and mimics parts of the content of the root filesystem
/lib: This refers to the location of libraries
/etc: This refers to the system-wide configurations
/dev: This refers to the location of device files
/media: This refers to the location of mount points of removable devices
/mnt: This refers to the mount location point of static media
/proc: This refers to the mounting point of the proc virtual filesystem
/sys: This refers to the mounting point of the sysfs virtual filesystem
/tmp: This refers to the location temporary files
/var: This refers to data files, such as logging data, administrative information, or the location of transient data

The FHS changes over time, but not very much. Most of the previously mentioned directories remain the same for various reasons - the simplest one being the fact that they need to ensure backward compatibility.

Note

The latest available information of the FHS is available at http://refspecs.linuxfoundation.org/FHS_2.3/fhs-2.3.pdf.

The root filesystems are started by the kernel, and it is the last step done by the kernel before it ends the boot phase. Here is the exact code to do this:

/*
  * We try each of these until one succeeds.
  *
  * The Bourne shell can be used instead of init if we are
  * trying to recover a really broken machine.
  */
  if (execute_command) {
    ret = run_init_process(execute_command);
    if (!ret)
      return 0;
    pr_err("Failed to execute %s (error %d).  Attempting defaults...\n",execute_command, ret);
  }
  if (!try_to_run_init_process("/sbin/init") ||
    !try_to_run_init_process("/etc/init") ||
    !try_to_run_init_process("/bin/init") ||
    !try_to_run_init_process("/bin/sh"))
      return 0;

  panic("No working init found.  Try passing init= option to kernel." "See Linux Documentation/init.txt for guidance.");

In this code, it can easily be identified that there are a number of locations used for searching the init process that needs to be started before exiting from the Linux kernel boot execution. The run_init_process() function is a wrapper around the execve() function that will not have a return value if no errors are encountered in the call procedure. The called program overwrites the memory space of the executing process, replacing the calling thread and inheriting its PID.

This initialization phase is so old that a similar structure inside the Linux 1.0 version is also available. This represents the user space processing start. If the kernel is not able to execute one of the four preceding functions in the predefined locations, then the kernel will halt and a panic message will be prompted onto the console to issue an alert that no init processes can be started. So, the user space processing will not start until the kernel space processing is finished.

For the majority of the available Linux systems, /sbin/init is the location where the kernel spawns the init process; the same affirmation is also true for the Yocto Project's generated root filesystems. It is the first application run in the user space context, but it isn't the only necessary feature of the root filesystem. There are a couple of dependencies that need to be resolved before running any process inside the root filesystem. There are dependencies used to solve dynamically linked dependencies references that were not solved earlier, and also dependencies that require external configurations. For the first category of dependencies, the ldd tool can be used to spot the dynamically linked dependencies, but for the second category, there is no universal solution. For example, for the init process, the configuration file is inittab, which is available inside the /etc directory.

For developers not interested in running another init process, this option is available and can be accessed using the kernel command line with the available init= parameter, where the path to the executed binary should be made available. This information is also available in the preceding code. The customization of the init process is not a method commonly used by developers, but this is because the init process is a very flexible one, which makes a number of start up scripts available.

Every process started after init uses the parent-child relationship, where init acts as the parent for all the processes run in the user space context, and is also the provider of environment parameters. Initially, the init process spawns processes according to the information available inside the /etc/inittab configuration file, which defines the runlevel notion. A runlevel represents the state of the system and defines the programs and services that have been started. There are eight runlevels available, numbered from 0 to 6, and a special one that is noted as S. Their purpose is described here:

Each runlevel starts and kills a number of services. The services that are started begin with S, and the ones that a killed begin with K. Each service is, in fact, a shell script that defines the behaviour of the provides that it defines.

The /etc/inittab configuration script defines the runlevel and the instructions applied to all of them. For the Yocto Project, the /etc/inittab looks similar to this:

# /etc/inittab: init(8) configuration.
# $Id: inittab,v 1.91 2002/01/25 13:35:21 miquels Exp $

# The default runlevel.
id:5:initdefault:

# Boot-time system configuration/initialization script.
# This is run first except when booting in emergency (-b) mode.
si::sysinit:/etc/init.d/rcS

# What to do in single-user mode.
~~:S:wait:/sbin/sulogin

# /etc/init.d executes the S and K scripts upon change
# of runlevel.
#
# Runlevel 0 is halt.
# Runlevel 1 is single-user.
# Runlevels 2-5 are multi-user.
# Runlevel 6 is reboot.

l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
# Normally not reached, but fallthrough in case of emergency.
z6:6:respawn:/sbin/sulogin
S0:12345:respawn:/sbin/getty 115200 ttyS0
# /sbin/getty invocations for the runlevels.
#
# The "id" field MUST be the same as the last
# characters of the device (after "tty").
#
# Format:
#  <id>:<runlevels>:<action>:<process>
#

1:2345:respawn:/sbin/getty 38400 tty1

When the preceding inittab file is parsed by the init, the first script that is executed is the si::sysinit:/etc/init.d/rcS line, identified through the sysinit tag. Then, runlevel 5 is entered and the processing of instructions continues until the last level, until a shell is finally spawned using /sbin/getty symlink. More information on either init or inittab can be found by running man init or man inittab in the console.

The last stage of any Linux system is represented by the power off or shutdown command. It is very important, because if it's not done appropriately, it can affect the system by corrupting data. There are, of course, multiple options to implement the shutdown scheme, but the handiest ones remain in the form of utilities, such as shutdown, halt, or reboot. There is also the possibility to use init 0 to halt the system, but, in fact, what all of them have in common is the use of the SIGTERM and SIGKILL signals. SIGTERM is used initially to notify you about the decision to shut down the system, to offer the chance to the system to perform necessary actions. After this is done, the SIGKILL signal is sent to terminate all the processes.

Device drivers

One of the most important challenges for the Linux system is the access allowed to applications to various hardware devices. Notions, such as virtual memory, kernel space, and user space, do not help in simplifying things, but add another layer of complexity to this information.

A device driver has the sole purpose of isolating hardware devices and kernel data structures from user space applications. A user does not need to know that to write data to a hard disk, he or she will be required to use sectors of various sizes. The user only opens a file to write inside it and close when finished. The device driver is the one that does all the underlying work, such as isolating complexities.

Inside the user space, all the device drivers have associated device nodes, which are, in fact, special files that represent a device. All the device files are located in the /dev directory and the interaction with them is done through the mknod utility. The device nodes are available under two abstractions:

Block devices: These are composed of fixed size blocks that are usually used when interacting with hard disks, SD cards, USB sticks, and so on
Character devices: These are streams of characters that do not have a size, beginning, or end; they are mostly not in the form of block devices, such as terminals, serial ports, sound card and so on

Each device has a structure that offers information about it:

Type identifies whether the device node is a character or block
Major identifies the category for the device
Minor holds the identifier of the device node

The mknod utility that creates the device node uses a triplet of information, such as mknod /dev/testdev c 234 0. After the command is executed, a new /dev/testdev file appears. It should bind itself to a device driver that is already installed and has already defined its properties. If an open command is issued, the kernel looks for the device driver that registered with the same major number as the device node. The minor number is used for handling multiple devices, or a family of devices, with the same device driver. It is passed to the device driver so that it can use it. There is no standard way to use the minor, but usually, it defines a specific device from a family of the devices that share the same major number.

Using the mknod utility requires manual interaction and root privileges, and lets the developer do all the heavy lifting needed to identify the properties of the device node and its device driver correspondent. The latest Linux system offers the possibility to automate this process and to also complete these actions every time devices are detected or disappear. This is done as follows:

devfs: This refers to a device manager that is devised as a filesystem and is also accessible on a kernel space and user space.
devtmpfs: This refers to a virtual filesystem that has been available since the 2.6.32 kernel version release, and is an improvement to devfs that is used for boot time optimizations. It only creates device nodes for hardware available on a local system.
udev: This refers to a daemon used on servers and desktop Linux systems. More information on this can be referred to by accesing default device manager.
mdev: This offers a simpler solution then udev; it is, in fact, a derivation of udev.

Since system objects are also represented as files, it simplifies the method of interaction with them for applications. This would not been possible without the use of device nodes, that are actually files in which normal file interaction functions can be applied, such as open(), read(), write(), and close().

Filesystem options

The root filesystem can be deployed under a very broad form of the filesystem type, and each one does a particular task better than the rest. If some filesystems are optimized for performance, others are better at saving space or even recovering data. Some of the most commonly used and interesting ones will be presented here.

The logical division for a physical device, such as a hard disk or SD card, is called a partition. A physical device can have one or more partitions that cover its available storage space. It can be viewed as a logical disk that has a filesystem available for the user's purposes. The management of partitions in Linux is done using the fdisk utility. It can be used to create, list, destroy, and other general interactions, with more than 100 partition types. To be more precise, 128 partition types are available on my Ubuntu 14.04 development machine.

One of the most used and well known filesystem partition formats is ext2. Also called second extended filesystem, it was introduced in 1993 by Rémy Card, a French software developer. It was used as the default filesystem for a large number of Linux distributions, such as Debian and Red Hat Linux, until it was replaced by its younger brothers, ext3 and ext4. It continues to remain the choice of many embedded Linux distributions and flash storage devices.

The ext2 filesystem splits data into blocks, and the blocks are arranged into block groups. Each block group maintains a copy of a superblock and the descriptor table for that block group. Superblocks are to store configuration information, and hold the information required by the booting process, although there are available multiple copies of it; usually, the first copy that is situated in the first block of the file system is the one used. All the data for a file is usually kept in a single block so that searches can be made faster. Each block group, besides the data it contains, has information about the superblock, descriptor table for the block group, inode bitmap and table information, and the block bitmap. The superblock is the one that holds the information important for the booting process. Its first block is used for the booting process. The last notion presented is in the form of inodes, or the index nodes, which represent files and directories by their permission, size, location on disk, and ownership.

There are multiple applications used for interaction with the ext2 filesystem format. One of them is mke2fs, which is used to create an ext2 filesystem on a mke2fs /deb/sdb1 –L partition (ext2 label partition). The is the e2fsck command, which is used to verify the integrity of the filesystem. If no errors are found, these tools give you information about the partition filesystem configuration, e2fsck /dev/sdb1. This utility is also able to fix some of the errors that appear after improper utilization of the device, but cannot be used in all scenarios.

Ext3 is another powerful and well known filesystem. It replaced ext2 and became one of the most used filesystems on Linux distributions. It is in fact similar to ext2; the difference being that it has the possibility to journalize the information available to it. The ext2 file format can be changed in an ext3 file format using the tune2fs –j /dev/sdb1 command. It is basically seen as an extension for the ext2 filesystem format, one that adds the journaling feature. This happens because it was engineered to be both forward and backward compatible.

Journaling is a method that is used to log all the changes made on a filesystem form by making the recovery functionality possible. There are also other features that ext3 adds besides the ones that are already mentioned; here, I am referring to the possibility of not checking for consistencies in the filesystem, mostly because journalizing the log can be reversed. Another important feature is that it can be mounted without checking whether the shutdown was performed correctly. This takes place because the system does not need to conduct a consistency check at power down.

Ext4 is the successor of ext3, and was built with the idea of improving the performance and the storage limit in ext3. It is also backward compatible with the ext3 and ext2 filesystems and also adds a number of features:

Persistent preallocation: This defines the fallocate() system call that can be used to preallocate space, which is most likely in a contiguous form; it is very useful for databases and streaming of media
Delayed allocation: This is also called allocate-on-flush; it is used to delay the allocation blocks from the moment data from the disk is flushed, to reduce fragmentation and increase performance
Multi block allocation: This is a side effect of delayed allocation because it allows for data buffering and, at the same time, the allocation of multiple blocks.
Increase subdirectory limit: This the ext3 has a limit of 32000 subdirectories, the ext4 does not have this limitation, that is, the number of subdirectories are unlimited
Checksum for journal: This is used to improve reliability

Journalling Flash Filesystem version 2 (JFFS2) is a filesystem designed for the NAND and NOR flash memory. It was included in the Linux mainline kernel in 2001, the same year as the ext3 filesystem, although in different months. It was released in November for the Linux version 2.4.15, and the JFFS2 filesystem was released in September with the 2.4.10 kernel release. Since it's especially used to support flash devices, it takes into consideration certain things, such as the need to work with small files, and the fact that these devices have a wear level associated with them, which solves and reduces them by their design. Although JFFS2 is the standard for flash memory, there are also alternatives that try to replace it, such as LogFS, Yet Another Flash File System (YAFFS), and Unsorted Block Image File System (UBIFS).

Besides the previously mentioned filesystems, there are also some pseudo filesystems available, including proc, sysfs, and tmpfs. In the next section, the first two of them will be described, leaving the last one for you to discover by yourself.

The proc filesystem is a virtual filesystem available from the first version of Linux. It was defined to allow a kernel to offer information to the user about the processes that are run, but over time, it has evolved and is now able to not only offer statistics about processes that are run, but also offer the possibility to adjust various parameters regarding the management of memory, processes, interrupts, and so on.

With the passing of time, the proc virtual filesystem became a necessity for Linux system users since it gathered a very large number of user space functionalities. Commands, such as top, ps, and mount would not work without it. For example, the mount example given without a parameter will present proc mounted on the /proc in the form of proc on /proc type proc (rw,noexec,nosuid,nodev). This takes place since it is necessary to have proc mounted on the root filesystem on par with directories, such as /etc, /home, and others that are used as the destination of the /proc filesystem. To mount the proc filesystem, the mount –t proc nodev/proc mount command that is similar to the other filesystems available is used. More information on this can be found inside the kernel sources documentation at Documentation/filesystems/proc.txt.

The proc filesystem has the following structure:

For each running process, there is an available directory inside /proc/<pid>.It contains information about opened files, used memory, CPU usage, and other process-specific information.
Information on general devices is available inside /proc/devices, /proc/interrupts, /proc/ioports, and /proc/iomem.
The kernel command line is available inside /proc/cmdline.
Files used to change kernel parameters are available inside /proc/sys. More information is also available inside Documentation/sysctl.

The sysfs filesystem is used for the representation of physical devices. It is available since the introduction of the 2.6 Linux kernel versions, and offers the possibility of representing physical devices as kernel objects and associate device drivers with corresponding devices. It is very useful for tools, such as udev and other device managers.

The sysfs directory structure has a subdirectory for every major system device class, and it also has a system buses subdirectory. There is also systool that can be used to browse the sysfs directory structure. Similar to the proc filesystem, systool also can also be visible if the sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) mount command is offered on the console. It can be mounted using the mount -t sysfs nodev /sys command.

Note

More information on available filesystems can be found at http://en.wikipedia.org/wiki/List_of_file_systems.

Understanding BusyBox

BusyBox was developed by Bruce Perens in 1999 with the purpose of integrating available Linux tools in a single executable. It has been used with great success as a replacement for a great number of Linux command line utilities. Due to this, and the fact that it is able to fit inside small embedded Linux distributions, it has gained a lot of popularity in the embedded environment. It provides utilities from file interactions, such as cp, mkdir, touch, ls, and cat, as well as general utilities, such as dmesg, kill, fdisk, mount, umount, and many others.

Not only is it very easy to configure and compile, but it is also very easy to use. The fact that it is very modular and offers a high degree of configuration makes it the perfect choice to use. It may not include all the commands available in a full-blown Linux distribution available on your host PC, but the ones that it does are more than enough. Also, these commands are just simpler versions of the full-blown ones used at implementation level, and are all integrated in one single executable available in /bin/busybox as symbolic links of this executable.

A developer interaction with the BusyBox source code package is very simple: just configure, compile, and install it, and there you have it. Here are some detailed steps to explain the following:

Run the configuration tool and chose the features you want to make available
Execute make dep to construct the dependencies tree
Build the package using the make command

Tip

Install the executable and symbolic links on the target. People who are interested in interacting with the tool on their workstations should note that if the tool is installed for the host system, then the installation should be done in a location that does not overwrite any of the utilities and start up scripts available to the host.

The configuration of the BusyBox package also has a menuconfig option available, similar to the one available for the kernel and U-Boot, that is, make menuconfig. It is used to show a text menu that can be used for faster configuration and configuration searches. For this menu to be available, first the ncurses package needs to be available on the system that calls the make menuconfig command.

At the end of the process, the BusyBox executable is available. If it's called without arguments, it will present an output very similar to this:

Usage: busybox [function] [arguments]...
 or: [function] [arguments]...

 BusyBox is a multi-call binary that combines many common Unix
 utilities into a single executable. Most people will create a
 link to busybox for each function they wish to use and BusyBox
 will act like whatever it was invoked as!

Currently defined functions:
 [, [[, arping, ash, awk, basename, bunzip2, busybox, bzcat, cat,
 chgrp, chmod, chown, chroot, clear, cp, crond, crontab, cut, date,
 dd, df, dirname, dmesg, du, echo, egrep, env, expr, false, fgrep,
 find, free, grep, gunzip, gzip, halt, head, hexdump, hostid, hostname,
 id, ifconfig, init, insmod, ipcalc, ipkg, kill, killall, killall5,
 klogd, length, ln, lock, logger, logread, ls, lsmod, md5sum, mesg,
 mkdir, mkfifo, mktemp, more, mount, mv, nc, "netmsg", netstat,
 nslookup, passwd, pidof, ping, pivot_root, poweroff, printf, ps,
 pwd, rdate, reboot, reset, rm, rmdir, rmmod, route, sed, seq,
 sh, sleep, sort, strings, switch_root, sync, sysctl, syslogd,
 tail, tar, tee, telnet, test, time, top, touch, tr, traceroute,
 true, udhcpc, umount, uname, uniq, uptime, vi, wc, wget, which,
 xargs, yes, zcat

It presents the list of the utilities enabled in the configuration stage. To invoke one of the preceding utilities, there are two options. The first option requires the use of the BusyBox binary and the number of utilities called, which are represented as ./busybox ls, while the second option involves the use of the symbolic link already available in directories, such as /bin, /sbin, /usr/bin, and so on.

Besides the utilities that are already available, BusyBox also offers implementation alternatives for the init program. In this case, the init does not know about a runlevel, and all its configurations available inside the /etc/inittab file. Another factor that differentiates it from the standard /etc/inittab file is the fact that this one also has its special syntax. For more information, examples/inittab available inside BusyBox can be consulted. There are also other tools and utilities implemented inside the BusyBox package, such as a lightweight version for vi, but I will let you discover them for yourself.

Minimal root filesystem

Now that all the information relating to the root filesystem has been presented to you, it would be good exercise to describe the must-have components of the minimal root filesystem. This would not only help you to understand the rootfs structure and its dependencies better, but also help with requirements needed for boot time and the size optimization of the root filesystem.

The starting point to describe the components is /sbin/init; here, by using the ldd command, the runtime dependencies can be found. For the Yocto Project, the ldd /sbin/init command returns:

linux-gate.so.1 (0xb7785000)
libc.so.6 => /lib/libc.so.6 (0x4273b000)
/lib/ld-linux.so.2 (0x42716000)

From this information, the /lib directory structure is defined. Its minimal form is:

lib
|-- ld-2.3.2.so
|-- ld-linux.so.2 -> ld-2.3.2.so
|-- libc-2.3.2.so
'-- libc.so.6 -> libc-2.3.2.so

The following symbolic links to ensure backward compatibility and version immunity for the libraries. The linux-gate.so.1 file in the preceding code is a virtual dynamically linked shared object (vDSO), exposed by the kernel at a well established location. The address where it can be found varies from one machine architecture to another.

After this, init and its runlevel must be defined. The minimal form for this is available inside the BusyBox package, so it will also be available inside the /bin directory. Alongside it, a symbolic link for shell interaction is necessary, so this is how the minimal for the bin directory will look:

bin
|-- busybox
'-- sh -> busybox

Next, the runlevel needs to be defined. Only one is used in the minimal root filesystem, not because it is a strict requirement, but due to the fact that it can suppress some BusyBox warnings. This is how the /etc directory will look:

etc
'-- init.d
 '-- rcS

At the end, the console device needs to be available to the user for input and output operations, so the last piece of the root filesystem is inside the /dev directory:

dev
'-- console

Having mentioned all of this, the minimal root filesystem seems to have only five directories and eight files. Its minimal size is below 2 MB and around 80 percent of its size is due to the C library package. It is also possible to minimize its size by using the Library Optimizer Tool. You can find more information on this at http://libraryopt.sourceforge.net/.

The Yocto Project

Moving to the Yocto Project, we can take a look at the core-image-minimal to identify its content and minimal requirements, as defined inside the Yocto Project. The core-image-minimal.bb image is available inside the meta/recipes-core/images directory, and this is how it looks:

SUMMARY = "A small image just capable of allowing a device to boot."

IMAGE_INSTALL = "packagegroup-core-boot ${ROOTFS_PKGMANAGE_BOOTSTRAP} ${CORE_IMAGE_EXTRA_INSTALL} ldd"

IMAGE_LINGUAS = " "

LICENSE = "MIT"

inherit core-image

IMAGE_ROOTFS_SIZE ?= "8192"

You can see here that this is similar to any other recipe. The image defines the LICENSE field and inherits a bbclass file, which defines its tasks. A short summary is used to describe it, and it is very different from normal package recipes. It does not have LIC_FILES_CHKSUM to check for licenses or a SRC_URI field, mostly because it does not need them. In return, the file defines the exact packages that should be contained in the root filesystem, and a number of them are grouped inside packagegroup for easier handling. Also, the core-image bbclass file defines a number of other tasks, such as do_rootfs, which is only specific for image recipes.

Constructing a root filesystem is not an easy task for anyone, but Yocto does it with a bit more success. It starts from the base-files recipe that is used to lay down the directory structure according to the Filesystem Hierarchy Standard (FHS), and, along with it, a number of other recipes are placed. This information is available inside the ./meta/recipes-core/packagegroups/packagegroup-core-boot.bb recipe. As can be seen in the previous example, it also inherits a different kind of class, such as packagegroup.bbclass, which is a requirement for all the package groups available. However, the most important factor is that it clearly defines the packages that constitute packagegroup. In our case, the core boot package group contains packages, such as base-files, base-passwd (which contains the base system master password and group files), udev, busybox, and sysvinit (a System V similar to init).

As can be seen in the previously shown file, the BusyBox package is a core component of the Yocto Project's generated distributions. Although information was available about the fact that BusyBox can offer an init alternative, the default Yocto generated distributions do not use this. Instead, they choose to move to the System V-like init, which is similar to the one available for Debian-based distributions. Nevertheless, a number of shell interaction tools are made available through the BusyBox recipe available inside the meta/recipes-core/busybox location. For users interested in enhancing or removing some of features made available by the busybox package, the same concepts that are available for the Linux kernel configuration are used. The busybox package uses a defconfig file on which a number of configuration fragments are applied. These fragments can add or remove features and, in the end, the final configuration file is obtained. This identifies the final features available inside the root filesystem.

Inside the Yocto Project, it is possible to minimize the size of the root filesystem by using the poky-tiny.conf distribution policies, which are available inside the meta-yocto/conf/distro directory. When they're used, these policies reduce not only the boot size, but the boot time as well. The simplest example for this is available using the qemux86 machine. Here, changes are visible, but they are somewhat different from the ones already mentioned in the Minimal root filesystem section. The purpose of the minimization work done on qemux86 was done around the core-image-minimal image. Its goals is to reduce the size to under 4 MB of the resulting rootfs and the boot time to under 2 seconds.

Now, moving to the selected Atmel SAMA5D3 Xplained machine, another rootfs is generated and its content is quite big. Not only has it included the packagegroup-core-boot.bb package group, but other package groups and separate packages are also included. One such example is the atmel-xplained-demo-image.bb image available inside the meta-atmel layer in the recipes-core/images directory:

DESCRIPTION = "An image for network and communication."
LICENSE = "MIT"
PR = "r1"

require atmel-demo-image.inc

IMAGE_INSTALL += "\
    packagegroup-base-3g \
    packagegroup-base-usbhost \
    "

Inside this image, there is also another more generic image definition that is inherited. Here, I am referring to the atmel-demo-image.inc file, and when opened, you can see that it contains the core of all the meta-atmel layer images. Of course, if all the available packages are not enough, a developer could decide to add their own. There has two possibilities in front of a developer: to create a new image, or to add packages to an already available one. The end result is built using the bitbake atmel-xplained-demo-image command. The output is available in various forms, and they are highly dependent on the requirements of the defined machine. At the end of the build procedure, the output will be used to boot the root filesystem on the actual board.

Summary

In this chapter, you have learned about the Linux rootfs in general, and also about the communication with the organization of the Linux kernel, Linux rootfs, its principles, content, and device drivers. Since communication tends to become larger over time, information about how a minimal filesystem should look was also presented to you.

Besides this information, in the next chapter, you will be given an overview of the available components of the Yocto Project, since most of them are outside Poky. You will also be introduced to, and given a brief gist of, each component. After this chapter, a bunch of them will be presented to you and elaborated on.