The magazine of the Melbourne PC User Group
Linux: The Big Picture
Lars Wirzenius |
|
|
Lars Wirzenius was at university with Linus Torvalds in 1991 when Linux was
born, and he ran an early copy on his computer. In this article Lars shares some
of his knowledge of the history and development of Linux during the past 12
years — a period that has seen a young man’s idea become a significant
International trend... |
The history of computer operating systems begins in the 1950s, with simple
schemes for running batch programs efficiently; minimizing idle time between
programs. A batch program is one that requires no interaction with a user. It
reads all its input from a file (possibly a stack of punch cards) and sends all
its output to another file (possibly to a printer). This is how all computers
used to work.
Then in the early 1960s interactive use started to gain ground. Not only
interactive use, but several people using the same computer at the same time
from different terminals. These were known as time-sharing systems and compared
to the batch systems they were quite a challenge to implement.
During the 1960s there were many attempts at building good time-sharing systems.
Some of these were university research projects, others were commercial. One
such project was Multics, it was innovative at the time. For example it had a
hierarchical file system, something we take for granted in modern operating
systems. However, the Multics project did not progress very well. It took years
longer than anticipated to complete and never got a significant share of the
operating system market. One of the participants, Bell Laboratories withdrew
from the project.
The Bell Labs people who had been involved then created their own operating
system and named it UNIX.
Initially UNIX was distributed free and gained much popularity in universities.
Later, it was given an implementation of the TCP/IP protocol stack and was
adopted as the operating system of choice on many early workstations.
By 1990, UNIX had a strong position in the server market and was especially
strong in universities. Most universities had UNIX systems and computer science
students were exposed to them. Many of them wanted to run UNIX on their own
computers as well. Unfortunately by that time UNIX had become commercial and
rather expensive. About the only cheap option was Minix, a limited UNIX-like
system written for teaching purposes by Andrew Tanenbaum. There was also 386BSD,
a precursor to NetBSD, FreeBSD and OpenBSD, but that wasn't mature yet and
required hardware more formidable than many had available at home.
Into this scene came Linux, in October, 1991. The author, Linus Torvalds had
used UNIX at the University of Helsinki, and wanted something similar on his PC
at home. Since the commercial alternatives were far too expensive, he started
out with Minix, but wanted something better and soon began to write his own
operating system. After its first release it attracted the attention of several
other hackers. While initially Linux was not really useful except as a toy, it
soon gathered enough features to attract the interest of many people, even those
generally uninterested in operating system development.
Linux itself is only the kernel of an operating system. The kernel is the part
that makes all other programs run. It implements multitasking, manages hardware
devices and generally enables applications to do their thing. All the programs
that the user (or system administrator) actually interacts with are run on top
of the kernel. Some of these are essential. For example, a command line
interpreter (or shell), which is used both interactively and to run shell
scripts, ie. files corresponding to batch (.BAT) files.
Linus did not write those programs himself; he used existing free versions
instead. This greatly reduced the amount of work required to get a working
environment. In fact often he changed the kernel to make it easier to get the
existing programs to run on Linux, instead of the other way around.
Most of the critically important system software, including the C compiler, came
from the Free Software Foundation's GNU project. Started in 1984, the GNU
project aims to develop an entire UNIX-like operating system that is completely
free. To credit them, many people like to refer to a Linux system as a GNU/Linux
system. (GNU has its own kernel as well.)
During 1992 and 1993, the Linux kernel gathered all the necessary features it
needed to work as a replacement for UNIX workstations, including TCP/IP
networking and a graphical windowing system (the X Window System). Linux also
received plenty of industry attention, and several small companies were started
to develop and distribute Linux. Dozens of user groups were founded and the
Linux Journal magazine appeared in early 1994.
Version 1.0 of the Linux kernel was released in March, 1994. Since then the
kernel has gone through many development cycles, each culminating in a stable
version. Each development cycle has taken a year or three, and has involved
redesigning and rewriting large parts of the kernel to deal with changes in
hardware (for example, new ways to connect peripherals, such as USB) and to meet
increased speed requirements as people apply Linux to larger and larger systems
(or smaller and smaller ones: embedded Linux is becoming a hot topic).
From a marketing and political point of view, after the 1.0 release the next
huge step occurred in 1997 when Netscape decided to release its Web browser as
free software (the term "open source" was created for this). This was the
occasion that first brought free software to the attention of the whole
computing world. It has taken years of work since then, but free software (or
"open source") has not only become generally accepted but also it's often the
preferred choice for many applications.
The Social Phenomenon
Apart from being a technological feat, Linux is also an interesting social
phenomenon. Much through Linux the free software movement has gained attention
and recognition. On the way there it got an informal marketing department and
brand, "open source". It is baffling to many outsiders that something as
successful as Linux could be developed by a bunch of unorganised people in their
free time.
The major factor here is the availability of all the source code of the system,
plus a copyright license that allows modifications to be made and distributed.
When the system has many programmers among its users, if they find a problem
they can fairly easily fix it. Additionally if they think a feature is missing
they can add it themselves. For some reason that is a challenge many programmers
like to take on, even when they're not paid for it: they have an itch (a need),
so they scratch (write the code to fill the need).
It is necessary to have at least one committed developer who puts in a lot of
effort. After a while, once there are enough programmer-users sending in small
changes and improvements you get a snowball effect. Lots of small changes result
in a rapid total development time. This attracts more users, some of whom will
be programmers. The wheel spins faster.
For operating system development specifically, the input from this large group
of programmer-users results in two important types of improvements, bug fixes
and device drivers. Operating system code often has bugs that occur only rarely
and it can be difficult for the developers to reproduce them. When there are
thousands or more users who are also programmers, the result is a very effective
testing and debugging army.
Most of the code volume in Linux is device drivers. The core, which implements
multitasking and multiuser functionality, is small in comparison. Most device
drivers are independent from each other and interact only with the operating
system core via well defined interfaces. Thus it is fairly easy to write a new
device driver without having to understand the whole complexity of the operating
system. This also allows the main developers to concentrate on the core
functionality and they can allow those people who actually have the devices, to
write the device drivers.
It would be awkward just to store the thousands of different sound cards,
Ethernet cards, IDE controllers, motherboards, digital cameras, printers, and so
on that Linux supports. The Linux development model is distributed and spreads
the work around quite effectively.
The Linux model is not without problems. When a new device comes onto the market
it can take a few months before a Linux programmer somewhere is interested
enough to write a device driver. Also, some device manufacturers for their own
reasons do not wish to release programming information for their devices. This
can prevent a Linux device driver from being written at all. Fortunately with
the growing global interest in Linux such companies are becoming fewer in
number.
So What Is Linux?
Linux is a UNIX-like multitasking, multiuser 32 and 64 bit operating system for
a variety of hardware platforms and licensed under an open source arrangement.
This is a somewhat brief description and I'll spend the rest of this article
expounding on it.
Being UNIX-like means emulating the UNIX operating system interfaces so that
programs written for UNIX will work for Linux merely by recompiling the code. It
follows that Linux uses mostly the same abstractions as the UNIX system. For
example, the way processes are created and controlled is the same in UNIX and
Linux.
There are a number of other operating systems in active use: from Microsoft's
family of Windows versions, through
Apple's MacOS to OpenVMS. Linus Torvalds chose UNIX as the model for Linux
partly for its aesthetic appeal to system programmers and partly because of all
the operating systems with which he was familiar, it was the one he knew best.
The UNIX heritage also gives Linux the two most important features: multitasking
and multiuser capabilities. Linux, like UNIX, was designed from the start to run
multiple processes independently of each other. Implementing multitasking well
requires attention at every level of the operating system. It is hard to add
multitasking to an operating system afterwards. That's why the Windows 95 series
and MacOS (before MacOS X) did multitasking somewhat poorly: multitasking was
added to an existing operating system, not designed into a new one. That's also
why the Windows NT series, MacOS X, and Linux do multitasking so much better.
A good implementation of multitasking requires, among other things, proper
memory management. The operating system must use memory protection support in
the processor to protect (running) programs from interfering with each other.
Otherwise a buggy program (that is, almost any program) can easily corrupt the
memory area of another program, or of the operating system itself, causing
anything from weird behaviour to a total system crash with likely loss of data
and unsaved work.
Supporting many concurrent users is easy after multitasking works. You label
each instance of a running program with a particular user and prevent the
program from tampering with other user's files.
Portable And Scalable
Linux was originally written for an Intel 386 processor, and naturally works on
all successive processors. After about three years of development, work began to
adapt (or port) Linux to other processor families as well. The first one was the
Alpha processor, then developed and sold by the Digital Equipment Corporation.
The Alpha was chosen because Digital graciously donated a system to Linus. Soon
other that, porting efforts followed. Today, Linux also runs on Sun SPARC and
UltraSPARC, Motorola 68000, PowerPC, PowerPC64, ARM, Hitachi SuperH, IBM S/390,
MIPS, HP PA-RISC, Intel IA-64, DEC VAX, AMD x86-64 and CRIS processors. See
http://kernel.org for details.
Most of those processors are not very common on people's desks. For example,
S/390 is IBM's big mainframe architecture. Here mainframe means the type of
computer inside which you could put your desk, rather than the type that fits on
your desk.
Some of those processors are 32 bit, like the Intel 386. Others are 64 bit, such
as the Alpha. Supporting such different processors has been good for Linux. It
has required designing the system to use proper modularity and good abstractions
and this has improved code quality.
The large range of supported processors also shows off Linux's scalability: it
works on everything from very small systems, such as embedded computers,
handheld devices, and mobile phones, to very large systems such as the IBM
mainframes.
Using clustering technology, such as Beowulf http://www.beowulf.org, Linux even
runs on supercomputers. For example the US Lawrence Livermore National
Laboratories bought a cluster with 1920 processors, resulting in one of the five
fastest supercomputers in the world with a theoretical peak performance of 9.2 teraFLOPS or 9.2 trillion calculations per second.
http://lwn.net/Articles/4759/.
Using Linux
The operating system itself is pretty boring to most people. Applications are
necessary to get things done. Traditionally, Linux applications were the same
types of applications used with UNIX: scientific software, databases, and
network services. Also of course, all the tools programmers want for their
craft.
Much of that software seems rather old-fashioned by today's desktop standards.
User interfaces are text based, or they might not exist at all. Indeed most
software has usually been non-interactive and has been of the command line,
batch processing variety. Since most users have been experts in the application
domain, this has been good enough.
Thus, Linux first found corporate employment as a file server, mail server, Web
server, or firewall. It was a good platform for running a database, with support
from all major commercial database manufacturers.
In the past few years Linux has also become an interesting option with
user-friendly desktops. The KDE http://www.kde.org and Gnome
http://www.gnome.org
projects both develop desktop environments and applications that are easy to
learn and effective to use. There are now many desktop applications that people
with Windows or MacOS experience will have no difficulty using.
There is even a professional grade Office Software package. OpenOffice
http://www.openoffice.org,
based on Sun's StarOffice, is free, is fully featured, and file-compatible with
Microsoft Office. It includes a word processor, spreadsheet, and presentation
program, competing with Microsoft's Word, Excel, and PowerPoint.
Linux Distributions
Before installing Linux you must choose a Linux distribution. A distribution is
the Linux kernel plus an installation program plus a set of applications to run
on top of it. There are hundreds of Linux distributions, all serving different
needs.
All distributions use pretty much the same actual software, but they are
different in which software they include, which versions they pick (a stable
version known to work well, or the latest version with all the bells, whistles
and bugs), how the software is pre-configured, and how the system is installed
and managed. For example, OpenOffice, Mozilla (Web browser), KDE and Debian
(desktop environments), and Apache (Web server) will all work on all
distributions.
Some distributions aim to be general purpose, but most of them are task
specific: they are meant for running a firewall, a Web kiosk, or meant for users
within a particular university or country. Those looking for their first Linux
experience can choose from the three biggest, general purpose distributions -
Red Hat, SuSE, and Debian.
The Red Hat and SuSE distributions are produced by companies by the same names.
They aim at providing an easy installation procedure, and for a pleasant desktop
experience. They are also good as servers. Both are sold in boxes with an
installation CD and printed manual. Both can also be downloaded via the
Internet.
The Debian distribution is produced by a volunteer organization. Its
installation is less easy - you have to answer some questions during the
installation; questions about things the other distributions deduce
automatically. Nothing complicated as such, but requiring an understanding of,
and information about hardware that many PC users usually don't want to be
concerned with. On the other hand, after installation Debian can be upgraded to
each new release without reinstalling anything.
The easiest way to try out Linux is to use a distribution that works completely
off a CD-ROM. This way, you don't have to install anything. You merely download
the CD-ROM image (.iso file) from the Net and burn it to a disc, or buy a
mass-produced one via the Net. Insert disc in drive, then reboot. Not having to
install anything on the hard disk means you can easily switch between Linux and
Windows. Also, since all the Linux files are on a read-only CD-ROM, you can't
accidently break anything while you're learning.
About the Author
Lars Wirzenius http://liw.iki.fi/liw/ designs and implements embedded telematic
software for Oliotalo
http://www.oliotalo.fi at work, and develops Debian at
home.
Reprinted from the May 2003 issue of PC Update, the magazine of Melbourne PC User Group, Australia
|