The magazine of the Melbourne PC User Group

Databases - Back to the Future
James Duff

This article is based around the writer's odyssey through the IT arena over nearly three decades. Starting off as an analyst programmer on batch systems on mainframe computers in the sixties, I consider myself extremely lucky to have witnessed and been part of the growth of database technology. While many of us would love to have first hand experience in a greater variety of software products, the explosive growth in the number of software tools has rendered impossible the fulfilment of such a desire. I am pleased to share some of this history with you and welcome feedback from readers.

Introduction to Databases

The purpose of this article is to give readers an overview of the history of databases and their current positioning in the business and personal world of information technology.

Let me say at the outset that the time for databases has arrived. They still have a way to go, but unlike spreadsheets and word processors which have had their glory years, databases are finally maturing into tools that users, as well as programmers can understand and use effectively. I believe this is due to the convergence of a more enlightened and computer literate user community, which has progressed through the more intuitive word processor and spreadsheet paradigms, and the push by organisations and vendors into the Client/Server arena associated with the right-sizing movement.

Not so long ago, we spoke of the three primary types of data processing on personal computers: word processors, spreadsheets and databases.

Word processors are forms of software (computer programs) that have taken over from where typewriters left off and allow users to type reports, books, letters and so forth.

Spreadsheets are forms of software that allow users to set out arrays (or tables) of numbers and to perform calculations such as adding a column of figures. They are an extension of calculators.

Databases are forms of software that enable users to set up and maintain

sets of tables, e.g. mailing lists, stock lists, sets of accounts, etc. They are an extension of filing cabinets. It seems nothing is really new, just an evolution of something that has gone before.

These are very simplistic definitions, for nowadays each type of software has extended beyond its initial objectives, to the extent that elements of each are embedded in the others. The distinction is further blurred by the advent of graphics, where visual art forms are built and stored; by communications, which enable computers to share information; and by multimedia, which supplements the graphics with sound, extending the management of sight and sound beyond the computer to include CD and TV.

These are the general forms of computer activities which affect our daily lives, if not already then certainly in the very near future.

Of course, most of us require more than one type of software, and the major software vendors have now provided us with heavily discounted suites of software which cost less than the sum of the individual components. These suites

allow us to move gracefully from one application to another and to embed parts (objects) from one into the other. As an example, an annual report, prepared with a word processor, can have spreadsheet data such as profit & loss reports and balance sheets incorporated in it. This makes the preparation of the report very easy because it is printed out as one document, with standard page formatting.

The Early Mainframe Systems

In order to appreciate where database technology is today, it is useful to reflect on its origins. In the beginning was the word, and the ten commandments were produced by a word processor of the day, not a spreadsheet, well maybe, but certainly not a database.

Without going back that far, let's look at the world of computers, not long before personal computers arrived about a decade and a half ago.

When the first major wave of computers was introduced into the commercial world in the 1960s, they were predominately aimed at large organisations such as banks, insurance companies, military and government, manufacturing operations etc. The majority of systems developed for these businesses were built on databases. I use the term in a very basic sense, as the databases then were very rudimentary in their design. In fact,

the early business computers were extensions of the punched card and accounting machines that had been operating since the turn of the century.

These early computer systems tended to go through an evolutionary process; firstly by implementing the punched card systems onto computer, then extending the amount of data held, and making use of the added calculation facilities. If an organisation's payroll file was stored in personnel number sequence, it was easy to generate the fortnightly cheques into department sequence, or direct deposits into bank and branch sequence, for ease of distribution.

The databases in those days were called files and represented stacks of cards containing accounts, payrolls and all forms of registers: assets, shares, government - taxation, social security, customs etc. These files were sequential in nature and had to be sorted, collated and tabulated to enable the generation of reports and the desired outputs such as bills/accounts, share certificates, payroll and pension cheques or tax refunds. This method was known as batch processing.

Updating in this mode was achieved by sorting a batch of transactions (from a transaction file) into the same key sequence as the master file, e.g. personnel number, and then having an update program read the two files sequentially, matching the keys as it went. Any transaction that matched a master record had the transaction data applied, subject to the rules specified for add, delete and change activities.

Because of the high cost of data processing in those days, information in the files was usually very summary and was backed by an enormous amount of procedures and paperwork. The machines existed mainly to automate the end process. That is, to add them up and send them out. Initially most of the calculations had already been done manually. However, the calculations were soon incorporated into these systems, resulting in greatly improved accuracy and control.

Initially, these master files comprised fixed-length records, each with a set number of fields, similar to the database tables we know today. Then came an advanced feature, variable length records. This enabled a level of sophistication whereby, for example, the systems could store a variable number of allowances or deductions for each employee. These variable numbers are known in modern technical talk as repeating groups- and are not normally allowed in formal database design. In such cases, the programmer in charge of an update program had to cater for an arbitrary maximum number of each repeating group in memory arrays, expand each record as it was read into memory, perform any updating from the transaction stream, then compress the repeating groups prior to writing the record back to the master file.

Another way of achieving the same variability was to have sets of records for each entity, usually with a header record of static data, followed by trailer records containing the variable data.

Validation of data in the batch system environment comprised, in many cases, hard coded arrays, internal to the programs that edited the data. In order to validate the department code for an employee record, the department codes would most likely reside within arrays in the edit programs, only to be amended by the programmers in times of organisational change.

Some of the validation tables were contained in external files, and read into the program's buffers (with necessary limitations in size) as part of its startup routine.

These systems were initially based on magnetic tape master files because disk technology was still expensive and restricted in size. As an example, the first mainframe system I worked on - one of the largest in the country at the time - had a master file spread over forty magnetic tapes (eight reels in each of five capital cities), each with a capacity of 20 MB of data. The computer system, an IBM S/360-40 had only two disk drives, each with 7 MB capacity and just 128 kB of RAM. We complained in those days that the operating system robbed us of 28 kB and that we had only 100 kB in which to cram our heavily overlaid programs - we are still complaining about memory limitations in these days of megabytes aren't we.

Database Systems

While people in the business world were developing the early batch systems, the academics were anticipating advances in storage technology and figuring out much better ways of storing and retrieving data. As soon as reasonable amounts of disk storage became available at reasonable prices, systems started to migrate to the new storage medium. These systems gave rise to rapid access to data via Direct Access to records. The disks became known as Direct Access Storage Devices (DASD). By various techniques, records could be accessed directly, rather than having to be read sequentially along a master file. This was like moving from a tape recorder to a vinyl record: just select the track you want and play it.

The software technology for storing and retrieving the data in a direct manner initially took the form known as ISAM (Indexed Sequential Access Method). This meant that records were given a primary key and could be accessed directly via that key. This formed the basis of future direct access systems.

Then came the hierarchical approach, whereby a series of database tables were classified into two categories of records: masters and details. Masters, such as employee records, were accessed directly via a primary key. A chain of records attached to each master contained a particular set of data for that record, such as allowances or deductions in our payroll example. There would also be a chain for each employee's payment history and any other chains considered necessary. These chains had to be traversed sequentially for each master, thereby providing a mix of direct and sequential processing in this type of database management system.

Early research and development in the area of hierarchical databases came out of NASA and the aerospace industry in the US to become IBM's IMS. Another offering at the time was Cincom's Total, made popular because of its ability to operate across different hardware and operating system platforms. Note, we are talking back in the days of proprietary operating systems here, so the transportability effort for a vendor was quite considerable at that time.

With a view to overcoming the limitations of the hierarchical database method, the network database came into being. This method still contained the two data constructs, masters and details, however its linking structures were too complex for general use.

Relational Databases

The answer seemed to be in the relational database method that was built around a single data construct- the table. This technology was relatively simple and had a supporting theory as well as a practical implementation. The supporting theory is an implementation of relational logic which grew out of work on predicate logic earlier in the century when computers were mere twinkles in eyes. The predicate was extended, or multiplied, to incorporate relations.

As an example, a single predicate is contained in "John (subject) lives in Melbourne (predicate)"; "John lives in Melbourne and is married to Mary" extends the predicate to include a relation- marriage; no pun intended by the fact that John is now related, by marriage, to Mary. I won't try to back myself out of this one, but had better take a different tack before I get myself into a semantic corner. Let's say "John is an employee and has several allowances and deductions (in his pay)." This is a better example of a commercial situation and aligns with the evolution of our sample employee system.

Extending our relational logic into something useful, we now have a relational database which comprises a set of tables, with internal rules for their composition, and external rules for relating them to other tables in the set.

The early works in this area were carried out by E W (Ted) Codd and C J (Chris) Date, and several of their papers and books are available on the subject. E W Codd is considered the father of relational databases and, in recent years, has defended and extended his original specifications of the relational model.

This can become very technical and emotional stuff, but the average PC user just wants to get his database up and running with the least amount of fuss and detail. For the purpose of this article, suffice to say that there are some very basic rules to follow in order to design a properly functioning database system. The following paragraphs will become, by necessity, reasonably disciplined and structured, traits also necessary for those who indulge in database technology.

Database Design

Tables, as defined in relational theory, are nothing different in principle to our first master files, with their fixed length records and fixed number of fields. This is now the starting point for defining the rules, based on relational database principles, which will enable us to analyse a business situation and design our working relational database. Many books have been written on this subject, but hopefully the following captures the pragmatic essentials, using examples as we go. Let me say here that there is no correct way to design a payroll, or any other system which addresses everybody's needs. Each organisation is different and therefore has different requirements. For this reason generic type packages must be evaluated very carefully to ensure they match your particular needs.

Here we go on a condensed theory/design journey. A table, also known as a relation, is a two dimensional array, comprising tuples (rows, records), defined by domains (columns, fields). This can be accomplished in a spreadsheet, or word processor, too. There is no implied or physical order to the tuples in a table. Each tuple, however, must be unique and that is accomplished by defining one or more domains as a (unique) primary key.

An example of a table is employee which consists of employee number (unique key), employee name, date of birth, address etc. If there is a repeating group of domains, e.g. payroll allowances for an employee, each occurrence becomes a tuple in a separate table (allowances) with its own identifying key.

A relational database system differs from a single table/list processor or a spreadsheet/word processor table because it contains more than one table, and can maintain relationships between those tables. Another table in our payroll system would be the payments table containing details of payments made to each employee for each pay period. This table would contain the employee number and the paydate as a concatenated key to uniquely identify the payments made. Other non-key data in the table would be items such as gross, tax and net payment amounts.

Consider another piece of information we may wish to keep in our payroll system: employee department. Assuming we are going to use this information, we would have a table of departments, comprising a department code (unique key) and department name. If we happened to be a national organisation, we would also have a State table or a combination of State and department, ensuring that the occurrences of State/department combinations within the table were unique.

The placement of department information will depend on how static our workforce is, and how conscious the organisation is of profit centre performance. If our staff are reasonably static in their location and salaries are not profit centre sensitive, we would simply place a department column in the employee table. Then, as an employee moves to another department, the employee record is simply updated to reflect the current situation. On the other hand, we could place the location into an employee location history table that would vary over time. The most recent entry, per employee, in this table would be determined by the most recent start date column.

Irrespective of whether department code is placed in the employee table or in the employee history table, each occurrence is checked, at the time of entry of the data into the system, to ensure that there is a corresponding entry in the department table.

The department code column in the employee/employee history table is known as a foreign key, as it has a relationship back to a primary key in another table i.e. the department table. This relationship arrangement is the basis for another term you may have come across- referential integrity.

Referential Integrity

Referential integrity deals with maintaining the truth of relationships within a relational database, i.e. there can be no occurrences of foreign key values without a corresponding occurrence of a primary key within a related table. Referential integrity prevents actions, such as deletions, that would cause inconsistency between tables.

What happens to all those employees working at the Woy Woy branch if someone deletes Woy Woy from the Branch table? I wish I had been working with Codd and Date when they made up these names, for I would not have called these keys foreign; the name has negative overtones; I would have preferred the term related keys.

Referential integrity is one of the tenets of relational theory and may be contained as a function of the DBMS itself, or, like Xbase and many PC level systems, must be programmed explicitly by the programmer. This is a factor to be taken into account when selecting a DBMS.

Cardinality Rules

Relationships, such as those between tables mentioned in the example above, fall into three categories for the purposes of determining and mapping the design of a set of tables to become a relational database. A relationship is determined between any two tables as being one-to-one, one-to-many (and, conversely, many-to-one) or many-to-many.

The simplest to explain initially is the one-to-many relationship. Take the two relations - employee and department. At any point in time, each (one) department may have many employees and, conversely, an employee can belong to only one department. This is a one-to-many relationship between department and employee.

The converse- any employee can belong to only one department - verifies that this is, in fact, a one-to-many relationship, rather than a many-to-many relationship.

The one-to-many relationship sets up the (one) table with department code as the unique primary key, while determining that the (many) employee table contains the foreign key, department code.

While it is not necessarily essential, most foreign keys are of such significant value that they are also secondary keys in their own right in the (many) table. We sometimes wish to access tables in a sequence other than the primary key, in which case we add secondary, or alternate keys. This is one of the major benefits of DBMSs - in theory, we never have to physically sort our tables anymore.

A one-to-one relationship states that, at any point in time, a single occurrence in one table occurs only once in the related table. For most practical purposes this type of relationship does not occur, as related columns are usually contained within the one table. As an example, an employee table and a home address table, for practical purposes, are most often better combined into one, so that each employee row contains the corresponding employee address. This may violate the real world situation where several employees live at the one address, however the benefits of practicality are easily seen in this example.

Finally, there is the many-to-many relationship, where many occurrences in one table may be related to many occurrences in another table. Again an example illustrates this situation.

Assume we have a requirement to record and report on employee payment history. Each employee has many payments (per annum), and on each payday many employees are paid. This appears to be a two-way one-to-many relationship, which actually translates to a many-to-many.

However, in order to implement this construct in our relational database, we do in fact transform it back into two one-to-many relationships- with an intersecting table in the middle to hold the manys.

This is accomplished by having a one-to-many relationship between the employee table and the intersecting employee/payment table, and a one-to-many relationship between the payday table and the employee/payment table. In this instance, the intersecting table contains a concatenated primary key of employee number and payday, which is of course a pair of foreign keys (relating back to the employee and payday tables).

In summary, although there is only one construct in theory i.e. tables, in a relational database system, they can be classified as having two types of data; one set being relatively static, which reflects the structure, elements and rules (organisation structure, employees, products, clients, rates, paydays, awards etc.) of the organisation or problem being addressed, and the other set containing the dynamic data associated with the operation (e.g. sales, invoices, payments, deals, tasks etc). The static data tables usually contain the ones and the dynamic tables usually contain the manys. The manys are also usually linked to several ones.

That, in a nutshell, is the essence of relational database design, and I would now like to move on to something, hopefully, a little lighter. It must be time for a coffee break.

Database Systems and Tools

Reporting from databases, and even from files in past times, has always been a primary activity associated with databases. After all, it is no good having all that data if management cannot look at it and analyse it. Reporting was one of the first IT processes to be automated with the introduction of report generators. Early products such as Easytrieve, and later products such as R&R have certainly taken the drudgery out of writing report programs.

While the relational databases were being researched and developed, it was recognised that a standard language would be ideal, and so Structured Query Language (SQL - pronounced sequel) was born. This standard is now available with most DBMS systems. Australia played a part in the development of SQL, recounted in a PC Update DBMS SIG report in November 1990.

Many products offer an alternative means of providing query and reporting facilities known as Query by Example (QBE), by which the user selects fields from prompt lists of table and column names, operators (equal to, greater than etc) and relational operators (and, or). Many products combine both facilities by providing the non-programming type QBE interface, to collect the selection criteria, and then generating the query itself in SQL. The report generator products mentioned above have accordingly evolved to become SQL based, and can access a variety of database formats on both PC and mainframe platforms.

Although the PC DBMS offerings have been touted as relational they do not possess all of the qualities of a fully-fledged mainframe DBMS. Things such as referential integrity and active data dictionaries have not been visible in many products, including Xbase, but they are slowly emerging.

The Advent of PC DBMS

When micro computers first appeared in the seventies, they could not be used for commercial purposes until spreadsheet software, such as Multiplan for the Apple II personal computer, and word processor software, such as Wordstar for the IBM PC, became available. Beginners All Purpose Symbolic Instruction Code (BASIC) and Pascal each provided high level language capability with rudimentary file processing, thus enabling some application programming capability for small business systems. One early entrant into data management was Turbo Toolbox, which came as an add-on product to Borland's Turbo Pascal. This provided rapid access to tables via the B-tree algorithm, a specialised binary division method.

This algorithm was the basis of the Btrieve database engine offered by SoftCraft, a Texas-based software company. This software layer could be called by C, Pascal, Basic and COBOL programs, and it provided additional database protection by way of begin and end transaction identification and with rollback and restart facilities. This engine is still one of the recognised standard database recording methods, and any products today that claim to have access to several formats, usually include Btrieve.

Returning to those early days of personal computers, the arrival of Ashton-Tate's dBASE II was the event that heralded a relational database and commercial business processing environment on a personal computer. dBASE II had all the elements needed to generate a business application based on relational database principles.

It had a database engine, that element of software which manages the reading and writing of database tables to disk, in this case with the database definition embedded in the database itself; also DDL, a data definition language which is used to define and create the tables and their related indexes and keys,and it had DML, a data manipulation language that enabled the data to be manipulated, analysed and calculated- a true high-level language with easy-to-use screen I/O (input-output) for interaction with the users of the application.

This was the time when I decided to buy my first personal computer. With dBASE II and Turbo Pascal, a proper language, not like that horrible, unstructured BASIC, I could leap tall buildings with a single bound. Having leapt one building I wanted to leap two, but found dBASE had certain limitations, such as only 64 active memory variables, 32 fields per record, and only two files able to be active (with 16 open) at one time. My first personal computer was an Australian Microbee, with the CP/M operating system and twin floppy disk drives. When I later migrated to the DOS-based IBM compatible PC, I was able to take my dBASE, and Pascal programs and tables and run them straight away on the new platform.

This experience was brought back for me to exercise recently when a client with a CP/M system wanted to upgrade to an IBM compatible. He decided on the quantum leap from CP/M to Windows, and Microsoft Access. Fortunately I was able to assist in this migration; an interesting exercise that may be the subject of a later article.

PCs and their software products were off and running, and soon the Lotus 1-2-3 spreadsheet became the killer application; Wordstar, Word and WordPerfect were battling it out in the word processor field; and FoxBASE, dBASE III, Dataflex, Paradox and a myriad of other DBMSs were emerging in their area of speciality.

However, because dBASE had such a good start, many other add-on products emerged around this database and its associated language structure. Quicksilver and FoxBASE were two early products that capitalised on some of the shortcomings of dBASE, one of which was its slow performance.

Other vendors have come into this market and, following the acquisition of dBASE by Borland from its original marketers, Ashton-Tate, and after some period of time considering its options on the legal front, the new owners decided to allow the dBASE language to be public domain. Thereafter, all dBASE-like implemen-tations and associated products became known as Xbase. At this time, Fox Software and many other vendors heaved sighs of relief, as there had been threats of look-and-feel legal suits against them.

Runtimes and EXEs

One of the factors blamed, incorrectly I might add, for dBASE's slow operation was the runtime functionality provided. We should now explore this facility and compare it with the alternative executable option.

Because of its 4GL capabilities, a dBASE application was able to operate in its development environment by interpreting the source programs created by the programmer, or by interpreting commands entered at the dot prompt (similar to the DOS command line).

dBASE programs were not compiled into executable programs, known as.EXE files. If a developer wished to distribute executable code to users, the programs were pseudo compiled and distributed with a runtime base program. This was very much like how BASIC worked and something that many developers found hard to come to grips with, i.e. not having an EXE file to execute. The concept didn't bother the dBASE, and later FoxBASE, programmers though, who found the run-time concept much more efficient as there was essentially only one EXE overhead.

One major problem, however, was cost, particularly in the early days when the database vendors wanted to capitalise on the developers' work and charged a royalty for each runtime module sold.

Enter Clipper, a compiler for dBASE code. This enabled dBASE code to be compiled into executable files for distribution to end users, and did not incur a cost to either the developer or end user.

Moving back for a moment to the subject of runtime versus compiled code. Consider a set of (say) 30 programs operating under a menu structure. The runtime method provides one program that contains all of the subroutines that would normally be contained within the base of an EXE program. The base runtime program serves both as a director to the pseudo-compiled dBASE programs, and a slave to them; Executing low level subroutines and functions as directed by the program code.

In our 30 program suite, there would be one only set of base subroutines contained in the standard runtime program. The relatively small runtime program is loaded fairly quickly into memory, up front, at execution time, and calls into memory those other programs, only when they are required, primarily by the menu selection program.

In a fully compiled environment for our 30 program suite, the problem that existed was how does one decide whether the programs are to be compiled separately, or all together; lumped in with the menu program.

Back in the days of DOS, with its 640 kB RAM limitation, it did not take much effort to build an application that would exceed this limit when compiled into an executable program. The popular Clipper product approached the problem by the use of overlay techniques but from discussions I have had with programmers in that arena, this was always a problem area for the compiler.

On the other hand, if you compiled each program separately, you introduced the overhead of a full set of runtime procedures, within the header of each and every executable program- 30 duplicate sets in our example. Redundant code space and extra loading time, every time a program is called into memory for execution.

Enter FoxBASE, a dBASE II compatible product, and later FoxPro, a dBASE III and dBASE IV compatible product, which used the same runtime philosophy employed by dBASE, but this time in a much more efficient way. The argument was settled for me. One set of runtime procedures, operating the program code at lightning speed, and the need for compiled EXE programs goes out the window. Speed was always Fox Software's major claim for their product, a claim carried forward by FoxPro's new owner, Microsoft. Whenever other DBMS vendors claim speed superiority, FoxPro is always the target benchmark.

The Xbase Language

dBASE has developed from the original dBASE II, through to a short-lived dBASE III, which soon gave way to a multi-user version, dBASE III PLUS. This became a very popular product and many applications written in this version are still operating today. It is still the DBMS of choice for database subjects in many universities and learning institutions.

When the first version of dBASE IV was released it contained many anomalies and caused an enthusiastic database community to become nervous. Such was the pressure upon vendors to get products out to the marketplace, in the time promised by its marketers, that the proper quality control process was not rigorously followed. Fortunately the next two releases of dBASE IV overcame its initial problems. Many lessons were learned from this experience by the vendors and observers.

As well as providing multi user capability, dBASE III PLUS added an advanced Assistant, whereby the operator used pull-down menus to create, manipulate, enquire and report on database tables, without having to write programs. This was a major factor in getting users, rather than programmers, to use dBASE. The concept has been carried through to dBASE IV where it is known as the Control Center.

Xbase is now so entrenched in the PC environment that virtually all applications able to import or export

a database file include dBASE III or dBASE IV table formats. It is also entering the mini computer arena, with offerings on AS/400 and several UNIX implementations. Import/export facilities are common to DBMSs, providing standard text type files that allow transfer between PC applications e.g. spreadsheets, and other database-based applications e.g. project management, and across all hardware platform types.

It seems we are most comfortable using the first spreadsheet, word processor or database system we learnt. For that reason, I am still a dBASE/Xbase proponent, having moved very early into the FoxBASE/FoxPro camp to capitalise on the latter's excellent implementation of this DBMS product. When moving around in my job as a consultant, I find dBASE IV is still a popular product at client sites, and I find its Control Center easy to use.

DBMS and Windows

Although it was still only a DOS product, FoxPro version 2.0 was the first Xbase to incorporate WIMP (windows, icons, mice and pulldown/popup menu) facilities in a text based platform. This was an excellent version, as it provided all the benefits of windowing without suffering any loss of performance usually caused by the overheads of a graphical user interface. This version has since been replaced by a true MS Windows version 2.5, plus an enhanced DOS version 2.5, both of which have recently been further upgraded to version 2.6.

When Borland acquired dBASE from Ashton-Tate, they hit the ground running, so to speak, with their arsenal of object oriented languages to develop dBASE for Windows. This product is currently in beta testing and due for release in Australia in August. This version will be a GUI (Graphical User Interface) and object oriented version, built from the ground up to utilise Windows features to the fullest extent. It will also be client/server enabled, allowing connection to major servers like Sybase, Sequel Server and InterBase.

The Xbase vendors have provided upward compatibility of their DOS products into the Windows environment. This means that many years of DOS investment is easily transferred to Windows.

dBASE's current database market share is just over 55% worldwide. This compares with Paradox at 20%, and Microsoft Access (see below) at 6%. (Source: IDC). It is important to note that this 55% share is effectively DOS.

Together with FoxPro, Clipper and other Xbase products, Xbase is the COBOL of the micros, and is the source of many legacy systems. Legacy systems is a recently coined term, used to describe systems written in old languages on old platforms- read COBOL (COmmon Business Oriented Language) on mainframes. One feature these legacy systems can boast is- they work!

With its monolithic might, Microsoft presented the marketplace with a Windows based relational DBMS that would enable applications to be generated by the press of a Wizard button, with access to all forms of DBMS formats on PCs and mainframes. The product is named Access and has been a very fast selling DBMS. At a bargain $150 (approx.) introductory price, or purchased as part of the Microsoft Office Professional suite, it has picked up 6% of the market in its short life (Source: IDC). Access has been targeted to cover a range of users, from the novice, who recognises the need to have a database rather than a spreadsheet, to the corporate citizen who wishes to access data on a mainframe and operate SQL queries and reports from a PC.

Just weeks after the release of Access, Borland countered with Paradox for Windows which had been re-written from the ground up, not simply upgraded from the DOS version. This included the newer ObjectPal language and came in three options: Regular, Workgroup and Developer; the last one containing a runtime and many other tools to help developers.

Where to Now?

The DBMS world is poised at the crossroads of the traditional Xbase standards and going with any of the several alternative offerings in the marketplace. To add strength to the Xbase school, there is currently an ANSI Xbase standard being developed. I believe it is a safe bet that no one language, product or standard will predominate

the DBMS arena to the same extent that Microsoft has dominated PC operating systems with DOS and Windows. There is certainly a place for many products, as each one has its own flavour, intended to address the needs of a particular set of users in this computer-aware age.

Expanding Horizons

While the mainstream Xbase products have been aimed at the programming community, other products have been positioned to lure the non-programmers. Unfortunately, many of them have had only single file capabilities, being suitable only for mailing lists, labels etc.

dBASE with its Assistant (and later the Control Center), Paradox and ObjectVision have advanced with user friendly interfaces designed to limit the amount of programming required. Many of us just want to develop an event-driven application via the point and click method, without having to write any program source code.

However, it is also convenient to have the flexibility, if necessary, to add that extra function which is unavailable in products not supported by a language. I find the FoxPro FOXAPP program generator and report writer do most of the things I need to generate multi table applications and provide comprehensive reporting facilities.

While the DBMS systems were advancing to attract new followers from the PC user community, the vendors were also looking upward and outward to the world of open systems. Back in the dark ages of computing, competition was rife amongst the hardware vendors who were able to lock clients in to their proprietary operating systems, network protocols, databases and languages. IBM computers could only talk to IBM computers via IBM's networking systems. Start life out on a small DEC VAX and you can scale up, without any change whatsoever to your application code, to any other sized computer, provided it is a DEC VAX.

Come the enlightened eighties, and the vendors had realised that clients did not like this lock-in situation, so they became open. The DBMS vendors played a major part in that movement, ensuring that if an application was written, for example, in COBOL for the Total DBMS, it could run on a number of hardware platforms. The communications arena also started to develop standard protocols and, of course, the UNIX operating system had been lurking around for some time.

Recalling the computer press of the eighties, I believe every year was touted a the year of UNIX.. This is the operating system that boasted a wider coverage of hardware platforms than any other. It has taken nearly two decades, but UNIX can now claim to be the open operating system of the nineties.

The open background was established by segmenting and identifying the several components of systems into specialised functional layers. The communications industry has standardised on the OSI 7-layer model. The layer approach enables users to mix and match components from each of the layers and provides vendors with the ability to selectively address those areas of the market with their specialist products. Against this open background, the DBMS vendors have been extending their own specialist field to incorporate a wider platform coverage.

The Client/Server Evolution

With the advances made in this layered approach, developers can now mix and match front-end GUI programs that can access a variety of DBMSs, via the now standard SQL, operating on different hardware platforms, across a range of locations via several different layers of network protocols. This may sound like a spaghetti arrangement, but the layered/modular/object approach is

the means by which we now have the opportunity and flexibility to utilise the current marvel buzz word- Client/Server.

Client/Server, in simple terms is the ability for a client computer, operating on local resources, to request database or other services from a remote server computer. In a simple example, a PC operating a user friendly GUI data entry program, can send a request to a database server at a remote location to create or update a central database record. Other situations extend the simple one-on-one configuration such that a server to one client process can itself be a client to a further server process. Given Australia's large geographic spread, an example of such a situation could be a GUI front end client PC requesting an application function from a capital city application server, which in turn, as a client, requests a database function from a database server.

This flexibility provides an effective level of access to strategically located data, and provides a level of scalability (ability to add more users to the system without suffering system response degradation) not seen before in unitary based mainframe systems.

To give you an idea of the current groundswell in this evolutionary movement, a recent survey by the US based DBMS magazine revealed that

76% of its readers were actively looking at client/server for solutions to their data processing problems.

Client/Server is being targeted both by those organisations who are down-sizing their operations from mainframes, and by growing organisations who are looking to up-size their operations from an unregimented collection of stand-alone PC applications, or even LANs, into corporate wide systems on a Client/Server basis.

More recently, the down-sizing and up-sizing phrases have been replaced by the more appropriate right-sizing adjective. All this leads to the point that there is a lot of hardware and software out there that does not necessarily have to be thrown away just because an organisation is going to restructure its systems environment. Certainly, new software processes and standards will be deployed to take advantage of the new environment but at the very least, the current investment in PC hardware will in many cases be preserved.

As part of this Client/Server evolution, the DBMS vendors have been positioning - <P4.5MI> right-sizing- their products so that they appeal to a much wider audience. Borland, with its Paradox client and InterBase database server for example, have products which can be used at the PC level, or on a variety of hardware platforms. Microsoft SQL Server, Sybase SQL Server, Oracle Server and many others also provide this multi platform capability.

While it has been contended by the mainframe lobby that PC based systems could not handle the database and transaction volumes called for by large enterprise-wide systems, the following (edited) extract from DBMS magazine of June 1994 quotes samples that would have been considered beyond the reach of PCs only a few short years ago.

"At Kemper National Services .., more than 600 people run an entire insurance claim data processing application on a PC network using Dataflex .. Accelerated Collections Bureau uses a single-server, single-Ethernet segment to service 60 concurrent users accessing a single FoxPro database... CBN down-sized its 500 terminal/mainframe shop to 720 networked PCs running FoxPro. CBN is saving more than one million dollars per year in maintenance costs alone. The main contributor application handles 55,000 transactions per day (about five records per transaction). What about database size? CBN has 40 GB of online transaction data. The contributor's database is 17 GB with 67 million records..."

Business As Usual

Over the years there have been many mergers and takeovers of database and other software products and companies. In or around 1992 we saw the following moves by the major DBMS players: Microsoft took over Fox Software (FoxBASE and FoxPro), Borland took over Ashton-Tate (dBASE), and Computer Associates (CA) took over Nantucket (Clipper). More recently, Microsoft Corp. and Sybase Inc. parted company after a seven year partnership, each taking a version of SQL Server to pursue separate markets. Novell, who purchased the Btrieve and related technology from SoftCraft Inc., have recently hived off that acquisition as a separate stand alone venture called Btrieve Technologies. In 1991, Software Publishing Corp. acquired Superbase from Precision Software Ltd, but has recently cancelled future development of the product, and is looking for a buyer for the product. Meanwhile, it will continue to support the current release 2.0. In view of recent press coverage, this may not be the end of the merger stories.

Whatever happens, DBMSs are definitely now in the fast lane.

Acknowledgments: Borland Australia, Microsoft Australia, DBMS magazine, IDC, R.H. Fees.

Reprinted from the July 1994 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[About Melbourne PC User Group]