Despite Microsoft's recent legal debacle with Stac Electronics, and the subsequent withdrawal of DoubleSpace from its DOS offerings, compression technology remains a cheap alternative to getting more precious disk space. Over several years my colleagues and I have used, with varying levels of satisfaction, three of the major players in this arena - SuperStor (SS), Stacker and DoubleSpace (DS). 2 This article examines:
The Technology First understand that data compression is a sound, mathematically based, and proven technology. Data compression is a "free lunch" in a well behaved environment; the only measurable cost is the memory required by the device driver, which can be loaded high. Realise that even on older machines a compressed drive will often outperform an uncompressed one! Time is required to decompress all that data in memory but the net data transfer rate is faster (its compressed) and the CPU decompresses data a lot faster than the disk could transfer the same data in uncompressed form. The major assumption being that the compressed drive is regularly defragmented. Lossless compression has been available and successfully employed by PC users for many years as evidenced by programs like PKZip from PKWare. The problems of real time transparent compression occur not in the technology but in its implementation. These implementation differences and how they interact with your operating environment are where all known problems lie. Nearly all troubles that users experience can be traced to confusion or to odd interactions between the compression software and other parts of the system. Problems usually fall into the following three categories:
Implementation Although DS legitimised real time compression, in reality its compression abilities are about 20 percent less than either Stacker or SS. For a detailed discussion of implementation issues beyond our needs see the article by Halfhill [1]. Compression drivers are either preloaded as in DOS 6 or activated as a device in CONFIG.SYS. DS, Stacker 3.1 and later preload their drivers before CONFIG.SYS, while SS is activated from CONFIG.SYS as a device. In the case of SS, if CONFIG.SYS is corrupted for some reason the compressed drive may not be mounted. If your compression driver is loaded in CONFIG.SYS and you cannot access the compressed drive then CONFIG.SYS is the first thing to check. Accidentally deleting CONFIG.SYS will not affect the compressed drive, only prevent access until CONFIG.SYS is correctly established and the system rebooted. SuperStor/DS included with IBMs PC-DOS 6.1 also preloads its driver; this version of SS will soon be the only one available. Stacker and DS obtain information about which compressed drives to mount from STACKER.INI and DBLSPACE.INI respectively while SS searches available drives to locate compressed volumes. Once the compressed drive is mounted, it appears to the system as a virtual drive. All file I/O happens normally, except the device driver intercepts the I/O to compress and decompress files as they are saved and loaded to the new drive. In essence the compressed drive is one large file (CVF). "Some people fear that storing everything in one massive file compromises data integrity. However a number of safeguards and cross-checks are built into the compressions architectures to prevent you from losing information even if the CVF is corrupted. What's most important ... is the integrity of the compression architecture and how readily you can diagnose and repair common problems with disk utilities" [1]. Even so, most users are so hungry for disk space they are willing to accept some misfortune. Known Problems Mysterious problems are often due to unpredictable interactions between software. Microsoft compiled a list of software that may not work with DS. Some of these programs will not run from any compressed drive, reasons vary from copy protection schemes to temporary file handling; even different versions of the ROM BIOS are known to cause problems. Furthermore, DOS really needs a controlled shutdown rather than simply turning off the machine. Data could be ii the cache and not yet committed to disk (as per SmartDrive and DOS 6.0). DOS 6.`e ensures SmartDrive is flushed before returning the DOS prompt. Drive Mapping All of these compression drivers create a single file (the CVF) which is assigned a drive letter. If C is compressed there must always be an uncompressed host drive to hold the hidden DOS files (MSDOS.SYS, IO.SYS) and the CVF. Both Stacker and DS follow this philosophy fr every volume and assign an extra drive each CVF. In order to maintain user familiarity the CVF may optionally, and default, take the letter of the original drive (in the case of DS the original C defaults to H and the CvF becomes C). Both Stacker and DS maintain a log of the compression process in a small portion of the drive which remains uncompressed, H in the case of C. For fully compressed volumes other than C, SS replaces the original drive letter with the CVF. As the SS drive is fully compressed and SS does not maintain a log, there is no advantage in having a useless drive letter assigned. A fully compressed SS host is not accessible after the CVF is mounted. The drive mapping highlights a difference in implementation approach. Stacker maintains an additional copy of the FAT table on the uncompressed host drive, this is not done by SS or DS. The problem of course is that if the FATS differ which copy is correct? Whilst DS does not maintain the extra FAT its treatment of drive allocation is undoubtedly due to its Stacker heritage. One small advantage of this approach is that it is often simpler and quicker to backup the single CVF from the host drive than its contents on an individual file basis. Problems Encountered SuperStor The following problems, and where known solutions, are presented in the historical order in which they were encountered. The historical presentation adds to the description as it also reflects the evolution of the technology. DR-DOS 6 was the first to introduce realtime compression as an integral part of the operating system. Digital Research licensed a tailored version of SS that would only function with its operating system. This implementation was surprisingly resilient and the DR-DOS version of CHKDSK would fix most problems; unfortunately most does not mean all. Occasionally the CVF would decide to protect its contents from any possibility of being further corrupted by placing itself in read-only mode and nothing could be done to fix the problem and coerce it back to normality. The only solution was to backup the contents and reinstall the drive. Whilst a painfully slow process this will usually work with any CVF, and in many cases is the only solution. SS Pro, which was also compatible with MS-DOS came with its own recovery utility SSUTIL. This was a major improvement over the standard DR-DOS 6 version but it too had problems. Despite claims in the documentation, none of the DR-DOS 6 utilities (CHKDSK Or DISKOPT - the defragmenter) would recognise the CVF. These utilities treated the CVF as a normal volume, which means that simple problems could no longer be repaired by CHKDSK and defragmenting with DISKOPT was not a good idea. SSUTIL had a built in defragmenter but this was painfully slow. SSUTIL was now the standard means of recovering a corrupted SS CVF. Unfortunately even SSUTIL could not always get the CVF out of read-only mode; in this event the backup recovery method was the only alternative. If SSUTIL was able to resurrect the CVF it would do so in one of two ways. It would either identify and optionally delete corrupted files (one could usually live with this) or more commonly it would delete all directories and their contents that occurred, in linear order, from the first directory that contained a corrupted entry. In the latter case one had to backup everything from that directory onwards in the tree. This case was marginally better than backing up the whole CVF. Norton Utilities Version 7 (NU7) provided a better rescue environment than SSUTIL but they still required SSUTIL to initially get the CVF out of read-only mode; an operation that was not always possible and may cause the deletions mentioned previously. NU7 at least did not insist on deleting all directories and their contents past the first corrupt entry. The NU7 (or later) are the best compression recovery tools currently available on the market and are highly recommended. Superstor Pro may also have a problem with high speed machines. On the DX2/66s I have tried to run it on it was unable to cope with high rates of data transfer. A simple "copy" of a large file caused a system crash and the CVF was placed in read-only mode. This may have been BIOS related but I doubt it. SS Pro functioned adequately on DX2/50s, DX/33s and below. AddStor Inc. (SS) has been purchased by IBM and hence the appearance of SS within IBM's PC-DOS 6.1. One would expect the speed problem to be absent from the IBM release: SS/DS. DoubleSpace If you did not install the optional programs (Backup, AntiVirus and Undelete) that came with DOS 6.0 the 6.2 Upgrade program will not upgrade them. If you do not have a copy of DBLSPACE.BIN visible the upgrade will not even work, so even if you do not currently use several of the DOS utilities and have them archived else-where, put them back in the DOS directory before attempting the upgrade to 6.2. DOS 6.2 apparently introduced an arbitrary change to the default Buffers value to 10. DS is now partially loaded into the HMA region leaving less space for other code. With DS you have room for about 15 buffers and without DS there is room for many more. However, Buffers=10 is recommended when using SMARTDRIVE and additional buffers will only duplicate SMARTDRIVE's buffering efforts and actually slow the system. There were troubles with data corruption with DOS 6.0, so SCANDISK and the DoubleGuard feature of DBLSPACE were introduced to cure that. SCANDISK was found to be causing the total loss of data on certain hard disks (when verify was set on) and that required fixing as well. Unfortunately, even when working well DS is not very good. The alternatives, Stacker (Novell DOS 7) and SS (DR-DOS 6 and IBM PC-DOS 6.1) both provide more space than DS. Additionally, Stacker and SS can create compressed floppies which can be read by any PC (DOS 6.2 DS floppies are only portable to another DOS 6.2 DS system) and IBMs PC-DOs 6.1 (SS) can read Stacker-compressed drives directly (DOS 6.2 can't do that) [2]. Norton Utilities comes with its own excellent cache, NCACHE2. Users aware of the SMARTDRIVE DOS 6 problem who had NU7 made the change. Unfortunately they walked into a different trap. The original NCACHE2 V7 was incompatible with the default setting of DS in Dos 6.2 that enabled automatic detection of compressed floppies (automatic compressed floppy detection was not available in DOS 6.0). When a floppy was accessed the system would perform erratically and usually hang. Disabling automatic compressed floppy detection fixed the problem (place "AutoMount=0" in DBLSPACE.INI), but finding this was a non trivial exercise for many a consultant. Norton released a patch to the cache to overcome this problem. Owing to the heritage of DS from Stacker what follows applies to both of these compression systems. When a corrupt CVF occurs, and you are unable to salvage the situation in the recommended manner (SCANDISK, NU7+ or CHECK /F in the case of Stacker) this technique may prove useful. Consider for example, a "file size" problem, as reported by the diagnostic utilities, indicating that the CVF is corrupt. Strangely, this situation is not always repairable by either the Norton, DOS or Stacker utilities; nothing seems to be able to get the CVF out of read-only mode. Even the Stacker technical notes specifically dealing with this situation were unsuccessful [3]. Further investigation revealed that the CVF was not contiguous. Firstly when the CVF was created an unmovable log file is created somewhere on the host, it splits the CVF in two pieces. Secondly, and this was a shock, in many cases the end of the CVF was in the centre of the CVF! i.e. if the CVF was in three segments, A-B-C, the actual order on disk might be A-C-B! This means that the actual end of the file might be hard up against itself. This of course should be irrelevant but with nothing to lose try the following: Backup the CVF and have a bootable disk handy. (If you follow later recommendations backup may not be necessary.)
I now perform the above whenever I install Stacker or DS. It ensures a contiguous CVF and thus appears to guarantee the repair utilities will actually work when needed. Step 3 is quite safe if the CVF is empty. You might want to take a copy of the log file before deleting it if this is not the case. Stacker Overall, my personal experience leads me to favour Stacker 4, and its superior compression algorithm can increase available disk space by another 25 percent. The only complaint is the time consuming checking aspects associated with the duplicate FAT. When a discrepancy occurs between the two FATs the user is asked to select one or the other-as if they would know the better choice! Unfortunately on a large drive it can take over an hour for the Stacker CHECK utility to assess the probable consequences of selecting one FAT over the other. The irritating thing is that the estimates are done separately. Consequently one must reassess the tables, another 2+ hours on a large drive, to change the decision. Currently NU8 does not support Stacker 4, when it does this alone would be a good reason to acquire it. A strange problem occurs when using Stacker 4, Windows and QEMM 7 Stealth ROM. Clicking on the Windows Control Panel causes Windows to immediately exit to DOS. I cannot formally verify the cause but I suspect Stacker uses the expanded memory page frame in a non standard way for its internal buffer. The QEMM VHLN parameter tells QEMM to ignore this non standard access method and solves the problem. Stacker has recently released an update to Stacker 4 to allow 32-bit access to "stacked" drives under windows for workgroups 3.11. Network Problems An annoying problem can happen with both DS and Stacker when using a network. If your first network drive letter is too low, typically F, then the standard DS installation will prevent network access. At boot time the PC would see local drives letters A..H (H is the default allocation for the original C drive) and thus will not attach to the "duplicate" non local drive F (less than H). Of course if the host letter could be made less than F (D say) then problem solved; but this is not always possible. Of course one must have LASTDRIVE=E (or less) in CONFIG.SYS. On systems with large hard drives it is often more manageable to partition the drive into at least C and D. Suppose we compress D and map the host to E. We can still attach to the network as E is less than F. If the machine has a CD-ROM what letter should it use? In fact in this situation, without reassigning the network drive to a higher letter (an operation which to my knowledge must be done by the computer centre), you cannot access both the CD-ROM and the network simultaneously! If the CD-ROM driver is loaded first it must be F or higher and the previous problem arises, if the network is attached first then MSCDEX currently fails to initialise correctly. I know of no simple way to resolve this problem. General Advice Most people want extra disk space, not for there precious data, but to hold all that FATWARE. Software usually is the consumer of disk space, not data. Most clients I deal with seldom have more than 20-30 MB of data. Bearing this in mind the following setup provides excellent data protection whilst gaining the most from data compression with minimal risk Assuming at least an 80 MB Drive, partition the drive into a 30 MB C and the rest for D. On C place DOS, all drivers and TSRs etc required by CONFIG.SYS and AUTOEXEC.BAT and utilities useful in a crisis (XTree, Norton, AntiViral, ...). Thirty MB should be ample and you may have enough free space left for the Windows Permanent Swap File. Compress D leaving 30 MB uncompressed. Now fully defragment the host as described earlier; as the host (originally D) was empty you can safely delete any files created by the compression process from the host leaving only the CVF (DBLSPACE.OOx or STACVOL.DSK). We now have 30 MB of uncompressed space on the host for that precious data, and a guaranteed contiguous compressed volume for the FATWARE. On a 200 MB drive we would have 60 MB of uncompressed space and about 280 (140 x 2) MB of compressed space. Appreciate that all the compression systems will allow you to vary the amount of uncompressed space on the host as necessary, hence that uncompressed 30 MB on our original D can be varied up or down to suit changing requirements without affecting its contents. At this point we have all the system software and utilities on an uncompressed C, hence there will never be any booting difficulties caused by compression problems, and the recovery utilities will always be available. All application software, including windows, goes in the CVF and all data on the uncompressed area of the host. The windows permaner swap file is placed on either C or the uncompressed portion of the host. Realise that C and the CVF never need backing up, they contain software which can be reinstalled from original disks. Only the uncompressed portion of the host require regular backup and this process is now considerably simpler for less experienced staff. Furthermore, the uncompressed data is a lot safer. I usually adjust parameters so that the data drive is D (for data), the CVF application software drive is E and the network drive (F) is unaffected. Summary
The program SPACEPCD.ZIP on the BBS is a simple utility that has proven itself useful. It identifies host drives, the compression driver in use, summarises available disk space, and in a networking environment identifies all drive resources available. The functionality of the program does not depend in anyway on the compression system in use; it simply identifies it. I could only test this with those versions available to me. References [1] Halfhill, T.R. (1994): "How Safe is Data Compression." BYTE February 1994. V 19:2 56-74. [2] Wilby, R, (1994): "A Beginners Tale." PC Update June 1994, p26. [3] STAG FAX 4702 dated 9/2/94. 0 Reprinted from the October 1994 issue of PC Update, the magazine of Melbourne PC User Group, Australia |