The magazine of the Melbourne PC User Group

There’s Something About Microsoft Files
Major Keary
 

In my experience Microsoft Office makes big - very big - files. The "quick save" feature contributes because, instead of discarding data that is no longer required, files keep the lot. That's why 'fast save' is faster: each time a file is edited Office simply closes without erasing redundant data. In other words, the more often a file is edited or used, the more detritus it accumulates. Typically the name of the originator will appear multiple times, probably because it is recorded each time the file is opened. Saving font information also contributes to avoidable file obesity.

Another 'feature', which seems to be peculiar to the Windows operating system in versions 95/98/ME, is that the kernel does not provide zero-filled memory, which can result in the allocation of memory containing previously saved data that may be dumped within the open file.
 
When a program calls for memory to be allocated the operating system should ensure the memory provided is empty (zero-filled). Memory holds whatever data has been loaded until either it is flushed or the system is shut down. I have seen Microsoft Word files with the contents of other, unrelated, files embedded in them. That material can't be seen when the file is opened with Word, but it can be seen with a text editor such as UltraEdit. Apart from a potential security problem, it adds to the bloat.
 
There is another reason for the difference in file size. When testing StarOffice I was given a 2.2 Megabyte .DOC file that contained no excess baggage, but when saved by StarOffice the resulting file was 1.2 Megabytes. To test the fidelity of the file I used StarOffice to export both the original and the StarOffice versions to PDF. The PDF files were identical. The answer lies in the way StarOffice stores data: compressed XML.
 
Microsoft files can lock themselves out, so to speak. Office applications appear to write a check sum or CRC (cyclical redundancy check) that is appended when a file is saved or correctly closed. If closure is the result of an unexpected event (power failure, book dropped on keyboard, system crash, etc.) the checksum operation is not performed. When Office attempts to reopen a 'forced-close' file it encounters an incorrect check sum and returns an error message.
 
In such cases StarOffice can be used to open the file and 'repair' it simply by 'save as'. StarOffice can save Microsoft Office files to the original format, or in an StarOffice format. Which begs the question, why bother using Office in the first place?

Reprinted from the July 2004 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[ About Melbourne PC User Group ]