The magazine of the Melbourne PC User Group

Browsing Your Hard Disk - Offline Browsing
Trevor Gosbell
 

 

There is a lot of good stuff on the Internet. The trouble is I don't want to spend all my online time browsing and reading. Have you ever found yourself asking "Why can't I just down-load what I want and look at it in my own good time?"

Well, you can. Here are some suggestions for offline browsing.

Use Your Browser

Modern browsers provide good support for offline browsing. Internet Explorer offers two options - Web page saving and synchronize.

To save a Web page select
File|Save As on the menu, and IE offers to save "Web page, complete" as the default file type. This will save the current Web page and all of it's requisite parts to your hard disk.

Easy, but what if the current page is page one of a sequence? If it's more than a few pages, it will get a bit tedious to save each page individually. This is where synchronize comes in.

The easiest way to use Synchronize is to tag existing Favorites for offline browsing. On the menu choose
Favorites |Organize Favorites. Click each favorite you want to view offline and click the "Make available offline" check box. Click the "Properties" button to set how much content to download. Now before you log-off you can click Tools|Synchronize and the selected favorites will be updated for offline viewing.

Netscape also provides a facility called Netcaster that also provides offline browsing.

Use a batch download utility.

The main problem with the offline browsing features of the browsers is that you have little control over where the downloaded material is stored, so it's almost impossible to share your downloaded Web pages with other computers. I want to be able to download pages for viewing and printing on my desktop PC and also sit in the beanbag and browse them on my notebook. Enter the download utilities!

There is a stack of freeware and shareware batch downloaders available. These tools allow you to replicate a selected part of a Web site on your hard drive, which you can then browse offline.

I have found wget from GNU to be the most reliable. wget is a freely available from http://ftp.sunsite.dk/projects/wget/windows/wget-1.8.1b.zip. Unzip this package and it's ready to run.

wget is a command line utility with an enormous range of command line options. Although it's possible to setup a configuration file to save your commonly used options, I find it just as easy make a batch file that calls wget with the options I want. I call mine Webget.bat, and it looks like this:

wget -x -r -nc -k -np %1

What the options mean:
-x reproduces the complete path (including the URL) of the file on your drive
-r "recursive download" - follows links on the current page (and subsequently downloaded pages) to a default depth of 5 links
-nc "no clobber" - don't download a file if you already have it
-k convert hard links to relative links, making offline browsing reliable
-np "no parent" - do not download any links from above the current directory

With these settings, a Web page and other pages in the same directory are downloaded into a replica of the Web site on your computer.

Example

Put wget.exe and Webget.bat in your path, and you can retrieve a back issue of PC Update by typing the following in a DOS window:

webget http://www.melbpc.org.au/pcupdate/2112/index.htm

This will download index.htm and all of the other articles available in the 2112 directory to your computer, in the directory www.melbpc.org.au/pcupdate/2112/. But other files from www.melbpc.org.au/pcupdate/ or even www.melbpc.org.au will not be downloaded.

The good thing about this approach is that I control where the Web pages are downloaded to, so I can move them onto another machine if I want to. Also the URL and path of the original file are recorded in the file structure, so I can easily return to the live Web site for an update or to view other related material.

Note

Many friendly Webmasters provide a zipped copy of their Web site to down-load for local viewing. This will always be a faster download than using an automated Web site download.

Use caution when using any of these automated download techniques, as they can quickly get out of hand, downloading megabytes of unwanted material to your computer. If you use wget read the documentation before changing any of the options shown here. Always make your download settings conservative. You can always go back again later if you didn't get everything you want.

Reprinted from the May 2002 issue of PC Update, the magazine of Melbourne PC User Group, Australia
 

[About Melbourne PC User Group]