The magazine of the Melbourne PC User Group

The Missing Link
An e-mail Forum Item Has a Link That Just Doesn't Work...
Gordon Woolf
 
 

Gordon Woolf explains how you can work out what has gone wrong

How often are you sent an e-mail with a link to some information you simply must have — and it doesn't work? For most of us, the answer is "often".

Usually you dash off a reply, and hope you will get a corrected link in time. However, with a little thought, and some information on how hyperlinks work, you may save a lot of time and more than just a little heartache.

Superfluous Characters

The most common troublesome link is the one that has been copied and pasted with an extra stray character, or a character missing.

An example: http://www.worsleypress.com/magbook/index.htm is almost a link to a book I wrote, except that the file name ends, not with .HTM, but with .HTML, so it will be http://www.worsleypress.com/magbook/index.html

I typed a full point at the end of that last paragraph, and that can be a problem too. Always check whether any stray characters at the end of a link are in fact part of it. Common problems are full stops and angle brackets.

In such cases copy the link and paste it into your browser because you can then delete the extra characters before you press enter or click on the "Go" button.

Another Type

Next among common problems are those links which carry over to a second line. If you realise this has happened, it is no good to copy both lines together, one under the other and paste them into the address bar of a browser. You need to combine them, as follows: copy the first line, paste that in, then copy the second line and with your cursor at precisely the end of the previous paste, add the rest of the link.

Usually, long links are the result of extra information after the actual URL for the site, but you can get long addresses too. Just to peeve Gary with one more problem on how he can possibly lay out this article sensibly, the longest registered Web site address is said to be the following German one, locating a publisher of history and technical books:
http://www.wiemenschlichmenschensidzeigthrumgangmitdermuttersprachefrsch.de/

Spaces

Be wary of spaces; there cannot be a space in a URL. They convert to strange things like %20 but in general they should not be there, so make sure you have closed up the multiple parts of a link pasted in this way. Sometimes what looks like a space in an underlined link is in fact an underscore character. So also try that in place of a space, especially if you suspect that someone retyped an address instead of copying and pasting it.

Question Marks

Also, be wary of question marks in URLs. Often the text after the question mark is the result of a search query on the site and is not part of the actual address; it is not unknown for the information after the query to identify the individual if it is the result of a site where logins are used.
For example: a long URL that would break to two or more lines in most e-mail readers, is shown in below. It was a page giving details of where mail could not be delivered by the US postal service after the cyclone. It resulted from a search but nothing after the question mark is in fact needed once you find the information. That may not always be the case, but if such a link does not work it may be worth progressively cutting off the part after the actual page name.
http://www.usps.com.communications/serviceupdates.htm?
from=bannercommunicatins&page=katrina

A Myth

There is a common myth that you can preserve the whole of a link in an e-mail by enclosing it in <angle brackets>. That often works but it is not unknown for mail readers to interpret anything between such brackets as hypertext code and hide it completely, expecting that there will also be the name of the link outside the angle brackets.

Referral Links

Another kind of link is the referral link. These are commonly used in newsletters and on publication Web sites. They take you to the site they list but together with some extra code that will show up either at the sending end, or for the recipient Web site to help prove the usefulness of the publication. Such an address might be as shown in:
http://www.google.com/url?sa=t&ct=res&cd=5&url=http://discuss.agonist.org/yabbse/index.php?board;action=display;threadid=1151-&ei=lkOZQ-ilBdC4igHs8e3IDA

So let's work out what is happening there. In this case it appears to be a link to Google but in fact it points to another site called agonist.org and to a subdomain on that site called discuss. Getting sidetracked briefly, subdomains are actually folders on the site which are set up to seem like they exist as a site in their own right. A similar format can also be used to refer to a specific computer on a site hosted by more than one server — it's worth noting the clever use of this by the ABC. They did not call their mail server by the usual name "mail". Instead they call their mail server machine "your" and the email goes to someone someone@yourabc.net.au

In our agonist example above you are pointed to a subfolder of the discuss folder called "yabbse" and in that to a script called "index.php" which is not a static Web page but is instead created on the fly from database information. The link is to a particular discussion on a message board. If you typed just http://discuss.agomist.org/yabbse/index.php?board=7;action=display;threadid=11510, that would actually be sufficient to get to that thread on message board 7.

Odd Characters

Sometimes odd characters occur in URLs and can cause problems for humans in understanding the URL (even if they will be understood by the browser). For example %3F is actually ?, %26 an ampersand, %40 is @.

However, in general you should be very wary of Web links that include such characters. They can be used maliciously as described in the very good article by Michael Horowitz of New York on how spammers, phishers and con men work at http://www.michaelhorowitz.com/linksthatlie.html  [Ed: A wonderfill find, Gordon! Must be highly recommended for a thorough read; I'd say study it. — GT]

There is more on how to obscure Web site addresses at http://www.pchelp.org/obscure.htm and while it is also intended to help uncover scammers, it is also useful in helping to understand how Web site addressing works.

Another genuine referral mechanism also uses a trick which has been much exploited by spammers and hackers —the substituting of one link with another. For example, in a webmaster newsletter I received a few days ago, there was in the text a link to FindMyHost.com, but if you let you mouse hover over that link for a while you would get http://www.sitepronews.com/cgi-bin/ct.cgi$id=283.

In this instance the software at the newsletter host knows where to send you and does so after recording another "hit" for id 283. No particular harm in that.

Yet another trick used by scam merchants does have a genuine use in URLs. This is that anything between and the (LI: character, (known as the "commercial at", strudel, monkey's tail, snail etc) is not part of the actual address. It can be used genuinely for transmitting passwords to the site and if you are given this kind of address by someone who is not trying to scam you, it may mean they are innocently revealing their password. To see what site is actually involved, ignore anything before the @ character.

Another symbol that often appears in URLs is the #, often called the hash, pound or number sign. This is used to indicate a link within the page you are viewing so http://www.mydomain.com/thispage.html#something will point to a section of that same page which has been called "something" and has some code to which your cursor will be sent.

Murphy dictates that when you receive a link that does not work, it will be something you'd really like to see. Maybe now you will now be able to work out for yourself where you should really go — and I do mean that in the nicest way.

About the Author
Gordon Woolf, a long time member of Melb PC, recently retired from publishing and sold his publishing business to fellow member Geoff Heard. He's now doing book layouts and covers for a US publisher from his home in Hastings on the Mornington Peninsula and hosting several Web sites on a server in Michigan. Contact:
gordon@gcwnet.net

Reprinted from the October 2005 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[ About Melbourne PC User Group ]