Hyper-Text Mail Link (HTML), Web Information, Hints and Tips; the Simplest Websites.

How big-lies.org was built!   No Java, no apps, no WordPress, no Joomla, no plugins, no stylesheets

Using log analyzers to extract information   New in June 2021

© Raeto West 1998, 1999, 2000, 2014, 2021

v. 22 March 2021

HTML tips: simple website HTML with minimal tricks. Almost no Java. No apps. No WordPress

NO MENU BAR AT LEFT?
Then click here once

You clicked more than once??

WARNING: These free but 15-year old computer hints do NOT include packages such as WordPress. They were aimed at people who want to get their ideas onto Internet, and at those wishing to make thorough use of the Net. The HTML still works; people wanting to write HTML might find my comments useful. For example, in writing indexes to all their files, so people can find them.
      This was my own notebook: I've found the information, which I've assembled quite carefully, very useful, and some of it indispensable. I've uncompromisingly assumed some experience with computers and Internet hosting companies. My suggestions are somewhat idiosyncratic; they are low-budget; there will be mistakes—apologies for them. No legal responsibility accepted—Rae West
Click for Home Page of Rerevisionist's Website   |   Click for Rerevisionist's Youtube Channel

Beginner? Four most important things are: fonts, links, tables, graphics.
[And you'll need some sort of uploader to get your site online: I use and recommend CoffeeCup, which (1) displays your site as it is on your disk or solid state, and the online version, at the same time, with easy uploads and downloads; (2) makes inserting tables, rules, font sizes, unusual characters, fairly easy; (3) prompts for <DIV>, images, and so on; (4) shows how your site appears in different browsers, loading your site into the actual browser; (5) has a search feature within your files, so you can relocate your own contents. All this ... when you've got it set up...
      As with all computer software, it's much more difficult to use than it should be; the designers seem unable to write, unable to explain even simple things. Possibly this is a fossilised habit, laid down when memory and screen space were expensive.   [Unpaid ad inserted Feb 2014, and Mar 2020]

Most sites now use WordPress. I found it difficult to use, and without clear explanations. But my main reason is that it is yet another layer of complication, introducing possible routes for hackers with add-ons which seem to be needed to make it work well. big-lies.org has a few specimens from the BNP site some years ago, which was a WordPress site. They don't display properly now. There is a permanence issue which may be important to you.

[Hosting a site on your own computer, or one of your computers is possible and perhaps worth bearing in mind as an option. If webhost companies come under fire, probably by Jews or their agents, this practice may become widespread. Software would need security against hostile viruses, spam, and 'denial of service' overloads. Feb 2014]

  1. Social Conscience, Informational, and Reforming Websites
  2. HTML? Outline and hints for proud owners of new websites who can't work out what to do next     A bit on Javascript
  3. Internet Browsers. (Explorer? Communicator? AOLPress? Opera? Mosaic? ...)
  4. 'Search Engines': understanding them, popular ones, searching Internet
        Long list of less well-known web searchers
        Metasearchers
        Adding your site(s), general advice
  5. Improving your search engine rankings
  6. How to search files, books, emails and other data on disk for information which you know is somewhere on your hard disk.
  7. What is 'Usenet'? And other e-discussion groups and lists?
  8. Tables and Frames and Pop-up Messages (Alerts)
  9. Graphics
  10. Sound
  11. Creative use of old software
  12. Backups with larger format 'Floppy disks' & writeable CDs
  13. Free Sites. Small Print.     Free software. Free newsletters. News.
  14. Counters
  15. Site Searchers
  16. Security, Downloading, and Software Problems
  17. Domain Names (.com, .org. .net and others)

[Rae West's Intellectual Revisionist Website Home Page          E-mail anything @ big-lies.com if you like this site, or have a technical crit]


1. Social Conscience, Informational, and Reforming Websites
[Back to top]

Internet offers something approaching a level playing-field for the spread of information, perhaps for the first time ever. Here are my tips on designing a social conscience, reforming, or informational website.

      Social Conscience/ Reform/ Informative Websites: Some Possibilities

  1. New research: you may have information on disease patterns, injuries, pollution. A well-written article, perhaps with maps or other evidence, could have major impact. A classic pre-Internet example is tetraethyl lead in fuel; another is the suspicion that X-rays of unborn children damaged them. More recent examples could include asthma (where the methodology - fairly simple mapping - may be similar).
  2. Overview science site. Example: the official inquiry into BSE puts daily testimony on the web. But it remains an undigested mass of words - there is still no site outlining the revelations so far. For instance, I've been told that the supposed animal feed link with BSE has been disproved. Unfortunately, nobody, including the mass media, seems be sorting through and publicising what's been said.
  3. Reviews. Many people would like to read journals or books, but haven't the time. Good reviews, including notes on the contents and fairly detailed quotations, can be valuable. Obviously I don't mean journalistic rubbish. Don't underestimate the time needed for serious work. The same possibility applies to the broadcast media.
  4. Scams, maladministration: you may have information on frauds and scandals. These are often complex and difficult to present, and involve vested interests, so it's doubly unlikely that the conventional media will publish, let alone disentangle them. Internet may be a successful medium for such information - if you can do the work.
  5. Personal experiences: These (so far) are rare on Internet. They may become more common if websites become easier to design. The sort of thing I mean is personal experience with medical, legal, academic, or other 'professional' work which goes wrong. Plenty of potential scope for outrageous libels as well as exposés.
  6. Anti-censorship: Some sites put up pieces which have been censored. They don't necessarily agree with the contents. It might be interesting to try such a site, but, if you try, think out your rules beforehand, then publish and stick to them, so you can't be accused of shifting the goalposts. For example, they should, I'd say, not reveal dangerous information [e.g. how to make weapons]; have some importance [so e.g. a piece on someone's nasty neighbour would be turned down]; have some point or intent [i.e. not be meaningless]; have evidence for every point, and several examples to support any generalisations [to help reduce expressions like 'growing concern over' and 'ever growing tension'].
        Peacefire is an interesting example; it lists site blocking software, such as Net Nanny, Cyber Patrol, BESS, Surfwatch, Smartfilter and so on, and has anecdotes about these products, and includes information on removing them from your computer. They are currently under legal attack by Mattel, of Cyber Patrol, for distributing a program which decrypted Cyber Patrol's banned list of URLs. (Their encryption method has now been changed).
  7. List of links: if you've found, by search engine or email, sites which interest you, you can collect them together and make an interesting site without much more work. Some Kosovo sites are like this.
  8. Paradigm-shifting sites. If you have new ideas or interpretations of AIDS, atom bombs, Darwin, drugs, education, Picasso, sex, Shakespeare, World Wars, or whatever it may be, the Web may be the best place, or even the only practicable place, to air them.
  9. Rediscovery sites of suppressed or little-known facts. (I started my site as a result of my interest in Bertrand Russell, when I found nobody had bothered to put Russell's Vietnam War Crimes Tribunal onto Internet.) As a rather different example, some sites scan in out-of-copyright rationalist books, hoping to oppose religious fanaticisms.
  10. Interviews: a transcribed (or possibly audio) interview (get permission) with someone with new ideas, or with some suppressed information, can be interesting and novel. You might pool questions with other people, anticipate answers so you can requestion, and generally try to prepare carefully. Again, Internet may be the only practicable way to publish such material. Interview with Armstrong, anyone?
What you need to be a Webmaster (or Webmistress)
  1. Emotional and intellectual motivation. And enough time, equipment, money.
  2. A paid-for site is probably best, as ads can contrast absurdly with your material. A domain name is probably good, if you can think of one which you won't want to change.
  3. Some knowledge of computers and HTML: I'd say tables and graphics represent the necessary minimum. Tables allow you to box in your text and pictures, so that they always display in more-or-less the right relative positions. And optimized graphics add information and visual interest.
  4. A statement of the aim of your site may be desirable. Even if some readers don't like your site, at least they will know where you stand.
  5. Material from other people? If other people are willing to work, be grateful! But be prepared for them to have little feeling for evidence, or for how to present it. They may know a lot, but be incapable of explaining what they know. They may think their story is the most important thing in the world, and be unable to co-operate with others. They may withhold crucial portions of evidence.
  6. Be realistic. If you, when you think about it, realize you've never in your life used a search engine to investigate some web topic, is it reasonable to expect millions of people to search for your site? (Don't take claimed figures for web use too seriously, either. Remember Americans are notoriously innumerate and accept bogus statistics of all sorts).
  7. Be accurate—don't make promises you won't keep. For example, if you have an email box, you have an implicit obligation to reply to emails—or at least most emails. It's a sort of insult to ignore emails—especially if you have explicitly stated that you will reply.
  8. Keep your computer functioning properly. Use Scandisk, get a virus detection, keep backups. You may consider developing sites on a PC not connected to a modem, for greater security.
  9. Pay attention to keywords, links, and other search engine tricks. Check search engines and ensure at least some of them show your site.
  10. Have some confidence in your judgement. Remember there's no such thing as a perfect website! If you like humor, put it in—all 'serious' newspapers have cartoons. But, if you think people won't take you seriously, leave humor out. If you like a racy style, use it; but if you prefer a formal style, you may be proved correct. If you decide on long documents (slow loading, but complete), that's fine, if you warn your users. But if you prefer many short files (so each loads fast, but the total loading time is longer) it's your choice. And so on with colors, fonts, and all the other variables.
Watch for these Pitfalls
  1. Aimlessness: try to anticipate the next stage in your site, and the stage after that. Are you aiming for a change in the law? A public investigation? A change in attitude? Anything?
  2. Copyright problems: always ask for copyright permission if reproducing a document or book. (With newspaper articles this is probably not so necessary; some allow you to use pieces provided you include a link to them). If copyright permission is denied, you're still allowed to quote extracts. You may consider you have a moral obligation to put up material; e.g. a publicly-funded investigation into 'Satanic child abuse' was taken off the Web nominally on copyright grounds, but in reality probably because its conclusions made people look incompetent. Are you justified in putting items like this up? I'm not sure. Copyright claims in the US appear to be increasing as a censorship device, or so the story goes.
  3. Evidence Problems: be sure of your ground. You may need to collect clippings or computer files from newspapers and magazines; or videotape TV programs; or audiotape speakers. This can be very tedious, but necessary. Example: Amnesty was believed by one of its founders to have been infiltrated by the British secret services. I have a good quality tape of this being said, several times, by a reputable man who worked for years with the founders, which seems good enough. Anonymous or vague information is more difficult to deal with.
  4. If you quote people, either from written material or orally (perhaps from tape), in my view you should use their exact words. Don't misrepresent or misquote them in any way. (I recently saw an amusing example, where someone, trying to quote some author or other, put 'G-d' in place of 'God'. How can anyone trust a 'quotation' from such a person?)
  5. Libel: this is your concern; I accept absolutely no responsibility. If your evidence is good, you should have few problems at present, as Internet is huge and most legal systems don't want to make themselves look silly by punishing truth. In the UK, libel settlements are much smaller than they were. People may want to avoid giving your site publicity. But all this could change rapidly. Bear in mind you may be asked in the UK to disclose, or 'discover', all your evidence (but then so may they!).
  6. Co-operation failures: big organisations are held together, very firmly, by money, prestige, inertia, economic function, and so on. Opposition is likely to be held together only by fragile threads. So beware of people falling out on trivial pretexts, not doing work they said they would, failing or being unable to make their case, and so on. It happens all the time.
  7. Poor site layout. Try to signpost your site, so people know what's where. Don't have a set of ambiguous or overlapping links, like these—'News!' 'News About' 'Whats New' 'Index' 'Facts and Figures' 'Guidelines' 'Review' 'More'—where it's uncertain what's meant, or 'Search' without saying what's being searched. And, to help people, signpost for size—in the real world, books, papers, sketches and so on can be instantly assessed; you can see immediately what's involved in reading them. But in the virtual world, a bare web link could lead to one sentence, or hundreds of huge files—you can't tell! In my view, it's worth indicating which links go outside your site (I use a lightning flash symbol, with a left-arrow to hint that surfers might return website design hints).
  8. Watch for various types of phoney site and don't make yours too much like them. Examples: some alternative sites are disguised product ad sites (they assume 'alternative' people must be gullible). Some sites are disguised book ads, computer software ads, entertainment ads, or whatever. There are very many social concern sites which are concerned to push some theory, therapy, scare, drug, belief system or whatever. Many sites, e.g. the 'skeptics', and some sites opposing censorship, and ethics sites, are essentially frauds, limiting themselves to unimportant issues and censoring important ones. Examples include FreeRepublic - see Carol Valentine's crit over censorship of Waco. Another is www.grassroots.com, 'Your political action network', in fact run by lawyers, politicians, PR hacks, press agents.. A year ago I found a site on 'conspiracies and extremism' by Marc E Fisher also of this type.
        Watch for meaningless awards (though your site might benefit from a few awards—a few nice logos won't do any harm.)
Learn from these Website Examples
  1. The McSpotlight site was pathbreaking in several ways: the David vs Goliath aspect, the direct putting onto the Web of court transcripts, and the co-operative work involved. But it's badly designed; I've made many attempts, all unsuccessful, to find certain points of interest in their trial transcripts.
  2. Carol Valentine's well-researched and designed Waco site is a one-person site, run on a shoestring, based on well-scanned-in newspaper stories and pictures, and a lot of further research.
  3. This is a co-written site of mine on private hospitals in the UK, the object being to reveal limitations in British law and help change things. It's one long file, a 30-page printout. Every statement is referenced. It exploits the passivity of Internet: seven days a week, journalists can look and plagiarise bits. The alternative is photocopying piles of paper, highlighting bits, scribbling a covering letter, putting in envelopes...
  4. Leading Swedish site on war and peace, Transnational Foundation (TFF) was the first to publish details on the Rambouillet 'Accords'. Standardized site colours and serious approach.
  5. This AIDS site is unfortunately badly designed, in my view. Graphics make it slow-loading, and information is hard to find—all you see is a batch of links, with no hint as to where they lead—the webmaster knows, but isn't telling!
  6. Websitesthatsuck (which I saw in a newspaper promotion) in my view is another terrible site; it's misleadingly titled, being more of a collection of techniques the author doesn't like, and has concealed ads etc.
        (Note too the horrific uninformativeness of much commercial software; I've just been looking at Presto!—nowhere does it say what it's supposed to do, or what an 'inbox' is, or what an 'application' is. And Acrobat, which says it's a 'handy tool for viewing PDF files'. They've spent years on their programs, but as a result they're unable to realise that other people haven't. Read this: 'The import/export wizard allows you to easily import and export information from Internet Explorer like Favorites and Cookies to other applications or a file from your Computer.' Or this: 'Definition term specifies a definition in definition within a definition list'. Surely you can do better than this!)
  7. Put your URL somewhere on your site! (See the end of this piece, for example). Let me illustrate why with an example: a web article, somewhere, claims the famous story of morphine addiction after the American Civil War is untrue. BUT, despite having a printout, I've never been able to relocate this site, which has other, similar, interesting articles on it. If they'd put their URL I could have recovered it from my stored copy.

2. HTML? Outline and hints for proud owners of new websites who can't work out what to do next
[Back to top]

  1. Why learn the basics of HTML? Because it can be useful, not just to help design your sites, but for such purposes as editing other peoples' sites so you can print them out in a format you like, or refer to them more easily.
  2. HTML means 'hypertext markup language'. All this means is that (i) when <HTML> is found, the browser is programmed to treat what follows as a website; (ii) it can be sent in effect as e-mail. (If you're interested in examining the standards for HTML, try World Wide Web Consortium Standards, but be prepared for a very long session with rather unhelpful material. Document types, style sheets etc represent attempts to standardise HTML which in my view won't seriously come into force for a long time).
  3. And be aware that, because HTML started in a ramshackle and not properly thought out way, much professional website material is written including other languages, notably Javascript, so much so that HTML alone isn't adequate to get the most polished effects.
  4. A 'Website' is a collection of linked computer files: one of these must be named index.htm or index.html or possibly default.htm. One file is the minimum size of website: but there can be any number of HTML files, and graphics files, linked to the index. You will start to see why it's called the 'web'.
        If you want a more or less secret file, perhaps for other people to download, or for you to test, you can upload files which aren't linked into your website. (But if you instal a site searcher, the unlinked files may show up on that).
        HTML browsers take any file ending in .HTM or .HTML and convert into the screen display with specific fonts, colours, and the rest. .GIF and .JPG files are converted into pictures, each browser using its own conversion, so results may differ a bit. Any other file, e.g.a .ZIP file or .EXE file, triggers the downloading sequence - you'll see a box asking you where to download the file. If files are wrongly named, results may vary—for example HTML files without the .HTML or .HTML won't load properly with Netscape, though they are OK with Explorer.
        If you're beginning, make a subdirectory - for example C:\MY-SITE - and put all your website files into it. (You might have C:\MY-TEST to try things out). You can use your browser to load the index file from this directory, and the result should be exactly the same as if downloaded from Internet, except that it will be much faster. This is called 'browsing offline'. Use your ftp (see below) to put your site onto Internet where others can get access to it.
        If you have contrasting sets of material on your site, consider the site as made of 'subsites' (for want of another word). Subsites can be treated as separate for many purposes, for example, entering them on search engines.
  5. The simplest possible HTML for beginners is
    <HTML>
    Hello!
  6. The simplest structure of a useful HTML file is something like this:
    <HTML>
    <HEAD>
    <TITLE>Brief title of your site - about ten words before the end-of-line cuts it off; say 60 characters maximum. All search engines display this.<TITLE>
    <META NAME="DESCRIPTION" CONTENT="Description of your site; about 25 or 30 words, say 150 characters maximum. Many search engines display this.">
    <META NAME="KEYWORDS" LAN="en" CONTENT="List of keywords separated by commas. English language, or whatever, optional. Maximum of 800 characters or so">
    </HEAD>
    <BODY BGCOLOR="color" TEXT="color" BACKGROUND="image.jpg" LINK="blue" ALINK="red" VLINK="purple">

    ALL YOUR STUFF TO SHOW ON THE SCREEN.
    The first words may be displayed by some search engines, so choose them carefully.

    <!-- this structure is for comments; browsers don't show anything in these brackets -->
    </BODY>
    </HTML>

  7. Everything in brackets of this sort: < > should be a code with a specific meaning for the browser. If a browser doesn't recognise it (for example, a typo like <HTLM>) it will be ignored, a design decision probably necessary since otherwise every new feature would give error messages in old browsers. Upper and lower case can be jumbled together in these commands - they'll be recognised just the same.
  8. Programs to help you put together your site. I'm assuming here a low-budget approach. This rules out HTML composer programs such as HotDog, HotMetal (includes HTML—gettit?), Dreamweaver, Drumbeat, excellent though they may be, except for free ones, for example Netscape's Composer, 1stPage 2000 and AOL's AOLPress which does its best to combine word processing with HTML display. AOL also has Easy Designer which I haven't tested. Another free HTML editor (which I haven't tried either) is BBEdit Lite. Claris has a good system (for Mac only). I haven't personally bothered much with Microsoft's FrontPage, biased, perhaps, by the fact that too many newcomers can't use it and that it can be hard to find. I haven't looked into Sitebuilder. In my view, you must have a feel for HTML, because all the composer programs ask interminable unhelpful questions about (for example) tables, rows, columns, cells, borders, and so on, which mean nothing unless you already know what they're talking about. I personally use an obsolete version of HoTMetaL just to define complicated tables, as I can't be bothered to work out the rowspan and colspan parameters. Another freebie is Arachnophilia, which converts (for example) Word documents into HTML, though it isn't sophisticated enough to distinguish opening and closing quotes; it can be used to add meta tags, such as title and description and keywords, as it has a small editing window. I haven't tried Word 7's Generator—judging by sites I've looked at, it's fine for simplish sites, but then gets more difficult. Netscape's Composer (click Communicator, then Composer, to find it) tends to introduce thousands of extra characters. Both it and AOLPress are useful for inspecting other peoples' sites, in particular to show up the way tables are nested. 1stPage 2000 is a free Australian web tool with an all-purpose intention; it's 5 Mb and downloads ads. Amaya is another free web authoring package (though I couldn't get it to run, and it did things to one of my hard disks). The simplest web authoring tools, like the free HTML Edit, don't do much beyond inserting the tags, e.g. <B></B> and positioning the cursor waiting for you.
          Some sites provide skeleton websites, or cookie cutters, into which you insert your own information—life, details, interests, pictures. Xoom for example does this; they have CDs with templates which you edit. I haven't tried this, but suspect it's not as easy as it sounds—your text may be a lot longer than their specimen, and your pictures different sizes, for example. Nevertheless if you can't face the effort of trying to get frames or tables to work this may be a good option.
          (Click here for my free stuff section).
  9. There's no easy way to learn HTML. In my opinion, most of the books are badly written and hard to follow. The online guides tend to be not much help, because they give precedence to ads and have endless tiny bits of information—Webmonkey is an example. Start simple, work up, and don't be too impressed by the fancy stuff—sites run slowly if they're overfreighted with frames, graphics, background patterns, Javascript and sounds; it's possible to do without. There are sites listing features of HTML including for example style sheets and much technical data. It seems unlikely that such things will become standard for a long time.
  10. One way to get the feel of the way ordinary text is converted into fancy typefaces in colour is to look at the way it's done in a site you like. Click on 'view' and then 'source' which you'll find somewhere, to see the 'source code' of what you're looking at with your browser - (how difficult it is to write about these things without jargon!) In this way you can print out a site and learn from it.
  11. You can save other peoples' websites on your own disk; one individual page can be saved by clicking on 'file' then 'save as' and putting something like c:\file-to-be-saved.html in the file name box. Later, you can browse offline—save phone bills!—by clicking on 'file' then 'open' then selecting c:\file.html. You will probably have to save pictures too (right click on them—they'll be saved with their original name unless you change this). Entire websites can be downloaded, or 'grabbed', but this is usually not quite as simple as it sounds, since the directory structure has to be copied, but even so it's an easy thing to program and it's surprising to find that many site grabbers are hard to use. (It's worth checking grabbed sites by putting them on a disk and trying a different computer, if you can, because cached files and images may be recovered by your browser; when they're eventually wiped, your grabbed site may look feebler or not work properly. It's also worth picking a time of day when traffic is light; GMT 10 am is good. Note also that most sites have large numbers of smallish files, which may take up more hard disk space than you expect, because the minimum stored size of hard disks is usually 32K).
  12. You can load, edit, and write websites using any word processor, but only in 'text' mode. The older, non-HTML versions, of 'Word' for example is perfectly good—but select and save as 'text' type file, which strips away formatting characters. Otherwise you'll include all sorts of characters which you don't want.
  13. To move your HTML files from your disk to your website, you'll need an ftp. this means 'file transfer program' which uses a standard 'file transfer protocol'. It's the same sort of thing as getting online, except that it needs your password, and is specifically arranged with directories of files in two (or more) windows, from which you can choose such files as you wish to move around, get rid of, or whatever.
  14. The first file on your site, the 'home page', or index, is, by convention, always named index.htm or index.html. If such a file isn't present the browser will list your files by name in a non-HTML format. Note that (as far as I know) the file name has to be in lower-case only; you may be able so set your ftp to force lower-case filenames to take care of this potential problem, which can be a source of rather baffling bugs in websites. Bigwig software has a short, free program to force you website file-names into lower-case (and, if you like, it will go through your website and convert the links to lower-case too).
  15. Put your web address (URL='Uniform Resource Locator', beginning, to be safe, http://) somewhere on your site(s). Then, if people print it out, or rediscover it stored in their computer, they'll be able to relocate your site without having to use a search engine.
  16. Write a clear title in your header, as almost all search engines display this. If you can, squeeze in important keywords. Search engines often follow the title with the description, so try to make the two read smoothly together. Typically these are about ten words and about thirty, so they will take up about one line and several lines respectively.
  17. Minimise use of underlined text. Underlining is the commonest way to show there's a link.
  18. Avoid flashing text (usually). (The <BLINK> command has been discontinued by Microsoft, but is supported by Netscape). You might try occasionally <MARQUEE> sidescrolling text </MARQUEE> which only works with Microsoft's Explorer. (You can try both—with luck your text will blink or sidescroll, depending on the browser).
  19. A new command is Span Style, like this: <span style="background-color:yellow">text</span>.
  20. 'Headlines' are coded as <H1> through <H5>. H1 is the biggest. This convention is opposite to that for font size, where <FONT SIZE=1> is smallest. Don't worry too much about font sizes; many browsers now allow the font size to be adjusted up or down.
  21. <BR> means return, break, or something like that. It moves to the next line. For a long time, more than one of these was treated just as one, but newer browsers seem to no longer do this. <P> means a paragraph - this inserts two breaks, so the effect is to insert one empty line.
  22. <P align="justify"> makes your text tidy on both margins—as in this paragraph. You may get odd spacing effects, if you have short lines or long words. <P align="center"> centres the subsequent text like a tree, with a ragged left margin reflecting the ragged right margin. <P align="right"> gives a justified right margin; sometimes useful in a column of words to the right of a page, but not generally much use with a language written left-to-right, such as English.
        (There's also a <DIV> command, but I've never been able to work out what it does.)

  23. The two main types of font (the UK word is, or was, fount) are those with serifs - the twiddly bits which have been found by long experience to make reading easier - and sans serif faces. For normal reading, serifs are best; but with the limitations imposed by the dots making computer screens, sans serif is often better, particularly for small print. This is why search engines use them. However, be aware that your instructions can be over-ridden: you can't be sure the font you'd like is loaded into the viewer computer, or, if it is, that it will be actually displayed. Times-like faces are usually set as the browser default, so, for sans serif text, use something like this:
        <FONT FACE="Arial,Verdana,Helvetica,sans-serif,Geneva"> which should work for most browsers. They work through the list until they find one they recognise. Geneva is for the Mac.
        'System' is a nicely compact bold PC sanserif face, and might be tried if you have lots of text.
        <FONT FACE="Courier,Courier New,Mono"> is typewriter-style text, not proportionally spaced. This gives an old-style appearance. You can use it, invisibly, with non-break spaces (&nbsp;) to mark out fixed-width spaces for paragraphs or tables.
        <FONT FACE="Times,Times New Roman,MS Serif,Palatino"> forces an ordinary serif font.

        <FONT FACE="comic sans, comic sans ms,hobo"> gives a decorative sans serif, rather like clear, straightforward handwriting.

        <FONT FACE="script,brush script mt, monotype corsiva,matura mt script capitals"> has a sporting chance of generating 'handwriting'. So does 'Lucida Handwriting', another handwriting font.
        <FONT FACE="book antiqua,bookman old style"> is a font with rounded serifs.

        <FONT FACE="wingdings">gives... gives...
    lettering-sized pictures. Or perhaps pictograms is a better word.
  24. Thus <b>bold <i>italic <u>underline <s>strikeout</s></u></i></b> looks like: bold italic underline strikeout.
  25. (This section is only relevant to people who, like me, assemble their HTML directly, and don't use an elaborate program). The font commands are supposed to be nested, so that <FONT something> is cancelled by </FONT>. (This system is also called a 'stack': you put a command on the stack, then, if you remove it, the previous contents take over.) However, bold, italic, strikeout, underline and heading codes, <B>, <I>, <S>, <U>, <H1> and so on, also affect fonts, so a stray </B> can cause odd effects, and, being incorrect HTML, probably shows differently on different browsers. It's probably best when putting together bits of HTML by hand to nest heading codes inside font changes, to be on the safe side. There are similar oddities, generally, I suspect, traceable to HTML being defined rather laxly. For example, this bit of a list </LI><BR><LI> displays differently from <BR></LI><LI> with some browsers, although you might reasonably expect each to put a spare line between the two list elements.
  26. If you'd like to know the inside info on an unusual font (for example, Greek letters, or wingdings - perhaps you want a small telephone image), write this simple program, with a word processor to help with the repetitive bits:
        <HTML><FONT FACE="wingdings">&#000;=0 &#001;=1 &#002;=2 up to &#255;=255
        And save as test.htm or something similar. The stuff between & and # is HTML convention for a special character; with this HTML, characters of values 0 through 255 will be displayed, followed helpfully by their number, if you look at the file with a browser. You'll get a list like this: ... Ô=212 ...
  27. Accents: this table illustrates the standard way to get the common ones:-
    For  put &Acirc; in your HTML file.
    For ç put &ccedil; in your HTML file.
    For é put &eacute; in your HTML file.
    For ò put &ograve; in your HTML file.
    For Ü put &Uuml; in your HTML file.
  28. The complete list of colors recognised by Microsoft's Internet Explorer (I haven't checked Navigator!) is:
        aliceblue, antiquewhite, aqua, aquamarine, azure, beige, bisque, black, blanchedalmond, blue, blueviolet, brown, burlywood, cadetblue, chartreuse, chocolate, coral, cornflowerblue, cornsilk, crimson, cyan, darkblue, darkcyan, darkgoldenrod, darkgray, darkgreen, darkkhaki, darkmagenta, darkolivegreen, darkorange, darkorchid, darkred, darksalmon, darkseagreen, darkslateblue, darkslategray, darkturquoise, darkviolet, deeppink, deepskyblue, dimgray, dodgerblue, firebrick, floralwhite, forestgreen, fuchsia, gainsboro, ghostwhite, gold, goldenrod, gray, green, greenyellow, honeydew, hotpink, indianred, indigo, ivory, khaki, lavender, lavenderblush, lawngreen, lemonchiffon, lightblue, lightcoral, lightcyan, lightgoldenrodyellow, lightgreen, lightgrey, lightpink, lightsalmon, lightseagreen, lightskyblue, lightslategray, lightsteelblue, lightyellow, lime, limegreen, linen, magenta, maroon, mediumaquamarine, mediumblue, mediumorchid, mediumpurple, mediumseagreen, mediumslateblue, mediumspringgreen, mediumturquoise, mediumvioletred, midnightblue, mintcream, mistyrose, moccasin, navajowhite, navy, oldlace, olive, olivedrab, orange, orangered, orchid, palegoldenrod, palegreen, paleturquoise, palevioletred, papayawhip, peachpuff, peru, pink, plum, powderblue, purple, red, rosybrown, royalblue, saddlebrown, salmon, sandybrown, seagreen, seashell, sienna, silver, skyblue, slateblue, slategray, snow, springgreen, steelblue, tan, teal, thistle, tomato, turquoise, violet, wheat, white, whitesmoke, yellow, yellowgreen.
        So why not master HTML's hexadecimal RGB (red-green-blue) colour notation? A colour is represented by three bytes, each from 0 to 255, or, 00 to FF in hex, where the number system is extended and A represents 10, B 11, ..., F 15, so 0 to 255 becomes 00 through FF. The higher the number, the lighter. This is additive colour, so you're in the strange world where, for example, red plus green gives yellow, and yellow plus blue gives white. Thus #FF0000 is pure red; #00FFFF is saturated blue-green; #000000 is black; and #FFFFF0 is near-white, with the blue reduced a little—a light beige.
  29. Text in the same colour as its background, is automatically treated as 'spam' by some search engines, and the site left unindexed. This was true of very tiny text, but, since bowsers now allow people to change text size, this mayn't be true, though it may be good practice not to have enormous variations in font size.
  30. Link colors are set by this sort of statement:
        <BODY LINK="purple" ALINK="red" VLINK="blue">. LINK is the color before the link's been clicked, and VLINK after it's been visited, in case the link was so unmemorable you forget you've been there. ALINK means active link—it shows you've clicked, in case there's a delay going to the relevant place, and this can be valuable feedback, since without it a user may wonder whether anything's happening. There's much to be said for sticking with the well-established blue unvisited/ purple visited convention.
  31. Everything in double quotes is treated as case sensitive, although, confusingly, this is overridden with e.g. names of colors, or commands like WIDTH.
  32. HTML's standard formulation for links (i.e. click on this to go somewhere else) is
        <A HREF="file name.htm">Underlined message</A>. (This is displayed as Underlined message by the browser, which puts the underlining and colour in automatically, unless told not to). I think the 'A' is probably meant to mean 'anchor'. When clicked, the file will be loaded and displayed in place of the present one. Note that the file name must be exactly correct. That's if it's in the same subdirectory of your site; if it isn't you'll need a formulation like <A HREF="../subdirectory/file name.htm">Underlined message</A>. Or, if it's a link to someone else's site, you'll need this sort of thing:
        <A HREF="http://www.provider.name.htm">Underlined message</A>.

        To allow clicking to move inside a file, rather than having to scroll, two parts are needed: something like this: <A HREF="#label">Underlined message</A>. And the indication of where to move to: <A NAME="label"></a>. (To allow clicking to move somewhere inside a different file, use the filename followed by #label, something like name.htm#label). The label must match exactly - it is case sensitive. If the browser can't find it, nothing will happen. The computer looks for the A NAME label from the top down, so if you have two identical labels, by mistake, it'll only find the first. And if the file isn't fully loaded, it won't find the remoter internal links.

          (I'm sorry if this seems complicated.. it's really not as bad as all that. The logic is, you have to tell it, unambiguously, where to go, so some sort of clear label will be needed both where you click and where you arrive.)

          Links without underlining, with color change. Relatively easy effects (not using MOUSEOVER) can be got, at least with Microsoft Internet Explorer. Examples: in the <HEAD> section put
    <style type="text/css">a{text-decoration:none;}</style>
    <style type="text/css">a:hover{color:yellow;}</style>
    <style type="text/css">a{cursor:crosshair;}</style>

  33. Graphics are included in HTML with the standard coding:
        <IMG SRC="filename of picture"> and the recommended, more elaborate, form, e.g.: <IMG SRC="filename of picture" WIDTH=100 HEIGHT=200 ALT="Information about the picture" ALIGN="right" HSPACE=10 VSPACE=10>. The width and height parameters help the program format its data quickly (without them, the screen may stay blank until it gets round to loading images, which is the first time it knows how big they are). If you can't remember the picture's width and height, look at it with your image processor and get the figures from 'image information' or by clicking on 'resize' or 'resample', recording the figures, and then cancelling. The 'Alt' text is displayed if the browser is set not to load images. Some search engines seem to use 'Alt' information, so it's worth making it reinforce the keywords. This example uses ALIGN to position the picture over to the right; there'll be a space of 10 pixels around it, so the words which automatically are arranged around it have an elegant tidy little margin.
        HTML only recognises .GIF files and .JPG ('general image format' and 'joint photographic experts' group' - or something. See below). Each browser unpacks its pictures itself; so you cannot assume that pictures will look identical in different browsers, or with different monitors, unfortunately.
        You can include the NOSAVE option in your img src definition, which prevents people casually copying your pictures.
  34. The HTML specification states that if a space character is found, all space characters immediately following it are ignored, so a series of spaces always displays just as one space. (This was probably to allow the layout of HTML to be tidily arranged on the page.) This means there's no easy way to get the effect of a tab, i.e. six or eight spaces. Use the non-breaking space (&nbsp;) or shifted space (&#160;) to force subsequent spaces. (Confusingly, at least one browser allows spaces not to be collapsed, if you so choose).
  35. See below, in the section on search engines, for an explanation of adding your site to search engines. If your site is controversial, it may disappear from search engines.
  36. Bear in mind that people's computers differ: with present HTML, it is in fact impossible to ensure the effect you want will display.
        (i) The fonts may differ - the font you specify mayn't be on their computer, and the default font size settings on browsers differ. Explorer has a 'soften fonts' option, which, with tiny typefaces, raplaces black/white dots with grey, given a more rounded effect. (ii) Their browsers may differ - there are several Netscape versions (e.g. Explorer and its update Navigator), several Microsoft versions (distinguished by version number), and others, such as AOLPRESS, Mosaic, Opera, and the freeware Neoplanet. (iii) Their screens may be set up differently, notably as regards the resolution: standard are 640 wide x 480, 800 x 600, 1024 x 768, and 1280 by 1024.
        [Important note: you can make the process of changing resolutions easier in Windows 98 by making the screen-setting icon—blue TV screen with strange bits on it—show on the task bar. All you need to do is click Start | Settings | Control Panel | Display | Settings | Advanced | General, and check the little box! Also click 'Ask me before applying', or you may unexpectedly be reset. This makes it easy to examine the effect of different resolutions and types of colour ('high color' and so on).]
        The most common setting is 800 dots wide by 600 high. A website designed for this size will look oversized and expanded when displayed on a 640 by 480 monitor. (Some sites put all their text in a centred table, usually invisible (i.e. with a border of width 0), set to width 640, so screens set to higher resolution get blank upright bands on either side.) (iv) There may be differences in the colour rendering, both in the sense that the number of colours may be fixed low at 256, or some monitors give a dim picture. (And some people may still have black and white). (v) A site may be displayed within a window, which of course will squash the wording together, the graphics remaining the same size. (vi) Some users have WebTV, i.e. a box which uses their TV as a display. Unfortunately TVs, US in particular, don't have as high a resolution as monitors. And as yet they aren't popular—a figure I've seen is fewer than a million users. But look at Web TV developers' site if you'd like to download their software, which is supposed to simulate Web TV on your monitor. So you can have some idea of the appearance. They state their (huge) software is so far only at the test stage. (vii) For that matter, they may use a palmtop...
          Some professional sites set the screen width to 800 pixels, and fill the left-hand 640 pixel column with the most important material. Screens set 600x800 will display the rightmost narrow column, while users set to 480x640 have to scroll right to see it. The 'View' options which normally allow font sizes to be changed can be overridden with Javascript. (It's possible to copy the relevant bits of it). This seems to be about the most generally useful compromise, though of course it's a nuisance—a relic of the looseness of the definition of HTML.
  37. Websnapshot provides statistics on what equipment people are using on Internet, and how they search etc. Compiled from '100% random sample'. Probably worth poking around in, from time to time. Doesn't tell you general figures for magnitude of web traffic.
        And Cool home pages might be worth a look for design ideas.
  38. Lists are an easy way to introduce some order; the numbered sections you're reading now are part of an 'ordered list' (this means numbers are automatically put in; inadvisable if you change your data and wish to refer to paragraph numbers). The HTML code is simple: <OL><LI> First item </LI><LI> Second item </LI> ... </OL>.
        An 'unordered list' is signalled on by </UL> and off by </UL>. This lists each item next to a 'bullet'.
  39. Tables—rectangular arrangements of data—and Frames—where the screen appears to be cut into independently-scrolling areas—are dealt with in a separate section, below, because of their importance.
  40. Forms can be designed, up to a point, with HTML. I haven't personally bothered much with this. Usually javascript is better. (Web-o-Rama however has some automated HTML help for HTML forms).
  41. If you wish to receive e-mails from people browsing your site, this is the way to do it:
        <A HREF="MAILTO:your.name@your.email.address">Message here - e.g. 'Click to e-mail'</A>. Note: only put this if you're prepared to answer e-mails, at least most of the time, from people who are able to put sentences together. Some sites—for example, in my experience, James Randi's—never seem to respond to messages. The same construction with someone else's e-mail address allows people browsing to send e-mails to someone else. (A comma separator allows both—people browsing your website can send e-mails, e.g. to a newspaper site, and also send a copy to you).
  42. Make your site user-friendly: at least describe what it's about and what you get if you click on links. Imagine someone's finger hesitating on the mouse; what incentive is there to click? You know what's there, but they don't. Unlike a book or magazine, it isn't obvious how long or complicated a piece of work is likely to be.
  43. Navigation. A site of any size faces the problem that users may get lost. The easiest solution is to have frequent links back to the home page, as the entry or opening page is usually called. If thi page is well-designed, people will be able to continue without problems.
        Another technique is to use a separate frame containing the contents, as in the left hand side of this screen (if you'd selected this feature). But this is complicated; see my discussion on frames.
        Easier is to include a list of links at the top of files—not all, but the principal menu files. This sort of thing: [Home Page] [Chapter 1] [Chapter 2]
        Slightly different is the so-called 'breadcrumb trail' which works if your site is structured hierarchically in some fairly clear way. This is used by search engines of the Yahoo! type. At the top of some files you put this sort of thing:
    Home Page --> Authors --> Steinbeck
  44. You can cut down the size of your files by removing redundant spaces and carriage return characters and tabs; for example, with Word, perform search-and-replace, a number of times, replacing [space][space] by [space]; then replace the hard return character by nothing; then replace tabs by nothing. The resulting file may be 20 or 30% shorter than the original - which means it will load faster - but will look just the same on a browser.
          See below, under graphics hints, for ways to cut down the size of graphics files - many sites are almost useless because of the long delays in downloading their colossal picture files.
          On the other hand, don't overdo this. There's a site somewhere on Internet with a competition for 5K-maximum sites. As you'll see, there isn't much anyone can do with such a limitation.
  45. Javascript is a simplified version of Java. (Technically, it's a translated language, processed by your browser. Javascript has a drawback in common with all translated languages, namely that bugs can remain in them undetected until they're run.) In my view the justifiable uses are for (i) drop-down menus, (which have the advantage of reinforcing keywords), (ii) passwords, (iii) to input name/ address/ interests/ other information. These latter can be processed with the FORM command, if you can work out how to use it. (iv) Some graphics processing, for example dissolving pictures in sequence.
          Sideways scrolling messages and similar things are (arguably) too wasteful of processing time. Javascript Source is one site with downloadable free bits of Javascript. To add to the fun, they contain some errors. Another site is Infohiway, which includes browser compatibility information.
          What is, or was, a popular on-line introduction is by Stefan Koch who says he's written a book on the subject: (Voodoo's Introduction, which downloads as a ZIP file. My version didn't seem to be linked properly; I had to edit it.) This has examples built into it. As with most books on computing, you're not really told what the point of the thing is, just given many examples.
  46. Checking your site:-
    I'd suggest a webmaster should check with Internet Explorer, Netscape Communicator, and Word 2000 for compatibility, i.e. that the appearance approximates to what you're looking for. This is the pragmatic way of testing. Unfortunately it's difficult to test old versions of software, unless you have other older computers, because new software versions usually overwrite earlier ones.
        Some software, for example Aolpress, has a syntax checker; find the 'parse' option, which will show up simple blunders, such as surplus or missed-out > or < symbols. Spyglass is another, commercial, quite thorough, software package, but the free version is too outdated to be much use. SGMLS is another HTML validator. HotMetal (which you can try for 30 days free) has a validator, but I've never been able to work it, as it continually says I have characters after the end of the document, and other unhelpful things. CSE HTML Validator has a 'lite' version free, which is very useful; I recommend it. It has no ads and seems to rely on its users wanting to upgrade to a more complex version. But it's heavy enough for many purposes.
        Some software will provide some sort of site map, for example, the freeware Arachnophilia, which lists files which aren't linked into your site and which perhaps were left over from a now-deleted file, or were never used. Arachnophilia also has a conversion program from Word files into HTML (you have to save the file first as Rich Text Format, .RTF, which writes all the formatting stuff on font size and color etc with the text. My version has the weakness of not knowing about opening and closing quote marks. A few programs map out your site, displaying a diagram with arrows to show which files link, but complicated sites naturally are hard to represent.
        The following websites will download your HTML and check it for you, usually in a fairly primitive way:-
        Dr.HTML, Linkexchange, Netmechanic, Siteinspector (related to LinkExchange), Webmaster, Webmonkey, Website Garage. These (e.g.) check your HTML syntax and browser compatibility, and perhaps see whether your graphics files can be made shorter, and test for links to your site. They may have a spelling check - often rather useless. All these are free, or at least offer a basic service free and charge for other services: WebsiteGarage for instance will present your file the way it appears to different browsers for about $10/month. (NB checking browser compatibility is more difficult than it sounds; any change to your file might alter the appearance unexpectedly, so you may find yourself in an apparently endless series of small adjustments.)
        Automated HTML checks can't be expected to take account of subtleties, and often produce tedious lists of trivia, but it may worth paying attention to what they say. For example, they may wrongly detect your file as showing text in the same colour as the background; if their software does it, possibly a search engine may, too, and a small adjustment in a color parameter may help you.
        You may also be able to use them to locate links to sites that interest you, something which is difficult to do (though some search engines allow a link construction).
        Some people suggest including code, nonsense, keywords in your work; if someone plagiarises it, you may be able to locate it by a search engine.
  47. Checking other peoples' sites:-
    (Aqui seems to be a directory of links which people have submitted; you may find interesting links to sites—Aqui claims to have had millions of visitors—or you may not.) If you're concerned about ideas theft, or diffusion, you may find you can identify keywords from your site (you could include filenames or codewords for this purpose).
        The Informant (free) will check whether pages have changed (or perhaps appear to have changed) and notify you if/when this happens. Also free is NetMind. Useful if you're awaiting an update, or something like that.
  48. Contrary to what might be imagined, Internet use declines at weekends - at least, that's my experience.
  49. Test your site by downloading it from time to time; don't just rely on what you have on disk. You may find you've missed something out, or left in a superseded file, or made the whole thing too slow or impenetrable. But I hope not!
          You might want to encourage people to bookmark your site, with a message. Unfortunately bookmarking/ setting favorites isn't particularly easy, taking several clicks, and there's no command which works for all browsers. Control-D works with Microsoft's Explorer.
  50. Print out your site with several settings to check for oddities: light colours may come out white, for example, and spoil your layout.
  51. Don't forget file integrity and security. Click for Scandisk, Firewalls, Backups etc
  52. You may like to look at Internet news sites, for example InternetNews. An interesting site is Jakob Neilsen not so much for the pontification as for the looks into what may be up-and-coming technologies.

[Back to top]


3. 'Net Browsers'. Explorer? Communicator? Opera? Mosaic...
[Back to top]

      All a browser has to do is download HTML files, and convert them into an attractive display, unpacking the picture files and taking care of the font appearance, size, etc. In practice this isn't as simple as it might appear.
      According to my notes Anybrowser is a site which looks at assorted browsers.
      According to figures quoted by people like Microsoft and Netscape, Microsoft's Internet Explorer (IE) and Netscape's Communicator are about joint equal in importance, only a tiny proportion of people using the once-popular Mosaic, which is no longer updated. AOL have their own browser (I think - I haven't checked, deterred by their signing-on apparatus) and perhaps dispute these figures. Other browsers I've come across include Opera and Oracle, each of which has a devoted band of followers. Opera's selling points are, it claims, strict adherence to HTML, and smallness, in the sense it hasn't huge numbers of files—if you have a slow computer it may suit you. (But it doesn't allow font size changes; or background graphics... At present it has a 30-separate-days trial offer.)
      Mosaic was one of the first (or the first?) HTML browser, but updates were discontinued a long time ago, and it's probably more or less extinct.
      AOL's Netpress was available on its CDs; these used to have separate files in separate directories. Its recent versions have files collected together, so it's impossible to load this without loading the whole of AOL's stuff; you may not want to do this.
    This little table summarises I hope correctly a few miscellaneous points of comparison between MSIE, Netscape and AOLPress:
 Netscape CommunicatorMS Internet ExplorerAOLPress
Free?YesYesYes, I think
Aggressive?Less so. Fewer cookies.Difficult to get several versions to co-exist, which makes compatibility checking a pain. Lots of cookies.More self-contained (and has its own tutorial)
Font size change?Yes. Click view then selectYes. Click view, then fonts, then sizeAlters actual tags. Global font size change needs 'select' of entire file
Print/ not print background patterns? (Confusingly, this is controlled from the browser).Yes. Click file/ Page setup/ Print backgroundsYes. Click View/ Option/ Advance PrintingHaven't tried!
Print preview feature?Yes. Very attractive and useful as it includes graphics with text. BUT font size changes don't operate; the printout looks the same irrespective of font selection. Nor is the printer layout option of two or four pages per sheet shown.No. BUT font changes are reflected in the printout! My version leaves fine horizontal lines.No
Tables?Not very helpfulNot very helpfulNot very helpful
Frames?UselessUselessUseless
Own Web Searcher?Yes. 'Search on Internet' button. (It's included in my list of search engines).Yes. It's MSN. Also in my list. Yes. AOLNetfind. Also listed.
Switch off cookies?Easy. Select View/ Internet Options/ Advanced/ scroll down to yellow exclamation mark and select disable (or warn)Easy. Edit/ Preferences/ Advanced/ Select disable or accept only..Doesn't seem to allow this; perhaps I have an old version.
Online/offline distinction?ClearLess clearNot very clear
Switching between online and offline browsing?DifficultDifficultNot quite so difficult?
'Print this page' command?Yes, but calls printer control panel.No. Click on file, then print. NB may lose last lines if bottom marging tight.No. Click on file, then print.
Print frames?Navigator seems only able to print contents of one frame, not entire screen. Or perhaps I can't be bothered to work out how to do this.Can print entire page. Printer may have an option for this.Don't know
Has an HTML WYSIWYG composer? (I.e. shows your work the way a browser sees it)Yes. And it's free, as perhaps it had to be to compete with Microsoft. Not bad; not incredibly good as the user needs considerable grasp of HTML to work the thing. And it reformats your HTML, which you mayn't want. If you view source code, simple syntax errors flash.No.Yes. I think probably one of the best. It has a parser for HTML, which detects errors, and makes fewer changes than Netscape.

[Back to top]


4. 'Search Engines': understanding them, searching Internet, adding your site(s)
[Back to top]

      A 'search engine' is just a computer program which looks for sequences of characters contained in its own computer files. If it finds them, it reports them in some way. Its files are compiled typically by 'crawlers', 'spiders' or 'robots'. This is why 'engines' are fast (compared with what they might be). The preliminary work has all been done; the engine doesn't search the entire net! This is also why it takes time for new sites to register. The relevant keywords or titles or whatever have to be collected, sorted, merged into the results from other searches, and filed away in a large-capacity storage system. The process has to be more or less continuous, since if (say) ten million sites are to be indexed, a single program could spend only a few seconds on each site every two weeks.
      The outcome is that, when you type in (say) "Elephant man", this phrase can be found and all the files indexed to it assessed by some algorithm which is meant to work out how important this phrase is in the site. The results are sorted and sent back. The processing is specialised work, which presumably is why many searchers seem to rely on software you've met before - e.g. Magellan uses Excite, Infoseek turns up with Go, Disney, Direct Hit, Ultra, and others in some sort of arrangement, Inktomi seems to lurk under many searchers e.g. Compuserve's, Lycos and Hotbot are joined in some way, Nexor, apparently a weapons-related thing, uses Excite, as does AOL Netfind, and so on.
      There are searchers in French (e.g. Voila-not the musical instrument, please), Danish (e.g. Jubii), Dutch (e.g. Ilse), Italian (e.g Il Ragno), German (e.g. web.de). Intersearch is German and Austrian. Fireball is German, apparently a part of Alta Vista. Some are Inktomi powered, e.g. SwissSearch and a Japanese engine, Goo. A few other examples are GoGreece, Searchmalta, Korean, and English-language ones like Surfchina... but I'll look mainly at English-based ones. Note that it may be possible to get new stuff by trying a foreign searcher. The keyword system allows you, armed with a dictionary, to search even if you don't know the language, then try to decode what you find.

      Among the best known Internet search engines, so far as I can tell or guess, are AAA AOK Matilda, 'the largest outside USA', AltaVista, Excite, Euroferret, Infoseek (now Go), Webcrawler, and Yahoo! (apparently the most profitable and well-known mainly through having been there a long time; but see small-print warning below). Apart from these, the commonest inertial engines are MSN, Netscape search, AOL Netfind and in the UK UKPlus. (See my list below for comments on some of these).
      Yahoo! was differently designed from most engines, though now others are following; it is a directory or selected list (that is, people submit their site for approval, which usually isn't given). Searching with some such engines is only carried out within categories. This makes sense from the time point of view—if you have twenty categories, and each search only looks within one twentieth of your database, everything is much faster. (Yahoo! claims only about 1M sites, while Alta Vista's Raging claims 350M, the highest I know of—yet). But of course miscategorised sites, or hard-to-classify sites, will have trouble. Yahoo, Netscape Search, Lycos, Hotbot and AOL all seem to share the same team of editors (see below under adding your sites to WWW). AAA Matilda has strict categories of this sort. So does Snap! So does Hotbot, but this also uses Inktomi, and in effect has two parallel engines. OpenHere is another directory with a 'focus on safe surfing'; it seems to have little content and I don't recommend it.
      You can't always rely on things they say: Hotbot for example claimed never to remove sites, which is untrue. A number of engines (MSN, Mirago, Excite..) claim to spider entire sites given just the main URL, but none of them as far as I know in fact do this completely.
      If you want to look at what other people say, there are guides to search engines on http://www.searchenginewatch.com, which as with many of these sites has unsourced information on the sizes of the databases used by a number of 'engines', and the criteria they search on. It has information on subjects like adding a search engine to one's own site, but it marred by uninformative outlines to the links it offers. All Search Engines says it lists them all; it has a top 6 and also 25 topics, e.g. Health and Medicine, with topic search engines, such as SciSeek. What may well be the biggest and best site is Search Engines Worldwide, with a searchable list of countries from which you can find search engines. There's also a long list of 'submit free' sites—see below for cautions. And (a site with less information) http://www.free-markets.com/search1.htm. And another is easynett, with another long list. The Spider's Apprentice has search engine information and news updates on takeovers and buyouts. For information e.g. on specialised subjects and universities you might try Geneva University. This site Internet Exploration gives you a longish list of search engines, with bits of information about them, but not enough to be useful.

      Less well-known search engines:

411 Locate (phonebook and yellow page seeker), 4anything (says it has professional editors, perhaps of total site. Doesn't spider. 'If you prefer a prime, guaranteed listing.. low, one-time listing fee..'), a2z (Searches only in categories), About (has what it says are experts: "smart people who care." I was amused to see it seems not to have 'AIDS'. Some sort of membership-only scheme I think), AcidSearch ('your entertainment search resource'—I found this slow loading and know nothing about it), ah-ha [Aims at commercial sites, but free to add anyway. 12 keywords allowed], Alexa (some sort of adjunct to search engines; seems to provide ad info, but I haven't been able to work out what it does), Aliweb (Pre-meta-tag engine, with file info input in strict format, very few pages. I could find nothing interesting), AllTheWeb (Could become important. Connected with 'Fast' in some way. Intends with Dell computers to be first with a billion sites), ANZwers (Yes, an Australia-New Zealand searcher), Apollo (based in Britain; seems slow loading or non-existent), Canada (no prizes for guessing where this searches), Company Sleuth (says it gets legal inside info on companies), DirectHit (spiders; see below), Disinformation, (not really a search engine; selected sites which it says are censorship, counter-culture and anti-corporate; disappointingly tiny with feeble material), Disney Internet guide ('the ultimate family guide to the web'. Appropriately, endless cookies), Electric Library (libraries, transcripts, reports..), Euroferret, (European sites. Claims to be the biggest Euro search engine. Says it won't index subsite URLs), Euroseek (instructions in many languages; elaborate password system), Fast. Adds URLS almost immediately, unlike almost all SEs. See below in spider engines), Find It (seems to be an offer to look things up for money), FindLink ('safe and clean'; has some puzzling banner system), FindWhat (I don't think you can add a URL; you have to pay, and there are various schemes for this), FrequentFinder (the only searcher known to me which searches domain names for meaningful phrases, so e.g. drivelvision.com shows with elvis), Galaxy (seems to be funded by defense and medical interests; $25 non-returnable fee to add URL), Go now subsumes Infoseek (see below in spider search engines), Google (Recently improved after three years' work, it says. 'Returns the right results fast' - it has an 'I feel lucky' option - supposedly with self-modifying algorithm, tho' database(s) aren't given. Easy to add URL; claims to spider. Claims very efficient de-deplication. Lots of non-English language news), Goto (relative newcomer; its selling proposition was simple instructions. Now position is determined by payment [including supposed public services]), Harvest Broker (says it gets its own sites. Not many in total), Hotbot (green. Several months for URLs to register with Inktomi, otherwise categorised, but difficult to add because an ordinary search with it doesn't give results by category, so it's quite difficult to work out where to add, at least in my experience. ('Expert' assessment amused me by having pro-'Holocaust' Jamie McCarthy censoring genocides in Africa and Asia), iBound (yet another categorised searcher), ICQIT! ("I seek it" and I seek you". Claims to cover entire web every two weeks. Unclear whether it spiders, so it probably doesn't. Online chat facilities—allows you to search for people by email address and/or name—claims to be the biggest chatroom software, competing with AOL's confusingly-named 'Instant Messenger'), I Found It!, (genealogy archives), Infohiway (has a free site mapping facility; can be useful to download sites - take advantage of their very fast links. Easyish to add URL), Info Tiger (probably small; can email URLs in batches), InfoSpace (phonebook and yellow pages; emails), Internets ('largest filtered collection of useful search engines and newswires anywhere..'), Jayde Online (relatively small; censorship policy; has a hints newsletter. That's what I think, but its blurb says 'second largest search engine directory on the web'. Says it indexes only on title and site description of main site), Jump City (claims to carefully select worthwhile sites, and has its own 'jump code' TM system which I fear I couldn't fathom), LinkStar (company information, US only; unavailable for now), Livelink (related to Pinstripe and Open Text; 'The source for business knowledge'), Looksmart/BeSeen, (another multi-level-category - 24000! - searcher; starting self-published sites, won't include 'offensive material' NB has alphabetical list so you can see the rather banal categories), Lycos (recently started TV ad campaign! Censorship policy—possibly the most censored search engine), Magellan (large, claiming 50M, general purpose; has a 'site voyeur' feature; watch for several similarly-named sites. Easy to add URLs. Seems to be part of Excite!), Mall Park (categorised selling site - commercial sites only, I think), Matilda (Or AAA Matilda. Yes, Australian! Unusually chaotic appearance, with strict categories making it user-hostile, but an index-everything policy, except they charge .com sites), Mirago (categorised British site run by Telecom plus; it claims to spider your whole site from the index, and return regularly; not much good), MSN (Microsoft's Web Searcher; has MSN Encarta section. Doesn't seem special! Categorised. URLs can be added - it takes some effort to find where; look for 'help' - after which entire site is supposedly spidered - if so, this is useful and timesaving), NationalDirectory ('least spammed, easiest, most comprehensive', new URLs 'searchable within minutes'-but I couldn't find how to add a URL), NerdWorld (relatively tiny; offers a site index, but apparently only from itself), Netfind (this is AOL's - censored? It's hard to tell whether the censorship is any different), Netscape's Netsearch (part of Netscape, though not a major part. Seems impossible to add URLs directly—there's a tedious mass registration apparently used as a junk mail database), Northern Light (large site with company reports apparently a specialty), Open Text (same as Livelink/ Pinstripe), Peekaboo (business and also 'quality public service information..'), Pinstripe, PlaNET Search (Said it spidered whole site within a few days, with regular repeats. SHUT DOWN Dec 1999), Pronet (International business directory), RagingSearch (Part of AltaVista. Unlike AV, which has a cluttered page, Raging has a near-empty screen. Allows searching "for phrases", -omissions, Upper Case Sensitivity. And for links, domains, text in titles, similar pages—see Help. No info on adding URLs, which presumably is done thru' AV. The 'most powerful search on the web'), REX ('Go get it, BOY!'), SciSeek (specialist science searcher with person(s) checking entries; claims one week turnround), Scrub the Web ('search in real-time. Add your URL instantly'), Search Europe ('.. designed to be comprehensive..'), ('search in real-time. Add your URL instantly'), Search King ('Where YOU Rule the Web!'. URLs added in less than 24 hrs, it claims. Doesn't spider - you supply keywords. Claims to have a click voting popularity system), Searchopolis (aimed "for K-12 students" whatever they are; filtered by N2H2 by "our large staff of trained reviewers". I was surprised to find the Russell War Crimes information in there), Search UK (Now searchengine.com. Was business partners, reps etc in UK), Siteshack ('the fastest on the net'. German? Makes you count characters), Snap! (Has a detailed 'membership form' which I expects put off many people. Copyright Weather Data. NBC. Another multi-category site, like Yahoo! Was easy to add URLs; now needs trawl through categories. (Has alphabetical list feature; mostly disappointing trivia). Claimed to be fastest. I think a CNN site, unlikely to include critical pieces), Starting Point (has regular 'new sites' feature), Superpages ('Business Websites Around the Globe'), Thunderstone (has a serious opinion of itself; wouldn't take web addresses which are subsites with slashes, and now seems impossible to add to), UKDirectory (UK searcher with teething trouble. Also publishes an A4 book of its sites, like a phone book), UKIndex (UK searcher. Enter your own description), UKMax (Another UK searcher. Claims to spider whole site from index and seems quite good at this. Has been trying TV ads in UK), UKPlus (And another. This one is used by Freeserve and is therefore important for UK users. Looks like Yahoo-know-who. Part of 'alleurope', doesn't state pages - probably < 100,000. Seems purely ad driven, probably with pay pages. Unlikely to post serious material? Switched to Infoseek to search web, so target this), Town USA ('free listings of US businesses and municipalities'), Ultra (seems same as Infoseek), USA Online (tiny business site apparently mimicking AOL), (couldn't find WebArrivals - or What's New), What's New Too (looks like a chat line but has its own database), What-U-Seek (has a promising website searcher), WhoIzzy (says it's one of the oldest; currently trying to sell itself), Yep (Related to HitBox counter. Ranks sites by popularity. However, only sites registered with Hitbox show up), Zeal.com (Categorised sites, as Yahoo, with internal search engine, but supposed to be 'community driven', i.e. sites are ranked or voted. You're legally required to indemnify them), Zensearch (nonprofit; falsely claims immediate indexing. Appears to be the tiniest of all and virtually useless)

      'Meta' search engines 'cheat' by submitting requests to other peoples' databases, then returning the answers to you. They have been a growth industry, because they use other peoples' databases and therefore only need programming skill. But (I'm guessing) they could presumably be cut off from accessing the searchers they use—perhaps they pay a percentage. There are two types:
          De-duplicating meta-searchers (for want of a better expression) which combine the results of their searches. These have the enormous advantage of potentially finding files which happen not to be stored by all search engines: you increase your chance of getting what you want. The first was (I think) Metacrawler, which now uses ten search engines. (It's 'all' button doesn't work). Savvysearch makes use of the largest number of other engines (12, though not at once) and may give very good results. Mamma ('the mother of all search engines') seems to have been a mimic of Metacrawler. Inference Find uses only six searchers. Its display sorts output by type, e.g. .com sites are collected as 'commercial', which can be useful; but it lists only bare titles, so a user is likely to spend more time trying to find a good one than the time theoretically saved. Ask Jeeves says it uses five searchers, and claims it has a natural-language front-end, i.e. allows you to type questions in ordinary English—though this isn't true. It claims to select sites and have made some checkups on commercial sites. A new metasearcher is Chubba, which uses about six engines plus What-U-Seek, but tends to leave graphics cluttering your screen. Metafind allows you to set various parameters. Verio Metasearch allows you to choose the weights to assign to different search engines. I've just found four others: Beaucoup which seems related to (or the same as?) 4anything and Metafind, GoCrawl, InfoZoid, which searches Usenet in addition to a clutch of search engines, and YahooSuck (doesn't state which engines it uses, but seems good). Copernic is a categorised metasearcher and is unusual: it's interactive, downloading its own current set of searchers, and beaming you banner adverts, from its own site, when you're online; it also inserts itself into your browser, which you may not want.
      The Big Hub is interesting not so much as another metasearcher (8 engines) as for 'specialty search engines' by topic; I haven't attempted to check this claim in detail, though it seems unlikely there are 1500.
      No meta-searchers (that I know of) attempt to include the full range of Web information; don't generally expect them to find names and addresses, or yellow page style information, although they should be able to tell you where to find such information.
      My experience is that Meta engines aren't always reliable—you may find that a site which is definitely found by a component search engine nevertheless doesn't show up on a meta engine. Possibly there's a censorship, time limit or perhaps a depth-of-search limitation. On the other hand, sometimes you find the reverse—a phrase which seems not to present on any engine, shows up on a meta-search!
          Non-deduplicating metasearchers simply return results from all their searchers separately. Usually the results display in sequence: OneBlink is one example, Dogpile another, with quite a range of options. Search presents a choice of 11 engines separately selectable. Metasearch last time I looked made you click separately on seven searchers yourself. WebTaxi (under development) is supposed to allow proximity searching, i.e. sets of words fairly near to each other. It uses an unusually wide range of types of search engine; however, you can only look at one at a time. What seems to be a filtered, i.e. censored, search engine is Internets, (but it has lots of Java, and may be best avoided). You can amuse yourself seeing what's been 'filtered'. Internet Sleuth allows you to choose 1 to 6 of its search engines, and seems to return more results than many. Highway 61 just uses five engines. Searchhound is another.
          But some display their findings in windows: All4One is similar, but displays search results from four search engines in separate quarters of your screen. Search Spaniel, offers eight searchers plus four specialist engines (for people, newsgroups, shareware, mailing lists) and allows the option of displaying the results in individual windows.

      Naturally there are endless complications. For example, keywords may not accurately represent the contents of files, pictures and voice files in any case can't be indexed like this, and there's scope for people to cheat by listing irrelevant words which they hide on screen. Phrases may not be recorded, although obviously they provide greater discriminating scope than single words. Foreign languages and alphabets, and mathematics, may be unfindable. Some engines (Yahoo!, Euroseek) shoehorn their sites into groupings; this (i) makes it more difficult to enter sites, and (ii) tends to produce somewhat irrelevant lists of found sites.
      Most search engines are surprisingly difficult to operate: because (1) ads, which use long graphics files, slow everything up; (2) their designers tuck away examples of how to use them; (3) they may be split up into geographical areas and subject areas; (4) since the search criteria aren't easy to work out, it's hard to guess how far down a list a sought item might be; (5) there's a constant tendency to divert into money-earning areas, so that things which look free turn out not to be; (6) they tend to try to send 'cookies' without saying what they're for; (7) the results may be displayed unhelpfully.
      But there are short cuts where detailed searching isn't wanted: if you just type a logo or name you may find the www and .org or whatever are tried for you.
      The best way to start is probably to type a lot of keywords in the box, preceding them by + to force them to be used, and putting phrases in quotes; you can also try using roots with *, for example flower* which might find flowers, flowering, flowery. Using many, or rare, or elaborate keywords will at least cut down the number of files returned for your inspection. Remember the comment about job ads - the trick is to get replies from the few people you want, not from everybody.
      To get a feeling for the way these 'engines' work, try to see things from the programmers' viewpoint. They have a choice of looking at hidden meta-statements, or ignoring these. Meta-statements have the huge advantage of allowing key phrases, such as (e.g.) baked beans, President Wilson, Vladimir Ashkenazy, or whatever. It's probably too much to expect a program to pick out keywords from raw text: in a sentence like 'the rain in Spain falls mainly..' the computer can't tell whether 'Spain falls' is a significant phrase. Hence metastatements can be useful. Unfortunately they can be abused - you could for instance put 'free download' in every file and increase your hit rate. So most engines state they don't use them. Altavista is one which does.
      If you wish to test what an engine does, rather than what it says it does, try a phrase like 'baked beans' and then 'beans baked'; if you get the same list, you can probably assume it doesn't store phrases. You might try "blue sky" in double quotes and the two words 'blue sky' separately; if the results are more or less the same, the engine probably indexes individual words, not phrases. You might also try a part-nonsense phrase like "blue mkxfdhg" or +blue +gkhutnw; if you get a list of sites with 'blue' in their titles, obviously the engine is looking for the two words separately, not together, and is likely to waste your time.
      I think it's true that search engines are subject to an artefact which causes a bias to short files: imagine a discussion on some topic (e.g. car exhaust); a long detailed file will tend to use this phrase less often, in proportion, than a short file, since it goes into detail on all sorts of points. So my impression is that Internet perhaps seems more trivial than it is.

      It has to be said that some sort of censorship is probably inevitable. Suppose (just one example) that a second-hand car dealer put up an exact duplicates of some popular site, but with the graphics replaced by his own ads: this would look identical to most search engines, and would therefore appear as often as the original. On serious issues, I've been told the CIA has bought a search engine—this would be an expected development, the money being peanuts. The large engines seem to be operated by large companies (e.g. Excite seems to be part of Reuters/UPI. Snap! seems to be CNN plus weather) so one would guess they will censor material perceived as contrary to their or lobbyists' interests. Or they may simply play down sites which pay less. For example, the keyword "Richard Milton", the author of several books on scientific dissent, some material of which is good, didn't show at all on Metacrawler, but appeared on Hotbot. Similarly my Russell Vietnam material of my site does not show up on Hotbot or Lycos. (All this is quite separate from the question of site censorship, where usually the service provider comes under pressure).

      A few notes on spam: a common type is to have many files, saying more or less the same thing, in different directories. A computer can't be expected to infer that the meaning is about the same, and then de-duplicate them. Identical files are perhaps less of a problem, as they are often in explicit mirror files. It's possible to generate unintentional spam - as for instance a book may be listed by chapters, and the title of each file may be the same, or similar; consequently some search engines will list (say) twenty separate chapters, one after the other. It's also possible to generate intentional spam; I recently noticed a Usenet group deliberately choked with nonsense messages. There's nothing to stop anyone putting up thousands of computer-generated nonsense pages. It's impossible to guess how much the relatively reasonable quality of the Web is due to sites being refused or removed.

      A newish type is the self-voting searcher, Goto supposedly being one. This lists a top 500 of interests (Pamela Anderson being #1) which are supposed to reflect peoples' interests, skewed by payment. What's New? so far as I know is something similar.

      I haven't considered in detail the problems of searching in special databases (legal, medical, sporting, etc) or looking for special program types; for example FTPSearch looks for FTP files. In any case these can be found by searching on keywords like CERN, Gopher, Archie, Archive, Veronica, and so on. Since Internet has special features designed specially to allow entry of queries (the standard elongated boxes, 'radio buttons' and so on,) there may well be specialised search engines for e.g. chess, baseball, or what have you. But if so they won't be easy to find—make a note or bookmark URLs you like!

      The final message here in searching the Web (which I give with some hesitation—my information is imperfect, times change, and anyway your interests may not be mine) is not to be too hasty, and regard the process as something of an art form. For general searches, start with meta-searchers, such as Savvysearch, Search Spaniel, or Metacrawler. You can try lots of keywords to start, then use fewer if the list returned is too short, or do it the other way round, so long as you're fairly systematic.
      If you prefer to stick with a single search engine, my selection is Altavista (but hard to pinpoint), Excite, Infoseek/ Go (but too many results), Webcrawler, and Yahoo! (if you're looking for conventional material—i.e. you want to find what other people are likely to be told). Try Euroferret for sites in Europe. UKMax might be good for UK. You might keep an eye on AllTheWeb. If you still can't find what you want, scroll up from here and try some of the search engines in my long list. And you could try foreign ones for a different angle. You might try the Webring index where more-or-less linked sites can be examined (scroll down, and use their site searcher). ABout mid-2000 some sort of link between Webring and Yahoo! was announced; I hope this won't have the effect of cutting back choices. Some webrings are good (a handy indicator is simply a look at the home site of a ring) but many have slow ads or are disappointing in other ways—the webring instructions are so badly written it's unsurprising many rings aren't maintained properly, as you'll find if you set up your own. On the other hand at least you can view their hit-rate stats). For Usenet groups (see below), Deja News seems the best (or only?) and it's very good, if you can work out how to use it. Several times I've found interesting websites that people have mentioned in their usenet emails, but not put onto search engines, so this roundabout route can be useful. You can also look at e-discussion groups in the same way as you look at Usenet, though many are private and may be difficult to get into or unwilling to answer questions.
      For specialist lists or specialist search engines try Search Spaniel or All Search Engines and go through the same sort of process.


      Adding your site to Web searchers: Search engines won't spontaneously search for sites, since, for all they know, their authors might have put them up as experiments and not want them listed. And they may not have access to the list of file names. So you have to do it yourself. Start by keeping a notebook. (There may be delays of weeks before your site is indexed). You might practise first with an instant indexer, if you can find one; try Search Europe, for example, or Zensearch, or NationalDirectory, or Scrub the Web, Starting Point. When you get the idea, try the large search engines: Fast Search is a good start. I'd recommend however that your site is in acceptable shape before submitting to major engines; they may blacklist feeble sites (perhaps).
      The usual trick is to find and click on the 'Add URL' button ('URL' means 'uniform resource locator' if you haven't been told!—the idea is that it's standard across the whole of Internet). You may have to navigate via 'Help', or do a search, before you find this. You may be faced with a series of boxes to fill in. You may be asked to describe your site in not more than 25 words, or some similar formula, even if your site has metastatements in it already containing this information; and you may be asked for keywords, even if you've included these. If you intend to add a batch of subsites, it's worth checking that the 'back' key will redisplay your previous site, so you don't have to type in your entire URL each time. A few search engines, for example Yahoo!, require you to find a category and sub-category for your site, something which can be very time-wasting, and the engine may not list your site anyway - Yahoo! has a reputation for throwing out 5/6 or 9/10 (or other fractions) of submissions. The next point may seem a bit obvious: make sure you spell all the characters of your URL(s) correctly; very few search engines check in real time that URLs actually exist, so it's possible to waste hours or days entering URLs with a tiny but invalidating mistake. Before adding URLs to Yahoo! and selective sites, check their listed sites: for example I recall being surprised at the feeble quality, and dead links, in Yahoo!'s UFO section.
      If you're beginning, try Alta Vista (very easy to add; claims to take a few days; takes subsites too). Excite is another two-week wait. MSN says it takes two to four weeks. Try some sites from my huge list, above. Lycos and Hotbot may take months and seem to have some sort of censorship policy.
    If search engines make use of other hardware/software setups, you can save effort by not entering the same information into different search engines which in effect use the same material. Inktomi is the well-known example; Inktomi's own site lists the search engines that use it, (click on 'partners'), but this information tends to be hidden away since the search engines like to pretend they're unique. Inktomi's long list includes Hotbot, Yahoo!, Snap, GoTo, ICQ and others, including Canada. They say they've arranged deals with Findwhat, kanoodle, LatinLOOK, Network Solutions, Oxygen Media, Surfbuzz, Powerize and others. I haven't been able to find a reliable way to add to Inktomi—their site is unhelpful—and none of their engines seems to spider, though one is told that a site added to one is added to all. If not, people with diverse sites have a boring time in prospect.
      Summary of spider search engines, and adding your URL to them: Fifteen or so I've found are (alphabetical; I've improved the URL targeting, so you can click and work through them:) AltaVista [says as a rule of thumb it aims to spider to one level], Direct Hit (needs tedious list of keywords), Euroferret [Moved to Webtop which now has add URL facility and requires confirmation], Euroseek [this is the English page. If it doesn't work, try Euroseek.com], Excite [scroll down; this seems identical to Magellan, which now claims to spider], Fast, or AllThe Web [I think it claims to spider], Go, [was Infoseek. Says it spiders, and seems to have discontinued e-mail reception of bulk URLs. Has its own community, which you can join], Google, Lycos, Mirago [British; not very good], MSN [Microsoft's. Claims to spider. But, inconsistently?, allows multiple URLs to be submitted, at one per day maximum], Northern Light, Searchengine.com, [new to me], UKMax [also British; persistent teething problems], and Webcrawler. Only the main URL is needed, a huge saving of time if you have a large site. They all work, more or less, though don't expect perfection.
    Notes: Lycos, which has a censorship policy, specifically states it isn't likely to spider more than one level, so if your site has subsites each with an index, you may do better to enter each of these separately with Lycos. Direct Hit has to have to add keywords (they give no information as to whether e.g. phrases are accepted or what separators should be used) and it has rankings based on user popularity.
      Some categorised engines seem to share the 'Open Directory Project'; they seem to be AOL, Euroseek, Yahoo, Netscape Search, Hotbot and Lycos, which therefore presumably now return much the same list of sites. This project therefore very important; you must try to add your site to it. The site says they have about 20,000 editors, and have about 1M sites; it also claims they are "quite passionate about their work"—just as well, perhaps, as a swift calculation reveals that if each of them examines ten sites an hour they'd take a year to get through 100M websites. It is, I'd say, probably impossible for them to keep up to date. You can't add sites from the project's home page; you have to burrow down into the subcategories. Any of the engines will do to try to add sites. And conversely, an ignorant editor is in a good position to block sites. The categories include horoscopes, under their culture section, an example of the oddly schizoid presentations of the Web; it's said that half the users are graduates, but see, for example, Lycos's top 50 search phrases, which suggest most of the use is by young people and children—but perhaps they are the other half.
      Meta-searchers may have an 'add URL' button, but you'll just be told you can't add URLs, because the meta-searcher has no database of its own. Savvysearch is the only exception (I think)—it adds your site(s) to many engines, and it also e-mails you with the responses.
      'Inertial Research': lots of Internet users probably stick with what their software presents them—they may not even know there's an alternative, and their suppliers won't tell them. Thus most AOL users probably get the AOL menu with AOL Netfind; in UK, Freeserve users probably mostly get the Freeserve page; Gateway buyers are nudged into Yahoo!; and people who've downloaded a new Netscape may find Netscape Search has muscled in, so they get the Netscape page. Or Microsoft's MSN page. Or whatever it may be. So explore other peoples' views of the web, and enter your URL into their search engines.
      To see if your site is indexed, try searching for URL:http://www.yoursite or "http://www.yoursite" if the searcher allows. (You might imagine the designers would provide an easy way to check. But they usually don't).
      E-mail submissions, where search engines allow this, are useful for well-developed sites, as whole batches of subsites can be submitted. Infoseek is the best example. Sometimes you can switch between a list of your files in Word, say, cutting and pasting a batch into the search engine's submission window.
      Bulk submissions to search engines: sites include Addme, Free Search Engine Submissions, GetHits (part of AddMe), Promote Your Site, Shout!, SubmitNow. So far as I've tried them, these are less easy to use than they sound—in practice, you have to type in all sorts of detail and probably also add yourself to some sellable emailshot database. The free Searchers tend to be very obscure ones and I suspect some to have been manufactured just for the purpose. Some require you to enter your website into all the searchers one after the other. Keeping track of searchers you're supposed to be on is harder. There are various ways in which the process is less free than it might be. And if you have a site divided into subsites, the cost presumably increases with each subsite... I've just checked SimpleSubmit which appears free of any objection. Savvysearch has a submitting facility which reports back to you with an email to tell you what happened.
      How important..? You may be told that bookmarked sites are the most important. Bear in mind that bookmarked sites aren't distinguished from typed-in addresses; in other words, if you can get your URL published somewhere, of course this will increase your hit rate; and you lose nothing by encouraging people to bookmark. But search engines may find people who otherwise would never hear of your site.

[Back to top]


5. Improving your search engine rankings.
[Back to top]

      Keywords in Meta Statements: Alta Vista, Excite, Lycos, Netfind, Northern Light and Webcrawler* DON'T use these.
      URL names: Some search engines make use of these: Alta Vista, Hotbot, Infoseek, Lycos, Webcrawler*. So, for example, if you're a world expert in orange juice, a site http://...orange-juice/orange-juice.htm will be rated higher by these searchers.
      Comments: Only Hotbot* recognises these. Presumably, <!-- orange juice --> would boost an orange juice site in Hotbot.
      ALT text for images: Alta Vista, Infoseek, and Lycos* only take account of these statements. The alt statement can be pure keywords, for example: <img src="mount_everest.jpg" alt="orange_juice">, though of course this may look odd if a mouse move causes the text to display. If you like, try an GIF with ALT keywords; take some picture which the site loads anyway and make it 1 by 1. (Why ALT statements? The reason must be because some peoples's sites are mostly graphics, which the spiders can't read. ALT statements give at least some inidication of what's happening).
      Tiny or Small Font Text is used by some sites to pack in lots of keywords in a way which doesn't show too prominently. Such sites will be rejected by Alta Vista, Hotbot, Lycos, MSN and Webcrawler*. However, Infoseek and Northern Light* don't object.
      Invisible text, with the font color the same as the background, is another obvious spamming technique. Only Webcrawler and Netfind* allow this.
      Keywords in text: quite a few sites include (usually at the end) a list of keywords with a notice that these are keywords! Something similar happens on e.g. NASA's website, and many scientific papers, when a list of key phrases, often very childish ones, is given, although it's hard to see why. You could try this—obviously, the words don't need to be small or invisible. Whether search engines object, I have no idea; the only guess I can make is not to overdo this, as search engine algorithms might be expected to look for some sensible ratio of text:keywords. I've put an example at the end of this file.
      Spread of keywords across the site is measured in some way by Alta Vista, Hotbot and Webcrawler*, no doubt to disallow sites which have misleading keywords bunched in meta statements or titles.
      Title is certainly used by some search engines, since they display the site title when they return their list. The usual recommendation is to pack as many keywords as possible into the title. And probably the description.
      Headings. I suspect the use of headings was an early device, before meta tags were well-known. If a site had no meta-tags, by default the searcher would look for <H1> or something similar, perhaps <CENTER>, and assume the first heading announced the purpose of the site. It's probably sound practice to include a heading and include keywords. If you can be bothered, search for for the first occurrence of <H in your file(s) and see how it measures up, in respect of meaningfulness, keywords, etc. You don't have to include a giant heading; <H4> may have the same effect.
      Links. Any search engine with a huge database can collect information on links. *Excite, Infoseek and Lycos check links; Infoseek also has some system for assessing the status of links, so that a group of sites, perhaps the same person's, with artificially huge numbers of links to each other aren't boosted. (This may be targeted against 'free-for-all' link sites). And some searchers supposedly factor in some assessment of popularity. This is (perhaps) a good reason for arranging reciprocal links with other sites, and is probably one reason why propagandist sites tend not to include links to opposing sites, apart of course from not advertising them. If a site you have a link to won't reciprocate, it's up to you whether to remove your link or not. Personally, if I like a site or think it of some interest I always put a link; but perhaps this is silly. I've seen a suggestion that you might yourself put other peoples' sites, if they link to yours, onto search engines if they haven't done it themselves. Some searchers allow link:www.your.url type of enquiries to show you which sites point to yours. (Linkdomain:www.your.url is supposed to work with Hotbot.)
      Spelling Variants. It's often suggested that misspellings should be included in meta tags, on the same principle as foreign restaurant name appearing with different spellings in phone books. If searchers don't use these tags, of course it won't help—you might try to include variants in the text itself. In any case, Infoseek, Lycos and Northern Light* search based on word stems; that is, rather than look for whole words, they abbreviate them to increase their catch. Alta Vista* does NOT do this; it stores the entire phrase, including plurals, and so is more sensitive.
      Crawler page [suggested by Paul Boutin, ex-Hotbot]. The idea is to make an HTML page containing only links, without graphics, external lnks or anything else. A link must be made to this page, presumably at the start of your index page. The idea is that spiders can be confused; if your site contains any wrong links or confusing HTML, it may get lost. A simple list of your own links should allow it to crawl fully in peace.
      Free site link pages to your main site, i.e. a page which, if found by a search engine, directs to your site. E.g. Geocities/Yahoo is supposed to take the fifth most traffic of US sites. If you set up a free menu page with keywords and so on, anyone searching Geocities sites might find your page, giving you a free link.
      Other link and portal pages. I've seen a site with keywords consisting entirely of spelling errors of keywords, linked to the main site.
      Payment. I haven't tried this! But obviously there's plenty of scope. I believe Sprinks is a new (May 2000) engine of this type. If you're doing this, you may as well start with the most popular, then look at specialist search engines and directories.
      Resubmitting: I've seen a recommendation to resubmit your site every month. Also recommendations to resubmit whenever you make a substantial alteration. And one (by Boutin, who says that no searcher penalises repeat submitters—as far as he's aware) to submit every week. A recent newsletter said a couple of hours a day was all that's needed! My best suggestion is to check at intervals that your site(s) show up on the engines you're mostly interested in; if not, you have no option but to resubmit. Alternatively, have a rolling checklist and spend a fixed length of time on it at a weekend, like painting the Forth Bridge.
      A possible approach is to tailor your site to search engines. You may find it worthwhile to have different entry points to your site designed to be optimised for each search engine. The idea is that a different entry page shows up well in different search engines, though the user sees more or less the same thing. I'm told some people do this! And some software does—'Web Position' and 'Eugenius' for example. But search engines are fighting back against this sort of thing.
      The lists marked * were published by Dilwyn Tng of Make-it-online, in InternetDay's emailed e-zine. Thanks. I haven't checked many of these claims; some may be outdated. I leave these entries in rather bare form, since it's fairly obvious what action to take, e.g. to ask people to link to your site.]

      A useful hint—perhaps the most useful you'll ever hear!—is to exploit other peoples' work. Search, using phrases relevant to your own site, and examine the source code of the top URLs the engines turn up; if you mimic parts of their structure, you instead may appear near number 1! (Beware though - other people have this idea. An interesting possible counter-attack has started with Pinnacle, which guarantees a 'top 30' place—unless, presumably, 31 people apply to them, and I think only for one combination of keywords—and says it has Java programs which make keywords etc unreadable.)

      Try to judge other sites objectively: if they are better than yours, then a search engine which ranks them higher is doing its job properly! So you may need to improve the content of your site.

[Back to top]


6. How to search files, books, e-mails and other data on disk for information which you know is somewhere on your hard disk.
[Back to top]

      This is where I put in a plug for antique technology, viz. Norton's utilities for DOS. In particular, there's a text search utility, of a type which seems hardly to exist in Windows; at least I haven't been able to find one. (Windows 98's search—Start menu, then click on the magnifying glass icon—does allow searches for text, unlike Windows 95's, which only allowed file name searches. But the result is only a list of files; each has to be opened to check the contents). This program searches entire directories for a text string, and returns not just the name of the file but a paragraph or so of text around the string, so you can see the text it's embedded in. Probably this anarchic feature makes it difficult for Windows; or perhaps most people just use games or something.

Example. To take a concrete example, I remembered reading, in a downloaded file, that Deborah Lipstadt's holocaust book doesn't actually deal with the points at issue. With the aid of Mr Norton's utility (version 4.2, something like 10 years old) and entering at the DOS prompt

TS C:\WORK "Deborah Lipstadt" /S

my entire hard disk, subdirectory WORK, is looked through for this string. And after a while, I duly found my quotation, from an article in the Skeptic.
      The output can be directed to a floppy disk, preferably a high capacity one, rather than writing to the hard disk, which is increasingly risky with modern disks. In fact, a batch file can be left running to search for many strings, and the floppy later hunted through. I personally recommend this for such tasks as (e.g.) finding whether 'cheer up' is a Shakespearean expression, (it is) or what Noam Chomsky thinks of hack American authors (you can search e.g. for Reinhold Niebuhr in your Chomsky files, or, for that matter, your entire collection of files). My personal typings-in include a lot of Bertrand Russell material, which I can search in the same way - of course it's necessary to have lots of downloaded texts, or material typed in yourself, on your PC for this to give useful results.
      As I say, I haven't found anything similar to this anywhere; it sounds absurd, especially as the idea isn't very difficult to program, but there it is. Recent Nortons seem not to have this, so far as I can judge by reading the advertising notes on their boxes. My version of TS.EXE is smaller than 20K bytes, about a twentieth the size of a typical small Windows utility.

[Back to top]


7. What is 'Usenet'?
[Back to top]

      Other e-discussion groups and lists
If you'd like your own e-group, the easiest way to start is to use someone else's software. www.egroups.com allows you to start your own group. E-mails are automatically sent between members. Groups can be public or private. Most are private; of the public ones, many have a tiny number of members. Also emails aren't easy to view; only the titles are given, and the are presented in small batches, to maximise ad exposure, so emails tucked away in the middle are hard to find. (Free Agent lists all emails in a group, one after the other).
        listbot.com seems to have more to do with businesses, customers, clients. It says it collects the URLs of all your visitors, so you can email them all in bulk. It has a free version (with ads, which it promises aren't too intrusive).
        Listserv® hosts or catalogues lists; the present site has more than 32,000. Click here for the official catalog. Various search options are offered, including lists with more than 10,000 subscribers. This seems to be updated regularly; there are other sites, e.g. by Vivian Neou, which haven't been updated for several years; and there are specialist lists on language, computer software, and so on.
        Another site is Email Discussion Groups/Lists and Resources which is 'intended as a one-stop information resource about e-mail discussion groups or "lists", as they are sometimes called.'

[Back to top]


8. Tables and Frames and Pop-up Messages
[Back to top]

Tables are one of the most useful features of HTML; they allow you to position blocks of text, and pictures, relatively to each other, in a way which won't vary too much with different peoples' computers. They look complicated at first, but less so once you have the feel for the way rows and columns are defined. (I have used an old, free copy of HoTMeTaL just for its table generating facility. New versions don't seem to have this.) They're one reason for the rectangular feel of much of Internet, like old computer games, which always seemed to have characters either going right/left or up/down. With a bit of effort you can introduce curves to reduce that effect.
    To experiment, run your editor and enter <TABLE BORDER=1 WIDTH="100%"><TR><TD BGCOLOR="lightblue"><I>Something</I></TD><TD>Something else</TD></TR></TABLE>, which is one row (rows go across!) and two columns, as, with practice, you can see. Save this file with a .htm extension and view with a browser. You should get
SomethingSomething else
The border of 1 draws a thin line around the data. If the border is 0, as it usually will be if you don't explicitly give it a value, the table is far more difficult to visualise. Each table data entry is controlled separately from the rest of the table as I've illustrated with the background color and font. Add some more rows with the <TR> command and put table data into them. The only other thing needed is the ROWSPAN and COLSPAN commands; COLSPAN=2, for example, makes an item of data span two columns, not the assumed-by-default single column.
    Tables used to be the only way to print text reversed-out,
like this,
i.e. in a small rectangle of different colour from the background. But now you can use the span style command. Tables can include links, so that, for example, you can arrange an upright stack of underlined words, any of which can be clicked on.
    Nested tables are the only way to arrange text in tidy blocks. Some search engines have more than a dozen tables (with borders set to 0, so the way it's done is hidden). To examine these, save the source code and then search and replace, setting (say) BORDER=1. When you look at the result with your browser, you'll see how the tables are fitted together.
    It's often better to set table width as a percentage, rather than a fixed number of dots, since in that way it will fill the same portion of the screen irrespective of the way it's been set up.
    If you'd like different column widths, e.g. a narrow left column with several wider columns to its right, the syntax is e.g. <TD WIDTH="15%">. Unfortunately, there is a problem with Netscape Communicator, which is less well worked-out than Microsoft's Explorer; the column widths are unpredictable, even when correctly set. There's no easy way round this; you might try (if desperate) defining all columns the same width and combining groups of them with the COLSPAN command. You might also try an invisible GIF, stretching it sideways, but this won't work for people with their screens set differently.

    Frames—where the screen appears to be cut into independently-scrolling areas—need the most complicated HTML of ordinary websites. (Example: see my McLibel site). Frames tend to be slow, since at least three HTML files are necessary—the first to tell the other two what to do. Ease yourself into this by examining a site you like the look of—you'll need to right-click on the component files to separate them all out.
    The easiest way known to me to generate frames is to use the free program Top Dawg. This has a 'frame wizard' which, if not in the Gandalf class, at any rate is fairly effective. (Download and instal his program—PCs only, I think; click on the Frames tab and select the wizard icon). It allows horizontal and vertical frames, and allows you to alter their relative sizes. You still have to work out your URL names. (After typing this, I found Kevin Gunn's free Web-o-Rama has a similar facility).
    The basic idea is fairly simple: there's a 'home' or starting page, which defines how the screen is to be subdivided; there must of course be a minimum of two other files to display, making a minimum total of three files—one to control things, two others to be displayed. The traditional main use is to have a fixed menu down the left margin, while the right portion of the screen is allowed to scroll. My McLibel example shows a different use - it has two scrolling windows, so readers can compare texts side by side. Windows can be defined so they won't scroll; and, as with tables, borders can be selected to show up, or not, so you have the option of making the joins invisible. There's endless scope, including windows within windows. You can provide an optional menu bar; my modern biology fraud" 'Why You May Die of Cancer' site (incomplete!) shows this—the main file loads first, then the user has the option to instal a menu bar, loading up the two control programs. So they can ease navigation if they want to. Once the main file is already cached, the overhead isn't very great. (See 'orphan files' below for outline of how to do this; if someone emails, I'll put more detail. Otherwise I won't bother.)
    Another use of frames is the free-floating frame, where a link is displayed in a new window. This uses the same sort of standard HTML code, like this: <a href="page_you_want_in_a_window.htm" target="_blank">Click here</a>. The window will be the previous default size; I haven't found whether the position and shape can be set easily. (Note: when you're using e.g. Explorer, you can open something you're interested in inside a smaller frame by right-clicking on it and choosing 'open in new window'. It's the same idea.)

    Problems with Frames

  1. Loading time: since there must be at least three files, it's best to keep their size down. Clearly, if you're aiming for a hit on your site to generate its front page with not more than 20 or 30K, all three or more files for the frame must add to about 20 or 30K, and even so, since three or more files have to be opened individually, the result will probably be slower than a single file of the same total length. The moral is: if you have long files to load, it's probably better to include them with clickable links, so the user will have a second wait after the main framing has been loaded. (Netscape's Help file illustrates the problem of slowness when several hefty files all have to load into frames). Remember slow modems can be a problem—but so can slow computers, once the material is loaded. It's only recently that true surfing has become possible.
  2. Orphan files: the subsidiary files are perfectly OK in their own right. If, perhaps as a result of a search engine hit, or someone giving the URL, a file intended for a frame is loaded, it will be perfectly good, BUT the user mayn't realise it's intended as part of a larger scheme. You can get round this by including this sort of thing: <a href="home_frame.htm" target="_top">This file belongs in a frame; click here to display</a> where the "_top" loads the uppermost file when the underlined text is clicked on.
  3. Search engines can allow for frames, but it's difficult, especially with nested frames where the highest level is hard to find. Probably some index all the files; others perhaps don't spider the nested levels. For an HTML author, the easiest solution seems to be to put titles and keywords into all the files, and some text in the home file—traditionally something along the lines that, if you haven't got a frames-compatible browser, you won't be able to see the files properly, then put links, with commentary, to all the displayable files, which with luck ought to make the whole thing searchable in a useful way.

    Pop-up Messages (Alerts) can be easily displayed by putting onload='window.alert("Your message")' into the <BODY> statement along with background and link colors. When the HTML is first loaded, an official looking warning message in a box pops up, and remains until the reader clicks OK.
[Back to top]


9. Graphics
[Back to top]

[In case you haven't worked this out, right-clicking on images, including animated ones, and including backgrounds, allows them to be saved onto your disk, with the original name or one you choose. These will usually be someone's copyright. (The process with Macs is different—in fact, I don't remember what it is). Another possibility is to look through your Windows files, with your 'My Computer' utility, in the temporary browser (or something) subdirectory, where you may be able to transplant a graphic image.]

Background graphics. My background here is a simple cyan horizontal line on white; it's a tiny file. Large backgrounds may look impressive, but have two problems - (1) downloading time is longer, (2) they look different for peope whose screen are set differently: you may of course simply assume most people use 800 by 600.
    If you have Microsoft Office, C:\Program Files\Microsoft Office\Clipart\Publisher\Backgrounds has 30 or 40 backgrounds. If you view them (start with 'My Computer') you'll see many are surprisingly pale. By experiment you'll find that backgrounds have to be weak-looking; the repetition ensures the brain will notice them.

What follows are notes for people who scan in or draw their own images, but find the resulting files too large and want to shrink their pictures. In my opinion, graphics files ought to be cut as much as possible, to free up bandwidth and generally cut down timewasting:—

Useful tip: to capture a screen image (when you're running a program), use Alt- or Ctrl- Print Screen (if the printer's off!); this should put the screen into the clipboard. Open your image-processing program, select edit, select clipboard, and save the clipboard image onto your disk, from which it can be reloaded, cropped, and edited.

Raeto West's free HTML website hints, training, tips, informationIf you want a fancy smallish logo or eye-catching picture you may be able to edit a character from Wingdings or a similar typeface. If your image processor allows lettering to be drawn with a shadow, the normal picture—star, comet, phone symbol—can be given a striking 3D appearance.

Note on animated graphics - where part of a picture moves. Animated GIFs (only the .GIF format allows animation; this is because JPG images aren't bitmapped, but divide the image into rectangular areas) are fun, but watch for the overdone 'insect house' effect. They're best used for a definite instructional or attention-getting purpose: say, making some bottom-line figures flash red. Note that animated GIFs work in backgrounds (e.g. twinkling stars can be effective), but you have to be careful if the individual pictures are different sizes.
animated gif       An easy way to experiment is to download Microsoft's free GIF Animator. With this you can load and look at stored animated GIFs which you have (you'll almost certainly find some in the Windows cache files). This software also lets you look at non-animated, simple GIFs—and you can set one colour to be transparent with it, so the background shows through, as in my shooting stars graphic. You can also add comments to GIF files—for example, your name. And, though this needs some familiarity with graphics processors, you can design your own portions of GIF and adjust their position to give the animated effect you want. If you can't be bothered with this, cheap CDs are available from e.g. Xoom and Jayde with thousands or millions of these images. Free sites include e.g. Xoom clip empire, rather small for an empire, including e.g. Java buttons, Animfactory which has downloadable free animated graphics, essentially a taster for their CDs, Free graphics, 'over 300,000', with many links, designed for their convenience, and their advertisers', Caboodles, 'over 1.4 million images'.
    There's a site which states that Unisys has copyright of the GIF format, and that anyone using it may be liable to penalties. I don't know the legal status of this site. However, you have been warned!

[Back to top]


10. Sound. Video.
[Back to top]

I've avoided sound in the past. Noises, voices and music need large amounts of data, and, therefore, possibly overlong download times. However; here are my outline notes:

Video. Capturing videotape into computers needs special equipment. In order to make computer systems work at all, small pictures are necessary, and there's intensive compression. The result is technically quite a feat but nowhere near as good as the original. I'll only mention that .mov files are a 'Quick Time' format. Another format is .mpg, I presume a moving version of .jpg, as the same distortions creep in as with jpg still images. Windows media player plays these. However, it does not play Real Player's own format, .ram files.
[Back to top]


11. Creative use of old software
[Back to top]

I'm afraid this is a plug for another antiquity, which personally I find more useful than most Windows software, namely the DOS version of WordPerfect. This has

[Back to top]


12. Backups to larger format 'Floppy disks'—ZIP and other disks, & writable CDs.
[Back to top]

For people who haven't noticed, the Windows 'File Manager' program and the Windows 95/95 'My Computer' click-and-drag copy do what Microsoft copy programs should always have done, but didn't. Namely to allow for overflow of data. Thus to backup the entire contents of a hard disk folder My Computer/File Manager's copy will ask for new disks to be put in until the process is finished. This is useful for people who distrust Microsoft's Backup. The early versions were very user-hostile, one of the most unpleasant programs I've used, with worrying warnings particularly in compressed mode about the use of RAM and the initial settings. To store data in compressed form, and allow files to overflow from one disk to the next, these programs need additional data files, and if these are lost, or if the compressed data becomes corrupted (as they say) in some way, you may find it impossible to restore your data.
    Be cautious: avoid this program and do straight, readable copies.

    Rewriteable and write-only CDs: There's an important difference. Rewriteable CDs (the expensive ones) behave like ordinary magnetic media. BUT write-only CDs are trickier: each time you write something, the records of where the file is stored and where its sectors are has to be altered. Since the disk is write-only, there's no fixed place on the disk for this information. This is the reason that the disk has to be made ready for writing, or made ready to be readable by an ordinary PC. The process of converting a write-only disk to mimic an ordinary CD can be slow; it's also slow to convert it back to allow writing.
    Note that writeable CDs may not be checkable with Scandisk; if there's a mistake in them, for example with cross-linked files or some other storage problem, you may have no easy way to detect and correct this. If a CD behaves oddly, you may do better to scrap it.

    NB the sector size on Zip disks is remarkably small—much smaller than typical hard disks, which now have a minimum stored file size of 32K, unless you use the Windows 98 system, which I don't personally recommend, as you may come up against software incompatibilities. So, many small files can be copied onto an unexpectedly small number of such disks. [Note: Zip disks have no connection with zip files, which are files in compressed form, i.e. manipulated by software to be short but expandable to the their original length.]
    Zip disks are difficult to undelete—they don't show up in the 'recycle bin'. (You can undlete by going to DOS, LOCKing the zipdisk, and using UNDELETE). This seems a minor reason to prefer large format 'superfloppies'.

[Back to top]


13. Free Sites. Small Print. Free Software. Free Newsletters. News.
[Back to top]

The leading free sites seem to be Fortunecity, Geocities, Tripod, and Xoom. They all claim to have huge numbers of members. Some search engines have free sites as a sideline, e.g. MSN and Jayde. Many of the free websites aren't much more than diary jottings; but some are substantial and interesting. Sometimes you can investigate such hosts by guessing site names, jim.htm for example. Unfortunately, as with other things, you tend to get what you pay for (if you're lucky): all these sites are paid for indirectly e.g. by ads from outside sources, internal ads and promotions, phone subsidies, on-line sales, or being partially locked in. Geocities sites have a small window which opens and displays ads; Xoom has a different system with banners, and specialises in computer sales of discounted software and hardware items. It might in principle be possible to use these sites just for storage, for example, of big files, which could be downloaded free for the user. Unsurprisingly, they've thought of this and discourage such use.
    Freespace.net is a directory, one hopes reasonably complete, of 'free' sites, classified by category (e.g. business, non-profit) and country and language, whether they allow CGI processing, and by Mb of space offered. I doubt whether they're forthright about the downside. A typical larger site is www.hypermart.net, for business websites.

But.. small-print warnings:
      Freeyellow says 'Your... Website is FREE.. if you register your custom domain name.. it will only be $149.50... This will get your custom address setup and HOSTED.. An address MUST BE HOSTED.. or it will not exist. Then there will only be a nominal $29.50 per year maintenance fee...'
      Tripod (I was told by e-mail) is free for a year 'to Premium members'. However, the actual site didn't seem to say so; nor did it say what happened after the year ended.
      Xoom in its small print says: '... If any third party brings a claim, lawsuit or other proceeding against Xoom based on your conduct or use of Xoom services, you agree to compensate Xoom (including its officers, directors, employees and agents) for any and all losses, liabilities, damages or expenses...' I was also unable to find a clear statement about prices; there's lots of 'join free' stuff, but I wasn't able to find that remaining a member was free, too. Xoom asks for complete freedom to use any hosted webpages (but only for ads or promotion). Be careful over copyright claims on your site.
      Yahoo! managed to achieve fame in tandem with Geocities (these tie-ins are likely to become more popular if browsers move toward stricter use of search engines they choose to promote) by claiming virtually all rights of exploitation—copyright, imitation, derivation, etc. etc.—over such sites. Another possibility (which I haven't checked) is that directories like Yahoo! might claim rights over sites which they include. Two URLs I have are boycottyahoo, and a supposedly helpful page by Yahoo!, http://docs.yahoo.com/docs/info/toshelp.html. In any case Yahoo! irritated me by removing my Vietnam War Crimes site from their 'directory'. Possibly the time is overdue for a revision of their credibility, in a downward direction. They now (early 2000) have some link with Reuters, which will no doubt worsen them.

      In the UK (early-2000): even the pay sites give little support (I think Demon, one of the most popular, gives no support to people struggling with websites; Prestel certainly don't, except in the sense of providing a group for people to quiz each other. Nor is this surprising, as a beginner could consume almost indefinite amounts of time wrestling with HTML.) A UK Freeserve is hosted by an electronics hardware company. Another, Tesconet, is hosted by a supermarket chain, as is LineOne. Amazon has one called TheSeed. I once met Richard Branson (his car had broken down), who now has VirginNet with a sort of white-on-white styling. It's quite hard to investigate these sites without signing onto them, which may incur some sort of liability—your hard disk may have odd things done to it. All I've established about Freeserve is that it has twice as many members in the UK as AOL (signed up in a much shorter time) but that it's not as easy as it sounds to set up a website—I'm told users can phone for a long time without being 'processed'. I don't know how much advertising has to be tolerated, nor how difficult it is to break out of their self-enclosed system. AOL's arrangement suggests that many users probably never realise how fenced in they are. I recently found zyweb which offers HTML outlines, so you can make your own sites relatively painlessly by copying their examples. I'm told it's free but slow.

    Free Downloadable Software: Dutch free site has free material for webmasters. Coolboard is a 'free customized message board'. Freeguestbooks is a Norwegian site. 'Guest Gear' is a guest book by Lycos: Guestbooks are interactive e-mails, so visitors can leave messages which other visitors can read. Guest books are a relatively easy way to make your site interactive; however, they tend to be a bit slow. Be Seen, part of LookSmart, says it has free web design software, counter, site searcher, chatroom etc. Webmaster-resources has articles etc. Microsoft's siteholder (click on products link) had a "wealth of free tools"—doesn't sound quite like Microsoft, does it?—but seems to have gone away. There's a set of sites promising free or shareware files, nearly always with barely enough description of their programs to be useful to you—you may find a piece of software has some problem which makes it unusable to you; for example, it may be outdated or have incomprehensible instructions. Software may have irrelevant advertising or bugs. You may need PERL, UNIX, or other things. (If this happens, why not email the site and tell them? They may genuinely not know). So don't be surprised if interesting-sounding programs turn out not to be there, as out of reach as fairground prizes. Some sites may alter the way your computer downloads, or do unannounced things with cookies, or have other annoying effects. Some programs are just tasters, and not 'free'. So beware. Sites include Shareware, Freecode, Winsite, Completely Free Software, Dave Central (in many ways absolutely typical—huge lists, but it's impossible to tell from the descriptions what the stuff is really like to use—a sort of used-car approach), Filez, Free Stuff Factory, Filepile, Nonags, Only Freeware, 'The Freeware Publishing Site', ZDnet. I recently found FreeSiteTools which is interesting. Some freeware is very good: see below in Graphics on 'IrfanView'. Forte Free Agent (for Usenet groups) and Pegasus (for email; a lot of useful features, for example a selective download of headers only, allowing you to not download spam, and easy separation into batches for your future reference), are two other popular freeware programs. I think Listbot is free (a way to manage an e-mailing list; I've never tried this. Check Majordomo for the same function). I quite like WebCopier by Maxim Klimov as a site grabber, to get a snapshot, stored on disk, of the whole of some interesting site at some date. (Click for free download. Warning—this software is not very easy to use, and has bugs, though usually it's OK. You must enter a URL address as a starting point, and remember to click on 'start download', which is hidden under the name of a 'project'. You'll need to click on 'URL filters' to download sites hosted away from the URL. Add the date to the title you give your file, so you have evidence of when it was grabbed—sites change. The folder will be in a subdirectory of c:\Program Files\Webcopier, and the whole folder is relocatable). Another site grabber, also free but with ads, is NetVampire, but for my taste it has a very user-hostile interface. 'Hot Dawg' on Arthur Smith's site has uses—see Tables and Frames, below.
      Magazine cover CD-ROMs have software of varying usefulness. The best advice I can give is that software from a brand name you recognise and like may be the best for you, even if it is supposedly obsolete.
      (Free graphics—see above.)

      Free newsletters can give useful snippets of information; all you have to do is ignore the stuff you don't want, discount repetitive stuff if you've seen it before, and be cautious about hype. They are emailed to you if you ask them, or 'subscribe' (on the Internet, this means free—usually). They all make it easy to unsubscribe, by sending an email. The ones I've found are, alphabetically: Add Me! (claims 180,000; weekly; web tips of varying levels, but often sensible), AOL has a newsletter (surprise! It's entirely concerned with AOL and its sites, and sales links), Internet Day (daily; claims 150,000; part of Jayde; articles supplied by readers, generally as thinly-disguised hard-sell promotions), Lycos News (somewhere on their search engine), NetDummy (sic; the link is an email address; send a message to subscribe. Daily? I seem to have lost my details of this), NetMechanic Webmaster Tips (once a month?), Webmastersonly, SitePoint (by far the smallest; has lots of computer and website news links), Website Journal (weekly by Website Garage, coupled with Netscape; often sensible Webmaster info), Xoom Newsletter (from Xoom; mostly hardware and CD ROM offers).

      News. Or what passes for news. Some sites are news.com, andovernews.com, and isyndicate.com. This is the Drudge Report a sort of conventional meta-look at all the third-rate sources of the modern media, such as the New York Times.

[Back to top]


14. Counters
[Back to top]

The good news is that counters exist, and can in principle tell you how many visitors you have. The less good news is that service providers don't seem to like them, because they take up computer processing time which could be used for better things, and also perhaps because they offer scope for people to try running programs in the heart of their machines—these are the ones with 'cgi' in their code. Having said that, some service providers (Prestel, for example) permit their users this facility. (The providers themselves of course have this sort of information, including the total number of bytes downloaded from each site; perhaps if you ask nicely they might give you some of it). All the commercial (i.e. advert-paid) counters I've seen on other peoples' sites have been painfully slow and never ever seemed to register their information.
    You may be told that counters (e.g.) count the graphics of your site, or count repeatedly each time a browser moves through a page. It depends on the programming; they may, or may not. If you're in doubt, experiment to see, preferably asking someone else to download from your site. You may get different results if a browser is set so pages aren't stored in caches, but I think this must be unusual, since it's such an advantage from the speed point of view.
    NedStat is a free counter which looks good (based in the Netherlands).
    TheCounter also looks good (and is easy to instal).

Commercial counters include these, none of which I've tried:
      Cool counter. The Basic edition is 'by far the most popular'; it's free, counts any number of pages, but has a banner you have to display. Extreme Counter, with no ads, sounds cheap.
      Fastcounter is another. So is Hitbox which claims to give lots of information; if you put an intrusive green symbol on your site, you can get it free, at least in theory (I couldn't get it to work, twice!) and get yourself listed on yep.com, their in-house search engine. There's Watchwise.
      Getstats another.
      Pagecount (says it's free but with ad banner; appears to count one page only, and has up to 100 origins of hits) is yet another.
      And Showstat and Sitemeter and Sitetracker another ('.. some of the most detailed..'). And Stat Track another.

[Back to top]


15. Site Searchers
[Back to top]

The idea is to generate an index for your personal site; when interrogated with a keyword, it should list your files which contain it, in standard format—file name, file title, description. For example, you might search for Constantine or Sherlock or Napoleon or, with a good site searcher, complicated things like chemical warfare. And the result will be a list of files just from the one site. If you have a precise subject in mind which you're fairly sure the site deals with, then a site searcher is a valuable tool, especially with very large sites. However, if my own experience is anything to go by, this feature isn't popular with most users (possibly the time delay has something to do with this)—so if you don't have one, you may not lose much.
      A searcher is supplied as Javascript, and the user has to cut it and paste it into his source HTML code. (You can do this with Word, but be careful to use only text mode). I've experimented with Pinpoint, but couldn't get it to work properly. Hotbot seems to have such a facility, which however I couldn't get to work either. What-U-Seek has a promising free one; in fact I recommend it. It has a password system, and quite a bit of freedom in formatting - you can include your own picture, background graphics, and choose font details of the search results. Many of the check-your-HTML sites (some listed above) offer this. A variant is to have a searcher which operates normally, though obviously this offers your site little advantage. 'Personal Open Directory' is Yahoo-like. There may be a Mac searcher on Ultradesign though I'm uncertain exactly what this company does. Lucene says it's a free downloadable search engine; written in Java. But I haven't checked this in any way. Atomz has a site searcher; I haven't checked it. With this type of feature, if it's free, you will have to put up with banner adverts. If you pay, you can be ad-free. You'll usually have to tell the host site when you want your site indexed, and whether there are files you want unindexed. There may well be restrictions on site size.

[Back to top]


16. Security, Downloading, and Software problems with your computer
[Back to top]

msconfig. It's worth knowing that you can control your start-up files and the way your PC is configured. If your computer has picked up software which insists on loading itself, click Start | Run and type msconfig in the letterbox. Click on OK and pick 'Selective startup'. You can turn off 'load startup group items' which gives you a normal desktop, except that it's clear of the programs which are usually automatically loaded. (This may mean e.g. your scanner is turned off).
    And you can tinker about with config.sys, autoexec.bat, win.ini and so on. Warning: I accept no responsibility. You must know what you're doing! Sometimes it's possible to make useful adjustments with this option; with some poking around you may well find you can turn off software you don't want.

It's also worth knowing that you can check what your computer is doing when downloading (or uploading) on your phone line. This is particularly useful with a long file (for example, a .pdf file) where there's no progress bar to show you what's happening. Right-click the modem indicator typically in the bottom right of your screen (my picture shows what it looks like), and select 'status'. Bytes received, and bytes sent, are shown to you. If both stay fixed for any length of time, someone's system may have been locked up or turned off.

Restart. Known as a 'warm boot', the idea is, that programs which have somehow got themselves in a tangle can be closed down or reset. The effect is as though the computer had been turned on. Recommended if (e.g.) your windows don't display properly, or there are strange conflicts preventing scanners, printers, external disks, or even the keyboard from working properly. Click on Start | Shut Down | Restart. Any data in memory will be lost.

Scandisk (mentioned above) is an important first-line check on your computer. It tests that files are stored correctly. If you switch off a computer without closing it properly, there's a risk that incomplete files may be left on your computer, which may later get tangled up and cross-linked with new ones. Scandisk (Start | Programs | Accessories | System tools | Scandisk) ought therefore to be run from time to time. Tip: Windows can run several programs simultaneously; you may find Scandisk is intolerably slow, presumably because files repeatedly get changed. So switch to DOS mode [Start | Shut Down (sic) | Restart in MS-DOS mode] and type, usually, SCANDISK C: after which exit or win or windows will return you to the desktop.

'Viruses' are the not-very-accurately named bits of program which are designed to attach themselves to other peoples' computers. In fact many are inept copies of other peoples' ideas; many don't work. The chances of getting a virus are low; obviously, the people designing PCs must have had this possibility in mind, and there's a strict division of files into those to be viewed, and those which run as programs. My guess is that more PCs are damaged in some way by programs with bugs than through intentional damage.
    There have been recent scares about 'attachments' to e-mails. I don't know if these scares were in fact valid. 'Pegasus' allows you to look at the headers of e-mails without downloading the body, giving some assurance. But if you're worried about this, don't open attachments, but e-mail back and ask for the information to be sent as an ordinary e-mail.
    Other things you can do are pay for a virus detection program, and/or consider doing your main work on a computer which isn't connected to anything else and where suspect disks aren't permitted.

Firewalls. This is a fancy name for software which is supposed to check on data being imported down a phone line into your PC. In my view, it's possible for, say, Microsoft, or anyone else, including firewall programmers, to incorporate special unannounced stuff in their own software; so this cannot be completely foolproof. If you want to experiment, try Zonelabs, who offer ZoneAlarm free to non-business users. As always, the instructions are a bit obscure; what exactly does 'locked' mean, for example? But, when surfing the net, you can have the pleasure of turning down companies that insist on probing your PC. A pop-up alert shows when something's happening; and the URL of the contact is listed, in number form, so if curious you can try to see what was happening. You can turn off individual programs (e.g. AOL's) from contacting internet. And set the program to prevent any access when the screen saver is on. Also it says it checks for Visual Basic attachments to e-mails. (I think this company also has an exclude-ads program, to prevent banner ads from well-known ad sources from downloading).

Keep backups!

[Back to top]


17. Domain Names (.com, .org. .net and others)
[Back to top]

Everyone knows that .com sites are supposed to be commercial, .org sites are organisations, .net sites are something vague to do with the Net, and .gov, .mil, .edu have their own uses. It's still true that .com sites are by far the best known, so, even if your site isn't commercial, there's a lot to be said for a dot com address. One convention is to avoid making people guess whether dashes or underscores are used, and simply use a name without spaces—www.normanfinkelstein.com illustrates the principle.

After some unhappiness over a US monopoly, there's been a mushroom growth of hosting sites.
      If your ISP is one which hosts domain names, and you're happy with your ISP, all you have to do is choose your name, register it, and do what's needed to make it all work.
      If your ISP doesn't host domains, the easiest thing is to keep your site where it is and register its URL somewhere else; most such sites (as far as I know) allow you to redirect any hits to your site, however long the name is. You can also move your site to another ISP without much disruption; all you do is change the redirection to the new address. There are (inevitably) complications: you may be allowed the option of keeping your domain name visible in your visitor's browser; or you may choose to allow it to display the real address. I'm uncertain whether any domain hosts allow more than a handful of subdirectories to be shown. (Example: if you are www.johnsmith.com you can sometimes allow johnsmith.com/computers.htm to redirect properly, but you may not have many such options).
      Several warnings:
      I'm told that you may find that, if you check a likely domain name, and find it's available, you may lose it, because the name may be bought speculatively by someone with access to checked-out names. So, if you think of a terrific name, you might do well to be cautious in checking for it, and perhaps be prepared to register it quickly.
      Once your name is registered, although it's theoretically yours, it's physically processed by someone's computers, and they can cause problems by not responding to queries, not answering emails, and so on. They may impose charges for moving to another site. It may be difficult to alter such details as phone numbers. Unfortunately the situation changes fairly fast; it's impossible to advise on the best hosts.
      Hardware: these companies use specialist equipment which is barely known outside the smallish number of people who work with it. So far as I know there's no way to predict who will have the best stuff in a year or two's time. Another complication is resellers, companies who appear not to host sites, but act as sales outfits. UK2net (in Britain) is an example; it is a reseller of joker.com, of Germany, and relies on ads which it's hard to describe as other than misleading. Unless you enjoy wasting your time, read the small print and take care.
[Back to top]


Keywords, key phrases for search engines: Alta Vista, AOL, autoexec.bat, be a webmaster, beginner, computer hints, config.sys, configuring, domains, domain names, free stuff, GIF, JPG, graphics, HTML, HTML edit, Infoseek, Internet, keywords, learn HTML, links, meta searchers, Open Directory, free computer hints, computer tips, optimize, fast load, faster loading, firewall, help, free help, how do I, how to, list, search engines, teach yourself HTML, Internet programming, Internet, msconfig, PC, sound, Sound Recorder, spiders, system.ini, URL, URL submit, win.ini, Yahoo, free sites, my site, my website, websites.
HTML Rae West. All rights reserved. First uploaded 98-05-17. Thorough revision 99-02-08. Browsers, getting onto search engines, site searchers added 99-06-14. Concept suggestions, revised searchers 99-07-03. Frames 99-07-31. Newsletters, other 99-08-05. Open directory 99-09-26. Backup 99-10-13. Sound 2000-03-27. Egroups 2000-06-16. Last revn 2000-08-20. Links to this site are very welcome. After a few months' gap, this was reinstated on 21 Dec 2014. Viewport added 2015-06-21. Coffeecop comments 4 March 2020

This site is big-lies.org