Hyper-Text Mail Link (HTML), Web Information, Hints and Tips; the Simplest Websites.
|
NO MENU BAR AT LEFT? |
WARNING: These free but 15-year old computer hints do NOT include packages such as WordPress. They were aimed at people who want to get their ideas onto Internet, and at those wishing to make thorough use of the Net. The HTML still works; people wanting to write HTML might find my comments useful. For example, in writing indexes to all their files, so people can find them. This was my own notebook: I've found the information, which I've assembled quite carefully, very useful, and some of it indispensable. I've uncompromisingly assumed some experience with computers and Internet hosting companies. My suggestions are somewhat idiosyncratic; they are low-budget; there will be mistakesapologies for them. No legal responsibility acceptedRae West |
Click for Home Page of Rerevisionist's Website | Click for Rerevisionist's Youtube Channel |
Most sites now use WordPress. I found it difficult to use, and without clear explanations. But my main reason is that it is yet another layer of complication, introducing possible routes for hackers with add-ons which seem to be needed to make it work well. big-lies.org has a few specimens from the BNP site some years ago, which was a WordPress site. They don't display properly now. There is a permanence issue which may be important to you.
[Hosting a site on your own computer, or one of your computers is possible and perhaps worth bearing in mind as an option. If webhost companies come under fire, probably by Jews or their agents, this practice may become widespread. Software would need security against hostile viruses, spam, and 'denial of service' overloads. Feb 2014]
[Rae West's Intellectual Revisionist Website Home Page E-mail anything @ big-lies.com if you like this site, or have a technical crit]
Internet offers something approaching a level playing-field for the spread of information, perhaps for the first time ever. Here are my tips on designing a social conscience, reforming, or informational website.
Social Conscience/ Reform/ Informative Websites: Some Possibilities
<HTML>
Hello!
<HTML>
<HEAD>
<TITLE>Brief title of your site - about ten words before the end-of-line cuts it off; say 60 characters maximum. All search engines display this.<TITLE>
<META NAME="DESCRIPTION" CONTENT="Description of your site; about 25 or 30 words, say 150 characters maximum. Many search engines display this.">
<META NAME="KEYWORDS" LAN="en" CONTENT="List of keywords separated by commas. English language, or whatever, optional. Maximum of 800 characters or so">
</HEAD>
<BODY BGCOLOR="color" TEXT="color" BACKGROUND="image.jpg" LINK="blue" ALINK="red" VLINK="purple">ALL YOUR STUFF TO SHOW ON THE SCREEN.
The first words may be displayed by some search engines, so choose them carefully.<!-- this structure is for comments; browsers don't show anything in these brackets -->
</BODY>
</HTML>
<P align="justify"> makes your text tidy on both marginsas in this paragraph. You may get odd spacing effects, if you have short lines or long words. <P align="center"> centres the subsequent text like a tree, with a ragged left margin reflecting the ragged right margin. <P align="right"> gives a justified right margin; sometimes useful in a column of words to the right of a page, but not generally much use with a language written left-to-right, such as English.
(There's also a <DIV> command, but I've never been able to work out what it does.)
To allow clicking to move inside a file, rather than having to scroll, two parts are needed: something like this: <A HREF="#label">Underlined message</A>. And the indication of where to move to: <A NAME="label"></a>. (To allow clicking to move somewhere inside a different file, use the filename followed by #label, something like name.htm#label). The label must match exactly - it is case sensitive. If the browser can't find it, nothing will happen. The computer looks for the A NAME label from the top down, so if you have two identical labels, by mistake, it'll only find the first. And if the file isn't fully loaded, it won't find the remoter internal links.
(I'm sorry if this seems complicated.. it's really not as bad as all that. The logic is, you have to tell it, unambiguously, where to go, so some sort of clear label will be needed both where you click and where you arrive.)
Links without underlining, with color change. Relatively easy effects (not using MOUSEOVER) can be got, at least with Microsoft Internet Explorer. Examples: in the <HEAD> section put
<style type="text/css">a{text-decoration:none;}</style>
<style type="text/css">a:hover{color:yellow;}</style>
<style type="text/css">a{cursor:crosshair;}</style>
All a browser has to do is download HTML files, and convert them into an attractive display, unpacking the picture files and taking care of the font appearance, size, etc. In practice this isn't as simple as it might appear.
According to my notes Anybrowser is a site which looks at assorted browsers.
According to figures quoted by people like Microsoft and Netscape, Microsoft's Internet Explorer (IE) and Netscape's Communicator are about joint equal in importance, only a tiny proportion of people using the once-popular Mosaic, which is no longer updated. AOL have their own browser (I think - I haven't checked, deterred by their signing-on apparatus) and perhaps dispute these figures. Other browsers I've come across include Opera and Oracle, each of which has a devoted band of followers. Opera's selling points are, it claims, strict adherence to HTML, and smallness, in the sense it hasn't huge numbers of filesif you have a slow computer it may suit you. (But it doesn't allow font size changes; or background graphics... At present it has a 30-separate-days trial offer.)
Mosaic was one of the first (or the first?) HTML browser, but updates were discontinued a long time ago, and it's probably more or less extinct.
AOL's Netpress was available on its CDs; these used to have separate files in separate directories. Its recent versions have files collected together, so it's impossible to load this without loading the whole of AOL's stuff; you may not want to do this.
This little table summarises I hope correctly a few miscellaneous points of comparison between MSIE, Netscape and AOLPress:
Netscape Communicator | MS Internet Explorer | AOLPress | |
Free? | Yes | Yes | Yes, I think |
Aggressive? | Less so. Fewer cookies. | Difficult to get several versions to co-exist, which makes compatibility checking a pain. Lots of cookies. | More self-contained (and has its own tutorial) |
Font size change? | Yes. Click view then select | Yes. Click view, then fonts, then size | Alters actual tags. Global font size change needs 'select' of entire file |
Print/ not print background patterns? (Confusingly, this is controlled from the browser). | Yes. Click file/ Page setup/ Print backgrounds | Yes. Click View/ Option/ Advance Printing | Haven't tried! |
Print preview feature? | Yes. Very attractive and useful as it includes graphics with text. BUT font size changes don't operate; the printout looks the same irrespective of font selection. Nor is the printer layout option of two or four pages per sheet shown. | No. BUT font changes are reflected in the printout! My version leaves fine horizontal lines. | No |
Tables? | Not very helpful | Not very helpful | Not very helpful |
Frames? | Useless | Useless | Useless |
Own Web Searcher? | Yes. 'Search on Internet' button. (It's included in my list of search engines). | Yes. It's MSN. Also in my list. | Yes. AOLNetfind. Also listed. |
Switch off cookies? | Easy. Select View/ Internet Options/ Advanced/ scroll down to yellow exclamation mark and select disable (or warn) | Easy. Edit/ Preferences/ Advanced/ Select disable or accept only.. | Doesn't seem to allow this; perhaps I have an old version. |
Online/offline distinction? | Clear | Less clear | Not very clear |
Switching between online and offline browsing? | Difficult | Difficult | Not quite so difficult? |
'Print this page' command? | Yes, but calls printer control panel. | No. Click on file, then print. NB may lose last lines if bottom marging tight. | No. Click on file, then print. |
Print frames? | Navigator seems only able to print contents of one frame, not entire screen. Or perhaps I can't be bothered to work out how to do this. | Can print entire page. Printer may have an option for this. | Don't know |
Has an HTML WYSIWYG composer? (I.e. shows your work the way a browser sees it) | Yes. And it's free, as perhaps it had to be to compete with Microsoft. Not bad; not incredibly good as the user needs considerable grasp of HTML to work the thing. And it reformats your HTML, which you mayn't want. If you view source code, simple syntax errors flash. | No. | Yes. I think probably one of the best. It has a parser for HTML, which detects errors, and makes fewer changes than Netscape. |
A 'search engine' is just a computer program which looks for sequences of characters contained in its own computer files. If it finds them, it reports them in some way. Its files are compiled typically by 'crawlers', 'spiders' or 'robots'. This is why 'engines' are fast (compared with what they might be). The preliminary work has all been done; the engine doesn't search the entire net! This is also why it takes time for new sites to register. The relevant keywords or titles or whatever have to be collected, sorted, merged into the results from other searches, and filed away in a large-capacity storage system. The process has to be more or less continuous, since if (say) ten million sites are to be indexed, a single program could spend only a few seconds on each site every two weeks.
The outcome is that, when you type in (say) "Elephant man", this phrase can be found and all the files indexed to it assessed by some algorithm which is meant to work out how important this phrase is in the site. The results are sorted and sent back. The processing is specialised work, which presumably is why many searchers seem to rely on software you've met before - e.g. Magellan uses Excite, Infoseek turns up with Go, Disney, Direct Hit, Ultra, and others in some sort of arrangement, Inktomi seems to lurk under many searchers e.g. Compuserve's, Lycos and Hotbot are joined in some way, Nexor, apparently a weapons-related thing, uses Excite, as does AOL Netfind, and so on.
There are searchers in French (e.g. Voila-not the musical instrument, please), Danish (e.g. Jubii), Dutch (e.g. Ilse), Italian (e.g Il Ragno), German (e.g. web.de). Intersearch is German and Austrian. Fireball is German, apparently a part of Alta Vista. Some are Inktomi powered, e.g. SwissSearch and a Japanese engine, Goo. A few other examples are GoGreece, Searchmalta, Korean, and English-language ones like Surfchina... but I'll look mainly at English-based ones. Note that it may be possible to get new stuff by trying a foreign searcher. The keyword system allows you, armed with a dictionary, to search even if you don't know the language, then try to decode what you find.
Among the best known Internet search engines, so far as I can tell or guess, are AAA AOK Matilda, 'the largest outside USA', AltaVista, Excite, Euroferret, Infoseek (now Go), Webcrawler, and Yahoo! (apparently the most profitable and well-known mainly through having been there a long time; but see small-print warning below). Apart from these, the commonest inertial engines are MSN, Netscape search, AOL Netfind and in the UK UKPlus. (See my list below for comments on some of these).
Yahoo! was differently designed from most engines, though now others are following; it is a directory or selected list (that is, people submit their site for approval, which usually isn't given). Searching with some such engines is only carried out within categories. This makes sense from the time point of viewif you have twenty categories, and each search only looks within one twentieth of your database, everything is much faster. (Yahoo! claims only about 1M sites, while Alta Vista's Raging claims 350M, the highest I know ofyet). But of course miscategorised sites, or hard-to-classify sites, will have trouble. Yahoo, Netscape Search, Lycos, Hotbot and AOL all seem to share the same team of editors (see below under adding your sites to WWW). AAA Matilda has strict categories of this sort. So does Snap! So does Hotbot, but this also uses Inktomi, and in effect has two parallel engines. OpenHere is another directory with a 'focus on safe surfing'; it seems to have little content and I don't recommend it.
You can't always rely on things they say: Hotbot for example claimed never to remove sites, which is untrue. A number of engines (MSN, Mirago, Excite..) claim to spider entire sites given just the main URL, but none of them as far as I know in fact do this completely.
If you want to look at what other people say, there are guides to search engines on http://www.searchenginewatch.com, which as with many of these sites has unsourced information on the sizes of the databases used by a number of 'engines', and the criteria they search on. It has information on subjects like adding a search engine to one's own site, but it marred by uninformative outlines to the links it offers. All Search Engines says it lists them all; it has a top 6 and also 25 topics, e.g. Health and Medicine, with topic search engines, such as SciSeek. What may well be the biggest and best site is Search Engines Worldwide, with a searchable list of countries from which you can find search engines. There's also a long list of 'submit free' sitessee below for cautions. And (a site with less information) http://www.free-markets.com/search1.htm. And another is easynett, with another long list. The Spider's Apprentice has search engine information and news updates on takeovers and buyouts. For information e.g. on specialised subjects and universities you might try Geneva University. This site Internet Exploration gives you a longish list of search engines, with bits of information about them, but not enough to be useful.
Less well-known search engines:
411 Locate (phonebook and yellow page seeker), 4anything (says it has professional editors, perhaps of total site. Doesn't spider. 'If you prefer a prime, guaranteed listing.. low, one-time listing fee..'), a2z (Searches only in categories), About (has what it says are experts: "smart people who care." I was amused to see it seems not to have 'AIDS'. Some sort of membership-only scheme I think), AcidSearch ('your entertainment search resource'I found this slow loading and know nothing about it), ah-ha [Aims at commercial sites, but free to add anyway. 12 keywords allowed], Alexa (some sort of adjunct to search engines; seems to provide ad info, but I haven't been able to work out what it does), Aliweb (Pre-meta-tag engine, with file info input in strict format, very few pages. I could find nothing interesting), AllTheWeb (Could become important. Connected with 'Fast' in some way. Intends with Dell computers to be first with a billion sites), ANZwers (Yes, an Australia-New Zealand searcher), Apollo (based in Britain; seems slow loading or non-existent), Canada (no prizes for guessing where this searches), Company Sleuth (says it gets legal inside info on companies), DirectHit (spiders; see below), Disinformation, (not really a search engine; selected sites which it says are censorship, counter-culture and anti-corporate; disappointingly tiny with feeble material), Disney Internet guide ('the ultimate family guide to the web'. Appropriately, endless cookies), Electric Library (libraries, transcripts, reports..), Euroferret, (European sites. Claims to be the biggest Euro search engine. Says it won't index subsite URLs), Euroseek (instructions in many languages; elaborate password system), Fast. Adds URLS almost immediately, unlike almost all SEs. See below in spider engines), Find It (seems to be an offer to look things up for money), FindLink ('safe and clean'; has some puzzling banner system), FindWhat (I don't think you can add a URL; you have to pay, and there are various schemes for this), FrequentFinder (the only searcher known to me which searches domain names for meaningful phrases, so e.g. drivelvision.com shows with elvis), Galaxy (seems to be funded by defense and medical interests; $25 non-returnable fee to add URL), Go now subsumes Infoseek (see below in spider search engines), Google (Recently improved after three years' work, it says. 'Returns the right results fast' - it has an 'I feel lucky' option - supposedly with self-modifying algorithm, tho' database(s) aren't given. Easy to add URL; claims to spider. Claims very efficient de-deplication. Lots of non-English language news), Goto (relative newcomer; its selling proposition was simple instructions. Now position is determined by payment [including supposed public services]), Harvest Broker (says it gets its own sites. Not many in total), Hotbot (green. Several months for URLs to register with Inktomi, otherwise categorised, but difficult to add because an ordinary search with it doesn't give results by category, so it's quite difficult to work out where to add, at least in my experience. ('Expert' assessment amused me by having pro-'Holocaust' Jamie McCarthy censoring genocides in Africa and Asia), iBound (yet another categorised searcher), ICQIT! ("I seek it" and I seek you". Claims to cover entire web every two weeks. Unclear whether it spiders, so it probably doesn't. Online chat facilitiesallows you to search for people by email address and/or nameclaims to be the biggest chatroom software, competing with AOL's confusingly-named 'Instant Messenger'), I Found It!, (genealogy archives), Infohiway (has a free site mapping facility; can be useful to download sites - take advantage of their very fast links. Easyish to add URL), Info Tiger (probably small; can email URLs in batches), InfoSpace (phonebook and yellow pages; emails), Internets ('largest filtered collection of useful search engines and newswires anywhere..'), Jayde Online (relatively small; censorship policy; has a hints newsletter. That's what I think, but its blurb says 'second largest search engine directory on the web'. Says it indexes only on title and site description of main site), Jump City (claims to carefully select worthwhile sites, and has its own 'jump code' TM system which I fear I couldn't fathom), LinkStar (company information, US only; unavailable for now), Livelink (related to Pinstripe and Open Text; 'The source for business knowledge'), Looksmart/BeSeen, (another multi-level-category - 24000! - searcher; starting self-published sites, won't include 'offensive material' NB has alphabetical list so you can see the rather banal categories), Lycos (recently started TV ad campaign! Censorship policypossibly the most censored search engine), Magellan (large, claiming 50M, general purpose; has a 'site voyeur' feature; watch for several similarly-named sites. Easy to add URLs. Seems to be part of Excite!), Mall Park (categorised selling site - commercial sites only, I think), Matilda (Or AAA Matilda. Yes, Australian! Unusually chaotic appearance, with strict categories making it user-hostile, but an index-everything policy, except they charge .com sites), Mirago (categorised British site run by Telecom plus; it claims to spider your whole site from the index, and return regularly; not much good), MSN (Microsoft's Web Searcher; has MSN Encarta section. Doesn't seem special! Categorised. URLs can be added - it takes some effort to find where; look for 'help' - after which entire site is supposedly spidered - if so, this is useful and timesaving), NationalDirectory ('least spammed, easiest, most comprehensive', new URLs 'searchable within minutes'-but I couldn't find how to add a URL), NerdWorld (relatively tiny; offers a site index, but apparently only from itself), Netfind (this is AOL's - censored? It's hard to tell whether the censorship is any different), Netscape's Netsearch (part of Netscape, though not a major part. Seems impossible to add URLs directlythere's a tedious mass registration apparently used as a junk mail database), Northern Light (large site with company reports apparently a specialty), Open Text (same as Livelink/ Pinstripe), Peekaboo (business and also 'quality public service information..'), Pinstripe, PlaNET Search (Said it spidered whole site within a few days, with regular repeats. SHUT DOWN Dec 1999), Pronet (International business directory), RagingSearch (Part of AltaVista. Unlike AV, which has a cluttered page, Raging has a near-empty screen. Allows searching "for phrases", -omissions, Upper Case Sensitivity. And for links, domains, text in titles, similar pagessee Help. No info on adding URLs, which presumably is done thru' AV. The 'most powerful search on the web'), REX ('Go get it, BOY!'), SciSeek (specialist science searcher with person(s) checking entries; claims one week turnround), Scrub the Web ('search in real-time. Add your URL instantly'), Search Europe ('.. designed to be comprehensive..'), ('search in real-time. Add your URL instantly'), Search King ('Where YOU Rule the Web!'. URLs added in less than 24 hrs, it claims. Doesn't spider - you supply keywords. Claims to have a click voting popularity system), Searchopolis (aimed "for K-12 students" whatever they are; filtered by N2H2 by "our large staff of trained reviewers". I was surprised to find the Russell War Crimes information in there), Search UK (Now searchengine.com. Was business partners, reps etc in UK), Siteshack ('the fastest on the net'. German? Makes you count characters), Snap! (Has a detailed 'membership form' which I expects put off many people. Copyright Weather Data. NBC. Another multi-category site, like Yahoo! Was easy to add URLs; now needs trawl through categories. (Has alphabetical list feature; mostly disappointing trivia). Claimed to be fastest. I think a CNN site, unlikely to include critical pieces), Starting Point (has regular 'new sites' feature), Superpages ('Business Websites Around the Globe'), Thunderstone (has a serious opinion of itself; wouldn't take web addresses which are subsites with slashes, and now seems impossible to add to), UKDirectory (UK searcher with teething trouble. Also publishes an A4 book of its sites, like a phone book), UKIndex (UK searcher. Enter your own description), UKMax (Another UK searcher. Claims to spider whole site from index and seems quite good at this. Has been trying TV ads in UK), UKPlus (And another. This one is used by Freeserve and is therefore important for UK users. Looks like Yahoo-know-who. Part of 'alleurope', doesn't state pages - probably < 100,000. Seems purely ad driven, probably with pay pages. Unlikely to post serious material? Switched to Infoseek to search web, so target this), Town USA ('free listings of US businesses and municipalities'), Ultra (seems same as Infoseek), USA Online (tiny business site apparently mimicking AOL), (couldn't find WebArrivals - or What's New), What's New Too (looks like a chat line but has its own database), What-U-Seek (has a promising website searcher), WhoIzzy (says it's one of the oldest; currently trying to sell itself), Yep (Related to HitBox counter. Ranks sites by popularity. However, only sites registered with Hitbox show up), Zeal.com (Categorised sites, as Yahoo, with internal search engine, but supposed to be 'community driven', i.e. sites are ranked or voted. You're legally required to indemnify them), Zensearch (nonprofit; falsely claims immediate indexing. Appears to be the tiniest of all and virtually useless)
'Meta' search engines 'cheat' by submitting requests to other peoples' databases, then returning the answers to you. They have been a growth industry, because they use other peoples' databases and therefore only need programming skill. But (I'm guessing) they could presumably be cut off from accessing the searchers they useperhaps they pay a percentage. There are two types:
De-duplicating meta-searchers (for want of a better expression) which combine the results of their searches. These have the enormous advantage of potentially finding files which happen not to be stored by all search engines: you increase your chance of getting what you want. The first was (I think) Metacrawler, which now uses ten search engines. (It's 'all' button doesn't work). Savvysearch makes use of the largest number of other engines (12, though not at once) and may give very good results. Mamma ('the mother of all search engines') seems to have been a mimic of Metacrawler. Inference Find uses only six searchers. Its display sorts output by type, e.g. .com sites are collected as 'commercial', which can be useful; but it lists only bare titles, so a user is likely to spend more time trying to find a good one than the time theoretically saved. Ask Jeeves says it uses five searchers, and claims it has a natural-language front-end, i.e. allows you to type questions in ordinary Englishthough this isn't true. It claims to select sites and have made some checkups on commercial sites. A new metasearcher is Chubba, which uses about six engines plus What-U-Seek, but tends to leave graphics cluttering your screen. Metafind allows you to set various parameters. Verio Metasearch allows you to choose the weights to assign to different search engines. I've just found four others: Beaucoup which seems related to (or the same as?) 4anything and Metafind, GoCrawl, InfoZoid, which searches Usenet in addition to a clutch of search engines, and YahooSuck (doesn't state which engines it uses, but seems good). Copernic is a categorised metasearcher and is unusual: it's interactive, downloading its own current set of searchers, and beaming you banner adverts, from its own site, when you're online; it also inserts itself into your browser, which you may not want.
The Big Hub is interesting not so much as another metasearcher (8 engines) as for 'specialty search engines' by topic; I haven't attempted to check this claim in detail, though it seems unlikely there are 1500.
No meta-searchers (that I know of) attempt to include the full range of Web information; don't generally expect them to find names and addresses, or yellow page style information, although they should be able to tell you where to find such information.
My experience is that Meta engines aren't always reliableyou may find that a site which is definitely found by a component search engine nevertheless doesn't show up on a meta engine. Possibly there's a censorship, time limit or perhaps a depth-of-search limitation. On the other hand, sometimes you find the reversea phrase which seems not to present on any engine, shows up on a meta-search!
Non-deduplicating metasearchers simply return results from all their searchers separately. Usually the results display in sequence: OneBlink is one example, Dogpile another, with quite a range of options. Search presents a choice of 11 engines separately selectable. Metasearch last time I looked made you click separately on seven searchers yourself. WebTaxi (under development) is supposed to allow proximity searching, i.e. sets of words fairly near to each other. It uses an unusually wide range of types of search engine; however, you can only look at one at a time. What seems to be a filtered, i.e. censored, search engine is Internets, (but it has lots of Java, and may be best avoided). You can amuse yourself seeing what's been 'filtered'. Internet Sleuth allows you to choose 1 to 6 of its search engines, and seems to return more results than many. Highway 61 just uses five engines. Searchhound is another.
But some display their findings in windows: All4One is similar, but displays search results from four search engines in separate quarters of your screen. Search Spaniel, offers eight searchers plus four specialist engines (for people, newsgroups, shareware, mailing lists) and allows the option of displaying the results in individual windows.
Naturally there are endless complications. For example, keywords may not accurately represent the contents of files, pictures and voice files in any case can't be indexed like this, and there's scope for people to cheat by listing irrelevant words which they hide on screen. Phrases may not be recorded, although obviously they provide greater discriminating scope than single words. Foreign languages and alphabets, and mathematics, may be unfindable. Some engines (Yahoo!, Euroseek) shoehorn their sites into groupings; this (i) makes it more difficult to enter sites, and (ii) tends to produce somewhat irrelevant lists of found sites.
Most search engines are surprisingly difficult to operate: because (1) ads, which use long graphics files, slow everything up; (2) their designers tuck away examples of how to use them; (3) they may be split up into geographical areas and subject areas; (4) since the search criteria aren't easy to work out, it's hard to guess how far down a list a sought item might be; (5) there's a constant tendency to divert into money-earning areas, so that things which look free turn out not to be; (6) they tend to try to send 'cookies' without saying what they're for; (7) the results may be displayed unhelpfully.
But there are short cuts where detailed searching isn't wanted: if you just type a logo or name you may find the www and .org or whatever are tried for you.
The best way to start is probably to type a lot of keywords in the box, preceding them by + to force them to be used, and putting phrases in quotes; you can also try using roots with *, for example flower* which might find flowers, flowering, flowery. Using many, or rare, or elaborate keywords will at least cut down the number of files returned for your inspection. Remember the comment about job ads - the trick is to get replies from the few people you want, not from everybody.
To get a feeling for the way these 'engines' work, try to see things from the programmers' viewpoint. They have a choice of looking at hidden meta-statements, or ignoring these. Meta-statements have the huge advantage of allowing key phrases, such as (e.g.) baked beans, President Wilson, Vladimir Ashkenazy, or whatever. It's probably too much to expect a program to pick out keywords from raw text: in a sentence like 'the rain in Spain falls mainly..' the computer can't tell whether 'Spain falls' is a significant phrase. Hence metastatements can be useful. Unfortunately they can be abused - you could for instance put 'free download' in every file and increase your hit rate. So most engines state they don't use them. Altavista is one which does.
If you wish to test what an engine does, rather than what it says it does, try a phrase like 'baked beans' and then 'beans baked'; if you get the same list, you can probably assume it doesn't store phrases. You might try "blue sky" in double quotes and the two words 'blue sky' separately; if the results are more or less the same, the engine probably indexes individual words, not phrases. You might also try a part-nonsense phrase like "blue mkxfdhg" or +blue +gkhutnw; if you get a list of sites with 'blue' in their titles, obviously the engine is looking for the two words separately, not together, and is likely to waste your time.
I think it's true that search engines are subject to an artefact which causes a bias to short files: imagine a discussion on some topic (e.g. car exhaust); a long detailed file will tend to use this phrase less often, in proportion, than a short file, since it goes into detail on all sorts of points. So my impression is that Internet perhaps seems more trivial than it is.
It has to be said that some sort of censorship is probably inevitable. Suppose (just one example) that a second-hand car dealer put up an exact duplicates of some popular site, but with the graphics replaced by his own ads: this would look identical to most search engines, and would therefore appear as often as the original. On serious issues, I've been told the CIA has bought a search enginethis would be an expected development, the money being peanuts. The large engines seem to be operated by large companies (e.g. Excite seems to be part of Reuters/UPI. Snap! seems to be CNN plus weather) so one would guess they will censor material perceived as contrary to their or lobbyists' interests. Or they may simply play down sites which pay less. For example, the keyword "Richard Milton", the author of several books on scientific dissent, some material of which is good, didn't show at all on Metacrawler, but appeared on Hotbot. Similarly my Russell Vietnam material of my site does not show up on Hotbot or Lycos. (All this is quite separate from the question of site censorship, where usually the service provider comes under pressure).
A few notes on spam: a common type is to have many files, saying more or less the same thing, in different directories. A computer can't be expected to infer that the meaning is about the same, and then de-duplicate them. Identical files are perhaps less of a problem, as they are often in explicit mirror files. It's possible to generate unintentional spam - as for instance a book may be listed by chapters, and the title of each file may be the same, or similar; consequently some search engines will list (say) twenty separate chapters, one after the other. It's also possible to generate intentional spam; I recently noticed a Usenet group deliberately choked with nonsense messages. There's nothing to stop anyone putting up thousands of computer-generated nonsense pages. It's impossible to guess how much the relatively reasonable quality of the Web is due to sites being refused or removed.
A newish type is the self-voting searcher, Goto supposedly being one. This lists a top 500 of interests (Pamela Anderson being #1) which are supposed to reflect peoples' interests, skewed by payment. What's New? so far as I know is something similar.
I haven't considered in detail the problems of searching in special databases (legal, medical, sporting, etc) or looking for special program types; for example FTPSearch looks for FTP files. In any case these can be found by searching on keywords like CERN, Gopher, Archie, Archive, Veronica, and so on. Since Internet has special features designed specially to allow entry of queries (the standard elongated boxes, 'radio buttons' and so on,) there may well be specialised search engines for e.g. chess, baseball, or what have you. But if so they won't be easy to findmake a note or bookmark URLs you like!
The final message here in searching the Web (which I give with some hesitationmy information is imperfect, times change, and anyway your interests may not be mine) is not to be too hasty, and regard the process as something of an art form. For general searches, start with meta-searchers, such as Savvysearch, Search Spaniel, or Metacrawler. You can try lots of keywords to start, then use fewer if the list returned is too short, or do it the other way round, so long as you're fairly systematic.
If you prefer to stick with a single search engine, my selection is Altavista (but hard to pinpoint), Excite, Infoseek/ Go (but too many results), Webcrawler, and Yahoo! (if you're looking for conventional materiali.e. you want to find what other people are likely to be told). Try Euroferret for sites in Europe. UKMax might be good for UK. You might keep an eye on AllTheWeb. If you still can't find what you want, scroll up from here and try some of the search engines in my long list. And you could try foreign ones for a different angle. You might try the Webring index where more-or-less linked sites can be examined (scroll down, and use their site searcher). ABout mid-2000 some sort of link between Webring and Yahoo! was announced; I hope this won't have the effect of cutting back choices. Some webrings are good (a handy indicator is simply a look at the home site of a ring) but many have slow ads or are disappointing in other waysthe webring instructions are so badly written it's unsurprising many rings aren't maintained properly, as you'll find if you set up your own. On the other hand at least you can view their hit-rate stats). For Usenet groups (see below), Deja News seems the best (or only?) and it's very good, if you can work out how to use it. Several times I've found interesting websites that people have mentioned in their usenet emails, but not put onto search engines, so this roundabout route can be useful. You can also look at e-discussion groups in the same way as you look at Usenet, though many are private and may be difficult to get into or unwilling to answer questions.
For specialist lists or specialist search engines try Search Spaniel or All Search Engines and go through the same sort of process.
Adding your site to Web searchers: Search engines won't spontaneously search for sites, since, for all they know, their authors might have put them up as experiments and not want them listed. And they may not have access to the list of file names. So you have to do it yourself. Start by keeping a notebook. (There may be delays of weeks before your site is indexed). You might practise first with an instant indexer, if you can find one; try Search Europe, for example, or Zensearch, or NationalDirectory, or Scrub the Web, Starting Point. When you get the idea, try the large search engines: Fast Search is a good start. I'd recommend however that your site is in acceptable shape before submitting to major engines; they may blacklist feeble sites (perhaps).
The usual trick is to find and click on the 'Add URL' button ('URL' means 'uniform resource locator' if you haven't been told!the idea is that it's standard across the whole of Internet). You may have to navigate via 'Help', or do a search, before you find this. You may be faced with a series of boxes to fill in. You may be asked to describe your site in not more than 25 words, or some similar formula, even if your site has metastatements in it already containing this information; and you may be asked for keywords, even if you've included these. If you intend to add a batch of subsites, it's worth checking that the 'back' key will redisplay your previous site, so you don't have to type in your entire URL each time. A few search engines, for example Yahoo!, require you to find a category and sub-category for your site, something which can be very time-wasting, and the engine may not list your site anyway - Yahoo! has a reputation for throwing out 5/6 or 9/10 (or other fractions) of submissions. The next point may seem a bit obvious: make sure you spell all the characters of your URL(s) correctly; very few search engines check in real time that URLs actually exist, so it's possible to waste hours or days entering URLs with a tiny but invalidating mistake. Before adding URLs to Yahoo! and selective sites, check their listed sites: for example I recall being surprised at the feeble quality, and dead links, in Yahoo!'s UFO section.
If you're beginning, try Alta Vista (very easy to add; claims to take a few days; takes subsites too). Excite is another two-week wait. MSN says it takes two to four weeks. Try some sites from my huge list, above. Lycos and Hotbot may take months and seem to have some sort of censorship policy.
If search engines make use of other hardware/software setups, you can save effort by not entering the same information into different search engines which in effect use the same material. Inktomi is the well-known example; Inktomi's own site lists the search engines that use it, (click on 'partners'), but this information tends to be hidden away since the search engines like to pretend they're unique. Inktomi's long list includes Hotbot, Yahoo!, Snap, GoTo, ICQ and others, including Canada. They say they've arranged deals with Findwhat, kanoodle, LatinLOOK, Network Solutions, Oxygen Media, Surfbuzz, Powerize and others. I haven't been able to find a reliable way to add to Inktomitheir site is unhelpfuland none of their engines seems to spider, though one is told that a site added to one is added to all. If not, people with diverse sites have a boring time in prospect.
Summary of spider search engines, and adding your URL to them: Fifteen or so I've found are (alphabetical; I've improved the URL targeting, so you can click and work through them:) AltaVista [says as a rule of thumb it aims to spider to one level], Direct Hit (needs tedious list of keywords), Euroferret [Moved to Webtop which now has add URL facility and requires confirmation], Euroseek [this is the English page. If it doesn't work, try Euroseek.com], Excite [scroll down; this seems identical to Magellan, which now claims to spider], Fast, or AllThe Web [I think it claims to spider], Go, [was Infoseek. Says it spiders, and seems to have discontinued e-mail reception of bulk URLs. Has its own community, which you can join], Google, Lycos, Mirago [British; not very good], MSN [Microsoft's. Claims to spider. But, inconsistently?, allows multiple URLs to be submitted, at one per day maximum], Northern Light, Searchengine.com, [new to me], UKMax [also British; persistent teething problems], and Webcrawler. Only the main URL is needed, a huge saving of time if you have a large site. They all work, more or less, though don't expect perfection.
Notes: Lycos, which has a censorship policy, specifically states it isn't likely to spider more than one level, so if your site has subsites each with an index, you may do better to enter each of these separately with Lycos. Direct Hit has to have to add keywords (they give no information as to whether e.g. phrases are accepted or what separators should be used) and it has rankings based on user popularity.
Some categorised engines seem to share the 'Open Directory Project'; they seem to be AOL, Euroseek, Yahoo, Netscape Search, Hotbot and Lycos, which therefore presumably now return much the same list of sites. This project therefore very important; you must try to add your site to it. The site says they have about 20,000 editors, and have about 1M sites; it also claims they are "quite passionate about their work"just as well, perhaps, as a swift calculation reveals that if each of them examines ten sites an hour they'd take a year to get through 100M websites. It is, I'd say, probably impossible for them to keep up to date. You can't add sites from the project's home page; you have to burrow down into the subcategories. Any of the engines will do to try to add sites. And conversely, an ignorant editor is in a good position to block sites. The categories include horoscopes, under their culture section, an example of the oddly schizoid presentations of the Web; it's said that half the users are graduates, but see, for example, Lycos's top 50 search phrases, which suggest most of the use is by young people and childrenbut perhaps they are the other half.
Meta-searchers may have an 'add URL' button, but you'll just be told you can't add URLs, because the meta-searcher has no database of its own. Savvysearch is the only exception (I think)it adds your site(s) to many engines, and it also e-mails you with the responses.
'Inertial Research': lots of Internet users probably stick with what their software presents themthey may not even know there's an alternative, and their suppliers won't tell them. Thus most AOL users probably get the AOL menu with AOL Netfind; in UK, Freeserve users probably mostly get the Freeserve page; Gateway buyers are nudged into Yahoo!; and people who've downloaded a new Netscape may find Netscape Search has muscled in, so they get the Netscape page. Or Microsoft's MSN page. Or whatever it may be. So explore other peoples' views of the web, and enter your URL into their search engines.
To see if your site is indexed, try searching for URL:http://www.yoursite or "http://www.yoursite" if the searcher allows. (You might imagine the designers would provide an easy way to check. But they usually don't).
E-mail submissions, where search engines allow this, are useful for well-developed sites, as whole batches of subsites can be submitted. Infoseek is the best example. Sometimes you can switch between a list of your files in Word, say, cutting and pasting a batch into the search engine's submission window.
Bulk submissions to search engines: sites include Addme, Free Search Engine Submissions, GetHits (part of AddMe), Promote Your Site, Shout!, SubmitNow. So far as I've tried them, these are less easy to use than they soundin practice, you have to type in all sorts of detail and probably also add yourself to some sellable emailshot database. The free Searchers tend to be very obscure ones and I suspect some to have been manufactured just for the purpose. Some require you to enter your website into all the searchers one after the other. Keeping track of searchers you're supposed to be on is harder. There are various ways in which the process is less free than it might be. And if you have a site divided into subsites, the cost presumably increases with each subsite... I've just checked SimpleSubmit which appears free of any objection. Savvysearch has a submitting facility which reports back to you with an email to tell you what happened.
How important..? You may be told that bookmarked sites are the most important. Bear in mind that bookmarked sites aren't distinguished from typed-in addresses; in other words, if you can get your URL published somewhere, of course this will increase your hit rate; and you lose nothing by encouraging people to bookmark. But search engines may find people who otherwise would never hear of your site.
Keywords in Meta Statements: Alta Vista, Excite, Lycos, Netfind, Northern Light and Webcrawler* DON'T use these.
URL names: Some search engines make use of these: Alta Vista, Hotbot, Infoseek, Lycos, Webcrawler*. So, for example, if you're a world expert in orange juice, a site http://...orange-juice/orange-juice.htm will be rated higher by these searchers.
Comments: Only Hotbot* recognises these. Presumably, <!-- orange juice --> would boost an orange juice site in Hotbot.
ALT text for images: Alta Vista, Infoseek, and Lycos* only take account of these statements. The alt statement can be pure keywords, for example: <img src="mount_everest.jpg" alt="orange_juice">, though of course this may look odd if a mouse move causes the text to display. If you like, try an GIF with ALT keywords; take some picture which the site loads anyway and make it 1 by 1. (Why ALT statements? The reason must be because some peoples's sites are mostly graphics, which the spiders can't read. ALT statements give at least some inidication of what's happening).
Tiny or Small Font Text is used by some sites to pack in lots of keywords in a way which doesn't show too prominently. Such sites will be rejected by Alta Vista, Hotbot, Lycos, MSN and Webcrawler*. However, Infoseek and Northern Light* don't object.
Invisible text, with the font color the same as the background, is another obvious spamming technique. Only Webcrawler and Netfind* allow this.
Keywords in text: quite a few sites include (usually at the end) a list of keywords with a notice that these are keywords! Something similar happens on e.g. NASA's website, and many scientific papers, when a list of key phrases, often very childish ones, is given, although it's hard to see why. You could try thisobviously, the words don't need to be small or invisible. Whether search engines object, I have no idea; the only guess I can make is not to overdo this, as search engine algorithms might be expected to look for some sensible ratio of text:keywords. I've put an example at the end of this file.
Spread of keywords across the site is measured in some way by Alta Vista, Hotbot and Webcrawler*, no doubt to disallow sites which have misleading keywords bunched in meta statements or titles.
Title is certainly used by some search engines, since they display the site title when they return their list. The usual recommendation is to pack as many keywords as possible into the title. And probably the description.
Headings. I suspect the use of headings was an early device, before meta tags were well-known. If a site had no meta-tags, by default the searcher would look for <H1> or something similar, perhaps <CENTER>, and assume the first heading announced the purpose of the site. It's probably sound practice to include a heading and include keywords. If you can be bothered, search for for the first occurrence of <H in your file(s) and see how it measures up, in respect of meaningfulness, keywords, etc. You don't have to include a giant heading; <H4> may have the same effect.
Links. Any search engine with a huge database can collect information on links. *Excite, Infoseek and Lycos check links; Infoseek also has some system for assessing the status of links, so that a group of sites, perhaps the same person's, with artificially huge numbers of links to each other aren't boosted. (This may be targeted against 'free-for-all' link sites). And some searchers supposedly factor in some assessment of popularity. This is (perhaps) a good reason for arranging reciprocal links with other sites, and is probably one reason why propagandist sites tend not to include links to opposing sites, apart of course from not advertising them. If a site you have a link to won't reciprocate, it's up to you whether to remove your link or not. Personally, if I like a site or think it of some interest I always put a link; but perhaps this is silly. I've seen a suggestion that you might yourself put other peoples' sites, if they link to yours, onto search engines if they haven't done it themselves. Some searchers allow link:www.your.url type of enquiries to show you which sites point to yours. (Linkdomain:www.your.url is supposed to work with Hotbot.)
Spelling Variants. It's often suggested that misspellings should be included in meta tags, on the same principle as foreign restaurant name appearing with different spellings in phone books. If searchers don't use these tags, of course it won't helpyou might try to include variants in the text itself. In any case, Infoseek, Lycos and Northern Light* search based on word stems; that is, rather than look for whole words, they abbreviate them to increase their catch. Alta Vista* does NOT do this; it stores the entire phrase, including plurals, and so is more sensitive.
Crawler page [suggested by Paul Boutin, ex-Hotbot]. The idea is to make an HTML page containing only links, without graphics, external lnks or anything else. A link must be made to this page, presumably at the start of your index page. The idea is that spiders can be confused; if your site contains any wrong links or confusing HTML, it may get lost. A simple list of your own links should allow it to crawl fully in peace.
Free site link pages to your main site, i.e. a page which, if found by a search engine, directs to your site. E.g. Geocities/Yahoo is supposed to take the fifth most traffic of US sites. If you set up a free menu page with keywords and so on, anyone searching Geocities sites might find your page, giving you a free link.
Other link and portal pages. I've seen a site with keywords consisting entirely of spelling errors of keywords, linked to the main site.
Payment. I haven't tried this! But obviously there's plenty of scope. I believe Sprinks is a new (May 2000) engine of this type. If you're doing this, you may as well start with the most popular, then look at specialist search engines and directories.
Resubmitting: I've seen a recommendation to resubmit your site every month. Also recommendations to resubmit whenever you make a substantial alteration. And one (by Boutin, who says that no searcher penalises repeat submittersas far as he's aware) to submit every week. A recent newsletter said a couple of hours a day was all that's needed! My best suggestion is to check at intervals that your site(s) show up on the engines you're mostly interested in; if not, you have no option but to resubmit. Alternatively, have a rolling checklist and spend a fixed length of time on it at a weekend, like painting the Forth Bridge.
A possible approach is to tailor your site to search engines. You may find it worthwhile to have different entry points to your site designed to be optimised for each search engine. The idea is that a different entry page shows up well in different search engines, though the user sees more or less the same thing. I'm told some people do this! And some software does'Web Position' and 'Eugenius' for example. But search engines are fighting back against this sort of thing.
The lists marked * were published by Dilwyn Tng of Make-it-online, in InternetDay's emailed e-zine. Thanks. I haven't checked many of these claims; some may be outdated. I leave these entries in rather bare form, since it's fairly obvious what action to take, e.g. to ask people to link to your site.]
A useful hintperhaps the most useful you'll ever hear!is to exploit other peoples' work. Search, using phrases relevant to your own site, and examine the source code of the top URLs the engines turn up; if you mimic parts of their structure, you instead may appear near number 1! (Beware though - other people have this idea. An interesting possible counter-attack has started with Pinnacle, which guarantees a 'top 30' placeunless, presumably, 31 people apply to them, and I think only for one combination of keywordsand says it has Java programs which make keywords etc unreadable.)
Try to judge other sites objectively: if they are better than yours, then a search engine which ranks them higher is doing its job properly! So you may need to improve the content of your site.
This is where I put in a plug for antique technology, viz. Norton's utilities for DOS. In particular, there's a text search utility, of a type which seems hardly to exist in Windows; at least I haven't been able to find one. (Windows 98's searchStart menu, then click on the magnifying glass icondoes allow searches for text, unlike Windows 95's, which only allowed file name searches. But the result is only a list of files; each has to be opened to check the contents). This program searches entire directories for a text string, and returns not just the name of the file but a paragraph or so of text around the string, so you can see the text it's embedded in. Probably this anarchic feature makes it difficult for Windows; or perhaps most people just use games or something.
Example. To take a concrete example, I remembered reading, in a downloaded file, that Deborah Lipstadt's holocaust book doesn't actually deal with the points at issue. With the aid of Mr Norton's utility (version 4.2, something like 10 years old) and entering at the DOS prompt
TS C:\WORK "Deborah Lipstadt" /S
my entire hard disk, subdirectory WORK, is looked through for this string. And after a while, I duly found my quotation, from an article in the Skeptic.
The output can be directed to a floppy disk, preferably a high capacity one, rather than writing to the hard disk, which is increasingly risky with modern disks. In fact, a batch file can be left running to search for many strings, and the floppy later hunted through. I personally recommend this for such tasks as (e.g.) finding whether 'cheer up' is a Shakespearean expression, (it is) or what Noam Chomsky thinks of hack American authors (you can search e.g. for Reinhold Niebuhr in your Chomsky files, or, for that matter, your entire collection of files). My personal typings-in include a lot of Bertrand Russell material, which I can search in the same way - of course it's necessary to have lots of downloaded texts, or material typed in yourself, on your PC for this to give useful results.
As I say, I haven't found anything similar to this anywhere; it sounds absurd, especially as the idea isn't very difficult to program, but there it is. Recent Nortons seem not to have this, so far as I can judge by reading the advertising notes on their boxes. My version of TS.EXE is smaller than 20K bytes, about a twentieth the size of a typical small Windows utility.
Other e-discussion groups and lists
If you'd like your own e-group, the easiest way to start is to use someone else's software. www.egroups.com allows you to start your own group. E-mails are automatically sent between members. Groups can be public or private. Most are private; of the public ones, many have a tiny number of members. Also emails aren't easy to view; only the titles are given, and the are presented in small batches, to maximise ad exposure, so emails tucked away in the middle are hard to find. (Free Agent lists all emails in a group, one after the other).
listbot.com seems to have more to do with businesses, customers, clients. It says it collects the URLs of all your visitors, so you can email them all in bulk. It has a free version (with ads, which it promises aren't too intrusive).
Listserv® hosts or catalogues lists; the present site has more than 32,000. Click here for the official catalog. Various search options are offered, including lists with more than 10,000 subscribers. This seems to be updated regularly; there are other sites, e.g. by Vivian Neou, which haven't been updated for several years; and there are specialist lists on language, computer software, and so on.
Another site is Email Discussion Groups/Lists and Resources which is 'intended as a one-stop information resource about e-mail discussion groups or "lists", as they are sometimes called.'
Tables are one of the most useful features of HTML; they allow you to position blocks of text, and pictures, relatively to each other, in a way which won't vary too much with different peoples' computers. They look complicated at first, but less so once you have the feel for the way rows and columns are defined. (I have used an old, free copy of HoTMeTaL just for its table generating facility. New versions don't seem to have this.) They're one reason for the rectangular feel of much of Internet, like old computer games, which always seemed to have characters either going right/left or up/down. With a bit of effort you can introduce curves to reduce that effect.
To experiment, run your editor and enter <TABLE BORDER=1 WIDTH="100%"><TR><TD BGCOLOR="lightblue"><I>Something</I></TD><TD>Something else</TD></TR></TABLE>, which is one row (rows go across!) and two columns, as, with practice, you can see. Save this file with a .htm extension and view with a browser. You should get
Something | Something else |
like this, |
Frameswhere the screen appears to be cut into independently-scrolling areasneed the most complicated HTML of ordinary websites. (Example: see my McLibel site). Frames tend to be slow, since at least three HTML files are necessarythe first to tell the other two what to do. Ease yourself into this by examining a site you like the look ofyou'll need to right-click on the component files to separate them all out.
The easiest way known to me to generate frames is to use the free program Top Dawg. This has a 'frame wizard' which, if not in the Gandalf class, at any rate is fairly effective. (Download and instal his programPCs only, I think; click on the Frames tab and select the wizard icon). It allows horizontal and vertical frames, and allows you to alter their relative sizes. You still have to work out your URL names. (After typing this, I found Kevin Gunn's free Web-o-Rama has a similar facility).
The basic idea is fairly simple: there's a 'home' or starting page, which defines how the screen is to be subdivided; there must of course be a minimum of two other files to display, making a minimum total of three filesone to control things, two others to be displayed. The traditional main use is to have a fixed menu down the left margin, while the right portion of the screen is allowed to scroll. My McLibel example shows a different use - it has two scrolling windows, so readers can compare texts side by side. Windows can be defined so they won't scroll; and, as with tables, borders can be selected to show up, or not, so you have the option of making the joins invisible. There's endless scope, including windows within windows. You can provide an optional menu bar; my modern biology fraud" 'Why You May Die of Cancer' site (incomplete!) shows thisthe main file loads first, then the user has the option to instal a menu bar, loading up the two control programs. So they can ease navigation if they want to. Once the main file is already cached, the overhead isn't very great. (See 'orphan files' below for outline of how to do this; if someone emails, I'll put more detail. Otherwise I won't bother.)
Another use of frames is the free-floating frame, where a link is displayed in a new window. This uses the same sort of standard HTML code, like this: <a href="page_you_want_in_a_window.htm" target="_blank">Click here</a>. The window will be the previous default size; I haven't found whether the position and shape can be set easily. (Note: when you're using e.g. Explorer, you can open something you're interested in inside a smaller frame by right-clicking on it and choosing 'open in new window'. It's the same idea.)
Problems with Frames
Pop-up Messages (Alerts) can be easily displayed by putting onload='window.alert("Your message")' into the <BODY> statement along with background and link colors. When the HTML is first loaded, an official looking warning message in a box pops up, and remains until the reader clicks OK.
[Back to top]
[In case you haven't worked this out, right-clicking on images, including animated ones, and including backgrounds, allows them to be saved onto your disk, with the original name or one you choose. These will usually be someone's copyright. (The process with Macs is differentin fact, I don't remember what it is). Another possibility is to look through your Windows files, with your 'My Computer' utility, in the temporary browser (or something) subdirectory, where you may be able to transplant a graphic image.]
Background graphics. My background here is a simple cyan horizontal line on white; it's a tiny file. Large backgrounds may look impressive, but have two problems - (1) downloading time is longer, (2) they look different for peope whose screen are set differently: you may of course simply assume most people use 800 by 600.
If you have Microsoft Office, C:\Program Files\Microsoft Office\Clipart\Publisher\Backgrounds has 30 or 40 backgrounds. If you view them (start with 'My Computer') you'll see many are surprisingly pale. By experiment you'll find that backgrounds have to be weak-looking; the repetition ensures the brain will notice them.
What follows are notes for people who scan in or draw their own images, but find the resulting files too large and want to shrink their pictures. In my opinion, graphics files ought to be cut as much as possible, to free up bandwidth and generally cut down timewasting:
If you want a fancy smallish logo or eye-catching picture you may be able to edit a character from Wingdings or a similar typeface. If your image processor allows lettering to be drawn with a shadow, the normal picturestar, comet, phone symbolcan be given a striking 3D appearance.
Note on animated graphics - where part of a picture moves. Animated GIFs (only the .GIF format allows animation; this is because JPG images aren't bitmapped, but divide the image into rectangular areas) are fun, but watch for the overdone 'insect house' effect. They're best used for a definite instructional or attention-getting purpose: say, making some bottom-line figures flash red. Note that animated GIFs work in backgrounds (e.g. twinkling stars can be effective), but you have to be careful if the individual pictures are different sizes.
An easy way to experiment is to download Microsoft's free GIF Animator. With this you can load and look at stored animated GIFs which you have (you'll almost certainly find some in the Windows cache files). This software also lets you look at non-animated, simple GIFsand you can set one colour to be transparent with it, so the background shows through, as in my shooting stars graphic. You can also add comments to GIF filesfor example, your name. And, though this needs some familiarity with graphics processors, you can design your own portions of GIF and adjust their position to give the animated effect you want. If you can't be bothered with this, cheap CDs are available from e.g. Xoom and Jayde with thousands or millions of these images. Free sites include e.g. Xoom clip empire, rather small for an empire, including e.g. Java buttons, Animfactory which has downloadable free animated graphics, essentially a taster for their CDs, Free graphics, 'over 300,000', with many links, designed for their convenience, and their advertisers', Caboodles, 'over 1.4 million images'.
There's a site which states that Unisys has copyright of the GIF format, and that anyone using it may be liable to penalties. I don't know the legal status of this site. However, you have been warned!
I've avoided sound in the past. Noises, voices and music need large amounts of data, and, therefore, possibly overlong download times. However; here are my outline notes:
Video. Capturing videotape into computers needs special equipment. In order to make computer systems work at all, small pictures are necessary, and there's intensive compression. The result is technically quite a feat but nowhere near as good as the original. I'll only mention that .mov files are a 'Quick Time' format. Another format is .mpg, I presume a moving version of .jpg, as the same distortions creep in as with jpg still images. Windows media player plays these. However, it does not play Real Player's own format, .ram files.
[Back to top]
I'm afraid this is a plug for another antiquity, which personally I find more useful than most Windows software, namely the DOS version of WordPerfect. This has
For people who haven't noticed, the Windows 'File Manager' program and the Windows 95/95 'My Computer' click-and-drag copy do what Microsoft copy programs should always have done, but didn't. Namely to allow for overflow of data. Thus to backup the entire contents of a hard disk folder My Computer/File Manager's copy will ask for new disks to be put in until the process is finished. This is useful for people who distrust Microsoft's Backup. The early versions were very user-hostile, one of the most unpleasant programs I've used, with worrying warnings particularly in compressed mode about the use of RAM and the initial settings. To store data in compressed form, and allow files to overflow from one disk to the next, these programs need additional data files, and if these are lost, or if the compressed data becomes corrupted (as they say) in some way, you may find it impossible to restore your data.
Be cautious: avoid this program and do straight, readable copies.
Rewriteable and write-only CDs: There's an important difference. Rewriteable CDs (the expensive ones) behave like ordinary magnetic media. BUT write-only CDs are trickier: each time you write something, the records of where the file is stored and where its sectors are has to be altered. Since the disk is write-only, there's no fixed place on the disk for this information. This is the reason that the disk has to be made ready for writing, or made ready to be readable by an ordinary PC. The process of converting a write-only disk to mimic an ordinary CD can be slow; it's also slow to convert it back to allow writing.
Note that writeable CDs may not be checkable with Scandisk; if there's a mistake in them, for example with cross-linked files or some other storage problem, you may have no easy way to detect and correct this. If a CD behaves oddly, you may do better to scrap it.
NB the sector size on Zip disks is remarkably smallmuch smaller than typical hard disks, which now have a minimum stored file size of 32K, unless you use the Windows 98 system, which I don't personally recommend, as you may come up against software incompatibilities. So, many small files can be copied onto an unexpectedly small number of such disks. [Note: Zip disks have no connection with zip files, which are files in compressed form, i.e. manipulated by software to be short but expandable to the their original length.]
Zip disks are difficult to undeletethey don't show up in the 'recycle bin'. (You can undlete by going to DOS, LOCKing the zipdisk, and using UNDELETE). This seems a minor reason to prefer large format 'superfloppies'.
The leading free sites seem to be Fortunecity, Geocities, Tripod, and Xoom. They all claim to have huge numbers of members. Some search engines have free sites as a sideline, e.g. MSN and Jayde. Many of the free websites aren't much more than diary jottings; but some are substantial and interesting. Sometimes you can investigate such hosts by guessing site names, jim.htm for example. Unfortunately, as with other things, you tend to get what you pay for (if you're lucky): all these sites are paid for indirectly e.g. by ads from outside sources, internal ads and promotions, phone subsidies, on-line sales, or being partially locked in. Geocities sites have a small window which opens and displays ads; Xoom has a different system with banners, and specialises in computer sales of discounted software and hardware items. It might in principle be possible to use these sites just for storage, for example, of big files, which could be downloaded free for the user. Unsurprisingly, they've thought of this and discourage such use.
Freespace.net is a directory, one hopes reasonably complete, of 'free' sites, classified by category (e.g. business, non-profit) and country and language, whether they allow CGI processing, and by Mb of space offered. I doubt whether they're forthright about the downside. A typical larger site is www.hypermart.net, for business websites.
But.. small-print warnings:
Freeyellow says 'Your... Website is FREE.. if you register your custom domain name.. it will only be $149.50... This will get your custom address setup and HOSTED.. An address MUST BE HOSTED.. or it will not exist. Then there will only be a nominal $29.50 per year maintenance fee...'
Tripod (I was told by e-mail) is free for a year 'to Premium members'. However, the actual site didn't seem to say so; nor did it say what happened after the year ended.
Xoom in its small print says: '... If any third party brings a claim, lawsuit or other proceeding against Xoom based on your conduct or use of Xoom services, you agree to compensate Xoom (including its officers, directors, employees and agents) for any and all losses, liabilities, damages or expenses...' I was also unable to find a clear statement about prices; there's lots of 'join free' stuff, but I wasn't able to find that remaining a member was free, too. Xoom asks for complete freedom to use any hosted webpages (but only for ads or promotion). Be careful over copyright claims on your site.
Yahoo! managed to achieve fame in tandem with Geocities (these tie-ins are likely to become more popular if browsers move toward stricter use of search engines they choose to promote) by claiming virtually all rights of exploitationcopyright, imitation, derivation, etc. etc.over such sites. Another possibility (which I haven't checked) is that directories like Yahoo! might claim rights over sites which they include. Two URLs I have are boycottyahoo, and a supposedly helpful page by Yahoo!, http://docs.yahoo.com/docs/info/toshelp.html. In any case Yahoo! irritated me by removing my Vietnam War Crimes site from their 'directory'. Possibly the time is overdue for a revision of their credibility, in a downward direction. They now (early 2000) have some link with Reuters, which will no doubt worsen them.
In the UK (early-2000): even the pay sites give little support (I think Demon, one of the most popular, gives no support to people struggling with websites; Prestel certainly don't, except in the sense of providing a group for people to quiz each other. Nor is this surprising, as a beginner could consume almost indefinite amounts of time wrestling with HTML.) A UK Freeserve is hosted by an electronics hardware company. Another, Tesconet, is hosted by a supermarket chain, as is LineOne. Amazon has one called TheSeed. I once met Richard Branson (his car had broken down), who now has VirginNet with a sort of white-on-white styling. It's quite hard to investigate these sites without signing onto them, which may incur some sort of liabilityyour hard disk may have odd things done to it. All I've established about Freeserve is that it has twice as many members in the UK as AOL (signed up in a much shorter time) but that it's not as easy as it sounds to set up a websiteI'm told users can phone for a long time without being 'processed'. I don't know how much advertising has to be tolerated, nor how difficult it is to break out of their self-enclosed system. AOL's arrangement suggests that many users probably never realise how fenced in they are. I recently found zyweb which offers HTML outlines, so you can make your own sites relatively painlessly by copying their examples. I'm told it's free but slow.
Free Downloadable Software: Dutch free site has free material for webmasters. Coolboard is a 'free customized message board'. Freeguestbooks is a Norwegian site. 'Guest Gear' is a guest book by Lycos: Guestbooks are interactive e-mails, so visitors can leave messages which other visitors can read. Guest books are a relatively easy way to make your site interactive; however, they tend to be a bit slow. Be Seen, part of LookSmart, says it has free web design software, counter, site searcher, chatroom etc. Webmaster-resources has articles etc. Microsoft's siteholder (click on products link) had a "wealth of free tools"doesn't sound quite like Microsoft, does it?but seems to have gone away. There's a set of sites promising free or shareware files, nearly always with barely enough description of their programs to be useful to youyou may find a piece of software has some problem which makes it unusable to you; for example, it may be outdated or have incomprehensible instructions. Software may have irrelevant advertising or bugs. You may need PERL, UNIX, or other things. (If this happens, why not email the site and tell them? They may genuinely not know). So don't be surprised if interesting-sounding programs turn out not to be there, as out of reach as fairground prizes. Some sites may alter the way your computer downloads, or do unannounced things with cookies, or have other annoying effects. Some programs are just tasters, and not 'free'. So beware. Sites include Shareware, Freecode, Winsite, Completely Free Software, Dave Central (in many ways absolutely typicalhuge lists, but it's impossible to tell from the descriptions what the stuff is really like to usea sort of used-car approach), Filez, Free Stuff Factory, Filepile, Nonags, Only Freeware, 'The Freeware Publishing Site', ZDnet. I recently found FreeSiteTools which is interesting. Some freeware is very good: see below in Graphics on 'IrfanView'. Forte Free Agent (for Usenet groups) and Pegasus (for email; a lot of useful features, for example a selective download of headers only, allowing you to not download spam, and easy separation into batches for your future reference), are two other popular freeware programs. I think Listbot is free (a way to manage an e-mailing list; I've never tried this. Check Majordomo for the same function). I quite like WebCopier by Maxim Klimov as a site grabber, to get a snapshot, stored on disk, of the whole of some interesting site at some date. (Click for free download. Warningthis software is not very easy to use, and has bugs, though usually it's OK. You must enter a URL address as a starting point, and remember to click on 'start download', which is hidden under the name of a 'project'. You'll need to click on 'URL filters' to download sites hosted away from the URL. Add the date to the title you give your file, so you have evidence of when it was grabbedsites change. The folder will be in a subdirectory of c:\Program Files\Webcopier, and the whole folder is relocatable). Another site grabber, also free but with ads, is NetVampire, but for my taste it has a very user-hostile interface. 'Hot Dawg' on Arthur Smith's site has usessee Tables and Frames, below.
Magazine cover CD-ROMs have software of varying usefulness. The best advice I can give is that software from a brand name you recognise and like may be the best for you, even if it is supposedly obsolete.
(Free graphicssee above.)
Free newsletters can give useful snippets of information; all you have to do is ignore the stuff you don't want, discount repetitive stuff if you've seen it before, and be cautious about hype. They are emailed to you if you ask them, or 'subscribe' (on the Internet, this means freeusually). They all make it easy to unsubscribe, by sending an email. The ones I've found are, alphabetically: Add Me! (claims 180,000; weekly; web tips of varying levels, but often sensible), AOL has a newsletter (surprise! It's entirely concerned with AOL and its sites, and sales links), Internet Day (daily; claims 150,000; part of Jayde; articles supplied by readers, generally as thinly-disguised hard-sell promotions), Lycos News (somewhere on their search engine), NetDummy (sic; the link is an email address; send a message to subscribe. Daily? I seem to have lost my details of this), NetMechanic Webmaster Tips (once a month?), Webmastersonly, SitePoint (by far the smallest; has lots of computer and website news links), Website Journal (weekly by Website Garage, coupled with Netscape; often sensible Webmaster info), Xoom Newsletter (from Xoom; mostly hardware and CD ROM offers).
News. Or what passes for news. Some sites are news.com, andovernews.com, and isyndicate.com. This is the Drudge Report a sort of conventional meta-look at all the third-rate sources of the modern media, such as the New York Times.
The good news is that counters exist, and can in principle tell you how many visitors you have. The less good news is that service providers don't seem to like them, because they take up computer processing time which could be used for better things, and also perhaps because they offer scope for people to try running programs in the heart of their machinesthese are the ones with 'cgi' in their code. Having said that, some service providers (Prestel, for example) permit their users this facility. (The providers themselves of course have this sort of information, including the total number of bytes downloaded from each site; perhaps if you ask nicely they might give you some of it). All the commercial (i.e. advert-paid) counters I've seen on other peoples' sites have been painfully slow and never ever seemed to register their information.
You may be told that counters (e.g.) count the graphics of your site, or count repeatedly each time a browser moves through a page. It depends on the programming; they may, or may not. If you're in doubt, experiment to see, preferably asking someone else to download from your site. You may get different results if a browser is set so pages aren't stored in caches, but I think this must be unusual, since it's such an advantage from the speed point of view.
NedStat is a free counter which looks good (based in the Netherlands).
TheCounter also looks good (and is easy to instal).
Commercial counters include these, none of which I've tried:
Cool counter. The Basic edition is 'by far the most popular'; it's free, counts any number of pages, but has a banner you have to display. Extreme Counter, with no ads, sounds cheap.
Fastcounter is another. So is Hitbox which claims to give lots of information; if you put an intrusive green symbol on your site, you can get it free, at least in theory (I couldn't get it to work, twice!) and get yourself listed on yep.com, their in-house search engine. There's Watchwise.
Getstats another.
Pagecount (says it's free but with ad banner; appears to count one page only, and has up to 100 origins of hits) is yet another.
And Showstat and Sitemeter and Sitetracker another ('.. some of the most detailed..'). And Stat Track another.
The idea is to generate an index for your personal site; when interrogated with a keyword, it should list your files which contain it, in standard formatfile name, file title, description. For example, you might search for Constantine or Sherlock or Napoleon or, with a good site searcher, complicated things like chemical warfare. And the result will be a list of files just from the one site. If you have a precise subject in mind which you're fairly sure the site deals with, then a site searcher is a valuable tool, especially with very large sites. However, if my own experience is anything to go by, this feature isn't popular with most users (possibly the time delay has something to do with this)so if you don't have one, you may not lose much.
A searcher is supplied as Javascript, and the user has to cut it and paste it into his source HTML code. (You can do this with Word, but be careful to use only text mode). I've experimented with Pinpoint, but couldn't get it to work properly. Hotbot seems to have such a facility, which however I couldn't get to work either. What-U-Seek has a promising free one; in fact I recommend it. It has a password system, and quite a bit of freedom in formatting - you can include your own picture, background graphics, and choose font details of the search results. Many of the check-your-HTML sites (some listed above) offer this. A variant is to have a searcher which operates normally, though obviously this offers your site little advantage. 'Personal Open Directory' is Yahoo-like. There may be a Mac searcher on Ultradesign though I'm uncertain exactly what this company does. Lucene says it's a free downloadable search engine; written in Java. But I haven't checked this in any way. Atomz has a site searcher; I haven't checked it. With this type of feature, if it's free, you will have to put up with banner adverts. If you pay, you can be ad-free. You'll usually have to tell the host site when you want your site indexed, and whether there are files you want unindexed. There may well be restrictions on site size.
msconfig. It's worth knowing that you can control your start-up files and the way your PC is configured. If your computer has picked up software which insists on loading itself, click Start | Run and type msconfig in the letterbox. Click on OK and pick 'Selective startup'. You can turn off 'load startup group items' which gives you a normal desktop, except that it's clear of the programs which are usually automatically loaded. (This may mean e.g. your scanner is turned off).
And you can tinker about with config.sys, autoexec.bat, win.ini and so on. Warning: I accept no responsibility. You must know what you're doing! Sometimes it's possible to make useful adjustments with this option; with some poking around you may well find you can turn off software you don't want.
It's also worth knowing that you can check what your computer is doing when downloading (or uploading) on your phone line. This is particularly useful with a long file (for example, a .pdf file) where there's no progress bar to show you what's happening. Right-click the modem indicator typically in the bottom right of your screen (my picture shows what it looks like), and select 'status'. Bytes received, and bytes sent, are shown to you. If both stay fixed for any length of time, someone's system may have been locked up or turned off.
Restart. Known as a 'warm boot', the idea is, that programs which have somehow got themselves in a tangle can be closed down or reset. The effect is as though the computer had been turned on. Recommended if (e.g.) your windows don't display properly, or there are strange conflicts preventing scanners, printers, external disks, or even the keyboard from working properly. Click on Start | Shut Down | Restart. Any data in memory will be lost.
Scandisk (mentioned above) is an important first-line check on your computer. It tests that files are stored correctly. If you switch off a computer without closing it properly, there's a risk that incomplete files may be left on your computer, which may later get tangled up and cross-linked with new ones. Scandisk (Start | Programs | Accessories | System tools | Scandisk) ought therefore to be run from time to time. Tip: Windows can run several programs simultaneously; you may find Scandisk is intolerably slow, presumably because files repeatedly get changed. So switch to DOS mode [Start | Shut Down (sic) | Restart in MS-DOS mode] and type, usually, SCANDISK C: after which exit or win or windows will return you to the desktop.
'Viruses' are the not-very-accurately named bits of program which are designed to attach themselves to other peoples' computers. In fact many are inept copies of other peoples' ideas; many don't work. The chances of getting a virus are low; obviously, the people designing PCs must have had this possibility in mind, and there's a strict division of files into those to be viewed, and those which run as programs. My guess is that more PCs are damaged in some way by programs with bugs than through intentional damage.
There have been recent scares about 'attachments' to e-mails. I don't know if these scares were in fact valid. 'Pegasus' allows you to look at the headers of e-mails without downloading the body, giving some assurance. But if you're worried about this, don't open attachments, but e-mail back and ask for the information to be sent as an ordinary e-mail.
Other things you can do are pay for a virus detection program, and/or consider doing your main work on a computer which isn't connected to anything else and where suspect disks aren't permitted.
Firewalls. This is a fancy name for software which is supposed to check on data being imported down a phone line into your PC. In my view, it's possible for, say, Microsoft, or anyone else, including firewall programmers, to incorporate special unannounced stuff in their own software; so this cannot be completely foolproof. If you want to experiment, try Zonelabs, who offer ZoneAlarm free to non-business users. As always, the instructions are a bit obscure; what exactly does 'locked' mean, for example? But, when surfing the net, you can have the pleasure of turning down companies that insist on probing your PC. A pop-up alert shows when something's happening; and the URL of the contact is listed, in number form, so if curious you can try to see what was happening. You can turn off individual programs (e.g. AOL's) from contacting internet. And set the program to prevent any access when the screen saver is on. Also it says it checks for Visual Basic attachments to e-mails. (I think this company also has an exclude-ads program, to prevent banner ads from well-known ad sources from downloading).
Keep backups!
Everyone knows that .com sites are supposed to be commercial, .org sites are organisations, .net sites are something vague to do with the Net, and .gov, .mil, .edu have their own uses. It's still true that .com sites are by far the best known, so, even if your site isn't commercial, there's a lot to be said for a dot com address. One convention is to avoid making people guess whether dashes or underscores are used, and simply use a name without spaceswww.normanfinkelstein.com illustrates the principle.
After some unhappiness over a US monopoly, there's been a mushroom growth of hosting sites.
If your ISP is one which hosts domain names, and you're happy with your ISP, all you have to do is choose your name, register it, and do what's needed to make it all work.
If your ISP doesn't host domains, the easiest thing is to keep your site where it is and register its URL somewhere else; most such sites (as far as I know) allow you to redirect any hits to your site, however long the name is. You can also move your site to another ISP without much disruption; all you do is change the redirection to the new address. There are (inevitably) complications: you may be allowed the option of keeping your domain name visible in your visitor's browser; or you may choose to allow it to display the real address. I'm uncertain whether any domain hosts allow more than a handful of subdirectories to be shown. (Example: if you are www.johnsmith.com you can sometimes allow johnsmith.com/computers.htm to redirect properly, but you may not have many such options).
Several warnings:
I'm told that you may find that, if you check a likely domain name, and find it's available, you may lose it, because the name may be bought speculatively by someone with access to checked-out names. So, if you think of a terrific name, you might do well to be cautious in checking for it, and perhaps be prepared to register it quickly.
Once your name is registered, although it's theoretically yours, it's physically processed by someone's computers, and they can cause problems by not responding to queries, not answering emails, and so on. They may impose charges for moving to another site. It may be difficult to alter such details as phone numbers. Unfortunately the situation changes fairly fast; it's impossible to advise on the best hosts.
Hardware: these companies use specialist equipment which is barely known outside the smallish number of people who work with it. So far as I know there's no way to predict who will have the best stuff in a year or two's time. Another complication is resellers, companies who appear not to host sites, but act as sales outfits. UK2net (in Britain) is an example; it is a reseller of joker.com, of Germany, and relies on ads which it's hard to describe as other than misleading. Unless you enjoy wasting your time, read the small print and take care.
[Back to top]
This site is big-lies.org