HyperText Markup Language (HTML) is a language to specify the structure of documents for retrieval across the Internet using browser programs of the WorldWideWeb.
HTML is an application of the Standard Generalized Markup Language (SGML) which is the International Standard (ISO 8879) for text markup. The principle is that text markup concentrates on structure rather than appearance, making the files more reuseable and leaving the visual details to the end-user software (like the browser you're reading this with now). For the reasons why, see Eliot Kimber's comments.
Details of the specification are in the IETF Draft and the HTML Document Type Description. There is a FAQ (Frequently-Asked Questions) document (also available by anonymous ftp from rtfm.mit.edu), and a new book on HTML and the WorldWideWeb out shortly.
Simon Spero (ses@tipper.oit.unc.edu) explained it as:
The HTML DTD with its very simple element structure is primarily intended for describing the structural elements that appear on hypertext pages. Not the structure of the documents that comprise those pages (it's too vanilla for that), but the pages themselves. Not the layout of the pages, but the structure.
In a hypertext browsing system, the page is the basic object into which elements are placed, and which is common to all documents across all display technologies. Much of the structure of a document might be implicitly expressed via links between pages.
In order for a page to be displayed and browsed through correctly on a variety of systems - one of the primary design goals for the web - the layout of a page must be described in a sufficiently abstract way as to make sense on a VT100, a NeXT, or an X workstation. This rule was broken somewhat by the
<img>tag, introduced by NCSA for their Xmosaic browser, but is otherwise generally intact. Attributes like centering are only really suitable for bitmapped displays with variable spacing. To allow portable display it is much better to indicate the visual role to be played by the attribute (the reason why you wanted it centered), and allow the display engine to decide how that text should be rendered.
The Universal Resource Locator (URL) is the `address' of a resource in the Web. It could be a file, or an index, or some program that does processing: they all use the same format to refer to them by:
scheme://host[:port]/path/filename{#location|?indexterm}
- Examples
- ftp://www.ucc.ie/pub/sgml/p2sg.ps
- http://www.ucc.ie/cgi-bin/acronym?url
- gopher://ds.internic.net/
There are two exceptions to this format, for Usenet news and for Telnet.
telnet://library:@iruccvax.ucc.ie
news:comp.infosystems.www.users