Structured Emacs

[GNU]


Why structured Emacs?

Structured editing is a means of rationalising and ordering your writing so that the structure of your document is taken care of by the editor, leaving you freer to concentrate on the content. Structured editors use markup to identify the structure of your document (headings, paragraphs, lists etc), similarly to the way used by some wordprocessors, but in a rational and reusable form. The international standard for text markup is the Standard Generalized Markup Language (SGML), which is implemented for Emacs in psgml-mode.

There are many different applications of SGML: some of the most popular are:

You should read a little more about SGML and how it works before starting to use structured editing, in order to get the maximum benefit from it. You should also read the documentation for the specific SGML application you plan to use.

There is a separate Reference Card for HTML available online.

The psgml-mode checks the syntax of SGML according to the relevant Document Type Description (DTD), so it always knows where you are in the document. This means you need to have a copy of the relevant DTD for the application you are using. Your systems manager should be able to provide a precompiled (.ced) file for you.


Emacs is GNU software. The GNU project (it stands for `GNU's Not Unix') aims to provide a complete operating environment in a portable form, distributable free of charge.

There's an online tutorial you can use (once you have Emacs running) by typing ^h t (the ^ prefix means you hold down the Ctrl key while pressing the letter that follows).


Loading psgml-mode

Provided that psgml-mode has been installed correctly on your machine, and if your filename ends in .sgml or .html then Emacs should spot this and load psgml-mode automatically. If this does not happen, you can force any buffer to become SGML-sensitive by typing ESC x sgml-mode RET

Loading the Document Type Description

In order to recognise the component parts of your document, psgml-mode needs access to the DTD. You must provide this by making the first line of your document a DOCTYPE statement in the form:

<!DOCTYPE type PUBLIC public-id>

or

<!DOCTYPE type SYSTEM directory/filename.DTD>

where

Examples:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<!doctype html system "~/web/html.dtd">
<!DOCTYPE tei.2 SYSTEM "c:/tei/dtds/tei2.dtd">

Note that on a PC system the directory separators must always be UNIX-style forward slashes, not DOS-style backslashes.

Emacs psgml-mode will read and parse the Document Type Declaration the moment you type something which requires parsing or validation. You can make it read the DTD first by typing ^c^p

Local (UCC) users on the WWW host can type Esc x html RET to initiate the DocType and the parse for a new file.

Once a DTD has been parsed and read correctly, you can make a fast-loading version by typing ESC x sgml-save-dtd RET and giving it a name ending with .ced

This can be reloaded for subsequent sessions with this file or others by typing ESC x sgml-load-dtd RET and giving the name. This avoids having to wait for a lengthy parse if you use a long or complex DTD like the TEI.

Structuring

The difference between structured editing and visual editing is that with structured editing you tell the system what kind of information you are about to type (or have just typed, or want to change), rather than how it is to appear when printed. This lets you uniquely and unambiguously define the component parts of your document entirely separately from their visual instantiation, so that your text becomes reusable, rather than tied to a specific manufacturer's proprietary view of how it `ought' to be. See Eliot Kimber's comments on public vs proprietary markup.

Doing this means you say `I want to start a new paragraph here' rather than `Break the line and indent 5 spaces'. The former describes the logic of what you want; the latter only tells you how it will look on one particular occasion. If you have ever written material for a publisher, you will have come across the problem of how to describe what you want in generic terms, when you don't yet know the actual design or style of typography to be used. SGML lets you do this without interfeing with your text in any way.

It does this by enclosing your text in tags, which are named items in angled brackets. The names are defined in the DTD you use, so this paragraph is actually stored as:

<p>It does this by enclosing your text in
<dfn>tags</dfn>, which are named items in angled brackets.
The names are defined in the DTD you use, so this paragraph is
actually stored as:</p>

The elements of your text are thus sandwiched between a start-tag and an end-tag (the end-tag has a slash before the name to identify it as such). Note in the example above how a descriptive element (DFN: a definition of a new term) is identified within a P (paragraph) element. The degree to which elements can be nested inside one another is defined by the DTD: obviously you can't normally have a section heading (for example) beginning inside a paragraph.

Editing

All Emacs editing keys operate exactly as normal, with the following major additions. You can use completion on all element names: pressing TAB when being asked for the element name will display a list of all elements valid at this point.

Element insertion and deletion

Editing movement

Special characters

To keep your files portable, you cannot use PC-specific or Mac-specific accent or other special characters. Instead, you use the international standard set of character names from the ISO Latin-1 and other lists, as defined in the DTD you are using.

For example, the ones valid for HTML are listed below. To write a name like René Füßli, you type:

Ren&eacute; F&uuml;&szlig;li

Editing systems which understand SGML and HTML usually let you insert these naes from menus.


ISO Latin-1 characters

&Agrave;  À  capital A, grave accent      &iuml;    ï  small i, diæresis/umlaut   
&agrave;  à  small a, grave accent        &ETH;     Ð  capital Eth, Icelandic
&Aacute;  Á  capital A, acute accent      &eth;     ð  small eth, Icelandic
&aacute;  á  small a, acute accent        &Ntilde;  Ñ  capital N, tilde        
&Acirc;   Â  capital A, circumflex        &ntilde;  ñ  small n, tilde               
&acirc;   â  small a, circumflex          &Ograve;  Ò  capital O, grave accent      
&Atilde;  Ã  capital A, tilde             &ograve;  ò  small o, grave accent             
&atilde;  ã  small a, tilde               &Oacute;  Ó  capital O, acute accent      
&Auml;    Ä  capital A, diæresis/umlaut   &oacute;  ó  small o, acute accent        
&auml;    ä  small a, diæresis/umlaut     &Ocirc;   Ô  capital O, circumflex   
&Aring;   Å  capital A, ring              &ocirc;   ô  small o, circumflex            
&aring;   å  small a, ring                &Otilde;  Õ  capital O, tilde             
&AElig;   Æ  capital AE ligature          &otilde;  õ  small o, tilde               
&aelig;   æ  small ae ligature            &Ouml;    Ö  capital O, diæresis/umlaut 
&Ccedil;  Ç  capital C, cedilla           &ouml;    ö  small o, diæresis/umlaut   
&ccedil;  ç  small c, cedilla             &Oslash;  Ø  capital O, slash                   
&Egrave;  È  capital E, grave accent      &oslash;  ø  small o, slash          
&egrave;  è  small e, grave accent        &Ugrave;  Ù  capital U, grave accent           
&Eacute;  É  capital E, acute accent      &ugrave;  ù  small u, grave accent        
&eacute;  é  small e, acute accent        &Uacute;  Ú  capital U, acute accent      
&Ecirc;   Ê  capital E, circumflex        &uacute;  ú  small u, acute accent        
&ecirc;   ê  small e, circumflex          &Ucirc;   Û  capital U, circumflex          
&Euml;    Ë  capital E, diæresis/umlaut   &ucirc;   û  small u, circumflex            
&euml;    ë  small e, diæresis/umlaut     &Uuml;    Ü  capital U, diæresis/umlaut 
&Igrave;  Ì  capital I, grave accent      &uuml;    ü  small u, diæresis/umlaut      
&igrave;  ì  small i, grave accent        &Yacute;  Ý  capital Y, acute accent      
&Iacute;  Í  capital I, acute accent      &yacute;  ý  small y, acute accent        
&iacute;  í  small i, acute accent        &THORN;   Þ  capital Thorn, Icelandic       
&Icirc;   Î  capital I, circumflex        &thorn;   þ  small thorn, Icelandic         
&icirc;   î  small i, circumflex          &szlig;   ß  small sharp s, German sz           
&Iuml;    Ï  capital I, diæresis/umlaut   &yuml;    ÿ  small y, diæresis/umlaut

Additional characters from ISO 8859-1

&#160; &nbsp;     non-breaking space          &#177; &plusmn; ± plus-or-minus sign          
&#161; &iexcl;  ¡ inverted exclamation mark   &#178; &sup2;   ² superscript two          
&#162; &cent;   ¢ cent sign                   &#179; &sup3;   ³ superscript three        
&#163; &pound;  £ pound sign                  &#180; &acute;  ´ acute accent             
&#164; &curren; ¤ general currency sign       &#181; &micro;  ´ micro sign                
&#165; &yen;    ¥ yen sign                    &#182; &para;   ¶ pilcrow (paragraph sign) 
&#166; &brvbar; ¦ broken (vertical) bar       &#183; &middot; · middle dot               
&#167; &sect;   § section sign                &#184; &cedil;  ¹ cedilla                  
&#168; &uml;    ¨ umlaut/dieresis             &#185; &sup1;   ¹ superscript one          
&#169; &copy;   © copyright sign              &#186; &ordm;   º ordinal indicator, male  
&#170; &ordf;   ª ordinal indicator, fem      &#187; &raquo;  » angle quotation mark, right   
&#171; &laquo;  « angle quotation mark, left  &#188; &frac14; ¼ fraction one-quarter          
&#172; &not;    ¬ not sign                    &#189; &frac12; ½ fraction one-half             
&#173; &shy;    ­ soft hyphen                 &#190; &frac34; ¾ fraction three-quarters       
&#174; &reg;    ® registered sign             &#191; &iquest; ¿ inverted question mark        
&#175; &macr;   ¯ macron                      &#215; &times;  × multiply sign                 
&#176; &deg;    ° degree sign                 &#247; &div;    ÷ division sign