abonnement Unibet Coolblue Bitvavo
pi_1901725
Loedertje slaat door

Ga door, Ga door!

pi_1901739
quote:
Op zondag 14 oktober 2001 23:24 schreef calvobbes het volgende:
Loedertje slaat door

Ga door, Ga door!


:{
pi_1901748
Tís nog best een gezellig topic geworden he?

Ow effe ontopic:

[Peeeeeest] text [/Peeeeeest]

:{
  † In Memoriam † zondag 14 oktober 2001 @ 23:28:22 #54
13819 Loedertje
Trotse GILF.
pi_1901755
Piccolo manuale di occupazione di Amsterdam

--------------------------------------------------------------------------------


Traduzione del piccolo manuale di occupazione dei centri informativi per l`occupazione (c.i.s.)

PICCOLO MANUALE DI OCCUPAZIONE

INTRODUZIONE

Ti presentiamo il P.M.O. di tutti i centri di occupazione di Amsterdam. Questo manuale e` un piccolo riassunto di alcune parti del manuale di occupazione del 96/97 aggiornato e con informazioni piu` recenti I gruppi di consulta per l'occupazione (Kraakspreekuur) sono organizzati per quartieri. Questi gruppi vogliono portare alla luce case del quartiere vuote (non affitate); inoltre questi gruppi vogliono conoscere come sta la situazione di rinnovazione e riurbanizzazione della citta`. Daltra parte si fanno queste consulte per spiegare ai potenziali occupanti come trovare le case occupate. Dopo aver letto questo manuale, avrai sicuramente delle domande, puoi andare da questi gruppi per avere altre notizie. Se una persona e` intenzionata ad occupare e vuole continuare allora puo` selezionre alcune case potenziali; poi i gruppi possono aiutarti a ottenere altre informazioni sulle case. I gruppi ti possono anche aiutare nella mobilitazione di gente che sara` presente al momento dell' occupazione. In piu` nella consulta si decidera` chi va a rompere la porta e chi va a parlre con la polizia. Affinche la consulta venga ad aiutarti nell' occupazione devi dimostrare interesse e fare tutto quello che puoi autonomamente. Per esempio e` importante la ricerca di case vuote. Inoltre le consulte servono anche dopo l'occupazione per aiutarti nei problemi che incontrerai. In caso di vandalismo legalizzazione e possibili sgomberi puoi chiedere ai gruppi aiuto e consiglio. Se occupi senza l'aiuto dei gruppi e` gradito che per prima cosa ti tenga informato se c`e` altra gente interessata ad occupare la stessa casa. Le consulte sono anche uno spazio per scambiare esperienze con altre persone con le quali pensare ad altre possibili occupazioni.

1) INFORMAZIONI SULLE CASE VUOTE

La prima cosa che devi fare e` una lista di case vuote che soddisfano le tue necessita`. In questo paragrafo troverai alcuni cosigli utili per sapere se una casa e` realmente vuota. Il primo controllo comincia nella maggior parte dei casi con un giro per la citta`; il momento migliore e` di giorno perche la notte tutte le case sembrano vuote. Scriviti gli indirizzi e il giorno in cui le hai selezionate. Dopo questo primo controllo avrai una quantita` di case che potrebbero essere occupate. Le informazioni piu` importanti che devi ottenere (rivolgiti ai Kraakspreekuur) sono:

1- e` la casa dichiarata sfitta
2- se si da quanto tempo
3- e` ritirato il permesso di residenza
4- se si in che giorno

In Amsterdam esiste l'SWD (Stedeljk Woning Dienst), che ti puo` dire da quanto tempo la casa e` sfitta; (non e` detto che questo ufficio ti possa informare su tutte le case della citta`; esso opera solo in alcuni quartieri, devi informarti). L'informazione importante che l'SWD ti potrebbe dare riguarda il valore del sussidio di affitto della casa, se questo sta` al disotto del limite di f-1100, la casa rientra nel settore sociale; quando l'abitazione sta al di sopra del limite di affitto questa appartiene al settore privato. Se l'SWD non ti puo` dare l'informazione ai altre possibilita` per sapere da quanto tempo la casa e vuota: guarda bene attraverso le finestre, porte e buche delle lettere, guarda se c`e` molta posta o delle tende alle finestre. Suona i campanelli della porta e ascolta se funzionano; stai attento a non richiamare l'attenzione e metti a lato della porta un fiammifero e ripassa per vedere se e` caduto. Altra possibilita` e` domandare ai vicini, naturalmente non devi dire che vuoi occupare la casa; ma ti devi informare su chi e` il proprietario della casa e se e` in affitto. In caso che sia un impresa puoi chiedere dove ha la sede; questa informazione la puoi ottenere dalla camera di commercio. Un ultimo modo di informarsi e` mediante il Bouw Woning Toezicht; qui` puoi chiedere se e` stato dato qualche permesso di costruzione, fai questo appena prima di occupare la casa per evitare che il proprieario venga avvisato all'ultimo momento. Se in questo modo scopri che la casa e` veramente sfitta, hai altre cose da sapere: chi e` il proprietario, lo puoi chiedere al Rijkkadaster (Catasto) pagando 10 fiorini. Qui` inoltre puoi ottenere altre informazioni (affitti, costruzioni,....)

Prova ad ottenere tutto cio` che e` possibilele circa la casa; cio` va a tuo vantaggio, infatti molte occupazioni non hanno avuto buon esito a causa di una errata ricerca. Di seguito trovi un piccolo riassunto di quello che devi sapere e dove chiederlo:

1) da quanto tempo la casa e` sfitta (SWD e vicini)
2) grandezza della casa (Woningkarthotheek, Kadaster, vicini)
3) affitto (SWD, Huuradvieskomissie, Woningkarthotheek, vicini)
4) tipo di abitazione (SWD e vicini)
5) ristrutturazione, demolizioni, permessi di costruzione (SWD, Wijkcentrum, Monumentenzorg, vicini)
6) nome del proprietario (Kadaster, camera di commercio, vicini)
7) stato finanziario del proprietario (registro delle ipoteche)
8) tipo di proprietario (Wijkcentrum, SPOK, vicini)

2) PREPARATIVI

Terminata la fase investigativa se hai trovato qualcosa di occupabile, puoi cominciare con i preparativi. Si raccomanda che per prima cosa, tu vada alle consulte di occupazione, Kraakspreekuur, loro hanno esperienza con la polizia e hanno anche gli strumenti necessari e si possono mobilizzare ampiamente. Inoltre hai la possibilita` di trovare qualcun altro interessato alla tua medesima casa; con i quali puoi collaborare in caso di interessi comuni, altrimenti sapendo si possono evitare occupazioni conflittuali. I gruppi (Kraakspreekuur), se ne avrai bisogno ti possono aiutare anche dopo l'occupazione, e` interessante mantenere i contatti con loro. Di cosa hai bisogno se vuoi occupare? Per prima cosa necessiti del set di occupazione: una sedia, un tavolo e un letto, questo ti serve per dichiarare che tu vivi nella casa. Inoltre hai bisogno di una serratura per sostituire quella vecchia; puo` accadere che la casa abbia piu` porte e quindi avrai bisogno di piu` serrature. Se decidi di occupare senza l'aiuto dei Kraakspreekuur avrai bisogno di certi strumenti come leve, cacciaviti, viti, lime, martelli, etc per rompere la porta e sostituire la serratura. In caso che per occupare la casa trovi aiuto nei vicini sarebbe gradito preparare una riunione con loro; generalmente per mantenere buoni rapporti, questo lo avrebbe potuto preparare il Kraakspreekuur.

3) OCCUPAZIONE

Al momento che hai preparato tutto si puo` cominciare con l'occupazione. Organizzati per tempo con chi ti ha aiutato e sii chiaro su tutte le informazioni che hai conseguito; generalmente si decide prima chi rompera` la porta e chi parlera` con la polizia. Prima dell' entrata effettiva, accertati che la casa sia realmente vuota allora puoi mettere il tuo set di occupazione; inizialmente nascondi gli stumenti di scasso affinche` la polizia non te li confischi. Dopo l'occupazione e` importante che venga la polizia per accertare che tu vivi nella casa; se non viene la devi chiamare. Mentre aspetti sostituisci la serratura. Al momento del loro arrivo fai in modo che non entrino piu` di due agenti, molti polizziotti vogliono spesso infastidirti con la scusa di appoggiarti. Essi sono interessati a sapere chi sei come e quando sei entrato e da quanto tempo quell'edificio stava sfitto. Ricordati che non sei obbligato a dare il tuo nome, se hai occupato con un gruppo di Kraakspreekuur dai il loro nome (per esempio gruppo di occupazione di Pijp). Questi ultimi possono informare la polizia da quanto tempo la casa era sfitta e dare il nome del proprietario.

4) COSA PUO` ACCADERE DOPO L'OCCUPAZIONE?

E` impossibile prevedere cosa puo` accadere dopo l'occupazione, ogni sqatt ha la sua propria storia. E` normale che il proprietario non sia contento che la sua proprieta` sia stata squattata. Ricordati che nessuno, compreso il proprietario puo` entrare nella casa senza il tuo consenso dopo che la polizia ha constatato che tu stai vivendo nella casa; queste sono le regole legali. Anche la polizia non puo` entrare senza un mandato, nel caso arrivi, tu hai il diritto di farti mostrare il medesimo. Puo` accadere che il proprietario non prenda sul serio le leggi e cominci ad agire con proprie azioni come porre delle barricate. In questo caso rivolgiti allo SPOK (Spekulatie Onderzoeks Kollektief). Questo collettivo e` un gruppo di occupazione che ricerca ed archivia informazioni sul mercato delle case ad Amsterdam. E` basato sull' esperienza che occorre con affittuari e proprietari e ha molto materiale storico riguardante ordini di disoccupazione e casi giuridici. Quando ti rechi allo SPOK ti vengono subito fornite informazioni e spiegazioni per esempio su come trovare il proprietario e cosa poter guardare nel registro civile dell'Olanda; sapere sullo stato finanziario della casa guardando il contratto di compra vendita e scoprire eventuali ipoteche. Puo` anche succedere che il prezzo della casa possa subire variazioni prima dell'occupazione. In caso che il proprietario della tua casa sia una societa`, fondazione o impresa e` importante che tu vada alla camera di commercio per conoscere la sua categoria. Chiedi inoltre se ci sono altre societa` che sono iscritte allo stesso indirizzo.

N.B. Di seguito riportiamo tre regole che pensiamo siano fondamentali:

1) Occupare con i Kraakspeerkuur e` la maniera migliore

2) Mantieni i contatti con loro, ricordati che sqatt vuol dire occupare un posto dove vivere, ma la collaborazione puo portare anche a costruire altre cose (CONTROINFORMA)

3) Se vuoi trovare collaborazione nei vicini, guarda bene sempre a chi ti rivolgi.

INDIRIZZI Kraakspreekuren in Amsterdam

"DE PIJP" Molli van Ostadestraat 55 huis, lunedi` dalle 19.00 alle 20.30.
"WEST" Ratjetoe, Frederik Hendrikstraat 111, lunedi` dalle 20.00 alle 21.00.
"CENTRUM" Vrankrijk, Spuistraat 216, mercoledi` dalle 20.00 alle 21.00
"OOSTELIJKE BINNENSTAD" Kalenderpander, Entrepotdok 96, lunedi` dalle 20.00 alle 21.00
"SCHINKELBUURT" Binnenpret 1e Schinkelstraat 14-16 martedi` dalle 19.30 alle 21.30

  † In Memoriam † zondag 14 oktober 2001 @ 23:29:41 #55
13819 Loedertje
Trotse GILF.
pi_1901770
quote:
Op zondag 14 oktober 2001 23:28 schreef Cora het volgende:
Tís nog best een gezellig topic geworden he?

Ow effe ontopic:

[Peeeeeest] text [/Peeeeeest]


Yeppppp

De rechtspositie van krakers jegens de overheid.

De bevoegdheden van de overheid tegen krakers in theorie en in de praktijk.


Scriptie van Marcel Schuckink Kool, als afsluiting van de studie Nederlands recht aan de Open Universiteit

Begeleider: Dick van Ekelenburg


Den Haag, juli 2001

Deze scriptie is opgedragen aan hen die vervolgd worden wegens hun strijd voor een rechtvaardigere, vrijere en gelukkigere samenleving, alsmede aan degenen die hen in hun strijd ondersteunen.

In de Verenigde Staten zit Mumia Abu Jamal na een showproces al sinds 1982 op death row op beschuldiging van de moord op een politieagent. De hiervoor aangedragen ‘bewijzen' zijn stuk voor stuk door de verdediging ontkracht. Mumia was voor zijn arrestatie een gevierd journalist die opkwam voor de, in de jaren ‘70 met extreem politiegeweld geconfronteerde, uit de Black Panther-movement voorgekomen Move-organisatie. Ook vanuit zijn dodencel blijft Mumia zijn strijd voor een rechtvaardigere samenleving voortzetten. Wereldwijd is er protest tegen zijn vonnis. Binnenkort is voor hem de laatste juridische mogelijkheid om het vonnis aan te vechten. Op deze dag, day X genoemd, wordt iedereen in de gehele wereld opgeroepen tot protest bij ambassades of consulaten van de Verenigde Staten.
http://www.mumia.nl en http://www.xs4all.nl/~tank/spg/mumia-nl/mumianl.htm

Carlo Giuliani werd op 21 juli 2001 vermoord door de politie bij protesten tegen de G7-conferentie in Genua. Dagenlang was de binnenstad afgegrendeld voor demonstranten, om de op deze conferentie aanwezige wereldleiders van alle demonstraties te vrijwaren. Bij het verzet tegen de politieterreur die zich alhier manifesteerde, werd Giuliani door een agent neergeschoten en daarna door een politiebusje meerdere malen overreden. Over wat er aan het schot van de agent voorafging, bestaan verschillende lezingen ...

Anarchist Black Cross is een organisatie die wereldwijd opkomt voor politieke gevangenen, onder andere door hen te schrijven en op andere manieren een hart onder de riem te steken, en door het organiseren van internationale campagnes om hun lot bekend te maken en te pleiten voor vrijlating en betere behandeling.
http://www.chez.com/maloka/ABC/index.htm

Hun strijd toont aan dat het juridische systeem in veel situaties een farce is en slechts de status quo ondersteunt. Desondanks ben ik van mening dat ook de juridische middelen voor deze strijd maximaal benut moeten worden.

  † In Memoriam † zondag 14 oktober 2001 @ 23:30:23 #56
13819 Loedertje
Trotse GILF.
pi_1901794
quote:
Op zondag 14 oktober 2001 23:29 schreef Loedertje het volgende:
Yeppppp

[Mega Knip]


Heb jij al die teksten ook gelezen Loedertje? ik ben nog bezig
:{
  † In Memoriam † zondag 14 oktober 2001 @ 23:33:28 #58
13819 Loedertje
Trotse GILF.
pi_1901808
The Spanish Revolution (1936)


--------------------------------------------------------------------------------
The role of anarchism in the Spanish Revolution or Spanish Civil War of 1936 is too often absent from histories of this struggle against fascism. Alongside the war millions of workers collectivised the land and took over industry to pursue their vision of a new society. This page tells their story and the story of those who fought alongside them.

--------------------------------------------------------------------------------
As of March 2000 this page was receiving 5000 visits a month

Contents
Introductions
Prelude to Revolution
The Revolution
Eyewitness accounts
Spain and the world
Organisations of the Spanish revolution
Collectives
Women
The May Days
Individual people
Original documents
Songs
Online books
Book Reviews
Interpretations
After the revolution
Introductions
Anarchism and the Spanish Civil War
an excellant overview of the role the anarchists played and their acheivements
Glossary of terms
Chronology of the major events
The PDF file of this pamphlet
A summary of the achivements of the Spanish revolution
The People Armed and the People's Army : A film review of Land and Freedom
An overview of the military history of the civil war
The anti-fascist camp in the Spanish revolution
A New World In Their Hearts

Prelude to Revolution
Francisco Ferrer and the Modern School [1901]
The formation of the anarchist unions and the Tragic Week [1909]
Birth of the FAI - Edgar Rodrigues On The Origins Of The Iberian Anarchist Federation (FAI) [1926]
An unexpected dash through Spain, Emma Goldman on conditions under Rivera [1929] *
The Barcelona rent strike of [1931]
Prison letter from Durruti the year before the revolution about CNT tactics [1935]
Daniel Guerin on the anarchist tradition in Spain *
How were the anarchists able to obtain mass popular support in Spain?*
Miners strikes in Asturies - 1890 to 1998
A history of the Spanish libertarian youth paper 'Ruta'



The international
'Anarchist Platform'

We invite you to look at the 'Anarchist Platform' points and if you agree with them to subscribe to this international anarchist mailing list


Read the 'Anarchist Platform'


The Revolution

Durruti's funeral in Barcelona was attended by 500,000 people

The first days of the Revolution
from El Acratador #54
The first two weeks
by Andrew Flood
The first two years
by the Friends of Durruti
The organisation of the anarchist militia
Anarchist rural Collectives
About the Iron Column
from Jose Peirats
Some quotes from 'Blood of Spain'
about the Spanish Revolution
Durruti's interview with Pierre van Paasen
Jack White's first Spanish impressions
What Spanish anarchism must do to win - Camillo Berneri - October. 1936
A Study of the Revolution in Spain
by Stuart Christie
Why the CNT entered government
The Militia's in the revolution


Eyewitness accounts and studies of the Revolution
The CNT as I Saw It
by Fenner Brockway
The Collectives in Aragon
by Gaston Leval
Collectives in Spain
by Gaston Leval & others
George Orwell
on the Spanish revolution.
First Spanish impressions
by Jack White
Camillo Berneri's writings from Spain
Interview on militarisation of the militias
War and Revolution
Open letter to comrade Federica Montseny
Madrid, sublime city
Beware, Dangerous Corner!
Counter Revolution on the March
The May Days in Barcelona, 1937
by Augustin Souchy
Emma Goldman
on Spain, 1937
Durruti Is Dead, Yet Living
By Emma Goldman
With the Peasants of Aragon
by Augustin Souchy
Towards a fresh revolution
by the Friends of Durruti
Anarchist rural Collectives
by Deidre Hogan (also in Spanish as El triunfo de la libertad)
The Tragedy of Spain
Rudolf Rocker's
The Collectives in Revolutionary Spain
by Lucien
Stalin's Foreign Policy in the Spanish Civil War and the Barcelona Uprising of May, 1937
by Jason Wehling
About the Iron Column
from Jose Peirats
A day Mournful and Overcast...
by an "uncontrollable" from the Iron Column.
A soldier returns
letter from a US member of the Durruti Column
Durruti Is Dead, Yet Living
by Emma Goldman
Living Utopia
English transcript of Televisiķn Espaņola documentary
CNT Newsreel stills
framegrabs from CNT newsreel shot during the Spanish Civil War
A Study of the Revolution in Spain
by Stuart Christie


The above photo is assembled from framegrabs of CNT newsreel of the Durruti Column, I believe this is the International section

--------------------------------------------------------------------------------

Anarchism and the Spanish revolution discussion list

For ongoing discussion about
Anarchism and the Spanish revolution


More details


PDF file for you to print out and distribute:
Anarchism in Action: The Spanish Civil War

--------------------------------------------------------------------------------
Organisations
Birth of the FAI
by Edgar Rodrigues
A study of the Iberian Anarchist Federation
review of Stuart Christie book
The CNT as I Saw It
by Fenner Brockway
Why the CNT entered government
The Friends of Durruti.
The revolutionary message of the 'Friends of Durruti' (BOOK- includes sections of their paper)
Introduction to Friends of Durruti
Their pamphlet Towards a fresh revolution
The Friends of Durruti - A Chronology by Paul Sharkey
Jaime Balius and the Friends of Durruti
Rebuttal of accusations of Marxism
Agustin Guillamon's history of the FOD
Guillamons flawed history of FoD
another review of Guillamon
Mujeres Libres - anarchist womens organisation
Free Women of Spain
Text of a talk on the Mujeres Libres
The Mujeres Libres Anthem
The organisation of the anarchist militia
Review of Antony Beevan on Militias
About the Iron Column
from Jose Peirats
The Iron Column
quotes from A Study of the Revolution
A day Mournful and Overcast...
by an "uncontrollable" from the Iron Column.
'Revolutionary War? A Contribution to the Debate about the Spanish Revolution
The role of the Spanish Communist Party
The Spanish libertarian youth paper 'Ruta'


The Collectives
The economy of the revolution in Spain
Kevin Doyle
Collectives in the Spanish Revolution
a WSM article
The Anarchist Collectives in the Countryside during the Spanish Civil War
by Deidre Hogan
The Collectives in Revolutionary Spain
by Lucien
Anarchist rural Collectives
by Deirdre Hogan
Collectives in Spain
by Gaston Leval
With the Peasants of Aragon
by Augustin Souchy
The Collectives in Aragon
by Gaston Leval
Innovation in the collectives
from Sam Dolgoff
Collectivization in Catalonia
Augustin Souchy on collectivization in Barcelona during the Spanish Revolution
How were Spanish industrial collectives organized?
from the Anarchist FAQ
How were the Spanish agricultural cooperatives organized
from the Anarchist FAQ
Eyewitness quotes on
the collectives in the Spanish Revolution


--------------------------------------------------------------------------------


Spain and the world
Anarchist Revolution in Spain: a Victim of International Politics
What did non-intervention actually mean
Non - intervention and international involvement in the Spanish Civil War

Anarchist Women Militia in Spain Women in the Revolution
Conditions for the vast majority of people in Spain in the 1920s and 1930s were appalling. For women they were especially bad. In the two years before the 1936 revolution, two groups of anarchist women in Barcelona and Madrid had begun organising

Mujeres Libres - anarchist womens organisation
Free Women of Spain
Women in The Spanish Revolution - S. African article
Womens liberation & the revolution
Women in the Spanish Revolution
A talk on the Mujeres Libres (Free Women)
The Mujeres Libres Anthem
Federica Montseny Maņé
CNT activist and Minister of Health in 1936

Anarchism and Womens liberation

The May Days in Barcelona (1937)
The May Days in Barcelona, 1937
by Augustin Souchy
The Friends of Durruti.
The revolutionary message of the 'Friends of Durruti' translated for this site by Chekov Feeney
A Spanish translation El Mensaje Revolucionario de "Los Amigos de Durruti"
Their pamphlet Towards a fresh revolution
Jaime Balius' rebuttal of accusations of Marxism
Pablo Ruiz, FoD member on the May days and the Friends of Durruti
Stalin's Foreign Policy in the Spanish Civil War and the Barcelona Uprising of May, 1937
by Jason Wehling
A brief biography of Camillo Berneri - Italian anarchist murdered by the Stalinists in Barcelona during the May Days
Luigi Camillo Berneri
Berneri's writings
Social democracy and communism betrays the revolution
Counter Revolution on the March
The USSR and the CNT: an unconscionable stance
Fragment on post-May opposition to collaboration
A soldier returns - a US member of the Durruti Column on the Stalinist terror
Review of non-anarchist writings on the May Days

People
Buenaventura Durruti
Durruti Is Dead, Yet Living by Emma Goldman
Buenaventura Durruti by Peter E Newell
Buenaventura Durruti by Correo@, Venezula
a prison letter from Durruti about CNT tactics in 1935
Durruti's interview with Pierre van Paasen
Federica Montseny Maņé
CNT activists and Minister of Health in 1936
Francisco Ferrer
a photo of Ferrer
Emma Goldman on Francisco Ferrer
Ferrer was executed in the aftermath of the Tragic Week
Major General Miguel Garcia Vivancos
Captain Jack White
who formed the Irish Citizen Army and later fought with the CNT militia
Jack White's first Spanish impressions
The anarchist views of Captain Jack White
Jaime Balius - secretary of the 'Friends of Durruti'
Jaime Balius' rebuttal of accusations of Marxism
Camillo Berneri - Italian anarchist murdered by the Stalinists in Barcelona during the May Days
A brief biography of Camillo Berneri
Luigi Camillo Berneri
Berneri's writings
Francisco Barbieri - murdered alongside Berneri

Find out when new pages are added to this site

Subscribe to an announcement list that will tell you when I add new pages anywhere in the Revolt collection. There will be no more then 3 posts a month to this list.


How to subscribe


Documents
Libertarian Communism
Isaac Puente's
Towards a fresh revolution
Friends of Durruti
After the Revolution
by Diego Abad de Santillan
Interview with Buenaventura Durruti
by Pierre Van Paasen
Berneri's writings
articles published 1936 - May '37

Songs of the Revolution
Lyrics of 'Sons of the People' in English and 'Spanish' (ca)
Lyrics of 'A las Barricadas' in English and 'Spanish' (ca)
Lyrics of the Mujeres Libres anthem in English and 'Spanish' (ca)
Lyrics of Los moros que trajo Franco in English and 'Spanish' (ca)


Find out when new pages are added to this site

Subscribe to an announcement list that will tell you when I add new pages anywhere in the Revolt collection. There will be no more then 3 posts a month to this list.

On-line books/pamphlets on Spain
Anarchism in Action: The Spanish Civil War
by Eddie Conlon
The revolutionary message of the 'Friends of Durruti'
George Fontenis, preface by Daniel Guerin
translated for this site by Chekov Feeney
Daniel Guerin's history of the Spanish Civil War
Towards a fresh revolution
by the Friends of Durruti
A day Mournful and Overcast...
by an "uncontrollable" from the Iron Column.
A Study of the Revolution in Spain
by Stuart Christie
The Friends of Durruti - A Chronology
by Paul Sharkey
Book Reviews

The Spanish Civil War by Antony Beevan
The Spanish Anarchists - The heroic years; 1868 - 1936
The Friends of Durruti.and Guillamon's 'history'
Agustin Guillamon's history of the FoD
Arms For Spain
The story of the Moscow gold: How the Spanish war was lost
Review of non-anarchist writings on the May Days
Recent Books on Spanish Anarchism - Reviewed by Jon Bekken
We, the Anarchists! A study of the Iberian Anarchist Federation (FAI) 1927-1937, by Stuart Christie.

Check out the Anarchist History page

pages of Anarchist History including the Russian, Spanish and Mexican revolutions, the Paris commune, historical texts and famous individual anarchists. Check out the anarchist history page and discussion list


The meaning of the Spanish Revolution for anarchism today
Does revolutionary Spain show that libertarian socialism can work in practice? from the Anarchist FAQ
The lessons of the Spanish Revolution
Murry Bookchins essays To Remember Spain
Trotskyist Lies on Anarchism a review of Felix Morrow's 'Revolution and Counter-Revolution in Spain
The WSA (US section of the IWA) answers criticism of the CNT from a trotskyist group.
Spain and it's Relevance Today -- Part 1and Part 2
Anarchists in the Spanish Revolution, by Sam Dolgoff

After the war: Anarchism undefeatedForgotten heros, the role Spanish anarchists played in fighting fascism in France in WWII
The resistence to Franco after the Civil War
This acoount of the making of Behold a Pale Horse includes several references to exiled Spanish anarchists
The Spanish libertarian youth paper 'Ruta' continued to be published in exile
Spanish anarchism and international revolutionary action, 1961-75


Images
CNT Newsreel stills
framegrabs from CNT newsreel shot during the Spanish Civil War
A page of texts and pictures from the Spanish revolution.
A collection of posters from the Spanish revolution.
The Southworth poster collection includes detailed explanations of the images

The Anarchist Movement in Spain today

Anarcho-syndicalism in Spain


Articles in English

The CNT since Franco by Andrew Rothman

Organisations on the web

The Confederacion National del Trabajo (CNT)
The Confederacion General del Trabajo (CGT)
The Catalan CNT paper Solidaridad Obrera
El Kiosko Libertario
Bienvenida a FEEL
Centre Ascaso Durutti: A Center devoted to the life, times, and philosophy of Francisco Ascaso and Buenaventura Durruti.

Links to anarchists all over the world


Essays

If you have written an essay connected with anarchism and the Spanish revolution email it to me and I'll add it here

The CNT, anarchism and Spain: Challenging State and Class Power
Does Spanish anarchism prove that state and class control is fundamentally inessential?
Links to other SCW web pages
Dana Wards Spanish Civil War page
Anti-Fascist Action No 15 has several articles on Spain
Eugene W. Plawiuk's excellant site on the Spanish Civil War.
Ireland and the Spanish Civil War


Write for this page
There are quite a lot of issues about anarchism and the Spanish Revolution not covered in this page, if you would like to write an article for it here are some topics that I would like articles on

the lyrics in English or Spanish of any revolutionary songs from that period
The anarchist militas in the early days of the war
The militias before, during and after militirization
What the 'anarchist' ministers did in power
A biography of Camillo Berneri [Done]
The role of anarchists from outside Spain in Spain
The role of anarchists from outside Spain in supporting the revolution in Spain
A biography of Francisco Ascaso
A biography of Garcia Oliver
A biography of Federica Montseny [Done]
Any other biographys
The takeover and running of industry in Barcelona
Or indeed any topic that you think is suitable, email me for advice on writing an article or with anything you think is suitable. Anyone writing a piece will be credited at the bottom of it if they wish.


The international
'Anarchist Platform'

We invite you to look at the 'Anarchist Platform' points and if you agree with them to subscribe to this international anarchist mailing list


Read the 'Anarchist Platform'

PDF file of this text for you to print out and distribute:
Anarchism in Action: The Spanish Civil War

Part of the
International Anarchism
web pages


[Main Index][Anarchist history]
[The Platform][Email lists][Organisations]

This page is part of the Struggle collection

pi_1901813
Jean Piaget (1896-1980)
Jean Piaget was born in Neuchâtel (Switzerland) on August 9, 1896. He died in Geneva on September 16, 1980. He was the oldest child of Arthur Piaget, professor of medieval literature at the University, and of Rebecca Jackson. At age 11, while he was a pupil at Neuchâtel Latin high school, he wrote a short notice on an albino sparrow. This short paper is generally considered as the start of a brilliant scientific career made of over sixty books and several hundred articles.
His interest for mollusks was developed during his late adolescence to the point that he became a well-known malacologist by finishing school. He published many papers in the field that remained of interest for him all along his life.

After high school graduation, he studied natural sciences at the University of Neuchâtel where he obtained a Ph.D. During this period, he published two philosophical essays which he considered as "adolescence work" but were important for the general orientation of his thinking.

After a semester spent at the University of Zürich where he developed an interest for psychoanalysis, he left Switzerland for France. He spent one year working at the Ecole de la rue de la Grange-aux-Belles a boys' institution created by Alfred Binet and then directed by De Simon who had developed with Binet a test for the measurement of intelligence. There, he standardized Burt's test of intelligence and did his first experimental studies of the growing mind.

In 1921, he became director of studies at the J.-J. Rousseau Institute in Geneva at the request of Sir Ed. Claparčde and P. Bovet.

In 1923, he and Valentine Châtenay were married. The couple had three children, Jacqueline, Lucienne and Laurent whose intellectual development from infancy to language was studied by Piaget.

Successively or simultaneously, Piaget occupied several chairs: psychology, sociology and history of science at Neuchâtel from 1925 to 1929; history of scientific thinking at Geneva from 1929 to 1939; the International Bureau of Education from 1929 to 1967; psychology and sociology at Lausanne from 1938 to 1951; sociology at Geneva from 1939 to 1952, then genetic and experimental psychology from 1940 to 1971. He was, reportedly, the only Swiss to be invited at the Sorbonne from 1952 to 1963. In 1955, he created and directed until his death the International Center for Genetic Epistemology.

His researches in developmental psychology and genetic epistemology had one unique goal: how does knowledge grow? His answer is that the growth of knowledge is a progressive construction of logically embedded structures superseding one another by a process of inclusion of lower less powerful logical means into higher and more powerful ones up to adulthood. Therefore, children's logic and modes of thinking are initially entirely different from those of adults.

Piaget's oeuvre is known all over the world and is still an inspiration in fields like psychology, sociology, education, epistemology, economics and law as witnessed in the annual catalogues of the Jean Piaget Archives. He was awarded numerous prizes and honorary degrees all over the world.

pi_1901819
*hopa*


Introduction
A. General
B. Users
C. Authors
D. Developers
Appendixes
The XML FAQ
Editor: Peter Flynn (pflynn@ucc.ie)
Originally maintained on behalf of the World Wide Web Consortium's XML Special Interest Group
v. 2.01 (2001-06-19) Frequently Asked Questions about the Extensible Markup Language
Introduction
This is the list of Frequently-Asked Questions about the Extensible Markup Language. It is restricted to questions about XML: if you are seeking answers to questions about HTML, scripts, Java, databases, or penguins, you may find some pointers, but you should probably look elsewhere as well. It is intended as a first resource for users, developers, and the interested reader, and does not form part of the XML Specification.

Thanks
The following people have helped with contributions:

Terry Allen, Tom Borgman, Tim Bray, Robin Cover, Bob DuCharme, Christopher Maden, Eve Maler, Makoto Murata, Peter Murray-Rust, Liam Quin, Michael Sperberg-McQueen, Joel Weber

...plus many other members of the SIG as well as FAQ readers around the world. Please mail any corrections or additions to the editor. Sadly, the form for comments found at the end of previous versions has had to be discontinued due to abuse. Please post questions to the relevant mailing list or newsgroup, not to the editor.

Recent changes
2.0 June 2001 DTD changed from DocBook SGML to QAML XML; removed query form; most questions revised and in some cases rewritten; updated references to new versions of associated standards, recommendations, and working drafts; added pointer to Jon Noring's Unicode test page and NIST's XSLT/XPath test suite; updated Eve Maler's links to the DTD for the spec; added warnings on speling and punk chew asian; added question on namespaces; fixed bug in question on stylesheets; inserted explanation of `document' vs `data' software; added new mailing list on XSL:FO; updated Robin Cover's URL throughout; updated the question on media types for RFC 3023; Extended question of graphics to cover SVG. For 2.01 there were minor typos, some updated links (to recent versions of the standards, and in the section on More Information), and a few wording changes. Thanks to James Cummings for a very thorough proofread.

History
Organisation
The FAQ is divided into four sections: General, User, Author, and Developer. The questions are numbered independently within each section. As the numbering may change with each version, comments and suggestions should refer to the version number (see above) as well as the Section and Question Number.

Please submit bug reports, suggestions for improvement, and other comments relating to this FAQ only to the maintainer at pflynn@ucc.ie. Comments about the XML Specification itself and related specifications should be directed to the W3C.

Availability
This is the first entirely XML version: it was delayed due to (human) difficulties about which DTD was most suitable. I finally picked QAML for its simplicity over DocBook, but it has meant a few changes in the internal subset (see the XML file) and a change in the content model for span to allow embedded links.

The XML master is at http://www.ucc.ie/xml/faq.xml. You can download it in text-mode as well;
The HTML version is at http://www.ucc.ie/xml/index.html;
A plaintext (ASCII) version is at http://www.ucc.ie/xml/faq.txt. A notification of the plaintext version is occasionally posted to comp.text.xml for the archives.
For printed copies there are versions for A4 PostScript, A4 PDF, Letter PostScript and Letter PDF configurations available. Viewers can be downloaded for PostScript and PDF formats.
WAP (if anyone's still using it), OEB (eBook) and cHTML versions are in development for your handheld devices.
The FAQ is also available in carbon-based toner on flattened dead trees by sending US$10 (or equivalent) to the editor (email first to check currency and postal address).
Translations (those I know about) are at:

Japanese: http://www.fxis.co.jp/DMS/sgml/cafe/library/etc/xmlfaq.html [Murata Makoto];
Spanish: http://slug.ctv.es/~olea/sgml-esp/xfaq15.html [Jaime Sagarduy];
Korean: http://xml.t2000.co.kr/faq/index.html [Kangchan Lee];
Chinese: http://zxd.webjump.com/xml.html [Neko] and http://weblab.crema.unimi.it/xmlzh/XML_FAQ.htm; [Jiang Luqin]
French: http://www.gutenberg.eu.org/pub/GUTenberg/publications/HTML/FAQXML/faqxml-fr.html [Jacques André];
Czech: http://zvon.vscht.cz/ZvonHTML/Translations/xmlFAQ/front_all.html [Miloslav Nic];
You can download the XML logo as a GIF, JPG, or EPS file; and an icon for your file system in ICO (Microsoft Windows), Mac, or XPM (X Window system) format.
List of Questions

A.1. What is XML?
A.2. What is XML for?
A.3. What is SGML?
A.4. What is HTML?
A.5. Aren't XML, SGML, and HTML all the same thing?
A.6. Who is responsible for XML?
A.7. Why is XML such an important development?
A.8. Why not just carry on extending HTML?
A.9. Why do we need all this SGML stuff? Why not just use Word or Notes?
A.10. Where do I find more information about XML?
A.11. Where can I discuss implementation and development of XML?
A.12. What is the difference between XML and C or C++?

B.1. What do I have to do to use XML?
B.2. Why should I use XML instead of HTML?
B.3. Where can I get an XML browser?
B.4. Do I have to switch from SGML or HTML to XML?

C.1. Does XML replace HTML?
C.2. Do I have to know HTML or SGML before I learn XML?
C.3. What does an XML document look like inside?
C.4. How does XML handle white-space in my documents?
C.5. Which parts of an XML document are case-sensitive?
C.6. How can I make my existing HTML files work in XML?
C.7. Is there an XML version of HTML?
C.8. If XML is just a subset of SGML, can I use XML files directly with existing SGML tools?
C.9. I'm used to authoring and serving HTML. Can I learn XML easily?
C.10. Can XML use non-Latin characters?
C.11. What's a Document Type Definition (DTD) and where do I get one?
C.12. How do I create my own DTD?
C.13. Does XML let me make up my own tags?
C.14. I keep hearing about alternatives to DTDs. What's a schema?
C.15. How do I upload or download XML to/from a database?
C.16. How will XML affect my document links?
C.17. Can I do mathematics using XML?
C.18. How does XML handle metadata?
C.19. Can I use Java, ActiveX, etc in XML files?
C.20. Can I use Java to create or manage XML files?
C.21. How do I execute or run an XML file?
C.22. How do I control appearance?
C.23. How do I use graphics in XML?

D.1. Where's the spec?
D.2. What are these terms DTDless, valid, and well-formed?
D.3. Which should I use in my DTD, attributes or elements?
D.4. What else has changed between SGML and XML?
D.5. What's a namespace?
D.6. What XML software can I use today?
D.7. Do I have to change any of my server software to work with XML?
D.8. Can I still use server-side inclusions?
D.9. Can I (and my authors) still use client-side inclusions?
D.10. I'm trying to understand the XML Spec: why does XML have such difficult terminology?
D.11. Is there a Developer's API kit for XML?
D.12. How does XML fit with the DOM?
D.13. Is there a conformance test suite for XML processors?
D.14. How do I include one DTD (or fragment) in another?
D.15. I've already got SGML DTDs: how do I convert them for use with XML?
D.16. What's the story on XML and EDI?

A. General questions
A.1 What is XML?
XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification.

It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a `metalanguage' -- a language for describing other languages -- which lets you design your own customized markup languages for limitless different types of documents. XML can do this because it's written in SGML, the international standard metalanguage for text markup systems (ISO 8879).

Back to Index

A.2 What is XML for?
XML is intended `to make it easy and straightforward to use SGML on the Web: easy to define document types, easy to author and manage SGML-defined documents, and easy to transmit and share them across the Web.'

It defines `an extremely simple dialect of SGML which is completely described in the XML Specification. The goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML.'

`For this reason, XML has been designed for ease of implementation, and for interoperability with both SGML and HTML'

[Quotes are from the XML specification]. XML is not just for Web pages: it can be used to store any kind of structured information, and to enclose or encapsulate information in order to pass it between different computing systems which would otherwise be unable to communicate.

Back to Index

A.3 What is SGML?
SGML is the Standard Generalized Markup Language (ISO 8879:1985), the international standard for defining descriptions of the structure of different types of electronic document. There is an SGML FAQ at http://www.infosys.utas.edu.au/info/sgmlfaq.txt which is posted every month to the comp.text.sgml newsgroup, and the SGML Web pages are at http://xml.coverpages.org/.

SGML is very large, powerful, and complex. It has been in heavy industrial and commercial use for over a decade, and there is a significant body of expertise and software to go with it. XML is a lightweight cut-down version of SGML which keeps enough of its functionality to make it useful but removes all the optional features which make SGML too complex to program for in a Web environment.

ISO standards like SGML are governed by the International Organization for Standardization in Geneva, Switzerland, and voted into or out of existence by representatives from every country's national standards body.

If you have a query about an international standard, you should contact your national standards body for the name of your country's representative on the relevant ISO committee or working group.

If you have a query about your country's representation in Geneva or about the conduct of your national standards body, you should contact the relevant government department in your country, or speak to your public representative.

The representation of countries at the ISO is not a matter for this FAQ. Please do not submit queries to the editor about how or why your ISO representatives have or have not voted on a specific standard.

Back to Index

A.4 What is HTML?
HTML is the HyperText Markup Language (RFC 1866), a small application of SGML used on the Web.

It defines a very simple class of report-style documents, with section headings, paragraphs, lists, tables, and illustrations, with a few informational and presentational items, and some hypertext and multimedia. See the question on extending HTML. There is also an XML version of HTML.

Back to Index

A.5 Aren't XML, SGML, and HTML all the same thing?
Not quite; SGML is the mother tongue, and has been used for describing thousands of different document types in many fields of human activity, from transcriptions of ancient Irish manuscripts to the technical documentation for stealth bombers, and from patients' clinical records to musical notation. SGML is very large and complex, however, and probably overkill for most common applications.

XML is an abbreviated version of SGML, to make it easier for you to define your own document types, and to make it easier for programmers to write programs to handle them. It omits all the options, and most of the more complex and less-used parts of SGML in return for the benefits of being easier to write applications for, easier to understand, and more suited to delivery and interoperability over the Web. But it is still SGML, and XML files may still be processed in the same way as any other SGML file (see the question on XML software).

HTML is just one of the SGML or XML applications, the one most frequently used in the Web.

Technical readers may find it more useful to think of XML as being SGML-- rather than HTML++.

Back to Index

A.6 Who is responsible for XML?
XML is a project of the World Wide Web Consortium (W3C), and the development of the specification is being supervised by their XML Working Group. A Special Interest Group of co-opted contributors and experts from various fields contributed comments and reviews by email.

XML is a public format: it is not a proprietary development of any company. The v1.0 specification was accepted by the W3C as Recommendation on Feb 10, 1998.

Back to Index

A.7 Why is XML such an important development?
It removes two constraints which were holding back Web developments:

dependence on a single, inflexible document type (HTML);
the complexity of full SGML, whose syntax allows many powerful but hard-to-program options.
XML simplifies the levels of optionality in SGML, and allows the development of user-defined document types on the Web.

Back to Index

A.8 Why not just carry on extending HTML?
HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information.

XML allows groups of people or organizations to create their own customized markup applications for exchanging information in their domain (music, chemistry, electronics, hill-walking, finance, surfing, petroleum geology, linguistics, cooking, knitting, stellar cartography, history, engineering, rabbit-keeping, mathematics, genealogy, etc).

HTML is at the limit of its usefulness as a way of describing information, and while it will continue to play an important role for the content it currently represents, many new applications require a more robust and flexible infrastructure.

Back to Index

A.9 Why do we need all this SGML stuff? Why not just use Word or Notes?
Information on a network which connects many different types of computer has to be usable on all of them. Public information cannot afford to be restricted to one make or model or manufacturer, or to cede control of its data format to private hands. It is also helpful for such information to be in a form that can be reused in many different ways, as this can minimize wasted time and effort. Proprietary data formats, no matter how well documented or publicized, are simply not an option: their control still resides in private hands and they can be changed or withdrawn arbitrarily without notice.

SGML is the international standard for defining this kind of application, but those who need an alternative based on different software for other purposes are entirely free to implement similar services using such a system, especially if they are for private use.

Back to Index

A.10 Where do I find more information about XML?
Online, there's the XML Specification and ancillary documentation available from the W3C; Robin Cover's SGML/XML Web pages with an extensive list of online reference material and links to software; and a summary and condensed FAQ from Tim Bray.

The items listed below are the ones I have been told about. Please mail me if you come across others.

An annual XML Conference is run by the Graphic Communications Association. XML 2001 is in Orlando, Florida, on December 9-14. See the GCA's Web site for details.
The Extreme Markup Languages 2001 conference takes place on 12-17 August at Le Centre Sheraton, Montréal, Canada.
The annual XML Summer School takes place in Oxford on 20-25 July 2001.
There are many other XML events around the world: most of them announced on the mailing lists and newsgroups.

There are lists of books, articles, and software for XML in Robin Cover's SGML and XML Web pages. That site should always be your first port of call: please look there first before using the form in this FAQ to ask about software or documentation.

Back to Index

A.11 Where can I discuss implementation and development of XML?
The two principal online media are the Usenet newsgroups and the mailing lists. The newsgroups are comp.text.xml and to a certain extent comp.text.sgml. Ask your Internet Provider how to access these, or use a Web interface like Google.

The general-purpose mailing list for public discussion is XML-L: to subscribe, visit the Web site and click on the link to join. You can also access the XML-L archives from the same URL.
For those developing components for XML there is an xml-dev mailing list. You can subscribe by sending a 1-line mail message to xml-dev-request@lists.xml.org saying just SUBSCRIBE. The xml-dev archives are at OASIS http://lists.xml.org/archives/xml-dev/. Note that this list is for those people actively involved in developing resources for XML. It is not for general information about XML (see this FAQ and other sources) or for general discussion about XML implementation and resources (see below).
There is a list for discussing XSL, the stylesheet language: XSL-List. For details of how to subscribe, see http://www.mulberrytech.com/xsl/xsl-list.
Andrew Watt writes that there is a mailing list specifically for XSL-FO only, on eGroups.com. You can subscribe by sending a message to XSL-FO-subscribe@egroups.com.
When you join a mailing list you will be sent details of how to use it. Please Read The Fine Documentation because it contains important information, particularly about what to do if your company or ISP changes your email address.

Please note that there is a lot of inaccurate and misleading information published in print and on the Web about subscribing to mailing lists. Don't guess: read the documentation.

Mailing lists in other languages
Gianni Rubagotti writes: A new Italian mailing list about XML is born: to subscribe, send a mail message without a subject line but with text saying subscribe XML-IT to majordomo@ananas.usr.dsi.unimi.it. Everyone, Italian or not, who wants to debate about XML in our tongue is welcome.
JP Theberge writes: A French mailing list about XML has been created. To subscribe, send subscribe to xml-request@trisome.com.
Jarno Elovirta writes: a Finnish mailing list about XML has been set up. To subscribe, send an email to majordomo@evitech.fi with subscribe XML-Fin in the message body. The list is also hypermailed for online reference at http://users.evitech.fi/lists/xml-fin/.
Back to Index

A.12 What is the difference between XML and C or C++?
C and C++ (and other languages like FORTRAN, or Pascal, or BASIC, or Java or dozens more) are programming languages with which you specify calculations, actions, and decisions to be carried out in order:

mod curconfig[if left(date,6) = "01-Apr", t.put "April Fool!",
f.put days('31102001','DDMMYYYY')-days(sdate,'DDMMYYYY')
" shopping days to Samhain"];
XML is a markup specification language with which you can design ways of describing information (text or data), usually for storage, transmission, or processing by a program: it says nothing about what you should do with the data (although your choice of element names may hint at what they are for):

<part num="DA42" models="LS AR DF HG KJ" update="2001-11-22">
<name>Camshaft end bearing retention circlip</name>
<image drawing="RR98-dh37" type="SVG" x="476" y="226"/>
<maker id="RQ778">Ringtown Fasteners Ltd</maker>
<notes>Angle-nosed insertion tool <tool id="GH25"/> is
required for the removal and replacement of this item.</notes>
</part>
On its own, an SGML or XML file (and HTML) doesn't do anything. It's a data format which just sits there until you run a program which does something with it. See also the question about how to run or execute XML files.

Back to Index

B. Existing users of SGML (including HTML: everyone who browses the Web)
B.1 What do I have to do to use XML?
For the average user of the Web, nothing except use a browser which works with XML (see the question about browsers). Remember some XML components are still being implemented, so some features are still either undefined or have yet to be written. Don't expect everything to work yet!

You can use XML browsers to look at some of the stable XML material, such as Jon Bosak's Shakespeare plays and the molecular experiments of the Chemical Markup Language (CML). There are some more example sources listed at http://xml.coverpages.org/xml.html#examples, and you will find XML (particularly in the disguise of XHTML) being introduced in places where it won't break older browsers.

If you want to start preparations for creating your own XML files, see the questions in the Authors' Section and the Developers' Section.

Back to Index

B.2 Why should I use XML instead of HTML?
Authors and providers can design their own document types using XML, instead of being stuck with HTML. Document types can be explicitly tailored to an audience, so the cumbersome fudging that has to take place with HTML can become a thing of the past: authors and designers are free to invent their own markup elements;
Information content can be richer and easier to use, because the descriptive and hypertext linking abilities of XML are much greater than those of HTML.
XML can provide more and better facilities for browser presentation and performance, using CSS and XSL stylesheets;
It removes many of the underlying complexities of SGML in favor of a more flexible model, so writing programs to handle XML is much easier than doing the same for full SGML.
Information will be more accessible and reusable, because the more flexible markup of XML can be used by any XML software instead of being restricted to specific manufacturers as has become the case with HTML.
Valid XML files are kosher SGML, so they can be used outside the Web as well, in existing SGML environments.
Back to Index

B.3 Where can I get an XML browser?
Remember the XML specification is still relatively new, so a lot of what you see now is experimental, and because the potential number of different XML applications is unlimited, no single browser can be expected to handle 100% of everything.

Some of the generic parts of XML (eg parsing, tree management, searching, formatting, etc) are being combined into general-purpose libraries or toolkits to make it easier for developers to take a consistent line when writing XML applications. Such applications can then be customized by adding semantics for specific markets, or using languages like Java to develop plugins for generic browsers and have the specialist modules delivered transparently over the Web.

MSIE5.5 handles XML but currently still renders it via the HTML model. Microsoft were also the architects of a hybrid (invalid) solution (islands) in which you could embed fragments of XML in HTML files because current HTML-only browsers simply ignored element markup which they didn't recognize, but his has now been superseded by XHTML. MSIE includes an implementation of an obsolete draft of XSLT (WD-xsl): you need to upgrade it and replace the parser (see http://www.netcrucible.com/ for details).
The publicly-released Netscape code (Mozilla) and the almost indistinguishable Netscape 6 (there is no v5) have XML/CSS support, based on James Clark's expat XML parser, and this seems to be more robust, if less slick, than MSIE. Mozilla 0.9 is reported to have some XSLT capability.
The authors of the former MultiDoc Pro SGML browser, CITEC, joined forces with Mozilla to produce a multi-everything browser called DocZilla, which reads HTML, XML, and SGML, with XSL and CSS stylesheets. This runs under NT and Linux and is currently still in the alpha stage. See http://www.doczilla.com for details. This is by far the most ambitious browser project, and is backed by solid SGML expertise, but seems to be rather a long time coming.
Opera now supports XML and CSS on MS-Windows and Linux and is the most complete implementation so far. The browser size is tiny by comparison with the others, but features are good and the speed is excellent, although the earlier slavish insistence on mimicking everything Netscape did, especially the bugs, still shows through in places.
See also the notes on software for authors and developers, and the more detailed list on the XML pages in the SGML Web site at http://xml.coverpages.org/.

Back to Index

B.4 Do I have to switch from SGML or HTML to XML?
No, existing SGML and HTML applications software will continue to work with existing files. But as with any enhanced facility, if you want to view or download and use XML files, you will need to use XML-aware software. There is much more being developed for XML than there ever was for SGML, so a lot of users are moving.

Back to Index

C. Authors of SGML (including writers of HTML: Web page owners)
C.1 Does XML replace HTML?
No. XML itself does not replace HTML: instead, it provides an alternative which allows you to define your own set of markup elements. HTML is expected to remain in common use for some time to come, and a Document Type Definition for HTML is available in XML syntax as well as in original SGML. XML is designed to make the writing of DTDs much simpler than with full SGML. (See the question on DTDs for what one is and why you might want one.)

Back to Index

C.2 Do I have to know HTML or SGML before I learn XML?
No, although it's useful because a lot of XML terminology and practice derives from 15 years' experience of SGML.

Be aware that `knowing HTML' is not the same as `understanding SGML' . Although HTML was written as an SGML application, browsers ignore most of it (which is why so many useful things don't work), so just because something is done a certain way in HTML browsers does not mean it's correct, least of all in XML.

Back to Index

C.3 What does an XML document look like inside?
The basic structure is very similar to most other applications of SGML, including HTML. XML documents can be very simple, with no document type declaration (DTD), and straightforward nested markup of your own design:

<?xml version="1.0" standalone="yes"?>
<conversation>
<greeting>Hello, world!</greeting>
<response>Stop the planet, I want to get off!</response>
</conversation>
Or they can be more complicated, with a DTD specified (see the question on document types), and maybe an internal subset (local DTD changes in [square brackets]), and a more complex structure:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE titlepage SYSTEM "http://www.foo.bar/dtds/typo.dtd"
[<!ENTITY % active.links "INCLUDE">]>
<titlepage id="BG12273624">
<white-space type="vertical" amount="36"/>
<title font="Baskerville" size="24/30"
alignment="centered">Hello, world!</title>
<white-space type="vertical" amount="12"/>
<!-- In some copies the following decoration is
hand-colored, presumably by the author -->
<image location="http://www.foo.bar/fleuron.eps"
type="URL" alignment="centered"/>
<white-space type="vertical" amount="24"/>
<author font="Baskerville" size="18/22"
style="italic">Vitam capias</author>
<white-space type="vertical" class="filler"/>
</titlepage>
Or they can be anywhere between: a lot will depend on how you want to define your document type (or whose you use) and what it will be used for.

Back to Index

C.4 How does XML handle white-space in my documents?
The SGML rules regarding white-space have been changed for XML. All white-space, including linebreaks, TAB characters, and regular spaces, even between those elements where no text can ever appear, is passed by the parser unchanged to the application (browser, formatter, viewer, converter, etc), identifying the context in which the white-space was found (element content, data content, or mixed content). This means it is the application's responsibility to decide what to do with such space, not the parser's:

insignificant white-space between structural elements (space which occurs where only element content is allowed, ie between other elements, where text data never occurs) will get passed to the application (in SGML this white-space gets suppressed, which is why you can put all that extra space in HTML documents and not worry about it. This is not so in XML);
significant white-space (space which occurs within elements which can contain text and markup mixed together, usually mixed content or PCDATA) will still get passed to the application exactly as under SGML. It is the application's responsibility to handle it correctly.
<chapter>
<title>
My title for Section
1.
</title>
<p>
text
</p>
</chapter>
The parser must inform the application that white-space has occurred in element content, if it can detect it. (Users of SGML will recognize that this information is not in the ESIS, but it is in the grove.) In the above example, the application will receive all the pretty-printing linebreaks, TABs, and spaces between the elements as well as those embedded in the chapter title. It is the function of the application, not the parser, to decide which type of white-space to discard and which to retain.

Back to Index

C.5 Which parts of an XML document are case-sensitive?
All of it, both markup and text. This is significantly different from HTML and most other SGML applications. It was done to allow markup in non-Latin-alphabet languages and to obviate problems with case-folding in scripts which are caseless.

Element type names are case-sensitive: you must stick with whatever combination of upper- or lower-case you use to define them (either by first usage or in a DTD). So you can't say <BODY>...<body>: upper- and lower-case must match; thus <IMG/> and <img/> are two different element types;
For well-formed files with no DTD, the first occurrence of an element type name defines the casing;
Attribute names are also case-sensitive, on a per-element basis: for example <PIC width="7in"/> and <PIC WIDTH="6in"/> in the same file exhibit two separate attributes, because the different casings of width and WIDTH distinguish them;
Attribute values are also case-sensitive. CDATA values (eg HRef="MyFile.SGML") always have been, but ID and IDREF attributes are now case-sensitive as well;
All entity names (&Aacute;), and your data content (text), are case-sensitive as always.
Back to Index

C.6 How can I make my existing HTML files work in XML?
Either convert them to conform to some new document type (with or without a DTD) and write a stylesheet to go with them; or edit them to conform to XHTML.

It is necessary to convert existing HTML files because XML does not permit end-tag minimization (missing </p>, etc), unquoted attribute values, and a number of other shortcuts which are normal in most HTML DTDs. However, many HTML authoring tools already produce almost (but not quite) well-formed XML. As a preparation for XML, the W3C's HTML Tidy program can clean up some of the formatting mess left behind by inadequate HTML editors, and even separate out some of the formatting to a stylesheet, but there is usually still some hand-editing to do.

Back to Index

Converting to a new document type
If you want to move your files out of HTML into some other DTD entirely, there are already many native XML application DTDs, and several XML versions of popular SGML DTDs like TEI and DocBook to choose from. There is a pilot site run by CommerceNet (http://www.xmlx.com/) for the exchange of XML DTDs.

Alternatively you could just make up your own markup: so long as it makes sense and you create a well-formed file, you should be able to write a CSS or XSLT stylesheet and have your document displayed in a browser.

Back to Index

Converting valid HTML to XHTML
If your HTML files are valid (full formal validation with an SGML parser, not just a simple syntax check), then try validating them as XHTML. If you have been creating clean HTML without embedded formatting then this process should throw up only mismatches in upper/lowercase element and attribute names, and empty elements (plus perhaps the odd non-standard element type name if you use them). Simple hand-editing or a short script should be enough to fix these changes.

If your HTML validly uses end-tag omission, this can be fixed automatically by a normalizer program like sgmlnorm (part of SP) or by the sgml-normalize function in an editor like Emacs/psgml (don't be put off by the names, they both do XML).

If you have a lot of valid HTML files, could write a script to do this in a programming language which understands SGML/XML markup (such as Omnimark, Balise, SGMLC, or a system using one of the SGML libraries for Perl, Python, or Tcl), or you could even use editor macros if you know what you're doing.

Back to Index

Converting invalid HTML to well-formed XHTML
If your files are invalid HTML (95% of the Web) they can be converted to well-formed DTDless files as follows:

replace the DOCTYPE Declaration with the XML Declaration <?xml version="1.0" standalone="yes" encoding="iso-8859-1"?>. If there was no DOCTYPE Declaration, just prepend the XML Declaration.
change any EMPTY elements (eg every <ISINDEX>, <BASE>, <META>, <LINK>, <NEXTID> and <RANGE> in the header, and every <IMG>, <BR>, <HR>, <FRAME>, <WBR>, <BASEFONT>, <SPACER>, <AUDIOSCOPE>, <AREA>, <PARAM>, <KEYGEN>, <COL>, <LIMITTEXT>, <SPOT>, <TAB>, <OVER>, <RIGHT>, <LEFT>, <CHOOSE>, <ATOP>, and <OF> in the body of the document) so that they end with /> instead, for example <img src="mypic.gif" alt="Picture"/>;
make all element names and attribute names lowercase;
ensure there are correctly-matched explicit end-tags for all non-empty elements; eg every <p> must have a </p>, etc;
escape all < and & non-markup (ie literal text) characters as &lt; and &amp; respectively (there shouldn't be any isolated &lt; characters to start with);
ensure all attribute values are in quotes.
Be aware that many HTML browsers may not accept XML-style EMPTY elements with the trailing slash, so the above changes may not be backwards-compatible. An alternative is to add a dummy end-tag to all EMPTY elements, so <IMG src="foo.gif"/> becomes <img src="foo.gif"></img>. This is still valid XML provided you guarantee never to put any text content in such elements. Adding a space before the slash (eg <img src="foo.gif" />) may also fool older browsers into accepting XHTML as HTML.

If your HTML files fall into this category (HTML created by some WYSIWYG editors is frequently invalid) then they will almost certainly have to be converted manually, although if the deformities are regular and carefully constructed, the files may actually be almost well-formed, and you could write a program or script to do as described above. The oddities you may need to check for include:

do the files contain markup syntax errors? For example, are there any missing angle-brackets, backslashes instead of forward slashes on end-tags, or elements which nest incorrectly (eg <B>an element starting <I>inside another</B> but ending outside</I>)?
are there any URLs (eg in hrefs or srcs) which use backslashes instead of forward slashes?
do the files contain markup which conflicts with HTML DTDs, such as headings or lists inside paragraphs, list items outside list environments, header elements like <base>preceding the first <html>, etc?
do the files use imaginary elements which are not in any known HTML DTD? (large amounts of these are used in proprietary markup systems masquerading as HTML). Although this is easy to transform to a DTDless well-formed file (because you don't have to define elements in advance) most proprietary or browser-specific extensions have never been formally defined, so it is often impossible to work out meaningfully where the element types can be used.
Are there any non-ISO Latin-1 (8859-1) characters or wrongly-coded characters in your files? Look especially for native Apple Mac characters left by careless designers, or any of the illegal characters (the 32 characters at decimal codes 128-159 inclusive) inserted by MS-Windows editors. These need to be converted to the correct characters in ISO 8859-1 or the relevant plane of Unicode (and the XML Declaration should show iso-8859-1 encoding unless you specifically know otherwise).
Do your files contain malformed (Mosaic/Netscape-style) comments? Comments must look <!-- like this --> with double-dashes each end and no double dashes in between (safest: no multiple dashes in between).
If you answer Yes to any of these, you can save yourself a lot of grief by fixing those problems first before doing anything else. You will likely then be getting close to having well-formed files.

Markup which is syntactically correct but semantically meaningless or void should be edited out before conversion. Examples are spacing devices such as repeated empty paragraphs or linebreaks, empty tables, invisible spacing GIFs etc: XML uses stylesheets, so you won't need any of these.

Unfortunately there is rather a lot of work to do if your files are invalid: this is why many professional Webmasters will always insist that only valid or well-formed files are used (and why you should instruct designers to do the same), in order to avoid unnecessary manual maintenance and conversion costs later.

Back to Index

C.7 Is there an XML version of HTML?
The W3C has released XHTML as `a reformulation of HTML 4 in XML 1.0' . This specification defines HTML as an XML application, and provides three DTDs corresponding to the ones defined by HTML 4.0. The semantics of the elements and their attributes are as defined in the W3C Recommendation for HTML 4.0. These semantics provide the foundation for future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines.

Back to Index

C.8 If XML is just a subset of SGML, can I use XML files directly with existing SGML tools?
Yes, provided you use up-to-date SGML software which knows about the WebSGML Adaptations to ISO 8879 (the features needed to support XML, such as the variant form for EMPTY elements; some aspects of the SGML Declaration such as NAMECASE GENERAL NO; multiple attribute token list declarations, etc).

An alternative is to use an SGML DTD to let you create a fully-normalised SGML file, but one which does not use empty elements; and then remove the DocType Declaration so it becomes a well-formed DTDless XML file.

Most SGML tools now handle XML files well, and provide an option switch between the two standards. (see the pointers in the question on software).

Back to Index

C.9 I'm used to authoring and serving HTML. Can I learn XML easily?
Yes, very easily, but at the moment there is still a need for tutorials, simpler tools, and more examples of XML documents. Well-formed XML documents may look similar to HTML except for some small but very important points of syntax.

The big practical difference is that XML has to stick to the rules. HTML browsers let you serve them broken or corrupt HTML because they don't do a formal parse but elide all the broken bits instead. With XML your files have to be correct or they simply won't work at all. One outstanding problem is that some browsers claiming XML conformance are also broken. Try yours on the test file at http://www.ucc.ie/test.xml.

Back to Index

C.10 Can XML use non-Latin characters?
Yes, the XML Specification explicitly says XML uses ISO 10646, the international standard 31-bit character repertoire which covers most human (and some non-human) languages. This is currently congruent with Unicode and is planned to be superset of Unicode.

The spec says (2.2): `All XML processors must accept the UTF-8 and UTF-16 encodings of ISO 10646...' . UTF-8 is an encoding of Unicode into 8-bit characters: the first 128 are the same as ASCII, the rest are used to encode the rest of Unicode into sequences of between 2 and 6 bytes. UTF-8 in its single-octet form is therefore the same as ISO 646 IRV (ASCII), so you can continue to use ASCII for English or other unaccented languages using the Latin alphabet. Note that UTF-8 is incompatible with ISO 8859-1 (ISO Latin-1) after code point 126 decimal (the end of ASCII). UTF-16 is an encoding of Unicode into 16-bit characters, which lets it represent the next two planes. UTF-16 is incompatible with ASCII because it uses two 8-bit bytes per character.

`...the mechanisms for signalling which of the two are in use, and for bringing other encodings into play, are [...] in the discussion of character encodings.' The XML Specification explains how to specify in your XML file which coded character set you are using.

Use of UCS-4 can only legally be specified in SGML or XML when the WebSGML Adaptations to ISO 8879 are implemented: this enables numbers longer than eight digits to be used in the SGML Declaration.

`Regardless of the specific encoding used, any character in the ISO 10646 character set may be referred to by the decimal or hexadecimal equivalent of its bit string' : so no matter which character set you personally use, you can still refer to specific individual characters from elsewhere in the encoded repertoire by using &#dddd; (decimal character code) or &#xHHHH; (hexadecimal character code, in uppercase). The terminology can get confusing, as can the numbers: see the ISO 10646 Concept Dictionary. Rick Jelliffe has XML-ized the ISO character entity sets. Mike Brown's encoding information at http://skew.org/xml/tutorial/ is a very useful explanation of the need for correct encoding. There is an excellent online database of glyphs and characters in many encodings from the Estonian Language Institute server at http://www.eki.ee/letter/.

Back to Index

C.11 What's a Document Type Definition (DTD) and where do I get one?
A DTD is a formal description in XML Declaration Syntax of a particular type of document. It sets out what names are to be used for the different types of element, where they may occur, and how they all fit together. For example, if you want a document type to be able to describe Lists which contain Items, the relevant part of your DTD might contain something like this:

<!ELEMENT List (Item)+>
<!ELEMENT Item (#PCDATA)>
This defines a list as an element type containing one or more items (that's the plus sign); and it defines items as element types containing just plain text (Parsed Character Data or PCDATA). Validating parsers read the DTD before they read your document so that they can identify where every element type ought to come and how each relates to the other, so that applications which need to know this in advance (most editors, search engines, navigators, databases) can set themselves up correctly. The example above lets you create lists like:

<List><Item>Chocolate</Item><Item>Music</Item><Item>Surfing</Item></List>
How the list appears in print or on the screen depends on your stylesheet: you do not normally put anything in the XML to control formatting like you had to do with HTML before stylesheets. This way you can change style easily without ever having to edit the document itself.

A DTD provides applications with advance notice of what names and structures can be used in a particular document type. Using a DTD when editing files means you can be certain that all documents which belong to a particular type will be constructed and named in a consistent and conformant manner. DTDs are less important for processing documents already known to be well-formed, but they are still needed if you want to take advantage of XML's special attribute types like the built-in ID/IDREF cross-reference mechanism.

There are thousands of DTDs already in existence in all kinds of areas (see the SGML/XML Web pages for pointers). Many of them can be downloaded and used freely; or you can write your own (see the question on creating your own DTD. Existing SGML DTDs need to be converted to XML for use with XML systems: read the question on converting SGML DTDs to XML, and expect to see announcements of popular DTDs becoming available in XML format.

Back to Index

C.12 How do I create my own DTD?
You need to use the XML Declaration Syntax (very simple: declaration keywords begin with <! rather than just the open angle bracket, and the way the declarations are formed also differs slightly). Here's an example of a DTD for a shopping list, based on the fragment used in an earlier question:

<!ELEMENT Shopping-List (Item)+>
<!ELEMENT Item (#PCDATA)>
It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one (that's the plus sign) but there may be more than one. It also says that the Item element may contain parsed character data (PCDATA, ie text).

Because there is no other element which contains Shopping-List, that element is assumed to be the `root' element, which encloses everything else in the document. You can now use it to create an XML file: give your editor the declarations:

<?xml version="1.0"?>
<!DOCTYPE Shopping-List SYSTEM "shoplist.dtd">
(assuming you put the DTD in that file). Now your editor will let you create files according to the pattern:

<Shopping-List>
<Item>Chocolate</Item>
<Item>Sugar</Item>
<Item>Butter</Item>
</Shopping-List>
It is possible to develop complex and powerful DTDs of great subtlety, but for any significant use you should learn more about document systems analysis and document type design. See for example Developing SGML DTDs by Maler and el Andaloussi, Prentice Hall, 1997, 0-13-309881-8, which was written for SGML, but perhaps 95% of it applies to XML as well, as XML is much simpler than full SGML -- see the list of restrictions which shows what has been cut out.

Back to Index

C.13 Does XML let me make up my own tags?
No, it lets you make up names for your own elements. If you think tags and elements are the same thing you are already in trouble: read the rest of this question carefully.

Before we start this one, Bob DuCharme notes: Don't confuse the term `tag' with the term `element' . They are not interchangeable. An element usually contains two different kinds of tag: a start-tag and an end-tag, with text or more markup between them.

XML lets you decide which elements you want in your document and then indicate your element boundaries using the appropriate start- and end-tags for those elements. Each <!ELEMENT... declaration defines a class of elements that may or may not be used in a document conforming to that DTD. We call this class of elements an `element type' . Just as the HTML DTD includes the H1 and P element types, your document can have color and price element types.

Non-empty elements are made up of a start-tag, the element's content, and an end-tag. <color>red</color> is a complete instance of the color element. <color> is only the start-tag of the element, showing where it begins; it is not the element itself.

Empty elements are a special case that may be represented either as a pair of start- and end-tags with nothing between them (eg <price retail="123"></price>) or as a single empty element start-tag that has a closing slash to tell the parser `don't go looking for an end-tag to match this' (eg <price retail="123"/>). [Bob DuCharme]

Back to Index

C.14 I keep hearing about alternatives to DTDs. What's a schema?
A DTD is for specifying the structure (only) of an XML file: it gives the names of the elements, attributes, and entities that can be used, and how they fit together. Because DTDs were designed for use with traditional text documents, they have no mechanism for defining the content of elements in terms of data types, because XML has no data types: text is just text. A DTD therefore cannot be used to specify numeric ranges or to define limitations or checks on the text content, only on the markup that surrounds it.

The XML Schema recommendation provides a means of specifying element content in terms of data types, so that document type designers can provide criteria for validating the content of elements as well as the markup itself. Schemas are written as XML files, thus avoiding the need for processing software to be able to read XML Declaration Syntax, which is different from XML Instance Syntax.

Schemas are now a formal Recommendation, and a number of sites are serving useful applications as both DTDs and Schemas, eg http://www.schema.net and http://www.dtd.com. There is a separate Schema FAQ at http://www.schemavalid.com. The term `vocabulary' is sometimes used to refer to `DTDs and Schemas' together.

Authors and publishers should note that the plural of Schema is Schemas: the use of the singular to do duty for the plural is a foible dear to the semi-literate; the use of the old (Greek) plural schemata is now unnecessary didacticism. Writers should also note that the plural of DTD is DTDs: there is no apostrophe.

Bob DuCharme adds: Many XML developers were dissatisfied with the syntax of the markup declarations described in the XML spec for two reasons. First, they felt that if XML documents were so good at describing structured information, then the description of a document type's own structure (its schema) should be in an XML document instead of written with its own special syntax. In addition to being more consistent, this would make it easier to edit and manipulate the schema with regular document manipulation tools. Secondly, they felt that traditional DTD notation didn't allow document type designers the power to impose enough constraints on the data -- for example, the ability to say that a certain element type must always have a positive integer value, that it may not be empty, or that it must be one of a list of possible choices. This eases the development of software using that data because the developer has less error-checking code to write.

Back to Index

C.15 How do I upload or download XML to/from a database?
Ask your database manufacturer: they all provide XML import and export modules. In some trivial cases there will be a 1:1 match between field and element types; in most cases some programming is required to establish the matches, but this can usually be stored as a procedure so that subsequent uses are simply commands or calls with the relevant parameters.

Users from a database or computer science background should be aware that XML is not a database management system: it is a text markup system. While there are many similarities, some of the concepts of one are simply non-existent in the other: XML does not possess some database-like features in the same way that databases do not possess markup-like ones. It is a common error to believe that XML is a DBMS like Oracle or Access and therefore possesses the same facilities. It doesn't. [PF]

Back to Index

C.16 How will XML affect my document links?
The linking abilities of XML systems are much more powerful than those of HTML, so you'll be able to do much more with them. Existing HREF-style links will remain usable, but the new linking technology is based on the lessons learned in the development of other standards involving hypertext, such as TEI and HyTime, which let you manage bidirectional and multi-way links, as well as links to a span of text (within your own or other documents) rather than to a single point. These features have been available to SGML users for many years, so there is considerable experience and expertise available in using them.

The XML Linking Specification (XLink) and XML Extended Pointer Specification (XPointer) documents contain a detailed draft specification. An XML link can be either a URL or a TEI-style Extended Pointer (XPointer), or both. A URL on its own is assumed to be a resource; if an XPointer or XLink follows it, it is assumed to be a sub-resource of that URL; an XPointer on its own is assumed to apply to the current document (all exactly as with HTML).

An XLink is always preceded by one of #, ?, or |. The # and ? mean the same as in HTML applications; the | means the sub-resource can be found by applying the link to the resource, but the method of doing this is left to the application. An XPointer can only follow a #.

The TEI Extended Pointer Notation (EPN) is much more powerful than the fragment address on the end of some URLs, as it allows you to specify the location of a link end using the structure of the document as well as (or in addition to) known, fixed points like IDs. For example, the linked second occurrence of the word `XPointer' two paragraphs back could be referred to as http://www.ucc.ie/xml/faq.sgml#ID(hypertext).child(2,*).child(2,#element,'p').child(3,#element,'link'), meaning the third link element within the second paragraph within the second object in the element whose ID is hypertext (this question). Count the objects from the start of this question in the XML source (which has the ID hypertext):

the first child object is the title of the question (<q>);
the second child object is the answer (the <a> element);
within the <a> element go to the second paragraph;
count to the third link.
David Megginson has produced an xpointer function for Emacs/psgml which will deduce an XPointer for any location in an XML document.

Back to Index

C.17 Can I do mathematics using XML?
Yes, if the document type you use provides for math. The mathematics-using community is developing software, and there is a MathML Recommendation at the W3C, which is a native XML application. It would also be possible to make XML fragments from other DTDs, such as the long-expired HTML3, the near-obsolete HTML Pro, or ISO 12083 Math, or OpenMath, or one of your own making. Browsers which display some math embedded in SGML already exist (eg DynaText, Panorama, Multidoc Pro).

Back to Index

C.18 How does XML handle metadata?
Because XML lets you define your own markup language, you can make full use of the extended hypertext features (see the question on Links) of XML to store or link to metadata in any format (eg ISO 11179, Dublin Core, Warwick Framework, Resource Description Framework (RDF), and Platform for Internet Content Selection (PICS)).

There are no predefined elements in XML, because it is an architecture, not an application, so it is not part of XML's job to specify how or if authors should or should not implement metadata. You are therefore free to use any suitable method from simple attributes to the embedding of entire Dublin Core/Warwick Framework metadata records. Browser makers may also have their own architectural recommendations or methods to propose.

Back to Index

C.19 Can I use Java, ActiveX, etc in XML files?
This will depend on what facilities the browser makers implement. XML is about describing information; scripting languages and languages for embedded functionality are software which enables the information to be manipulated at the user's end, so these languages do not have any place in an XML file, but in stylesheets like XSL and CSS.

XML itself provides a way to define the markup needed to implement scripting languages: as a neutral standard it neither encourages not discourages their use, and does not favour one language over another, so the field is wide open.

Back to Index

C.20 Can I use Java to create or manage XML files?
Yes, any programming language can be used to output data from any source in XML format. There is a growing number of front-ends and back-ends for programming environments and data management environments to automate this.

There is a large body of `middleware' written in Java and other languages for managing data either in XML or with XML output. There is a suite of Java tutorials (with source code and explanation) available at http://developerlife.com.

Please do not mail the FAQ editor with questions about your Java programming bugs. Ask one of the Java newsgroups instead.
Back to Index

C.21 How do I execute or run an XML file?
You can't and you don't. XML is not a programming language, so XML files don't `run' or `execute' . XML is a markup specification language and XML files are data: they just sit there until you run a program which displays them (like a browser) or does some work with them (like a converter which writes the data in another format, or a database which reads the data), or modifies them (like an editor).

Back to Index

C.22 How do I control appearance?
In HTML, default styling is built into the browsers because the tagset of HTML is predefined and hardwired into browsers. IN XML, where you can define your own tagset, browsers cannot know what names you are going to use and what they will mean, so you need a stylesheet if you want to display the formatted text.

Browsers which read XML will accept and use a CSS stylesheet at a minimum, but you can also use the more powerful XSLT stylesheet language to transform your XML into HTML -- which browsers, of course, already know how to display (and that HTML can still use a CSS stylesheet).

As with any system where files can be viewed at random by arbitrary users, the author cannot know what resources (such as fonts) are on the user's system, so the same care is needed as with HTML using fonts. To invoke a stylesheet from an XML file, include one of the stylesheet declarations:

<?xml-stylesheet href="foo.xsl" type="text/xsl"?>
<?xml-stylesheet href="foo.css" type="text/css"?>
The Cascading Stylesheet Specification (CSS) provides a simple syntax for assigning styles to elements, and has been implemented in most browsers.

The Extensible Stylesheet Language (XSL) has been created for use specifically with XML. Dave Pawson maintains a comprehensive FAQ at http://www.dpawson.co.uk/xsl/xslfaq.html. XSL uses XML syntax (an XSL stylesheet is an XML file) and has widespread support from several major vendors (see the questions on browsers and other software) although current browser support is limited. XSL comes in two flavours:

XSL itself, which is a pure formatting language, and which needs a text formatter like FOP or PassiveTeX to create printable output (both can produce PDF). Currently I am not aware of any Web browsers which support XSL rendering;
XSLT (T for Transformation), which is a language to specify transformations of XML into HTML either inside the browser or at the server before transmission. It can also specify transformations from one vocabulary of XML to another, and from XML to plaintext.
Currently only MS Internet Explorer 5.5 handles XSLT inside the browser (and even that needs some post-installation surgery to remove the obsolete WD-xsl and replace it with the current XSL-Transform processor). But there is a growing use of server-side processors like Cocoon, which let you store your information in XML but serve it auto-converted to HTML, thus allowing the output to be used by any browser. XSLT is also widely used to transform XML into non-SGML formats for input to other systems (for example to transform XML into LaTeX for typesetting.

Back to Index

C.23 How do I use graphics in XML?
Graphics have traditionally just been links which happen to have a picture file at the end rather than another piece of text. They can therefore be implemented in any way supported by the XLink and XPointer specifications (see earlier question), including using similar syntax to existing HTML images. They can also be referenced using XML's built-in NOTATION and ENTITY mechanism in a similar way to standard SGML, as external unparsed entities.

The linking specifications, however, give you much better control over the traversal and activation of links, so an author can specify, for example, whether or not to have an image appear when the page is loaded, or on a click from the user, or in a separate window, without having to resort to scripting.

XML itself doesn't predict or restrict graphic file formats: GIF, JPG, TIFF, PNG, CGM, and SVG at a minimum would seem to make sense; however, vector formats are normally preferred for non-photographic images.

Back to Index

Using entities for images
You cannot embed a raw graphics file (or any other binary [non-text] data) directly into an XML file because any bytes happening to resemble markup would get misinterpreted: you must refer to it by linking (see below). It would, however, in theory be possible to include a text-encoded transformation of a binary file as a CDATA marked section, using something like UUencode with the markup characters ] and > removed from the map so that they could not occur and be misinterpreted, or even simple hexadecimal encoding as used in PostScript. For vector graphics, however, the solution is to use SVG (see below).

Bob DuCharme adds: All the data in an XML document entity must be parseable XML. You can define an external entity as either a parsed entity (parseable XML) or an unparsed entity (anything else). Unparsed entities can be used for picture files, sound files, movie files, or whatever you like. They can only be referenced from within a document as the value of an attribute (much like a bitmap picture on an HTML Web page is the value of the img element's src attribute) and not part of the actual document. In an XML document, this attribute must be declared to be of type ENTITY, and the entity's declaration must specify a declared NOTATION, because if the entity isn't XML, the XML processor needs to know what it is. For example, in the following document, the colliepic entity is declared to have a JPEG notation, and it's used as the value of the empty dog element's picfile attribute.

<?xml version="1.0"?>
<!DOCTYPE dog [
<!NOTATION JPEG SYSTEM "Joint Photographic Experts Group">
<!ENTITY colliepic SYSTEM "lassie.jpg" NDATA JPEG>
<!ELEMENT dog EMPTY>
<!ATTLIST dog picfile ENTITY #REQUIRED>
]>
<dog picfile="colliepic"/>
The XLink and XPointer linking specifications describe other ways to point to a non-XML file such as a graphic. These offer more sophisticated control over the external entity's position, handling, and appearance within the XML document

pi_1901843
*goh, dit topic laatdtdt steeds traget *

The Net Abuse FAQ
Last changed $Date: 1998/12/23 19:28:32 $, making this $Revision: 3.2 $.
NOTE: Parts of this FAQ may be out of date. Please send me any suggestions or corrections.

The most frequently asked question is always "Who do I complain to about this?"
Please see sections 3.8 through 3.12 for answers.

If you read no other part of this FAQ, read section 3.21.
POLITICS
1.1) What are the news.admin.net-abuse groups, and why were they created?
1.2) (this section has been merged into 1.1)
1.3) What is net-abuse?
1.4) What is the purpose of this FAQ?
1.5) What questions does it leave unanswered?
1.6) Who's responsible for this FAQ?
1.7) Where can I get it?
1.8) Is this the only Net Abuse FAQ?
1.9) I don't understand a word of this.


SPAM, SPAMMERS, and MOOSES
2.1) What is Spam?
2.2) What is Excessive Multi-Posting (EMP)?
2.3) What about cross-posting?
2.4) Where did the term come from?
2.5) Tell me about the Great Spammers.
2.6) Who were Canter and Siegel?
2.7) Where can I get more info on them?
2.8) What should we do about the book?
2.9) Who is Cancelmoose
2.10) Who are the current spam cancellers?
2.11) Has this problem really been going on for FOUR YEARS?!


NITTY-GRITTY
3.1) Yeah, but how many times is 'X'?
3.2) What is the Breidbart Index (BI)?
3.3) What is NoCeM?
3.4) Is there a blacklist of net-abusers?
3.5) How can I tell if a post is forged?
3.6) How do I know when I've got spam on my hands?
3.7) My group is full of crap. Why isn't it being cancelled?
3.8) OK, I think I've spotted a spam. Who should I mail-bomb?
3.9) OK, I think I've spotted a spam. What should I do?
3.10) What about e-mail spam?
3.11) I e-mailed a complaint to {so-and-so} about their {e-mail, post} and now they're threatening to complain to my system administrator. What should I do?
3.12) List of Basic Adminstrative Addresses
3.13) What's a cancel-bot?
3.14) Where can I get me one?
3.15) How do spam-cancellers cancel spam?
3.16) Can I sic The Man on these MAKE.MONEY.FAST losers (or other types of net abusers)?
3.17) What is a killfile, and how do I use one?
3.18) How do I killfile all crossposted messages?
3.19) What is the Usenet Death Penalty (UDP)?
3.20) Do all hierarchies have the same rules?
3.21) How about we start a campaign to stop all the spammers?


GROAN
4.1) Why are you net-abuse people such net-cops?
4.2) Isn't cyberporn a bigger issue than spamming?
4.3) Hey, I think my newsgroup is being invaded by alt.syntax.tactical!
4.4) Hey, I think my newsgroup is being invaded by the Usenet Freedom Council!
4.5) Hey, somebody posted an ad in {newsgroup}!
4.6) Hey, so-and-so's not being nice in {newsgroup}!
4.7) Hey, the Good Times virus--
4.8) Hey, there's this (AT&T, Jerry Garcia, whatever) banner message in the newsgroup descriptions!
4.9) Hey, one of those net.cops posted an ad for {something}! Haw! Haw!

--------------------------------------------------------------------------------

POLITICS
1.1) What are the news.admin.net-abuse groups, and why were they created?
Originally, news.admin.net-abuse.misc was created to replace alt.current-events.net-abuse and news.admin.policy. The former was one of the most widely read and respectable alt.* groups, while the latter had become largely a mess of messages cross-posted from a.c-e.n-a and news.admin.misc.
news.admin.net-abuse.misc was then, not surprisingly, for discussions of net-abuse (see "What is net-abuse", below): definitions, occurances, objections, complaints, battle plans, peace plans, etcetera.

As you can guess, that generated amazing amounts of traffic. By early 1996, it had gotten to the point where it was impossible to keep up with the group without investing hours and hours of time.

In November of 1996, after many months of hard work from Tim Skirvin and others, the news.admin.net-abuse.* groups were reorganized. The charters are stored at:


http://www.uiuc.edu/ph/www/tskirvin/nana

1.3) What is net-abuse?
Since the first net-abuse newsgroup, many curious forms of Usenet behavior have been discussed. Of these, spam is the one most universally accepted as 'net-abuse', which is why it gets its own section below. Other Frequently Aired Complaints are discussed throughout the FAQ.
However, as Neil Pawson says, "it's for abuse *of* the net, NOT abuse *on* the net." Just because somebody does something vile doesn't mean we can do anything about it on n.a.n-a. To qualify as true panic-inspiring net-abuse, an act must interfere with the net-use of a large number of people. Examples of this: newsgroup flooding, widespread or organized forgery campaigns, widespread or organized account hackery, widespread or organized censorship attempts, etcetera.


1.4) What is the purpose of this FAQ?
This FAQ is *not* intended as a comprehensive guide to netiquette. That is covered in RFC 1855. Many things that this FAQ appears to treat lightly are, in fact, extreme breaches of netiquette. The FAQ primarily attempts to answer: are these situations "net-abuse", in the sense that the whole world should hear about them?

1.5) What questions does the FAQ leave unanswered?
Probably quite a few. If you have questions that you think should be added to the FAQ, feel free to contact me -- especially if you also have the answers.

I'd also love to have a section on network/address tracking and informational tools (telnet, traceroute, nslookup, etc.) a la "The Spam-tracker's Handbook". Whatever happened to that?

Anyways, feel free to contribute whole new entries.

1.6) Who's responsible for this FAQ?
It's currently maintained by J.D. Falk (jdfalk@cybernothing.org), and was originally maintained by by Scott Southwick (scotty@bluemarble.net). The information has been gleaned from various Usenet sources --primarily posts to the net-abuse groups made by a wide variety of authors-- and so the maintainer must actively disclaim all responsibilty for the veracity, advisability and/or legality of anything contained in the FAQ. Thanks to the following people who have contributed to it, or at least discussed its contents in a non-threatening manner:
Arthur Byrne, Pekka Pirinen, Keith "Justified and Ancient" Cochran, Lamont Granquist, Victoria Fike, Steve Patlan, Wilf Leblanc, Seth Cohn, Neil Pawson, Bram Cohen, Mitchell Golden, Rahul Dhesi, Stephen Boursy, Mary Branscombe, David Cortesi, Alexander Lehmann, Greg Lindahl, Jack Hamilton, Morten Welinder, Axel Boldt, Richard Lee, an48985, Phil Pfeiffer, John van Essen, Pierre Beyssac, Michael Shields, Travis Corcoran, Tim Skirvin, Chris Lewis, Daniel J. Barrett, Ricardo H. Gonzalez, Dave Hayes, Ed Falk (no relation), Nathan J. Mehl (Nathan says hi), Peter Kappesser, Robert Braver, Loy Ellen Gross, booter, Johann Beda, Shaun Davis-Gluyas, John R. Birch, Penn Hackney, David Grabiner, Brendan O'Sullivan-Hale, Bob Allisat, John Moreno, and many others we have undoubtedly missed over the years.

Contributions are always warmly welcomed, as are suggestions, corrections and criticism. However, you know where to shove the flames.


1.6.1) What are the big changes made in 1998?
After letting this FAQ languish for a while, I realized that it was time to go through and clean stuff up, as well as adding new information. To tell you the truth, I'm quite dismayed at how little has changed.

This Net Abuse FAQ will continue, however, to focus on usenet. There are a lot of other good documents about e-mail abuse, and that's an area which changes way too often.

1.7) Where can I get it?
This FAQ will be posted thrice monthly (on the 1st, 11th, and 21st) to the following newsgroups:

news.admin.net-abuse.usenet
news.admin.net-abuse.misc
news.admin.net-abuse.bulletins
news.admin.misc
news.groups.questions
news.answers
It will also be available at the various public FAQ archives, including rtfm.mit.edu and its mirror sites. The master hypertext version is available at:

http://www.cybernothing.org/faqs/net-abuse-faq.html

1.8) Is this the only Net Abuse FAQ?
Unfortunately, the topic of Net Abuse is so vast and so controversial that it cannot be covered completely in one document.
Of course, that didn't stop Daniel Barrett from trying, and doing a very good job. He wrote a book (published by O'Reilly Publishing) with the unfortunate but fitting title of Bandits on the Information Superhighway. More information is available at:


http://www.ora.com/item/bandits.html

I've removed much of the rest of this list, because Stan Kalisch III is doing a much better job of keeping his list of news.admin.net-abuse.* Newsgroups' Documents updated. You can view it at:

http://www.crl.com/~sjkiii/news-admin-net-abuse.html, or ftp://ftp.crl.com/users/sj/sjkiii/pub/usenet/news-admin-net-abuse.txt
For an almost totally different viewpoint, see Dave Hayes's long-awaited document, "An Alternative Primer on Net Abuse, Free Speech, and Usenet," which at first denied the existence of this FAQ. You can find it and some related documents at:

http://www.jetcafe.org/~dave/usenet/

My answer to Dave's Alternative Primer is also worth reading:

http://www.cybernothing.org/faqs/dave-hayes.html

There are a number of very good indices of net abuse-related documents:

Fight Spam on the Internet! (Scott Hazen Mueller)
http://spam.abuse.net/

news.admin.net-abuse.* homepage (Tim Skirvin)
http://www.math.uiuc.edu/~tskirvin/home/nana/

1.9) I don't understand a single word of this.
One of the best starting places for learning about Usenet has historically always been Indiana University's Usenet Resources page, which is now at:

http://kb.indiana.edu/menu/usenet.html
It has links to most Usenet primers, netiquette documents and news FAQs, Son-of-RFC-1036, some charters, newsreader man pages, etcetera. Also, perhaps one of the following resources will help:


http://www.landfield.com/usenet/
http://sunsite.unc.edu/usenet-i/
http://www.geocities.com/ResearchTriangle/8211/

--------------------------------------------------------------------------------

SPAM, SPAMMERS, and MOOSES
2.1) What is Spam?
It's a luncheon meat, kinda pink, comes in a can, made by Hormel. Most Americans intuitively, viscerally associate "Spam" with "no nutritive or aesthetic value," though it is still relatively popular (especially in Hawaii) and can be found in almost any grocery store.) The canned luncheon meat has its own newsgroup, alt.spam.
The term "spam," as used on this newsgroup, means "the same article (or essentially the same article) posted an unacceptably high number of times to one or more newsgroups." CONTENT IS IRRELEVANT. 'Spam' doesn't mean "ads." It doesn't mean "abuse." It doesn't mean "posts whose content I object to." Spam is a funky name for a phenomenon that can be measured pretty objectively: did that post appear X times? (See 3.1, "Yeah, but how many is X?')

There have been "customized" spams where each post made some effort to apply to each individual newsgroup, but the general thrust of each article was the same. A huge straw poll on news.admin.policy, news.admin.misc, and alt.current-events.net-abuse (December 1994) showed that as many of 90% of the readers felt that cancellations for these posts were justified. So, simply put: if you plan to post the same or extremely similar messages to dozens of newsgroups, the posts are probably going to get cancelled.

If you feel that a massive multi-post you are planning constitutes an exception, you are more than welcome to run the idea past the readers of news.admin.net-abuse.usenet for feedback first.


2.2) What is Excessive Multi-Posting (EMP)?
Spam (and spam by any other name still stinks.)
Some people feel that "spam" is an inappropriately misleading name for messages of this type. Others feel that "EMP" is misleading. Since spam is the most widely recognized term, that's what we use in this FAQ.


2.3) What about cross-posting?
Here's the difference between cross-posting and multi-posting: cross-posting is where you list all the groups on the Newsgroups: line of a single post. Multi-posting is where you have some idiotic program fire an individual copy of the post to each group. (If you do it manually, that's even more idiotic.) A cross-post only takes up the space of 1 post (one on every newsserver in the world), no matter how many groups; multi-posting takes up the space of dozens or hundreds of posts (on every newsserver in the world), which is why it infuriates so many people.
So, cross-posting is better than multi-posting. It's still very often a bad idea, and if you get carried away it'll still get cancelled (see 3.2, "What is the Breidbart Index (BI)?") This is often called Excessive Cross-Posting, or ECP. Some folks still call it "velveeta" because they like cutesy names.

If you *must* cross-post, set the followups to a single appropriate group by adding a header line like:


Followup-to: group.name.here

This prevents the readers of all the groups from having to deal with the thread for weeks afterwards if the readers of only one or two of the groups take an interest in it.
You can also add Followup-to: poster, which will (in most newsreaders) ask anybody who tries to follow up to e-mail you directly instead.


2.4) Where did the term 'Spam' come from?
The prevailing theory is that it is from the song in Monty Python's famous spam-loving vikings sketch that goes, roughly, "Spam spam spam spam, spam spam spam spam, spam spam spam spam..." The vikings, who were sitting in a restaraunt whose menu only included dishes made with spam, would sing this refrain over and over, rising in volume until it was impossible for the other characters in the sketch to converse (which was, of course, a large part of the joke.)
The term is rumored to have originated, as far as the Internet is concerned, from the MUD/MUSH community. Blue-haired former newsadmin Nathan J. Mehl tells the most reliable story known to date...


Well, briefly summarized:
My friend-who-shall-remain-nameless was, ah, a younger and callower man, circa 1985 or so, and happened onto one of the original Pern MUSHes during their most Sacred Event -- a hatching. After trying to converse sanely with two or three of the denizens, he came quickly to the conclusion that they area all of bunch of obsessive-compulsive nitwits with no life and less literary taste. (Probably true.)

Editors' Note: another source tells me that this actually happend in the summer of 1991.

So, as the 'eggs' were 'hatching', he assigned a keyboard macro to echo the line:

SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM SPAM

...and proceeded to invoke it once every couple of seconds, until one of the wizards finally booted him off.

...which would have probably been that last that anyone ever heard or thought of it, except that it apparently ingrained itself into the memory of the PernMUSHers, and forever after there was the legend of 'that asshole who spammed us.'

Every once in a while, this story makes it back to my friend, and he tries very hard to keep a straight face...


Another theory is related to throwing a "brick" of the luncheon meat at a rotating metal fan. However, none of the long-time "spam watchers" have any idea where that theory was from before it showed up in a Time magazine article.
The term wasn't first used to describe mass news posting, however. See the Hacker's Jargon File for previous uses of the word.


2.5) Tell me about the Great Spammers.
To paraphrase Yoda, spam does not make one great. However, a surprising number of people prefer infamy to obscurity, and would rather be hated than unknown. Some of those people take up spamming as a way to gain the notoriety that their warped psyches crave.
So as not to duplicate effort, here's an excellent archive devoted to the various bug- and honey-bears of the Net:


The Kook of the Month site (particularly the Net.Legends FAQ)
http://www.ews.uiuc.edu/~tskirvin/faqs/legends.html

Not all of the kooks and legends discussed there are spammers, or even villains. Spam fans should pay particular attention to the entries on Serdar Argic, the spiritual ancestor of today's spammers. In fact, any would-be spammers should try to be more like him. At least he was kinda interesting. Today's kooks are just sociopaths.


2.6) Who were Canter and Siegel?
They were lawyers, authors, and Usenet newbies _par excellence_. Super-newbies. Honorary Permanent Newbies. When they sit around the net, they sit *around the net*...

C+S weren't the first spammers, but they were so gothically clumsy about it, and so intent on making a buck, that people were terrified and infuriated into starting alt.current-events.net-abuse (which has since been replaced by the news.admin.net-abuse.* groups.

Since then, they've parted ways (rumour has it they were married when they spammed, and have since gotten a divorce.) Lawrence Canter was permanently disbarred, in part because of his history of net abuse. Martha Siegel was last heard from a few years ago, when she was trying to go on a lecture tour promoting her new, revised version of the book she and Canter wrote together on how to abuse the net.

2.7) Where can I get more information about them?
The best known source is Thomas Leavitt's "The Canter & Siegel Report," available via anonymous ftp from:


ftp://ftp.armory.com/pub/user/leavitt/
Those files are zipped. Users with access to 1990s technology should check out the WWW versions at:


ftp://ftp.armory.com/pub/user/leavitt/html/cands.report.html
ftp://ftp.armory.com/pub/user/leavitt/html/candsrpt.two.html
ftp://ftp.armory.com/pub/user/leavitt/html/candsrpt.three.html
There's also a wonderful article on the pair available at:


http://www.eye.net/Howling/Kooks/Kreeps/CS2.htm (apparently now an invalid link; anybody know where it went?)
Many, many more docs are available, but I'll stop there, because there's really no reason to dwell on the past. In fact, Canter & Siegel have both posted to news.admin.net-abuse.misc and other groups from time to time (always multiposted -- they seem genetically unable to crosspost), and it has always been quite obvious that all they wanted was to generate more publicity for themselves.


2.8) What should we do about the book?
What book?

2.9) Who is Cancelmoose[tm]?
Cancelmoose[tm] is, to misquote some wise poster, "the greatest public servant the net has seen in quite some time." Once upon a time, the 'Moose would send out spam-cancels and then post notice anonymously to news.admin.policy, news.admin.misc, and alt.current-events.net-abuse. The 'Moose stepped to the fore on its own initiative, at a time (mid 1994) when spam-cancels were irregular and disorganized, and behaved altogether admirably-- fair, even-handed, and quick to respond to comments and criticism, all without self-aggrandizement or martyrdom. Cancelmoose[tm] quickly gained near-unanimous support from the readership of all three above-mentioned groups.
Nobody knows who Cancelmoose[tm] really is, and there aren't even any good rumors. However, the 'Moose now has an e-mail address (moose@cm.org) and a web site (http://www.cm.org.)

By early 1995, several others had stepped into the spam-cancel business, and appeared to be comporting themselves well, after the Moose's manner. The moose has now gotten out of the business, and is more interested in ending spam (and cancels) entirely (see "What is NoCeM?")


2.10) Who are the current spam cancellers?"
Chris Lewis and Robert Braver take care of most of the spam (John Milburn has retired from the spam-cancelling biz), while Richard Depew cleans up spews from horribly misconfigured news servers, large misplaced binaries, and the like. Somebody calling himself The Unknown News Administrator has been helping as well, and so have a few others. Michael Scheidell and others deal with problems (usually out-of-area postings) in various local hierarchies.

Overall, Chris Lewis is considered to be the expert on spam cancelling, and one of the experts on Usenet in general.

For a good overview of who's doing what right now, hop over to news.admin.net-abuse.bulletins and check headers. It changes every few months.

2.11) Has this problem really been going on for FOUR YEARS?!
Yes.

The obvious next question is "why hasn't everybody just given up?" Well, some have. Many others have confined their reading to a small, selected set of groups, usually from behind a mass of killfiles and other filtering methods. Some folks even went as far as starting a new, "parallel" usenet alternative, called Usenet2, which you can read about at:

http://www.usenet2.org/
But I think Stanford newsadmin Russ Allbery explained it best in a post to Usenet2's net.subculture.usenet in March of 1998:

http://www.cybernothing.org/cno/docs/russ-usenet.txt

--------------------------------------------------------------------------------

NITTY-GRITTY
3.1) Yeah, but how many times is 'X'?
How many posts does it take to push the spam envelope? To use up all your spam charity points? For a bare-bones spam? To trigger the raging-spam-cancellers-from-Hell?
Among those who agree that spam should be defined solely by quantity,


-----------------> 20 <--------------------
appears to be the magic number, or at least a number so middle-of-the-road that it provokes very little passionate dissent in either direction. Notably, Cancelmoose[tm] refused to set a firm number, in the belief that people would simply post [X-1] messages. It's safe to say that a couple incidents of 19-post spams would cause the magic number to plummet. Thus, 20 should be considered a vague approximation only.
Passionately dissenting note: Rahul Dhesi [dhesi@rahul.net], one of the fathers of the cancel-bot movement, sticks by the following definition:


More than five physically distinct postings with substantially identical content posted within a period of ten days.

The most reliable document describing current spam thresholds and guidelines is a draft FAQ posted weekly to news.admin.net-abuse.misc by Chris Lewis. It also describes the Breidbart Index (see below) in greater detail. That FAQ is not now available on the web at:

http://spam.abuse.net/spam/others/thresholds.html
It is important to note that some ISP's set different limits on what their users may or may not do, so if you try to push the envelope with the Briedbart Index it's still quite possible that you'll lose your account.

3.2) What is the Breidbart Index (BI)?
The Breidbart Index (BI) is a measure of the breadth of any multi-posting, cross-posting, or combination of the two. BI is defined as the sum of the square roots of how many newsgroups each article was posted to. If that number approaches 20, then the posts will probably be cancelled by somebody.
For instance, four identical posts to nine newsgroups each (4 times 3) has a BI of 12. However, nine identical posts to four newsgroups each (9 times 2) has a BI of 18.


3.3) What is NoCeM?
NoCeM is an end to all this spam, and an end to all this cancelling. With NoCeM (pronounced "No See 'Em"), your newsreader goes out and gets certain posts (from trusted parties) that contain lists of junk articles (ECP, spam, etc.) Your newsreader then hides those articles from you.
Note that right now most NoCeM newsreaders are only for Unix. The only exception is Gnus, the newsreader for EMACS, which will work on any platform that supports a fully functioning version of GNU EMACS.

The move to NoCeM is headed by the Cancelmoose[tm] (moose@cm.org), and the moose's web site has all the info you might want about NoCeM:


http://www.cm.org/

Also check out the newsgroup alt.nocem.misc, which will degenerate into a Big 7 newsgroup (news.lists.nocem?) one of these days.

3.4) Is there a blacklist of net-abusers?
Yes, Axel Boldt maintains the world-renowned "Blacklist of Internet Advertisers" at:

http://math-www.uni-paderborn.de/~axel/BL/blacklist.html

Now, before you get really worried about McCarthyism and such, go and look at Axel's self-imposed rules for maintaining the blacklist. He's much fairer than most of those people deserve.

3.5) How can I tell if a post is forged?
Gandalf (gandalf@digital.net) has written the alt.spam FAQ, or "Figuring out fake E-Mail & Posts," which focuses on how to track spam. It is available at:

http://digital.net/~gandalf/spamfaq.html

For a rough article on forgery, originally constructed for this FAQ out of information contributed by Robert Bonomi, Arthur Byrne, Emma Pease, and Alan Bostick, see:

http://sckb.ucssc.indiana.edu/kb/data/all.afco.html

For more information on headers, see RFC-1036, "Standard for Interchange of Usenet Messages," at:

http://www.cis.ohio-state.edu/htbin/rfc/rfc1036.html

3.6) How can I tell how many newsgroups an article was posted to?
For people who can't use the classic "grepping the newsspool" method, nn or nngrab may be able to help. (The following is adapted from a posting by Lee Rudolph--thanks.)
You can force the Unix newsreader nn to ignore your .newsrc and create a "merged newsgroup" consisting only of articles containing a certain word in their subject line. For instance, to gather all articles at your site containing the word "spam" in their subject line, use this command:


% nngrab spam
That's basically a faster version of


% nn -i -s"spam" -mXx
Caution: this latter method can be a long, tedious process. See the nn man page for more details.


3.7) My group is full of crap. Why isn't it being cancelled?
Lots of groups are full of inappropriate posts, widely crossposted advertising, and so forth -- just pop into misc.misc or alt.sex for as many examples as you can possibly handle.
As annoying as it may be, these posts may not be cancellable spam. Keep in mind that the cancel thresholds err in the favor of the excessive poster, and still leave *lots* of room to post in a manner that most people find inappropriate.

A single, excessively crossposted post can not be cancellable in and of itself. In order for a single post to be cancelled, it would have to be posted to 400 groups (sqrt(400) = 20). This is not possible due to limits of news software.

Robert Braver reports "When checking for spam, I often must pass over groups of messages that are likely considered off-topic intrusions in each of the newsgroups it is posted to, but it doesn't hit the cancel threshold."

One good solution here would be for the newsadmins of a particular locality to come to a consensus for more stringent thresholds for their respective local hierarchies, as has been done in the atl.* and fl.* hierarchies.

Of course, the messages may actually be cancellable spam, especially when you consider the current 45-day window. But, this type can be harder for the automatic spam detectors to find.

Once a slow spam is detected and posted to news.admin.net-abuse.announce, it makes it easier to keep tabs on a particular poster or series of messages in the future. This kind of spam is probably where "field reports" to news.admin.net-abuse.misc are the most useful.


3.8) OK, I'm certain it's spam. Who should I mail-bomb?
Don't mail-bomb anybody. Harrassment is illegal everywhere. If somebody's done something truly evil, they'll get enough single responses from individuals to achieve the same effect.

3.9) OK, I'm certain it's spam. What should I do?
Check n.a.n-a.sightings. If somebody's already made a definitive spotting, there's no sense in an "I've seen it, too" post.

Include a *complete* header from one copy of the spam in your post to n.a.n-a.sightings. Set followups to n.a.n-a.misc.

Say how many newsgroups at your site it was posted to; list 20 or more of them. (See "How do I know how many newsgroups an article was posted to?")

Complain politely to the spammer and the Usenet administrator at the spammer's site (whose address should be "usenet@site.name" or "news@site.name"; if that fails, try "abuse" or "postmaster".) Request that the Usenet administrator post a response to n.a.n-a.announce, detailing what actions have been taken. Again, remember to be polite -- it is rare that the administrators are in any way responsible for the message.

3.10) What about e-mail spam?
You can always complain about unsolicited e-mail to both the bozo that sent it to you and the bozo's postmaster. To write to a postmaster, just substitute the perp's username in their address (e.g., bozo@otherwise.lovely.com) with "postmaster" (i.e., postmaster@otherwise.lovely.com.) Please be brief and polite with the postmasters, include a copy of the e-mail you received, and leave the subject-line intact (in case the postmaster wants to set up an auto-responder.)

Be sure to include all the headers (not just From, To, Date, and Subject, which is the default in most mail programs) in your reply, just in case the e-mail was cleverly forged. That way, the postmaster can trace it back to its source if necessary.

For more information, see:

http://spam.abuse.net/
3.11) I e-mailed a complaint to so-and-so about their {post, mail}, and now they're threatening to complain to my system administrator. What should I do?
Let your sys-admin know right away what's happening. Tell them the story, briefly. Offer to supply the post(s) in question, so that your admin doesn't have to go searching. Then keep them updated on any further threats.
If you're brief, polite, and on the right side, you can usually find an ally in your sys-admin.


3.12) List of Basic Administrative Addresses
The search for the best person to complain to at any site has led to much speculation and arguments, even among admins at the same site. However, if a message to the original poster doesn't get you anywhere, somebody at one of the following addresses might be able to help.

abuse
A lot of ISP's and network backbones have created 'abuse' addresses for complaints about net-abuse. That's usually the best place to start.

usenet or news
For Usenet abuse, you can usually reach a news administrator through one or both of these addresses. A notable exception is Compuserve, which utilizes the address <usemail@csi.compuserve.com> (this may change now that AOL has purchased Compuserve.)

postmaster
RFC 822, the document which set most of the current standards for Internet e-mail back in 1982, makes it mandatory for all sites which pass e-mail to have a postmaster address so that problems can be reported. The purpose of postmaster has expanded at many sites to include net-abuse, both e-mail and otherwise.

Administrative or Technical Contacts
If you have access to the whois command, you can type (for example) 'whois example.com' to find out who the administrative and technical contacts are for a domain. This will list their e-mail address, and often their phone and FAX numbers (but remember, be polite, because the contacts aren't usually responsible for their users' misbehavior, and harassment is illegal everywhere.)

Upstream Providers
If none of the above get you anywhere, you can try going to a site's upstream providers. For news, check the Path: header of the original message. To the right, you'll see the originating site. Each site between you and them is separated by an exclamation point, as in the partial example below:

!dummy-host.example.com!nohost.mydomain.com!not-for-mail

As you can see, the message originated at the machine foobar.mydomain.com. The next news hop is dummy-host.example.com, so you'd complain to news@example.com if the admins at mydomain.com were uncooperative.
For e-mail, determining who's upstream can often be confusing -- many people get it wrong. Unless you're familiar with the whois and traceroute tools, I'd suggest not even bothering.


If you don't have the time or resources to do this research, you can send mail to domain.name@abuse.net, and it will (probably) be sent to the appropriate contact(s) for that domain. You'll need to register with abuse.net the first time you send mail through it.

3.13) What is a cancel-bot?
First off, "cancel-bot" is an unfortunate misnomer, and one that the conventional media have understandably misunderstood. "Bot" implies that something is out there, running unattended, cancelling whatever meets its nefarious qualifications...but that is quite rare, and is only done when both the user and their administrators are completely unwilling to stop spamming. For the most part, all spam-cancels are sent out manually and deliberately by actual human beings. (They happen to use a program that is commonly referred to as a "cancel-bot".)
A cancel-bot, misnomer aside, is a program that sends out cancel messages; you feed it the message-IDs of posts, and it sends out a cancel message for each one (see RFC 1036.) Cancel messages are normally sent out by a newsreader in response to a user's request to cancel a message, using a newsreader command, *if* the user was also the original poster of the message. Sites will ignore cancel messages that don't appear to come from the original poster. Cancel-bots work around this restriction by using header lines that make it look like the original poster sent out the cancel; they'll usually add something like a "Cancelled-By" header line as well, to keep things nominally above-board.

Use of a cancel-bot against anything besides 'consensus spam' outrages people, as it should. See alt.religion.scientology for sample discussions.

For more information on cancels (especially in regards to net abuse), Tim Skirvin has written a very good FAQ, which used to be avaliable at:


http://www.uiuc.edu/ph/www/tskirvin/cancel.faq

3.14) Where can I get me a cancel-bot?
If you have to ask, you should probably wait a while.
3.15) How do the spam-cancellers cancel spam?
They make bloody sure they know how to use their cancel-bot;
They confirm the spam themselves;
They announce their action to n.a.n-a.announce. This prevents everyone from waiting around and wondering whether anyone's done anything.

Here's a standard section from an old cancel-notification post by the beloved Cancelmoose(TM):

The $alz cancel. and Path: cyberspam conventions were followed. [The $alz convention is to create your cancel message-ID by prepending 'cancel.' to the original one. The cyberspam convention is to use- 'Path: cyberspam!usenet' so that sites that do not want your cancels can easily opt out. Please use these when cancelling spam.]
Many more disclaimers are commonly added by modern spam cancellers, in an attempt to reduce confusion and misplaced anger.

3.16) Can I sic The Man on these MAKE.MONEY.FAST losers (or other types of net abusers)?
You can complain about e-mail or Usenet pyramid schemes (at least those involving Americans somehow) to the Federal Trade Commission:

STAFF CONTACT: Bureau of Consumer Protection
Ms. Broder
bbroder@ftc.gov

Before doing so, consider seriously whether you actually want to encourage government intervention. The number of 'net cases the FTC has been involved in is very low at this point; in an ideal world, it would probably remain that way.
But if you really want to go after MMF lusers (or anybody spammy any type of tax fraud scheme), you can complain to the IRS:


] Subject: Reporting MMF to the IRS [long]
] Date: 11 Mar 1997 09:26:20 -0500
] Reply-To: Inspector Andrew Fried
]
] Over the past six months, my email address has appeared in the "fraud
] killer list", a list of agency contacts used to report potential tax
] fraud violations by the "make money fast" (MMF) Usenet spammers. Since
] complaints such as those don't fall under my specific area of
] jurisdiction, I have been manually forwarding all such messages to the
] appropriate department within my agency.
]
] In order to facilitate routing complaints to the IRS via email, I have
] established two special mailboxes. Email sent to those addresses will
] be automatically forwarded to the correct organizations within the
] Service. This will assure faster delivery and reduce congestion on
] my personal email account. The addresses are as follows:
]
] net-abuse@nocs.insp.irs.gov
] Use this address to report make money fast (MMF) schemes. Mail sent to
] this address will be forwarded to the Criminal Investigation Division
] (CID) for appropriate action.
]
] hotline@nocs.insp.irs.gov
] Mail sent to this address will be forwarded to Internal Security
] (Inspection), the IRS's "internal affairs" type organization. Internal
] Security is responsible for investigating criminal acts which attempt to
] corrupt our tax system. Internal Security is also responsible for the
] protection of all Service employees. Use this address to report
] attempted bribery of IRS employees, conspiracy to defraud the tax
] system, threats against the IRS or IRS employees or any other suspected
] criminal acts affecting the integrity of our tax system. Please don't
] forward the infamous "IRS Abuse" reports here.
]
] Reports of tax fraud should be sent directly to your regional IRS
] Service Center; there is currently no Internet email address for
] reporting those suspected offenses.
]
] Please distribute this message to newsgroup moderators and members of
] your newsgroups. Should you have any other non-tax related questions,
] feel free to write to me directly at:
] afried@nocs.insp.irs.gov
]
] --
] Inspector Andrew Fried IRS Internal Security
] Voice: (202) 622-3535 1111 Constitution Ave, NW
] Fax: (202) 622-8681 Washington, DC 20224

A non-governmental organization which deals in such things (and more) is the National Fraud Information Center, which is funded by grants from major corporations and works in cooperation with federal, state, local and international law enforcement agencies. Their purpose is organize, classify, and forward "stuff" to the appropriate body: state's a.g, FTC, FBI, Secret Service, wherever.
Thus they are not "law enforcement" and the problems of inaction by local district attorneys, etc. persist (d.a's have "too much work to do" to go after an individual posting a chain letter). You can e-mail them at <nfic@internetmci.com>, or get information from their web page, which is at:


http://www.fraud.org/

For stock fraud and the like, some people have been complaining to the Securities and Exchange Commission at the address <enforcement@sec.gov>. And, they've started prosecuting. Please only send them reports of stock fraud, however -- they don't have the authority to deal with anything else.

3.17) What is a killfile, and how do I use one?
A killfile enables you to permanently avoid reading posts by certain people, or from a certain site, or whose Subject: lines contain particular words... Check out the RN killfile FAQ at:

http://www.cis.ohio-state.edu/hypertext/faq/usenet/killfile-faq/faq.html
If your newsreader doesn't allow killfiling (some news clients call 'em "filters"), write the author of the software and ask them to add support for killfiles. 'The "Good Net-Keeping Seal of Approval" for Usenet Software', which recommends that filtering be included in all news clients, can be viewed at:


http://www.xs4all.nl/%7Ejs/gnksa/
for more information on what makes a good newsreader.

And, for good advice on who to ignore, see the Global Killfile:


http://www.uiuc.edu/ph/www/tskirvin/global/

3.18) How do I killfile all crossposted messages?
It's becoming quite common for people to killfile all messages crossposted to more than X newsgroups, because this cuts down on the amount of blatantly off-topic crap they have to read.
This is simplest to do in the rn family (rn, trn, strn, etcetera) using a killfile entry like the following:


/^Newsgroups: .*,.*,.*,.*,.*,./h:,
That one kills anything posted to more than six groups, plus all of the followups in that thread (that's what the comma at the end means.) For less groups, use less .* entries -- for more groups, use more.
Peter Kappesser suggests a somewhat more efficient form for servers which support the Xref extension to the News Overview database file (if you aren't sure if your server supports it, just check and see if there's an Xref: header in the messages you see. If there is, it does.):


/:.*:.*:.*:.*:.*:/HXref:,
In this, the number of colons equals the threshold number of groups. This is more efficient because the Xref header line is transferred with the NOV file when you enter the group, so trn can process it quickly. If you kill on the Newsgroups line, trn has to fetch from the server at least the header for every article in the group in order to examine it for the kill.
One slight difference is that Xref contains only those groups carried by the server, which may not necessarily be all those listed in Newsgroups. However, this isn't often a problem -- most ECP's are to a dozen or more groups, so it doesn't matter that Newsgroups lists 27 groups while Xrefs only has 18, it's still greater than 6!


3.19) What is the Usenet Death Penalty (UDP)
There are two different things commonly referred to as "UDP."
The one least argued about could be called "shunning" or "aliasing," in which a newsadmin (running INN unoff3 or above, or using the 'shun' patch to earlier versions of INN) can add a site's pathhost to their ME line. They simply won't get any messages from that site. Some may consider this censorship, but it fits quite well with the simple but often forgotten concept that a newsadmin can do whatever they want on their own machine so long as it doesn't cause any problems for other newsadmins.

The other Usenet Death Penalty is automatic cancellation of all messages from a site, or from a person, or based on a regular expression. This is sometimes done when a spam (or spew) continues unabated even after the spam cancellers and other net-abuse activists have attempted to contact somebody and ask them to stop. As you can guess, there are arguments about this which have literally been going on for years.

Currently, the general consensus among news.admin.net-abuse.misc participants is that UDP of either type should only be employed after every other method has been tried and failed.

In the useless trivia column, the term "Usenet Death Penalty" was first coined by Eliot Lear. The first software to perform it was written three years earlier by Karl Kleinpaste in 1990, and was 28 lines long. Karl is also known as being the author of the anonymous server software.

The second (previous versions of the FAQ referred to it as the first) was written by Rich $alz (the inventor of INN) in Perl in April, 1993. It was 76 lines long, including instructions for use.


3.20) Do all hierarchies have the same rules?
Nope. This FAQ mainly deals with what's considered net abuse in the "Big 8" (comp.*, humanities.*, misc.*, news.*, rec.*, sci.*, soc.*, and talk.*) and alt.* (we also touch on biz.* a little bit.) But there are many hierarchies -- especially regional and local -- which have begun to adopt much stricter policies on net abuse.
The main reason behind this is that the local hierarchies usually have a smaller target audience. For example, dc.* exists for the Washington, D.C. metropolitian area, fl.* for the state of Florida, and so forth. Long ago in the history of Usenet (okay, it was only two or three years ago) all the news hosts in Florida traded fl.* with each other, and it didn't leak too far out-of-state -- but now, with so many national news providers, you can read fl.* pretty much anywhere in the world.

The point, however, is that just because you have /access/ to a heirarchy doesn't mean your message is appropriate for it. Many locally oriented groups, especially *.forsale and *.jobs groups, are deluged with non-local messages, which are often crossposted to a large number of different, incongruent local heirarchies. While these don't individually set off alarms on the world's spam-watching software, they can make a group become useless for local postings because it's so hard to wade through all the misplaced stuff.

So, most local hierarchies now have people (or, more often, groups of people) watching over them, sending copies of the FAQ or Charter to people who post inappropriately, and -- in extreme situations -- cancelling the misplaced messages. Cancellation after the fact is commonly referred to as "retromoderation," and is still a topic of hot debate.

For more specific information, the Regional Guidelines and Periodic Postings Database can be viewed at:


http://www.unicom.com/regional/

Or, watch the group itself for a while to see if there're rules of any type. Remember that in this case, "a while" means at least two weeks, since FAQs don't get posted every day, and "but I saw other people advertising their thigh cream here!" is a really lame excuse.
There is also a mailing list dedicated to discussing the mechanics and policies that regional FAQ maintainers and retromoderators follow. For more information, contact <us-region-request@megalith.miami.fl.us>.


3.21) How about we start a campaign to stop all the spammers?
We already did -- and it's about time!

http://spam.abuse.net/


--------------------------------------------------------------------------------

GROAN
4.1) I hate net-cops like you people.
Who will watch the watchmen? net-cop.cops like this, apparently. ;} Anyways, anyone who wanted to police the net would be a pig-headed, unrealistic fool. Thankfully, we (the regular participants in news.admin.net-abuse.*) just want to stop spam.

Anyways, if you don't like spam being cancelled at your site, you can alias your site to "cyberspam". (Actually, you can only do that if you're the newsadmin -- but users are subject to the whim of their newsadmin anyway, and if you don't like your newsadmin's policies, you can always just build your own server and get a feed from someplace else.)


4.2) Isn't cyberporn a bigger problem than spamming?
No matter what the more sensationalistic media outlets may try to tell you, "cyberporn" is not a real problem. For more information, see cyberNOTHING's Cyberporn Report, at:

http://www.cybernothing.org/cno/reports/cyberporn.html

As for illegal stuff, like child pornography -- there are existing laws against that in most countries, so those people will go to jail, and good riddance.

Net abuse, as described in this document, is a big problem, and will continue to be a problem unless Something Is Done.

Nevertheless, a case could be made that other issues (Government-imposed censorship, loss of natural resources, etcetera) are more or equally important. But that's not what this FAQ, or the net-abuse newsgroups, are about.

4.3) Hey, I think my group's being invaded by alt.syntax.tactical!
I'm sorry to hear that. Please don't bring that subject up again here. Good luck... Keith "Justified and Ancient" Cochran, who has been wrongfully accused of a.s.t involvement himself, adds: "I would suggest the first thing you do is take a chill pill." (Note that there is no second thing to do. However, you may want to pass the time reading the alt.bigfoot FAQ:

http://www.cis.ohio-state.edu/hypertext/faq/usenet/bigfoot/top.html
--particularly the part about cats.)

See also 3.17, "What is a killfile, and how do I use one?"


4.4) Hey, I think my group's being invaded by the "Usenet Freedom Council!"
The abusive "Usenet Freedom Council" seems to be made up of a number of accounts all owned & operated by Dr. John Grubor, a.k.a. Manus, a.k.a. DrG, a.k.a DrGodFuck, ad nausea infinitum. It used to include former Kook of the Month Steve Boursy, and former Kook of the Month Nominee Vladimir Fomin (who also no longer has access to the net under that pseudonym.)
Now that news.admin.* people have pretty much unanimously killfiled him, he's started going to other newsgroups and attempting to get outraged responses from people by posting what can only be described as patent bullshit.

The best thing to do is ignore him. This, of course, made easier with a good killfile (see 3.15, "What is a killfile, and how do I use one?") The REAL "Usenet Freedom Council" was dreamt up by Dave Hayes. The best way to understand it is to view his "Freedom Knights" home page, at:

http://www.jetcafe.org/~dave/usenet/

Afterwards, I'd suggest reading "Dave Hayes / Freedom Knights: An Alternative View," which some feel is a little more realistic (and there are even those who say it's being too nice.)
http://www.cybernothing.org/faqs/dave-hayes.html

4.5) Hey, somebody posted an ad in {newsgroup}!
So?
All right, all right: first, check to see if the post was obviously forged (see 3.5, "How can I tell if a post is forged?")

Then check to see if it's spam (see 2.1, "What is Spam?" It's probably not. We only want to hear about it if it's spam.

If the ad is off-topic, and you really can't let it go, check out the advice in 4.6, "Hey, so-and-so's not being nice in {newsgroup}!"


4.6) Hey, so-and-so's not being nice in {newsgroup}!
Happens all the time. We don't want to hear about it. However, here are some things you can do (written by Keith "Justified and Ancient" Cochran):

"The first thing to do is take it up with user@some.site. If you can't achieve a mutual understanding, then you _MIGHT_ (note, not WILL, _MIGHT_) want to mail postmaster@some.site with your complaint. If you are going to write to postmaster@some.site, be sure to include the full, unedited post you have a problem with, a short but descriptive summary of why you have a problem with it, and a short, but descriptive explanation of what you would like to have happen. "Note that this does not apply to MAKE.MONEY.FAST. If you see a copy of M.M.F, just e-mail postmaster@some.site, including the article ID, and the first paragraph of the post."
Of course, the descriptive explanation of what you would like to have happen must also be realistic. Since most ISP's have a policy regarding commercial posts, it's common to ask the postmaster to reiterate or reinforce whatever policy they may have on hand, rather than asking right away for the user to be nuked. It's not nice to tell system administrators what to do -- especially if you don't know the entire situation yourself.

See also 3.15, "What is a killfile, and how do I use one?"


4.7) Hey, the "Good Times" virus--
...is a total, 100%, long-proven hoax. For the complete story, see:

http://www.nsm.smcm.edu/News/GTHoax.html

4.8) Hey, there's this (AT&T, Jerry Garcia, whatever) banner message in the newsgroup descriptions!
We know, we know... It's a fairly common prank to add bunches of newsgroups whose descriptions spell something out. Ask your local news adminstrator to remove the whole lot.

4.9) Hey, one of those net.cops posted an ad for {something}! Haw! Haw!
"Ad" does not equal "spam".
"Ad" does not equal "net-abuse".

--------------------------------------------------------------------------------
This document is Copyright 1994, 1995, 1996, 1997, and 1998 by Scott Southwick and J.D. Falk. Permission is granted for it to be reproduced electronically on any system connected to the various networks which make up the Internet, USENET, and FidoNet so long as it is reproduced in its entirety, unedited, and with this copyright notice intact.

  zondag 14 oktober 2001 @ 23:39:32 #62
16972 Davilex
haak me dan!!!
pi_1901878
[23:31:59] <Appie`> chips van de Gamma? YUCK!
http://homocultuur.startkabel.nl
  † In Memoriam † zondag 14 oktober 2001 @ 23:43:27 #63
13819 Loedertje
Trotse GILF.
pi_1901920
Kun je dat misschien uitleggen, ik snap er geen fuck van
quote:
Op zondag 14 oktober 2001 23:34 schreef calvobbes het volgende:
*hopa*


Introduction
A. General
B. Users
C. Authors
D. Developers
Appendixes
The XML FAQ
Editor: Peter Flynn (pflynn@ucc.ie)
Originally maintained on behalf of the World Wide Web Consortium's XML Special Interest Group
v. 2.01 (2001-06-19) Frequently Asked Questions about the Extensible Markup Language
Introduction
This is the list of Frequently-Asked Questions about the Extensible Markup Language. It is restricted to questions about XML: if you are seeking answers to questions about HTML, scripts, Java, databases, or penguins, you may find some pointers, but you should probably look elsewhere as well. It is intended as a first resource for users, developers, and the interested reader, and does not form part of the XML Specification.

Thanks
The following people have helped with contributions:

Terry Allen, Tom Borgman, Tim Bray, Robin Cover, Bob DuCharme, Christopher Maden, Eve Maler, Makoto Murata, Peter Murray-Rust, Liam Quin, Michael Sperberg-McQueen, Joel Weber

...plus many other members of the SIG as well as FAQ readers around the world. Please mail any corrections or additions to the editor. Sadly, the form for comments found at the end of previous versions has had to be discontinued due to abuse. Please post questions to the relevant mailing list or newsgroup, not to the editor.

Recent changes
2.0 June 2001 DTD changed from DocBook SGML to QAML XML; removed query form; most questions revised and in some cases rewritten; updated references to new versions of associated standards, recommendations, and working drafts; added pointer to Jon Noring's Unicode test page and NIST's XSLT/XPath test suite; updated Eve Maler's links to the DTD for the spec; added warnings on speling and punk chew asian; added question on namespaces; fixed bug in question on stylesheets; inserted explanation of `document' vs `data' software; added new mailing list on XSL:FO; updated Robin Cover's URL throughout; updated the question on media types for RFC 3023; Extended question of graphics to cover SVG. For 2.01 there were minor typos, some updated links (to recent versions of the standards, and in the section on More Information), and a few wording changes. Thanks to James Cummings for a very thorough proofread.

History
Organisation
The FAQ is divided into four sections: General, User, Author, and Developer. The questions are numbered independently within each section. As the numbering may change with each version, comments and suggestions should refer to the version number (see above) as well as the Section and Question Number.

Please submit bug reports, suggestions for improvement, and other comments relating to this FAQ only to the maintainer at pflynn@ucc.ie. Comments about the XML Specification itself and related specifications should be directed to the W3C.

Availability
This is the first entirely XML version: it was delayed due to (human) difficulties about which DTD was most suitable. I finally picked QAML for its simplicity over DocBook, but it has meant a few changes in the internal subset (see the XML file) and a change in the content model for span to allow embedded links.

The XML master is at http://www.ucc.ie/xml/faq.xml. You can download it in text-mode as well;
The HTML version is at http://www.ucc.ie/xml/index.html;
A plaintext (ASCII) version is at http://www.ucc.ie/xml/faq.txt. A notification of the plaintext version is occasionally posted to comp.text.xml for the archives.
For printed copies there are versions for A4 PostScript, A4 PDF, Letter PostScript and Letter PDF configurations available. Viewers can be downloaded for PostScript and PDF formats.
WAP (if anyone's still using it), OEB (eBook) and cHTML versions are in development for your handheld devices.
The FAQ is also available in carbon-based toner on flattened dead trees by sending US$10 (or equivalent) to the editor (email first to check currency and postal address).
Translations (those I know about) are at:

Japanese: http://www.fxis.co.jp/DMS/sgml/cafe/library/etc/xmlfaq.html [Murata Makoto];
Spanish: http://slug.ctv.es/~olea/sgml-esp/xfaq15.html [Jaime Sagarduy];
Korean: http://xml.t2000.co.kr/faq/index.html [Kangchan Lee];
Chinese: http://zxd.webjump.com/xml.html [Neko] and http://weblab.crema.unimi.it/xmlzh/XML_FAQ.htm; [Jiang Luqin]
French: http://www.gutenberg.eu.org/pub/GUTenberg/publications/HTML/FAQXML/faqxml-fr.html [Jacques André];
Czech: http://zvon.vscht.cz/ZvonHTML/Translations/xmlFAQ/front_all.html [Miloslav Nic];
You can download the XML logo as a GIF, JPG, or EPS file; and an icon for your file system in ICO (Microsoft Windows), Mac, or XPM (X Window system) format.
List of Questions

A.1. What is XML?
A.2. What is XML for?
A.3. What is SGML?
A.4. What is HTML?
A.5. Aren't XML, SGML, and HTML all the same thing?
A.6. Who is responsible for XML?
A.7. Why is XML such an important development?
A.8. Why not just carry on extending HTML?
A.9. Why do we need all this SGML stuff? Why not just use Word or Notes?
A.10. Where do I find more information about XML?
A.11. Where can I discuss implementation and development of XML?
A.12. What is the difference between XML and C or C++?

B.1. What do I have to do to use XML?
B.2. Why should I use XML instead of HTML?
B.3. Where can I get an XML browser?
B.4. Do I have to switch from SGML or HTML to XML?

C.1. Does XML replace HTML?
C.2. Do I have to know HTML or SGML before I learn XML?
C.3. What does an XML document look like inside?
C.4. How does XML handle white-space in my documents?
C.5. Which parts of an XML document are case-sensitive?
C.6. How can I make my existing HTML files work in XML?
C.7. Is there an XML version of HTML?
C.8. If XML is just a subset of SGML, can I use XML files directly with existing SGML tools?
C.9. I'm used to authoring and serving HTML. Can I learn XML easily?
C.10. Can XML use non-Latin characters?
C.11. What's a Document Type Definition (DTD) and where do I get one?
C.12. How do I create my own DTD?
C.13. Does XML let me make up my own tags?
C.14. I keep hearing about alternatives to DTDs. What's a schema?
C.15. How do I upload or download XML to/from a database?
C.16. How will XML affect my document links?
C.17. Can I do mathematics using XML?
C.18. How does XML handle metadata?
C.19. Can I use Java, ActiveX, etc in XML files?
C.20. Can I use Java to create or manage XML files?
C.21. How do I execute or run an XML file?
C.22. How do I control appearance?
C.23. How do I use graphics in XML?

D.1. Where's the spec?
D.2. What are these terms DTDless, valid, and well-formed?
D.3. Which should I use in my DTD, attributes or elements?
D.4. What else has changed between SGML and XML?
D.5. What's a namespace?
D.6. What XML software can I use today?
D.7. Do I have to change any of my server software to work with XML?
D.8. Can I still use server-side inclusions?
D.9. Can I (and my authors) still use client-side inclusions?
D.10. I'm trying to understand the XML Spec: why does XML have such difficult terminology?
D.11. Is there a Developer's API kit for XML?
D.12. How does XML fit with the DOM?
D.13. Is there a conformance test suite for XML processors?
D.14. How do I include one DTD (or fragment) in another?
D.15. I've already got SGML DTDs: how do I convert them for use with XML?
D.16. What's the story on XML and EDI?

A. General questions
A.1 What is XML?
XML is the Extensible Markup Language. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification.

It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a `metalanguage' -- a language for describing other languages -- which lets you design your own customized markup languages for limitless different types of documents. XML can do this because it's written in SGML, the international standard metalanguage for text markup systems (ISO 8879).

Back to Index

A.2 What is XML for?
XML is intended `to make it easy and straightforward to use SGML on the Web: easy to define document types, easy to author and manage SGML-defined documents, and easy to transmit and share them across the Web.'

It defines `an extremely simple dialect of SGML which is completely described in the XML Specification. The goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML.'

`For this reason, XML has been designed for ease of implementation, and for interoperability with both SGML and HTML'

[Quotes are from the XML specification]. XML is not just for Web pages: it can be used to store any kind of structured information, and to enclose or encapsulate information in order to pass it between different computing systems which would otherwise be unable to communicate.

Back to Index

A.3 What is SGML?
SGML is the Standard Generalized Markup Language (ISO 8879:1985), the international standard for defining descriptions of the structure of different types of electronic document. There is an SGML FAQ at http://www.infosys.utas.edu.au/info/sgmlfaq.txt which is posted every month to the comp.text.sgml newsgroup, and the SGML Web pages are at http://xml.coverpages.org/.

SGML is very large, powerful, and complex. It has been in heavy industrial and commercial use for over a decade, and there is a significant body of expertise and software to go with it. XML is a lightweight cut-down version of SGML which keeps enough of its functionality to make it useful but removes all the optional features which make SGML too complex to program for in a Web environment.

ISO standards like SGML are governed by the International Organization for Standardization in Geneva, Switzerland, and voted into or out of existence by representatives from every country's national standards body.

If you have a query about an international standard, you should contact your national standards body for the name of your country's representative on the relevant ISO committee or working group.

If you have a query about your country's representation in Geneva or about the conduct of your national standards body, you should contact the relevant government department in your country, or speak to your public representative.

The representation of countries at the ISO is not a matter for this FAQ. Please do not submit queries to the editor about how or why your ISO representatives have or have not voted on a specific standard.

Back to Index

A.4 What is HTML?
HTML is the HyperText Markup Language (RFC 1866), a small application of SGML used on the Web.

It defines a very simple class of report-style documents, with section headings, paragraphs, lists, tables, and illustrations, with a few informational and presentational items, and some hypertext and multimedia. See the question on extending HTML. There is also an XML version of HTML.

Back to Index

A.5 Aren't XML, SGML, and HTML all the same thing?
Not quite; SGML is the mother tongue, and has been used for describing thousands of different document types in many fields of human activity, from transcriptions of ancient Irish manuscripts to the technical documentation for stealth bombers, and from patients' clinical records to musical notation. SGML is very large and complex, however, and probably overkill for most common applications.

XML is an abbreviated version of SGML, to make it easier for you to define your own document types, and to make it easier for programmers to write programs to handle them. It omits all the options, and most of the more complex and less-used parts of SGML in return for the benefits of being easier to write applications for, easier to understand, and more suited to delivery and interoperability over the Web. But it is still SGML, and XML files may still be processed in the same way as any other SGML file (see the question on XML software).

HTML is just one of the SGML or XML applications, the one most frequently used in the Web.

Technical readers may find it more useful to think of XML as being SGML-- rather than HTML++.

Back to Index

A.6 Who is responsible for XML?
XML is a project of the World Wide Web Consortium (W3C), and the development of the specification is being supervised by their XML Working Group. A Special Interest Group of co-opted contributors and experts from various fields contributed comments and reviews by email.

XML is a public format: it is not a proprietary development of any company. The v1.0 specification was accepted by the W3C as Recommendation on Feb 10, 1998.

Back to Index

A.7 Why is XML such an important development?
It removes two constraints which were holding back Web developments:

dependence on a single, inflexible document type (HTML);
the complexity of full SGML, whose syntax allows many powerful but hard-to-program options.
XML simplifies the levels of optionality in SGML, and allows the development of user-defined document types on the Web.

Back to Index

A.8 Why not just carry on extending HTML?
HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information.

XML allows groups of people or organizations to create their own customized markup applications for exchanging information in their domain (music, chemistry, electronics, hill-walking, finance, surfing, petroleum geology, linguistics, cooking, knitting, stellar cartography, history, engineering, rabbit-keeping, mathematics, genealogy, etc).

HTML is at the limit of its usefulness as a way of describing information, and while it will continue to play an important role for the content it currently represents, many new applications require a more robust and flexible infrastructure.

Back to Index

A.9 Why do we need all this SGML stuff? Why not just use Word or Notes?
Information on a network which connects many different types of computer has to be usable on all of them. Public information cannot afford to be restricted to one make or model or manufacturer, or to cede control of its data format to private hands. It is also helpful for such information to be in a form that can be reused in many different ways, as this can minimize wasted time and effort. Proprietary data formats, no matter how well documented or publicized, are simply not an option: their control still resides in private hands and they can be changed or withdrawn arbitrarily without notice.

SGML is the international standard for defining this kind of application, but those who need an alternative based on different software for other purposes are entirely free to implement similar services using such a system, especially if they are for private use.

Back to Index

A.10 Where do I find more information about XML?
Online, there's the XML Specification and ancillary documentation available from the W3C; Robin Cover's SGML/XML Web pages with an extensive list of online reference material and links to software; and a summary and condensed FAQ from Tim Bray.

The items listed below are the ones I have been told about. Please mail me if you come across others.

An annual XML Conference is run by the Graphic Communications Association. XML 2001 is in Orlando, Florida, on December 9-14. See the GCA's Web site for details.
The Extreme Markup Languages 2001 conference takes place on 12-17 August at Le Centre Sheraton, Montréal, Canada.
The annual XML Summer School takes place in Oxford on 20-25 July 2001.
There are many other XML events around the world: most of them announced on the mailing lists and newsgroups.

There are lists of books, articles, and software for XML in Robin Cover's SGML and XML Web pages. That site should always be your first port of call: please look there first before using the form in this FAQ to ask about software or documentation.

Back to Index

A.11 Where can I discuss implementation and development of XML?
The two principal online media are the Usenet newsgroups and the mailing lists. The newsgroups are comp.text.xml and to a certain extent comp.text.sgml. Ask your Internet Provider how to access these, or use a Web interface like Google.

The general-purpose mailing list for public discussion is XML-L: to subscribe, visit the Web site and click on the link to join. You can also access the XML-L archives from the same URL.
For those developing components for XML there is an xml-dev mailing list. You can subscribe by sending a 1-line mail message to xml-dev-request@lists.xml.org saying just SUBSCRIBE. The xml-dev archives are at OASIS http://lists.xml.org/archives/xml-dev/. Note that this list is for those people actively involved in developing resources for XML. It is not for general information about XML (see this FAQ and other sources) or for general discussion about XML implementation and resources (see below).
There is a list for discussing XSL, the stylesheet language: XSL-List. For details of how to subscribe, see http://www.mulberrytech.com/xsl/xsl-list.
Andrew Watt writes that there is a mailing list specifically for XSL-FO only, on eGroups.com. You can subscribe by sending a message to XSL-FO-subscribe@egroups.com.
When you join a mailing list you will be sent details of how to use it. Please Read The Fine Documentation because it contains important information, particularly about what to do if your company or ISP changes your email address.

Please note that there is a lot of inaccurate and misleading information published in print and on the Web about subscribing to mailing lists. Don't guess: read the documentation.

Mailing lists in other languages
Gianni Rubagotti writes: A new Italian mailing list about XML is born: to subscribe, send a mail message without a subject line but with text saying subscribe XML-IT to majordomo@ananas.usr.dsi.unimi.it. Everyone, Italian or not, who wants to debate about XML in our tongue is welcome.
JP Theberge writes: A French mailing list about XML has been created. To subscribe, send subscribe to xml-request@trisome.com.
Jarno Elovirta writes: a Finnish mailing list about XML has been set up. To subscribe, send an email to majordomo@evitech.fi with subscribe XML-Fin in the message body. The list is also hypermailed for online reference at http://users.evitech.fi/lists/xml-fin/.
Back to Index

A.12 What is the difference between XML and C or C++?
C and C++ (and other languages like FORTRAN, or Pascal, or BASIC, or Java or dozens more) are programming languages with which you specify calculations, actions, and decisions to be carried out in order:

mod curconfig[if left(date,6) = "01-Apr", t.put "April Fool!",
f.put days('31102001','DDMMYYYY')-days(sdate,'DDMMYYYY')
" shopping days to Samhain"];
XML is a markup specification language with which you can design ways of describing information (text or data), usually for storage, transmission, or processing by a program: it says nothing about what you should do with the data (although your choice of element names may hint at what they are for):

<part num="DA42" models="LS AR DF HG KJ" update="2001-11-22">
<name>Camshaft end bearing retention circlip</name>
<image drawing="RR98-dh37" type="SVG" x="476" y="226"/>
<maker id="RQ778">Ringtown Fasteners Ltd</maker>
<notes>Angle-nosed insertion tool <tool id="GH25"/> is
required for the removal and replacement of this item.</notes>
</part>
On its own, an SGML or XML file (and HTML) doesn't do anything. It's a data format which just sits there until you run a program which does something with it. See also the question about how to run or execute XML files.

Back to Index

B. Existing users of SGML (including HTML: everyone who browses the Web)
B.1 What do I have to do to use XML?
For the average user of the Web, nothing except use a browser which works with XML (see the question about browsers). Remember some XML components are still being implemented, so some features are still either undefined or have yet to be written. Don't expect everything to work yet!

You can use XML browsers to look at some of the stable XML material, such as Jon Bosak's Shakespeare plays and the molecular experiments of the Chemical Markup Language (CML). There are some more example sources listed at http://xml.coverpages.org/xml.html#examples, and you will find XML (particularly in the disguise of XHTML) being introduced in places where it won't break older browsers.

If you want to start preparations for creating your own XML files, see the questions in the Authors' Section and the Developers' Section.

Back to Index

B.2 Why should I use XML instead of HTML?
Authors and providers can design their own document types using XML, instead of being stuck with HTML. Document types can be explicitly tailored to an audience, so the cumbersome fudging that has to take place with HTML can become a thing of the past: authors and designers are free to invent their own markup elements;
Information content can be richer and easier to use, because the descriptive and hypertext linking abilities of XML are much greater than those of HTML.
XML can provide more and better facilities for browser presentation and performance, using CSS and XSL stylesheets;
It removes many of the underlying complexities of SGML in favor of a more flexible model, so writing programs to handle XML is much easier than doing the same for full SGML.
Information will be more accessible and reusable, because the more flexible markup of XML can be used by any XML software instead of being restricted to specific manufacturers as has become the case with HTML.
Valid XML files are kosher SGML, so they can be used outside the Web as well, in existing SGML environments.
Back to Index

B.3 Where can I get an XML browser?
Remember the XML specification is still relatively new, so a lot of what you see now is experimental, and because the potential number of different XML applications is unlimited, no single browser can be expected to handle 100% of everything.

Some of the generic parts of XML (eg parsing, tree management, searching, formatting, etc) are being combined into general-purpose libraries or toolkits to make it easier for developers to take a consistent line when writing XML applications. Such applications can then be customized by adding semantics for specific markets, or using languages like Java to develop plugins for generic browsers and have the specialist modules delivered transparently over the Web.

MSIE5.5 handles XML but currently still renders it via the HTML model. Microsoft were also the architects of a hybrid (invalid) solution (islands) in which you could embed fragments of XML in HTML files because current HTML-only browsers simply ignored element markup which they didn't recognize, but his has now been superseded by XHTML. MSIE includes an implementation of an obsolete draft of XSLT (WD-xsl): you need to upgrade it and replace the parser (see http://www.netcrucible.com/ for details).
The publicly-released Netscape code (Mozilla) and the almost indistinguishable Netscape 6 (there is no v5) have XML/CSS support, based on James Clark's expat XML parser, and this seems to be more robust, if less slick, than MSIE. Mozilla 0.9 is reported to have some XSLT capability.
The authors of the former MultiDoc Pro SGML browser, CITEC, joined forces with Mozilla to produce a multi-everything browser called DocZilla, which reads HTML, XML, and SGML, with XSL and CSS stylesheets. This runs under NT and Linux and is currently still in the alpha stage. See http://www.doczilla.com for details. This is by far the most ambitious browser project, and is backed by solid SGML expertise, but seems to be rather a long time coming.
Opera now supports XML and CSS on MS-Windows and Linux and is the most complete implementation so far. The browser size is tiny by comparison with the others, but features are good and the speed is excellent, although the earlier slavish insistence on mimicking everything Netscape did, especially the bugs, still shows through in places.
See also the notes on software for authors and developers, and the more detailed list on the XML pages in the SGML Web site at http://xml.coverpages.org/.

Back to Index

B.4 Do I have to switch from SGML or HTML to XML?
No, existing SGML and HTML applications software will continue to work with existing files. But as with any enhanced facility, if you want to view or download and use XML files, you will need to use XML-aware software. There is much more being developed for XML than there ever was for SGML, so a lot of users are moving.

Back to Index

C. Authors of SGML (including writers of HTML: Web page owners)
C.1 Does XML replace HTML?
No. XML itself does not replace HTML: instead, it provides an alternative which allows you to define your own set of markup elements. HTML is expected to remain in common use for some time to come, and a Document Type Definition for HTML is available in XML syntax as well as in original SGML. XML is designed to make the writing of DTDs much simpler than with full SGML. (See the question on DTDs for what one is and why you might want one.)

Back to Index

C.2 Do I have to know HTML or SGML before I learn XML?
No, although it's useful because a lot of XML terminology and practice derives from 15 years' experience of SGML.

Be aware that `knowing HTML' is not the same as `understanding SGML' . Although HTML was written as an SGML application, browsers ignore most of it (which is why so many useful things don't work), so just because something is done a certain way in HTML browsers does not mean it's correct, least of all in XML.

Back to Index

C.3 What does an XML document look like inside?
The basic structure is very similar to most other applications of SGML, including HTML. XML documents can be very simple, with no document type declaration (DTD), and straightforward nested markup of your own design:

<?xml version="1.0" standalone="yes"?>
<conversation>
<greeting>Hello, world!</greeting>
<response>Stop the planet, I want to get off!</response>
</conversation>
Or they can be more complicated, with a DTD specified (see the question on document types), and maybe an internal subset (local DTD changes in [square brackets]), and a more complex structure:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE titlepage SYSTEM "http://www.foo.bar/dtds/typo.dtd"
[<!ENTITY % active.links "INCLUDE">]>
<titlepage id="BG12273624">
<white-space type="vertical" amount="36"/>
<title font="Baskerville" size="24/30"
alignment="centered">Hello, world!</title>
<white-space type="vertical" amount="12"/>
<!-- In some copies the following decoration is
hand-colored, presumably by the author -->
<image location="http://www.foo.bar/fleuron.eps"
type="URL" alignment="centered"/>
<white-space type="vertical" amount="24"/>
<author font="Baskerville" size="18/22"
style="italic">Vitam capias</author>
<white-space type="vertical" class="filler"/>
</titlepage>
Or they can be anywhere between: a lot will depend on how you want to define your document type (or whose you use) and what it will be used for.

Back to Index

C.4 How does XML handle white-space in my documents?
The SGML rules regarding white-space have been changed for XML. All white-space, including linebreaks, TAB characters, and regular spaces, even between those elements where no text can ever appear, is passed by the parser unchanged to the application (browser, formatter, viewer, converter, etc), identifying the context in which the white-space was found (element content, data content, or mixed content). This means it is the application's responsibility to decide what to do with such space, not the parser's:

insignificant white-space between structural elements (space which occurs where only element content is allowed, ie between other elements, where text data never occurs) will get passed to the application (in SGML this white-space gets suppressed, which is why you can put all that extra space in HTML documents and not worry about it. This is not so in XML);
significant white-space (space which occurs within elements which can contain text and markup mixed together, usually mixed content or PCDATA) will still get passed to the application exactly as under SGML. It is the application's responsibility to handle it correctly.
<chapter>
<title>
My title for Section
1.
</title>
<p>
text
</p>
</chapter>
The parser must inform the application that white-space has occurred in element content, if it can detect it. (Users of SGML will recognize that this information is not in the ESIS, but it is in the grove.) In the above example, the application will receive all the pretty-printing linebreaks, TABs, and spaces between the elements as well as those embedded in the chapter title. It is the function of the application, not the parser, to decide which type of white-space to discard and which to retain.

Back to Index

C.5 Which parts of an XML document are case-sensitive?
All of it, both markup and text. This is significantly different from HTML and most other SGML applications. It was done to allow markup in non-Latin-alphabet languages and to obviate problems with case-folding in scripts which are caseless.

Element type names are case-sensitive: you must stick with whatever combination of upper- or lower-case you use to define them (either by first usage or in a DTD). So you can't say <BODY>...<body>: upper- and lower-case must match; thus <IMG/> and <img/> are two different element types;
For well-formed files with no DTD, the first occurrence of an element type name defines the casing;
Attribute names are also case-sensitive, on a per-element basis: for example <PIC width="7in"/> and <PIC WIDTH="6in"/> in the same file exhibit two separate attributes, because the different casings of width and WIDTH distinguish them;
Attribute values are also case-sensitive. CDATA values (eg HRef="MyFile.SGML") always have been, but ID and IDREF attributes are now case-sensitive as well;
All entity names (&Aacute;), and your data content (text), are case-sensitive as always.
Back to Index

C.6 How can I make my existing HTML files work in XML?
Either convert them to conform to some new document type (with or without a DTD) and write a stylesheet to go with them; or edit them to conform to XHTML.

It is necessary to convert existing HTML files because XML does not permit end-tag minimization (missing </p>, etc), unquoted attribute values, and a number of other shortcuts which are normal in most HTML DTDs. However, many HTML authoring tools already produce almost (but not quite) well-formed XML. As a preparation for XML, the W3C's HTML Tidy program can clean up some of the formatting mess left behind by inadequate HTML editors, and even separate out some of the formatting to a stylesheet, but there is usually still some hand-editing to do.

Back to Index

Converting to a new document type
If you want to move your files out of HTML into some other DTD entirely, there are already many native XML application DTDs, and several XML versions of popular SGML DTDs like TEI and DocBook to choose from. There is a pilot site run by CommerceNet (http://www.xmlx.com/) for the exchange of XML DTDs.

Alternatively you could just make up your own markup: so long as it makes sense and you create a well-formed file, you should be able to write a CSS or XSLT stylesheet and have your document displayed in a browser.

Back to Index

Converting valid HTML to XHTML
If your HTML files are valid (full formal validation with an SGML parser, not just a simple syntax check), then try validating them as XHTML. If you have been creating clean HTML without embedded formatting then this process should throw up only mismatches in upper/lowercase element and attribute names, and empty elements (plus perhaps the odd non-standard element type name if you use them). Simple hand-editing or a short script should be enough to fix these changes.

If your HTML validly uses end-tag omission, this can be fixed automatically by a normalizer program like sgmlnorm (part of SP) or by the sgml-normalize function in an editor like Emacs/psgml (don't be put off by the names, they both do XML).

If you have a lot of valid HTML files, could write a script to do this in a programming language which understands SGML/XML markup (such as Omnimark, Balise, SGMLC, or a system using one of the SGML libraries for Perl, Python, or Tcl), or you could even use editor macros if you know what you're doing.

Back to Index

Converting invalid HTML to well-formed XHTML
If your files are invalid HTML (95% of the Web) they can be converted to well-formed DTDless files as follows:

replace the DOCTYPE Declaration with the XML Declaration <?xml version="1.0" standalone="yes" encoding="iso-8859-1"?>. If there was no DOCTYPE Declaration, just prepend the XML Declaration.
change any EMPTY elements (eg every <ISINDEX>, <BASE>, <META>, <LINK>, <NEXTID> and <RANGE> in the header, and every <IMG>, <BR>, <HR>, <FRAME>, <WBR>, <BASEFONT>, <SPACER>, <AUDIOSCOPE>, <AREA>, <PARAM>, <KEYGEN>, <COL>, <LIMITTEXT>, <SPOT>, <TAB>, <OVER>, <RIGHT>, <LEFT>, <CHOOSE>, <ATOP>, and <OF> in the body of the document) so that they end with /> instead, for example <img src="mypic.gif" alt="Picture"/>;
make all element names and attribute names lowercase;
ensure there are correctly-matched explicit end-tags for all non-empty elements; eg every <p> must have a </p>, etc;
escape all < and & non-markup (ie literal text) characters as &lt; and &amp; respectively (there shouldn't be any isolated &lt; characters to start with);
ensure all attribute values are in quotes.
Be aware that many HTML browsers may not accept XML-style EMPTY elements with the trailing slash, so the above changes may not be backwards-compatible. An alternative is to add a dummy end-tag to all EMPTY elements, so <IMG src="foo.gif"/> becomes <img src="foo.gif"></img>. This is still valid XML provided you guarantee never to put any text content in such elements. Adding a space before the slash (eg <img src="foo.gif" />) may also fool older browsers into accepting XHTML as HTML.

If your HTML files fall into this category (HTML created by some WYSIWYG editors is frequently invalid) then they will almost certainly have to be converted manually, although if the deformities are regular and carefully constructed, the files may actually be almost well-formed, and you could write a program or script to do as described above. The oddities you may need to check for include:

do the files contain markup syntax errors? For example, are there any missing angle-brackets, backslashes instead of forward slashes on end-tags, or elements which nest incorrectly (eg <B>an element starting <I>inside another</B> but ending outside</I>)?
are there any URLs (eg in hrefs or srcs) which use backslashes instead of forward slashes?
do the files contain markup which conflicts with HTML DTDs, such as headings or lists inside paragraphs, list items outside list environments, header elements like <base>preceding the first <html>, etc?
do the files use imaginary elements which are not in any known HTML DTD? (large amounts of these are used in proprietary markup systems masquerading as HTML). Although this is easy to transform to a DTDless well-formed file (because you don't have to define elements in advance) most proprietary or browser-specific extensions have never been formally defined, so it is often impossible to work out meaningfully where the element types can be used.
Are there any non-ISO Latin-1 (8859-1) characters or wrongly-coded characters in your files? Look especially for native Apple Mac characters left by careless designers, or any of the illegal characters (the 32 characters at decimal codes 128-159 inclusive) inserted by MS-Windows editors. These need to be converted to the correct characters in ISO 8859-1 or the relevant plane of Unicode (and the XML Declaration should show iso-8859-1 encoding unless you specifically know otherwise).
Do your files contain malformed (Mosaic/Netscape-style) comments? Comments must look <!-- like this --> with double-dashes each end and no double dashes in between (safest: no multiple dashes in between).
If you answer Yes to any of these, you can save yourself a lot of grief by fixing those problems first before doing anything else. You will likely then be getting close to having well-formed files.

Markup which is syntactically correct but semantically meaningless or void should be edited out before conversion. Examples are spacing devices such as repeated empty paragraphs or linebreaks, empty tables, invisible spacing GIFs etc: XML uses stylesheets, so you won't need any of these.

Unfortunately there is rather a lot of work to do if your files are invalid: this is why many professional Webmasters will always insist that only valid or well-formed files are used (and why you should instruct designers to do the same), in order to avoid unnecessary manual maintenance and conversion costs later.

Back to Index

C.7 Is there an XML version of HTML?
The W3C has released XHTML as `a reformulation of HTML 4 in XML 1.0' . This specification defines HTML as an XML application, and provides three DTDs corresponding to the ones defined by HTML 4.0. The semantics of the elements and their attributes are as defined in the W3C Recommendation for HTML 4.0. These semantics provide the foundation for future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines.

Back to Index

C.8 If XML is just a subset of SGML, can I use XML files directly with existing SGML tools?
Yes, provided you use up-to-date SGML software which knows about the WebSGML Adaptations to ISO 8879 (the features needed to support XML, such as the variant form for EMPTY elements; some aspects of the SGML Declaration such as NAMECASE GENERAL NO; multiple attribute token list declarations, etc).

An alternative is to use an SGML DTD to let you create a fully-normalised SGML file, but one which does not use empty elements; and then remove the DocType Declaration so it becomes a well-formed DTDless XML file.

Most SGML tools now handle XML files well, and provide an option switch between the two standards. (see the pointers in the question on software).

Back to Index

C.9 I'm used to authoring and serving HTML. Can I learn XML easily?
Yes, very easily, but at the moment there is still a need for tutorials, simpler tools, and more examples of XML documents. Well-formed XML documents may look similar to HTML except for some small but very important points of syntax.

The big practical difference is that XML has to stick to the rules. HTML browsers let you serve them broken or corrupt HTML because they don't do a formal parse but elide all the broken bits instead. With XML your files have to be correct or they simply won't work at all. One outstanding problem is that some browsers claiming XML conformance are also broken. Try yours on the test file at http://www.ucc.ie/test.xml.

Back to Index

C.10 Can XML use non-Latin characters?
Yes, the XML Specification explicitly says XML uses ISO 10646, the international standard 31-bit character repertoire which covers most human (and some non-human) languages. This is currently congruent with Unicode and is planned to be superset of Unicode.

The spec says (2.2): `All XML processors must accept the UTF-8 and UTF-16 encodings of ISO 10646...' . UTF-8 is an encoding of Unicode into 8-bit characters: the first 128 are the same as ASCII, the rest are used to encode the rest of Unicode into sequences of between 2 and 6 bytes. UTF-8 in its single-octet form is therefore the same as ISO 646 IRV (ASCII), so you can continue to use ASCII for English or other unaccented languages using the Latin alphabet. Note that UTF-8 is incompatible with ISO 8859-1 (ISO Latin-1) after code point 126 decimal (the end of ASCII). UTF-16 is an encoding of Unicode into 16-bit characters, which lets it represent the next two planes. UTF-16 is incompatible with ASCII because it uses two 8-bit bytes per character.

`...the mechanisms for signalling which of the two are in use, and for bringing other encodings into play, are [...] in the discussion of character encodings.' The XML Specification explains how to specify in your XML file which coded character set you are using.

Use of UCS-4 can only legally be specified in SGML or XML when the WebSGML Adaptations to ISO 8879 are implemented: this enables numbers longer than eight digits to be used in the SGML Declaration.

`Regardless of the specific encoding used, any character in the ISO 10646 character set may be referred to by the decimal or hexadecimal equivalent of its bit string' : so no matter which character set you personally use, you can still refer to specific individual characters from elsewhere in the encoded repertoire by using &#dddd; (decimal character code) or &#xHHHH; (hexadecimal character code, in uppercase). The terminology can get confusing, as can the numbers: see the ISO 10646 Concept Dictionary. Rick Jelliffe has XML-ized the ISO character entity sets. Mike Brown's encoding information at http://skew.org/xml/tutorial/ is a very useful explanation of the need for correct encoding. There is an excellent online database of glyphs and characters in many encodings from the Estonian Language Institute server at http://www.eki.ee/letter/.

Back to Index

C.11 What's a Document Type Definition (DTD) and where do I get one?
A DTD is a formal description in XML Declaration Syntax of a particular type of document. It sets out what names are to be used for the different types of element, where they may occur, and how they all fit together. For example, if you want a document type to be able to describe Lists which contain Items, the relevant part of your DTD might contain something like this:

<!ELEMENT List (Item)+>
<!ELEMENT Item (#PCDATA)>
This defines a list as an element type containing one or more items (that's the plus sign); and it defines items as element types containing just plain text (Parsed Character Data or PCDATA). Validating parsers read the DTD before they read your document so that they can identify where every element type ought to come and how each relates to the other, so that applications which need to know this in advance (most editors, search engines, navigators, databases) can set themselves up correctly. The example above lets you create lists like:

<List><Item>Chocolate</Item><Item>Music</Item><Item>Surfing</Item></List>
How the list appears in print or on the screen depends on your stylesheet: you do not normally put anything in the XML to control formatting like you had to do with HTML before stylesheets. This way you can change style easily without ever having to edit the document itself.

A DTD provides applications with advance notice of what names and structures can be used in a particular document type. Using a DTD when editing files means you can be certain that all documents which belong to a particular type will be constructed and named in a consistent and conformant manner. DTDs are less important for processing documents already known to be well-formed, but they are still needed if you want to take advantage of XML's special attribute types like the built-in ID/IDREF cross-reference mechanism.

There are thousands of DTDs already in existence in all kinds of areas (see the SGML/XML Web pages for pointers). Many of them can be downloaded and used freely; or you can write your own (see the question on creating your own DTD. Existing SGML DTDs need to be converted to XML for use with XML systems: read the question on converting SGML DTDs to XML, and expect to see announcements of popular DTDs becoming available in XML format.

Back to Index

C.12 How do I create my own DTD?
You need to use the XML Declaration Syntax (very simple: declaration keywords begin with <! rather than just the open angle bracket, and the way the declarations are formed also differs slightly). Here's an example of a DTD for a shopping list, based on the fragment used in an earlier question:

<!ELEMENT Shopping-List (Item)+>
<!ELEMENT Item (#PCDATA)>
It says that there shall be an element called Shopping-List and that it shall contain elements called Item: there must be at least one (that's the plus sign) but there may be more than one. It also says that the Item element may contain parsed character data (PCDATA, ie text).

Because there is no other element which contains Shopping-List, that element is assumed to be the `root' element, which encloses everything else in the document. You can now use it to create an XML file: give your editor the declarations:

<?xml version="1.0"?>
<!DOCTYPE Shopping-List SYSTEM "shoplist.dtd">
(assuming you put the DTD in that file). Now your editor will let you create files according to the pattern:

<Shopping-List>
<Item>Chocolate</Item>
<Item>Sugar</Item>
<Item>Butter</Item>
</Shopping-List>
It is possible to develop complex and powerful DTDs of great subtlety, but for any significant use you should learn more about document systems analysis and document type design. See for example Developing SGML DTDs by Maler and el Andaloussi, Prentice Hall, 1997, 0-13-309881-8, which was written for SGML, but perhaps 95% of it applies to XML as well, as XML is much simpler than full SGML -- see the list of restrictions which shows what has been cut out.

Back to Index

C.13 Does XML let me make up my own tags?
No, it lets you make up names for your own elements. If you think tags and elements are the same thing you are already in trouble: read the rest of this question carefully.

Before we start this one, Bob DuCharme notes: Don't confuse the term `tag' with the term `element' . They are not interchangeable. An element usually contains two different kinds of tag: a start-tag and an end-tag, with text or more markup between them.

XML lets you decide which elements you want in your document and then indicate your element boundaries using the appropriate start- and end-tags for those elements. Each <!ELEMENT... declaration defines a class of elements that may or may not be used in a document conforming to that DTD. We call this class of elements an `element type' . Just as the HTML DTD includes the H1 and P element types, your document can have color and price element types.

Non-empty elements are made up of a start-tag, the element's content, and an end-tag. <color>red</color> is a complete instance of the color element. <color> is only the start-tag of the element, showing where it begins; it is not the element itself.

Empty elements are a special case that may be represented either as a pair of start- and end-tags with nothing between them (eg <price retail="123"></price>) or as a single empty element start-tag that has a closing slash to tell the parser `don't go looking for an end-tag to match this' (eg <price retail="123"/>). [Bob DuCharme]

Back to Index

C.14 I keep hearing about alternatives to DTDs. What's a schema?
A DTD is for specifying the structure (only) of an XML file: it gives the names of the elements, attributes, and entities that can be used, and how they fit together. Because DTDs were designed for use with traditional text documents, they have no mechanism for defining the content of elements in terms of data types, because XML has no data types: text is just text. A DTD therefore cannot be used to specify numeric ranges or to define limitations or checks on the text content, only on the markup that surrounds it.

The XML Schema recommendation provides a means of specifying element content in terms of data types, so that document type designers can provide criteria for validating the content of elements as well as the markup itself. Schemas are written as XML files, thus avoiding the need for processing software to be able to read XML Declaration Syntax, which is different from XML Instance Syntax.

Schemas are now a formal Recommendation, and a number of sites are serving useful applications as both DTDs and Schemas, eg http://www.schema.net and http://www.dtd.com. There is a separate Schema FAQ at http://www.schemavalid.com. The term `vocabulary' is sometimes used to refer to `DTDs and Schemas' together.

Authors and publishers should note that the plural of Schema is Schemas: the use of the singular to do duty for the plural is a foible dear to the semi-literate; the use of the old (Greek) plural schemata is now unnecessary didacticism. Writers should also note that the plural of DTD is DTDs: there is no apostrophe.

Bob DuCharme adds: Many XML developers were dissatisfied with the syntax of the markup declarations described in the XML spec for two reasons. First, they felt that if XML documents were so good at describing structured information, then the description of a document type's own structure (its schema) should be in an XML document instead of written with its own special syntax. In addition to being more consistent, this would make it easier to edit and manipulate the schema with regular document manipulation tools. Secondly, they felt that traditional DTD notation didn't allow document type designers the power to impose enough constraints on the data -- for example, the ability to say that a certain element type must always have a positive integer value, that it may not be empty, or that it must be one of a list of possible choices. This eases the development of software using that data because the developer has less error-checking code to write.

Back to Index

C.15 How do I upload or download XML to/from a database?
Ask your database manufacturer: they all provide XML import and export modules. In some trivial cases there will be a 1:1 match between field and element types; in most cases some programming is required to establish the matches, but this can usually be stored as a procedure so that subsequent uses are simply commands or calls with the relevant parameters.

Users from a database or computer science background should be aware that XML is not a database management system: it is a text markup system. While there are many similarities, some of the concepts of one are simply non-existent in the other: XML does not possess some database-like features in the same way that databases do not possess markup-like ones. It is a common error to believe that XML is a DBMS like Oracle or Access and therefore possesses the same facilities. It doesn't. [PF]

Back to Index

C.16 How will XML affect my document links?
The linking abilities of XML systems are much more powerful than those of HTML, so you'll be able to do much more with them. Existing HREF-style links will remain usable, but the new linking technology is based on the lessons learned in the development of other standards involving hypertext, such as TEI and HyTime, which let you manage bidirectional and multi-way links, as well as links to a span of text (within your own or other documents) rather than to a single point. These features have been available to SGML users for many years, so there is considerable experience and expertise available in using them.

The XML Linking Specification (XLink) and XML Extended Pointer Specification (XPointer) documents contain a detailed draft specification. An XML link can be either a URL or a TEI-style Extended Pointer (XPointer), or both. A URL on its own is assumed to be a resource; if an XPointer or XLink follows it, it is assumed to be a sub-resource of that URL; an XPointer on its own is assumed to apply to the current document (all exactly as with HTML).

An XLink is always preceded by one of #, ?, or |. The # and ? mean the same as in HTML applications; the | means the sub-resource can be found by applying the link to the resource, but the method of doing this is left to the application. An XPointer can only follow a #.

The TEI Extended Pointer Notation (EPN) is much more powerful than the fragment address on the end of some URLs, as it allows you to specify the location of a link end using the structure of the document as well as (or in addition to) known, fixed points like IDs. For example, the linked second occurrence of the word `XPointer' two paragraphs back could be referred to as http://www.ucc.ie/xml/faq.sgml#ID(hypertext).child(2,*).child(2,#element,'p').child(3,#element,'link'), meaning the third link element within the second paragraph within the second object in the element whose ID is hypertext (this question). Count the objects from the start of this question in the XML source (which has the ID hypertext):

the first child object is the title of the question (<q>);
the second child object is the answer (the <a> element);
within the <a> element go to the second paragraph;
count to the third link.
David Megginson has produced an xpointer function for Emacs/psgml which will deduce an XPointer for any location in an XML document.

Back to Index

C.17 Can I do mathematics using XML?
Yes, if the document type you use provides for math. The mathematics-using community is developing software, and there is a MathML Recommendation at the W3C, which is a native XML application. It would also be possible to make XML fragments from other DTDs, such as the long-expired HTML3, the near-obsolete HTML Pro, or ISO 12083 Math, or OpenMath, or one of your own making. Browsers which display some math embedded in SGML already exist (eg DynaText, Panorama, Multidoc Pro).

Back to Index

C.18 How does XML handle metadata?
Because XML lets you define your own markup language, you can make full use of the extended hypertext features (see the question on Links) of XML to store or link to metadata in any format (eg ISO 11179, Dublin Core, Warwick Framework, Resource Description Framework (RDF), and Platform for Internet Content Selection (PICS)).

There are no predefined elements in XML, because it is an architecture, not an application, so it is not part of XML's job to specify how or if authors should or should not implement metadata. You are therefore free to use any suitable method from simple attributes to the embedding of entire Dublin Core/Warwick Framework metadata records. Browser makers may also have their own architectural recommendations or methods to propose.

Back to Index

C.19 Can I use Java, ActiveX, etc in XML files?
This will depend on what facilities the browser makers implement. XML is about describing information; scripting languages and languages for embedded functionality are software which enables the information to be manipulated at the user's end, so these languages do not have any place in an XML file, but in stylesheets like XSL and CSS.

XML itself provides a way to define the markup needed to implement scripting languages: as a neutral standard it neither encourages not discourages their use, and does not favour one language over another, so the field is wide open.

Back to Index

C.20 Can I use Java to create or manage XML files?
Yes, any programming language can be used to output data from any source in XML format. There is a growing number of front-ends and back-ends for programming environments and data management environments to automate this.

There is a large body of `middleware' written in Java and other languages for managing data either in XML or with XML output. There is a suite of Java tutorials (with source code and explanation) available at http://developerlife.com.

Please do not mail the FAQ editor with questions about your Java programming bugs. Ask one of the Java newsgroups instead.
Back to Index

C.21 How do I execute or run an XML file?
You can't and you don't. XML is not a programming language, so XML files don't `run' or `execute' . XML is a markup specification language and XML files are data: they just sit there until you run a program which displays them (like a browser) or does some work with them (like a converter which writes the data in another format, or a database which reads the data), or modifies them (like an editor).

Back to Index

C.22 How do I control appearance?
In HTML, default styling is built into the browsers because the tagset of HTML is predefined and hardwired into browsers. IN XML, where you can define your own tagset, browsers cannot know what names you are going to use and what they will mean, so you need a stylesheet if you want to display the formatted text.

Browsers which read XML will accept and use a CSS stylesheet at a minimum, but you can also use the more powerful XSLT stylesheet language to transform your XML into HTML -- which browsers, of course, already know how to display (and that HTML can still use a CSS stylesheet).

As with any system where files can be viewed at random by arbitrary users, the author cannot know what resources (such as fonts) are on the user's system, so the same care is needed as with HTML using fonts. To invoke a stylesheet from an XML file, include one of the stylesheet declarations:

<?xml-stylesheet href="foo.xsl" type="text/xsl"?>
<?xml-stylesheet href="foo.css" type="text/css"?>
The Cascading Stylesheet Specification (CSS) provides a simple syntax for assigning styles to elements, and has been implemented in most browsers.

The Extensible Stylesheet Language (XSL) has been created for use specifically with XML. Dave Pawson maintains a comprehensive FAQ at http://www.dpawson.co.uk/xsl/xslfaq.html. XSL uses XML syntax (an XSL stylesheet is an XML file) and has widespread support from several major vendors (see the questions on browsers and other software) although current browser support is limited. XSL comes in two flavours:

XSL itself, which is a pure formatting language, and which needs a text formatter like FOP or PassiveTeX to create printable output (both can produce PDF). Currently I am not aware of any Web browsers which support XSL rendering;
XSLT (T for Transformation), which is a language to specify transformations of XML into HTML either inside the browser or at the server before transmission. It can also specify transformations from one vocabulary of XML to another, and from XML to plaintext.
Currently only MS Internet Explorer 5.5 handles XSLT inside the browser (and even that needs some post-installation surgery to remove the obsolete WD-xsl and replace it with the current XSL-Transform processor). But there is a growing use of server-side processors like Cocoon, which let you store your information in XML but serve it auto-converted to HTML, thus allowing the output to be used by any browser. XSLT is also widely used to transform XML into non-SGML formats for input to other systems (for example to transform XML into LaTeX for typesetting.

Back to Index

C.23 How do I use graphics in XML?
Graphics have traditionally just been links which happen to have a picture file at the end rather than another piece of text. They can therefore be implemented in any way supported by the XLink and XPointer specifications (see earlier question), including using similar syntax to existing HTML images. They can also be referenced using XML's built-in NOTATION and ENTITY mechanism in a similar way to standard SGML, as external unparsed entities.

The linking specifications, however, give you much better control over the traversal and activation of links, so an author can specify, for example, whether or not to have an image appear when the page is loaded, or on a click from the user, or in a separate window, without having to resort to scripting.

XML itself doesn't predict or restrict graphic file formats: GIF, JPG, TIFF, PNG, CGM, and SVG at a minimum would seem to make sense; however, vector formats are normally preferred for non-photographic images.

Back to Index

Using entities for images
You cannot embed a raw graphics file (or any other binary [non-text] data) directly into an XML file because any bytes happening to resemble markup would get misinterpreted: you must refer to it by linking (see below). It would, however, in theory be possible to include a text-encoded transformation of a binary file as a CDATA marked section, using something like UUencode with the markup characters ] and > removed from the map so that they could not occur and be misinterpreted, or even simple hexadecimal encoding as used in PostScript. For vector graphics, however, the solution is to use SVG (see below).

Bob DuCharme adds: All the data in an XML document entity must be parseable XML. You can define an external entity as either a parsed entity (parseable XML) or an unparsed entity (anything else). Unparsed entities can be used for picture files, sound files, movie files, or whatever you like. They can only be referenced from within a document as the value of an attribute (much like a bitmap picture on an HTML Web page is the value of the img element's src attribute) and not part of the actual document. In an XML document, this attribute must be declared to be of type ENTITY, and the entity's declaration must specify a declared NOTATION, because if the entity isn't XML, the XML processor needs to know what it is. For example, in the following document, the colliepic entity is declared to have a JPEG notation, and it's used as the value of the empty dog element's picfile attribute.

<?xml version="1.0"?>
<!DOCTYPE dog [
<!NOTATION JPEG SYSTEM "Joint Photographic Experts Group">
<!ENTITY colliepic SYSTEM "lassie.jpg" NDATA JPEG>
<!ELEMENT dog EMPTY>
<!ATTLIST dog picfile ENTITY #REQUIRED>
]>
<dog p

  † In Memoriam † zondag 14 oktober 2001 @ 23:45:04 #64
13819 Loedertje
Trotse GILF.
pi_1901944
Anton Constandse: "Leven tegen de stroom in"


Bert Gasenbeek, Rudolf de Jong, Pieter Edelman (redactie) "Anton Constandse &#8211; Leven tegen de stroom in" &#8211; Uitgeverij Papieren Tijger, Breda, en Het Humanistisch Archief, Utrecht, 1999 &#8211; 268 blz. ISBN 90 6728 099 2 &#8211; NLG 30,00 - EURO 13,61

Sommige mensen vergeet ik nooit. Anton Constandse (1899-1985) is een van die onvergetelijke persoonlijkheden. Ik bewaar goede herinneringen aan de keren dat ik hem mocht ontmoeten. Een markante, briljante, bescheiden "ambachtsman van het vrije woord".

Individualisme

Constandse was weliswaar anarchist, maar zag al in 1938 in dat dit ideaal niet te verwerkelijken is en nam afscheid van het (politieke) radicaalanarchisme. We kunnen, met hem, &#8216;het anarchisme&#8217; echter ook opvatten als een (in allerlei vormen en varianten optredende) vrijheidlievende geestelijke stroming die een &#8216;sociaal individualisme&#8217; nastreeft. Als we de maatschappelijke betrekkelijkheid én de sociale betrokkenheid ervan op hun juiste waarde schatten, blijkt dit individualisme een inspirerende levenshouding en zelfs een onmisbare factor in de maatschappelijke evolutie te zijn. Kropotkin (1842-1921), de anarchistische Russische prins, noemde dit verschijnsel &#8216;wederkerig dienstbetoon&#8217;, dat hij even onontkoombaar achtte als &#8216;de strijd om het bestaan&#8217;. Constandse's &#8216;anarchisme&#8217; werd na 1938 dan ook meer cultureel en literair dan &#8216;politiek&#8217; van inhoud. Kunst en literatuur zijn in wezen &#8216;anarchistisch&#8217; (individualistisch). Op talloze manieren beelden kunstenaars dit vrijheidsstreven uit of brengen het onder woorden. Kunst: een daad van verzet. Constandse heeft hier belangwekkende dingen over gezegd en geschreven.

Over het individualisme als geestelijke houding en maatschappelijke stroming schreef hij:

"Geestelijk was het individualisme van grote betekenis voor de evolutie. Maar zijn denkbeelden en idealen konden slechts worden bevorderd voorzover collectiviteiten daartoe bereidheid en vermogen toonden te bezitten. En aldus is individuele vrijheid toch een sociaal ideaal." (Cursief van mij, JB)

Dit citaat uit "Het souvereine ik - Het individualisme van Lao-tse tot Friedrich Nietzsche" (Uitg. Meulenhoff, 1983) is, zoals de schrijvers van dit boek terecht stellen, "&#8230; het credo van Constandse's atheīsme en anarchisme. En is de laatste zin ook niet het credo van elke vrijdenker en elke humanist?" De juiste dynamische balans vinden tussen individu en samenleving: Constandse&#8217;s levenshouding in een notendop.

Atheīsme, humanisme

Voor Constandse waren atheīsme en humanisme synoniemen. Tegenwoordig is dit geen vanzelfsprekendheid meer. Ook Anton begreep dit. Hoewel enkele opvattingen niet veranderen, evolueerden ook zijn opvattingen in de loop der jaren. Zo zag hij het hedendaagse humanisme en "het Humanistisch Verbond als de erfgenaam van drie geestelijke stromingen, die elk op hun eigen wijze tot een bovengodsdienstige ethiek waren gekomen: naast de vrijdenkers waren dat de socialisten en de vrijzinnig-protestanten." (In het blad "Rekenschap, Driemaandelijks tijdschrift voor wetenschap en cultuur", 1966). Zo vertegenwoordigde de VPRO in zijn ogen een uitermate belangrijke culturele rol. En dat met socialisme niet het zitvleessocialisme wordt bedoeld maar het historisch diep gewortelde emancipatiestreven van de werkende klasse(n) in sociale, politieke, geestelijke en culturele zin, moge duidelijk zijn. Het zitvleessocialisme is van deze emancipatie helaas een bijverschijnsel.

Constandse schreef meer dan honderd boeken en vele duizenden artikelen over evenzoveel onderwerpen. Een greep: anarchisme, seksuele hervorming, de dichter Herman Gorter, de wijsgeer Spinoza, Spaanse literatuur, Dada, het &#8216;sociale individualisme&#8217; en het asociale iktijdperk, Vietnam, Latijns-Amerika, de tweede wereldoorlog, de koude oorlog, imperialisme, neokolonialisme. Hij leverde zowel bijdragen aan de Groene Amsterdammer en Vrij Nederland als aan de Vrije Gedachte. Voor het Handelsblad (het latere NRC Handelsblad) schreef hij zijn vaste rubriek en voor de VPRO radio las hij zijn commentaren. Het mooie van Constandse was, dat ook als je het niet met hem eens was, hij je toch wist te boeien.

Een onvermijdelijk zeer uitgebreide &#8216;beredeneerde bibliografie&#8217; en een (evenzeer onvermijdelijk) uitermate beknopte biografie, alsmede een bijzonder handig register, besluiten dit boek. Iedereen wil ik dit boek aanbevelen. Nu 'ayatollahs' van diverse pluimage hun kans schoon zien om hun enge en benauwende ideeën opnieuw op te leggen, kan een stem als die van Constandse niet luid en duidelijk genoeg klinken. Constandse&#8217;s denken noopt niet tot na-apen, wel tot nadenken.

Š Jan Bontje 2001


pi_1901953
quote:
Op zondag 14 oktober 2001 23:43 schreef Loedertje het volgende:
[b]Kun je dat misschien uitleggen, ik snap er geen fuck van
Ik ook niet, shoot me
  † In Memoriam † zondag 14 oktober 2001 @ 23:47:22 #66
13819 Loedertje
Trotse GILF.
pi_1901967
Dichter-columnist-tekstschrijver uit Spijkenisse"
1947, Rotterdam



Genre: "bontjes" (op haiku gebaseerde korte gedichten die op geheel eigen wijze als voertuig voor zijn poëzie-uitingen dienen) - verder "light verse", kwatrijnen, aforismen, columns, songteksten, korte verhalen, essays, boekrecensies. Dit alles zowel uit eigen behoefte als in opdracht.

pi_1902020
*paste*

Section 1: Basic Questions
This section aims to deal with basic questions, addressing the role and
nature of CGI, and its place in Web programming. Questions/answers which
just don't appear to 'fit' under any other section may also be included
here.


1.1: What is CGI?
[ from the CGI reference http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ]

The Common Gateway Interface, or CGI, is a standard for external
gateway programs to interface with information servers such as HTTP servers.
A plain HTML document that the Web daemon retrieves is static,
which means it exists in a constant state: a text file that doesn't change.
A CGI program, on the other hand, is executed in real-time, so that it
can output dynamic information.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.2: Is it a script or a program?
The distinction is semantic. Traditionally, compiled executables
(binaries) are called programs, and interpreted programs are usually
called scripts. In the context of CGI, the distinction has become
even more blurred than before. The words are often used interchangably
(including in this document). Current usage favours the word "scripts"
for CGI programs.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.3: When do I need to use CGI?
There are innumerable caveats to this answer, but basically any
Webpage containing a form will require a CGI script or program
to process the form inputs.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.4: Should I use CGI or JAVA?
[answer to this non-question hopes to try and reduce the noise level of
the recurrent "CGI vs JAVA" threads].

CGI and JAVA are fundamentally different, and for most applications
are NOT interchangable.

CGI is a protocol for running programs on a WWW server. Whilst JAVA
can also be used for that, and even has a standardised API (the servlet,
which is indeed an alternative to CGI), the major role of JAVA on the
Web is for clientside programming (the applet).

In certain instances the two may be combined in a single application:
for example a JAVA applet to define a region of interest from a
geographical map, together with a CGI script to process a query
for the area defined.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.5: Should I use CGI or SSI or ... { PHP/ASP/... }
CGI and SSI (Server-Side Includes) are often interchangable, and it may
be no more than a matter of personal preference. Here are a few
guidelines:
1) CGI is a common standard agreed and supported by all major HTTPDs.
SSI is NOT a common standard, but an innovation of NCSA's HTTPD
which has been widely adopted in later servers. CGI has the
greatest portability, if this is an issue.
2) If your requirement is sufficiently simple that it can be done
by SSI without invoking an exec, then SSI will probably be
more efficient. A typical application would be to include
sitewide 'house styles', such as toolbars, netscapeised <body>
tags or embedded CSS stylesheets.
3) For more complex applications - like processing a form -
where you need to exec (run) a program in any case, CGI
is usually the best choice.
4) If your transaction returns a response that is not an HTML page,
SSI is not an option at all.

Many more recent variants on the theme of SSI are now available.
Probably the best-known are PHP which embeds server-side scripting
in a pre-html page, and ASP which is Microsoft's version of a
similar interface.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.6: Should I use CGI or an API?
APIs are proprietary programming interfaces supported by particular
platforms. By using an API, you lose all portability. If you know
your application will only ever run on one platform (OS and HTTPD),
and it has a suitable API, go ahead and use it. Otherwise stick to CGI.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.7: So what are in a nutshell the options for webserver programming?
Too many to enumerate - but I'll try and summarise. Briefly, there
are several decisions you have to make, including:
* Power. Is it up to a complex task?
* Complexity. How much programming manpower is it worth?
* Portability. Might you want to run your program on another system?

So here's an overview of the main options. It's inevitably subjective,
but may be helpful to someone:

Basic SSI:Simple interface for basic dynamic content.
Non-standard - read your server docs.
Enhanced SSI[1]:Suitable for more complex tasks within
an HTML page.
CGI:The standardised, portable general-purpose API,
not limited to working with HTML pages.
Enhanced CGI-like[2]:Typically gain efficiency but lose portability
compared to standard CGI.
Servlets:An alternative API for JAVA, that overcomes
the limitation of JAVA not supporting
environment variables.
Server API:Generally the most powerful and most complex option.

[1] For example, PHP, ASP.
[2] For example, CGI adapted to mod_perl or fastcgi.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.8: What do I absolutely need to know?
If you're already a programmer, CGI is extremely straightforward, and just
three resources should get you up to speed in the time it takes to read them:
1) Installation notes for your HTTPD. Is it configured to run CGI
scripts, and if so how does it identify that a URL should be executed?
(Check your manuals, READMEs, ISP webpages/FAQS, and if you still can't
find it ask your server administrator).
2) The CGI specification at NCSA tells you all you need to know
to get your programs running as CGI applications.
http://hoohoo.ncsa.uiuc.edu/cgi/interface.html
3) WWW Security FAQ. This is not required to 'get it working', but
is essential reading if you want to KEEP it working!
http://www.w3.org/Security/Faq/www-security-faq.html

If you're NOT already a programmer, you'll have to learn. If you would
find it hard to write, say, a 'grep' or 'cat' utility to run from the
commandline, then you will probably have a hard time with CGI. Make
sure your programs work from the commandline BEFORE trying them with CGI,
so that at least one possible source of errors has been dealt with.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.9: Does CGI create new security risks?
Yes. Period.
There is a lot you can do to minimise these. The most important thing
to do is read and understand Lincoln Stein's excellent WWW security
FAQ, at http://www.w3.org/Security/Faq/www-security-faq.html


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.10: Do I need to be on Unix?
No, but it helps. The Web, along with the Internet itself, C, Perl,
and almost every other Good Thing in the last 20 years of computing,
originated in Unix. At the time of writing, this is still the
most mature and best-supported platform for Web applications.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.11: Do I have to use Perl?
No - you can use any programming language you please. Perl is simply
today's most popular choice for CGI applications. Some other widely-
used languages are C, C++, TCL, BASIC and - for simple tasks -
even shell scripts.

Reasons for choosing Perl include its powerful text manipulation
capabilities (in particular the 'regular' expression) and the fantastic
WWW support modules available.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.12: What languages should I know/use?
It isn't really that important. Use what you're comfortable with,
or what you're constrained (eg by your manager) to use.

If you're just dabbling with programming, Perl is a good choice, simply
because of the wealth of ready-to-run Perl/CGI resources available.

If you're serious about programming, you should be at home in a
range of languages. C, the industry standard, is a must (at least to
the level of comfortably reading other people's code). You'll
certainly want at least one scripting language such as Perl, Python
or Tcl. C++ is also a good idea.

In response to a Usenet newbie question:
> I am seriously wanting to learn some CGI programming languages

J.M. Ivler wrote some eloquent words of wisdom:
> If you want to learn a programming language, learn a programming language.
> If you want to learn how to do CGI programming, learn a programming
> language first.
>
> My book is one of the few that tackles two languages at the same time.
> Why? because it's not about languages (which are just syntax for logic).
> CGI programming is about programming, and how to leverage the experience
> for the person coming to the site, or maintaining the site, or in some way
> meeting some requirements. Language is just a tool to do so.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.13: Do I have to put it in cgi-bin?
see next question


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.14: Do I have to call it *.cgi? *.pl?
Maybe. It depends on your server installation.

These types of filenames are commonly used conventions - no more.
It is up to the server administrator whether or not CGI scripts are
enabled, and (if so) what conventions tell the server to run or
to print them.

If you are running your own server, read the manual.
If you're on ISP or other rented webspace, check their webpages for
information or FAQs. As a last resort, ask the server administrator.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.15: What is the "CGI Overhead", and should I be worried about it?
The CGI Overhead is a consequence of HTTP being a stateless protocol.
This means that a CGI process must be initialised for every "hit"
from a browser.

In the first instance, this usually means the server forking a
new process. This in itself is a modest overhead, but it can
become important on a heavily-used server if the number of
processes grows to problem levels.

In the second place, the CGI program must initialise. In the
case of a compiled language such as C or C++ this is negligible,
but there is a small penalty to pay for scripting languages such as Perl.

Thirdly, CGI is often used as 'glue' to a backend program, such as
a database, which may take some considerable time to initialise.
This represents a major overhead, which must be avoided in any
serious application. The most usual solution is for the backend
program to run as a separate server doing most of the work, while
the actual CGI simply carries messages.

Fourthly, some CGI scripts are just plain inefficient, and may
take hundreds of times the resources they need. Programs using
system() or `backtick` notation often fall into this category.

Note that there are ways to reduce or eliminate all these overheads,
but these tend to be system- or server-specific. The best-supported
server is probably Apache, as commercial server-vendors may prefer to
push their proprietary solutions in preference to CGI.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.16: What do I need to know about file permissions and "chmod"?
Unix systems are designed for multiple users, and include provision
for protecting your work from unauthorised access by other users
of the system. The file permissions determine who is permitted
to do what with your programs, data, and directories. The command
that sets file permissions is chmod.

Web servers typically run as user "nobody". That means that, setting
aside serious bugs (such as those in certain versions of the Frontpage
extensions), your files are absolutely secure from damage through the
webserver. It also means that you may have to make explicit changes to
enable the server to access them in a CGI context.

There are two ways to run CGI:
- by default they run as the webserver user (nobody)
For most purposes this is safest, as your programs and data
are protected by the operating system from unauthorised access
through possible bugs in your CGI. However, when the CGI has
to write to a file, that file must be writable to every web
user on the system, and is therefore completely unprotected.
- setuid, they run under your own userid.
This means that files written by your CGI can be secure.
On the other hand, any bugs in your CGI could now compromise
*all* your programs and data on the server.
As an elementary security precaution, scripts (e.g. Perl) are
prevented from running setuid by most OSs. The "cgiwrap"
program offers a workaround for this.

A third way you should *never* permit CGI to be run is:
- as root or setuid root, they can run as any user.
This is extremely dangerous, as any bugs could compromise the
entire server, including every user's files. Fortunately only
the system administrator can install setuid root programs. If
you are *at all* concerned about security, make sure that no such
programs (in particular Frontpage extensions) are installed,
regardless of whether you use them yourself.

For a proper overview, "man chmod". Some modes that may be useful
in a typical CGI context are:

* CGI programs, 0755
* data files to be readable by CGI, 0644
* directories for data used by CGI, 0755
* data files to be writable by CGI, 0666 (data has absolutely no security)
* directories for data used by CGI with write access, 0777 (no security)
* CGI programs to run setuid, 4755
* data files for setuid CGI programs, 0600 or 0644
* directories for data used by setuid CGI programs, 0700 or 0755
* For a typical backend server process, 4750

Finally, if this answer tells you anything you didn't already know,
don't even think about trying to set up a secure server!


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.17: What is CGIWrap, and how does it affect my program?
[ quoted from http://www.umr.edu/~cgiwrap/intro.html ]

> CGIWrap is a gateway program that allows general users to use CGI scripts
> and HTML forms without compromising the security of the http server.
> Scripts are run with the permissions of the user who owns the script. In
> addition, several security checks are performed on the script, which will not
> be executed if any checks fail.
>
> CGIWrap is used via a URL in an HTML document. As distributed, cgiwrap
> is configured to run user scripts which are located in the
> ~/public_html/cgi-bin/ directory.

See http://www.umr.edu/~cgiwrap/


[Table of Contents] [Index]
--------------------------------------------------------------------------------

1.18: How do I decode the data in my Form?
The normal format for data in HTTP requests is URLencoded. All Form data
is encoded in a string, of the form
param1=value1&param2=value2&...paramn=valuen
Many non-alphanumeric characters are "escaped" in the encoding:
the character whose hexadecimal number is "XY" will be represented by
the character string "%XY".

Decoding this string is a fundamental function of every CGI library.

Another format is "multipart/form-data", also known as "file upload".
You will get this from the HTML markup
<form method="POST" enctype="multipart/form-data">

(but note you must accept URLencoded input in any case, since not all
browsers support multipart forms).

Most(?) CGI libraries will handle this transparently.

Section 2: HTTP Headers and NPH Scripts
This is a fairly technical section dealing with HTTP, the protocol of
the Web. It also includes NPH, the mechanism by which CGI programs can
return HTTP header information directly to the Client.


2.1: What is HTTP (HyperText Transfer Protocol)?
HTTP is the protocol of the Web, by which Servers and Clients (typically
browsers) communicate. An HTTP transaction comprises a Request sent by
the Client to the Server, and a Response returned from the Server to
the Client.
Every HTTP request and response includes a message header, describing
the message. These are processed by the HTTPD, and may often be
mostly ignored by CGI applications (but see below).
A message body may also be included:
1) A HEAD or GET request sends only a header. Any form data is encoded
in an HTTP_QUERY_STRING header field, which is available to the CGI
program as an environment variable QUERY_STRING.
2) A POST request sends both header and body. The body typically
comprises data entered by a user in a form.
3) A HEAD request does not expect a body in the response.
4) A GET or POST request will accept a response with or without a body,
according to the header. The body of a response is typically an
HTML document.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.2: What HTTP request headers can I use?
Most HTTP request headers are passed to the CGI script as environment
variables. Some are guaranteed by the CGI spec. Others are server,
browser and/or application dependent.

To see what _your_ browser and server are telling each other, just use
a trivial little CGI script to print out the environment. In Unix:
#!/bin/sh
echo "Content-type: text/plain"
echo
set

(Just call it "env.cgi" or something, and put it where your server
will execute it. Then point your browser at
http://your.server/path/to/env.cgi ).

This enables you to see at-a-glance what useful server variables are set.
Note that dumping the environment like this within a more complex
script can be a useful debugging technique.

For details, see the CGI Environment Variables specification at
http://hoohoo.ncsa.uiuc.edu/cgi/env.html
(which also includes a version of the above script - somewhat more
nicely formatted - online).


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.3: What Environment variables are available to my application?
See previous question. Those you can rely on are documented in NCSA's
pages; those associated with your particular server and browser can
be determined using the above script.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.4: Why doesn't my script get REMOTE_USER? My page is password-protected.
You will get REMOTE_USER if the _script_ is password protected.
That's all. The page the user is coming from has nothing to do with it.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.5: What HTTP response headers do I need to know about?
Unless you are using NPH, the HTTPD will insert necessary response
headers on your behalf, always provided it is configured to do so.

However, it is conventional for servers to insert the Content-Type header
based on a page's filename, and CGI scripts cannot rely on this. Hence
the usual advice is to print an explicit Content-Type header.
At least one of "Content-Type", "Status" and "Location" is almost
always required.

A few other headers you may wish to use explicitly are:
Status(to set HTTP return code explicitly. Caveats:
(1) Behaviour is undefined if it conflicts with
another header. (2) This is NOT an HTTP header.)
Location(to redirect the user to another URI, which may or may
not be on your own server)
Set-cookie(Netscape/Nonstandard) Set a cookie
Refresh(Netscape/Nonstandard) Clientpull

You can also use general MIME headers: eg "Keywords" for the benefit of
indexers (although in this instance some major search robots have
regrettably introduced a new protocol to do the same thing).

For a detailed reference, see RFC1945 (HTTP/1.0) or RFC2068 (HTTP/1.1).


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.6: What is NPH?
NPH = No Parsed Headers. The script undertakes to print the entire
HTTP response including all necessary header fields. The HTTPD
is thereby instructed not to parse the headers (as it would normally do)
nor add any which are missing.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.7: Must/should/can I write nph scripts?
Generally, no. It is usually better to save yourself hassle by letting
the HTTPD produce the headers for you.

If you are going to use NPH, be sure to read and understand the HTTP spec at
http://www.w3.org/pub/WWW/Protocols/

Your headers should be complete and accurate, because you're instructing
the HTTPD not to correct them or insert what's missing.

Possible circumstances where the use of NPH is appropriate are:
* When your headers are sufficiently unusal that they might be
differently parsed by different HTTPDs (eg combining "Location:"
with a "Status:" other than 302).
* When returning output over a period of time (eg displaying
unbuffered results of a slow operation in 'real' time).
See RFC1945 (HTTP/1.0) or RFC2068 (HTTP/1.1) for detail


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.8: Do I have to call it nph-*
According to NCSA's reference pages, this is the standard for telling
the server that your script is NPH, so this should be a fully portable
convention.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

2.9: What is the difference between GET and POST?
Firstly, the the HTTP protocol specifies differing usages for the two
methods. GET requests should always be idempotent on the server.
This means that whereas one GET request might (rarely) change some state
on the Server, two or more identical requests will have no further effect.

This is a theoretical point which is also good advice in practice.
If a user hits "reload" on his/her browser, an identical request will be
sent to the server, potentially resulting in two identical database or
guestbook entries, counter increments, etc. Browsers may reload a
GET URL automatically, particularly if cacheing is disabled (as is usually
the case with CGI output), but will typically prompt the user before
re-submitting a POST request. This means you're far less likely to get
inadvertently-repeated entries from POST.

GET is (in theory) the preferred method for idempotent operations, such
as querying a database, though it matters little if you're using a form.
There is a further practical constraint that many systems have builtin
limits to the length of a GET request they can handle: when the total size
of a request (URL+params) approaches or exceeds 1Kb, you are well-advised
to use POST in any case.

In terms of mechanics, they differ in how parameters are passed to the
CGI script. In the case of a POST request, form data is passed on
STDIN, so the script should read from there (the number of bytes to be
read is given by the Content-length header). In the case of GET, the
data is passed in the environment variable QUERY_STRING. The content-type
(application/x-www-form-urlencoded) is identical for GET and POST requests.

Section 3: Techniques: "How do I..."
This section comprises programming hints and tips for a number of popular
tasks. Also included are a number of common questions to which the answer
is "you can't", with the reasons why.


3.1: Can I get information about who is visiting?
*sigh*
Many people keep mailing me questions or suggested hacks to get
visitor information, particularly email addresses. It seems they
won't take "NO" for an answer.

The bottom line is that whatever information is available to _you_
is _equally_ available to every spammer on the net. Therefore when
a browser bug _does_ permit personal data to be collected, it gets
reported and fixed very quickly (one short-lived Netscape 2.0.x
release reportedly had such a bug in its Javascript engine).

You can get some limited information from the environment variables
passed to you by the browser. Relatively few of these are guaranteed
to be available, and some may be misleading. For particular types
of information, see below. For full details, see NCSA's reference pages.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.2: Can I get the email of visitors?
Why do you want to do this?

The best information available is the REMOTE_ADDR and REMOTE_HOST,
which tell you nothing about the user. Techniques such as "finger@"
are not reliable, are widely disliked, and generally serve only to
introduce long delays in your CGI. Better - as well as more polite -
just to ask your users to fill in a form.

BTW: the "From:" header line (HTTP_FROM variable) is usually only set
by robots, since human visitors to your webpage will not normally want
their addresses collected without permission, and browsers respect this.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.3: "But I saw some.kool.site display my email address..."
Some sites will play party tricks, which can get *some users* email
addresses. Possible tell-tale signs of this are inordinate delays
loading a page (fingering @REMOTE_HOST - doesn't often work but
probably can't be detected from the webpage), or a submit button that
appears to do nothing at all (a mailto: form - works well with some
browsers but trivially detectable). As a "snoop" party trick that's
fine, but if you find someone abusing these facilities (eg they send
you junkmail), alert their service provider!


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.4: Can I verify the email addresses people enter in my Form?
Unfortunately people will sometimes enter an incorrect or invalid
email address in your Form. Worse, they may enter a valid but
incorrect email address that will deliver to someone who doesn't
want your mail.

Proposed regexps to match email addresses are sometimes posted.
Most of these will fail against perfectly valid email addresses,
like "S=N.OTHER/OU1=X12345A/RECIPNUM=1/MTA-BASIC@attmail.com"
(which is what your address looks like if you are connected to
the Internet via X400 - and if you think that example is too easy,
check the ones at the end of Eli the Bearded's Email Addressing FAQ).

Probably the most complete parser and checker available for download
is Tom Christiansen's, at
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/ckaddr.gz
Of course, this still says nothing about deliverability.

A frequently-suggested hack that doesn't work is to use
SMTP EXPN or VRFY commands. Modern versions of sendmail permit
administrators to disable these commands, and many sites take
advantage of this facility to protect their users' privacy.

Probably the best way to verify an email address is to send mail to
it, asking the user to respond. Include a clause like "if you have
received this mail in error, please accept our apologies..."


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.5: Subject: How can I get the hostname of the remote user?
You can't. Well, not always.

IF it is available, you'll find it in the REMOTE_HOST environment
variable. However, this will more often than not contain the numerical
IP address rather than the IP name of the remote host. Remember that
not all IP addresses have a hostname associated with them; this is the
case of most IP addresses assigned to dialup users, for example. Your
web server may also not perform a reverse lookup on incoming
connections, in which case REMOTE_HOST will contain the IP address even
if it has a corresponding IP name. In the second case, you can do a
reverse lookup yourself in your script, but this is expensive and
should probably be avoided unless absolutely necessary.

Even if you do manage to obtain a hostname, you should be aware that it
may not correspond to the hostname the user is accessing your page
from. It may instead be that of an intervening proxy host.

The short answer is therefore that there is no reliable way of finding
out what the remote user's hostname is.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.6: Can I get browser details and return different pages?
Why do you want to do this?

Well-written HTML will display correctly in any browser, so the correct
answer to this question is to design a template for your output in good
HTML, and make sure your output is correct.

If you insist on a different answer, you can use the HTTP_USER_AGENT
environment variable. This requires care, and can lead to unexpected
results. For example, checking for "Mozilla" and serving a frameset
to it ensures that you *also* serve the frameset to early (Non-Frame)
Netscapes, me-too browsers (notably Microsoft[1]) and others who have
chosen to lie to you about their browser.

Note also that not every User Agent is a browser. Your page may be
read by a user agent you've never heard of, and then displayed by
100 different browsers. Or retrieved by different browsers from
a cache. Another reason to write good HTML, and not try to
devise a clever or koool substitute.

[1] At the time of writing, only Netscape 2+ supported frames, and
some authors considered them koool. That's changed, but the same
general principle still holds.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.7: Can I trace where a user has come from/is going to?
HTTP_REFERER might or might not tell you anything. By all means
use it to collect partial statistics if you participate in (say)
an advertising banner scheme. But it is not always set, and may
be meaningless (eg if a user has accessed your page from a bookmark,
and the browser is too dumb to cope with this).

The HTTP protocol forbids relying on Referer information for functionality
in your programs, so don't try it.

You cannot trace outgoing links at all. If you really must try,
point all the external links to your HTTPD and use its redirection
facility (which gives you generally-reliable logs). This is much
less inefficient than using a CGI script.

BTW: don't even think about asking Javascript to send you information
on some event: it's a violation of privacy which Netscape fixed as
soon as complaints about its abuse started coming in. If it works
with *your* browser, you should upgrade!


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.8: Can I launch a long process and return a page before it's finished?
[UNIX]
You have to fork/spawn the long-running process.
The important thing to remember is to close all its file descriptors;
otherwise nothing will be returned to the browser until it's finished.
The standard trick to accomplish this is redirection to/from /dev/null:

"long_process < /dev/null > /dev/null 2>&1 &"
print HTML page as usual


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.9: Can I launch a long process which the user interacts with?
This does not fit well with the basic mechanics of the Web, in which
each transaction comprises a single request and response.
If your processing can be done on the Client machine, you can use
a clientside application; for example a Java applet.

For processing on the server, one trick that works well for Clients
running an X server (and far more efficient than a JAVA solution) is:
if ( fork() ) {
print HTML page explaining what's going on and advising about xhost
} else {
exec ("xterm -display THEIR_DISPLAY -title MY_APP -e MY_PROG ARGS
< /dev/null > /dev/null 2>&1 &") ;
}
NOTE: THEIR_DISPLAY is not necessarily the same as REMOTE_HOST or REMOTE_ADDR.
You have to ask users to supply their display (set REMOTE_HOST as default).

A JAVA terminal program will accomplish something similar for the many
users with platforms that support JAVA but not X.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.10: Can I password-protect my pages?
Yes. Use your HTTPD's authentication, just as you would a basic HTML page.
Now you'll have the identity of every visitor in REMOTE_USER.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.11: Can I do HTTP authentication using CGI?
It depends on which version of the question you asked.

Yes, you can use CGI to trigger the browser's standard Username/Password
dialogue. Send a response code 401, together with a "WWW-authenticate"
header including details of the the authentication scheme and realm:
e.g. (in a non-NPH script)

Status: 401 Unauthorized to access the document
WWW-authenticate: Basic realm="foobar"
Content-type: text/plain

Unauthorised to access this document

The use you can make of this is server-dependent, and harder,
since most servers expect to deal with authentication before ever
reaching the CGI (eg through .www_acl or .htaccess).
Thus it cannot usefully replace the standard login sequence, although
it can be applied to other situations, such as re-validating a user -
e.g after a certain timeout period or if the same person may need to
login under more than one userid.

What you can never get in CGI is the credentials returned by the user.
The HTTPD takes care of this, and simply sets REMOTE_USER to the
username if the correct password was entered.

For a much longer but outdated discussion of this question,
see my discussion at http://www.webthing.com/tutorials/login.html


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.12: Can I identify users/sessions without password protection?
The most usual (but browser-dependent) way to do this is to set a cookie.
If you do this, you are accepting that not all users will have a 'session'.

An alternative is to pass a session ID in every GET URL, and in hidden
fields of POST requests. This can be a big overhead unless _every_ page
requires CGI in any case.

Another alternative is the Hyper-G[1] solution of encoding a session-id in
the URLs of pages returned:
http://hyper-g.server/session_id/real/path/to/page
This has the drawback of making the URLs very confusing, and causes any
bookmarked pages to generate old session_ids.

Note that a session ID based solely on REMOTE_HOST (or REMOTE_ADDR)
will NOT work, as multiple users may access your pages concurrently
from the same machine.

[1] Actually I don't think that's been true of Hyper-G since sometime
in '96. However, general advances in web server technology, such as
Apache's mod_alias or mod_rewrite, make it straightforward without
the need for CGI.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.13: Can I redirect users to another page?
For permanent and simple redirection, use the HTTPD configuration file:
it's much more efficient than doing it yourself. Some servers enable
you to do this using a file in your own directory (eg Apache) whereas
others use a single configuration file (eg CERN).

For more complicated cases (eg process form inputs and conditionally
redirect the user), use the "Location:" response header.
If the redirection is itself a CGI script, it is easy to URLencode
parameters to it in a GET request, but don't forget to escape the URL!


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.14: Can I run a CGI script without returning a new page to the browser?
Yes, but think carefully first: How are your readers going to know
that their "submit" has succeeded? They may hit 'submit' many times!

The correct solution according to the HTTP specification is to
return HTTP status code 204. As an NPH script, this would be:

#!/bin/sh
# do processing (or launch it as background job)
echo "HTTP/1.0 204 No Change"
echo

(as non-NPH, you'd simply replace HTTP/1.0 with the Status: CGI header).

Alan J Flavell has pointed out that this will fail with certain
popular browsers, and suggests a workaround to accommodate them:

[ May 1998 update[1]: I'm deleting Alan's suggestion, because the problem
is mainly of historical interest, and the workaround is no longer
recommended. See his page for a a detailed survey and recommendations.
]

His survey is at
http://ppewww.ph.gla.ac.uk/%7Eflavell/status204/results.html

[1] With apologies to Alan for having left it in so long.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.15: Can I write output to a different Netscape frame?
Yep. The fact you're using CGI makes no difference: use
"target=" in your links as usual. Alternatively, the script
can print a "Window-target:" header. Read Netscape's pages
for detail: these answer all the questions about things like
"getting rid of" or "breaking out of" frames, too.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.16: Can I write output to several frames at once?
A single CGI script can only ever print to one frame.

However, this limitation may be overcome by using more than one script.
The first script (the URL of the "submit" button) prints a frameset,
typically to a "_parent" or "_top" target. The sources for one or
more of the frames thus generated may also be CGI scripts, to which
you can easily pass parameters (eg encoded in URLs with method GET).
This hack is definitely not recommended. If you find yourself wanting
to update several frames from a single user event, it probably means
you should review the design of your application at a higher level.

Warnings:
1. Don't forget to escape your URLs.
2. This technique results in your server being hit by multiple
concurrent CGI requests. You'll need LOTS of memory, especially
if you use a memory-hog like Perl. It can be a good recipe
for bringing a server to its knees.

Javascript is often a valid alternative here, but note just how silly
it can (and often does) look in a different browser.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.17: Can I use a CGI script to generate both text and inline images?
Not directly. One script generates one response to one request.

If you want to generate a dynamic page including dynamic images
(say, a report including graphs, all of which depend on user input)
then your primary script will print the usual
<img src="[script-to-generate-image]" alt="[what you asked for]">
and, just as in the multiple frames case, you can pass data to the
image-generating program encoded in a GET URL. Of course, the same
caveats apply: see above.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.18: How can I use Caches to make CGI scripts faster and more Net-friendly?
This is currently beyond the scope of this FAQ. However,
there is an excellent introduction to net-friendly webpages, including
CGI pages, at http://vancouver-webpages.com/CacheNow/

A sample cacheing perl/cgi script by Andrew Daviel is available at
http://vancouver-webpages.com/proxy/log-tail.pl


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.19: How can I avoid users hitting "submit" twice?
You can't. You just have to deal with it when they do.

You can avoid re-processing a submission by embedding a unique ID in your
Form each time it is displayed. When you process the form, you enter
the ID in a database. Or, if it's already there, you don't repeat the
processing.

You probably want to expire your database entries after a little time:
an hour should be fine in a typical situation.

If you're already using cookies (e.g. a shoppingcart), an alternative is
to use the cookie as a unique identifier. This means you also have to
handle the situation where a user deliberately "goes round twice" and
submits the same form with different contents.

If your script may take some time to process, you should also consider
running it as a background job, and returning an immediate
acknowledgement to the user (see above if your "immediate" response
gets delayed until processing is complete in any case).


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.20: How can I stop my CGI script reading and writing files as "nobody"?
CGI scripts are run by the HTTPD, and therefore by the UID of the HTTPD
process, which is (by convention) usually a special user "nobody".

There are two basic ways to run a script under your own userid:
(1) The direct approach: use a setuid program.
(2) The double-server approach: have your CGI script communicate
with a second process (e.g. a daemon) running under your userid,
which is responsible for the actual file management.

The direct approach is usually faster, but the client-server architecture
may help with other problems, such as maintaining integrity of a database.

When running a compiled CGI program (e.g. C, C++), you can make it
setuid by simply setting the setuid bit:
e.g. "chmod 4755 myprog.cgi"

For security reasons, this is not possible with scripting languages
(eg Perl, Tcl, shell). A workaround is to run them from a setuid
program, such as cgiwrap.

In most cases where you'd want to use the client-server approach,
the server is a finished product (such as an SQL server) with its
own CGI interface.
A lightweight alternative to this is Don Libes' "expect" package.

Note that any program running under your userid has access to all your
files, and could do serious damage if hacked. Take care!


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.21: How can I prevent my CGI results being cached by the browser?
Firstly, we need to debunk a myth. People asking this question usually
add that they tried "Pragma: no-cache". Whilst this is not actively
wrong, there is no requirement on browsers to take any notice of it,
and most of them don't.

The "Pragma: no-cache" header (now superseded by HTTP/1.1 Cache-Control)
is a directive to proxies. The browser sends it with an HTTP request
to indicate that it wants the request to be dealt with by the original
server and will not accept a proxy's cached document (e.g. when you
use a reload button). The server may send it to tell a proxy not to
cache the document.

Having said all that, a practical hack to get round cacheing is
to use a different URL for your CGI script each time it's called.
This can easily be accomplished by adding a unique identifier such
as current time in the QUERY_STRING or PATH_INFO. The browser will
see a different URL, but the script can just ignore it. Note that
this can be very inefficient, and should be avoided where possible.


[Table of Contents] [Index]
--------------------------------------------------------------------------------

3.22: How can I control the default filename when downloading a file via CGI?
(from a newsgroup post by Matthew Healy)

One option, assuming you aren't already using the PATH_INFO
environment variable, is just to call your CGI script with extra
path information.

For example, suppose the URL to your script is actually

http://server.com/scriptname?name1=value1&name2=value2

Instead, try calling it as

http://server.com/scriptname/filename.ext?name1=value1&name2=value2

and note that you need to escape the URL if it's in an HTML page:

http://server.com/scriptname/filename.ext?name1=value1&amp;name2=value2

And probably the browser will assign the name given in the last chunk
as the suggested filename for downloading.

This works because the http server looks for the program file to run,
then passes any extra path to the program as PATH_INFO variable; the
browser cannot tell where the SCRIPT_NAME part ends and the PATH_INFO
part begins.

This can also be very useful if you want one script to generate more
than one filename -- the script can check the PATH_INFO value and
alter its response accordingly...


[Table of Contents] [Index]

  † In Memoriam † zondag 14 oktober 2001 @ 23:53:54 #68
13819 Loedertje
Trotse GILF.
pi_1902022
Maden

Er is nog heel wat leven ondergronds,

als wij tot slot voorgoed de ogen sluiten:

de maden lusten wel gehakt van ons,

zij kennen ons van binnen en van buiten.

  † In Memoriam † zondag 14 oktober 2001 @ 23:56:41 #69
13819 Loedertje
Trotse GILF.
pi_1902044
quote:
Op zondag 14 oktober 2001 23:45 schreef calvobbes het volgende:

[..]

Ik ook niet, shoot me



(mooi he?)

pi_1902060
Acda & De Munnik - Naar huis
.
pi_1902067
quote:
Op zondag 14 oktober 2001 23:56 schreef Loedertje het volgende:

[..]

[afbeelding]


(mooi he?)


  † In Memoriam † maandag 15 oktober 2001 @ 00:02:31 #72
13819 Loedertje
Trotse GILF.
pi_1902096
The Mother of Punk, so what the funk!

Nina Hagen was born in the Eastern sector of Berlin on March 11, 1955, to actress Eva Marie Hagen and writer Hans Hagen. Her parents divorced when she was two; eight years later her mother remarried. Nina's new step-father was the dissident poet-songwriter Wolf Biermann. Throughout her childhood Nina joined various East German youth organizations, although the presence of protester Biermann in her life proved to be a bit of a problem.

When she was 17, she failed her examination of the government-controlled East German Actors School in Berlin-Schonweide. Instead, she went to Poland for several months where she sang with a band for the first time. The following year, upon returning to East Germany, she enrolled at the Studio fur Unterhaltungsmusik (Studio for Popular Music) and graduated a year later with outstanding honours. As part of her training she toured East Germany for two months.

She spent several more years touring East Germany with the Alfons Wonneberg Orchestra, but, eventually tiring of this, she started her own band, Automobil. From then on she did full-scale concerts, often performing for eight hours straight, and working so hard that she was ordered by her doctor to take a break. She did, but then re-emerged a few months later with another group, Fritzens Dampferband. Tiring of this as well, Nina took the opportunity to leave to country when her step-father was expelled from East Germany in 1976 (in fact, she was practically begged to leave by the authorities at this point). She arrived in the Federal Republic of Germany (that is, West Germany) and soon secured a recording contract.

A year or so later, Nina flew to London to see what the music scene was over in the UK. She didn't waste any time meeting the Slits and writing a few songs with that group's vocalist, Ari Up. Back in West Berlin in mid-1977, she met up with the members of her future group, the Nina Hagen Band: guitarist Bernard Potschka, bassist Manfred Praeka, drummer Herwig Mitteregger and keyboardist Reinhold Heil.

Nina recorded her initial albums in German. The first one, called simply The Nina Hagen Band (1978), was more reminicient of the American new wave sound than of English punk. Her frenetic, gutteral voice and wide vocal range were distinctive on songs such as TV Glotzer (a reworking of the Tubes' "White Punks on Dope"), Gott Im Himmel (a cover of Norman Greenbaum's "Spirit in the Sky") and the powerful, anthemic Auf'm Friedhof. Her second LP Unbehagan, released in 1979, spawned the single African Reggae, which received a considerable amount of airplay on alternative radio stations.

In 1979, her appearance in the film Cha Cha (HERMAN BROOD!!!)captured the impact of new wave on the Amsterdam underground scene. The soundtrack for this film featured the first of several collaborations with new wave diva Lene Lovich, whom Nina met on the set of the film. They have since maintained a personal friendship and professional relationship. In fact, Nina included a German-language version of Lovich's new wave hit "Lucky Number" (Wir Leben Immer Noch) on Unbehagan; and in 1986 the two of them sang together in on "Don't Kill the Animals", an animal-rights song that has since appeared on various compilations.

Nina dissolved her band after the release of their second album in 1979, deciding instead to pursue a solo career. She went on to achieve a certain level of infamy, if not exactly fame, in her home country, as her decidedly anti-establishment lyrics resulted in a high level of press condemnation.

"From 1978 to 1985, the musical career of Nina Hagen flourished throughout the galaxy by virtue of her rigorous intercontinental touring schedules and the availability of her five albums on CBS Records... No matter that Nina Hagen - actress, chanteuse, political firebrand, doting mother, animal rights activist - was an anomaly in the era of punk rock and disco divas: Marlene Dietrich meets Emma Goldman on stage at the Ritz. No matter that awestruck critics in America were at a genuine loss to categorize her and that radio reeled in horror. No matter that her lifestyle went beyond conventional limits of women's liberation more than a decade ago. At the end of the day, as with every great musical artist, the recordings are the true chronicle of events...." (excerpt from liner notes)

Nina continues to enjoy a rock'n'roll lifestyle as well as a certain level of fame and notoriety, especially back home in Germany. In 1985, her concert in Tokyo was accompanied by the Japanese Philharmonic Orchestra and directed by Eberhard Schoener; also in 1985, Nina played live for more than a million fans at "Rock in Rio". More recently, Nina completed the "Brecht" Tour together with German actress Meret Becker. She has also continued to appear in films, such as Portrait Of A Woman Drinker and Pedro Almodovar's Pepi, Luci, Bom. For a more complete listing Nina's cinematic accomplishments, please check out The Internet Movie Database.

B-live it or not: Nina Hagen has two children: daughter Cosma Shiva (born in 1981, who is becoming somewhat of an accomplished actress herself) and son Otis (born in 1990).

  † In Memoriam † maandag 15 oktober 2001 @ 00:11:15 #73
13819 Loedertje
Trotse GILF.
pi_1902175
Straight-Edge - een subcultuur!?
(1/4/1997)

Straight-edge, de dingetjes niet waard. Straight-edge, de stroming is oud nieuws, en is dan ook meer Stray-edge: een dwaling op het randje. En de jeugd tuint er massaal in.

Straight-edge, de onthouding van drank, drugs, sex, en in veel gevallen ook vlees, is als tegenbeweging natuurlijk helemaal zo gek nog niet. Ik vind het alleen jammer dat de argumenten van de straight-edgers die ik tot nog toe heb gehoord zo adembenemend zwak zijn dat ze hun stroming bij voorbaat reeds de nek omdraaien. Natuurlijk hoeft de doelstelling of visie van een beweging niet 100% rationeel te zijn; sterker nog, om in beweging te komen heeft de mens natuurlijk toch altijd nog tegenstrijdigheden nodig. Toch zijn de motieven bij straight-edge dusdanig in tegenspraak met de methodes, dat ik er eigenlijk alleen nog maar om kan lachen.

Straight-edge, controle over jezelf hebben door je geest niet te laten beinvloeden door substanties van buitenaf. Gelukkig zijn velen nog net consequent genoeg om ook caffeine-houdende dranken daarbij mee te rekenen, maar uitlaatgassen, verontreinigingen in het water, babyvoeding, tastbare media, of kleding gemaakt uit andere stoffen dan die uit het eigen lichaam, DAAR hoor je niemand over. Ah, jij bent straight-edger? Goed hee, dat jij al voor je geboorte genoeg controle had over je eigen bewustzijn om je tijdens je kinderjaren niet meer te laten vormen door je omgeving zeg! Andere mensen zullen je wel benijden; zij zijn immers de slaaf van hun neurale netwerk, en de dingen die ze willen ontspruiten aan een brein dat is gevormd door de buitenwereld. Hun streven naar controle is een zielig en bovenal zinloos verzet tegen de wetten van de natuur. Okay, als je dan wilt praten over controle voor zover die illusie een werkelijkheid kan zijn, dan ben je naar mijn mening zowiezo als een mongool wanneer je je hele levensvisie ophangt aan 1 of 2 stoffen van buitenaf. Alcoholisten of heroinejunkies hoeven helemaal niet out-of-control te zijn, tenzij ze hun leven door hun drug laten bepalen. Wie zo handelt kiest zelf voor een leven van afhankelijkheid, en zolang afhankelijkheid zelfgekozen is biedt deze meer vrijheid dan de meeste mensen ooit zullen ervaren. Echte vrijheid zit in dat neurale netwerkje van je, niet in de (bio)chemische balans daarbinnen.

Straight-edge, als manier om gezonder te leven, is heel zinnig. Mits je gelooft in relatieve begrippen zoals gezondheid. Als je daarin gelooft zul je wel het een en ander hebben aangenomen van wat je ouders je hebben verteld. Loser! Hoe kun je je lifestyle nou nog verdedigen!? Had het nou maar gelaten bij live & let live, geen sex, geen drugs, geen vlees, en er geen politiek statement van gemaakt. Alleen al door te beweren dat iets een tegenbeweging is benadruk je toch al dat iets het gevolg is van een invloed van buitenaf? Had nou maar gewoon gezegd dat JIJ je gewoon lekker voelde bij een dergelijke geheelonthouding, en je niet aan irrationele beschuldigingen aan het adres van de punk-beweging bezondigd. Ik lees hier zojuist weer een artikel over straight-edge in de Nieuwe Revu, waarin de zoveelste aanhanger van deze sekte letterlijk over punk zegt: "Het anarchisme waar ze het altijd over hebben, het vrij zijn. Maar hoe kun je vrij zijn als je leven door alcohol en drugs wordt beheerst? Dan ben je naar mijn mening helemaal niet vrij. Dat klopt gewoon niet. Wat punkers pretenderen, zijn ze helemaal niet. Daarom zie ik straight-edge eigenlijk ook als de enige echte vorm van punk. De ultieme vorm van punk". Maar punkers haalden politiek en wetenschap ook niet door elkaar. De vrijheid die ze nastreefden had niets te maken met het ontkennen van de natuurwetten waaraan ons multiversum voldoet. In dit opzicht is straight-edge nog hypocrieter dan van die debiele actiegroepen die zogenaamd een beter milieu nastreven door andermans eigendommen te vernietigen.

Straight-edge, da's niks voor mij. Ik laat mij niet als hersenloze marionet meeslepen door de eerste de beste tegenbeweging (ik laat mij slechts als hersenloze marionet meeslepen door de chemische reacties in mijn brein die uit de confrontatie met deze tegenbeweging voortvloeien). Straight-edgers zijn dom (uitzonderingen kunnen thans de regel enkel nog bevestigen) en staren zich blind op hun eigen statements. Realistisch is anders, maar misschien moeten ze gewoon nog een andere invulling voor hun leven vinden dan sex en drugs, omdat dat het enige was waar hun leven voorheen uit bestond. Tja, dan ben je nog triester dan dat je dom bent. Maar genoeg theater voor nu. Schelden zoals ik hier nu doe wordt zelden in perspectief gezien, en dus zie ik me bij voorbaat maar weer genoodzaakt om te vermelden dat ik absoluut niks tegen geheelonthouders heb, tenzij ze gaan opscheppen over hun prestaties en valse motieven vastplakken aan hun nobele streven. Je moet leven op de manier waar je jezelf het beste bij voelt, maar pretendeer niet meteen te weten hoe de dingen zijn. Als er IETS is wat laat zien dat de enige vijand die we nog hebben onze eigen hypocrisie en onwetendheid is (en we in feite alleen nog tegen onszelf aan het vechten zijn), dan is dit het wel.

  † In Memoriam † maandag 15 oktober 2001 @ 00:14:46 #74
13819 Loedertje
Trotse GILF.
pi_1902216
Justice Tonight/Kick it Over

STAY AROUND DON'T PLAY AROUND
THIS OLD TOWN AND ALL
SEEMS LIKE I GOT TO TRAVEL ON
A LOT OF PEOPLE WON'T GET NO SUPPER TONIGHT
JUSTICE TONIGHT
RUNNIN AND A HIDING TONIGHT
JUSTICE TONIGHT
REMEMBER TO KICK IT OVER
NO ONE WILL GUIDE YOU THROUGH ARMEGIDEON TIME
IT'S ARMAGIDEON
IT'S NOT CHRISTMAS TIME
A LOT OF PEOPLE
A LOT OF PEOPLE USE A CALCULATOR
A LOT OF PEOPLE WON'T GET NO SUPPER TONIGHT
A LOT OF PEOPLE SITTIN' DOWN BY THE LIGHT
THE BATTLE IS GETTIN HOTTER
ARMAGIDEON TIME
ARMAGIDEON
REMEMBER TO KICK IT OVER
ARMAGIDEON TIME
A LOT OF PEOPLE AIN'T GOT NO SUPPER TONIGHT
A LOT OF PEOPLE GOT TO STAND OUT BACK


(The Clash)

  † In Memoriam † maandag 15 oktober 2001 @ 00:20:33 #75
13819 Loedertje
Trotse GILF.
pi_1902265
European youth win their climate bet against the EU
Contributed on Wednesday September 12, @ 12:56PM
EU Commissioner Margot Wallström: "This is a bet that I like to lose"

European youth win their climate bet against the EU

Remerschen, Luxemburg, September 7th &#8211; Today the European Environment Agency (EEA) announced that European youth have won their climate bet against EU Environment Commissioner Wallström. Within eight months, young people from 16 European countries managed to reduce their CO2 emissions by 8% in their schools and at home. About 52.000 so-called &#8216;Betties&#8217; participated in this climate campaign. EU Commissioner Wallström: "I am impressed about the work of the Betties. I love to lose this bet".

--------------------------------------------------------------------------------


The European Betties have been working hard on their CO2 emissions since the climate summit in The Hague in November last year. In The Hague, they bet they would beat the EU target of reducing its emissions by 8% in eight years &#8211; the Betties said they would do it in 8 months. EU Commissioner Wallström took up the bet, and said the Betties wouldn&#8217;t make it. In case the Betties would lose, they would have to bike Mrs. Wallström in a rickshaw for a week to all her meetings in Brussels. Now that the Betties won, Mrs. Wallström will ride her bike instead for a month, from her home in Brussels to work.

In total about 90 schools from all over Europe managed to reduce their emissions by 8%. This part of The BET was won by the young Europeans. In total, the Betties managed to save about 4 million kilograms instead. The Betties admit that they wanted to reach 8 million kilograms, but still feel like winners. Henrike Wegener, one of the activists from the coordination office of the campaign in Berlin: &#8216;We proved that CO2 emission reducing can be done easily, fast and cheap. All it takes is your own will and creativity.&#8217;

Through the BET campaign, thousands of young Europeans became interested in climate politics, an otherwise rather dull theme for people aged from 16 to 25. The Betties reduced their emissions with simple measures: turning down the heating, installing energy-saving light bulbs, insulating rooms, repairing leaking water taps, setting up recycling systems for water and paper. Wegener: &#8216;One would think that these steps are already well known to people, but apparently they are not. There is still a lot of educational work to do&#8217;.

The Betties celebrated their winning of the campaign with a so-called climate week in Remerschen, Luxemburg. After the announcement of the winner, Margot Wallström had an informal meeting with the about 50 present Betties. Luxemburg&#8217;s State Secretary for Environment Berger attended the press conference as well.

--------------------------------------------------------------------------------
For further information:
Jeroen Kuiper (European coordinator) 0049 30 7970 6610

abonnement Unibet Coolblue Bitvavo
Forum Opties
Forumhop:
Hop naar:
(afkorting, bv 'KLB')