Friday, 8 May 2009

UNICODE-MORSE

UNICODE ARTICLE FROM EYE MAGAZINE

The United Nations of Type



Unicode is one of the great success stories of standardisation. The Unicode Consortium is a not-for-profit organisation founded in 1991 by all the major computer companies (including Apple, Microsoft and Adobe) and its mission is to ensure consistency in the digital encoding of typographic characters. 

In the original Unicode, each character had an unambiguously defined sixteen-bit code point. Updates to Unicode have made available seventeen ‘planes’ of 65,536 points, potentially 1,114,112 different characters, minus several thousand ‘function’ codes. The current version (Unicode 5.0) has the potential to encode all the characters in all the world’s languages, living or dead, realising the dream of Xerox Parc’s Joseph D. Becker that ‘the ultimate goal must be multi-lingual word processing.’

Decodeunicode is a not-for-profit organisation that aims to ‘decode’ and demystify Unicode, and to proselytise independently on its behalf. The project was launched in 2004 by the design department at the University of Applied Sciences in Mainz, Germany, where Johannes Bergerhausen is professor of typography, and went online in 2005 as www.decodeunicode.org. It was initially supported by a grant from the German federal government, but is now self-supporting, raising revenue from the sale of graphic materials such as postcards and a giant, limited-edition poster. The website has won many awards including: 100 Beste Plakate, Germany, Austria, Switzerland, 2004; iF Gold Award, 2005; iF Gold Award 2006; Red Dot Best of the Best, 2006, plus a nomination for the Design Award of the Federal Republic of Germany, 2007.

Bergerhausen, originally from Bonn, studied communication design in Düsseldorf and lived and worked in Paris for much of the 1990s. He collaborated with Grapus founders, Gérard Paris-Clavel (see Eye no. 27 vol. 7) at Ne Pas Plier and Pierre Bernard (Eye no. 3 vol. 1) at Atelier de Création Graphique before founding his own practice. He conducted a French-funded research project on ASCII in 1998, and returned to Germany in 2000, taking up the Mainz post in 2002. For a while, Decodeunicode was a full time job for Bergerhausen, but it now takes up about 20 per cent of his working time.

This interview, conducted mainly by email, took place while Professor Bergerhausen was based temporarily in Paris (for research work), and ‘on the road’, lecturing in Bologna, Amiens and Mainz. He has also delivered lectures about Decodeunicode in Beirut, Berlin, Dubai, Frankfurt, London, Paris, Prague, Rotterdam, San Francisco and Weimar. 

His team is currently working to integrate all Unicode characters, including ancient and minority scripts from the other Unicode planes such as Cuneiform and Byzantine musical symbols, making more than 99,000 characters.

John L. Walters 
Is Decodeunicode just for designers and typographers? Or is it for everyone?

Johannes Bergerhausen 
Our project is, on the one hand, directed at experts, e.g. designers, typographers, linguists or programmers. On the other hand, we also receive feedback from a lay public who simply derives pleasure from the character shapes in languages such as Arabic or Ethiopic. Our website aims to provide a popular scientific access to the writing systems of the world. 

Why you, Johannes Bergerhausen?

I felt that one of the most important revolutions in typography should not happen unnoticed by the public – it has been a kind of idée fixe for me. Anyone with a PC or Mac has access to more characters than Gutenberg ever did. Every time I present the project I feel I am showing people a bizarre, fascinating and completely unknown universe. The people at the Unicode Consortium are doing a very good job, but they don’t communicate to a general public. And they find it hard to make people understand that it is not a question of simply assigning code points to a couple of characters, and that the technical process of implementation is very complicated. 

And of course I am not alone. The number of people in our team today varies between three and five, online; as many as 180 contributors have helped, and students on the planned Master’s course at Mainz will be joining.

Can Decodeunicode change people and behaviour and things?

Unicode has already changed the way business people think of type. A corporate font for a modern international organisation can no longer stick to Latin characters alone. Addressing large and small markets in their national language, in their local script, has become much easier, and even saves money. The Decodeunicode project makes a small contribution to this by drawing attention to the subject, and what remains to be done. Since the implementation of Unicode and OpenType, the type design community has shown an increased interest in ‘exotic’ writing systems. More non-Latin fonts than ever are being created. 

When did you first become aware of the political and social aspects of typography and design? 

It was when I discovered that the ISO-Latin1 code of the 1980s was like something from the Cold War: special characters like the German umlaut, French cedilla or the Spanish tilde are encoded – but try writing a name like Miloševi´c!

Has there been any opposition to Unicode?

The greatest danger lies in the possibility of the computer industry losing interest in encoding missing writing systems, assuming the markets they represent are too small. About 30 living minority scripts have yet to be encoded. Though some of them represent communities of no more than 100,000 people, they, too, provide a market. As yet, none of the leading operating systems (Windows, Mac, Linux, etc.) supplies enough fonts to cover all the Unicode characters. 

Furthermore, there are keyboard layouts missing. At the very least, operating systems should be localised for every large language community. Take, for example, Cambodia with its thirteen million population. That’s some market!

But some markets will always be smaller than others. Are you not making a point about democracy and equality here, too? 

Yes, I am. Apart from the existing social responsibility of the designer there is a social responsibility of the big global brands such as Apple, Microsoft or Nokia to support access of minority communities to the it world. To attain this goal might well become easier in the future, with a growing number of tools being equipped with input devices operating through soft buttons instead of hard buttons, like the iPhone or the Optimus keyboard developed by the Russian industrial designer Art Lebedev. 
The so-called ‘other planes’ include many characters, such as Cuneiform or Old Italic, that are used by only a small number of communities. Some software developers see no sense in developing fonts for them. To give an example, in order to save storage space, mobile phones offer only the 10,000 most important Chinese characters as a font. But we need to think beyond that. The storage space of a mobile phone today equals that of a computer two decades ago. In a few years, including everything will be no problem. Every operating system should then at least offer a default font for each character.

You refer to Unicode as a sort of ‘typographical UN General Assembly’. Is the Consortium similarly unwieldy and complex?

Yes, and you could take the simile further: international organisations are slow. It may take a couple of years for a proposal to include a new character to pass (or be rejected). After which one has to wait for the next update of operating systems. We are actually talking about a democratisation of characters, regardless of the tools on which (or countries in which) they are to be used in the future. This is a one-off process humanity must undertake. The Unicode Consortium adheres to a few ‘basic laws’ such as, for example, never changing a character’s name, even if incorrect, so as not to corrupt the data. However dictatorially inclined designers may be, I cannot imagine any breakaways.

You describe ASCII as ‘one of the most successful codes’ since DNA. How and why did things go wrong in the early 1980s with eight-bit codes such as ISO-Latin1 and Windows?

In the 1980s, every manufacturer produced their own code chart – they didn’t think of a common standard. Everyone thought their own system was the best, and nobody had in mind the exchange of characters via the internet. It was an evolutionary process like Blue-Ray (DVD) and HD-DVD today.

Evolution can be a slow and inefficient process. How did universal encoding come about?

Xerox’s Star workstation, which was developed in 1980, not only featured the first commercial graphical user interface with windows, icons and mouse but also a sixteen-bit universal code including Asian characters, which may be regarded as a predecessor of Unicode. Joseph D. Becker published the first paper on universal coding in 1988, after discussions between two members of the Xerox PARC (Palo Alto Research Centre) staff and one Apple employee. 
In 1991, Unicode 1.0.0 was published. The breakthrough came with the support of Unicode by Windows NT in 1993 and XML in 1998. The internet speaks Unicode.

Why is it important that Unicode encodes dead languages? Is it just because we can?

At present researchers can only have images of texts in archaic characters – they can’t use full text search. They would like to work with these characters in PDFs, presentation documents and emails, too. With Aramaic characters in one of the next Unicode updates, we can expect interesting results in a few years from Bible scholars analysing the Dead Sea scrolls. 

Will Unicode ever embrace non-alphabetic symbols, such as pictograms and dingbats? Or dance notation, such as Labanotation?

This is one of the fascinating questions for the future of Unicode: what to do about special characters such as astronomical signs, laundry symbols, dingbats, etc. The American linguist Deborah Anderson, the German typographer and researcher Andreas Stötzner and I filed an initial proposal for a new Unicode block called ‘Public Signage’ last year. Western musical symbols and even Byzantine and ancient Greek musical symbols are already encoded. 
Any community wishing to use Labanotation or any other writing system is free to file a proposal with the Unicode Consortium.

In 2004 you made a plea for ‘typographers, linguists and other experts to join in the finalising of the work’. Has this been heeded?

It has, even though we still have a long way to go. Few linguists, apart from Unicode’s Irish co-author Michael Everson, are aware of the problems of dealing with letterforms. Deborah Anderson’s ‘Script Encoding Initiative’ fights for the inclusion of minority scripts. 

We are also working on an update to integrate the lesser-known characters of the Unicode standard (the ‘other planes’). Only a small number of our 60,000 unique monthly visitors actually contribute information. It is my hope that the site will become one of the references for information about Unicode characters. It has many good contributions in German, but we need more contributions in English! 

Every writer (on a computer or mobile device) is now a typesetter. Do ordinary people realise this? 

Some of them do. At least, they ought to be able to tell the apostrophe from the minute character. I think that dictionaries, in their description of special characters (such as the different dashes: - – — ) should give the corresponding Unicode positions (for example, ‘U+2015 horizontal bar’) as these make up the only hardware- and software-independent information.

Do designers care about Decodeunicode? 

Yes. Interest in this strange subject does apparently exist! In 2005, the website counted a monthly average of 11,000 visitors. In 2006, it was 33,000, and in the first four months of 2007, it was more than 65,000 people from 133 countries, making two million page impressions per month. The second edition of our poster sold most of its 600 copies within two months. We have just finished another poster, showing all the characters of the Basic Multilingual Plane (BMP) of Unicode 5.0, and folded as a map. This is much easier to handle, and we have printed 3000. All profits go to the advancement of the project.

Who else is supporting the cause? 

There are many people working on advancing Unicode. First of all, there is the Unicode community working on proposals to encode new scripts or missing single characters. International working groups meet twice a year somewhere on the planet. There are proposals to include Egyptian hieroglyphs. 

Type designers are aware of Unicode, but only a few people, like Andreas Stötzner, are taking the time to work on proposals to include new characters. He is currently setting up proposals on medieval Latin characters and recently re-filed a proposal to include the disputed German Versal-Eszett (uppercase ß).

But do enough people know about it?

There is not enough feedback from the academic world. Possibly, it has yet to realise that, once encoded in Unicode, a character becomes available ‘in the public domain’. (The German Grüner Punkt refused to have its ‘green dot’ packaging recycling symbol included in Unicode. In trying to protect its intellectual property, it missed an opportunity to make its sign accessible and known all over the world.)

Erik Spiekermann said to me that ‘the only thing better than the website is the poster’. Did it take long to make?

One day, our database designer, Wenzel Spingler, sent us a text file in which, with the help of a script, he had lined up one character after the next — 65,536 code positions! We felt it would be interesting, fun and useful to get a simple overview of the sheer number of characters, following the tradition of scientific tables. It took some weeks for Siri Poarangan and me to design it, six months for the tutors to implement the nineteen different fonts, several weeks to design missing characters and a month for Siri and me to proof it. It is just one text frame in InDesign. But if you change one letter, you have to wait for several minutes for the computer to operate, like on my first Mac lc in 1990.

Has Decodeunicode had an impact on your practice as a designer and educator?

Looking beyond their own Latin alphabet can be useful for designers, typographers and students alike, as it leads to many questions about the theory of design. Unicode differentiates between character and glyph, i.e. between the Platonic idea of a character and the actual shape it takes. You can call it the difference between form and content, or the distinction that von Ehrenfels draws between the ideas of Gehalt and Gestalt.

Could you imagine ‘Unicode the movie’? Is there enough sex and violence to get funding?

Oh! How did you know? We are actually thinking about it . . . But don’t tell anyone!

No comments:

Post a Comment