Thursday 28 June 2007

Auxiliary Languages

Auxiliary languages, the invented "universal languages" have been a long-standing fascination of mine, and was re-awoken during an earlier tidying up of my web presence, some years back.

Most of the technical terms will be explained by a quick visit to Google.

Esperanto

Unfortunately, this is the most popular invented language, the one that most people will have heard of, and the only one with easily available treeware documentation, but it has problems. The first is the one that really stopped it for me, back in the '70s and '80s, when all I had was a manual typewriter, with no accented characters…

The very alphabet

The Esperanto alphabet is:

A E I O U; B C Ĉ (or Ĉ) D F G Ĝ (or Ĝ) H Ĥ (or Ĥ) J Ĵ (or Ĵ) K L M N P R S Ŝ (or Ŝ) T Ŭ (or Ŭ) V Z

Maybe you didn't see all those letters - C^, G^, H^, J^, S^ and U-inverted^ - as they don't live in the ASCII, or even the Latin-1 range of common characters. In the full Unicode standard, they are in the next block, the Latin Extended-A set, and in the character justifications, while U-inverted^ is shared with Latin, the rest are Esperanto-only.

For a language that was happy to throw away X in favour of 'ks', the addition of these accented characters seems strange. If you have only ASCII, I'd prefer the following reform.

The letters involved are ones with multiple sounds - C is a 'ts' and C^ a 'ch'; G as in 'gay' and G^ as in 'gem', H^ as the 'ch' in 'loch', J is as the 'y' in 'young' (or the Germanic 'J' in 'Jung'), but J^ is a 'zh', S as in 'gas', but S^ as 'sh' in 'ship', while U-breve is as W.

Why not one of the following?

C->S (merge the 's' and 'ts'), C^->C, G->G, G^->J, J->Y, H^->Q,S^->X,U-breve->W, J^->* if you're stuck with ASCII, or

C->ç (soft C as in French garçon), C^->C, G->G, G^->J, J->Y, H^->Q,S^->X,U-breve->W, J^->Ð using Latin-1.

The assumptions

The Sapir-Whorf hypothesis is that language constrains the thoughts we can have, as without words, we cannot articulate a thought. Esperanto shows the reverse is true. The language was first published in 1887, and shows some of the values of the time - the word for 'father' is the obvious patro, but the word for 'mother' is patrino, using a feminine modifier. You can't even say "single parent" in the language as the word parent only formally exists in the plural, gepatroj, ('ge-' meaning "both sexes together") from which one has to hack off the pluralizing -j to coin a clumsy singular.

Similar value-laden decisions in the vocabulary have us use maldekstra, rather than something based off gauche or sinister, for left. But south is sudo, rather than "malnordo"!

The long words

In natural languages, common words are short words. Esperanto's systematization leads to long words for simple common concepts. For example rust (iron oxide) is rusto; iron rusts (fero rustig^is), and becomes rusted (rustig^inta). In this case, there is the shorter rusta (rusty), but that offers a slightly different shade of meaning in English.

Admittedly, English does have an unusually broad vocabulary with different shades of meaning, having assembled fragments of a number of other languages, something not common to other languages, so this may be an idiosyncrasy.

Interlingue

I discovered this one when searching for the ISO-639 language code for Latin (la) and Classical Greek (there isn't one, but there is an ISO-639-2 standard 3-letter code (grc), for Greek from before the fall of Byzantium). But there are first-class, 2-letter codes for Esperanto (eo), Volapük (vo), Interlingua (ia), and Interlingue (ie). I'd been aware of the first three, but it took Google to unearth the last.

It's a real minority choice among auxiliary languages, being kept alive by the ability of the 'net to bring together scattered individuals of like mind. It starts with the same sort of brief - gather together common European Romance word roots and a systematic grammar, but is much more accessible.

It uses just the 26 ASCII letters. There are some apparently complex rules for pronunciation, such as for whether 'C' is soft or hard, or whether 'T' is sibilant (like in caution) - but they are just the ones we use everyday in English.It is quite happy to use doubled letters, usually with H, - 'ch' as in 'church', silent h after g (ghetto) or k (khedive) or r (rheumatisme), 'sh' (or occaionally 'sch') as in 'ship', 'th','ph' for Greek derived words that had θ, φ - but 'ss' for a hard S, 'zz' for a 'ts' as in plazza.

I like its use of shorter words - for example, its use of simple and more obviously European e,o for and,or, as opposed to Esperanto's kaj,aŭ (that's u-breve); and it uses forms that are much more natural for an English speaker.

This remains my current favourite.

Esperanto to Interlingue dictionary

Interlingue to Esperanto dictionary

Ceqli

This is the c.2003 state of the language. More at its now somewhat spammed Yahoo! group.

Pronounced Ching-lee, a Loglan(lojban) derived language with a Mandarin influenced grammar, and a global reach in populating its vocabulary, being constrained to have fundamental words with alternating vowels and consonants. Being a language of the 1990s, it has a neutral form, pam, for parent, with father, mother, being pamzo, pamxi by adding systematic elements meaning man, woman.

It does use some "funny pronunciations" - C is 'ch', Q is 'ng', X is 'sh' (that being only half-funny, being akin to the familiar use in the Portuguese place name, Xeres, that gives us the English 'sherry'), so where I've used it as a quasi-gibberish language in my SF, elsewhere on this site, I've written it in a phonetic style.

1 comment:

Baloo said...

Hi. I'm the Ceqli guy. It might interest you that, now that I'm retired, sort of, there's more Ceqli activity. I have a PBWiki at
http://ceqli.pbwiki.com/
I'm retelling the Ranma story at
http://ceqli.pbwiki.com/STORI+HU+SRANMAZO
and I have a Ceqli cartoon blog at
http://ceqlizobloq.wordpress.com/