Hvad er formålet med Dansk Sprognævn?

radiohuset photo
Photo by seier+seier
Der er i dagens danske aviser flere artikler, der beskriver flytningen af Dansk Sprognævn fra København til Bogense primært fra medarbejdernes synsvinkel (og de er naturligvis ikke glade).

Aviserne er ret ukritiske over for sprognævnets beskrivelse af sig selv. Fx har Berlingske et længere interview med nævnets direktør, Sabine Kirchmeier, hvor hun ikke bliver stillet mange kritiske spørgsmål, ikke engang når hendes udsagn er ret subjektive, som i det flg.:

Dansk Sprognævn er ikke et lukket miljø, hvor vi sidder bag hver vores computer og taster. Vi er en meget udadvendt institution, der laver forskningsprojekter og seminarer og alle mulige andre aktiviteter. Nogle af os er censorer på universiteterne og underviser på universiteterne, og vi har ph.d.-studerende, der bliver vejledt hos os, og studerende, der skriver speciale hos os. Vi fungerer på mange måder som et universitetsinstitut, og hvis man i Moderniseringsstyrelsen, eller hvor man nu har truffet beslutningen, havde læst vores årsberetning, så ville man have set, at vi også får besøg af skoleklasser og holder seminarer om sprog med offentlige institutioner og uddannelsesinsitutioner. Og så videre og så videre.

De aktiviteter, hun beskriver her, lyder ganske rigtigt meget, som om de hører hjemme på et universitetsinstitut. Det underbygges af, at hendes svar på det rimelige spørgsmål, om de ikke bare ansætte nogle andre dygtige sprogfolk er flg.: „Nej, for der findes ikke ret mange med en ph.d.-grad på vores område. Nye medarbejdere skal først uddannes til jobbet, og det tager tre år at tage en relevant ph.d.-grad.“ Jeg har arbejdet i ordbogsbranchen i over femten år, og det er første gang, jeg hører om et sted, hvor der kræves en PhD-grad for at arbejde der (hvorimod det er meget normalt på universiteterne).

Og DSN er jo slet ikke noget universitetsinstitut iflg. loven:

§ 1. Dansk Sprognævn er en statsinstitution, som har til opgave at følge det danske sprogs udvikling, at give råd og oplysninger om det danske sprog og at fastlægge den danske retskrivning.

Stk. 2. Sprognævnet skal

1. indsamle nye ord, ordforbindelser og ordanvendelser, herunder forkortelser,

2. besvare sproglige spørgsmål fra myndigheder og offentligheden om det danske sprogs bygning og brug, herunder give vejledning i stavning og udtale af udenlandske navne,

3. udgive skrifter om dansk sprog, navnlig vejledninger i brugen af modersmålet, og samarbejde med terminologiorganer, ordbogsredaktioner og offentlige institutioner, der autoriserer eller registrerer stednavne, personnavne og varenavne.

Stk. 3. Dansk Sprognævn skal arbejde på videnskabeligt grundlag. I sit arbejde skal nævnet tage hensyn til sprogets funktion som bærer af tradition og kulturel kontinuitet og som spejl af samtidens kultur og samfundsforhold.

Stk. 4. I sager, som vedrører forholdet til andre sprog, forhandler nævnet med tilsvarende organer i de pågældende lande. Nævnet skal især samarbejde med sprognævn og tilsvarende organer i Norden.

[…]

§ 2. Dansk Sprognævn redigerer og udgiver den officielle danske retskrivningsordbog. Heri offentliggøres den af nævnet fastlagte retskrivning.

Stk. 2. I forbindelse med udgivelse af nye udgaver af retskrivningsordbogen kan nævnet på egen hånd foretage ændringer og ajourføringer af ikkeprincipiel karakter.

[…]

§ 3. Sprognævnet udsender hvert år en beretning om arbejdet. I beretningen eller på anden måde offentliggør nævnet mindst en gang om året et udvalg af de udtalelser, som det har afgivet i årets løb.

Det forekommer mig, at sprognævnet har overfortolket „at følge det danske sprogs udvikling“ „på videnskabeligt grundlag“. De to dele står ikke i samme sætning, men det forekommer mig, at de har opfattet det som et carte blanche til at skabe et universitetsinstitut, der forsker i det danske sprogs udvikling. Problemet er, at det ikke nødvendigvis er det, de får deres penge for. Det undstøttes af en mail fra ministeren til Politiken (intet gratis link):

Kulturminister Mette Bock (LA) afviser, at udflytningen bliver et problem for sprognævnet.

„Dansk Sprognævn løser i dag vigtige opgaver for hele landet, som de også kan løse fra Bogense. Arbejdspladserne vil blive en stor gevinst for byen,“ skriver hun i en mail til Politiken. […] „Sprognævnet løser i dag to hovedopgaver, som handler om rådgivning om sprog og at følge sprogets udvikling. Det er opgaver, som løses digitalt og telefonisk. Det kræver ikke en bestemt fysisk placering i København,“ skriver Mette Bock.

Det er naturligvis møgtræls for de berørte medarbejdere, der troede de skulle arbejde på et universitetsinstitut tilknyttet Københavns Universitet og ikke på et ordbogsforlag i Bogense, men det er jo ikke rigtigt regeringens problem.

Det skulle såmænd ikke undre mig, om de diskret er blevet bedt om at skrue ned for deres videnskabelige aktiviteter (det kunne fx begrunde deres sidste flytning fra universitetet til det gamle radiohus), men at de ikke har villet lytte. Deres struktur (med en direktør, en bestyrelse og et repræsentantskab) kunne godt tænkes at gøre det svært for den siddende kulturminister at gennemtrumfe en ændring på andre måder end ved at flytte dem.

Hvis Dansk Sprognævn opfatter sig som et universitetsinstitut, burde de nok opfordre til, at loven skrives om, så den passer med virkeligheden, og derefter burde de nok blive overført til et af universiteteterne. Alternativet er jo nok at flytte til Bogense og ansatte nogle nye medarbejdere, der er gode til at løse de opgaver, loven har bestemt, de skal udføre.

Emil fra Lønneberg og Julemanden

Emil og Julemanden.
Anna (som lige er fyldt ti) læser hver aften lidt op for mig for at blive bedre til dansk (og jeg læser også højt for hende). For tiden læser hun Emil fra Lønneberg, og det går da også ganske godt.

Nogle gange går det dog galt, som for eksempel, da hun glad og fro sagde flg.:

Emil spejdede op i skorstenen, og da så han noget sjovt. I hullet lige over hans hoved hang en rød julemand og kiggede ned til ham.

„Hej med dig,“ sagde Emil. „Nu skal du se en, der kan klatre!“

I originalen står der „julemåne“, men det er nu ikke nær så sjovt!

På samme måde læste hun flg. et par sider senere, men det var nu måske nok med vilje, for hun gjorde det med et skælmsk smil:

Men i Katholtsøen mellem hvide åkander svømmede Emil og Alfred rundt i det kølige vand, og på himlen hang julemanden, rød som en lygte og lyste for dem.

„Dig og mig, Alfred,“ sagde Emil.

„Ja, dig og mig, Emil,“ sagde Alfred, „Det skulle jeg mene!“

Phyllis bestemte sig i øvrigt for at teste Léon på den første passage, og han begik den selvsamme fejl som Anna, så det må være en oplagt fejl for dansk-skotter.

Scots on Smartphones

Writin Scots uisin SwiftKey’s preditive keyboard.
A’ve been fasht for a lang time at predictive keyboards wadna recognise Scots ava – ilka time ye uised a perfecklie normal wird, it wad get chynged tae a completelie different Inglis wird at juist happent tae leuk similar.

Sae A wis weel chuft whan ane o ma clients, Scottish Language Dictionaries, gat a email fae Julien Baley fae SwiftKey (a Lunnon-based companie awnt bi Microsoft) twa-three months syne anent addin Scots tae thair predictive keyboard for Android an iOS. A dae aw the data stuff for SLD, sae o coorse A wis chosen tae wirk wi Julien on this.

A extractit the relevant bits fae the new edition o the Concise Scots Dicionary an sent this tae Julien. Forby, A gied him a earlie version o a corpus (a collection o texts) o modren Scots. He separatelie contactit Andy Eagle and gat the heidwirds fae his Online Scots Dictionary.

Suin efter this, Julien sent me the first version o the keyboard. At this pynt, it daedna ken the Scots inflections, an it wis makkin some unco substitutions (e.g., aA oweraw), sae A advised him on the grammar o Scots an on the substitutions. The final bit wis tae leuk at wirds he fund in the corpus at wisna in the dictionars, an the keyboard wis redd.

Ye can doonlaid SwiftKey on yer Android smartphone the day, but gin ye hae a iPhone, ye maun wait few mair days (technical issues pat it aff).

SwiftKey will lair fae the wey fowks uise it, sae it’ll get better and better.

A howp this will see monie mair fowks writin Scots wi confidence, an ultimatelie tae better support for Scots in programs an on wabsteids. Wad it no be great gin Scots wis supportit in yer spellchecker, in Google Translate, and as Facebook interface leid?

PS: A wis chuft tae sae stories aboot this in The National, The Herald an Bella Caledonia.

Generation X are disappearing

Hugo Rifkind had an article in The Times today about being a Xennial (too old to be a Millennial, too young for Generation X), and I sent him the following tweet as a reply:

Hugo and 77 other people (so far) were kind enough to like it, so I thought I’d elaborate a bit on my theory.

A lot of the stuff about the Baby Boomers, Generation X and the Millennials can be traced back to Howe and Strauss’s Generations from 1991. This book examined earlier American generations and claimed to identify a four-generation cycle. They then defined the new generations that were emerging at the time and tried to predict their future very roughly. In particular, they expected a huge crisis once the Baby-Boomers had started to retire (perhaps around 2020), which Generation X would sort out and then hand over power to the Millennials.

This is clearly not what happened – the crises (9/11 + the financial crash) happened much sooner than they expected, while the Baby-Boomers were still in office. They actually mentioned this possibility briefly on page 382:

What happens if the crisis comes early? What if the Millennium – the year 2000 or soon thereafter – provides Boomers with the occasion to impose their “millennial” visions on the nation and world? The generation cycle suggests that the risk of cataclysm would be very high.

Furthermore, in their historical analysis they clearly don’t assign a standard length to generations, so they would themselves have expected the generational boundaries in the 20th century to require some tweaking once the big defining events had taken place. It’s therefore completely in their spirit to revisit the definitions they suggested more than 25 years ago.

They actually don’t even stick to four generations per cycle all the time. What they call the Civil War Cycle contains only three. As they write on page 192:

[It is] America’s only three-part cycle – the one whose crisis came too soon, too hard, and with too much ghastly devastation. This cycle is no aberration. Rather, it demonstrates how events can turn out badly – and, from a generational perspective, what happens when they do.

I’m postulating that this has happened again. The crisis came so soon that at least half of Generation X hadn’t yet managed to get high enough up the housing ladder (or build up assets in other ways) to allow them to benefit from the asset boom that was a result of the financial crash. As a result we now have a huge split in most western societies: On the one hand, older people (Baby Boomers and older X’ers) often are asset-rich and have paid off most of their house, as well as having a good pension. Other members of this generation are less rich, but they might at least have a cheap council house that is affordable on their salary or their pension. On the other hand, younger people (Millennials and younger X’ers) don’t tend to have much wealth: They’re either renting in the private sector, or they’ve paid so much money for their house that a crazy amount of their salary is spent on the mortgage. They don’t have decent pensions, and they don’t really expect ever to be able to retire comfortably. They also typically grew up being told to expect a great and prosperous life, and they weren’t expecting things to turn out like this.

I was born in 1972, so right in the middle of Generation X, and I think we felt different from both the Baby Boomers and the Millennials before the financial crash. However, I now feel more and more similar to the Millennials, and further and further removed from the Boomers. So I think we might have to redefine the Baby Boomer generation as stretching all the way to the late 1960s, and the Millennials starting immediately afterwards. (I don’t believe it’s a clean break – whether somebody belongs in one generation or the other ultimately depends on whether they had enough assets when the economy collapsed.)

I think we can now also tell when the Millennial generation ended: The youngsters who don’t remember the time before the financial crash have a different mindset because they didn’t spend their childhood expecting a rich and easy life. They also happen to be the smartphone generation.

So to finish this blog post, let me redefine the generations as follows:

  • The Baby Boomers (too young to remember WWII, and old enough to have built up their wealth before the financial crash): Roughly 1940–1969.
  • The Car-Crash Generation (grew up expecting an easy life, but suddenly the rug got pulled away from under they feet): Roughly 1970–1999.
  • The Smartphone Generation (they don’t remember the easy years, and they live their lives through their smartphones): Roughly 2000–.

AlphaDiplomacy Zero?

diplomacy game photo
Photo by condredge
When I was still at university, I did several courses in AI, and in one of them we spent a lot of time looking at why Go was so hard to implement. I was therefore very impressed when DeepMind created AlphaGo two years ago and started beating professional players, because it was sooner than I had expected. And I am now overwhelmed by the version called AlphaGo Zero, which is so much better:

Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.

It is able to do this by using a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher. The system starts off with a neural network that knows nothing about the game of Go. It then plays games against itself, by combining this neural network with a powerful search algorithm. As it plays, the neural network is tuned and updated to predict moves, as well as the eventual winner of the games.

I’m wondering whether the same methodology could be used to create a version of Diplomacy.

The game of Diplomacy was invented by Allan B. Calhamer in 1954. The seven players represent the great powers of pre-WWI Europe, but differently from many other board games, there are no dice – nothing is random. In effect it’s more like chess for seven players, except for the addition of diplomacy, i.e., negotiation. For instance, if I’m France and attack England on my own, it’s likely our units will simply bounce; to succeed, I need to convince Germany or Russia to join me, or I need to convince England I’m their friend and that it’ll be perfectly safe to move all their units to Russia or Germany without leaving any of them behind.

Implementing a computer version of Diplomacy without the negotiation aspect isn’t much use (or fun), and implementing human negotiation capabilities is a bit beyond the ability of current computational linguistics techniques.

However, why not simply let AlphaDiplomacy Zero develop its own language? It will probably look rather odd to a human observer, perhaps a bit like Facebook’s recent AI experiment:

Well, weirder than this, of course, because Facebook’s Alice and Bob started out with standard English. AlphaDiplomacy Zero might decide that “Jiorgiougj” means “Let’s gang up on Germany”, and that “Oihuergiub” means “I’ll let you have Belgium if I can have Norway.”

It would be fascinating to study this language afterwords. How many words would it have? How complex would the grammar be? Would it be fundamentally different from human languages? How would it evolve over time?

It would also be fascinating for students of politics and diplomacy to study AlphaDiplomacy’s negotiation strategies (once the linguists had translated it). Would it come up with completely new approaches?

I really hope DeepMind will try this out one day soon. It would be truly fascinating, not just as a board game, but as a study in linguistic universals and politics.

It would tick so many of my boxes in one go (linguistics, AI, Diplomacy and politics). I can’t wait!

The future belongs to small and weird languages

tlingit photo
Photo by David~O
Google Translate and other current machine translation programs are based on bilingual corpora, i.e., collections of translated texts. They translate a text by breaking it into bits, finding similarities in the corpus, selecting the corresponding bits in the other language and then stringing the translation snippets together again. It works surprisingly well, but it means that current machine translation can never get better than existing translations (errors in the corpus will get replicated), and also that it’s practically impossible to add a language that very few translations exist for (this is for instance a challenge for adding Scots, because very few people translate to or from this language).

My prediction is that the next big break-through in computational linguistics will involve deducing meaning from monolingual corpora, i.e., figuring out the meaning of a word by analysing how it’s used. If somebody then manages to construct a computational representation of meaning (perhaps aided by brain research), it should then theoretically be possible to translate from one language into another without ever having seen a translation before, by turning language into meaning and back into another language. I’ve no idea when this is going to happen, but I presume Google and other big software companies are throwing big money at this problem, so it might not be too far away. My gut feeling would be 10–20 years from now.

Interestingly, once this form of machine translation has been invented, translating between two language varieties will be just as easy as translating between two separate languages. So you could translate a text in British English into American English, or formal language into informal, or Geordie into Scouse. You could even ask for Wuthering Heights as J.K. Rowling would have written it.

Also, the computer could be analysing your use of language and start mimicking it – using the same words and phrases with the same pronunciation. In effect, it could start sounding like you (or like your mum, Alex Salmond or Marilyn Monroe if you so desired).

This will have huge repercussions for dialects and small languages.

At the moment, we’re surrounded by big languages – they dominate written materials as well as TV and movies, and most computer interfaces work best in them. It’s also hard to speak a non-standard variety of a big language, because speech recognition and machine translation programs tend to fall over when the way you speak doesn’t conform. Scottish people are very aware of this, as shown by the famous elevator sketch:

However, if my predictions come out true, all of that will change. As soon as a corpus exists (and that can include spoken language, not just written texts), the computer should be able to figure our how to speak and understand this variety. Because translation is always easier and more accurate between similar language varieties than between very different ones, people might prefer to get everything translated or dubbed into their own variety. So you will never need to hear RP or American English again if you don’t want to – you can get everything in your own variety of Scottish English instead. Or in broad Scots. Or in Gaelic.

Every village used to have its own speech variety (its patois to use the French term). The reformation initiated a process of language standardisation, and this got a huge boost when all children started going to school to learn to read and write (not necessarily well, but always in the standard language). When radio was invented, the spoken language started converging, too, and television made this even more ubiquitous. We’re now in a situation where lots of traditional languages and dialects are threatened with extinction.

If computers start being good at picking up the local lingo, all of that will potentially change again. There will be no great incentive to learn a standard variety of a language if your computer can always bridge the gap if other people don’t understand it. The languages of the world might start diverging again. That will be interesting.

Fastern’s E’en and Fastelavn

Today it’s Fastelavn in Denmark. The word comes from Low German vastel-avent, meaning the evening (and by extension the day) before the fast (Lent), which means that it always takes place on the Sunday before Shrove Tuesday.

When I was a kid, we dressed up and went guising, and we took turns beating up a barrel containing sweets with a bat. I’m not sure many kids go guising any more, but the barrel smashing (“at slå katten af tønden”) is still very popular.

Interestingly, in Scotland Shrove Tuesday used to be called Fastern’s E’en, which is clearly etymologically the same word as Fastelavn. It was marked in various ways, but eating pancakes doesn’t seem to have been one of them.

In the Borders it was traditional to have a Baw game:

Here are a list of the various traditions I’ve managed to find on Tobar an Dualchais:

  • On Shrove Tuesday they had a big bannock with a ring, a sixpence and/or a button in it.
  • Up till about the First World War, a ba [handball] game used to be played on Fastern’s E’en (Shrove Tuesday). People ate currant dumplings on that day.
  • On Shrove Tuesday children went round the houses for bannocks containing a ring or a button. The one who got a ring would be first married.
  • Fastern’s Een [Shrove Tuesday] was celebrated with the baking of special cakes.
  • In the Melrose marriage ball game, which replaced an earlier Fastern’s Een handball game, the bride kicks off a rugby ball in the square and the young men scramble for it. There has been an attempt to stop the tradition because of the danger from traffic. In the earlier game, there were no teams, just small groups trying to run away with the ball. It had to be hidden (not in a house) for three days in order to win the game.
  • On Shrove Tuesday Mrs Hailstones’ mother, who was English, used to make big pancakes and sugar them and roll them up. Scottish families did not do this, but Mrs Hailstones used to do it for her own children.
  • Shrove Tuesday was the day before Lent started. There used to be a big feast on the evening of shrove Tuesday. It was believed that something bad would happen during Easter if Shrove Tuesday was not properly celebrated.
  • The contributor’s great-grandmother (who lived to be 120) was once without a chicken for Shrove Tuesday. A pigeon came in, and so it was killed instead.
  • On Shrove Tuesday children went round the houses for bannocks containing a ring or a button. The one who got a ring would be first married. A button meant an old maid or a bachelor. They had a half day off school on Shrove Tuesday.
  • On Shrove Tuesday a sheep was killed and there was a feast. A broth was made with barley that had been threshed with a ‘cnotag’, a stone with a hollow in the middle.
  • The contributor explains how a bannock was made with barley meal, butter, eggs and sugar, to celebrate Shrove Tuesday. She has never seen it made, but her father saw it in Barra.

It would be nice to see a revival of some of these traditions!