Translation Revisited — some real surprises

Translation Revisited - some real surprises

More than a year ago I posted a lengthy article on machine translation, reporting the state of things back then, and testing six leading MT products ("MT playoffs, March 2009"). The results had Google in first place, with Language Weaver next, and Microsoft third. I closed the article with this: "Bear in mind this was a single test of 500 words, and just English-Spanish. But things are changing so fast a person really should do their own tests anyway once a year…"

Testing is actually hard work, not as easy as you would think. You really should do it, though, if you routinely use these tools. Get a chunk of sample text, run it through the various MT options out there, and compare the output. Seems quick and easy. The hard part, though, is that the output (in my case) is Spanish, and while I read it OK, it’s slow going for me really judging quality. Added to that, first impressions of MT quality can really be deceiving. A butchered first sentence makes a whole paragraph look like crap, and it’s only after digging into it that you might notice what’s good about it.

Well, I did the work, as well as some other research having to do with translation in general, and I’m feeling really good now. All caught up on things. In a nutshell, here’s what I found:

I tested seven MT providers, including all the current leaders. Overall I was very impressed by the quality I was seeing; links to their translation pages are provided at the end of this article, if you want to see what I mean. The biggest surprise by far was that, over the past 16 months, Microsoft must have been grinding away pretty hard in the back room, because they’ve quietly closed the gap with Google. In my opinion they can claim the number one slot now, at least for Spanish. I totally wasn’t expecting that. In fact, I had to stare at it from several directions before I could actually believe it. I used the same exact 500 word passage as last time, so that I would be able to compare the actual output now with the actual output from March 2009. I made a subjective judgment of readability — how pleasant or painful it was to read the translation, and also, how close it came to the real meaning of what the author (me) was trying to say. On this measure they stacked up as: 1) Microsoft, 2) Google, 3) SDL & Language Weaver in a tie, and 4) World Lingo & Yahoo in a tie.

For me personally, though, readability isn’t the whole story. In my work, typically writing an e-mail in Spanish, I "mine" two or three different MT translations for individual phrases or words which appeal to me, and clip the ones I want to borrow. So to test this factor, I rewrote about 200 words of the test passage in Spanish, robbing the best phrases, and keeping track of where they came from on a piece of paper. This is when I first became aware that Microsoft was doing something different. I got 9 truly-usable phrases from Microsoft, 4 each from Language Weaver & World Lingo, 3 each from Google & Yahoo, and 2 from SDL. Curious, I pulled up the output I had saved from 16 months ago. Side-by-side, old compared with new, it was quite evident that Microsoft had made some sort of a leap forward. Google had improved, too, but only incrementally. (Language Weaver — oddly — had identical output word for word, no change at all. This could have two explanations. Either they’ve been putting all of their effort into other languages, or Babylon isn’t getting their updates.)

Not to beat the thing to death, but I did one more test. I started instead with a Spanish test passage from the Wall Street Journal, and translated the other direction. That went quick, because it’s easier for me to see quality differences in English output, of course. On this trial I judged the three statistically-based translations (Google, Microsoft, and Language Weaver) to be indistinguishable in quality. Google maybe had a tiny lead. SDL (which may mix statistical with rule-based, I’m not sure) was next, with Yahoo & World Lingo pretty bad (they both use the old Systran engine) and Reverso definitely last (but hey, they do have a sweet verb conjugator).

If you’ve really read this far, I know you’re feeling cross-eyed. Let’s take a rest, back off a bit, talk about SDL, Babylon, and Language Weaver. If you’ve read some of my prior posts you’ll know I have a crush on Language Weaver. They were early leaders in the statistical approach and, against all odds for a 90-person company, they’ve been giving Microsoft and Google a solid run. They only do one thing — statistically-based machine translation — and they only do it for giant corporations and governments. Babylon partnered with them, and has access to their engine, but it’s only licensed for a couple of sentences at a time. It’s practically unusable, but it has given me a way to test Language Weaver against the others. Now here’s the interesting part (you’ve been very patient). On July 15 — only a week ago — SDL bought Language Weaver for $42 million. SDL is British, and big. It’s best known as an important provider of software and services to professional translators and localization shops. Their product line includes Trados, which is the industry-standard translation memory software used by professionals. Language Weaver, they say, will mostly keep its independence, and remain the same happy bunch of people, happily chunking away on MT problems in Southern California.

I’ve read that Microsoft, while mostly statistically-based, is mixing in some rule-based algorithms. Why this is interesting? It seems to be working for them. For example my translations from Google, while generally pretty sweet, occasionally have grammatical bloopers that must be plain embarrassing, assuming computers get embarrassed. Microsoft seems less prone to these. So as time goes along it will be very interesting to watch SDL, already fairly decent at MT themselves, and now the proud owner of Language Weaver’s technology to boot. They could, potentially, successfully combine their rules with Language Weaver’s statistics. They’ve likely made themselves, with this purchase, one of the Big Players.

Just now while writing this I was reminded of another company, AppTek, which frequently shows up in news stories because they never took sides but instead believed that MT could benefit by a mixture of the statistical and rule-based approaches. I hopped over to their site. This is from their FAQ: "What is HMT? Hybrid Machine Translation leverages both RMT and SMT fused together. HMT has higher fluency as well as informativeness." I was surprised to see they actually give free samples from their homepage; so I quickly ran my test passage one more time, and the output was actually pretty good, judging quickly — perhaps in the middle of the pack. I included their free-sample link below with the others, and will bookmark it for myself. AppTek has actually been around almost 20 years. It’s in Virginia. This is just a good reminder that there are lots of players in this fast-moving field, any one of which could emerge by surprise, and suddenly be important.

So taking a deep breath, what’s my recommendation now? For friends who just want to read websites or e-mails in another language, I’d still recommend Google first, because it’s good and it’s so fun to use; but don’t forget to give Microsoft a try. For myself, for my own website (coming your way, like a snail) I’ll be hiring a real translator living in Córdoba. I carefully timed myself while piecing together translations out of MT parts. It turns out that, charging 5 cents a word, I could make $6 an hour — but that’s fictional because it’s crappy quality, if I do say so myself. If my initial website has 5000 words, I can probably buy a really attractive, Argentine flavored professional translation for $500, and save myself 50 mind-numbing hours of meticulously wrecking my own website by myself. But for e-mails or informal postings, I hate the idea of needing somebody’s help, and I don’t mind looking like my authentic self. So for that I have a system all worked out now that I like a lot, it uses MT and dictionaries together for support while I write. I’ll be using Microsoft and Google together on a regular basis, and every time throwing in one more, rule-based alternative — most likely SDL, sometimes World Lingo, maybe AppTek. The rule-based translations, while not as pretty, are just different, so they often give me interesting alternative phrasings or words. And I’ll certainly want to be there to notice the quiet rollout of Language Weaver under the SDL brand.

For the final polish on anything I write I totally LOVE Microsoft Word, it really does have wonderful Spanish proofing tools. Spellchecking in Spanish, grammar checking, and especially their out-of-the-box right-click thesaurus, they’re all first class. And Windows 7 is sweet, too, the way it transparently manages foreign keyboards and language switching (ask me about the very cool United States International keyboard). I’m not a Microsoft nut — I’m just sounding like it, today. They really are doing a spectacular job in this area, and unless you happen to be mucking around in it, it’s not something anyone would be expected to notice.

So that’s what I’m doing for translation now. You can count on me looking around again though, in a year or so. Here are those links:

Arrow

Pedro translated to Pete

(Sorry. that’s how I signed the 2009 article, I just had to do it again :-)

Oh, one last PS — part of this project involved quickly scanning my old notes. I ran across this gem: "Google and Babel Fish are both based on Systran. They give the exact same results. My guess is that this is the full Systran product, the one I would buy, except metered down to about 150 word blocks. It works terrible on anything complicated…" That’s talking about a world where statistical translation didn’t exist outside the laboratory, Google maxed at 150 words, and the price of entry to MT was $299 for an unpalatable dose of Systran-in-a-box. Which I gladly paid, and cheerfully tolerated. Guess what year it was? 2006.

Floral Florish

No comments yet. Be the first.

Leave a reply