Human Quality Evaluation
How do you know if your MT (Machine Translation) engine is giving you value for money? You test it. Asian Absolute’s machine translation services are used for productivity measurement by companies in the UK and globally. It’s the only way to make sure your MT engine is actually working the way it’s supposed to. Based on your objectives.
Do you need your engine to produce output which is presentable without post-editing? Are you looking for output which you can bring up to publication standard with as little effort as possible?
You need to choose a quality metric which is going to give you meaningful results. So it makes sense that our first step is to understand your goals. The next is to create bespoke metrics.
Why choose Asian Absolute?
You’ll be working with MT specialists. Highly skilled linguists who know how to interpret automated scores. And who use their own in-depth knowledge and expertise to evaluate post-editing speed and effort.
- Get the help you need to select the best automated metrics – and interpretthe results.
- Expert human evaluationgets you a more accurate picture of your MT engine’s performance.
- Rely on native-language translatorsin more than 120 languages.
- Your evaluator’s experience and qualifications in your fieldensure the appropriateness of terminology in your translations is judged correctly.
- Count on award-winning project management– always meeting the ISO:9001 Quality Management standard.
Asian Absolute helped FTChinese.com in the challenging task of building a world-class translation service. They provide top quality, personal service.
I was extremely impressed by Asian Absolute’s hard work to complete the project to our high standards and within a very tight timeframe.
Many thanks for your help and also for providing an interpreter for the week, she was absolutely fantastic and a real life-saver!
Guinness World Records
Can’t I just use automated metrics? How do I interpret BLEU scores?
There are many metrics used to judge the quality of MT output. Some rely on expert human evaluation. Others are automated metrics which use algorithms to make a judgement of quality.
BLEU (BiLingual Evaluation Understudy) is still regarded by many as one of the best automated metrics. Best, in this context, means its scores have a greater chance of correlating with what a human evaluator might say. Though this is by no means guaranteed. BLEU is especially challenged by languages which lack word boundaries, such as Chinese and Japanese.
That’s one specific drawback. But there are more general problems with using automated metrics to measure Machine Translation output:
- You work towards a quality “score” rather than actual quality: by using an automated quality score you put a quantitative number on a qualitative issue.
- Multiple accurate translations: one thought, phrase or sentence might be translatable in a number of ways. Sometimes those ways are completely different to each other. Sometimes they’re still all correct. A human translator would know this. A machine doesn’t.
Asian Absolute can help you choose the most suitable automated metric. NIST, WER, METEOR, LEPOR and several other algorithms have shown promise when it comes to achieving some relationship between their scores and what a human might say.
But human evaluation is still the best judge of the quality of Machine Translation. And it’s likely to remain that way for the foreseeable future…
Measure your machine translation post-editing speed and effort
The whole point of MT is to save you time and money. This means the best metrics for figuring out whether yours is doing so are often post-editing speed. And post-editing effort.
Measuring these requires specially trained linguists. Experts who understand how to edit MT output efficiently in order to make it fit for purpose. An inexperienced editor, for example, may over-edit. Using more time than necessary and bringing the standard of translation to a higher level than is actually required. Equally, they might under-edit. Producing a translation which doesn’t meet the project’s goals.
1. Post-editing speed
Post-editing speed can be easily measured. How long does it take the expert linguists involved to edit your engine’s output?
Obviously, labour time saved will lead to a direct saving in total per-word cost from your Machine Translation solution.
2. Post-editing effort
Post-editing effort measures the quality of matches provided by your MT engine by comparing them with the post-edited version. This is sometimes known as the edit distance.
Get free quotes on MT engine training productivity measurement in the UK
Contact us 24/7. You can easily find out more about measuring the performance of your Machine Translation. About MT engine training services in general. Or get a free, no-obligation quote on your project today.