2020年5月20日水曜日

How does the evaluation of machine translation take place?

I will write about how specifically we are assessing the performance of machine translation. We're not developing MT, we're just users of MT. Therefore, our evaluation method is very different from the developer's evaluation method. I write about how to internally evaluate whether a particular MT can be used in practice by a translation company.

For example, when a translation company evaluates MT, the company evaluates the following

[MT evaluation items].

  1. omission/addition 
  2. incorrect translation 
  3. grammatical error 
  4. fluency
  5. inconsistency
  6. term appropriateness

You can rate other items, but we don't rate any more than that. It's too time consuming and too complicated. In fact, I think it's enough to narrow it down to the following three items

[Simple MT evaluation items].

  1. omission/addition 
  2. incorrect translation 
  3. grammatical error

The reason I narrowed it down to these three items is that I think it's wrong to expect the following three items from MT in the first place.

  1. fluency
  2. inconsistency
  3. term appropriateness

Therefore, there is no problem in excluding these three items when making an evaluation.

* Note that today's article was translated using DeepL.

1 件のコメント: