I will write about how specifically we are assessing the performance of machine translation. We're not developing MT, we're just users of MT. Therefore, our evaluation method is very different from the developer's evaluation method. I write about how to internally evaluate whether a particular MT can be used in practice by a translation company.
For example, when a translation company evaluates MT, the company evaluates the following
[MT evaluation items].
You can rate other items, but we don't rate any more than that. It's too time consuming and too complicated. In fact, I think it's enough to narrow it down to the following three items
[Simple MT evaluation items].
The reason I narrowed it down to these three items is that I think it's wrong to expect the following three items from MT in the first place.
Therefore, there is no problem in excluding these three items when making an evaluation.
* Note that today's article was translated using DeepL.