Eval

Created2023-07-19|DL

|Word Count:352|Reading Time:2mins

BLEU

Bilingual Evaluation Understudy
MT quality metric
Precision Based
$Precision_n=\frac{\sum\limits_{p \in hyp}\sum\limits_{n-gram \in p}Count_{clip}(n-gram)}{\sum\limits_{p \in hyp} \sum\limits_{n-gram \in p}Count(n-gram)}$
$Count_{clip}(n-gram) = min(\text{matched n-gram count}, max_{r \in Ref} (\text{n-gram count in r}))$
A weighted logarithmic average : Consider the exponential decay observed in the n-gram precision

\sqrt[\sum_{n=1}^N w_n]{\prod_{n=1}^N p_n^{w_n}}=\frac{1}{\sum_{n=1}^N w_n} \exp \left(\sum_{n=1}^N w_n * \ln p_n\right)=\exp \left(\frac{1}{N} * \sum_{n=1}^N \ln p_n\right)

Details

Brevity Penalty
$BP = \begin{cases} 1 & \text{if } c > r \\ e^{(1 - \frac{r}{c})} & \text{if } c \leq r \end{cases}$
$BLEU = BP \cdot \exp \left(\sum\limits_{n=1}^N w_n \log p_n\right)$
$\log BLEU = min(1-\frac{r}{c}, 0) + \sum\limits_{n=1}^N w_n \log p_n$

ROUGE

Recall-Oriented Understudy for Gisting Evaluation

ROUGE-N

An n-gram recall between a candidate summary and a set of reference summaries

Recall=\frac{TP}{TP+FN}

OverView

Author: fl_334

Link: https://www.fl334.com/2023/07/19/Eval/

Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.

Related Articles

Loading the Database