[WIP v2 - deprecated] Unlikelihood token loss #2011

funboarder13920 · 2021-02-19T14:52:12Z

Implement unlikelihood loss from Neural Text Generation with Unlikelihood Training
Tests for unlikelihood loss
Class refactoring for LossComputeBase
Fix ppl in statistics that used to be based on the loss which could be different from NLLLoss

francoishernandez

A few comments.

francoishernandez · 2021-02-19T15:23:22Z

.github/workflows/push.yml

+            -heads 4 -transformer_ff 64 \
+            -word_vec_size 16 -report_every 5 \
+            -rnn_size 16 -train_steps 10
+    - name: Test LM training with unlieklihood loss


Typo 'unlikelihood'

francoishernandez · 2021-02-19T15:28:10Z

onmt/utils/loss.py

+            opt.label_smoothing, len(tgt_field.vocab),
+            ignore_index=padding_idx
+        )
+    elif opt.unlikelihood_coeff > 0:


Shall we assert in the opts validation that unlikelihood_coeff isn't compatible with label_smoothing?

I can make them mutually exclusive with the parser or I can make both of them compatible at the same time as unlikelihood_coeff can be added to any loss (but label smoothing and unlikelihood are contradictory)

Yes, let's go for the mutually exclusive way.

francoishernandez · 2021-02-19T15:33:17Z

onmt/modules/copy_generator.py

@@ -177,7 +177,7 @@ def forward(self, scores, align, target):
        return loss


-class CommonCopyGeneratorLossCompute(CommonLossCompute):
+class CommonCopyGeneratorLossCompute(LossComputeBase):


I'm not sure to grasp the whole rationale behind the CommonLossCompute/LossComputeBase refactoring. Is the last big remaining difference only the log_ppl computation?

(Underlying question is: do we really need both CommonLossCompute and LossComputeBase anymore?)

The _compute_loss, _make_shard_state and the way to use the generator are different between CopyGeneratorLoss and the other classes

We can do it in one class, the code is already not very clear, it's not going to be worse. If we do that CopyGenerator will override _compute_loss, _compute_log_ppl and _compute_alignement_loss will only be used in the compute_loss of the main class

We can do it in one class, the code is already not very clear, it's not going to be worse. If we do that CopyGenerator will override _compute_loss, _compute_log_ppl and _compute_alignement_loss will only be used in the compute_loss of the main class

Yes I think this might be a bit better to explicitly override this method instead of having a full class that we don't really know what it's for unless we look at this specific CopyGeneratorLoss.

I merged it, the ppl part is not nice. Also there is a normalization args that was not used anywhere, I will investigate to see if the normalization process disappeared by mistake

normalization was already not used a year ago

OpenNMT-py/onmt/utils/loss.py

Line 228 in 7835130

def __init__(self, criterion, generator, normalization="sents",

francoishernandez · 2021-02-19T15:42:48Z

onmt/utils/loss.py

+    """
+
+    def __init__(self, unlikelihood_coeff, ignore_index=-100):
+        assert 0.0 < unlikelihood_coeff


Maybe add an explicit message here?

francoishernandez · 2021-02-19T15:43:21Z

onmt/utils/loss.py

+            target.size(0), target.size(1), target.size(0)
+        ).permute(1, 2, 0)
+
+        ctx_cands = (


More explicit variable name? Or at least a comment?

vince62s · 2022-12-07T15:48:05Z

@funboarder13920 @francoishernandez would it be worth updating wrt the v3 and merging or shall we drop ?

implement unlikelihood token loss and fix ppl to always be the ppl

6d93dcc

funboarder13920 changed the title ~~Unlikelihood token loss implementation~~ Unlikelihood token loss Feb 19, 2021

francoishernandez reviewed Feb 19, 2021

View reviewed changes

Valentin Berkes added 3 commits February 19, 2021 16:24

address PR comments + commit missing test file

c1787a8

mutually exclusive label_smoothing and unlikelihood_coeff

770f60d

merge LossComputeBase and CommonLossComputeBase

118a43f

francoishernandez marked this pull request as draft December 19, 2022 17:19

vince62s changed the title ~~Unlikelihood token loss~~ [WIP v2 - deprecated] Unlikelihood token loss Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP v2 - deprecated] Unlikelihood token loss #2011

[WIP v2 - deprecated] Unlikelihood token loss #2011

funboarder13920 commented Feb 19, 2021

francoishernandez left a comment

francoishernandez Feb 19, 2021

francoishernandez Feb 19, 2021

funboarder13920 Feb 19, 2021

francoishernandez Feb 19, 2021

francoishernandez Feb 19, 2021

francoishernandez Feb 19, 2021

funboarder13920 Feb 19, 2021 •

edited

funboarder13920 Feb 19, 2021

francoishernandez Feb 19, 2021

funboarder13920 Feb 19, 2021

funboarder13920 Feb 19, 2021 •

edited

francoishernandez Feb 19, 2021

francoishernandez Feb 19, 2021

vince62s commented Dec 7, 2022

[WIP v2 - deprecated] Unlikelihood token loss #2011

Are you sure you want to change the base?

[WIP v2 - deprecated] Unlikelihood token loss #2011

Conversation

funboarder13920 commented Feb 19, 2021

francoishernandez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

funboarder13920 Feb 19, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

funboarder13920 Feb 19, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vince62s commented Dec 7, 2022

funboarder13920 Feb 19, 2021 •

edited

funboarder13920 Feb 19, 2021 •

edited