Duplicate dividing in relative positional encoding #11

songweige · 2022-06-15T23:38:12Z

Hey @lucidrains, thanks for keeping these models implemented. In line 88

video-diffusion-pytorch/video_diffusion_pytorch/video_diffusion_pytorch.py

Lines 84 to 88 in f55f1b0

    
           num_buckets //= 2 
        
           ret += (n < 0).long() * num_buckets 
        
           n = torch.abs(n) 
        
           max_exact = num_buckets // 2

you have max_exact as the half of num_buckets, whose value was already halved in line 84.

I think that is duplicated and should be changed to identity:

 max_exact = num_buckets

The text was updated successfully, but these errors were encountered:

oxjohanndiep · 2022-07-02T06:15:30Z

I suggest you read the paper "On Scalar Embedding of Relative Positions in Attention Models". In that paper, they explain the implemented bucketing function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate dividing in relative positional encoding #11

Duplicate dividing in relative positional encoding #11

songweige commented Jun 15, 2022

oxjohanndiep commented Jul 2, 2022

Duplicate dividing in relative positional encoding #11

Duplicate dividing in relative positional encoding #11

Comments

songweige commented Jun 15, 2022

oxjohanndiep commented Jul 2, 2022