You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a doubt about layers in Swin Transformer. As it is mentioned in the architecture of Swin-T that there are 2, 2, 6, 2 layers at stage 1,2,3 and 4.
What does it mean by 2 layers at 1st stage and 6 layers at 3rd stage.
Although there are 2 successive swin transformer blocks, but I am confused with the term layers.
Does it mean that at Layer 1, W-MSA block will be executed and output given to SW-MSA block, then what happens next? What about Layer 2. Does the W-MSA block again executed on the output of SW-MSA block?
The architecture has four swin transformer blocks, and each block also consists of two. In my understanding, the given layers indicate how many times you should perform each swin transformer block.
Why does the number of block repetitions follow the logic of having the highest number of repetitions in the third stage? Other Swin variants follow [2,2,18,2]. Can this logic be generalised to other modalities?
I had a doubt about layers in Swin Transformer. As it is mentioned in the architecture of Swin-T that there are 2, 2, 6, 2 layers at stage 1,2,3 and 4.
What does it mean by 2 layers at 1st stage and 6 layers at 3rd stage.
Although there are 2 successive swin transformer blocks, but I am confused with the term layers.
Does it mean that at Layer 1, W-MSA block will be executed and output given to SW-MSA block, then what happens next? What about Layer 2. Does the W-MSA block again executed on the output of SW-MSA block?
@zeliu98 @ancientmooner Please help. Others can also give their views.
Thankyou.
The text was updated successfully, but these errors were encountered: