Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 3D Conv-Net #466

Merged
merged 5 commits into from
May 28, 2024
Merged

Support for 3D Conv-Net #466

merged 5 commits into from
May 28, 2024

Conversation

kevinkevin556
Copy link
Contributor

Hi all,

Thank you for developing such a nice repo. I've been using it in many of my projects for network explainability, and it has been incredibly convenient!

Recently, I've been working with medical datasets using 3D-UNet. However, I noticed that 3D convolution is not yet supported in this library, and there are also some issues like #351 requesting for the feature. Therefore, I made several changes on GradCAM and BaseCAM to extend the functionality of GradCAM to support 3D images.

Please let me know if you have any questions or suggestions regarding the changes I've implemented. I'm excited to contribute to this project and look forward to your feedback!

@jacobgil
Copy link
Owner

jacobgil commented Dec 9, 2023

Hey, sorry for the late reply.
Thanks a lot for this functionality, this will be great to merge.

Is there a way to share an example use case for this: maybe some model and and input image example,
or an image example for the readme?

weights = self.get_cam_weights(input_tensor,
target_layer,
targets,
activations,
grads)
weighted_activations = weights[:, :, None, None] * activations
w_shape = (slice(None), slice(None)) + (None,) * (len(activations.shape)-2)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is a bit less straight forward to understand.
Can you please explain what's going on here?
Do you think there is a way to rewrite it to be more clear ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line does exactly the same thing as

# 2D conv
if len(activations.shape) == 4:
  weighted_activations = weights[:, :, None, None] * activations

# 3D conv
elif len(activations.shape) == 5:   
  weighted_activations = weights[:, :, None, None, None] * activations

But I think you are right: it does lack some readability.
I will rewrite the code here.

@kevinkevin556
Copy link
Contributor Author

@jacobgil Thanks for your reply!

Is there a way to share an example use case for this: maybe some model and and input image example, or an image example for the readme?

I added an animation of gradcam-visualized CT scans in the readme.
Hope this can make it clearer.

@Syax19
Copy link

Syax19 commented Jan 17, 2024

@kevinkevin556 Thanks for providing the code for applying Grad-Cam on 3D CNN!

I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width).
Then I got the grayscale_cam outputs size is (1, 24, 224, 224).
I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ?
Since I found that every depth of the output heatmap looked same.

Looking forward to your replying, thanks!

@kevinkevin556
Copy link
Contributor Author

I have used your code to get the grad-cam outputs, my input 3D image tensor size is (1, 1, 24, 224, 224) representing (batch, channel, depth, height, width). Then I got the grayscale_cam outputs size is (1, 24, 224, 224). I'm curious to know if I take one of the outputs, for example, depth=11, the output will be "outputs[ 0, : ][ 11, : , : ] (depth, height, width)", will it corresponds to "input image[ : , 11 , : , : ] (channel, depth, height, width)" ? Since I found that every depth of the output heatmap looked same.

@Syax19 Sorry for the late reply. I'm glad to hear that someone is using it 😄

Although I followed MONAI's convention to assign each dimension in the order of (height, width, depth), the output dimensions should still correspond with your input tensor, as there is no dimension swap when calculating Grad-CAM.

Therefore, the grayscale_cam of size (1, 24, 224, 224) represents dimensions (batch, depth, height, width) in your case.

@MoH-assan
Copy link

@jacobgil
Any update on this feature?

@jacobgil
Copy link
Owner

This is incredible functionality, thank you so much for contributing this, and sorry for being so late with my reply.
I really want to merge this.
The .gif file weights 24 mb which is a bit much, will look into resizing it.

@jacobgil jacobgil changed the base branch from master to 3d May 28, 2024 18:41
@jacobgil jacobgil changed the base branch from 3d to master May 28, 2024 18:43
@jacobgil jacobgil merged commit 3f6b14d into jacobgil:master May 28, 2024
1 check failed
@jacobgil
Copy link
Owner

@kevinkevin556 merged!! better late than never. Thank you so much for this contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants