Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting to ONNX (Person REID) #9

Open
joseph-cv opened this issue May 17, 2023 · 10 comments
Open

Exporting to ONNX (Person REID) #9

joseph-cv opened this issue May 17, 2023 · 10 comments

Comments

@joseph-cv
Copy link

Hello, how could I export the model to onnx.

@joseph-cv joseph-cv changed the title Exporting to ONNX Exporting to ONNX (Person REID) May 17, 2023
@cwhgn
Copy link
Collaborator

cwhgn commented May 18, 2023

The model trained by SOLIDER is mainly used for downstream human task fine-tuning, so we do not involve onnx yet. If you want to export onnx, you may use some 3rd-party functions or tools to do it, such as torch.onnx.export.

@22ema
Copy link

22ema commented Dec 18, 2023

I share the tips I learned while converting ONNX. If you use dynamic batch when converting Swin transformer to onnx, you must change the code below. Even if you change the model, there will be no change in performance.

    def window_reverse(self, windows, H, W):
        """
        Args:
            windows: (num_windows*B, window_size, window_size, C)
            H (int): Height of image
            W (int): Width of image
        Returns:
            x: (B, H, W, C)
        """
        window_size = self.window_size
        # B = int(windows.shape[0] / (H * W / window_size / window_size))
        # x = windows.view(B, H // window_size, W // window_size, window_size,
        #                  window_size, -1)
        # x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
        C = int(windows.shape[-1])
        x = windows.view(-1, H // window_size, W // window_size, window_size, window_size, C)
        x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, H, W, C)
        return x

@rhett-ye
Copy link

rhett-ye commented Dec 28, 2023

我分享了我在转换 ONNX 时学到的技巧。如果在将 Swin transformer 转换为 onnx 时使用动态批处理,则必须更改以下代码。即使您更改模型,性能也不会发生变化。

    def window_reverse(self, windows, H, W):
        """
        Args:
            windows: (num_windows*B, window_size, window_size, C)
            H (int): Height of image
            W (int): Width of image
        Returns:
            x: (B, H, W, C)
        """
        window_size = self.window_size
        # B = int(windows.shape[0] / (H * W / window_size / window_size))
        # x = windows.view(B, H // window_size, W // window_size, window_size,
        #                  window_size, -1)
        # x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
        C = int(windows.shape[-1])
        x = windows.view(-1, H // window_size, W // window_size, window_size, window_size, C)
        x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, H, W, C)
        return x

Can you provide more details about torch to onnx ? When I was inferencing with the transformed onnx model, the results were completely different from the torch results, and I had made sure the inputs were the same. And it will prompt when export:
TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

And I also get errors when I try to use onnx-simplify. is this a problem with the model?

@22ema
Copy link

22ema commented Jan 2, 2024

我分享了我在转换 ONNX 时学到的技巧。如果在将 Swin transformer 转换为 onnx 时使用动态批处理,则必须更改以下代码。即使您更改模型,性能也不会发生变化。

    def window_reverse(self, windows, H, W):
        """
        Args:
            windows: (num_windows*B, window_size, window_size, C)
            H (int): Height of image
            W (int): Width of image
        Returns:
            x: (B, H, W, C)
        """
        window_size = self.window_size
        # B = int(windows.shape[0] / (H * W / window_size / window_size))
        # x = windows.view(B, H // window_size, W // window_size, window_size,
        #                  window_size, -1)
        # x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
        C = int(windows.shape[-1])
        x = windows.view(-1, H // window_size, W // window_size, window_size, window_size, C)
        x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, H, W, C)
        return x

Can you provide more details about torch to onnx ? When I was inferencing with the transformed onnx model, the results were completely different from the torch results, and I had made sure the inputs were the same. And it will prompt when export: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

And I also get errors when I try to use onnx-simplify. is this a problem with the model?

Hello. @rhett-ye . happy new year.
Okay. I provide more details about convert to onnx.
First, i share convert pth to onnx code.

import os
import torch
import torchvision.models as models

from utils.Util import pretrained_model
from config.Config import Config
from net.Swin_Transform.SOLIDER import SOLIDER
from net.Swin_Transform.backbone import swin_transformer
from net.SoliderReid import make_model

def load_model(model, nLabels, pretrained_model_path, activate_func):
    torch_model = None
    model_desc = model.lower()
    print(model_desc)
    if "solider_reid_swin_s" in model_desc:
        torch_model = make_model(0.2)
    # assert torch_model == None, "torch_model must not be None!!!"
    torch_model = pretrained_model(torch_model, pretrained_model_path, False)
    if activate_func == "sigmoid":
        torch_model = torch.nn.Sequential(torch_model, torch.nn.Sigmoid())
    elif activate_func == "softmax":
        torch_model = torch.nn.Sequential(torch_model, torch.nn.Softmax())
    elif activate_func == None:
        pass
    return torch_model

def convert_onnx(torch_model, onnx_path, sizes):
    torch_model.to("cuda:0")
    batch, channels, height, width = sizes
    input_names = ["input_0"]
    output_names = ["output_0"]
    dynamic_axes = {"input_0" : {0 : 'batch_size'}, "output_0" : {0 : 'batch_size'}}
    if not os.path.isfile(onnx_path):
        dummy_input = torch.randn(batch, channels, height, width, device='cuda')
        torch.onnx.export(
            torch_model,
            dummy_input,
            onnx_path,
            input_names = input_names, 
            output_names = output_names,
            dynamic_axes = dynamic_axes,
            opset_version=11
        )
        print(f"ONNX model exported to {onnx_path}.")
    else:
        print(f"ONNX model {onnx_path} already exists.")
            

if __name__ == "__main__":
    torch.set_grad_enabled(False)
    
    dummy_size = (Config['batch_size'], Config['channels'], Config['height'], Config['width'])
    
    torch_model = load_model(Config['model_name'], 
                             Config['label_number'], 
                             Config['pth_path'], 
                             Config['fn_activate_func'])
    
    convert_onnx(torch_model, Config['onnx_path'], dummy_size)

load_model() function load a model. and convert_onnx() fuction convert model to onnx format. config's information is below.

Config = \
    {
        'pth_path': './model/pth/p_reid.solider.swin_s.i384x128.sb.v1.0.0.pth',
        'onnx_path': './model/onnx/p_reid.solider.swin_s.i384x128.sb.v1.0.0_b.onnx',
        'fn_activate_func': None,
        'model_name': 'solider_reid_swin_s',
        'label_number': 768,
        'batch_size': 1,
        'channels': 3,
        'width': 128,
        'height': 384
    }

My convert result is below. Tested using several images from Market 1501. (ReID)

## PTH Result
Label=3:	 predict=3 score=0.24012675881385803	 True
Label=2:	 predict=2 score=0.251444011926651	 True
Label=3:	 predict=3 score=0.18431055545806885	 True
Label=1:	 predict=1 score=0.14189831912517548	 True
Label=4:	 predict=4 score=0.16967739164829254	 True
Label=4:	 predict=4 score=0.1858203113079071	 True
Label=2:	 predict=2 score=0.19251590967178345	 True
Label=3:	 predict=3 score=0.18236778676509857	 True
Label=2:	 predict=2 score=0.18501074612140656	 True
Label=4:	 predict=4 score=0.18582163751125336	 True
Label=4:	 predict=4 score=0.15165238082408905	 True
Label=1:	 predict=1 score=0.18481244146823883	 True
Label=3:	 predict=3 score=0.1856241375207901	 True
Label=1:	 predict=1 score=0.33886465430259705	 True
Label=1:	 predict=1 score=0.17545929551124573	 True
Label=2:	 predict=2 score=0.2006452977657318	 True
Label=1:	 predict=1 score=0.16112568974494934	 True
Label=1:	 predict=1 score=0.135588601231575	 True
Label=1:	 predict=1 score=0.2757149636745453	 True
Label=3:	 predict=3 score=0.19793693721294403	 True
Label=2:	 predict=2 score=0.23355929553508759	 True
## ONNX result
Label=3:	 predict=3 score=0.24011623859405518	 True
Label=2:	 predict=2 score=0.2514563202857971	 True
Label=3:	 predict=3 score=0.18432889878749847	 True
Label=1:	 predict=1 score=0.14190667867660522	 True
Label=4:	 predict=4 score=0.16966880857944489	 True
Label=4:	 predict=4 score=0.18584184348583221	 True
Label=2:	 predict=2 score=0.192528635263443	 True
Label=3:	 predict=3 score=0.18238039314746857	 True
Label=2:	 predict=2 score=0.18498477339744568	 True
Label=4:	 predict=4 score=0.18582601845264435	 True
Label=4:	 predict=4 score=0.1516398787498474	 True
Label=1:	 predict=1 score=0.18481333553791046	 True
Label=3:	 predict=3 score=0.185637429356575	 True
Label=1:	 predict=1 score=0.33886411786079407	 True
Label=1:	 predict=1 score=0.17545419931411743	 True
Label=2:	 predict=2 score=0.20063801109790802	 True
Label=1:	 predict=1 score=0.1611219197511673	 True
Label=1:	 predict=1 score=0.1355796605348587	 True
Label=1:	 predict=1 score=0.27569296956062317	 True
Label=3:	 predict=3 score=0.19792437553405762	 True
Label=2:	 predict=2 score=0.23357689380645752	 True

Lastly, the evaluation results using market1501 are as follows. Both ONNX and pth show the same results.

  • pth
    image

  • onnx
    image

User Warning is displayed in the same way. Since it is a warning, there does not appear to be any problem. (because the result is the same)

If the above does not solve the problem, please share your code.

@rhett-ye
Copy link

rhett-ye commented Jan 8, 2024

我分享了我在转换 ONNX 时学到的技巧。如果在将 Swin transformer 转换为 onnx 时使用动态批处理,则必须更改以下代码。即使您更改模型,性能也不会发生变化。

    def window_reverse(self, windows, H, W):
        """
        Args:
            windows: (num_windows*B, window_size, window_size, C)
            H (int): Height of image
            W (int): Width of image
        Returns:
            x: (B, H, W, C)
        """
        window_size = self.window_size
        # B = int(windows.shape[0] / (H * W / window_size / window_size))
        # x = windows.view(B, H // window_size, W // window_size, window_size,
        #                  window_size, -1)
        # x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
        C = int(windows.shape[-1])
        x = windows.view(-1, H // window_size, W // window_size, window_size, window_size, C)
        x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, H, W, C)
        return x

Can you provide more details about torch to onnx ? When I was inferencing with the transformed onnx model, the results were completely different from the torch results, and I had made sure the inputs were the same. And it will prompt when export: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
And I also get errors when I try to use onnx-simplify. is this a problem with the model?

Hello. @rhett-ye . happy new year. Okay. I provide more details about convert to onnx. First, i share convert pth to onnx code.

import os
import torch
import torchvision.models as models

from utils.Util import pretrained_model
from config.Config import Config
from net.Swin_Transform.SOLIDER import SOLIDER
from net.Swin_Transform.backbone import swin_transformer
from net.SoliderReid import make_model

def load_model(model, nLabels, pretrained_model_path, activate_func):
    torch_model = None
    model_desc = model.lower()
    print(model_desc)
    if "solider_reid_swin_s" in model_desc:
        torch_model = make_model(0.2)
    # assert torch_model == None, "torch_model must not be None!!!"
    torch_model = pretrained_model(torch_model, pretrained_model_path, False)
    if activate_func == "sigmoid":
        torch_model = torch.nn.Sequential(torch_model, torch.nn.Sigmoid())
    elif activate_func == "softmax":
        torch_model = torch.nn.Sequential(torch_model, torch.nn.Softmax())
    elif activate_func == None:
        pass
    return torch_model

def convert_onnx(torch_model, onnx_path, sizes):
    torch_model.to("cuda:0")
    batch, channels, height, width = sizes
    input_names = ["input_0"]
    output_names = ["output_0"]
    dynamic_axes = {"input_0" : {0 : 'batch_size'}, "output_0" : {0 : 'batch_size'}}
    if not os.path.isfile(onnx_path):
        dummy_input = torch.randn(batch, channels, height, width, device='cuda')
        torch.onnx.export(
            torch_model,
            dummy_input,
            onnx_path,
            input_names = input_names, 
            output_names = output_names,
            dynamic_axes = dynamic_axes,
            opset_version=11
        )
        print(f"ONNX model exported to {onnx_path}.")
    else:
        print(f"ONNX model {onnx_path} already exists.")
            

if __name__ == "__main__":
    torch.set_grad_enabled(False)
    
    dummy_size = (Config['batch_size'], Config['channels'], Config['height'], Config['width'])
    
    torch_model = load_model(Config['model_name'], 
                             Config['label_number'], 
                             Config['pth_path'], 
                             Config['fn_activate_func'])
    
    convert_onnx(torch_model, Config['onnx_path'], dummy_size)

load_model() function load a model. and convert_onnx() fuction convert model to onnx format. config's information is below.

Config = \
    {
        'pth_path': './model/pth/p_reid.solider.swin_s.i384x128.sb.v1.0.0.pth',
        'onnx_path': './model/onnx/p_reid.solider.swin_s.i384x128.sb.v1.0.0_b.onnx',
        'fn_activate_func': None,
        'model_name': 'solider_reid_swin_s',
        'label_number': 768,
        'batch_size': 1,
        'channels': 3,
        'width': 128,
        'height': 384
    }

My convert result is below. Tested using several images from Market 1501. (ReID)

## PTH Result
Label=3:	 predict=3 score=0.24012675881385803	 True
Label=2:	 predict=2 score=0.251444011926651	 True
Label=3:	 predict=3 score=0.18431055545806885	 True
Label=1:	 predict=1 score=0.14189831912517548	 True
Label=4:	 predict=4 score=0.16967739164829254	 True
Label=4:	 predict=4 score=0.1858203113079071	 True
Label=2:	 predict=2 score=0.19251590967178345	 True
Label=3:	 predict=3 score=0.18236778676509857	 True
Label=2:	 predict=2 score=0.18501074612140656	 True
Label=4:	 predict=4 score=0.18582163751125336	 True
Label=4:	 predict=4 score=0.15165238082408905	 True
Label=1:	 predict=1 score=0.18481244146823883	 True
Label=3:	 predict=3 score=0.1856241375207901	 True
Label=1:	 predict=1 score=0.33886465430259705	 True
Label=1:	 predict=1 score=0.17545929551124573	 True
Label=2:	 predict=2 score=0.2006452977657318	 True
Label=1:	 predict=1 score=0.16112568974494934	 True
Label=1:	 predict=1 score=0.135588601231575	 True
Label=1:	 predict=1 score=0.2757149636745453	 True
Label=3:	 predict=3 score=0.19793693721294403	 True
Label=2:	 predict=2 score=0.23355929553508759	 True
## ONNX result
Label=3:	 predict=3 score=0.24011623859405518	 True
Label=2:	 predict=2 score=0.2514563202857971	 True
Label=3:	 predict=3 score=0.18432889878749847	 True
Label=1:	 predict=1 score=0.14190667867660522	 True
Label=4:	 predict=4 score=0.16966880857944489	 True
Label=4:	 predict=4 score=0.18584184348583221	 True
Label=2:	 predict=2 score=0.192528635263443	 True
Label=3:	 predict=3 score=0.18238039314746857	 True
Label=2:	 predict=2 score=0.18498477339744568	 True
Label=4:	 predict=4 score=0.18582601845264435	 True
Label=4:	 predict=4 score=0.1516398787498474	 True
Label=1:	 predict=1 score=0.18481333553791046	 True
Label=3:	 predict=3 score=0.185637429356575	 True
Label=1:	 predict=1 score=0.33886411786079407	 True
Label=1:	 predict=1 score=0.17545419931411743	 True
Label=2:	 predict=2 score=0.20063801109790802	 True
Label=1:	 predict=1 score=0.1611219197511673	 True
Label=1:	 predict=1 score=0.1355796605348587	 True
Label=1:	 predict=1 score=0.27569296956062317	 True
Label=3:	 predict=3 score=0.19792437553405762	 True
Label=2:	 predict=2 score=0.23357689380645752	 True

Lastly, the evaluation results using market1501 are as follows. Both ONNX and pth show the same results.

  • pth
    image
  • onnx
    image

User Warning is displayed in the same way. Since it is a warning, there does not appear to be any problem. (because the result is the same)

If the above does not solve the problem, please share your code.

Thanks for your reply! I solved my problem, which was model loading error. On the other hand, when I used opset version = 11, I got an exception: torch.roll operation was not supported (opset 14 is already supported), so I re-implemented this part.

@chenscottus
Copy link

When run it, there is an error:
from utils.Util import pretrained_model
ModuleNotFoundError: No module named 'utils.Util'

@22ema
Copy link

22ema commented Feb 23, 2024

#9 (comment)

I'm sorry
The pretrained_model module was implemented by me. You will need to implement the module yourself or modify the code.

I just provided some of the onnx conversion code.

@chenscottus
Copy link

Is that possible you can share the code? Or just the minimum code that works?

@22ema
Copy link

22ema commented Feb 23, 2024

Okay, I share the code. :D

def check_keys(model, pretrained_state_dict):
    ckpt_keys = set(pretrained_state_dict.keys())
    model_keys = set(model.state_dict().keys())
    used_pretrained_keys = model_keys & ckpt_keys
    unused_pretrained_keys = ckpt_keys - model_keys
    missing_keys = model_keys - ckpt_keys
    print('Missing keys:{}'.format(len(missing_keys)))
    print('Unused checkpoint keys:{}'.format(len(unused_pretrained_keys)))
    print('Used keys:{}'.format(len(used_pretrained_keys)))
    assert len(used_pretrained_keys) > 0, 'load NONE from pretrained checkpoint'
    return True


def remove_prefix(state_dict, prefix):
    ''' Old style model is stored with all names of parameters sharing common prefix 'module.' '''
    print('remove prefix \'{}\''.format(prefix))
    f = lambda x: x.split(prefix, 1)[-1] if x.startswith(prefix) else x
    return {f(key): value for key, value in state_dict.items()}


def pretrained_model(model, pretrained_path, load_to_cpu):
    print('Loading pretrained model from {}'.format(pretrained_path))
    if load_to_cpu:
        pretrained_dict = torch.load(pretrained_path, map_location=lambda storage, loc: storage)
    else:
        device = torch.cuda.current_device()
        pretrained_dict = torch.load(pretrained_path, map_location=lambda storage, loc: storage.cuda(device))
        # with open(pretrained_path, 'rb') as f:
        #     buffer = io.BytesIO(f.read())
        # pretrained_dict = torch.jit.load(buffer)
    if "state_dicts" in pretrained_dict.keys():
        pretrained_dict = remove_prefix(pretrained_dict['state_dicts'], 'module.')
    else:
        pretrained_dict = remove_prefix(pretrained_dict, 'module.')
    # print(model.state_dict().keys())
    check_keys(model, pretrained_dict)
    model.load_state_dict(pretrained_dict, strict=False)
    return model

Since the Config file is simply a file consisting of constant values, it is recommended that you modify the relevant constants before use.

@chenscottus
Copy link

chenscottus commented Feb 23, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants