Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add op remainder for all platform #4912

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

FisherWY
Copy link

@FisherWY FisherWY commented Aug 3, 2023

  • Remainder全平台实现
    • arm
    • loongarch
    • mips, working
    • riscv
    • vulkan
    • x86
  • PNNX转换
  • 单测

@tencent-adm
Copy link

tencent-adm commented Aug 3, 2023

CLA assistant check
All committers have signed the CLA.

@nihui
Copy link
Member

nihui commented Aug 3, 2023

remainder 应该实现在 binaryop 里的...

@FisherWY
Copy link
Author

FisherWY commented Aug 4, 2023

remainder 应该实现在 binaryop 里的...

是的,昨天参考了Paddle的文档,提PR后才发现Paddle和Torch的Remainder不一样😂,下一个commit会修正的
Paddle文档:链接
Torch文档:链接

@codecov-commenter
Copy link

codecov-commenter commented Aug 5, 2023

Codecov Report

Merging #4912 (1fd5705) into master (c45c01c) will decrease coverage by 0.05%.
Report is 32 commits behind head on master.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #4912      +/-   ##
==========================================
- Coverage   89.81%   89.76%   -0.05%     
==========================================
  Files         306      306              
  Lines       86875    86997     +122     
==========================================
+ Hits        78024    78091      +67     
- Misses       8851     8906      +55     
Files Changed Coverage Δ
src/layer/binaryop.cpp 97.19% <0.00%> (-2.20%) ⬇️
src/layer/x86/avx512_mathfun.h 99.00% <0.00%> (-1.00%) ⬇️
src/layer/x86/avx_mathfun.h 98.79% <0.00%> (-1.21%) ⬇️
src/layer/x86/binaryop_x86.cpp 98.11% <0.00%> (-1.69%) ⬇️
src/layer/x86/sse_mathfun.h 98.78% <0.00%> (-1.22%) ⬇️

... and 6 files with indirect coverage changes

@FisherWY FisherWY changed the title [WIP] Add op remainder for all platform Add op remainder for all platform Sep 13, 2023
@nihui
Copy link
Member

nihui commented Sep 21, 2023

ci 很多编译失败,需要修复

@FisherWY
Copy link
Author

ci 很多编译失败,需要修复

目前在x86上根据Torch提供的计算公式进行实现,但貌似结果没法对齐(test_binaryop挂):torch.remainder(a, b) == a - a.div(b, rounding_mode="floor") * b链接,🤔

@nihui
Copy link
Member

nihui commented Sep 26, 2023

ci 很多编译失败,需要修复

目前在x86上根据Torch提供的计算公式进行实现,但貌似结果没法对齐(test_binaryop挂):torch.remainder(a, b) == a - a.div(b, rounding_mode="floor") * b链接,🤔

        float div_result = x / y;
        float round_result = roundf(div_result);
        float res = x - y * round_result;
        return res;

是这里的 roundf( x / y ) 和 div floor 不一样吧

@FisherWY
Copy link
Author

ci 很多编译失败,需要修复

目前在x86上根据Torch提供的计算公式进行实现,但貌似结果没法对齐(test_binaryop挂):torch.remainder(a, b) == a - a.div(b, rounding_mode="floor") * b链接,🤔

        float div_result = x / y;
        float round_result = roundf(div_result);
        float res = x - y * round_result;
        return res;

是这里的 roundf( x / y ) 和 div floor 不一样吧

遇到了一个奇怪的问题,复现步骤如下:

  1. src/layer/binaryop.cpp中写一个实现,返回值为0:
struct binary_op_remainder
{
    float operator()(const float& x, const float& y) const
    {
        return 0.0f;
    }
};
  1. src/layer/x86/binaryop_x86.cpp中实现x86平台,返回值同样为0:
struct binary_op_remainder
{
    float func(const float& x, const float& y) const
    {

        return 0.0f;
    }
#if __SSE2__
    __m128 func_pack4(const __m128& x, const __m128& y) const
    {
        __m128 res = _mm_setzero_ps();
        return res;
    }
#if __AVX__
    __m256 func_pack8(const __m256& x, const __m256& y) const
    {
        __m256 res = _mm256_setzero_ps();
        return res;
    }
#if __AVX512F__
    __m512 func_pack16(const __m512& x, const __m512& y) const
    {
        __m512 res = _mm512_setzero_ps();
        return res;
    }
#endif // __AVX512F__
#endif // __AVX__
#endif // __SSE2__
  1. 编译并运行单测,却会得到不同的结果:
    image
  2. 请问这是什么原因造成的呢?(我的理解是单测是用src/layer/binaryop.cpp的计算结果跟对应平台的实现进行比对,请问是这理解有误吗?)

@nihui
Copy link
Member

nihui commented Oct 16, 2023

ci 很多编译失败,需要修复

目前在x86上根据Torch提供的计算公式进行实现,但貌似结果没法对齐(test_binaryop挂):torch.remainder(a, b) == a - a.div(b, rounding_mode="floor") * b链接,🤔

        float div_result = x / y;
        float round_result = roundf(div_result);
        float res = x - y * round_result;
        return res;

是这里的 roundf( x / y ) 和 div floor 不一样吧

遇到了一个奇怪的问题,复现步骤如下:

1. 在`src/layer/binaryop.cpp`中写一个实现,返回值为0:
struct binary_op_remainder
{
    float operator()(const float& x, const float& y) const
    {
        return 0.0f;
    }
};
2. 在`src/layer/x86/binaryop_x86.cpp`中实现x86平台,返回值同样为0:
struct binary_op_remainder
{
    float func(const float& x, const float& y) const
    {

        return 0.0f;
    }
#if __SSE2__
    __m128 func_pack4(const __m128& x, const __m128& y) const
    {
        __m128 res = _mm_setzero_ps();
        return res;
    }
#if __AVX__
    __m256 func_pack8(const __m256& x, const __m256& y) const
    {
        __m256 res = _mm256_setzero_ps();
        return res;
    }
#if __AVX512F__
    __m512 func_pack16(const __m512& x, const __m512& y) const
    {
        __m512 res = _mm512_setzero_ps();
        return res;
    }
#endif // __AVX512F__
#endif // __AVX__
#endif // __SSE2__
3. 编译并运行单测,却会得到不同的结果:
   ![image](https://user-images.githubusercontent.com/32707008/275434958-8e9949fa-2e45-420b-949d-b218cbb2a881.png)

4. 请问这是什么原因造成的呢?(我的理解是单测是用`src/layer/binaryop.cpp`的计算结果跟对应平台的实现进行比对,请问是这理解有误吗?)

test layer gpu failed 表明 vulkan 的实现没有和 binaryop.cpp 对齐

@FisherWY
Copy link
Author

ci 很多编译失败,需要修复

目前在x86上根据Torch提供的计算公式进行实现,但貌似结果没法对齐(test_binaryop挂):torch.remainder(a, b) == a - a.div(b, rounding_mode="floor") * b链接,🤔

        float div_result = x / y;
        float round_result = roundf(div_result);
        float res = x - y * round_result;
        return res;

是这里的 roundf( x / y ) 和 div floor 不一样吧

遇到了一个奇怪的问题,复现步骤如下:

1. 在`src/layer/binaryop.cpp`中写一个实现,返回值为0:
struct binary_op_remainder
{
    float operator()(const float& x, const float& y) const
    {
        return 0.0f;
    }
};
2. 在`src/layer/x86/binaryop_x86.cpp`中实现x86平台,返回值同样为0:
struct binary_op_remainder
{
    float func(const float& x, const float& y) const
    {

        return 0.0f;
    }
#if __SSE2__
    __m128 func_pack4(const __m128& x, const __m128& y) const
    {
        __m128 res = _mm_setzero_ps();
        return res;
    }
#if __AVX__
    __m256 func_pack8(const __m256& x, const __m256& y) const
    {
        __m256 res = _mm256_setzero_ps();
        return res;
    }
#if __AVX512F__
    __m512 func_pack16(const __m512& x, const __m512& y) const
    {
        __m512 res = _mm512_setzero_ps();
        return res;
    }
#endif // __AVX512F__
#endif // __AVX__
#endif // __SSE2__
3. 编译并运行单测,却会得到不同的结果:
   ![image](https://user-images.githubusercontent.com/32707008/275434958-8e9949fa-2e45-420b-949d-b218cbb2a881.png)

4. 请问这是什么原因造成的呢?(我的理解是单测是用`src/layer/binaryop.cpp`的计算结果跟对应平台的实现进行比对,请问是这理解有误吗?)

test layer gpu failed 表明 vulkan 的实现没有和 binaryop.cpp 对齐

原来如此,非常感谢!

@github-actions github-actions bot removed the core label Oct 17, 2023
@nihui
Copy link
Member

nihui commented Oct 20, 2023

ci 很多测试失败了 qaq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants