Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Metal】The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend #114

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

vinsentli
Copy link
Contributor

@vinsentli vinsentli commented Apr 30, 2024

The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend.
For example , on iOS simulator is 256.

https://developer.apple.com/documentation/metal/developing_metal_apps_that_run_in_simulator
When you set arguments for the render or compute command, align constant buffer offsets to 256 bytes.

And I have not test the value on MacOS. Maybe it is same with iOS simulator. I cannot verify it.

From the https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf it shows 256B on M2.
image

@facebook-github-bot
Copy link
Contributor

@corporateshark has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@syeh1
Copy link
Contributor

syeh1 commented May 1, 2024

@vinsentli are you running into this issue while trying to allocate a uniform buffer or a buffer to hold the texture?

Please give an example of the usage so we can better assess next steps.

@corporateshark
Copy link
Contributor

@vinsentli Could you please share some test code snippet or a usage example showing how exactly this value is being used?

@vinsentli
Copy link
Contributor Author

@syeh1 @corporateshark
Does DeviceFeatureLimits::BufferAlignment mean GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT?
I fix the same problem on opengl backend in another pull request:
#113

@vinsentli
Copy link
Contributor Author

https://www.khronos.org/opengl/wiki/Uniform_Buffer_Object#Data_storage
If you bind a uniform buffer with glBindBufferRange, the offset field of that parameter must be a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT (this is a global value, not a per-program or per-block one). Thus, if you want to put the data for multiple uniform blocks in a single buffer object, you must make sure that the data for each within that block matches this alignment.

@vinsentli
Copy link
Contributor Author

This is my use scene: put the data for multiple uniform blocks in a single buffer object.

In that scene,must make sure that the data for each within that block matches this alignment.

@syeh1
Copy link
Contributor

syeh1 commented May 1, 2024

Ok, got it. In that case, iOS is 256b, but M2 would be 32b (looking at the Minimum Constant Buffer offset Alignment row)

@vinsentli
Copy link
Contributor Author

vinsentli commented May 1, 2024

Ok, got it. In that case, iOS is 256b, but M2 would be 32b (looking at the Minimum Constant Buffer offset Alignment row)

On m2 is 32B,means 128byte.@syeh1

@vinsentli
Copy link
Contributor Author

I update the value on macos.
So the alignment on
iOS simulator is 256byte,
on macos is 128 byte,
on iOS phones is 16 byte.

@facebook-github-bot
Copy link
Contributor

@corporateshark has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@tgoulart
Copy link
Contributor

tgoulart commented May 2, 2024

I'm trying to review this, but things are very unclear. Let's start by establishing some things:

  • If we want to keep the OpenGL and Metal PRs separate, let's not mix them in the conversation. It just adds noise.
  • The Metal tables document uses uppercase B for bytes, not bits. IGL's values are also in bytes. There should be no conversion between the two.
  • GPU families got weird after M1 macs. Starting with Apple7, mobile and mac GPUs are in the same family. The limits in the mac2 column represent Intel GPUs, but note that arm macs will also report support for the mac2 family. All of this is in the document linked above.
    • Note that IGL does not distinguish between arm and Intel macs, it just looks at the platform.

Now, onto the speculative:

  1. If I understand the motivation, the type of alignment that matters here is for "constant buffers".
  2. DeviceFeatureLimits::BufferAlignment doesn't necessarily map to constant buffers only, but buffers in general. We might have to add a new enum for it, but we can also just go with the more restrictive of all use cases -- and I say let's go that way until someone comes up with actual code exposing some issue/limitation, as otherwise it's just guessing.
  3. According to the Metal table, iOS physical device alignment should be 16, and mac should be 32. And according to the simulator guide, alignment should be 256. I think these are the more correct numbers to use.
  4. HOWEVER, if we also want to account for the "Buffer alignment for copying an existing texture to a
    buffer" row given the lack of granularity in IGL's enums, we should use 256 as the alignment for mac as well.

IMO, given the lack of real examples, we should go with this simpler/safer alternative:

#if IGL_PLATFORM_MACOS || IGL_PLATFORM_IOS_SIMULATOR
    result = 256;
#else
    result = 16;
#endif

Thoughts?

@vinsentli vinsentli changed the title The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend 【Metal】The DeviceFeatureLimits::BufferAlignment is not always equals 16 byte on Metal backend May 3, 2024
@vinsentli
Copy link
Contributor Author

vinsentli commented May 5, 2024

There is a easy way to verify the alignment value on MacOS。
image

In the ColorSession demo, modify the bufferOffset value from 0 to 1, the Xcode will report an error.

I verify the value on MacOS with arm GPU(Apple M1 Pro), the alignment value is 16 bytes.
【the offset into the buffer color that is bound at buffer index 0 must be a multiple of 16】

image

It aligns with the instructions provided in the documentation.

image image

I verify the value on MacOS with Intel GPU(Apple M1 Pro), the alignment value is 32 bytes.
【the offset into the buffer color that is bound at buffer index 0 must be a multiple of 32】
image

Indeed, it has been confusing me.

Yes , use 256 bytes on MacOS more safer. @tgoulart @corporateshark

#if IGL_PLATFORM_MACOS || IGL_PLATFORM_IOS_SIMULATOR
result = 256;
#else
result = 16;
#endif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants