-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade HadoopTableOperations.version from int32 to long64 #10277
Comments
@jkolash can you share a little bit more details about the 3rd party that is writing this? It would be good to know why this 3rd party writes this as a long instead of an int. |
In this case it is data written by snowflake
We aren't particularly interested in writing data to snowflake. But we are interested in using the hadoop catalog to read data after it has landed on s3. Our goal is to be able to simply have snowflake write the data to s3 without needing to connect to the snowflake catalog. Then just use s3 after data has been delivered so we don't have to "know" it is from snowflake. I've tested that I can query via spark 3.4 once I switch from int32 to long64 |
@jkolash you might want to report this to Snowflake as the version should currently be an int instead of a long to comply with the implementation in Iceberg |
Seeing other implementations that use version-latest.text and whether int32 would break them. duckdbhttps://github.com/duckdb/duckdb_iceberg/blob/main/src/common/iceberg.cpp#L220C25-L220C40 uses a string and makes no assumptions about the type other than that it seems. |
Feature Request / Improvement
We are using the hadoop catalog and have encountered tables written by a 3rd party that are encoding the latest-version.text field in a value higher than supported by int32.
I can provide a PR if it is desired, the changes are all isolated to HadoopTableOperations.
The only issue I encountered were if the spark driver/worker iceberg jars were not the same we'd have serialization issues, but this is very often the case anyway when upgrading libraries.
Query engine
Spark
The text was updated successfully, but these errors were encountered: