Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Reading past the end of the stream #5716

Open
FelixYBW opened this issue May 12, 2024 · 0 comments
Open

[VL] Reading past the end of the stream #5716

FelixYBW opened this issue May 12, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@FelixYBW
Copy link
Contributor

Backend

VL (Velox)

Bug description

Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::getOutput failed for [operator: ValueStream, plan node ID: 0]: Error during calling Java code from native code: org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Reading past the end of the stream
Retriable: False
Context: Split [Hive: s3://pinterest-capetown/prod/no_sa_pins_d/dt=2024-05-04/part-r-05087-7d055d77-5d76-4d28-8763-41b32b68afbb.zstd.parquet 3221225472 - 1073741824] Task Gluten_Stage_2_TID_150888
Top-Level Context: Same as context.
Function: read
File: /home/binweiyang/gluten/ep/build-velox/build/velox_ep/./velox/dwio/parquet/thrift/ThriftTransport.h
Line: 52
Stack trace:
# 0  _ZN8facebook5velox7process10StackTraceC1Ei
# 1  _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_
# 2  _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorEPKcEEvRKNS1_18VeloxCheckFailArgsET0_
# 3  _ZN8facebook5velox7parquet6thrift24ThriftStreamingTransport4readEPhj
# 4  _ZN6apache6thrift9transport7readAllIN8facebook5velox7parquet6thrift15ThriftTransportEEEjRT_Phj
# 5  _ZN6apache6thrift8protocol17TCompactProtocolTIN8facebook5velox7parquet6thrift15ThriftTransportEE10readBinaryERNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
# 6  _ZN6apache6thrift8protocol4skipINS1_17TCompactProtocolTIN8facebook5velox7parquet6thrift15ThriftTransportEEEEEjRT_NS1_5TTypeE
# 7  _ZN6apache6thrift8protocol4skipINS1_17TCompactProtocolTIN8facebook5velox7parquet6thrift15ThriftTransportEEEEEjRT_NS1_5TTypeE
# 8  _ZN8facebook5velox7parquet6thrift10PageHeader4readEPN6apache6thrift8protocol9TProtocolE
# 9  _ZN8facebook5velox7parquet10PageReader14readPageHeaderEv
# 10 _ZN8facebook5velox7parquet10PageReader10seekToPageEl
# 11 _ZN8facebook5velox7parquet10PageReader11rowsForPageERNS0_4dwio6common21SelectiveColumnReaderEbbRN5folly5RangeIPKiEERPKm
# 12 _ZN8facebook5velox7parquet10PageReader15readWithVisitorINS0_4dwio6common13ColumnVisitorIlNS0_6common9IsNotNullENS5_15ExtractToReaderINS5_28SelectiveIntegerColumnReaderEEELb0EEEEEvRT_
# 13 _ZN8facebook5velox4dwio6common28SelectiveIntegerColumnReader10readHelperINS0_7parquet19IntegerColumnReaderENS0_6common9IsNotNullELb0ENS2_15ExtractToReaderIS3_EEEEvPNS7_6FilterEN5folly5RangeIPKiEET2_
# 14 _ZN8facebook5velox7parquet19IntegerColumnReader4readEiN5folly5RangeIPKiEEPKm
# 15 _ZN8facebook5velox4dwio6common31SelectiveStructColumnReaderBase4readEiN5folly5RangeIPKiEEPKm
# 16 _ZN8facebook5velox4dwio6common31SelectiveStructColumnReaderBase4nextEmRSt10shared_ptrINS0_10BaseVectorEEPKNS2_8MutationE
# 17 _ZN8facebook5velox7parquet16ParquetRowReader4nextEmRSt10shared_ptrINS0_10BaseVectorEEPKNS0_4dwio6common8MutationE
# 18 _ZN8facebook5velox9connector4hive11SplitReader4nextEmRSt10shared_ptrINS0_10BaseVectorEE
# 19 _ZN8facebook5velox9connector4hive14HiveDataSource4nextEmRN5folly10SemiFutureINS4_4UnitEEE
# 20 _ZN8facebook5velox4exec9TableScan9getOutputEv
# 21 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE
# 22 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE
# 23 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE
# 24 _ZN6gluten24WholeStageResultIterator4nextEv
# 25 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext
# 26 0x00007f9efab368a8

	at org.apache.gluten.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:39)
	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:43)
	at org.apache.gluten.utils.InvocationFlowProtection.hasNext(Iterators.scala:135)
	at org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69)
	at org.apache.gluten.utils.PayloadCloser.hasNext(Iterators.scala:35)
	at org.apache.gluten.utils.PipelineTimeAccumulator.hasNext(Iterators.scala:98)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:31)
	at org.apache.gluten.vectorized.GeneralInIterator.hasNext(GeneralInIterator.java:31)
	at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method)
	at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNextInternal(ColumnarBatchOutIterator.java:65)
	at org.apache.gluten.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:37)
	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:43)
	at org.apache.gluten.utils.InvocationFlowProtection.hasNext(Iterators.scala:135)
	at org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69)
	at org.apache.gluten.utils.PayloadCloser.hasNext(Iterators.scala:35)
	at org.apache.gluten.utils.PipelineTimeAccumulator.hasNext(Iterators.scala:98)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	at org.apache.spark.shuffle.ColumnarShuffleWriter.internalWrite(ColumnarShuffleWriter.scala:132)
	at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:236)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1470)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750) 

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

@FelixYBW FelixYBW added bug Something isn't working triage labels May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant