Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) in tf.raw_ops.SoftmaxCrossEntropyWithLogits #67531

Open
LongZE666 opened this issue May 14, 2024 · 4 comments
Open
Assignees
Labels
comp:ops OPs related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.16 type:bug Bug

Comments

@LongZE666
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf 2.16.1

Custom code

Yes

OS platform and distribution

Ubuntu 20.04

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

Triggered when input parameters features and labels are incorrect.

Standalone code to reproduce the issue

import tensorflow as tf

features = tf.constant([], shape=[3,0], dtype=tf.float64)
labels = tf.constant([], shape=[0], dtype=tf.float64)
tf.raw_ops.SoftmaxCrossEntropyWithLogits(features=features, labels=labels)

Relevant log output

ASAN Report:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==2083523==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fe71e96a65b bp 0x7ffd6f3a7700 sp 0x7ffd6f3a7620 T0)
==2083523==The signal is caused by a READ memory access.
==2083523==Hint: address points to the zero page.
    #0 0x7fe71e96a65b in std::_Function_handler<void (long, long), tensorflow::functor::XentFunctorBase<Eigen::ThreadPoolDevice, double>::operator()(Eigen::ThreadPoolDevice const&, Eigen::DSizes<long, 2> const&, Eigen::array<long, 2ul> const&, Eigen::array<long, 2ul> const&, Eigen::TensorMap<Eigen::Tensor<double const, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double const, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 2, 1, long>, 16, Eigen::MakePointer>)::{lambda(long, long)#1}>::_M_invoke(std::_Any_data const&, long&&, long&&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x38bcb65b)
    #1 0x7fe6fb7f638b in Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::function<long (long)>, std::function<void (long, long)>) const (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x15a5738b)
    #2 0x7fe71e976680 in tensorflow::SoftmaxXentWithLogitsOp<Eigen::ThreadPoolDevice, double>::Compute(tensorflow::OpKernelContext*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x38bd7680)
    #3 0x7fe74fe0714a in tensorflow::ThreadPoolDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1cdc14a)
    #4 0x7fe74f98dfe6 in tensorflow::(anonymous namespace)::SingleThreadedExecutorImpl::Run(tensorflow::Executor::Args const&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x1862fe6)
    #5 0x7fe74f86a306 in tensorflow::FunctionLibraryRuntimeImpl::RunSync(tensorflow::FunctionLibraryRuntime::Options, unsigned long, absl::lts_20230802::Span<tensorflow::Tensor const>, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x173f306)
    #6 0x7fe74f8c7f74 in tensorflow::ProcessFunctionLibraryRuntime::RunMultiDeviceSync(tensorflow::FunctionLibraryRuntime::Options const&, unsigned long, std::vector<std::variant<tensorflow::Tensor, tensorflow::TensorShape>, std::allocator<std::variant<tensorflow::Tensor, tensorflow::TensorShape> > >*, std::function<absl::lts_20230802::Status (tensorflow::ProcessFunctionLibraryRuntime::ComponentFunctionData const&, tensorflow::ProcessFunctionLibraryRuntime::InternalArgs*)>) const (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x179cf74)
    #7 0x7fe74f8d58b1 in tensorflow::ProcessFunctionLibraryRuntime::RunSync(tensorflow::FunctionLibraryRuntime::Options const&, unsigned long, absl::lts_20230802::Span<tensorflow::Tensor const>, std::vector<tensorflow::Tensor, std::allocator<tensorflow::Tensor> >*) const (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_framework.so.2+0x17aa8b1)
    #8 0x7fe7179f72eb in tensorflow::KernelAndDeviceFunc::Run(tensorflow::ScopedStepContainer*, tensorflow::EagerKernelArgs const&, std::vector<std::variant<tensorflow::Tensor, tensorflow::TensorShape>, std::allocator<std::variant<tensorflow::Tensor, tensorflow::TensorShape> > >*, tsl::CancellationManager*, std::optional<tensorflow::EagerFunctionParams> const&, std::optional<tensorflow::ManagedStackTrace> const&, tsl::CoordinationServiceAgent*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31c582eb)
    #9 0x7fe71788cc2c in tensorflow::EagerKernelExecute(tensorflow::EagerContext*, absl::lts_20230802::InlinedVector<tensorflow::TensorHandle*, 4ul, std::allocator<tensorflow::TensorHandle*> > const&, std::optional<tensorflow::EagerFunctionParams> const&, tsl::core::RefCountPtr<tensorflow::KernelAndDevice> const&, tensorflow::GraphCollector*, tsl::CancellationManager*, absl::lts_20230802::Span<tensorflow::TensorHandle*>, std::optional<tensorflow::ManagedStackTrace> const&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31aedc2c)
    #10 0x7fe71788f66a in tensorflow::ExecuteNode::Run() (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31af066a)
    #11 0x7fe7179c9939 in tensorflow::EagerExecutor::SyncExecute(tensorflow::EagerNode*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31c2a939)
    #12 0x7fe71787cde4 in tensorflow::(anonymous namespace)::EagerLocalExecute(tensorflow::EagerOperation*, tensorflow::TensorHandle**, int*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31addde4)
    #13 0x7fe717880dd4 in tensorflow::DoEagerExecute(tensorflow::EagerOperation*, tensorflow::TensorHandle**, int*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31ae1dd4)
    #14 0x7fe71788ad26 in tensorflow::EagerExecute(tensorflow::EagerOperation*, tensorflow::TensorHandle**, int*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31aebd26)
    #15 0x7fe706873c33 in tensorflow::EagerOperation::Execute(absl::lts_20230802::Span<tensorflow::AbstractTensorHandle*>, int*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x20ad4c33)
    #16 0x7fe7179bee5e in tensorflow::CustomDeviceOpHandler::Execute(tensorflow::ImmediateExecutionOperation*, tensorflow::ImmediateExecutionTensorHandle**, int*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x31c1fe5e)
    #17 0x7fe6f63f145b in TFE_Execute (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x1065245b)
    #18 0x7fe74dd9f274 in TFE_Py_FastPathExecute_C(_object*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../_pywrap_tensorflow_internal.so+0x3c2274)
    #19 0x7fe6e1d5cccb in pybind11::cpp_function::initialize<pybind11_init__pywrap_tfe(pybind11::module_&)::{lambda(pybind11::args)#61}, pybind11::object, pybind11::args, pybind11::name, pybind11::scope, pybind11::sibling>(pybind11_init__pywrap_tfe(pybind11::module_&)::{lambda(pybind11::args)#61}&&, pybind11::object (*)(pybind11::args), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/_pywrap_tfe.so+0xb7ccb)
    #20 0x7fe6e1e73899 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/_pywrap_tfe.so+0x1ce899)
    #21 0x51ad66  (/usr/bin/python3.11+0x51ad66)
    #22 0x4e75db in _PyObject_MakeTpCall (/usr/bin/python3.11+0x4e75db)
    #23 0x4fb151 in _PyEval_EvalFrameDefault (/usr/bin/python3.11+0x4fb151)
    #24 0x531822 in _PyFunction_Vectorcall (/usr/bin/python3.11+0x531822)
    #25 0x541194 in PyObject_Call (/usr/bin/python3.11+0x541194)
    #26 0x4fefe0 in _PyEval_EvalFrameDefault (/usr/bin/python3.11+0x4fefe0)
    #27 0x62e1b3  (/usr/bin/python3.11+0x62e1b3)
    #28 0x4f3a66 in PyEval_EvalCode (/usr/bin/python3.11+0x4f3a66)
    #29 0x647c36  (/usr/bin/python3.11+0x647c36)
    #30 0x64534f  (/usr/bin/python3.11+0x64534f)
    #31 0x650d14  (/usr/bin/python3.11+0x650d14)
    #32 0x650a63 in _PyRun_SimpleFileObject (/usr/bin/python3.11+0x650a63)
    #33 0x650832 in _PyRun_AnyFileObject (/usr/bin/python3.11+0x650832)
    #34 0x64f786 in Py_RunMain (/usr/bin/python3.11+0x64f786)
    #35 0x61ee0c in Py_BytesMain (/usr/bin/python3.11+0x61ee0c)
    #36 0x7fe7fbe3cd8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #37 0x7fe7fbe3ce3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #38 0x61ec94 in _start (/usr/bin/python3.11+0x61ec94)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/mnt//venv/tensorflow-2.16.1-asan/lib/python3.11/site-packages/tensorflow/python/platform/../../libtensorflow_cc.so.2+0x38bcb65b) in std::_Function_handler<void (long, long), tensorflow::functor::XentFunctorBase<Eigen::ThreadPoolDevice, double>::operator()(Eigen::ThreadPoolDevice const&, Eigen::DSizes<long, 2> const&, Eigen::array<long, 2ul> const&, Eigen::array<long, 2ul> const&, Eigen::TensorMap<Eigen::Tensor<double const, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double const, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 2, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<double, 2, 1, long>, 16, Eigen::MakePointer>)::{lambda(long, long)#1}>::_M_invoke(std::_Any_data const&, long&&, long&&)
==2083523==ABORTING
@Venkat6871
Copy link

Hi @LongZE666 ,

  • Sorry for the delay, Can you please try with recent TF version? I tried with TF-nightly and I cannot reproduce the error. It might be solved in upcoming version. Please check the gist here.

Thank you!

@Venkat6871 Venkat6871 added the stat:awaiting response Status - Awaiting response from author label May 16, 2024
@LongZE666
Copy link
Author

I can reproduce this problem on tensorflow version 2.16.1, and it can be executed normally on tf-nightly.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 16, 2024
@Venkat6871
Copy link

Hi @LongZE666 ,

  • Yes, I also reproduce same error with version 2.16.1. But it is working fine with tf-nightly. So, It might be solve in upcoming version.

Thank you!

@Venkat6871 Venkat6871 added the stat:awaiting response Status - Awaiting response from author label May 22, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:ops OPs related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.16 type:bug Bug
Projects
None yet
Development

No branches or pull requests

2 participants