Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch use-after-free bugs in usage of shared ptrs using Clang's attributes #2029

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tchaikov
Copy link
Contributor

@tchaikov tchaikov commented Jan 4, 2024

Clang provides consumed annotations, which are very useful for statically catching use-after-free bugs in specific classes.

That's done by:

  1. Marking class as consumable with: clang::consumable().
  2. Annotates contructors with clang::return_typestate(), to inform new objects are in "unconsumed" state.
  3. Set state of objects moved as "consumed" using clang::set_typestate()
  4. Methods that access the object will be annotated with clang::callable_when(unconsumed).

This way, clang screams whenever a code path tries to access an object that was moved, i.e. has consumed state.

For years, we have been suffering with user-after-free bug around smart pointers.

Therefore, both shared_ptr and lw_shared_ptr will now benefit from this static analysis (not enabled by default).

To enable it, one must use clang, of course, and also provide the -Wconsumed flag. As the analysis can report false positives, it's important to disable -Werror (possibly by overriding it with -Wno-error), so to allow the compilation to run to its completion.

This tool was able to find 4 use-after-free bugs in Scylla so far.

Example of report by clang once it hits a possible use-after-free:
"sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator*'
on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed]
                legacy_reverse_slice_to_native_reverse_slice(*schema, slice.get()),
                        pc, std::move(trace_state), fwd, fwd_mr, monitor);"

@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 4, 2024

this change is copied from https://groups.google.com/g/seastar-dev/c/hKxNokDaHMY/m/t92rRO4dAgAJ , in hope that it can facilitate the review process.

v2:

  • expand the macros for defining the attributes
  • pass -Wno-attributes=clang:: as a public compile option if the compiler is not Clang, per Avi's suggestion.

@@ -608,12 +635,15 @@ public:
explicit operator bool() const noexcept {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quote from Avi's review comment at https://groups.google.com/g/seastar-dev/c/hKxNokDaHMY/m/etJQnAhJAQAJ

What about this?

Copy link
Contributor Author

@tchaikov tchaikov Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method only accesses _p, and does not de-reference or mutate it, so it does not depends on any preconditions imposed on the consumed/unconsumed state of the instance, neither does it set the consumed/unconsumed state, so no need to annotate it.

@tchaikov tchaikov changed the title Catch use-after-free bugs in usage of shared ptrs using Clang's med attributes Catch use-after-free bugs in usage of shared ptrs using Clang's attributes Jan 4, 2024
@tchaikov tchaikov force-pushed the catch-use-after-free-clang-attr branch 4 times, most recently from 766380e to 81c4911 Compare January 4, 2024 05:23
@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 4, 2024

@avikivity @raphaelsc hi Avi and Raphael, could you help review this change?

@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 4, 2024

FWIW, i compiled scylla master HEAD with this change, no -Wconsumed was identified.

@tchaikov tchaikov force-pushed the catch-use-after-free-clang-attr branch from 81c4911 to 94f6744 Compare January 5, 2024 01:22
@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 5, 2024

v3:

  • corrected the attribute lw_shared_ptr::invalidate() to clang::set_typestate(consumed).

@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 5, 2024

@raphaelsc hi Raph, thank you for your review! could you help re-review this change?

@tchaikov
Copy link
Contributor Author

tchaikov commented Jan 6, 2024

@scylladb/seastar-maint hello maintainers, could you help review this change?

@raphaelsc
Copy link
Member

ping for merge.

Copy link
Contributor

@nyh nyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time I'm seeing this PR (I believe @avikivity reviewed it in the past), so I'm unfamiliar with the clang feature. Looks like a nice feature, but if I understand correctly, it doesn't actually do anything if you don't explicitly add a -Wconsumed to your build? If that's the case, don't you want to add -Wconsumed by default? If not, why not - and do we need to document this anywhere so users will find it? Do you plan to enable it by default in the ScyllaDB project?

I left a small question that probably reflects more on my ignorance than on the correctness of this patch.

@@ -738,35 +766,35 @@ SEASTAR_MODULE_EXPORT_BEGIN
template <typename T, typename U>
inline
bool
operator==(const shared_ptr<T>& x, const shared_ptr<U>& y) {
operator==(const shared_ptr<T>& x [[clang::param_typestate(unconsumed)]], const shared_ptr<U>& y [[clang::param_typestate(unconsumed)]]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these annotations needed or helpful? This function calls x.get() on x, so should already fail if x is consumed. Or maybe the "unknown" case is the difference here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so. dropped in the latest revision. actually. i don't think we should enforce any expectation here. as any shared_ptr can be compared with any shared_ptr. and neither should we enforce any expectation when calling get(), as shared_ptr::get() should behave no matter it is consumed or not.

T* operator->() const noexcept {
return _p;
}
[[clang::callable_when("unconsumed", "unknown")]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "unknown" ok?

Copy link
Contributor Author

@tchaikov tchaikov Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is based on practical reasons. the state of a consumable object is based on annotations, but not all the code paths are annotated. in the code paths where the annotation is missing, and compiler is not able to deduce the state of the object, it's state would be "unknown", but we should not fail the build just because the compiler is not able to figure out the runtime behavior without enough annotations. actually, we have lots of them in scylla.

to make the -Wconsume useful before annotating all of them, adding "unknown" is a good compromise. probably is a must.

@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 2, 2024

This is the first time I'm seeing this PR (I believe @avikivity reviewed it in the past), so I'm unfamiliar with the clang feature. Looks like a nice feature, but if I understand correctly, it doesn't actually do anything if you don't explicitly add a -Wconsumed to your build? If that's the case, don't you want to add -Wconsumed by default? If not, why not - and do we need to document this anywhere so users will find it? Do you plan to enable it by default in the ScyllaDB project?

thank you Nadav, this makes a lot of sense! we should both enable and expose this compile option as a public cflag exposed to the parent project. -Wconsumed is not enabled by the existing compiling options. i just added in the latest revision. and tested using following test:

class [[clang::consumable(unconsumed)]] CleverObject {
public:
    [[clang::return_typestate(unconsumed)]]
    CleverObject() {}

    CleverObject(CleverObject&& other) { other.invalidate(); }

    [[clang::callable_when(unconsumed)]]
    void do_something() { assert(m_valid); }

private:
    [[clang::set_typestate(consumed)]]
    void invalidate() { m_valid = false; }

    bool m_valid { true };
};

SEASTAR_THREAD_TEST_CASE(test_foo) {
    CleverObject object;
    auto other = std::move(object);
    object.do_something();
}

and the build failed as expected

error: invalid invocation of method 'do_something' on object 'object' while it is in the 'consumed' state [-Werror,-Wconsumed]
  219 |     object.do_something();
      |            ^
1 error generated.

…med attributes

Clang provides consumed annotations, which are very useful for
statically catching use-after-free bugs in specific classes.

That's done by:
1) Marking class as consumable with: clang::consumable().
2) Annotates contructors with clang::return_typestate(), to inform
new objects are in "unconsumed" state.
3) Set state of objects moved as "consumed" using clang::set_typestate()
4) Methods that access the object will be annotated with
clang::callable_when(unconsumed).

This way, clang screams whenever a code path tries to access an
object that was moved, i.e. has consumed state.

For years, we have been suffering with user-after-free bug around
smart pointers.

Therefore, both shared_ptr and lw_shared_ptr will now benefit from
this static analysis (not enabled by default).

To enable it, one must use clang, of course, and also provide the
-Wconsumed flag. As the analysis can report false positives,
it's important to disable -Werror (possibly by overriding it with
-Wno-error), so to allow the compilation to run to its completion.

This tool was able to find 4 use-after-free bugs in Scylla so far.

```
Example of report by clang once it hits a possible use-after-free:
"sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator*'
on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed]
                legacy_reverse_slice_to_native_reverse_slice(*schema, slice.get()),
                        pc, std::move(trace_state), fwd, fwd_mr, monitor);"
```

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
@tchaikov tchaikov force-pushed the catch-use-after-free-clang-attr branch from 94f6744 to a6c3f06 Compare February 2, 2024 06:12
@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 2, 2024

clang-17 is not able to compile with -Wconsume enabled:

[29/349] Building CXX object CMakeFiles/seastar.dir/src/core/metrics.cc.o
FAILED: CMakeFiles/seastar.dir/src/core/metrics.cc.o 
/usr/bin/clang++-17 -DBOOST_NO_CXX98_FUNCTION_BASE -DFMT_SHARED -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT -DSEASTAR_HAS_MEMBARRIER -DSEASTAR_HAVE_ASAN_FIBER_SUPPORT -DSEASTAR_HAVE_HWLOC -DSEASTAR_HAVE_NUMA -DSEASTAR_HAVE_SYSTEMTAP_SDT -DSEASTAR_HAVE_URING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_PTHREAD_ATTR_SETAFFINITY_NP -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_STRERROR_R_CHAR_P -DSEASTAR_THREAD_STACK_GUARDS -DSEASTAR_TYPE_ERASE_MORE -Dseastar_EXPORTS -I/home/circleci/project/include -I/home/circleci/project/build/debug/gen/include -I/home/circleci/project/build/debug/gen/src -I/home/circleci/project/src -g -std=gnu++17 -fPIC -U_FORTIFY_SOURCE -Wno-error=unused-result -Wconsumed "-Wno-error=#warnings" -fstack-clash-protection -UNDEBUG -Wall -Werror -Wimplicit-fallthrough -Wdeprecated -Wno-error=deprecated -gz -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT CMakeFiles/seastar.dir/src/core/metrics.cc.o -MF CMakeFiles/seastar.dir/src/core/metrics.cc.o.d -o CMakeFiles/seastar.dir/src/core/metrics.cc.o -c /home/circleci/project/src/core/metrics.cc
/home/circleci/project/src/core/metrics.cc:369:19: error: state of variable 'res_ref' must match at the entry and exit of loop [-Werror,-Wconsumed]
  369 |     for (auto&& i : functions) {
      |                   ^
1 error generated.

while clang-18 is able to build the tree.

@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 2, 2024

even with clang-19 (c761b4a5e4cc003a2c850898e1dc67d2637cfb0c), i still have build failures like

FAILED: compaction/CMakeFiles/compaction.dir/Debug/task_manager_module.cc.o 
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -I/home/kefu/dev/scylladb/build/gen -isystem /home/kefu/dev/scylladb/build/rust -g -O0 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wignored-qualifiers -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -U_FORTIFY_SOURCE -Werror=unused-result -Wconsumed "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT compaction/CMakeFiles/compaction.dir/Debug/task_manager_module.cc.o -MF compaction/CMakeFiles/compaction.dir/Debug/task_manager_module.cc.o.d -o compaction/CMakeFiles/compaction.dir/Debug/task_manager_module.cc.o -c /home/kefu/dev/scylladb/compaction/task_manager_module.cc
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:224:13: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  224 |     while (!table_tasks.empty()) {
      |             ^
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:276:13: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  276 |     while (!keyspace_tasks.empty()) {
      |             ^
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:400:19: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  400 |     for (auto& ti : _local_tables) {
      |                   ^
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:461:19: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  461 |     for (auto& ti : _local_tables) {
      |                   ^
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:496:19: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  496 |     for (auto& ti : _table_infos) {
      |                   ^
/home/kefu/dev/scylladb/compaction/task_manager_module.cc:525:19: error: state of variable 'current_task' must match at the entry and exit of loop [-Werror,-Wconsumed]
  525 |     for (auto& ti : _table_infos) {
      |                   ^
6 errors generated.

i am afraid that this is a show stopper.

@tchaikov
Copy link
Contributor Author

tchaikov commented Feb 2, 2024

i will revisit this change later on. before that, let's mark this change "draft".

@tchaikov tchaikov marked this pull request as draft February 2, 2024 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants