-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: slow query log does not report cancellation error code #122722
Comments
As a first attempt, let's see if we can move the logging code into cockroach/pkg/sql/conn_executor_exec.go Lines 1861 to 1918 in e9d1b46
|
Moving to the observability team as discussed with @nkodali |
I tested this out today and it looks like we can move the logging into |
Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: cockroachdb#122722 Release note: None
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: cockroachdb#122722 Release note: None
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Preserving pausible portal logging behaviour: There are some nuances in logging when we have a pausable portal that must be preserved. For pausible portals, logging was triggered as a cleanup action in `execStmtInOpenState` by appending it to a list of cleanup fns in `dispatch`. This is now accomplished by performing logging in the `processCleanupFunc` helper within `execStmtInOpenState`. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
123544: cli: debugzip sensitive fields redaction guarded through cluster settings r=arjunmahishi a=kousiknath The PR #122181 enabled redaction of hostname and ip address in the generated debug zip. But this PR puts that functionality behind a cluster setting - `debug.zip.redact_addresses.enabled` (which is enabled by default). So, if the user wants to skip the redaction of hostname and ip address, they can set this setting to `false` before generating the debug zip. --- ### Possible combinations of `--redact` and `debug.zip.redact_addresses.enabled` | `--redact` | `debug.zip.redact_addresses.enabled` | Hostname, IP address | |--------|--------|--------| | `true` | `true` | Redacted | | `true` | `false` | Not Redacted | | `false` | `true` | Not Redacted | | `false` | `false` | Not Redacted | --- Part of: CRDB-38270 Epic: None Release note (ops change): Add cluster setting `debug.zip.redact_addresses.enabled` which allows the user to enable/disable redaction of fields like hostname and IP addresses 124250: sql,sqlstats: lift out fields required for logging from `dispatchToExecutionEngine` r=abarganier a=xinhaoz **sql: record topLevelQueryStats in instrumentationhelper** Previously topLevelQueryStats were being passed directly into the logging function after execution in dispatchToExecutionEngine. This commit persists the top level stats of the query execution into the planner's instrumentation helper, similar to what we do with sampled query exec stats. This allows us to move the logging operation in dispatchToExecutionEngine to `execStmtInOpenState`. Part of: #122722 Release note: None **sql, sqlstats: construct statement fingerprint id on demand when logging** Currently we construct the stmt fingerprint id at various places to pass to the logging function. We only need the statement fingerprint id for telemetry logging, so this construction is wasteful in other scenarios. This commit moves the stmt fingerprint id generation to the logging function. Summary: - Record the current stmt fingerprint id in the stats collector when writing sql stats. This value is reset at the start of stmt exec. - Construct stmt fingerprint at the time of logging based on the AST in the planner if it was not available in the stats collector. This change is motivated by the desire to move logging from `disatpchToExecutionEngine` to `execStmtInOpenState`. See issue for details. Part of: #122722 Release note: None 124495: logictest: deflake synthetic_privileges test r=rafiss a=rafiss The test was relying on behavior from before a65fea7, where the connection would not be closed for up to 1 second after being cancelled with CANCEL SESSION. After that commit, the connection is closed nearly immediately, so the test was much more likely to hit an error when trying to reuse the connection. I confirmed this was the bug by reverting a65fea7, then adding a sleep in the test before switching to the testuser connection. That made the test hit the same error. fixes #124449 Release note: None Co-authored-by: kousiknath <kousik@cockroachlabs.com> Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging. Preserving pausible portal logging behaviour: There are some nuances in logging when we have a pausable portal that must be preserved. For pausible portals, logging was triggered as a cleanup action in `execStmtInOpenState` by appending it to a list of cleanup fns in `dispatch`. This is now accomplished by performing logging in the `processCleanupFunc` helper within `execStmtInOpenState`. Epic: none Fixes: cockroachdb#122722 Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Describe the problem
The slow query log includes error codes for slow queries. The error code is incorrect for queries that timed out due to
statement_timeout
or that were canceled explicitly.To Reproduce
Now open
cockroach-sql-slow.log
. It contains:Expected behavior
The first line in those logs is correct. But the latter 2 should have an error code of
57014
(instead ofXXUUU
).Environment:
Tested this on v23.2.1, but all versions are affected.
Additional context
The issue is that the error code is added here
cockroach/pkg/sql/conn_executor_exec.go
Lines 479 to 484 in e9d1b46
and here
cockroach/pkg/sql/conn_executor_exec.go
Lines 522 to 529 in e9d1b46
That error rewriting logic is in
execStmtInOpenState
. The slow query log is written indispatchToExecutionEngine
. SinceexecStmtInOpenState
is a caller ofdispatchToExecutionEngine
, that means the error code is only added after the log is written.see https://cockroachlabs.slack.com/archives/C06PEGL22UX/p1713373695222239?thread_ts=1713193977.393989&cid=C06PEGL22UX
Jira issue: CRDB-38034
Epic CRDB-38033
The text was updated successfully, but these errors were encountered: