Rollup joins across different data sources with different param allocators causing invalidations #8268

pauldheinrichs · 2024-05-15T20:16:18Z

Describe the bug
When attempting to run the orchestration on a druid datasource to rebuild pre-aggregations. The following error is spit out in the API logs after running.

--- EDIT ---

Seems like this for me may be an underlying issue with something i'm working through. My API node is frequently assuming my druid models are my default driver param object in more than just orchestration. It infact is invalidating my pre-aggregations because it's assuming the default driver. I do not have a cube.js file in my instance.

Slack thread:https://cube-js.slack.com/archives/C04NYBJP7RQ/p1715737464066809

(IF i'm just doing it wrong... let me know 😓 )

{
  "action": "post",
      "selector": {
      "contexts": [{ "securityContext": {} }],
      "timezones": ["UTC"],
        "dataSources": ["druid"]
    }
}

{"message":"Downloading external pre-aggregation via query error","queryKey":
[["CREATE TABLE cube_production_pre_aggregations.video_views_video_rollup_count20240415 AS SELECT\n      \"video_views\".video_id \"video_views__video_id\", DATE_TRUNC('day', CAST(TIME_FORMAT(\"video_views\".__time, 'yyyy-MM-dd HH:mm:ss', 'UTC') AS TIMESTAMP)) \"video_views__time_day\", sum(\"video_views\".views) \"video_views__view_count\"\n    FROM\n      TeamVideoView AS \"video_views\"  WHERE (\"video_views\".__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2",["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],{}],[[{"refresh_key":null}]]],"query":"SELECT\n      \"video_views\".video_id \"video_views__video_id\", DATE_TRUNC('day', CAST(TIME_FORMAT(\"video_views\".__time, 'yyyy-MM-dd HH:mm:ss', 'UTC') AS TIMESTAMP)) \"video_views__time_day\", sum(\"video_views\".views) \"video_views__view_count\"\n    FROM\n      TeamVideoView AS \"video_views\"  WHERE (\"video_views\".__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2","values":["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],"targetTableName":"cube_production_pre_aggregations.video_views_video_rollup_count20240415_lnicrtcl_403ejwb4_1j4a3mu","requestId":"6e558a4f-3de1-4e19-a350-fb41acb687a6","newVersionEntry":{"table_name":"cube_production_pre_aggregations.video_views_video_rollup_count20240415","structure_version":"403ejwb4","content_version":"lnicrtcl","last_updated_at":1715801822362,"naming_version":2},"buildRangeEnd":"2024-04-21T23:59:59.999","error":"Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table.__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2","values":["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],"targetTableName":"cube_production_pre_aggregations.video_views_video_rollup_count20240415_lnicrtcl_403ejwb4_1j4a3mu","requestId":"6e558a4f-3de1-4e19-a350-fb41acb687a6","newVersionEntry":{"table_name":"cube_production_pre_aggregations.video_views_video_rollup_count20240415","structure_version":"403ejwb4","content_version":"lnicrtcl","last_updated_at":1715801822362,"naming_version":2},"buildRangeEnd":"2024-04-21T23:59:59.999","error":"Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table

Focusing on specifically Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table

at /cube/node_modules/@cubejs-backend/druid-driver/src/DruidClient.ts:92:19\n    
at processTicksAndRejections (node:internal/process/task_queues:95:5)\n    
at DruidDriver.downloadQueryResults (/cube/node_modules/@cubejs-backend/druid-driver/src/DruidDriver.ts:140:31)\n    
at PreAggregationLoader.refreshReadOnlyExternalStrategy (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:1154:19)\n    
at Object.query (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryCache.ts:602:26)\n    
at QueryQueue.processQuery (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:842:25)"}

To Reproduce
Steps to reproduce the behavior:

setup druid datasource and another default (postgres / redshift in my case)

CUBEJS_DATASOURCES: "default,druid"

setup pre-aggregation on druid cube with a refresh key as such

     refreshKey: {
        every: `1 hour`,
        incremental: true
      },

hit the orchestration API with the above API call

Expected behavior
It seems that this process isn't utilizing the druid based ? expected variable but rather the postgres driver param in my case (which is redshift / postgres $1 syntax )

IT would appear that this method is being called with a passed in $1 args

Version:
Latest version

The text was updated successfully, but these errors were encountered:

pauldheinrichs changed the title ~~Orchestration API on multi datasource using improper driver~~ Multi datasource using improper driver for Orchestration / Pre-Aggregation cache checks May 16, 2024

pauldheinrichs changed the title ~~Multi datasource using improper driver for Orchestration / Pre-Aggregation cache checks~~ Druid Driver assuming wrong parameter argument for partitioned pre-aggregation May 17, 2024

pauldheinrichs changed the title ~~Druid Driver assuming wrong parameter argument for partitioned pre-aggregation~~ Druid Driver assuming wrong parameter argument for incremental partitioned pre-aggregation May 17, 2024

pauldheinrichs changed the title ~~Druid Driver assuming wrong parameter argument for incremental partitioned pre-aggregation~~ Rollup joins across different data source with different param allocators causing invalidations May 19, 2024

pauldheinrichs changed the title ~~Rollup joins across different data source with different param allocators causing invalidations~~ Rollup joins across different data sources with different param allocators causing invalidations May 19, 2024

pauldheinrichs linked a pull request May 19, 2024 that will close this issue

fix(schema-compiler) Resolve rollup join pre-aggregation param allocator issue #8279

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollup joins across different data sources with different param allocators causing invalidations #8268

Rollup joins across different data sources with different param allocators causing invalidations #8268

pauldheinrichs commented May 15, 2024 •

edited

Rollup joins across different data sources with different param allocators causing invalidations #8268

Rollup joins across different data sources with different param allocators causing invalidations #8268

Comments

pauldheinrichs commented May 15, 2024 • edited

pauldheinrichs commented May 15, 2024 •

edited