Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preaggregations are not working from Metabase #8210

Open
Betilopeza opened this issue Apr 30, 2024 · 7 comments
Open

Preaggregations are not working from Metabase #8210

Betilopeza opened this issue Apr 30, 2024 · 7 comments
Labels
question The issue is a question. Please use Stack Overflow for questions.

Comments

@Betilopeza
Copy link

Betilopeza commented Apr 30, 2024

Hello! we are testing preaggregations to improve the latency for the questions to appear in Metabase dashboards. We are currently using:

  • Cube Cloud Latest (0.35.22)
  • Metabase Cloud v1.49.6
  • Google Cloud (have the prod_pre_aggregations dataset already saving this aggregations I did)

For some reason, when we create a preaggregation with measures that worked perfectly fine in our playground, but when queried from Metabase, are not preaggregated as shown below in the example.

image
image
image

The dynamic schema for that table is this one:

{% set account = "accountUuid" %}

{% set interaction = "interactionUuid" %}
{% set time = "time" %}
cubes:
  {%- for cube in load_interaction_tags()["cubes"] %}

  - name: {{ cube.name }}
    sql: {{ cube.sql }}
    dataSource: default

  {%- if cube.dimensions is not none and cube.dimensions|length > 0 %}
    pre_aggregations:
      - name: {{cube.name + "_FullTable" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.isDone
          - InteractionLocations.url
          - MGMLocations.Active_Status
          - MGMLocations.HOA_Type
          - MGMLocations.Plan_Type
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_LocationInteraction3" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.url
          - MGMLocations.Active_Status
          - MGMLocations.HOA_Type
          - MGMLocations.Plan_Type
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_LocationInteraction4" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.url
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_table" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
        refresh_key:
          every: 1 hour

      {%- for dimension in cube.dimensions %}

      - name: {{ dimension.name + "_filter" }} 
        dimensions:
          - {{ dimension.name }}
        refresh_key:
          every: 1 hour

      - name: {{ cube.name + "_distinctUuid" }} 
        measures:
          - distinctUuid
        dimensions:
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
        refresh_key:
          every: 1 hour

      {%- endfor %}

      - name: {{ cube.name + "_distinctUuid" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
        measures:
          - distinctUuid
        refresh_key:
          every: 1 hour

      - name: {{ cube.name + "_distinct" }} 
        measures:
          - distinctUuid
  {%- endif %}

  {%- if cube.joins is not none and cube.joins|length > 0 %}
    joins:
      {%- for join in cube.joins %}
      - name: {{ join.name }}
        relationship: {{ join.relationship }}
        sql: {{ join.sql }}
      {%- endfor %}
  {%- endif %}

  {%- if cube.measures is not none and cube.measures|length > 0 %}
    measures:
      {%- for measure in cube.measures %}
      - name: {{ measure.name }}
        type: {{ measure.type }}
      {%- if measure.sql %}
        sql: {{ measure.sql }}
      {%- endif %}
      {%- endfor %}
  {%- endif %}

  {%- if cube.dimensions is not none and cube.dimensions|length > 0 %}
    dimensions:
      {%- for dimension in cube.dimensions %}
      - name: {{ dimension.name }}
        sql: {{ dimension.sql }}
        type: {{ dimension.type }}
        {% if dimension.primaryKey == True -%}
        primaryKey: true
        public: true
        {% endif -%}
      {%- endfor %}
  {%- endif %}
  {%- endfor %}

And this is our cube.js

const {memoizedFetchAccountPassword} = require("./sql-auth")

function decodeBase64(data) {
    let buff = Buffer.from(data, 'base64');
    return buff.toString('ascii');
}

const deconstructGlobalId = (globalId) => {
    const decoded = decodeBase64(globalId)
    const globalIdParts = decoded.split(":")

    return {
        node: globalIdParts[0],
        uuid: globalIdParts[1]
    }
}

const contextToAppId = (context) => `CUBEJS_APP_${context.cacct}`

const extendContext = (req) => {
  // For SQL API (context is then handled by checkSqlAuth)
    if (req.headers === undefined || req.headers.cacct === undefined) {
        return;
    }

    try {
        const deconstructed = deconstructGlobalId(req.headers.cacct);
        return {cacct: deconstructed["uuid"]};
    } catch (err) {
        console.log("Error extending context: " + err)
    }
}

const queryRewrite = (query, request) => {
    console.log("Rewriting for", JSON.stringify(request))
    const accountId = request.cacct !== undefined ? request.cacct : request.securityContext.cacct;
    
    if (query["dimensions"].length > 0) {
        try {
            var table_name = query["dimensions"][0].substring(0, query["dimensions"][0].indexOf("."));
            } catch {
            var table_name = query["dimensions"][0]["cubeName"]
            }   
    }
    // measures in query
    else if (query["measures"].length > 0) {
        try {
            var table_name = query["measures"][0].substring(0, query["measures"][0].indexOf("."))
            } catch {
            var table_name = query["measures"][0]["cubeName"]
            }
    }

    else if (query["timeDimensions"].length > 0) {
        try {
        var table_name = query["timeDimensions"][0]["dimension"].substring(0, query["timeDimensions"][0]["dimension"].indexOf("."))
        } catch {
            var table_name = query["measures"][0]["cubeName"]
        }
    }

    else {
        console.log("Using account id:", accountId)
        console.log("The query:", query)
    }

    console.log("Using account id:", accountId);

    query.filters.push({
        member: `${table_name}.accountUuid`,
        operator: 'equals',
        values: [accountId],
    });

    return query;
}

const checkSqlAuth = async (req, username) => {
    // Remove this section for containing sensitive information
}

const queueOptions = {
  concurrency: 2,
  executionTimeout: 600,
  orphanedTimeout: 120,
  heartBeatInterval: 120,
};

module.exports = {
    http: {
        cors: {
            origin: '*',
            methods: 'GET,HEAD,PUT,PATCH,POST,DELETE',
            preflightContinue: false,
            allowedHeaders: ['Content-Type', 'Authorization', 'cacct'],
            optionsSuccessStatus: 204,
        },
    },
    scheduledRefreshTimer: 120,
    checkSqlAuth,
    contextToAppId,
    extendContext,
    queryRewrite,
    orchestratorOptions: {
        queryCacheOptions: {
            refreshKeyRenewalThreshold: 120,
            backgroundRenew: true,
            queueOptions,
        },
        preAggregationsOptions: {queueOptions},
    },
    // Figure out ScheduledRefreshContexts later! (probably needed for QueryRewrite)
    // Placeholder to prevent the error message:
    scheduledRefreshContexts: () => [
        {
            securityContext: {
                cacct: '00000000-0000-0000-0000-000000000000'
            }
        }
    ],
};

Please if you can check if we are doing something wrong, or if we are missing something, would be really helpful

@Betilopeza Betilopeza added the question The issue is a question. Please use Stack Overflow for questions. label Apr 30, 2024
@Betilopeza Betilopeza closed this as not planned Won't fix, can't repro, duplicate, stale Apr 30, 2024
@Betilopeza Betilopeza reopened this Apr 30, 2024
@paveltiunov
Copy link
Member

@Betilopeza It seems distinctUuid is a measure however you're using it in GROUP BY. I expect to see it inside COUNT(DISTINCT distinctUuid) aggregation reference. Could you please elaborate on your use case?

@Betilopeza
Copy link
Author

Betilopeza commented May 2, 2024

Hello, yes we tried in the group by, because in the measure sections the preaggregation was not working either! I send you the screenshots.
Preaggregation:
- name: {{ cube.name + "_distinct" }}
measures:
- distinctUuid

image (21)
image (22)
image (23)

@paveltiunov
Copy link
Member

@Betilopeza There're dimensions in this pre-aggregation and this pre-aggregation is non-additive. So it couldn't match a query without dimensions defined. Please see https://cube.dev/docs/product/caching/matching-pre-aggregations.

@Betilopeza
Copy link
Author

We are trying to make work this preaggregation:
Preaggregation:

  • name: {{ cube.name + "_distinct" }}
    measures:
  • distinctUuid

As you can see it actually doesn´t have any dimension, so it´s happenning what I send you in the previous message. In playground works perfectly, but not in metabase. I did't get why you said that we have dimensions

@paveltiunov
Copy link
Member

@Betilopeza Unfortunately we can't reproduce it on our side. If you're in Cube Cloud please prepare a branch with a description on how to reproduce and file a ticket with the support team. We'll take a look.

@Betilopeza
Copy link
Author

Betilopeza commented May 6, 2024

Thanks! Unfortunately, we haven't been able to reach support since February, so I don't think that would be an option.

@igorlukanin
Copy link
Member

Hi @Betilopeza, you can send the link to your branch with a reproduction and any relevant information to me via igor@cube.dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

3 participants