Skip to content

Summary and count causes performance issues on large datasets #37

@markbrough

Description

@markbrough

With very large datasets (e.g. 13m rows), summary and count appear to significantly slow down the response:

babbage/babbage/cube.py

Lines 89 to 96 in 9416105

# Count
count = count_results(self, prep(cuts,
drilldowns=drilldowns,
columns=[1])[0])
# Summary
summary = first_result(self, prep(cuts,
aggregates=aggregates)[0].limit(1))

Without generating summary and count, it's 2-3 times faster to return the response.

It would be useful to make returning these properties optional. E.g. by adding an optional &simple parameter to the request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions