Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Aggregation (1/8): Aggregator and AggregationLayer #8886

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Pessimistress
Copy link
Collaborator

@Pessimistress Pessimistress commented May 12, 2024

For #7457

The goals of this refactor:

  • Unified Aggregator interface for both GPU aggregation and CPU aggregation
  • Simplified Aggregator and AggregationLayer classes that no longer concern:
    • Different props.data formats (iterable, non-iterable, binary...)
    • Projection systems
    • Specific prop names and update triggers
  • Fully typed and extensively documented

Planned PRs:

  • Aggregator and AggregationLayer
  • GPUAggregator
  • CPUAggregator
  • ScreenGridLayer
  • GridLayer
  • HexagonLayer
  • ContourLayer
  • Remove legacy code

Change List

  • New Aggregator interface
  • New AggregationLayer abstract class
  • Add isDrawable to Layer class to decouple isComposite from drawing behavior. AggregationLayer is a composite layer (renders sublayers) but also participate in the draw cycle.

@coveralls
Copy link

Coverage Status

coverage: 89.578% (-0.2%) from 89.821%
when pulling af9e3ff on x/aggregation-1
into b5a6313 on master.

Copy link
Collaborator

@donmccurdy donmccurdy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this direction a lot! 👏🏻

Left a few questions, nothing critical.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working through the PRs in order, apologies if these questions would be answered by a later PR. :)

Comment on lines +27 to +30
state!: {
gpuAggregator: GPUAggregator | null;
cpuAggregator: CPUAggregator | null;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question – It appears that you plan for both aggregators to have similar public APIs, does the layer need the binary distinction CPU and GPU aggregators as two different props?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I would expect there to be only an aggregator prop here.

Comment on lines +37 to +40
/** Called to create a GPUAggregator instance */
abstract getGPUAggregator(): GPUAggregator | null;
/** Called to create a CPUAggregator instance if getGPUAggregator() returns null */
abstract getCPUAggregator(): CPUAggregator | null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - If this creates an instance rather than returning an already-assigned one, should it be prefixed 'createFoo' or 'makeFoo' etc.? Is there any assumption that this will only be called once? No objection if you prefer the 'get' prefix here, just wondering.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Why would we wan't to create aggregator in callback as opposed to have it passed in as a prop from the subclass?

* - The number of data points
* - The group that each data point belongs to, by mapping each data point to a _binId_ (integer or array of integers)
* - The value(s) to aggregate, by mapping each data point in each channel to one _weight_
* - The method (_aggregationOperation_) to reduce a list of _weights_ to one number, such as SUM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor – I felt like the term "weights" was a bit misleading in the previous iteration of aggregation... For me it implies that they are coefficients for something, rather than a bunch of numbers we're going to sum or average. I think "values" or "inputs" might work just as well here without that confusion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without detailed knowledge, but based on your comment, I would vote for values. I would definitely expect weights to be an extra set of parameters used to rebalance things, not the values themselves.
(Though it can depend on perspective: e.g. in a graph, edges typically have weights not values.)

* - The aggregated values (_result_) as a list of numbers for each channel, comprised of one number per bin
* - The [min, max] among all aggregated values (_domain_) for each channel
*
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this summary! 🙏🏻

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Being able to reason about and define Aggregators separately and have a layer that accepts them follows the true spirit of vis.gl composability

I would mostly encourage another pass to remove differences between GPUAggregator and CPUAggregator from being visible in the AggregationLayer. I think you are almost there, see detailed notes.

import {Aggregator} from './aggregator';

// TODO
type GPUAggregator = Aggregator & {destroy(): void};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why keep destroy() outside of the interface? Making it part of the interface (possibly optional if CPUAggregator just can't be bothered to implement it) would seem to make more sense to me.

Comment on lines +27 to +30
state!: {
gpuAggregator: GPUAggregator | null;
cpuAggregator: CPUAggregator | null;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I would expect there to be only an aggregator prop here.

Comment on lines +37 to +40
/** Called to create a GPUAggregator instance */
abstract getGPUAggregator(): GPUAggregator | null;
/** Called to create a CPUAggregator instance if getGPUAggregator() returns null */
abstract getCPUAggregator(): CPUAggregator | null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Why would we wan't to create aggregator in callback as opposed to have it passed in as a prop from the subclass?

if (params.changeFlags.extensionsChanged) {
this.state.gpuAggregator?.destroy();
this.state.gpuAggregator = this.getGPUAggregator();
if (this.state.gpuAggregator) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: probably better "ask" the aggregator instead of such switching on aggregator "type".

Along the lines of.

if (this.state.aggregator.usesAttributeManager) {

// Override Layer.finalizeState to dispose the GPUAggregator instance
finalizeState(context: LayerContext) {
super.finalizeState(context);
this.state.gpuAggregator?.destroy();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: IMHO, it would look nicer to call destroy() on both here, even if it was a no-op on the cpuAggregator.


export type AggregationOperation = 'SUM' | 'MEAN' | 'MIN' | 'MAX';

export type AggregationProps = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: TSDoc on the type as well.

};

/**
* _Aggregation_ is a 2-step process:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe start with a more general sentence about the aggregator (perhaps being a base class etc) before diving into the details of the algorithm?

* - The number of data points
* - The group that each data point belongs to, by mapping each data point to a _binId_ (integer or array of integers)
* - The value(s) to aggregate, by mapping each data point in each channel to one _weight_
* - The method (_aggregationOperation_) to reduce a list of _weights_ to one number, such as SUM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without detailed knowledge, but based on your comment, I would vote for values. I would definitely expect weights to be an extra set of parameters used to rebalance things, not the values themselves.
(Though it can depend on perspective: e.g. in a graph, edges typically have weights not values.)

getResultDomain(channel: number): [min: number, max: number];

/** Returns the information for a given bin. */
getBin(index: number): {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe worth breaking out as type for the Bin or AggregatedBin or ...

@@ -41,6 +41,11 @@ export default abstract class CompositeLayer<PropsT extends {} = {}> extends Lay
return true;
}

/** `true` if the layer renders to screen */
get isDrawable(): boolean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is this extra wrinkle worth it? Should we just start calling draw on composite layers and just make sure we provide an empty implementation?

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and extensively documented

The Aggregator class deserves extensive user facing documentation, to help end-users define their own aggregators and that could probably be copy pasted from your voluminous comments in the source.

@Pessimistress
Copy link
Collaborator Author

@ibgreen I will add markdown documentation once the API discussion settles. I didn't want to write them prematurely as you can't type check those.

static layerName = 'AggregationLayer';

state!: {
gpuAggregator: GPUAggregator | null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if rather than specifying CPU/GPU as options, the layer could instead accept a list of aggregators, and then pick whichever is compatible (similar to how luma does with the Device). We may want to add support for a webGpuAggregator in addition to webGlAggregator, or who knows what else in the future

@@ -0,0 +1,65 @@
import type {Attribute, BinaryAttribute} from '@deck.gl/core';

export type AggregationOperation = 'SUM' | 'MEAN' | 'MIN' | 'MAX';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding 'COUNT'? For the GPU aggregation case it would avoid uploading a buffer containing just 1s

data: LayerDataSource<DataT>;
};

export default abstract class AggregationLayer<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you thought about how this could work with a TileLayer? I'm working on a cluster layer right now which works with a tiled data source, and aggregates whenever the visible data changes (outputting a single ScatterplotLayer)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants