3.0 #101

bmschmidt · 2023-12-07T17:21:04Z

Major refactor.

3.0.0 includes a number of pent-up breaking changes. The underlying motivation for many of these is to allow library to now fully pass typescript compilation tests, with all the stability benefits that provides.

Breaking changes:

The library is now structured as named exports,
rather than a single default export. Instead of
```
import Scatterplot from 'deepscatter';
```
a typical first line will be
```
import { Scatterplot } from 'deepscatter';
```
This allows the export of several useful types for advanced functions in scatterplots we've found useful at Nomic. The initial set of exported items are {Dataset, Bitmask, Scatterplot}.
The distinction between QuadTile and ArrowTile
has been eliminated in favor of Tile, and with it the need to provide
generics around them through the system. Similarly, QuadTileDataset and ArrowDataset are both removed in favor of Dataset.
Instead, the TileProxy object is used to provide a wrapper than can turn anything into a
dataset. Although datasets are presumed to be quadtiles right now, formally they can be any
any collection of arrow batches structured as a tree. (This is increasingly how I've come to think of the data parts of deepscatter: as a system for navigating dataframes that consist of trees rather than of linear lists of points.)
Deepscatter no longer accepts strings as direct
arguments to Scatterplot.plotAPI in places where they were previously cast to functions
as lambdas, because linters rightfully get crazy mad about the unsafe use of eval. If
you want to use deepscatter in scrollytelling
contexts where definining functions as strings inside json is convenient (I still will do this myself in static sites) you must turn them
into functions before passing them into deepscatter.
Shortcuts for passing position and position0 rather than naming the x and y dimensions explicitly have been removed.
The behavior of categorical scales in certain circumstances has been tightened; it is possible, as a result, that places where it previously possible to treat categorical scales as numbers (referring to the underlying ints) will no longer work. I am not aware of specific such issues at the moment, and will act responsively to address any issues.
Dataset and Tile objects can now be instantiated with a manifest that allows listing all the tiles in a dataset. When passed, this allows a dataset to instantiate all tiles at creation time without actually loading any data. This represents a major change for any code that access the Tile.record_batch attributes, because they may now error on well-formed tiles since the presence of data is no longer necessary for something to be a Tile. Additionally, the Tile.ready promise has been retired; instead, to check if necessary data exists for a dataset, you must explicitly check if Tile.hasLoadedColumn('foo').
(Another way of expressing this change is that where previously there was a 'primary record batch' and 'sidecar batches', in version 3.0 of deepscatter this distinction is much less important; it is possible, for example, to draw a scatterplot without loading the x and y columns if other columns are passed to encoding.x and encoding.y.)
The tile prioritization rules which previously applied to core tiles in a dataset now apply to all sidecars as well.
It is possible to aggressively load any columns to any depth in the dataset without loading other data using 'Dataset.spawnDownloads()' and Dataset.runDownloads().
The Dataset object is now more independent of the scatterplot, to the point that it can independently run in a non-browser environment like Node. See the unit tests for an example of this.

bmschmidt added 16 commits January 10, 2024 18:16

2.16 release

e5d769f

improve docstring

dcc7d13

part one of no more subclasses

8b58580

more dataset fixes

65908d4

exclude dist and docs

1351cad

release_notes.md

65f5bb0

more type fies

6bb659d

minor typing fix

6d92153

81 errors

ee885e2

prepare big aesthetic refactor

cca34fb

3.0 stash

727143d

add async full recursion

145a4aa

more type schanges

73a800d

mavis beacon teaches typing

e59f3ed

stash

0d90e6c

typing improvements

6e18867

bmschmidt force-pushed the tileproxy-is-all-you-need branch from 869b646 to 6e18867 Compare February 13, 2024 03:12

bmschmidt added 8 commits February 14, 2024 15:09

refactor aesthetics

890eae0

tangent

5c690ba

plots black and white points again

9702df5

fewer errors, svelte-based testing page

4159eec

no errors without strict; restore some arrow-loading

e2247bf

mouseover, test pages

8c86866

improved support for partial loading of tiles and manifests

6c7c119

more support for unevenly loaded tile batches

cf24b9b

bmschmidt marked this pull request as ready for review May 24, 2024 16:10

bmschmidt closed this May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.0 #101

3.0 #101

bmschmidt commented Dec 7, 2023 •

edited

3.0 #101

3.0 #101

Conversation

bmschmidt commented Dec 7, 2023 • edited

bmschmidt commented Dec 7, 2023 •

edited