Tableplot Project Summary

Overview

Tableplot is a Clojure data visualization library inspired by ggplot2's layered grammar of graphics. It provides easy layered graphics by composing Hanami templates with Tablecloth datasets. The library enables creation of interactive visualizations that work with any tool supporting the Kindly data visualization standard, such as Clay and Clojupyter.

Current Version: 1-beta14
Maven Coordinates: org.scicloj/tableplot
License: Eclipse Public License v2.0

Key Features

Layered grammar of graphics similar to ggplot2
Integration with Tablecloth for data processing
Support for Plotly.js and Vega-Lite backends
Kindly-compatible visualizations
Composable plotting functions with pipeline-friendly API
Statistical transformations (smoothing, histograms, density plots)
Multiple plot types (scatter, line, bar, heatmap, 3D surfaces, etc.)

Project Structure

Source Code Organization

src/scicloj/tableplot/v1/
├── plotly.clj          # Main Plotly backend implementation
├── hanami.clj          # Vega-Lite backend via Hanami
├── transpile.clj       # Cross-backend transpilation
├── dag.clj             # Dependency-aware function system
├── xform.clj           # Data transformations
├── util.clj            # Utility functions
└── cache.clj           # Caching mechanisms

Documentation and Examples

docs/                   # Quarto-generated documentation
notebooks/tableplot_book/  # Example notebooks and tutorials
├── plotly_walkthrough.clj    # Plotly API examples
├── hanami_walkthrough.clj    # Hanami/Vega-Lite examples
├── plotly_reference.clj      # Complete Plotly reference
└── transpile_reference.clj   # Cross-backend examples

Core Dependencies

Required Dependencies

tablecloth (7.029.2) - Dataset manipulation and processing
aerial.hanami (0.20.1) - Vega-Lite template system
metamorph.ml (1.2) - Machine learning pipeline integration
fastmath (3.0.0-alpha3) - Mathematical operations and statistics
kindly (4-beta16) - Visualization standard compliance
tempfiles (1-beta1) - Temporary file management
std.lang (4.0.10) - Language utilities

Development Dependencies

noj (2-beta18) - Data science stack (dev/test)
test.check (1.1.1) - Property-based testing
test-runner - Test execution
nrepl (1.3.1) - REPL server

API Overview

Main Namespaces

`scicloj.tableplot.v1.plotly`

The primary Plotly backend providing the richest feature set.

Key Functions:

;; Base plot creation
(plotly/base dataset options)
(plotly/plot template)

;; Layer functions
(plotly/layer-point {:=x :col1 :=y :col2 :=color :col3})
(plotly/layer-line {:=x :time :=y :value})
(plotly/layer-bar {:=x :category :=y :count})
(plotly/layer-histogram {:=x :values :=bins 20})
(plotly/layer-smooth {:=x :x :=y :y :=method :loess})

;; Specialized plots
(plotly/splom dataset {:=columns [:col1 :col2 :col3]})
(plotly/surface matrix)
(plotly/imshow image-data)

`scicloj.tableplot.v1.hanami`

Vega-Lite backend for web-standard visualizations.

Key Functions:

(hanami/base dataset options)
(hanami/layer-point {:=x :col1 :=y :col2})
(hanami/layer-line {:=x :time :=y :value})
(hanami/facet context facet-config)

Common Usage Patterns

Basic Plotting Pipeline

(-> dataset
    (tc/select-columns [:x :y :category])
    (plotly/layer-point {:=x :x :=y :y :=color :category}))

Layered Plots

(-> dataset
    (plotly/layer-point {:=x :x :=y :y})
    (plotly/layer-smooth {:=x :x :=y :y :=method :loess}))

Statistical Transformations

(-> dataset
    (plotly/layer-histogram {:=x :values :=bins 30})
    (plotly/layer-density {:=x :values :=alpha 0.7}))

Architecture

Dependency-Aware Function System (DAG)

The library uses a sophisticated dependency system defined in dag.clj:

defn-with-deps macro creates functions that declare their data dependencies
Automatic caching of intermediate computations
Lazy evaluation of expensive operations
Dependency resolution for complex visualization pipelines

Data Flow Architecture

Input Dataset (Tablecloth/tech.ml.dataset)
Layer Functions apply aesthetic mappings and transformations
Statistical Computations (optional) - smoothing, binning, etc.
Backend-Specific Rendering (Plotly/Vega-Lite/ECharts)
Kindly Metadata added for tool integration
Visualization Output (JSON specifications)

Cross-Backend Compatibility

The transpile.clj namespace enables cross-backend functionality:

Convert between Plotly and Vega-Lite specifications
Maintain feature parity across backends
Backend-specific optimizations

Implementation Patterns

Aesthetic Mappings

Keys prefixed with = (e.g., :=x, :=y, :=color)
Automatic type inference from dataset columns
Support for continuous and categorical mappings

Template System

Built on Hanami's substitution key system
Composable plot specifications
Default value propagation and merging
nil values are removed from output (equivalent to hc/RMV)

Caching Strategy

Intermediate results cached by content hash
Statistical computations cached separately
Cache invalidation on data changes

Development Workflow

Setup

# Clone and enter directory
cd tableplot

# Start REPL with dev dependencies
clj -M:dev

# Start REPL with dev dependencies and Clojure-MCP support
clj -M:dev:nrepl

# Run tests
clj -X:test

# Build documentation
# (requires Quarto installation)

Testing

Unit Testing with Clay Test Generation

The project uses Clay's kind/test-last mechanism for generating unit tests from notebooks:

;; In notebook files (e.g., plotly_walkthrough.clj, plotly_reference.clj)
(-> dataset
    (plotly/layer-point {:=x :x :=y :y}))

;; Add test annotation - must contain exactly ONE predicate
(kind/test-last [#(= (-> % plotly/plot :data first :type) "scatter")])

Test Pattern Best Practices:

Single predicate per test: Each kind/test-last takes ONE predicate function
REPL verification: Always test predicates in REPL before adding to file
Template-level tests: Check ::ht/defaults for mappings and settings
Spec-level tests: Use plotly/plot to realize and test final Plotly.js spec
Layer-specific tests: Navigate to (-> template ::ht/defaults :=layers first ::ht/defaults)

Test Coverage (as of current session):

plotly_reference.clj: 67 tests (36 original + 31 new spec-level tests)
plotly_walkthrough.clj: 24 tests covering key examples and patterns
hanami_walkthrough.clj: 19 tests covering Hanami/Vega-Lite API
Total: 110 comprehensive API tests

Common Test Patterns:

Template-level tests (check configuration before realization):

;; Template has correct dataset
(kind/test-last [#(contains? (:aerial.hanami.templates/defaults %) :=dataset)])

;; Layer has correct mappings
(kind/test-last [#(let [layer-defaults (-> % :aerial.hanami.templates/defaults :=layers first :aerial.hanami.templates/defaults)]
                    (and (= (:=x layer-defaults) :sepal-width)
                         (= (:=y layer-defaults) :sepal-length)))])

Spec-level tests (check realized Plotly.js/Vega-Lite output):

;; Plotly: Check trace type in realized spec
(kind/test-last [#(= (-> % plotly/plot :data first :type) "scatter")])

;; Plotly: Check marker properties appear in spec
(kind/test-last [#(= (-> % plotly/plot :data first :marker :size) 20)])

;; Plotly: Check aesthetic mappings create expected structures
(kind/test-last [#(vector? (-> % plotly/plot :data first :marker :color))])

;; Vega-Lite: Check encoding types
(kind/test-last [#(= (-> % hanami/plot :encoding :x :type) :quantitative)])

;; Check multiple traces from grouping
(kind/test-last [#(= (-> % plotly/plot :data count) 3)])

Important Testing Notes:

Use :aerial.hanami.templates/defaults not ::ht/defaults (namespace-qualified keywords don't work in test generation context)
Prefer spec-level tests over template-level tests - they verify actual output
For Plotly: opacity is at trace level (:opacity), not marker level
For Plotly: z-data in heatmaps/surfaces can be lazy seqs, use seq? not vector?
For Plotly: SPLOM dimensions are seqs, not vectors
For Vega-Lite: encoding types are keywords (:quantitative), not strings

Running Tests:

# Run all tests
clj -T:build test

# Run specific test namespaces
clj -M:test -n test.namespace

# Generate tests from notebooks using Clay
# (Tests are auto-generated when rendering notebooks with Clay)

Building and Deployment

# CI pipeline (test + build JAR)
clj -T:build ci

# Deploy to Clojars
clj -T:build deploy

Documentation Generation

Documentation built with Quarto from .qmd files
Notebooks in notebooks/tableplot_book/ provide examples
Generated docs deployed to GitHub Pages

Extension Points

Adding New Plot Types

Define layer function in appropriate backend namespace
Implement aesthetic mapping logic
Add statistical transformation if needed
Update documentation and examples

Custom Statistical Transformations

Implement stat function with defn-with-deps
Define data dependencies and caching strategy
Integrate with existing layer functions

New Backend Support

Create new namespace following plotly.clj pattern
Implement core functions: base, plot, layer functions
Add transpilation support in transpile.clj
Update cross-backend tests

Integration Points

Kindly: Add new visualization kinds
Tablecloth: Extend dataset processing pipeline
Hanami: Create new template types
Clay/Clojupyter: Enhance notebook integration

Common Development Tasks

Adding a New Layer Type

(defn layer-new-type [& {:as options}]
  (plotly/layer {:=mark :new-type} options))

Creating Custom Statistical Functions

(defn-with-deps custom-stat [dataset x-col y-col]
  [dataset :=x :=y]  ; dependencies
  ;; transformation logic
  (tc/add-column dataset :computed-col computed-values))

Cross-Backend Testing

;; Test same visualization across backends
(def viz-spec {:=x :col1 :=y :col2})
(plotly/layer-point viz-spec)
(hanami/layer-point viz-spec)

This summary provides the essential information needed for an LLM to understand and work effectively with the Tableplot codebase, including its architecture, APIs, patterns, and extension points.

FilesExpand file tree

PROJECT_SUMMARY.md

Latest commit

History