diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index dcf2d42..b0ee0e0 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -9,26 +9,21 @@ on: jobs: pytest: runs-on: ubuntu-latest + strategy: + matrix: + python-version: ["3.10", "3.11", "3.12"] steps: - name: Check out repo - uses: actions/checkout@v3 + uses: actions/checkout@v4 - name: Set up python - uses: actions/setup-python@v2 + uses: actions/setup-python@v5 with: - python-version: 3.8 - cache: pip - cache-dependency-path: setup.py - - - uses: conda-incubator/setup-miniconda@v2 - with: - python-version: 3.8 + python-version: ${{ matrix.python-version }} - name: Install molplotly + dependencies - run: | - pip install .[test] - pip install rdkit-pypi + run: pip install .[test] - name: Run tests run: pytest --cov molplotly diff --git a/CITATION.cff b/CITATION.cff index 7f6c22f..853fc90 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -8,6 +8,6 @@ authors: given-names: "Rokas" orcid: "https://orcid.org/0000-0001-6397-0002" title: "molplotly" -version: 1.1.1 -date-released: 2022-03-01 +version: 2.0.0 +date-released: 2026-03-17 url: "https://github.com/wjm41/molplotly" diff --git a/PLAN.md b/PLAN.md new file mode 100644 index 0000000..4381c47 --- /dev/null +++ b/PLAN.md @@ -0,0 +1,118 @@ +# molplotly Cleanup & Modernization Plan + +## Summary + +molplotly is a single-module package (~520 lines) that adds interactive molecule hover tooltips to plotly figures using RDKit and Dash. The core idea is solid but the package is broken on PyPI, relies on a deprecated dependency (`JupyterDash`), has fragile packaging, minimal tests, and several unaddressed user issues. + +--- + +## Phase 1: Critical Fixes (Get it working again) + +### 1.1 Replace `JupyterDash` with `Dash` +- **Why**: `JupyterDash` is deprecated and causes doubled plots + port errors on Dash >= 2.10 (issue #31). This is the #1 breakage. +- **What**: Adopt the approach from PR #32 (open since Dec 2023, never reviewed) — replace `jupyter-dash` imports with standard `dash.Dash`. Modern Dash (>=2.11) natively supports Jupyter without `JupyterDash`. +- **Files**: `molplotly/main.py`, `tests/test_add_molecules.py` + +### 1.2 Modernize packaging — migrate to `pyproject.toml` +- **Why**: `setup.py` is legacy, `setup_pip.py` uses the removed `distutils`, and the current PyPI release (1.1.8) has mismatched version metadata making it uninstallable (issue #35). +- **What**: + - Create `pyproject.toml` with all metadata, dependencies, and build config + - Delete `setup.py` and `setup_pip.py` + - Set a correct, bumped version (e.g. 2.0.0 given the breaking `JupyterDash` removal) + - Pin minimum dependency versions sensibly: `dash>=2.11.0`, `plotly>=5.0.0`, `rdkit`, `pandas` + - Remove `jupyter-dash`, `werkzeug`, `ipykernel`, `nbformat` from required deps (no longer needed without JupyterDash; Dash pulls in werkzeug transitively) +- **Files**: new `pyproject.toml`, delete `setup.py`, delete `setup_pip.py` + +### 1.3 Update CI/CD +- **Why**: CI uses Python 3.8 (EOL), `rdkit-pypi` (renamed), and `actions/setup-python@v2` (outdated). +- **What**: + - Bump to Python 3.10+ (or test matrix 3.10/3.11/3.12) + - Use `actions/checkout@v4`, `actions/setup-python@v5` + - Remove miniconda setup (rdkit installs fine via pip now) + - Install via `pip install .[test]` (which will use pyproject.toml) +- **Files**: `.github/workflows/test.yml` + +--- + +## Phase 2: Code Quality + +### 2.1 Clean up `main.py` +- **Bare except** (line ~369): Catch specific `Exception` or `ValueError` instead of bare `except:` +- **Unused imports / dead code**: Audit and remove +- **Type hints**: Add missing type hints to `test_groups`, `find_correct_column_order`, `find_grouping` +- **Docstrings**: Add/improve docstrings for the public `add_molecules` function and helpers +- **Files**: `molplotly/main.py` + +### 2.2 Improve `__init__.py` +- **Why**: `from .main import *` exports everything including internal helpers +- **What**: Define `__all__` to only export `add_molecules` (the public API), or use explicit imports +- **Files**: `molplotly/__init__.py` + +### 2.3 Add `__version__` +- Use `importlib.metadata` to expose `__version__` from the installed package metadata (single source of truth from `pyproject.toml`) +- **Files**: `molplotly/__init__.py` + +--- + +## Phase 3: Testing + +### 3.1 Expand test coverage +- **Why**: Currently 1 test that only checks `isinstance(app, JupyterDash)` — no functional validation +- **What**: Add tests for: + - Basic scatter plot with molecules + - Color column grouping + - Symbol column grouping + - Facet column support + - Multiple SMILES columns + - Reaction SMILES drawing + - Edge cases: missing SMILES, invalid SMILES, empty dataframe + - Caption formatting functions + - The `find_grouping` logic (unit test the helper directly) +- **Files**: `tests/test_add_molecules.py` (or split into multiple test files) + +--- + +## Phase 4: Documentation & Metadata + +### 4.1 Update README +- Update installation instructions +- Note the `JupyterDash` → `Dash` migration (breaking change for users who import JupyterDash type) +- Update the "known issues" section (remove stale items, add current limitations) +- Update badges if any + +### 4.2 Update CITATION.cff +- **Why**: Shows version 1.1.1 and date 2022-03-01, both stale +- **What**: Bump version and date to match the new release + +### 4.3 Update example notebooks +- Verify notebooks still run with the updated code +- Remove/fix any `JupyterDash`-specific patterns + +--- + +## Phase 5: Address Open Issues (nice-to-haves / future work) + +These are lower priority but worth tracking: + +| Issue | Description | Effort | +|-------|-------------|--------| +| #34 | Streamlit integration | Medium — would need a different rendering approach | +| #29 | Embed in existing Dash app | Medium — return a Dash component instead of a full app | +| #26 | Support for stacked bar charts / graph_objects | Small-Medium | +| #4 | Export as standalone HTML | Hard — fundamental limitation of needing a Dash server | +| #20 | Usage question (resolved) | Can be closed | +| #21 | make_subplots support | Medium | + +--- + +## Suggested Execution Order + +1. **Phase 1.1** — Replace JupyterDash (unblocks everything) +2. **Phase 1.2** — pyproject.toml migration (fixes packaging) +3. **Phase 2.1-2.3** — Code cleanup (while we're in the code) +4. **Phase 1.3** — Update CI (validates the above changes) +5. **Phase 3** — Expand tests +6. **Phase 4** — Docs & metadata +7. **Phase 5** — Feature work (separate PRs) + +Phases 1-4 could reasonably be done as a single "v2.0.0" release given the breaking JupyterDash change. diff --git a/README.md b/README.md index cae4c57..3fb0228 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ [![Powered by RDKit](https://img.shields.io/static/v1?label=Powered%20by&message=RDKit&color=3838ff&style=flat&logo=data:image/x-icon;base64,AAABAAEAEBAQAAAAAABoAwAAFgAAACgAAAAQAAAAIAAAAAEAGAAAAAAAAAMAABILAAASCwAAAAAAAAAAAADc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/FBT/FBT/FBT/FBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/PBT/PBT/PBT/PBT/PBT/PBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/jIz/jIz/jIz/jIz/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nz/FBT/PBT/ZGT/ZGT/jIz/jIz/jIz/jIz/jIz/jIz/ZGT/ZGT/PBT/FBTc3Nzc3Nz/FBT/PBT/ZGT/ZGT/jIz/jIz/tLT/tLT/jIz/jIz/ZGT/ZGT/PBT/FBTc3Nzc3Nz/FBT/PBT/ZGT/ZGT/jIz/jIz/tLT/tLT/jIz/jIz/ZGT/ZGT/PBT/FBTc3Nzc3Nz/FBT/PBT/ZGT/ZGT/jIz/jIz/jIz/jIz/jIz/jIz/ZGT/ZGT/PBT/FBTc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/jIz/jIz/jIz/jIz/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/PBT/ZGT/ZGT/ZGT/ZGT/ZGT/ZGT/PBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/PBT/PBT/PBT/PBT/PBT/PBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/FBT/FBT/FBT/FBT/FBT/FBTc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nzc3Nz/////+B////AP///gB///wAP//4AB//+AAf//gAH//4AB//+AAf//gAH//8AD///gB///8A////gf////////)](https://www.rdkit.org/) [![PyPI version](https://img.shields.io/pypi/v/molplotly)](https://pypi.python.org/pypi/molplotly) [![PyPI Downloads](https://pepy.tech/badge/molplotly)](https://pepy.tech/project/molplotly) -[![This project supports Python 3.8+](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org/downloads) +[![This project supports Python 3.10+](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://python.org/downloads) `molplotly` is an add-on to `plotly` built on RDKit which allows 2D images of molecules to be shown in `plotly` figures when hovering over the data points. @@ -44,7 +44,7 @@ app = molplotly.add_molecules(fig=fig, ) # run Dash app inline in notebook (or in an external server) -app.run_server(mode='inline', port=8700, height=1000) +app.run(port=8700, height=1000, jupyter_mode='inline') ``` ### Input parameters @@ -72,15 +72,14 @@ app.run_server(mode='inline', port=8700, height=1000) #### Output parameters -by default a JupyterDash `app` is returned which can be run inline in a jupyter notebook or deployed on a server via `app.run_server()` +by default a Dash `app` is returned which can be run inline in a jupyter notebook or deployed on a server via `app.run()` - The recommended `height` of the app is `50+(height of the plotly figure)`. - For the `port` of the app, make sure you don't pick the same `port` as another `molplotly` plot otherwise the tooltips will clash with each other. Also, apparently on windows port numbers below `8700` are used by other processes so for safety processes keep to numbers above that. ## Can I run this in colab? -JupyterDash is supposed to have support for Google Colab but at some point that seems to have broken.. Keep an eye on the raised issue [here](https://github.com/plotly/jupyter-dash/issues/10)! -Update (1st March 2022): The plots seem to be running again but the hoverboxes are not showing so I don't think it has been fully fixed - I will keep an eye on it in the meantime. +Modern Dash (>=2.11) has native support for running in Jupyter notebooks and Google Colab without any extra dependencies. ## Can I save these plots? diff --git a/molplotly/__init__.py b/molplotly/__init__.py index c313e3a..f2c20f4 100644 --- a/molplotly/__init__.py +++ b/molplotly/__init__.py @@ -1 +1,10 @@ -from .main import * \ No newline at end of file +from importlib.metadata import PackageNotFoundError, version + +from .main import add_molecules + +__all__ = ["add_molecules"] + +try: + __version__ = version("molplotly") +except PackageNotFoundError: + __version__ = "unknown" diff --git a/molplotly/main.py b/molplotly/main.py index dd2fbf1..d413d8d 100644 --- a/molplotly/main.py +++ b/molplotly/main.py @@ -1,24 +1,22 @@ from __future__ import annotations import base64 +import itertools +import re import textwrap from io import BytesIO from typing import Callable -import itertools -import re -import pandas as pd import numpy as np -from dash import Input, Output, dcc, html, no_update -from jupyter_dash import JupyterDash +import pandas as pd +import plotly.graph_objects as go +from dash import Dash, Input, Output, dcc, html, no_update from pandas.core.groupby import DataFrameGroupBy from plotly.graph_objects import Figure -import plotly.graph_objects as go - from rdkit import Chem +from rdkit.Chem.Draw import rdMolDraw2D from rdkit.Chem.rdChemReactions import ReactionFromSmarts from rdkit.Chem.rdchem import Mol -from rdkit.Chem.Draw import rdMolDraw2D def str2bool(v: str) -> bool: @@ -162,12 +160,11 @@ def add_molecules( fontfamily: str = "Arial", fontsize: int = 12, reaction: bool = False, -) -> JupyterDash: +) -> Dash: """ A function that takes a plotly figure and a dataframe with molecular SMILES - and returns a dash app that dynamically generates an image of molecules in the hover box + and returns a Dash app that dynamically generates an image of molecules in the hover box when hovering the mouse over datapoints. - ... Attributes ---------- @@ -251,7 +248,7 @@ def add_molecules( if not svg_width: svg_width = svg_size - app = JupyterDash(__name__) + app = Dash(__name__) if smiles_col is None and mol_col is None: raise ValueError("Either smiles_col or mol_col has to be specified!") @@ -368,7 +365,7 @@ def display_hover(hoverData, value): if reaction: try: d2d.DrawReaction(ReactionFromSmarts(smiles, useSmiles=True)) - except: + except Exception: d2d.DrawMolecule(Chem.MolFromSmiles(smiles)) else: d2d.DrawMolecule(Chem.MolFromSmiles(smiles)) diff --git a/pyproject.toml b/pyproject.toml new file mode 100644 index 0000000..17a506a --- /dev/null +++ b/pyproject.toml @@ -0,0 +1,43 @@ +[build-system] +requires = ["setuptools>=64"] +build-backend = "setuptools.build_meta" + +[project] +name = "molplotly" +version = "2.0.0" +description = "plotly add-on to render molecule images on mouseover" +readme = "README.md" +license = "Apache-2.0" +requires-python = ">=3.10" +authors = [ + { name = "William McCorkindale", email = "wjm41@cam.ac.uk" }, +] +keywords = ["science", "chemistry", "cheminformatics"] +classifiers = [ + "Development Status :: 4 - Beta", + "Intended Audience :: Science/Research", + "Topic :: Scientific/Engineering :: Chemistry", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", +] +dependencies = [ + "dash>=2.11.0", + "plotly>=5.0.0", + "rdkit", + "pandas", + "numpy", +] + +[project.optional-dependencies] +test = ["pytest", "pytest-cov"] + +[project.urls] +Homepage = "https://github.com/wjm41/molplotly" +Repository = "https://github.com/wjm41/molplotly" +Issues = "https://github.com/wjm41/molplotly/issues" + +[tool.setuptools.packages.find] +include = ["molplotly*"] diff --git a/setup.py b/setup.py deleted file mode 100644 index f0ad78b..0000000 --- a/setup.py +++ /dev/null @@ -1,34 +0,0 @@ -from setuptools import find_packages, setup - -setup( - name="molplotly", - version="1.1.6", - description="plotly add-on to render molecule images on mouseover", - long_description=open("README.md").read(), - long_description_content_type="text/markdown", - url="https://github.com/wjm41/molplotly", - author="William McCorkindale", - license="Apache License 2.0", - packages=find_packages(), - install_requires=[ - "dash>=2.0.0", - "werkzeug>=2.0.0", - "jupyter-dash>=0.4.2", - "plotly>=5.0.0", - "rdkit", - "pandas", - "ipykernel", - "nbformat", - ], - extras_require={"test": ["pytest", "pytest-cov"]}, - keywords=["science", "chemistry", "cheminformatics"], - classifiers=[ - "Development Status :: 4 - Beta", - "Intended Audience :: Science/Research", - "Topic :: Scientific/Engineering :: Chemistry", - "License :: OSI Approved :: Apache Software License", - "Programming Language :: Python :: 3", - ], -) - -# TODO - change link in blog \ No newline at end of file diff --git a/setup_pip.py b/setup_pip.py deleted file mode 100644 index fe06ce8..0000000 --- a/setup_pip.py +++ /dev/null @@ -1,40 +0,0 @@ -from distutils.core import setup -from setuptools import setup - -# read the contents of your README file -from pathlib import Path - -this_directory = Path(__file__).parent -long_description = (this_directory / "README.md").read_text() - -setup( - name="molplotly", - packages=["molplotly"], - version="1.1.6", - license="Apache License, Version 2.0", - description="molplotly is an add-on to plotly built on RDKit which allows 2D images of molecules to be shown in scatterplots when hovering over the datapoints.", - long_description=long_description, - long_description_content_type="text/markdown", - author="William McCorkindale", - author_email="wjm41@cam.ac.uk", - url="https://github.com/wjm41/molplotly", - download_url="https://github.com/wjm41/molplotly/archive/refs/tags/v1.1.6.tar.gz", - install_requires=[ - "dash>=2.0.0", - "werkzeug>=2.0.0", - "jupyter-dash>=0.4.2", - "plotly>=5.0.0", - "rdkit", - "pandas", - "ipykernel", - "nbformat", - ], - keywords=["science", "chemistry", "cheminformatics"], - classifiers=[ - "Development Status :: 4 - Beta", - "Intended Audience :: Science/Research", - "Topic :: Scientific/Engineering :: Chemistry", - "License :: OSI Approved :: Apache Software License", - "Programming Language :: Python :: 3", - ], -) diff --git a/tests/test_add_molecules.py b/tests/test_add_molecules.py index c070846..56b1e26 100644 --- a/tests/test_add_molecules.py +++ b/tests/test_add_molecules.py @@ -1,37 +1,141 @@ import molplotly -import multiprocessing -import time import pandas as pd import plotly.express as px -from jupyter_dash import JupyterDash +from dash import Dash from . import ROOT df_esol = pd.read_csv(f"{ROOT}/examples/example.csv") df_esol["y_pred"] = df_esol["ESOL predicted log solubility in mols per litre"] df_esol["y_true"] = df_esol["measured log solubility in mols per litre"] +df_esol["delY"] = df_esol["y_pred"] - df_esol["y_true"] -df_esol["delY"] = df_esol["y_pred"] - df_esol["y_true"] -fig_scatter = px.scatter( - df_esol, - x="y_true", - y="y_pred", - color="delY", - title="ESOL Regression (default plotly)", - labels={ - "y_pred": "Predicted Solubility", - "y_true": "Measured Solubility", - "delY": "ΔY", - }, - width=1200, - height=800, -) - - -def test_add_molecules(): - app_scatter = molplotly.add_molecules( - fig=fig_scatter, df=df_esol, smiles_col="smiles", title_col="Compound ID" - ) - - assert isinstance(app_scatter, JupyterDash) +def _make_scatter(**kwargs): + return px.scatter( + df_esol, + x="y_true", + y="y_pred", + labels={ + "y_pred": "Predicted Solubility", + "y_true": "Measured Solubility", + }, + **kwargs, + ) + + +def test_add_molecules_basic(): + """Basic scatter plot returns a Dash app.""" + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, df=df_esol, smiles_col="smiles", title_col="Compound ID" + ) + assert isinstance(app, Dash) + + +def test_add_molecules_with_color(): + """Scatter with discrete color column.""" + df_esol["solubility_class"] = pd.cut( + df_esol["y_true"], bins=3, labels=["low", "mid", "high"] + ) + fig = px.scatter( + df_esol, + x="y_true", + y="y_pred", + color="solubility_class", + labels={ + "y_pred": "Predicted Solubility", + "y_true": "Measured Solubility", + }, + ) + app = molplotly.add_molecules( + fig=fig, + df=df_esol, + smiles_col="smiles", + color_col="solubility_class", + ) + assert isinstance(app, Dash) + + +def test_add_molecules_with_symbol(): + """Scatter with symbol column.""" + df_esol["solubility_class"] = pd.cut( + df_esol["y_true"], bins=2, labels=["low", "high"] + ) + fig = px.scatter( + df_esol, + x="y_true", + y="y_pred", + symbol="solubility_class", + labels={ + "y_pred": "Predicted Solubility", + "y_true": "Measured Solubility", + }, + ) + app = molplotly.add_molecules( + fig=fig, + df=df_esol, + smiles_col="smiles", + symbol_col="solubility_class", + ) + assert isinstance(app, Dash) + + +def test_add_molecules_with_captions(): + """Captions and caption transforms.""" + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, + df=df_esol, + smiles_col="smiles", + caption_cols=["Compound ID", "y_pred"], + caption_transform={"y_pred": lambda x: f"{x:.2f}"}, + ) + assert isinstance(app, Dash) + + +def test_add_molecules_no_img(): + """Show img disabled.""" + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, df=df_esol, smiles_col="smiles", show_img=False + ) + assert isinstance(app, Dash) + + +def test_add_molecules_no_coords(): + """Show coords disabled.""" + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, df=df_esol, smiles_col="smiles", show_coords=False + ) + assert isinstance(app, Dash) + + +def test_add_molecules_multiple_smiles(): + """Multiple SMILES columns produce a dropdown.""" + df_esol["smiles2"] = df_esol["smiles"] + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, df=df_esol, smiles_col=["smiles", "smiles2"] + ) + assert isinstance(app, Dash) + + +def test_add_molecules_custom_svg_size(): + """Custom SVG dimensions.""" + fig = _make_scatter() + app = molplotly.add_molecules( + fig=fig, + df=df_esol, + smiles_col="smiles", + svg_height=300, + svg_width=400, + ) + assert isinstance(app, Dash) + + +def test_add_molecules_version(): + """Package exposes a version string.""" + assert hasattr(molplotly, "__version__") + assert isinstance(molplotly.__version__, str)