Interactive molecular content

How to embed interactive content in webpages with Quarto using Bokeh, 3DMol.js and NGL

quantum chemistry
Author

Kjell Jorner

Published

August 13, 2022

Introduction

I recently rebased the blog on the Quarto publishing system. Quarto is an evolution of R Markdown and allows publishing a notebook (.qmd or .ipynb) in various formats, inlcuding as blog posts.

In the previous post on visualizing atomic type orbitals, we had some code for interactive visualization with ipywidgets. Unfortunately, it didn’t work in the browser as it needs a Python backend running. With Quarto as publishing system, we can work around that problem.

Thanks to the support for the Observable dialect of JavaScript (OJS) in Quarto, we can create interactive elements which will work on the final static webpage. This will require some level of proficiency with JavaScript, but rest assured that I knew zero JavaScript when I started writing this blog post. The code cells featuring OJS are hidden in this post, but can be shown by clicking on the arrow next to “Code”. But first we will start with a visualization that doesn’t need any JavaScript skills.

Visualing molecules with molplotly

molplotly is a great add-on to plotly to display data together with 2D images of the associated molecules. It is really easy to use and works nicely in a Jupyter Notebook, but requires a Dash app to run in the background. Here we will instead use Bokeh to create similar plots which can be displayed on a static webpage, although with a bit more effort. See the Bokeh documentation as well as the blog post by iwatobipen and the notebook from OpenEye Software for more ideas.

We visualize the ESOL dataset,1 downloaded from MoleculeNet.

from bokeh.io import output_notebook
from bokeh.models import HoverTool
from bokeh.plotting import figure, show, ColumnDataSource
import pandas as pd
from rdkit import Chem
from rdkit.Chem import AllChem

# Read the csv file
df = pd.read_csv("delaney-processed.csv")

# Get data to plot
all_smiles = df["smiles"]
x = df["measured log solubility in mols per litre"].values
y = df["Molecular Weight"].values

# Create SVGs for each smiles with the "new" RDKit drawing code
imgs = []
for smiles in all_smiles:
    mol = Chem.MolFromSmiles(smiles)
    d2d = Chem.Draw.MolDraw2DSVG(150, 150)
    d2d.DrawMolecule(mol)
    d2d.FinishDrawing()
    svg = d2d.GetDrawingText()
    imgs.append(svg)

# Configure for output in the notebook
output_notebook()

# Load the data into a source and plot
source = ColumnDataSource(
    data={
        "x": x,
        "y": y,
        "imgs": imgs,
        "smiles": all_smiles,
    }
)
p = figure()
p.scatter("x", "y", source=source)
p.plot_height = 300
p.plot_width = 400
p.sizing_mode = "scale_width"
p.xaxis.axis_label = "Molecular weight"
p.yaxis.axis_label = "log S"

# Create tooltips referencing stored images
TOOLTIPS = """\
    <div>
        <div>
            @imgs{safe}
        </div>
        <div>
            <span>[$index]</span>
        </div>
        <div>
            <span>@smiles{safe}</span>
        </div>
        <div>
            <span>($x, $y)</span>
        </div>
    </div>
"""

# Connect tooltips to plot
p.add_tools(HoverTool(tooltips=TOOLTIPS))

# Show figure
show(p)
Loading BokehJS ...