Skip to content

Tutorial

cscully-allison edited this page Dec 7, 2021 · 24 revisions

Advanced Usage Tutorial

Assuming you have gone through the quick-start tutorial; you should now be ready to learn more about the advanced and dynamic features of Roundtrip.

Loading Multiple Files

In the quick start, we load a single, static HTML file. While helpful to verify that your install is working, it cannot support even a simple visualization so we will now explore how to load multiple files and modify html with your css and javascript code.

Picking up from the Quickstart, you should now have a directory that looks like this:

rt_tutorial
├── rt_templ.html
├── rt_vis.py
├── rt_notebook.ipynb

We are going to expand on this directory by adding a new file, rt_script.js.

touch rt_script.js

Inside of this file we will write a simple script which adds a line to our html:

function createTextElement (type, text) {
    // create a new div element
    const newNode = document.createElement(type);
    newNode.className += 'from-js';
  
    // and give it some content
    const newContent = document.createTextNode(text);
  
    // add the text node to the newly created div
    newNode.appendChild(newContent);
    
    return newNode;
  }

const node =  createTextElement("p", "I was created and added in the JavaScript, and styled by the CSS!");

element.appendChild(node);

Inside of this code we define a function which makes the building of text nodes very easy. Then we use it to add an paragraph node to our view area, containing the text "I was created and added in the Javascript, and styled by the CSS!".


Note

In this example we use element, a variable which was not defined. Roundtrip exposes the node containing all your loaded web files in this variable element. Due to potential conflicts with other cells, be sure to use this like you would 'document' when selecting and modifying the DOM. It will keep your code safely in the visualization area of the current Jupyter cell.


We can now add this JavaScript code into our visualization by modifying the rt_vis.py file to specify one more file path in our call to RT.load_web_files():

...
  @line_magic
  def hello_rt(self, line):
      
      # load files
      RT.load_web_files(["rt_templ.html", "rt_script.js"])

      RT.initialize()
...

When you run the %hello_rt cell in your rt_notebook you should see the text added, but you should also notice that it is not styled. That's because we need to add a css file to our code. Do this by creating a new file:

touch rt_style.css

Now add some styles to your css file:

h1 {
    color: rgba(50,50,180,1);
}

p{
    color:brown;
    font-size: medium;
    font-family: 'Times New Roman', Times, serif;
}

And add them to our list of loaded files as before:

...
  @line_magic
  def hello_rt(self, line):
      
      # load files
      RT.load_web_files(["rt_templ.html", "rt_script.js"])

      RT.initialize()
...

And, when you reload and re-run your notebook you will see that the big text is colored blue and the smaller is now red. This workflow is best suited for vanilla js and simple workflows, but can allows you to inject many files of any supported type (css, html, and js).

Loading Webpack Code

Since this is a library to support complex JavaScript visualizations we can expect that code bases will become increasingly complex and benefit from the many benefits provided by packaging libraries. So, in order to support more complicated visualizations we suggest a workflow for managing and loading your code using webpack. This is the recommended approach to loading web files into roundtrip.

You will need Node.js and Webpack installed on your system for this portion of the tutorial.

We will also need the webpack html plugin, the style loader plugins, as well as d3. Install these from the rt_tutorial root directory:

npm install webpack-html-plugin
npm install style-loader css-loader
npm install d3

At this point your directory should look like this:

rt_tutorial
├── rt_templ.html
├── package.json
├── rt_vis.py
├── rt_notebook.ipynb
├── rt_style.css
├── rt_script.js

You can now create two new directories called src and dist, create a webpack.config.file and populate src with an HTML, JavaScript and CSS file.

mkdir src
mkdir dist
touch config.webpack.js
cd src
touch webpack_templ.html webpack_script.js webpack_style.css

After these operations your directory should now look like this:

rt_tutorial
├── rt_templ.html
├── package.json
├── config.webpack.js
├── rt_vis.py
├── rt_notebook.ipynb
├── rt_style.css
├── rt_script.js
├── dist
└── src
     ├── webpack_style.js
     ├── webpack_script.js
     └── webpack_templ.html

The Files

We will populate webpack_templ.html with just an svg:

<h3>My Histogram</h3>
<svg id='hist-canvas'></svg>

We will put some style information for our histogram bars in the webpack_style.css:

.idle{
    fill: indianred;
    stroke: royalblue;
    stroke-width: 1px;
}

.hovered{
    fill: royalblue;
    stroke: rgb(215, 215, 216);
    stroke-width: 1px;
}

.labels{
    fill: black;
    font: 14px sans-serif;
}

We will put the following into our webpack_script.js file:

import * as d3 from 'd3';
import './webpack_style.css'

class Histogram{
    constructor(svg, width, height, data){
        this.svg = svg;
        this.width = width;
        this.height = height;
        this.margins = {
            left: 40,
            right: 20,
            top: 20,
            bottom: 20
        }
        this.bar_margin = 7;

        this.data = data;
        this.maxMin = this.getMinMax(data);
        let bins = d3.bin();
        this.data_bins = bins(this.data);
        this.num_bins = this.data_bins.length;
        this.bindomains = this.getBinDomains(this.data_bins);

        this.xscale = d3.scaleLinear().range([this.margins.left, this.width-this.margins.right]).domain([0,this.num_bins]);
        this.yscale = d3.scaleLinear().range([this.margins.top+this.margins.bottom,this.height]).domain([this.bindomains.max, this.bindomains.min]);
    }

    getMinMax(arr){
        let minmax = {
            "min": Number.POSITIVE_INFINITY,
            "max": Number.NEGATIVE_INFINITY
        }

        for(let val of arr){
            minmax.min = Math.min(minmax.min, val);
            minmax.max = Math.max(minmax.max, val);
        }

        return minmax;
    }
    
    getBinDomains(bins){
        let minmax = {"max": Number.NEGATIVE_INFINITY, "min": 0}

        for(let bin of bins){
            minmax.max = Math.max(minmax.max, bin.length);
        }

        return minmax
    }

    draw(){
        let bar_width = (this.width/this.data_bins.length)-this.bar_margin;

        let leftAxis = d3.axisLeft(this.yscale);
        this.svg.append('g').attr('transform',`translate(${this.margins.left}, ${-this.margins.top})`).call(leftAxis);


        this.svg.attr('height', this.height)
            .attr('width', this.width);

        let bars = this.svg
                    .selectAll('.bars')
                    .data(this.data_bins);
        
        let labels = this.svg
                        .selectAll('.labels')
                        .data(this.data_bins);
        
        bars.enter().append('rect')
            .attr('class', 'bars idle')
            .attr('width', bar_width)
            .attr('height', (d)=>{return this.height-this.yscale(d.length)})
            .attr('x', (d,i)=>{return this.xscale(i)})
            .attr('y', (d)=>{return this.yscale(d.length)-this.margins.bottom})
            .on('mouseover', (evt, d)=>{
                    d3.select(evt.target).attr('class', 'bars hovered');
                })
            .on('mouseout', (evt,d)=>{d3.select(evt.target).attr('class', 'bars idle');});

        labels.enter()
                .append('text')
                .attr('class', 'labels')
                .attr('x', (d,i)=>this.xscale(i)+(0.5*bar_width))
                .attr('y', this.height - 5)
                .attr('text-anchor', 'middle')
                .html(d=>{return `[${d.x0}, ${d.x1}]`})

        
    }
}

//setup
const svg_select = d3.select(element).select('#hist-canvas');
const data = [];
const num_data = 1000;
for(let i = 0; i < num_data; i++){
    data.push(Math.random()*100);
}

//construction & initial draw
const hist = new Histogram(svg_select, 800, 400, data);
hist.draw();

In our script file you will notice that our code has increased in complexity compared to our prior examples. Using webpack we can now import our CSS files directly, in addition to any JavaScript libraries we download through NPM. Additionally, we can now use ES6 syntax for creating a class that manages our visualization: Histogram.

Inside of this class we provide two functions for getting the domains of our data for d3 scales. We also use the d3.bins class to automatically sort our generated visualizations into bins. The draw function also provides some styling on hover (mouseover and mouseout).

At the bottom of this code you will see that we are randomly generating data. Accordingly, the visualization generated should reflect a random distribution where all bars are roughly about the same height.

Finally, we are going to add a new loader to our rt_vis.py file.

. . .

@magics_class
class MyWebpackVis(Magics):
    def __init__(self, shell):
        super(MyWebpackVis, self).__init__(shell)
        self.shell = shell
    
    @line_magic
    def histogram(self, line):
        RT.load_webpack('dist/webpack_bundle.html')
        RT.initialize()

def load_ipython_extension(ipython):
    ipython.register_magics(MyRTVis)
    ipython.register_magics(MyWebpackVis)

Webpack Configuration

Finally, in the webpack.config.js file we will provide the following settings:

const HtmlWebpackPlugin = require('html-webpack-plugin');
const path = require('path');

module.exports = {
    module:{
        rules:[
            {
                test: /\.css$/i,
                use: ["style-loader", "css-loader"]
            },
        ]
    },
    entry: {
        script: ['./src/webpack_script.js']
    },
    output: {
        publicPath: path.resolve(__dirname, 'dist'),
        filename: '[name]_bundle.js',
        path: path.resolve(__dirname, 'dist')
    },
    optimization: {
        minimize: false
    },
    plugins:[
        new HtmlWebpackPlugin({
            template: './src/webpack_templ.html',
            chunks: ['script'],
            filename: 'webpack_bundle.html'
        })
    ],
    mode: 'production'
}

In this webpack configuration file, there are a few key details which must always be included for Roundtrip to work correctly.

First, the use of the HtmlWebpackPlugin is strongly recommended. Roundtrip's load_webpack function expects an .html file created from this plugin. This workflow will benefit you as well as the html file produced here will contain all the necessary reference to any scripts you are using in your visualization. It will also save you from generating otherwise static html in your Javascript code.

Second, note the publicPath option under output. For Roundtrip to successfully load files regardless of where the Jupyter notebook server is running, this should be the absolute path to your output directory; dist in this case. __dirname holds the directory path to this webpack.config file.

Outside of these two details, the rest of the webpack can be configured in whatever way works best for you and your workflow.

Once your configuration file is all set, then you should be ready to go. You can build out your code by running the following line from rt_tutorial directory:

npx webpack

After your front end code has been built successfully, you can now run the histogram visualization from the rt_notebook in a new cell with the %histogram magic function. Your output should look more-or-less like the following figure.

An image of a histogram with a random distribution loaded by roundtrip.

Passing Jupyter-Scoped Variables to Javascript

Now we have a histogram which visualizes completely useless and constantly changing data. Wouldn't it be way more useful if we could pass data supplied by the notebook user to this visualization and enable them to visualize any data they want?

Fortunately, we can do that; and this is really where the magic of RoundTrip begins to shine. Roundtrip provides a function RT.var_to_js() for this exact purpose. It takes in the name of a Jupyter scoped variable and a key where it can be found in our javascript code and manages the translation from Jupyter to Javascript, making our data available for use in a loaded visualization.

To demonstrate how this works we are going to add two lines to our histogram line magic function:

    @line_magic
    def histogram(self, line):
        args = line.split(" ")

        RT.load_webpack('dist/webpack_bundle.html', cache=False)

        if(len(args) > 0):
          RT.var_to_js(args[0], "hist_data")
        
        RT.initialize()
  

The first line added is at the top of this function args = line.split(" "). This line simply parses any space-separated variable arguments passed to our magic function and provides them as a list. This list contains the names of variables passed to the function; Roundtrip manages the extraction of data referred to by those names in the Jupyter namespace.

The next new line is RT.var_to_js(args[0], "hist_data"). In this line we pass the name of the first variable passed to %histogram and specify that the data should be found at the key hist_data on out Javascript Roundtrip object. We have wrapped this line in a if statement to catch if the user passed an argument or not. We can also put this call in a try catch block or support any other arbitrary validation techniques.

We will now modify the JavaScript in our webpack_script.js file to make use of this newly exposed data.

import * as d3 from 'd3';
import './webpack_style.css'

const RT = window.Roundtrip;
...

Near the top of our file we will access the global Roundtrip object from the window variable provided by our browser. We alias it as RT for use throughout this script.

At the bottom of our file we can access the data passed in from the Jupyter side of our code.

...
let data = [];

if(Object.keys(RT).includes('hist_data')){
    data = JSON.parse(RT['hist_data']);

    //construction & initial draw
    const hist = new Histogram(svg_select, 800, 400, data);
    hist.draw();
}
else{
    svg_select.attr('width', 800)
            .append('text')
            .attr('class', 'no-data')
            .attr('y', 32)
            .html("Oh no! There is no data to show.")
}
...

We have now changed the const data to a let data so that we can reassign it with our passed data and removed our Javascript-based random number generation. We have also added a conditional which will print an error message in the visualization if no data has been passed.

Roundtrip stores data passed from Jupyter to the Roundtrip object in a string format and expect all data passed back into it to be a string format or basic datatype (int, float, char).

Note On the Python side, Roundtrip provides a default converter for translating basic Python data to JSON, but allows for users to provide custom defined converters in Python to support more complex datatypes. Conversion on the JavaScript side is currently manual and requires users to parse data retrieved from the Roundtrip object as well as encode data passed back to the Roundtrip object.

Finally, we need to add our new class no-data to our css file.

.no-data{
    fill: black;
    font: 32px sans-serif;
}

Once you have implemented the above code changes, run npx webpack again to build out your code. We can now generate a normal dataset very easily using numpy in our Jupyter notebook and see it visualized.


Note At this point you may notice some behavioral errors when you try to re-run your data. During development we suggest that you do the following to fix some cache related errors:

  1. Reload the Kernel
  2. Clear the browser cache
  3. Refresh the Webpage

Related, you may at this point have tried to "run all cells". Due to the asynchronous nature of loading and passing data, Roundtrip does not support running all cells at once in a Jupyter notebook. If you run into issues related to running all cells, try the above steps to reset the Jupyter notebook to a fresh state and run each cell individually.


Create a new cell above the %histogram visualization cell and insert the following code:

import numpy as np
normal = np.random.normal(0, 1, 1000).tolist()

This creates a new, normal list for us to visualize. Since we did not supply a custom converter to our var_to_js function, we convert it to a list so that our default converter handles the input more elegantly than it might with a numpy array. We can pass this list into our visualization by providing it as the first argument to our histogram:

%histogram normal

Your output should look something like this:

Histogram with a normal distribution.

Returning Javascript Data to Jupyter

At this point, we now have a custom histogram visualization which works with any list of numbers the user wishes to pass into it from the Jupyter notebook. Now what if we want to return information back from our notebook for further analysis or visualization? Roundtrip provides an interface for this.

The function is on the Python side of our code RT.fetch_data(js_var, jupyter_var) and takes two arguments:

  1. js_var - The key where our JavaScript data was stored in the Javascript Roundtrip object
  2. jupyter_var - The name of a Jupyter notebook-scoped Python variable where we will place retrieved data

To use this retrieval function we define a new line magic method inside of our MyWebpackVis class:

    @line_magic
    def get_histogram_selection(self, line):
      args = line.split(" ")
      
      RT.fetch_data("return_ranges", args[0])

This method takes in a single argument: a Jupyter-scoped variable we will load our returned data into. Currently the fetch_data method does not support automatic conversion of returned data so the data loaded into the args[0] variable will be a string. If the string was stored in a JSON format originally, the decoding will be easy in the scope of the Jupyter notebook.

Note: Functionality to supply a custom converter for this function will be implemented soon.

After we have defined where we expect the data to be found on the Javscript side of our code and created this interface for the Jupyter-notebook user, we need to modify our visualization code to return data to it's reserved area: 'return_ranges'. To do that we will add a new interaction to the bars in webpack_script.js and modify our CSS webpack_style.css.

draw(){
. . .
        bars.enter().append('rect')
            .attr('class', 'bars idle')
            .attr('width', bar_width)
            .attr('height', (d)=>{return this.height-this.yscale(d.length)})
            .attr('x', (d,i)=>{return this.xscale(i)})
            .attr('y', (d)=>{return this.yscale(d.length)-this.margins.bottom})
            .on('mouseover', (evt, d)=>{
                    d3.select(evt.target).attr('class', 'bars hovered');
                })
            .on('mouseout', (evt, d, i)=>{
                let range_key = `[${d.x0}, ${d.x1}]`;
                let found = Object.keys(this.selected).includes(range_key);
                if(!found){
                    d3.select(evt.target).attr('class', 'bars idle');
                }
                else{
                    d3.select(evt.target).attr('class', 'bars selected-bar');
                }
            })
            .on('click', (evt,d)=>{
                let range_key = `[${d.x0}, ${d.x1}]`;
                let found = Object.keys(this.selected).includes(range_key);
                if(found){
                    d3.select(evt.target).attr('class', 'bars hovered');
                    delete this.selected[range_key];
                }else{
                    d3.select(evt.target).attr('class', 'bars selected-bar');
                    this.selected[range_key] = d;
                }

                //loading return data into our roundtrip object
                RT["return_ranges"] = JSON.stringify(this.selected);
            });
. . .
.selected-bar{
    fill: royalblue;
    stroke: indianred;
    stroke-width: 2px;
}

In our JavaScript we added a new event and slightly modified our mouseout event to ensure that the behavior of our visualization is consistent. Most importantly however, you will see that in our click event we now add data to an object that keeps track of our selected ranges, this.selected. At each click, we update this object and store the new selections in our Roundtrip object at the key "return_ranges", the location our fetch_data function expects it to be found.

RT["return_ranges"] = JSON.stringify(this.selected);

Note that we stringify our object before storing it. Just like how we have to parse data retrieved from the Roundtrip object we also need to stringify it before storing it. If you prefer a different encoding scheme besides json, that is ok, but just be sure to encode your data before loading and be consistent with what your python code expects.

Now, whenever a bar is clicked it will be added to the "return_ranges" location where it can retrieved by a call to %get_histogram_selection.

You can verify this behavior by pulling up the rt_notebook we have been working in thus far and adding in three new cells below the histogram visualization cell.

%get_histogram_selection ranges
subselection = []
rng = json.loads(ranges)
for r in rng:
    subselection += rng[r]
    
subselection = np.array(subselection)
print("Mean: ", subselection.mean())
print("Std Deviation: ", subselection.std())
print("Variance: ", subselection.var())

Note Roundtrip loads data back into the supplied Jupyter-scoped variable asynchronously. This means that Python code run in the same cell as a retrieval will not be able to access the loaded data when executed. So be sure to always run your retrieval function in a separate cell from calculations which will use the data.


Reload your kernel and then run the cells up to the visualization. Click on a few bars and then run %get_histogram_selection cell. After this, run the subsequent two cells and you will see that the returned data has been successfully loaded into the notebook, ready for analysis.

Your notebook should look something like this now:

An image of a histogram with a normal distribution. The rightmost bars are colored blue, indicating that they have been selected.

Binding Jupyter-Scoped Variables to Javascript with the '?' operator

So if you have followed along with everything thus far you should now have a fancy web-packed Javascript visualization that supports loading real data from a notebook and returning data back to that notebook. However you may now notice that we have to do a lot of clicking if we modify our source data and want to re-visualize it. Also, returning this data in a manual fashion like demonstrated can be frustrating, requiring extra clicks and runs after interacting with our vis.

Wouldn't it be great if we could bind our Jupyter-scoped variables to our visualizations so that our vis automatically refreshes when this variable is updated? Furthermore, it would be equally helpful to have that binding go-both-ways so that we can effortlessly update variables updated in our visualization.

Fortunately, Roundtrip provides a tool for exactly that. The ? operator.

The ? operator can be thought of as a combination pass-by-refrence, auto-refresh switch for your Roundtrip-managed visualizations. From the notebook-user perspective its as simple as placing a ? before any variable they are passing as an argument to a Roundtrip-enabled line magic function. In the example of this tutorial, it will change:

%histogram normal

into:

%histogram ?normal ?ranges

For the visualization developer, a little more configuration is required. In our rt_vis.py file we will make the following changes:

. . .
#defined in the global scope above our classes
def py_to_js(data):
  import json
  import numpy
  if type(data) is type(numpy.array([])):
    return json.dumps(data.tolist())
  if type(data) is type([]):
    return json.dumps(data.tolist())
  else:
    return data

def js_to_py(json_data):
  import json
  return json.loads(json_data)

. . .
    @line_magic
    def histogram(self, line):
        args = line.split(" ")

        RT.load_webpack('dist/webpack_bundle.html', cache=False)
        
        if(len(args) > 0):
            RT.var_to_js(args[0], "hist_data", watch=True, to_js_converter=py_to_js)
        
        if(len(args) > 1):
            RT.var_to_js(args[1], "return_ranges", watch=True, from_js_converter=js_to_py)
          
        RT.initialize()
. . .

Here you will notice that we have added a second var_to_js() call and added some optional arguments to our original RT.var_to_js() call. We have also defined some converter functions py_to_js and js_to_py; these converter functions will enable us to manage the unique data which our visualization can support.

Variable Watching

Of primary importance to this functionality is the addition of the watch=True argument. This argument tells Roundtrip to keep track of the Jupyter-scoped variable passed in to this function as well as the Javascript key in the second argument. It will monitor both of these variables in their respective code bases and update their partner on-the-fly when a change has been detected.

The watch argument is False by default. If a notebook user provides a ? variable to a visualization which does not have watch enabled, then Roundtrip will print a warning notifying the notebook user that variable watching is not supported.

Converters

The to_js conversion function has one argument, data, and will be automatically populated with whatever data was passed by the notebook user to var_to_js. This function expects to return a stringified, encoded version of the passed data.

The js_to_py function similarly has one argument which will be automatically populated with the string-encoded data placed in the Javascript Roundtrip object at a particular key. In this example, we can save the notebook user some work by decoding the JSON data returned from the click interaction and returning a Python dict.


Note

Converter functions must be defined outside of a class as normal python functions. Defining them at the top of your visualization loader file, like we show in this example, is recommended.



Note

Be sure to lazy load the imports required for your conversions by defining them inside of the conversion functions as we show here. It ensures that a particular library is absolutely available and loaded when the converter is called.



Note

If you wish to use the data returned from this visualization in a secondary histogram called in another cell, ensure that this output converter returns the data in the same format as the input converter expects. In this example we would need to return a list or a numpy array; instead of the dictionary of ranges which we return currently. For more information see Data Binding and Automatic Updating.


By adding watch=True and defining these custom converters for our two arguments to the histogram we have now enabled automatic updating and two-way data binding on our visualization.

Once the user re-runs the visualization with the ? arguments they will be registered with Roundtrip and will cause the visualization to reload if we run a cell which changes their content. A data analyst now only needs to manipulate their data in their scripts and the changes will be automatically reflected in the visualization. Furthermore, if data is modified in the visualization and stored in our Javascript Roundtrip object it will be automatically stored in the associated variable passed to our line magic function.

In this example, changes to normal will cause the visualization to reload with new data. Point-and-click selections of individual bars will automatically populate ranges with the selected data. We can now take the data in ranges and manipulate it just like we did in the prior example; however in this case, since we supplied a custom converter, the user no longer has to decode the json-formatted data. We could also save the user of concatenating the returned data in our converter as well, if we so chose.

With these watched variables your notebook should now look something like this:

A excerpt of a notebook with a histogram bound to Jupyter data using the "?" operator.

Statistical descriptions of data returned from the roundtrip histogram visualization.

Feel free to play around with this and see what this can do. We suggest some of the following exercises, if you want to explore this functionality in more detail.

Try and generate other distributions and watch the visualization update with each new run of the data generating code.

Click around in the visualization and run any other cell with ranges to see what the output looks like how it changes.

Add a new cell with a second call to %histogram ?subselection and see how it changes when you run the cell where subselection is generated.

Figure out how to modify the js_to_py converter so that you can pass ranges directly back into another histogram.


Note

If you add or remove cells above a tracked visualization be sure to reload the kernel. Roundtrip does not support variable binding and automatic updating with moving, adding or deleting cells without a kernel refresh.