-
Notifications
You must be signed in to change notification settings - Fork 2
Tutorial
Assuming you have gone through the quick-start tutorial; you should now be ready to learn more about the advanced and dynamic features of Roundtrip.
In the quick start, we load a single, static HTML file. While helpful to verify that your install is working, it cannot support even a simple visualization so we will now explore how to load multiple files and modify html with your css and javascript code.
Picking up from the Quickstart, you should now have a directory that looks like this:
rt_tutorial
├── rt_templ.html
├── rt_vis.py
├── rt_notebook.ipynb
We are going to expand on this directory by adding a new file, rt_script.js
.
touch rt_script.js
Inside of this file we will write a simple script which adds a line to our html:
function createTextElement (type, text) {
// create a new div element
const newNode = document.createElement(type);
newNode.className += 'from-js';
// and give it some content
const newContent = document.createTextNode(text);
// add the text node to the newly created div
newNode.appendChild(newContent);
return newNode;
}
const node = createTextElement("p", "I was created and added in the JavaScript, and styled by the CSS!");
element.appendChild(node);
Inside of this code we define a function which makes the building of text nodes very easy. Then we use it to add an paragraph node to our view area, containing the text "I was created and added in the Javascript, and styled by the CSS!".
Note
In this example we use element
, a variable which was not defined. Roundtrip exposes the node containing all your loaded web files in this variable element
. Due to potential conflicts with other cells, be sure to use this like you would 'document' when selecting and modifying the DOM. It will keep your code safely in the visualization area of the current Jupyter cell.
We can now add this JavaScript code into our visualization by modifying the rt_vis.py file to specify one more file path in our call to RT.load_web_files()
:
...
@line_magic
def hello_rt(self, line):
# load files
RT.load_web_files(["rt_templ.html", "rt_script.js"])
RT.initialize()
...
When you run the %hello_rt
cell in your rt_notebook
you should see the text added, but you should also notice that it is not styled. That's because we need to add a css file to our code. Do this by creating a new file:
touch rt_style.css
Now add some styles to your css file:
h1 {
color: rgba(50,50,180,1);
}
p{
color:brown;
font-size: medium;
font-family: 'Times New Roman', Times, serif;
}
And add them to our list of loaded files as before:
...
@line_magic
def hello_rt(self, line):
# load files
RT.load_web_files(["rt_templ.html", "rt_script.js"])
RT.initialize()
...
And, when you reload and re-run your notebook you will see that the big text is colored blue and the smaller is now red. This workflow is best suited for vanilla js and simple workflows, but can allows you to inject many files of any supported type (css, html, and js).
Since this is a library to support complex JavaScript visualizations we can expect that code bases will become increasingly complex and benefit from the many benefits provided by packaging libraries. So, in order to support more complicated visualizations we suggest a workflow for managing and loading your code using webpack. This is the recommended approach to loading web files into roundtrip.
You will need Node.js and Webpack installed on your system for this portion of the tutorial.
We will also need the webpack html plugin, the style loader plugins, as well as d3. Install these from the rt_tutorial root directory:
npm install webpack-html-plugin
npm install style-loader css-loader
npm install d3
At this point your directory should look like this:
rt_tutorial
├── rt_templ.html
├── package.json
├── rt_vis.py
├── rt_notebook.ipynb
├── rt_style.css
├── rt_script.js
You can now create two new directories called src
and dist
, create a webpack.config.file and populate src
with an HTML, JavaScript and CSS file.
mkdir src
mkdir dist
touch config.webpack.js
cd src
touch webpack_templ.html webpack_script.js webpack_style.css
After these operations your directory should now look like this:
rt_tutorial
├── rt_templ.html
├── package.json
├── config.webpack.js
├── rt_vis.py
├── rt_notebook.ipynb
├── rt_style.css
├── rt_script.js
├── dist
└── src
├── webpack_style.js
├── webpack_script.js
└── webpack_templ.html
We will populate webpack_templ.html
with just an svg:
<h3>My Histogram</h3>
<svg id='hist-canvas'></svg>
We will put some style information for our histogram bars in the webpack_style.css
:
.idle{
fill: indianred;
stroke: royalblue;
stroke-width: 1px;
}
.hovered{
fill: royalblue;
stroke: rgb(215, 215, 216);
stroke-width: 1px;
}
.labels{
fill: black;
font: 14px sans-serif;
}
We will put the following into our webpack_script.js
file:
import * as d3 from 'd3';
import './webpack_style.css'
class Histogram{
constructor(svg, width, height, data){
this.svg = svg;
this.width = width;
this.height = height;
this.margins = {
left: 40,
right: 20,
top: 20,
bottom: 20
}
this.bar_margin = 7;
this.data = data;
this.maxMin = this.getMinMax(data);
let bins = d3.bin();
this.data_bins = bins(this.data);
this.num_bins = this.data_bins.length;
this.bindomains = this.getBinDomains(this.data_bins);
this.xscale = d3.scaleLinear().range([this.margins.left, this.width-this.margins.right]).domain([0,this.num_bins]);
this.yscale = d3.scaleLinear().range([this.margins.top+this.margins.bottom,this.height]).domain([this.bindomains.max, this.bindomains.min]);
}
getMinMax(arr){
let minmax = {
"min": Number.POSITIVE_INFINITY,
"max": Number.NEGATIVE_INFINITY
}
for(let val of arr){
minmax.min = Math.min(minmax.min, val);
minmax.max = Math.max(minmax.max, val);
}
return minmax;
}
getBinDomains(bins){
let minmax = {"max": Number.NEGATIVE_INFINITY, "min": 0}
for(let bin of bins){
minmax.max = Math.max(minmax.max, bin.length);
}
return minmax
}
draw(){
let bar_width = (this.width/this.data_bins.length)-this.bar_margin;
let leftAxis = d3.axisLeft(this.yscale);
this.svg.append('g').attr('transform',`translate(${this.margins.left}, ${-this.margins.top})`).call(leftAxis);
this.svg.attr('height', this.height)
.attr('width', this.width);
let bars = this.svg
.selectAll('.bars')
.data(this.data_bins);
let labels = this.svg
.selectAll('.labels')
.data(this.data_bins);
bars.enter().append('rect')
.attr('class', 'bars idle')
.attr('width', bar_width)
.attr('height', (d)=>{return this.height-this.yscale(d.length)})
.attr('x', (d,i)=>{return this.xscale(i)})
.attr('y', (d)=>{return this.yscale(d.length)-this.margins.bottom})
.on('mouseover', (evt, d)=>{
d3.select(evt.target).attr('class', 'bars hovered');
})
.on('mouseout', (evt,d)=>{d3.select(evt.target).attr('class', 'bars idle');});
labels.enter()
.append('text')
.attr('class', 'labels')
.attr('x', (d,i)=>this.xscale(i)+(0.5*bar_width))
.attr('y', this.height - 5)
.attr('text-anchor', 'middle')
.html(d=>{return `[${d.x0}, ${d.x1}]`})
}
}
//setup
const svg_select = d3.select(element).select('#hist-canvas');
const data = [];
const num_data = 1000;
for(let i = 0; i < num_data; i++){
data.push(Math.random()*100);
}
//construction & initial draw
const hist = new Histogram(svg_select, 800, 400, data);
hist.draw();
In our script file you will notice that our code has increased in complexity compared to our prior examples. Using webpack we can now import our CSS files directly, in addition to any JavaScript libraries we download through NPM. Additionally, we can now use ES6 syntax for creating a class that manages our visualization: Histogram
.
Inside of this class we provide two functions for getting the domains of our data for d3 scales. We also use the d3.bins
class to automatically sort our generated visualizations into bins. The draw function also provides some styling on hover (mouseover
and mouseout
).
At the bottom of this code you will see that we are randomly generating data. Accordingly, the visualization generated should reflect a random distribution where all bars are roughly about the same height.
Finally, we are going to add a new loader to our rt_vis.py
file.
. . .
@magics_class
class MyWebpackVis(Magics):
def __init__(self, shell):
super(MyWebpackVis, self).__init__(shell)
self.shell = shell
@line_magic
def histogram(self, line):
RT.load_webpack('dist/webpack_bundle.html')
RT.initialize()
def load_ipython_extension(ipython):
ipython.register_magics(MyRTVis)
ipython.register_magics(MyWebpackVis)
Finally, in the webpack.config.js file we will provide the following settings:
const HtmlWebpackPlugin = require('html-webpack-plugin');
const path = require('path');
module.exports = {
module:{
rules:[
{
test: /\.css$/i,
use: ["style-loader", "css-loader"]
},
]
},
entry: {
script: ['./src/webpack_script.js']
},
output: {
publicPath: path.resolve(__dirname, 'dist'),
filename: '[name]_bundle.js',
path: path.resolve(__dirname, 'dist')
},
optimization: {
minimize: false
},
plugins:[
new HtmlWebpackPlugin({
template: './src/webpack_templ.html',
chunks: ['script'],
filename: 'webpack_bundle.html'
})
],
mode: 'production'
}
In this webpack configuration file, there are a few key details which must always be included for Roundtrip to work correctly.
First, the use of the HtmlWebpackPlugin
is strongly recommended. Roundtrip's load_webpack
function expects an .html
file created from this plugin. This workflow will benefit you as well as the html file produced here will contain all the necessary reference to any scripts you are using in your visualization. It will also save you from generating otherwise static html in your Javascript code.
Second, note the publicPath
option under output. For Roundtrip to successfully load files regardless of where the Jupyter notebook server is running, this should be the absolute path to your output directory; dist
in this case. __dirname
holds the directory path to this webpack.config file.
Outside of these two details, the rest of the webpack can be configured in whatever way works best for you and your workflow.
Once your configuration file is all set, then you should be ready to go. You can build out your code by running the following line from rt_tutorial
directory:
npx webpack
After your front end code has been built successfully, you can now run the histogram visualization from the rt_notebook
in a new cell with the %histogram
magic function. Your output should look more-or-less like the following figure.
Now we have a histogram which visualizes completely useless and constantly changing data. Wouldn't it be way more useful if we could pass data supplied by the notebook user to this visualization and enable them to visualize any data they want?
Fortunately, we can do that; and this is really where the magic of RoundTrip begins to shine. Roundtrip provides a function RT.var_to_js()
for this exact purpose. It takes in the name of a Jupyter scoped variable and a key where it can be found in our javascript code and manages the translation from Jupyter to Javascript, making our data available for use in a loaded visualization.
To demonstrate how this works we are going to add two lines to our histogram
line magic function:
@line_magic
def histogram(self, line):
args = line.split(" ")
RT.load_webpack('dist/webpack_bundle.html', cache=False)
if(len(args) > 0):
RT.var_to_js(args[0], "hist_data")
RT.initialize()
The first line added is at the top of this function args = line.split(" ")
. This line simply parses any space-separated variable arguments passed to our magic function and provides them as a list. This list contains the names of variables passed to the function; Roundtrip manages the extraction of data referred to by those names in the Jupyter namespace.
The next new line is RT.var_to_js(args[0], "hist_data")
. In this line we pass the name of the first variable passed to %histogram
and specify that the data should be found at the key hist_data
on out Javascript Roundtrip object. We have wrapped this line in a if statement to catch if the user passed an argument or not. We can also put this call in a try catch block or support any other arbitrary validation techniques.
We will now modify the JavaScript in our webpack_script.js
file to make use of this newly exposed data.
import * as d3 from 'd3';
import './webpack_style.css'
const RT = window.Roundtrip;
...
Near the top of our file we will access the global Roundtrip object from the window
variable provided by our browser. We alias it as RT for use throughout this script.
At the bottom of our file we can access the data passed in from the Jupyter side of our code.
...
let data = [];
if(Object.keys(RT).includes('hist_data')){
data = JSON.parse(RT['hist_data']);
//construction & initial draw
const hist = new Histogram(svg_select, 800, 400, data);
hist.draw();
}
else{
svg_select.attr('width', 800)
.append('text')
.attr('class', 'no-data')
.attr('y', 32)
.html("Oh no! There is no data to show.")
}
...
We have now changed the const data
to a let data
so that we can reassign it with our passed data and removed our Javascript-based random number generation. We have also added a conditional which will print an error message in the visualization if no data has been passed.
Roundtrip stores data passed from Jupyter to the Roundtrip
object in a string format and expect all data passed back into it to be a string format or basic datatype (int, float, char).
Note On the Python side, Roundtrip provides a default converter for translating basic Python data to JSON, but allows for users to provide custom defined converters in Python to support more complex datatypes. Conversion on the JavaScript side is currently manual and requires users to parse data retrieved from the Roundtrip object as well as encode data passed back to the Roundtrip object.
Finally, we need to add our new class no-data
to our css file.
.no-data{
fill: black;
font: 32px sans-serif;
}
Once you have implemented the above code changes, run npx webpack
again to build out your code. We can now generate a normal dataset very easily using numpy
in our Jupyter notebook and see it visualized.
Note At this point you may notice some behavioral errors when you try to re-run your data. During development we suggest that you do the following to fix some cache related errors:
- Reload the Kernel
- Clear the browser cache
- Refresh the Webpage
Related, you may at this point have tried to "run all cells". Due to the asynchronous nature of loading and passing data, Roundtrip does not support running all cells at once in a Jupyter notebook. If you run into issues related to running all cells, try the above steps to reset the Jupyter notebook to a fresh state and run each cell individually.
Create a new cell above the %histogram
visualization cell and insert the following code:
import numpy as np
normal = np.random.normal(0, 1, 1000).tolist()
This creates a new, normal list for us to visualize. Since we did not supply a custom converter to our var_to_js
function, we convert it to a list so that our default converter handles the input more elegantly than it might with a numpy
array. We can pass this list into our visualization by providing it as the first argument to our histogram:
%histogram normal
Your output should look something like this:
At this point, we now have a custom histogram visualization which works with any list of numbers the user wishes to pass into it from the Jupyter notebook. Now what if we want to return information back from our notebook for further analysis or visualization? Roundtrip provides an interface for this.
The function is on the Python side of our code RT.fetch_data(js_var, jupyter_var)
and takes two arguments:
- js_var - The key where our JavaScript data was stored in the Javascript Roundtrip object
- jupyter_var - The name of a Jupyter notebook-scoped Python variable where we will place retrieved data
To use this retrieval function we define a new line magic method inside of our MyWebpackVis
class:
@line_magic
def get_histogram_selection(self, line):
args = line.split(" ")
RT.fetch_data("return_ranges", args[0])
This method takes in a single argument: a Jupyter-scoped variable we will load our returned data into. Currently the fetch_data
method does not support automatic conversion of returned data so the data loaded into the args[0]
variable will be a string. If the string was stored in a JSON format originally, the decoding will be easy in the scope of the Jupyter notebook.
Note: Functionality to supply a custom converter for this function will be implemented soon.
After we have defined where we expect the data to be found on the Javscript side of our code and created this interface for the Jupyter-notebook user, we need to modify our visualization code to return data to it's reserved area: 'return_ranges'. To do that we will add a new interaction to the bars in webpack_script.js
and modify our CSS webpack_style.css
.
draw(){
. . .
bars.enter().append('rect')
.attr('class', 'bars idle')
.attr('width', bar_width)
.attr('height', (d)=>{return this.height-this.yscale(d.length)})
.attr('x', (d,i)=>{return this.xscale(i)})
.attr('y', (d)=>{return this.yscale(d.length)-this.margins.bottom})
.on('mouseover', (evt, d)=>{
d3.select(evt.target).attr('class', 'bars hovered');
})
.on('mouseout', (evt, d, i)=>{
let range_key = `[${d.x0}, ${d.x1}]`;
let found = Object.keys(this.selected).includes(range_key);
if(!found){
d3.select(evt.target).attr('class', 'bars idle');
}
else{
d3.select(evt.target).attr('class', 'bars selected-bar');
}
})
.on('click', (evt,d)=>{
let range_key = `[${d.x0}, ${d.x1}]`;
let found = Object.keys(this.selected).includes(range_key);
if(found){
d3.select(evt.target).attr('class', 'bars hovered');
delete this.selected[range_key];
}else{
d3.select(evt.target).attr('class', 'bars selected-bar');
this.selected[range_key] = d;
}
//loading return data into our roundtrip object
RT["return_ranges"] = JSON.stringify(this.selected);
});
. . .
.selected-bar{
fill: royalblue;
stroke: indianred;
stroke-width: 2px;
}
In our JavaScript we added a new event and slightly modified our mouseout
event to ensure that the behavior of our visualization is consistent. Most importantly however, you will see that in our click
event we now add data to an object that keeps track of our selected ranges, this.selected
. At each click, we update this object and store the new selections in our Roundtrip object at the key "return_ranges"
, the location our fetch_data
function expects it to be found.
RT["return_ranges"] = JSON.stringify(this.selected);
Note that we stringify our object before storing it. Just like how we have to parse data retrieved from the Roundtrip object we also need to stringify it before storing it. If you prefer a different encoding scheme besides json, that is ok, but just be sure to encode your data before loading and be consistent with what your python code expects.
Now, whenever a bar is clicked it will be added to the "return_ranges"
location where it can retrieved by a call to %get_histogram_selection
.
You can verify this behavior by pulling up the rt_notebook
we have been working in thus far and adding in three new cells below the histogram visualization cell.
%get_histogram_selection ranges
subselection = []
rng = json.loads(ranges)
for r in rng:
subselection += rng[r]
subselection = np.array(subselection)
print("Mean: ", subselection.mean())
print("Std Deviation: ", subselection.std())
print("Variance: ", subselection.var())
Note Roundtrip loads data back into the supplied Jupyter-scoped variable asynchronously. This means that Python code run in the same cell as a retrieval will not be able to access the loaded data when executed. So be sure to always run your retrieval function in a separate cell from calculations which will use the data.
Reload your kernel and then run the cells up to the visualization. Click on a few bars and then run %get_histogram_selection
cell. After this, run the subsequent two cells and you will see that the returned data has been successfully loaded into the notebook, ready for analysis.
Your notebook should look something like this now:
So if you have followed along with everything thus far you should now have a fancy web-packed Javascript visualization that supports loading real data from a notebook and returning data back to that notebook. However you may now notice that we have to do a lot of clicking if we modify our source data and want to re-visualize it. Also, returning this data in a manual fashion like demonstrated can be frustrating, requiring extra clicks and runs after interacting with our vis.
Wouldn't it be great if we could bind our Jupyter-scoped variables to our visualizations so that our vis automatically refreshes when this variable is updated? Furthermore, it would be equally helpful to have that binding go-both-ways so that we can effortlessly update variables updated in our visualization.
Fortunately, Roundtrip provides a tool for exactly that. The ?
operator.
The ?
operator can be thought of as a combination pass-by-refrence, auto-refresh switch for your Roundtrip-managed visualizations. From the notebook-user perspective its as simple as placing a ?
before any variable they are passing as an argument to a Roundtrip-enabled line magic function. In the example of this tutorial, it will change:
%histogram normal
into:
%histogram ?normal ?ranges
For the visualization developer, a little more configuration is required. In our rt_vis.py
file we will make the following changes:
. . .
#defined in the global scope above our classes
def py_to_js(data):
import json
import numpy
if type(data) is type(numpy.array([])):
return json.dumps(data.tolist())
if type(data) is type([]):
return json.dumps(data.tolist())
else:
return data
def js_to_py(json_data):
import json
return json.loads(json_data)
. . .
@line_magic
def histogram(self, line):
args = line.split(" ")
RT.load_webpack('dist/webpack_bundle.html', cache=False)
if(len(args) > 0):
RT.var_to_js(args[0], "hist_data", watch=True, to_js_converter=py_to_js)
if(len(args) > 1):
RT.var_to_js(args[1], "return_ranges", watch=True, from_js_converter=js_to_py)
RT.initialize()
. . .
Here you will notice that we have added a second var_to_js()
call and added some optional arguments to our original RT.var_to_js()
call. We have also defined some converter functions py_to_js
and js_to_py
; these converter functions will enable us to manage the unique data which our visualization can support.
Of primary importance to this functionality is the addition of the watch=True
argument. This argument tells Roundtrip to keep track of the Jupyter-scoped variable passed in to this function as well as the Javascript key in the second argument. It will monitor both of these variables in their respective code bases and update their partner on-the-fly when a change has been detected.
The watch
argument is False
by default. If a notebook user provides a ?
variable to a visualization which does not have watch
enabled, then Roundtrip will print a warning notifying the notebook user that variable watching is not supported.
The to_js conversion function has one argument, data, and will be automatically populated with whatever data was passed by the notebook user to var_to_js
. This function expects to return a stringified, encoded version of the passed data.
The js_to_py
function similarly has one argument which will be automatically populated with the string-encoded data placed in the Javascript Roundtrip object at a particular key. In this example, we can save the notebook user some work by decoding the JSON data returned from the click interaction and returning a Python dict.
Note
Converter functions must be defined outside of a class as normal python functions. Defining them at the top of your visualization loader file, like we show in this example, is recommended.
Note
Be sure to lazy load the imports required for your conversions by defining them inside of the conversion functions as we show here. It ensures that a particular library is absolutely available and loaded when the converter is called.
Note
If you wish to use the data returned from this visualization in a secondary histogram called in another cell, ensure that this output converter returns the data in the same format as the input converter expects. In this example we would need to return a list or a numpy array; instead of the dictionary of ranges which we return currently. For more information see Data Binding and Automatic Updating.
By adding watch=True
and defining these custom converters for our two arguments to the histogram we have now enabled automatic updating and two-way data binding on our visualization.
Once the user re-runs the visualization with the ?
arguments they will be registered with Roundtrip and will cause the visualization to reload if we run a cell which changes their content. A data analyst now only needs to manipulate their data in their scripts and the changes will be automatically reflected in the visualization. Furthermore, if data is modified in the visualization and stored in our Javascript Roundtrip object it will be automatically stored in the associated variable passed to our line magic function.
In this example, changes to normal
will cause the visualization to reload with new data. Point-and-click selections of individual bars will automatically populate ranges
with the selected data. We can now take the data in ranges and manipulate it just like we did in the prior example; however in this case, since we supplied a custom converter, the user no longer has to decode the json-formatted data. We could also save the user of concatenating the returned data in our converter as well, if we so chose.
With these watched variables your notebook should now look something like this:
Feel free to play around with this and see what this can do. We suggest some of the following exercises, if you want to explore this functionality in more detail.
Try and generate other distributions and watch the visualization update with each new run of the data generating code.
Click around in the visualization and run any other cell with ranges
to see what the output looks like how it changes.
Add a new cell with a second call to %histogram ?subselection
and see how it changes when you run the cell where subselection
is generated.
Figure out how to modify the js_to_py converter so that you can pass ranges
directly back into another histogram.
Note
If you add or remove cells above a tracked visualization be sure to reload the kernel. Roundtrip does not support variable binding and automatic updating with moving, adding or deleting cells without a kernel refresh.