CS 8395-03 - Visual Analytics & Machine Learning
Resources
High-Level Authoring Tools
When you are first gathering/cleaning data and you would like to produce some initial visualizations, I strongly recommend using one of the following tools:
- matplotlib For python, this is a standard plotting library, but you can do a fair amount with it.
- ggplot2 For R, this library is more geared towards data visualization than matplotlib, as the
gg
is in reference to Leland Wilkinson’s Grammar of Graphics. - Vega-Lite A so-called Grammar of Interactive Graphics, this is a tool that allows you to easily declare how to map your data to marks and channels. Support for multiple views, as well as a means of declaring interactions.
- Altair Python library built on Vega-Lite.
- Bokeh Another Python library for data visualization. Good support for interactive visualizations.
- Data Illustrator Like Adobe Illustrator, but for the creation of data visualizations. Intended to have visualization authoring be akin to creating graphics through design tools, and does not require any programming.
- Charticulator Akin to Data Illustrator, but has a larger emphasis on the creation of complex layouts.
Libraries that support Bespoke Visualizations
The above tools may not be sufficient for creating more complex visualizations that require multiple views and intricate coordination of views. For these ends, please see the following:
- d3 Javascript library for creating visualizations - your vis is represented as SVG elements that are facilitated by d3.
- React Javascript library for building user interfaces.
- Stardust Javascript library for creating visualizations that is scalable. Uses WebGL for efficient rendering, could be useful if you have a large amount of data you want to show.
- Three 3D rendering library in Javascript, backed by WebGL.
- Leaflet Javascript library for interactive maps.
- Crossfilter Javascript library for interactions with multivariate datasets, good support for linked views.
- Shiny For R, can create some reasonably complex, responsive, visualizations with this library.
- VTK C++/Python library for creating 3D visualizations. Requires basic understanding of 3D graphics.
- Paraview Visualization tool for creating 3D visualizations. Large support of filters for data processing and visualization. Has support for scripting, so it is customizable to a certain extent.
- OpenGL When all else fails … you can’t beat using straight OpenGL for creating data visualizations. Available for most languages: C++/Python/Javascript, etc.. Steep learning curve, however.
D3 Resources
Should you decide to use D3 as your library for visualization, please see the additional resources below for further reference:
SVG Reference
Javascript Basics
D3 Basics
- API Reference
- How Selections Work
- General Update Pattern I, II, and III
- Nested Selections
- Thinking with Joins
- Mister Nester
Useful Blocks
- Mouse Enter/Exit
- Mouse Move
- Circle-Polygon Intersection
- Geometric Zooming
- Semantic Zooming
- Lab and HCL Color Spaces
In addition, I have a series of lectures on D3 that you might find useful. These are intended for you to download and experiment with on your computer.
- Web Programming, Javascript, SVG
- D3: selection and transformation
- D3: the data join
- D3: scales
- D3: events and transitions
- D3: odds and ends
Machine Learning Libraries
- TensorFlow
- PyTorch TensorFlow and PyTorch are the predominant deep learning libraries for Python. They allow one to build computation graphs (e.g. your network), loss functions, optimization methods, and some support for data management in training the model. They have similar learning curves.
- Keras Keras wraps a lot of the functionality in existing ML libraries, including TensorFlow, to make it easier for experimentation.
- scikit-learn scikit-learn offers a whole host of ML techniques (classification, clustering, dimensionality reduction, etc..), but does not have support for training deep neural networks.
Demos
Here are a set of web-enabled demos for some of the papers we will cover throughout the semester: