D3: Nests and Stacks

Grouping with d3.nest()

If we want to produce a grouped bar plot (as in the Graphical Perception paper), then there are three main bits of information associated with our data that we will need:

We will use nesting to reorganize our data array, giving us a data structure where key gives us the group, and values gives us an array of data for the given group. We will then create a band scale for group, a band scale for bars, and a linear scale for the bar’s height.

Here is code for achieving this:

var svg0 = d3.select('#svg0');
var range_pad = 20;
var width = svg0.attr('width')-range_pad, height = svg0.attr('height')-range_pad;

var samples = d3.range(10).map(d => 10*Math.pow(10,d/12))
d3.shuffle(samples);
var data = samples.map((d,i) => i%2==0 ? {'h':d,'id':'b'+Math.floor(i/2),'group':'g1'} : {'h':d,'id':'b'+Math.floor(i/2),'group':'g2'});

var nested_data = d3.nest()
	.key(d => d.group)
	.entries(data)

var group_keys = nested_data.map(d => d.key);
var id_keys = d3.set(data, d => d.id).values();
id_keys.sort()
var group_scale = d3.scaleBand().domain(group_keys).range([range_pad,width]).paddingInner(0.3);
var x_scale = d3.scaleBand().domain(id_keys).range([0,group_scale.bandwidth()]).padding(0.1);
var y_scale = d3.scaleLinear().domain([0,d3.max(data, d => d.h)]).range([height,range_pad]);

svg0.selectAll('empty').data(nested_data).enter().append('g')
	.attr('transform', d => 'translate('+group_scale(d.key)+',0)')

svg0.selectAll('g').selectAll('empty').data(d => d.values).enter().append('rect')
	.attr('x', d =>	x_scale(d.id)).attr('width', x_scale.bandwidth())
	.attr('y', d =>	y_scale(d.h)).attr('height', d => (y_scale(0)-y_scale(d.h)))
	.attr('fill', d3.hcl(20,30,70)).attr('stroke', '#000')

svg0.append('g').attr('transform', 'translate('+'0'+','+(height)+')').call(d3.axisBottom(group_scale))
svg0.append('g').attr('transform', 'translate('+(range_pad)+','+'0'+')').call(d3.axisLeft(y_scale))



To summarize: we first nest along the group attribute, producing an array for which each item contains all data that have an equivalent group value (e.g. “g1”, “g2”), and then perform a nested data join, first for the group, and then for each array with each group to give us individual bar marks.

Stacking with d3.stack()

For producing stacked bars (as in Graphical Perception), we use d3.stack. This function assumes data is formatted in a specific way, namely, each object in our data array corresponds to a single stacking of data. Each object must be comprised of the individual attributes that are to be stacked, with each attribute as a unique property name, and its value being the quantitative data that we aim to encode. In the previous example, we saw that the quantitative value was assigned to a property h, and its bar id was assigned to a property b. For stacking, we need to pack all values of a single group together, and so now, each object should have one property per bar id, assigned to the appropriate data values.

Given our data, the main thing we need to do is provide a way to access the individual values that will provide us the stacking. This is achieved by calling the keys function on a stack object, specifying data attributes that contain the values to be stacked. d3.stack actually produces a function, and it is necessary to call the returned result on the data array.

What is returned by stack is a multidimensional array that is a bit complicated. Stacking computes a cumulative sum per data item. Let’s break it down:

Here is an example:

var svg1 = d3.select('#svg1');
var range_pad = 30;
var width = svg1.attr('width')-range_pad, height = svg1.attr('height')-range_pad;

var data_1 = [{},{}];
samples.forEach((sample,i) => {
	g_id = i%2 == 0 ? 0 : 1;
	data_1[g_id]['b'+Math.floor(i/2)] = sample
	data_1[g_id]['group'] = 'g'+(g_id+1);
});

var stacker = d3.stack()
	.keys(id_keys)
var stacked_data = stacker(data_1);

var flattened_stack = d3.merge(stacked_data);
var max_stack = d3.max(flattened_stack, d => d[1]);
var stack_group_scale = d3.scaleBand().domain(group_keys).range([range_pad,width]).paddingInner(0.3).paddingOuter(0.05);
var stack_y_scale = d3.scaleLinear().domain([0,max_stack]).range([height,range_pad]);

svg1.selectAll('empty').data(stacked_data).enter().append('g')

svg1.selectAll('g').selectAll('empty').data(d => d).enter().append('rect')
	.attr('x', d =>	stack_group_scale(d.data.group)).attr('width', stack_group_scale.bandwidth())
	.attr('y', d =>	stack_y_scale(d[1])).attr('height', d => (stack_y_scale(d[0])-stack_y_scale(d[1])))
	.attr('fill', d3.hcl(20,30,70)).attr('stroke', '#000')

svg1.append('g').attr('transform', 'translate('+'0'+','+(height)+')').call(d3.axisBottom(stack_group_scale))
svg1.append('g').attr('transform', 'translate('+(range_pad)+','+'0'+')').call(d3.axisLeft(stack_y_scale))