###
Scatter vs Jitter

So I have a client that wants to visualize two variables for a critical chart being sent to the entire company. Naturally, with two variables they were inclined to select a scatter plot. However, the relationship between these two variables is not important to the audience even though the variables are highly correlative.

Another issue is that the inputs for these variables is based on an ordinal scale so some points have identical values. This means that some points can be stacked on top of one another. This makes it hard to get an accurate sense of the distribution of the points for each of these variables which is very important.

So these are the most important considerations when creating this chart:
- Needs to show the value of two discrete variables for a specific point
- Needs to show the relative distribution of all points
- The relationship between these two variables is not important
- Some points have identical values for both variables
- The audience for these charts is varied so a box-plot might be too complex

So given this information which chart would you recommend? Here is a mock up of the original chart my client made and my proposed solution with fake data in Tableau.

I proposed a parallel coordinate jitter plot. This allows you to clearly see the distribution of the data and compare the value for the two variables on the same scale. It also subtly highlights the comparison of the two variables for the specific points with the connecting line which is of ancillary value. It also avoids stacking points. But the x-axis could be confusing since it is a randomized.
Which chart would be best? Please let me know your thoughts in the comments. Thank you.
div#ContactForm1 {
display: none !important;
}
Great challenge!

ReplyDeleteAs is, I'd give the edge to the bottom viz. I think the biggest challenge is distinguishing that there are two different Y axes. Putting the vizs next to each other makes that hard.

Leaving a scatter chart behind hypothetically, what about a bar chart where length is one variable and the other variable is shown by bar width/color?

Alternatively, maybe two different bump charts showing ranking that share highlighting?

New thought, what about a bar chart that shows bars for one variable going up from the axis and bars for the other variable going down from the axis? Or, just two bar charts, one on top of the other?

ReplyDeleteJitterplots are the best charts you’ve never heard of. They are a type of scatterplot where only a single metric is used. This means you can easily compare the relative performance of the members in your dataset.

ReplyDeletehttp://bit.ly/2XgqWxN