FOR THIS ASSIGNMENT PLEASE USE THE APP CALL TABLEAU
V BGraphs are built in layers
Often people see graphs as common types by such as bar graphs, line graphs, scatterplots, but it is more useful to think about the features that these graphs have in common and what can be modified to change one graph to another when designing. For computer-rendered graphics like those made in Tableau, the computer uses functions to make the chart appear as pixels on your screen. Nearly all visualization software requires selecting the data of interest, the variables to plot on the x and y axes, and shapes and colors used to represent data on the screen. These are three of the seven layers commonly used as part of a grammar of graphics. Using this grammar to help, computers relate information by combining matrices of these layers when rendering an image.
For this assignment, watch the embedded video and read a brief excerpt posted below about the Grammar of Graphics. Once you learn about the layers that make a computer graphic, analyze one of the data visualizations from the previous assignments for the layers:
For your selected visual, embed a copy of it as an image, and then unpack the seven layers by naming them and describing them in relation to the visual media.
Data – the dataset in long table format.
Overview of the grammar of graphics
Excerpt from Dipanjan (DJ) Sarkar (2018) A Comprehensive Guide to the Grammar of Graphics for Effective Visualization of Multi-dimensional Data. Towards Data Science. https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149
Understanding the Grammar of Graphics
To understand the Grammar of Graphics, we would need to understand what do we mean by Grammar. The following figure summarizes both these aspects briefly.
Basically, a grammar of graphics is a framework which follows a layered approach to describe and construct visualizations or graphics in a structured manner. A visualization involving multi-dimensional data often has multiple components or aspects, and leveraging this layered grammar of graphics helps us describe and understand each component involved in visualization — in terms of data, aesthetics, scale, objects, and so on.
The original grammar of graphics framework was proposed by Leland Wilkinson, which covers all major aspects pertaining to effective data visualization in detail. I would definitely recommend interested readers to check out the book on it, whenever they get a chance!
We will, however, be using a variant of this — known as the layered grammar of graphics framework, which was proposed by Hadley Wickham, reputed Data Scientist and the creator of the famous R visualization package ggplot2 (https://ggplot2.tidyverse.org/). Readers should check out his paper titled, ‘A layered grammar of graphics’ (http://vita.had.co.nz/papers/layered-grammar.html)which covers his proposed layered grammar of graphics in detail and also talks about his open-source implementation framework ggplot2 which was built for the R programming language
Hadley’s layered grammar of graphics uses several layered components to describe any graphic or visualization. Most notably, it has some variations from the original grammar of graphics proposed by Wilkinson as depicted in the following figure.
We illustrate the same using a pyramid architecture to show an inherent layered hierarchy of components. Typically, to build or describe any visualization with one or more dimensions, we can use the components as follows.
Data: Always start with the data, identify the dimensions you want to visualize.
Aesthetics: Confirm the axes based on the data dimensions, positions of various data points in the plot. Also check if any form of encoding is needed including size, shape, color and so on which are useful for plotting multiple data dimensions.
Scale: Do we need to scale the potential values, use a specific scale to represent multiple values or a range?
Geometric objects: These are popularly known as ‘geoms’. This would cover the way we would depict the data points on the visualization. Should it be points, bars, lines, and so on?
Statistics: Do we need to show some statistical measures in the visualization like measures of central tendency, spread, confidence intervals?
Facets: Do we need to create subplots based on specific data dimensions?
Coordinate system: What kind of a coordinate system should the visualization be based on — should it be cartesian or polar?