Each column represents a level of homeownership
mm
Each column represents a level of homeownership
mm
we’ll break a square into columns for each category of the variable,
,
A mosaic plot is a visualization technique suitable for contingency tables that resembles a standardized stacked bar plot with the benefit that we still see the relative group sizes of the primary variable as well.
,
is helpful if the primary variable in the stacked bar plot is relatively imbalanced, e.g., the category has only a third of the observations in the category, making the simple stacked bar plot less useful for checking for an association. The major downside of the standardized version is that we lose all sense of how many cases each of the bars represent
d
The stacked bar plot is most useful when it’s reasonable to assign one variable as the explanatory variable (here homeownership) and the other variable as the response (here application_type) since we are effectively grouping by one variable first and then breaking it down by the others.
k
(also known as filled bar plot)
,
It is difficult to say, based on this plot alone, how different application types vary across the levels of homeownership. Figure 4.2 (b) is a standardized bar plot
We can display the distributions of two categorical variables on a bar plot concurrently. Such plots are generally useful for visualizing the relationship between two categorical variables. Figure 4.2 shows three such plots that visualize Figure 4.2 (a) is a stacked bar plot.
c
A bar plot is a common way to display a single categorical variable. Figure 4.1 (a) displays a bar plot of the homeownership variable.
c
The row totals provide the total counts across each row and the column totals down each column. We can also create a table that shows only the overall percentages or proportions for each combination of categories, or we can create a table for a single variable
c
A table that summarizes data for two categorical variables in this way is called a contingency table. Each value in the table represents the number of times a particular combination of variable outcomes occurre
c