4 Interactive graph
We used a Python script to group and parse the data so that we could create our grouped bar chart, showing the top five Amazon product categories for each of the different demographic groups.
From our exploration, we discovered that the bar plot is very interesting because it shows the user’s top product categories and includes the year feature, which could give some insights into any changes due to COVID-19. The alluvial plot was also eye-opening since it showed the flow between different variables and included the top categories. However, the years were not included. This is why we decided to make an interactive grouped bar plot that included time and an option for someone to select the demographics and variables. This way, they can see what they want to see in a user-friendly way that was not available in the static version.
Based on the other visualizations, we decided to include the demographic columns age, education, gender, income, and race. This allows the user to seamlessly select the demographic variable they want in order to see how that affects the number of purchases per category. They can check to see if this matches with what they thought or even if it matches their shopping behaviors. We saw similar behaviors with the alluvial plot when someone selects age with books and pet food being frequently purchased categories. However, we also see new top categories appear, such as cell phone cases for participants between 18-24 years old versus 65 and older, who are the only group who has coffee as a top category. When education level is the variable selected, it’s clear to see the top categories are about the same. It’s just the amounts that are different. When looking at gender, again, the same categories are popular; however, we see more female purchases, especially for books and pet food. Finally for income, the top categories are the same as gender and education, and we see a consistent pattern of books and pet food being the top two categories with fewer purchases overall when income decreases.
Overall, by letting the person select one variable at a time, it is a simple and readable way to see that the demographic doesn’t really affect the top product categories but it does impact the number of purchases.