5 Conclusion
In this project, we wanted to explore America’s shopping habits on Amazon and see if there were any trends based on demographics or if world events, like the COVID-19 pandemic, impacted Amazon purchase behavior. We made numerous plots and discovered some key insights on this topic. When looking at the top product categories bought per year, although the order of categories remained consistent, the popularity and increase in purchases increased yearly. Throughout the graphs, we saw the number of Amazon purchases rising over time. However, we did see that in 2023, there was a decrease in sales and purchases, although this might have to do with a limitation in our dataset. Since this dataset was based on voluntary survey data, the number of purchases was not evenly distributed, so some years seem to have fewer results. This data also had numerous columns, the majority of which were categorical. In the future, we would like to combine this dataset with another dataset to have an equal number of categorical and numerical variables and further explore Amazon data through other graphs, such as scatterplots and contour plots. We learned more about our data through each plot, but the bar chart and alluvial plot especially told us about the most popular product categories and the users’ preferences. We also looked into how Amazon purchases are distributed geospatially across the US by plotting the total number of Amazon orders on a state map. In the future, we would like to divide this number by population density to get a better idea of what states buy the most on Amazon per person. Using the other plots, we discovered which demographic variables were associated with purchase prices and dove deeper into that discovery using D3. This interactive bar chart is simple yet effective, as it lets the person see how choosing the demographic category changes the most popular categories, which was impossible to see in the static graph. Given more time, it would be interesting to add a quiz component to D3 about the top product category for your age group, where they can input their age and their prediction. However, due to our time constraints, we believe that we successfully explored our data, created a compelling data story with static R plots and a D3 interactive plot, and found some insight into who is buying certain products on Amazon.