Posts

Showing posts from October, 2024

Module 9 Assignment - Multivariate Analysis

Image
  This county map of Florida represents unemployment rate and voter percentage. Hillsborough County had an unemployment rate of 7.5 percent in 2020. This is the year COVID19 affected millions of people across the state. Voter turnout in this county is 52.86 percent.  The legend details the rate of unemployment, and in Hillsborough County, the rate was relatively low compared to other states in 2020. However, voter turnout remains relatively the same as other counties in Florida. Please click here to view an interactive version of this map.

Module 8 - Correlation Analysis and ggplot2

Image
 For this assignment, the class was tasked with creating a correlation analysis diagram. I chose to create a heatmap of the mtcars data set. The diagram demonstrates the correlation between each variable in the dataset. Items in light blue are highly correlated, whereas the darker colors are less closely correlated.  Below is the code I used to create my heatmap: qplot(x=Var1, y=Var2, data=melt(cor(mtcars)), fill=value, geom="tile") I went through several iterations of the qplot function before realizing that the code had been provided. At first, I was confused as to how qplot worked and tried to use only two variables that I named, before consulting with ChatGPT to develop a working solution. I discovered that heatmaps represent a correlation analysis between groups of variables that are identified as var1 and var2, x and y respectively.  This assigment helped me to understand how heatmaps can represent correlation analysis of variables within larger datasets.

Module 7 - Distribution Analysis

Image
 For this module, we were assigned to create a visualization based on distribution analysis. I created a bar graph based on the mpg table of the mtcars dataset in R. Below is my graph: This graph shows the miles per gallon of each vehicle in the mtcars dataset. A drawback of this graph is that the vehicles are not identified on the x axis, therefore, the graph does not demonstrate the best information on distribution for this dataset.  Here is the code:  data("mtcars") head(mtcars) counts <- table(mtcars$mgp) counts2 <- table(mtcars$wt) counts View(mtcars) barplot(mtcars$mpg, main = "Miles per Gallon", xlab = "MPG", ylab = "count", col = "orange")

Module 6 - Visual Differences and Deviation Analysis in R

Image
 For this week's assignment, the class was introduced to the R software environment. We were instructed to create a visualization of a graph. I chose to use Jacksonville, FL Sheriff Department's police shooting data for my data set. I created a data frame using data from 2023. This data contained variables for officer race, gender, tenure, and age. The values for officer gender and race were difficult to graph because there was not much variety (most officers recorded in the data set were white and male, leaving little room to create a dynamic visualization.  I elected to use the variables Officer Age and Officer Tenure for my visualization. I created a scatter plot and a line graph using ggplot2 (with the help of ChatGPT). Below are the graphs: What these graphs tells us about police involved shootings within the Jacksonville Sheriff's Department in 2023 is that most of the offending officers are younger than the age of 40 with between 0 and 10 years of tenure behind them....