California School Dashboards Part 3: Modeling the Data

This is part three of a three part series where I work with California School Dashboard data by cleaning, visualizaing, and exploring through modeling. You can read the first part of this series, which shows one way to clean and prepare the data, and the second part of the series, which shows a way to visualize the data .


Now that we’ve looked at cleaning the California School Dashboard data and exploring it through visualization, we’re going to explore it further by looking at the relationship between math and English language arts scores across subgroups in nine school districts. We’ll do this by fitting a line through those data points and visually assessing how well the actual data points resemble that line.

As in the previous two posts, I anonymized the datasets. I won’t include that process here to keep things brief, but if you’re interested in the procedure I used you can view the script and cleaned data on my GitHub page.

This approach is inspired by Hadley Wickham’s “many models” demonstration, which I’ll link to at the end of the post.

Distance from Three

The main number we will be looking at in this analysis is the California School Dashboard’s “distance from three” score. This is the same score we looked at when we visualized the dashboard data in part of two of this series. For review, recall that the distance from three score for English language arts is the

average distance from level 3 of students who took the Smarter Balanced summative assessment in ELA

If that number is positive the student subgroup performed higher than level three on average. If that number is negative, the student subgroup performed lower than level three on average. There’s more on that on the CDE website if you’d like to investigate further.

Do Math Scores Predict English Language Arts Scores?

Let’s explore the first of our nine districts just to see if there’s any hint of a relationship between the math scores and ELA scores. If we plot the distance from three score of math on the x-axis and the distance from three score of English language arts for each school’s subgroup in the first of our nine districts, we see that there is a pretty clear relationship:

all_files[[1]] %>% 
  # Remove school level datapoints 
  filter(! %>%
  ggplot(aes(x = currstatus_math, y = currstatus_ela)) + 
  geom_point() + 
  geom_smooth(method = "lm") +
  labs(title = "Distance from Three: Math vs. English Language Arts", 
       x = "Math Dist from Three", y = "ELA Dist from Three")
## Warning: Removed 37 rows containing non-finite values (stat_smooth).
## Warning: Removed 37 rows containing missing values (geom_point).

Here’s another way to think of it: generally speaking, a subgroup’s difference from three score in math can help us predict that same subgroup’s difference from three score in English language arts.

The line drawn through the plot is the model fit. That means that it goes through the plot in a straight line to best follow the relationship between the math distance of three and the English language arts distance from three in this dataset. This relationship is quantified by the model, but we won’t go into that here because we’re more interested in what this relationship looks like for different subgroups. But before we get to that, we will need to fit the model to the nine districts we’re analyzing.

Fitting the Model to Nine Districts

In order to compare this model fit to all nine of our districts, I arranged the data so that each row was a district and their dataset. I then fit a linear model to each district and calculated the differences between the predicted values and actual values. These differences are stored in a column called resid, which is short for residuals. We’ll talk a little more about residuals later.

Here is some of the code for arranging the data. Feel free to skip to the plots if you want to start examing the results.

I used a function for the linear model so I can apply it to all nine datasets using the purrr package.

dist_mod <- function(d) {
  lm(currstatus_ela ~ currstatus_math, data = d)

Next I used modelr::add_residuals to add the residuals.

mods <- comb %>% 
  mutate(model = map(data, dist_mod), 
         resids = map2(data, model, add_residuals)) 

Finally I used dplyr:unnest to open up the dataset so that each residual value had it’s own row.

resids <- mods %>% 

The Plots

Visualizing Differences Between Subgroups

Recall that our goal here is to see if there are any differences in our math vs. ELA relationship when we look at subgroups. The way we’ll do that is to compare how wrong our linear model got it for each subgroup.

But first a note on missing values:

The original dataset did not include distance from three scores for any subgroup that had less than 11 students. This is a typical practice with education data and is usually done to maintain student confidentiality. Consdquently, there is a pretty sizeable amount of subgroups that are missing distance from three scores. These missing values will be represented as warning messages in R when make our plots.

So let’s see what happens when we remove the math vs. ELA pattern from each dataset and see what we have left over. These are called residual values. (here’s Wikipedia for more reading on residuals). If we create a plot of these leftover values across distance from three math scores, we get this:

ggplot(data = resids, aes(x = currstatus_math, y = resid)) + 
  geom_point(alpha = .25) + 
  facet_wrap(~ districtname) + 
  labs(title = "Residuals Across Math Distance From Three Scores", 
       x = "Math: Distance From Three", 
       y = "Residuals")
## Warning: Removed 1356 rows containing missing values (geom_point).

So what does this tell us? The dots closest to y = 0 represents where the model predicted ELA results better. A residual of 0 would suggest that there was 0 difference between the predicted and actual values. So the next question is can we see any patterns in leftover values when we examine subgroups? To answer that question we’ll color each subgroup in the plot:

ggplot(data = resids, aes(x = currstatus_math, y = resid, color = studentgroup)) + 
  geom_point(alpha = .25) + 
  facet_wrap(~ districtname) + 
  labs(title = "Residuals Across Math Distance From Three Scores", 
       x = "Math: Distance From Three", 
       y = "Residuals")
## Warning: Removed 1356 rows containing missing values (geom_point).

You can start to see some hints that the subgroup “English Learners Only” had larger residual values in some districts, particularly in district_2 and district_7. Recall that leftover values below zero suggest the subgroup had a lower value than expected from the model. Let’s follow this line of questioning and see if we can visually inspect this difference a little more. To do that, we’ll define each subgroup datapoint as either “English Learners Only” or “not English Learners Only”:

resids %>% 
  mutate(elo = ifelse(studentgroup == "English Learners Only", "ELO", "Not ELO")) %>%
  ggplot(data = ., aes(x = currstatus_math, y = resid, color = elo)) + 
  geom_point(alpha = .5) + 
  facet_wrap(~ districtname) +
  labs(title = "Residuals Across Math Distance From Three Scores", 
       x = "Math: Distance From Three", 
       y = "Residuals", 
       color = "")
## Warning: Removed 1356 rows containing missing values (geom_point).

Now it’s a little clearer that there might be some differences in the way the model fits for the ELO subgroup. We won’t go into the differences in the way the California Department of Education distinguishes ELs from ELOs, but if we were continuing this line of inquiry it would be critical to understand how these subgroups are defined. You can read more about the definition of EL and ELO in this CDE document.

Practical Next Steps

Now that we have this information, what do we do with it? Here are some practical next steps for building on the analysis like this and turning it into meaningful action:

Build consensus on conclusions by examining other sources of data. Ask questions like:

  • Are there quantitative and anecdotal datasets, like local measures or weekly quiz scores that reinforce or challenge these findings?
  • Do these findings resonate with the accounts of teachers?

Look at the underlying instructional processes that might logically drive these findings. Ask questions like:

  • What are the similarities and differences in the instructional practices across subgroups?
  • How can we change instructional practices, even if in small ways, to achieve the outcomes we want for all subgroups?

Decide what to try and how to measure. Ask questions like:

  • Once we implement changes, is a repeat of this analysis useful to measure the impact?
  • If our plan was working, how would we expect the residual data to look different from these results?

More on the Many Models Approach

Here’s more on the many models approach as demonstrated by Hadley Wickham:

Ryan Estrellado
Ryan Estrellado
Executive Consultant at South County SELPA, Educator, and Data Scientist

I work at the San Diego South County SELPA, where I design solutions to help SELPAs build equitable systems for students with disabilities.