California Test Scores Versus District Characteristics

Eric Busboom.

Description

The variable used in the following charts are:

  • score_all: Test score averaged across all students in the district
  • score_ses: Test score of high socioeconomic static students (SES) minus low-SES students; the gap between rich an dpoor.
  • score_wb: Test score of white students minus the test score of black students; the white-black gap.
  • staff_black_rate: Percetage of teachers who are black
  • staff_hisp_rate: Percentage of teachers who are asian.
  • staff_whasian_rate: Percentage of teachers who are white or asian.
  • staff_male_rate: Percentage of teachers who are male
  • staff_teacher_rate: Ratio of teachers to total staff.

This analysis is limited to districts where the annual cost per student is less than $20,000, which excludes very small districts that have unusual circumstances and should be analyzed independently.

Correlation Matrix

<AxesSubplot:>

Detailed Correlations

These tables show the correlations between the variables in the a and b columns.

More asian and white teachers are correlated with higher test scores across all categories. More asian white teachers has a moderate association with increasing college rate for black students, but much less so for other students.

a b corr
19 staff_whasian_rate score_all 0.370149
20 staff_whasian_rate score_black 0.369934
26 staff_whasian_rate score_ses 0.302490
28 staff_whasian_rate cgr_black 0.221133
30 staff_whasian_rate score_white 0.183244
48 staff_whasian_rate cgr_all 0.044013
49 staff_whasian_rate score_wb 0.022123
59 staff_whasian_rate cost_per_ada -0.024438
77 staff_whasian_rate enr_black_rate -0.180846

Higher per-student spending is associated with higher scores for white students and lower scores for black students.

a b corr
0 cost_per_ada cost_per_ada 1.000000
31 cost_per_ada score_wb 0.182259
50 cost_per_ada score_white 0.018462
55 cost_per_ada score_ses -0.018445
57 cost_per_ada score_all -0.021149
70 cost_per_ada cgr_all -0.157765
74 cost_per_ada cgr_black -0.170933
80 cost_per_ada score_black -0.207649

More black teachers is associated with lower scores for black students and a lower rate of college for black students.

a b corr
5 staff_black_rate enr_black_rate 0.750346
34 staff_black_rate score_wb 0.171786
46 staff_black_rate cost_per_ada 0.049291
66 staff_black_rate cgr_all -0.121854
71 staff_black_rate score_white -0.162522
76 staff_black_rate score_ses -0.176879
83 staff_black_rate cgr_black -0.251551
85 staff_black_rate score_all -0.263363
88 staff_black_rate staff_whasian_rate -0.339720
89 staff_black_rate score_black -0.441095

More male teachers is associated with higher scores for black students.

a b corr
37 staff_male_rate staff_black_rate 0.131872
38 staff_male_rate cost_per_ada 0.112925
40 staff_male_rate score_wb 0.091101
42 staff_male_rate cgr_all 0.080405
45 staff_male_rate enr_black_rate 0.068373
61 staff_male_rate score_black -0.037697
62 staff_male_rate score_white -0.041318
63 staff_male_rate cgr_black -0.044097
64 staff_male_rate score_ses -0.075816
65 staff_male_rate score_all -0.115924
67 staff_male_rate staff_whasian_rate -0.145117
72 staff_male_rate enr_whasian_rate -0.167281

Detailed Regression Plots

Detailed scatter plots with a regresion line for pairs of variables. The two variables are shown in the title of each plot, where score_var is the name of variable on the y axis, and rate_var is the variable on the x axis.

Imputation Test

There are a lot of missing data, so it may be worth while to impute missing records. This grid of KDE plots shows the oroginal data series in orange, and the imputed ( KNN, n=2 ) data series in blue. Plots for variables where the blue and orange curves align exactly do not require imputation, and where the lines diverge greatly ( such as cgr_black ) the imputation is probably adversely affecting the statistics.