Joe Matula was a high school mathematics teacher and principal before serving as superintendent for 26 years. Now retired, he has been a member of the state’s Performance Evaluation Advisory Council (PEAC) since July 2010 and serves as a PERA facilitator.
P ERA implementation may be like the man who fell out of a ten-story building. As he passed the fifth floor, he said, “So far, so good.”
PERA is the acronym for the Performance Evaluation Reform Act concerning terms and conditions of teacher and principal evaluation and employment. PERA became Illinois law in 2010, followed by additional reforms. It requires that, in every Illinois school system, principals and assistant principals be evaluated by trained, pre-qualified evaluators (often the superintendent), and evaluations must include data and indicators of student growth as a significant factor. Teachers must be evaluated by trained evaluators (usually the principal), and again, student growth must be included. Principals, assistant principals and teachers must be evaluated using four rating categories: Excellent, Proficient, Needs Improvement, or Unsatisfactory.
With some exceptions, PERA implementation begins Sept. 1, 2015 for districts whose student performance ranks in the lowest 20 percent of their type; and Sept. 1, 2016, for all remaining districts. For more information, see IASB’s PERA overview for school board members at www.iasb.com/law/PERAoverview.pdf .
Many school districts in Illinois are seeking direction for PERA implementation. Although this report includes only five districts’ plans, these early decisions can provide a good starting point. Four of these districts must implement PERA by Sept.1, 2015, so their joint committees recently completed their 180-day bargaining sessions. One is not required to implement until 2016. Although all five are elementary districts, most of the following decisions could apply to high school districts or unit districts.
As a facilitator, I have no stake in decisions a joint committee makes regarding its PERA teacher evaluation plan. Because my task is to wave a red flag if I see a decision that may backfire, I do not push any certain agenda. Given that, and to my surprise, all five school districts independently made the decisions described below.
All five districts chose to forgo the state’s phase-in option of implementing the student growth part of PERA. Each could have implemented student growth at 25 percent for the first two years, but all chose to start directly with a 70 percent professional practice and 30 percent student growth split. The phase-in option was deemed so insignificant that the full measure was not worth postponing.
For the two required assessments, schools could choose from three types. Type I assessments are the most standardized, most reliable, least reflective of the classroom curriculum, and scored by an outside entity. Type II assessments — which the districts at hand did not choose — are approved or adopted by a district and are typically a common assessment given by all teachers in a grade level. Type III assessments are the least reliable, but most reflective of classroom curriculum, and must be agreed to by the teacher and the evaluator.
PERA requires at least one assessment be a Type I or II, and one a Type III. If no Type I or II can be identified, both can be Type IIIs. Any Type I or II may qualify as a Type III if it aligns to curriculum and measures student learning in that subject area.
All districts opted to make one of their two assessments a Type I assessment for all teachers — classroom teachers in addition to art, music, physical education, etc.
For their Type I assessments for language arts and math, one school district uses NWEA MAP, another uses Terra Nova, one uses easy CBM assessments, and two use STAR. Three school districts will base Type I growth on the higher of reading growth and math growth. One district will let the teacher choose. One will average reading and math growth scores.
All five school districts have made this assessment worth 5 percent of the total rating.
A commitment to Type I for all teachers motivates everyone and draws concern to performance of the district’s students on the standardized assessment, the most visible and publicized assessment. This commitment also builds instructional collaboration throughout the district, which encourages cross-disciplinary instruction.
For example, a physical education teacher is more likely to contact fourth-grade teachers and ask, “What are your students doing in math?” The fourth-grade teachers may say, “We are starting a unit on measurement.” The P.E. teacher can decide to integrate concepts of perimeter and area in fourth-grade P.E. classes. Students see mathematics concepts everywhere, not just in math class. This focus on math and reading can pervade all curriculum and instructional discussions.
The second reason Type I assessment works for all teachers is that it makes implementation of PERA fairer and more consistently applied. Everyone is judged by the 5 percent, rather than some teachers evaluated by a Type I and a Type III some by a Type I and a Type II, and so on. Separate combinations of various assessments, all with different levels of reliability, would make a messy and disparate evaluation process. This is a major concern of all teachers’ associations.
This decision also requires each teacher to be responsible for only one Type III assessment, rather than two.
All districts opted for the second assessment to be a Type III, decided by teacher and evaluator. This assessment is based on teaching a unit with a minimum interval of instruction of four weeks. This allows each teacher to schedule the unit to best suit the instructional calendar. Setting the unit at a minimum of four weeks fits most naturally with the regular flow of instruction, and teachers will not have to manipulate any instruction and assessments just for PERA’s sake.
Another decision the five joint committees faced was whether to assess all students a teacher faces, or just one class. One district decided to include all students. For example, a junior high teacher would have to assess all students who met the attendance criteria, which could easily be over 125 students. Four districts decided to give junior high teachers and special area teachers (such as art and music) the choice of classes for these unit lessons. Grades K-5 teachers, mostly with self-contained classroom groups, will use the same students either way and are mostly unaffected by this issue. However, the self-contained classroom teacher is allowed to select the subject area of his or her choice (not necessarily reading or math).
Three districts set a 90 percent attendance rate and two an 85 percent rate, based on the instructional lessons in the designated unit. This means that if a student misses the lesson, his/her data does not count for that teacher. The premise is that it is unfair to evaluate a teacher on student performance when the student is not present for lessons. One school district decided to use school attendance rather than classroom attendance, because it is easier to maintain accurate records.
A significant dilemma for joint committees is how to handle students who are taught the same subject by two or more teachers. By allowing a teacher to select the group of students of his or her choice, general education teacher and specialist may end up assessing the same student(s) for Type III assessment.
In one situation a specialist, such as a special education teacher or reading specialist, who goes into the classroom and teaches a student or small group of students, could use a student’s data for both teachers, or the general education teacher would not be allowed to teach that subject as the unit to be assessed. In other situations, a specialist pulls a student or group of students out of the classroom. This is simpler as the specialist would establish assessment independent of the general education teacher.
Some teachers, special education for example, may have small groups of students, which would not provide a valid statistical sample. Unfortunately, this has to be accepted. If the evaluation plan did not allow small groups, even as small as three or four students, they would be unable to comply with PERA, as PERA does not allow for exceptions due to small sample sizes.
For teacher reviews, all districts opted to establish a mandatory review by the superintendent or designee of all ratings of Needs Improvement or Unsatisfactory. Both teachers and administrators supported the logic behind this requirement. Teachers with the most at-risk ratings would feel more comfortable to know someone other than a single individual reviewed their evaluations. Superintendents supported this, because of the opportunity to review principals’ thoroughness and evaluation skills. Superintendents also liked the chance to review a rating that holds potential for objection by the teacher. In effect, it was like saying, “If we have to go to battle over this, I want to make sure we have a good case.”
Simple growth or Student Learning Objectives
Among the five school districts, one opted for a simple growth model, the average difference between the pre-test and the post-test scores. Average growth is computed and placed on a tiered growth scale as below:
20 divided by 46 or 43% growth
The simple growth scale would, counting all students who met the attendance requirement, realize the following rankings:
- Excellent = Average Student Growth of 50 percent or more
- Proficient = Average Student Growth of 25 percent to 49 percent
- Needs Improvement = Average Student Growth of 10 percent to 24 percent
- Unsatisfactory = Average Student Growth of 1 percent to 9 percent
Four school districts chose to develop Student Learning Objectives ( SLOs). This process allows a teacher to identify an expected growth level for each student or group of students. The teacher, based on knowledge and information he or she has about the students, can set differentiated growth targets. The teacher is evaluated on the students who meet or exceed targets. This is a more time-consuming process for teacher and evaluator, as it requires time to meet and agree on growth targets. Since the evaluator, typically the principal, can’t know each student as well as the teacher does, the evaluator must trust the teacher’s judgment. Trust is the key to the student growth process. SLOs can be computed as below:
|Pre- Test||Growth Target||Post- Test||Yes/ No|
3 of 5 students or 60% met their growth targets
The student learning objectives scale would, counting all students who met the attendance requirement, realize the following rankings:
- Excellent = 76 percent or more of students met targeted growth
- Proficient = 51 percent to 75 percent of students met targeted growth
- Needs Improvement = 25 percent to 50 percent of students met targeted growth
- Unsatisfactory = Less than 25 percent of students met targeted growth
A few miscellaneous decisions made by the joint committees were:
- Post-tests will count for student grades because that will motivate students to put forth greater effort.
- If some rare occurrence takes place during the pre-test/post-test interval, teachers may file an Extenuating Circumstances Report for the evaluator to consider an adjustment.
- Teachers will grade all Type III assessments.
- All completed Type III assessments will be stored in the respective classroom or school office, not the district office.
- In three districts, tenured teachers will complete Type III assessments during their off-evaluation years. In the other two districts, tenured teachers will not complete Type IIIs in an off year.
Each joint committee reviewed three options for determining the final rating. The first one, the definition model, describes the final rating in a narrative definition by establishing criteria for the professional practice part by domain. For example, three domains rated as Excellent and one domain as Proficient would equal a final rating of Excellent. The remaining ratings are defined in similar fashion. Nobody liked this one.
The second choice was a matrix, with each rating level (from 4 to 1) of professional practice weighted at 70 percent and each rating level of student growth weighted at 30 percent. These values were added together to create each cell in a 4-by-4 matrix. Nobody liked this one either.
Each committee chose the third option, a mathematical one (see chart). It measures the professional practice portion of PERA using the Danielson Framework. This option uses ratings of 4, 3, 2, and 1, totaling 84 points, for 21 of the 22 Danielson components. The student growth portion is described in the right-hand column. Each student growth portion provides a number of points. When added to the professional practice part, this gives a total that can be compared to the final rating chart in the bottom right-hand corner. Note all cut score levels would be determined by the joint committees (the ones in the sample are arbitrary). Each of these school districts plans to collect survey data from teachers during a pilot phase in early 2015 to adjust cut scores.
For more about the Danielson framework, visit danielsongroup.org/ .
Described above are the primary decisions to be considered by joint committees implementing PERA. I am a five-year member of PEAC (Performance Evaluation Advisory Council), an Illinois superintendent of 26 years and a tenured university professor who taught many teacher evaluation classes and Danielson training sessions. Most importantly, as a facilitator, I listened to the five joint committees independently analyze the above decisions. Given that background, I feel the model described above is the fairest, the most practical and the most educationally sound way to meet the PERA requirements.