Teacher Evaluation: Lions, and Tigers, and Bears–Oh, my!

[A friend told me the other evening that she was tired of checking my blog to still find the Mojo piece. I know what she means. I’ve not made the time to write much lately. I started this post a few weeks ago and finally got back to it over the weekend. This is for you, Margaret.]

My teacher colleagues take great pride in their work—pride in the day-to-day planning and implementation of instruction, in classrooms from Brandywine Hundred to Delmar, from Townsend to Cape Henlopen. They speak about their students with love and dedication, even when sharing funny stories or talking about the frustrations of working with reluctant learners.  Their sense of pride and ownership is remarkable. They think in terms like these: “It’s my class, my room, my kids, my school.” 

However, this year, teachers everywhere feel they have a tiger by the tail. Just like the evaluation and rating of anyone’s job performance, teacher evaluation has never been easy, always caused some anxiety, and required time and care in order for any one teacher to clearly demonstrate his/her abilities as an effective instructor. This year, all of us are trying to get a grasp on recent changes in DPAS II (Delaware Performance Appraisal System) that are intended to revolutionize the way that teachers are evaluated. Meanwhile, it’s tough to get a good hold on that tiger, to keep that hold, and to avoid getting mauled.

I am currently on a leave of absence from my teaching job so that I can serve for three years as the president of the Delaware State Education Association. However, I am also a 6th grade science teacher. Every year for the past 22 of my 39 years, one of my building administrators would come into my classroom for an entire science class and observe me doing what I do best–teaching science. Building administrators have been observing teachers for decades in order to get a sense of a teacher’s instructional prowess and competence. Additionally, classroom observation has been used to gauge the effectiveness of district-developed professional development, as well as to identify teacher compliance with administrative directives.  

In the past, I have been consistently judged a competent, effective teacher and have always been able to meet or exceed each of the many criteria in the state teacher evaluation system. But, this year, there will be a new twist to teacher evaluation throughout Delaware. If I were still in the classroom, a significant portion of my evaluation as a science teacher would be based on two pieces of rather unscientific data: 30% of my rating in Component 5: Student Improvement, would be based on a composite of school-wide scores for math and reading tests taken by all 900 students in my home school; the other 70% would be based on the math or reading scores of a cohort of my science students. I am quite uncertain about how this would turn out. For the first time in my career, I truly fear that I could be judged “Ineffective” or even labeled as “Needs Improvement” based on those damning test results.

Remember, I teach science. Obviously, science instruction involves a fair amount of reading and a good chunk of math. I have always done whatever I could to support 6th grade reading and math literacy. I make sure each year that my students understand the skills and strategies needed to succeed with non-fiction text. I reinforce some reading activities—but, I am not the reading teacher. Every year I do basic instruction in constructing and reading graphs and in analyzing the data we collect. However, I am not the math teacher! 

It’s a jungle out there, and Component 5 is turning out to be a real bear.

It has been my experience that teachers have long understood and responded to accountability. They felt accountable the moment they walked into the classroom. They were instantly and irrefutably accountable to the students in front of them and to the families to whom those students returned each evening.

It is no surprise that Delaware is actually way ahead of the curve in the development of a strong, state-wide teacher evaluation program. Edreform has been a dedicated hot topic throughout this state since 1983.  The Delaware Performance Appraisal System, a.k.a. DPAS I (pronounced “D-Pass”) was introduced in the mid-80’s. Prior to DPAS I, individual districts created and enforced their own teacher evaluation systems. DPAS I was well-conceived and the training program for administrators and teachers was pretty thorough for its time.

During the first half of this decade, we all worked together to design an updated, enhanced state-wide evaluation system. The plan was first piloted in two school districts, rolled out the following year to a few more districts, and the final roll-out was completed in 2008. This is the system on which I have been judged for the past three years. It is really a very good system, based on a framework designed by Charlotte Danielson, a renowned guru of instruction. Ms. Danielson was able to capture all of the substantive elements that make up effective instruction and described them clearly and succinctly, along with various charts and rubrics, in Enhancing Professional Practice: A Framework for Teaching, a book on which DPAS II is based. She organized the elements under four(4) domains that make up the first four(4) components of DPAS II.   http://charlottedanielson.com/theframeteach.htm    

In the past year, a renewed emphasis has been placed on a fifth part of the DPAS program—Component 5: Student Improvement. “The proof is in the pudding,” goes the saying. So, the proof of my value and worth as an effective teacher should be found in the results. Pure and simple. And, across this great country, there are folks who are wedded to the idea that the results can be and should be measured by student test results. They also believe that test results should trump mere classroom observation. It seems simple enough. It should be a matter of input and output.  

INPUT: If one does a careful and mindful combination all of the many things that an outstanding teacher should do (1) to plan and prepare instruction, (2) to create a worthwhile environment for teaching and learning, (3) to implement and deliver a focused, appropriate, accurate, pedagogically accurate lesson or instructional program, as well as (4) fulfill all professional responsibilities associated with classroom teaching, then, ipso facto, one should get the resultant OUTPUT: a clear, measurable demonstration of (5) commensurate student growth.

[Actually, Danielson doesn’t see it quite that way.  She is confident that she has hit on an authentic and accurate representation of the elements of effective instruction—what it takes to teach and teach well. However, she recognizes that the OUTPUT phase can be difficult to identify and quantify using valid, reliable measures. She is working now on a way to describe what and how this might be “measured.”]

The next step: One should be able to measure this output in order to validate classroom observations of effective teaching practice. It appears that the easiest way to quatify this deliverable would be to administer a test that should measure Johnny’s and Suzy’s level of learning. In the current environment of edreform, the equation seems simple: a less than satisfactory test score = inadequate instruction = ineffective teacher. 

Implications? It would appear that substandard student test scores would indicate ineffective teaching–simple cause and effect. It seems to make sense that there would be clear and direct alignment between effective math and reading instruction and students who consistently meet or exceed the standards on both the state math and reading tests. To play off of an old computer acronym: GTI / GTO—good teaching in / good test scores out.  The common thinking is that the inverse would likewise be true.

Oh, did I mention that (a) this is all based on a brand new test and (b) in just a few years, the test will be replaced with another new test.

So, why are teachers distraught? Well, out of the blue, or so it may seem, Component 5, which had previously involved a series of goal-setting and goal-measuring tasks—activities and data collection that had some potential but really had no teeth, and could have been much better managed by districts and the state alike—reared up and bit ‘em on the butt.

There they were, working their way through various approved curriculums and projects when that 5th component of DPAS—the one not delineated within the Danielson framework—took on new meaning and additional weight. Component 5 also now carries with it a framework of labeling and corrective actions, including the ultimate threat of termination.  

Depending upon one’s point of view, this could be good news for modern man or the bad news blues. Whatever it is, and whatever it turns out to be, it is currently rocking the world of teaching in Delaware.      

Disappointment? Yes. Many teachers are disappointed, but mostly, they report being fearful–worried, alarmed, anxious, apprehensive, troubled, distressed, tense, and uneasy—you name it, they feel it. The folks who deliver instruction to some 129,000 students across this state are more than a little concerned about the potential outcomes of the latest iteration of what had previously been a universally respected system of teacher evaluation.  Starting last spring, teachers began to get that uneasy, queasy feeling that Component 5—a piece that is still under development—was turning out differently than anticipated, that it was something both unexpected and unpredictable. Oh, my.

This entry was posted in Accountability, Education Reform, Quality Teachers/Quality Teaching, School Improvement, Teacher Evaluation, Teachers and Teaching, Testing. Bookmark the permalink.

16 Responses to Teacher Evaluation: Lions, and Tigers, and Bears–Oh, my!

  1. John Young says:


    Thanks for a very nice update and for covering the impact side, not just the progress side.

  2. gilda says:

    Great analysis of a complex, and completely political, additional component to teacher evaluation. Might it have something to do with the swelling numbers of employees at the state department of education?

    • Frederika says:

      Not really. They are pedaling as fast as they can. They have created HUGE tasks to accomplish–there may not be enough people to do what they feel needs to be done. There is incredible pressure–external and internal–to get it moving, keep it moving, move it faster and farther. My inclination is not to slow it down but to take the time to do it right. Get the test RIGHT. Arrange a contract to test all four core subjects. Why just test math and reading each year? Reinforces the idea that the other subjects don’t really count. And, for God’s sake, rid yourselves of the notion that any teacher of any subject can have their work and effectiveness judged on the results of that same math and reading testing. Makes NO sense. None at all. Not fair, certainly not valid, and sure looks and smells legally indefensible to experts–if it were to come down to that.

      • gilda says:

        I would imagine that lack of money will be the main reason to continue to do it the wrong way and in a hurry….I believe reason has left the building….

  3. Mike Matthews says:

    Thank you for this, Frederika. In one of our recent PLCs, our data coach passed around a sign-in sheet, Next to our name was a column that said “I feel…” so each of us was supposed to finish the sentence. All five of us in that meeting put something along the lines of “overwhelmed,” “stressed,” and “tired.” I asked the data coach if she would be reporting that data back. She said this was more “for her” and wouldn’t be going anywhere. I told her it would probably be a good idea if they did keep such data and report it to the state to get an idea of how the teachers are feeling. We really are lost at this point. For me, I’m not sure I want to be found.

    • Frederika says:

      I feel,… sick that some of the potential and ideals of initiatives that actually have some promise are being squandered. PLCs are like a dream activity to me. It is what my team was trying to do years ago when we worked on some integrated lessons. We trained together, planned together, assessed together, gathered data together, talked and talked–all on our own time. We spent a weekend in Dover getting special training–icluding the art teacher. Now, PLC’s in some districts are controlled from above–micromanaged. Student testing should be getting better–not causing more problems. Teacher evaluation should bring everyone’s attention to doing it right–making a system that genuinely documents and accounts for effective teacher and then provides resources if effectiveness is lacking. It’s a good system–let’s not compromise or degrade it.

  4. Toby W. Paone says:

    Very good post, FJ. Should be read by many.

  5. Pingback: New union prez adds voice to evaluation debate | Delaware Ed

  6. Stressed inDE says:

    I would love to see this published in the News Journal!!!!!

  7. Frederika says:

    Thanks for the link to the comments from the op-ed piece that Delaware EA presidents submitted. And, especially, thanks for standing up for teachers and education unions. That NOthing guy is a lost cause. Of course, effective teaching is the #1 overall factor in student success. Why would that be an argument? However, even the most effective of teachers may have a great deal to overcome with many under-achieving students in our high needs schools. No sense trying to explain how principals can and do exit the poorest and most ineffective teachers from the profession–student test scores are not necessary for that. I have worked for two guys who quite appropriately cleaned house–with my blessings. No teacher wants to work with or protect ineffective teachers. Unions do want to protect basic rights for all members. Too many principals fail to do a good job of weeding out ineffective teachers in the first three years when they are probationary. If they can’t or won’t take care of business then, how are they going to put in the time and effort to do it later? It takes time to document poor teaching. Big job–important job. Certainly not impossible.

  8. Frederika says:

    Check out some of my earliest posts on the problems with tenure. Way back to fall of 2010.

  9. Ancora Imparo says:

    Fredrika, thank you for taking the time to communicate to others. Living in the edreform/eduspeak, I am often caught in the whirlwind to slow it down. Definitely should be published. Creating assessments in component 5 scares me…as a parent. I hear teachers say, dcas doesnt test grammar, so no need to emphasize…but SAT does assess grammar. Other teachers comment, no need for depth in WW2, since not tested. These tests, rushed for federal
    compliance, impact instuctional, curricular decisions. I heard many teachers ask impacting questions. They were answered by, “this is not the task.” When is impact not an educator’s task? My biggest frustration is that my children are in the system at this time.

  10. Audrey Noble, PhD says:

    This is not the first time the state has used its testing program for purposes beyond which it was valid. But evaluating teachers’ performance and endangering careers is huge step beyond requiring kids to go to summer school or grade retention. Has anyone questioned the state’s technical advisory committee members about the validity of using reading and math test scores to evaluate teachers, escpecially those who teach other subject areas? I’d bet that the answer would be a resounding “NO”. Good assessment has strong validity and reliability components to it. Does DCAS have either in regards to teacher performance evaluation?

    • Frederika says:

      Audrey: DSEA most certainly has questioned the validity of using math and reading test scores for teachers other than math and reading/English teachers. We have questioned the validity of the use of some “composite” of scores of all students tested–the school-wide score–as even a small part of any teacher’s evaluation. We have directed these questions to the SecofEd and to Linda Rogers who heads up the staff on teacher evaluation. We were informed a month ago that the TAG was arranging for “beta testing” of the plan to use the cohort scores. Apparently thay have three school districts that have volunteered to allow them to use the fall scores and the January scores to “run the traps.” Any suggestions from you would be welcomed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s