Monday, January 26, 2015

Washington State Democrats Lead the Way

Last Saturday, the Washington State Democratic Party passed resolution 707 opposing the Common Core State Standards by a margin of over two to one.
Like the coming snowstorm here in the Northeast, let's hope this is the start of a political blizzard. Teacher activist Susan DuFresne has issued a press release together with David Spring and Elizabeth Hanson to help get the word out. Increasingly, mainstream media is simply not covering important education stories with due diligence, perhaps because education journalists tend to have a lower status and do little investigative reporting compared to their colleagues. To help meet the public's need for information on growing resistance to high stakes testing and the Common Core, bloggers such as Anthony Cody provide a reliable alternative

I got my elementary teaching credential at the University of Washington and taught in the Edmonds School District just north of Seattle for four years before returning to graduate school to complete my PhD. Back in the mid-1990s, there were no high stakes tests. There was a concerted effort on the part of the district to involve expert teachers in developing performance assessment tasks, and I worked on the ones in elementary math together with colleagues. The test prep I provided for my class prior to the one standardized test given in the spring was a homework packet and some test taking strategies. Oh, the (g)olden days! Now test prep IS curriculum, and we've been swimming in those toxic waters long enough for it to seem almost normal in the city schools where I work with my student teachers. It's increasingly clear that we need the solid backing of parents to persuade politicians to say enough is enough. 

In an effort to help educate the public on the growing problematic misuse of standardized tests, I published my first op-ed piece last week in the Journal News 
Please have a look, share if so inclined, and comment if you have something to say. 

Friday, January 2, 2015

Standardized Testing: The Final Frontier

Tests seem so reasonable at first --  teachers teach, students learn, and demonstrate mastery by passing a test. But as Daniel Koretz says at the start of his 2008 book, Measuring Up: What Educational Testing Really Tells Us, “Achievement testing is a very complex enterprise, and as a result, test scores are widely misunderstood and misused.” Now that is what I call an understatement. Furthermore, despite Common Core claims that better standards and tests mean fewer reasons for concern about their misuse, as Vito Perrone of Harvard University pointed out, “Most items on these various standardized tests remain well within the longstanding technology of testing, primarily to support the mechanical scoring procedures. They still seem to be limited instruments with too much influence” (1999, p. 152).

The testing “enterprise” is poised for a warp drive record-breaker of misuse insanity. In a nutshell, here’s how they plan to connect the dots.

A tiny fraction of what a student knows and can do is hypothetically captured, with some modicum of so-called scientific accuracy, by converting the number of correct answers out of the total number of questions on a standardized test to a raw score. Keep in mind that this single raw test score is still prone to error in its intent to measure what the student knows as the student may have made random choices, guessing correctly (or not), or may simply have had other contextual reasons for the performance including illness, distraction, nerves, etc. The test is also imperfect by design and is likely biased in some ways.

Now that raw score goes through some psychometric process to either be normed to a scale comparing it to other test scores, and/or it is ranked somewhere between unacceptable and excellent based on someone’s judgment of what students should know and be able to do. This is where all hell breaks loose as that converted score gets used.

How might it get used? For one, to tell the students and the parents or guardians how “well” they did which can involve labeling the converted score with a percentile rank, a grade-level equivalent, or just a descriptive meaning such as “meets standard.” However, it will likely be used in what is called a “high stakes” way to assign students to special education, to hold them back a year, or to track them into homogenous groups.

The most pernicious use is to group the scores to make claims about the quality of individual teachers. From there, it’s easy to see how tempting it is to make a claim about the quality of a school, and then a whole district. While we’re at it, let’s compare counties, states, regions, countries.

The cold hard truth, in Koretz’s words, is this:
Scores on a single test are now routinely used as if they were a comprehensive summary of what students know or what schools produce (p. 44-45).
He goes on later to add:
Simply attributing differences in scores to school quality or, similarly, simply assuming that scores themselves are sufficient to reveal educational effectiveness, is unrealistic. And more generally, simple explanations of performance differences are usually naïve. All of this is established science (p. 142).

Things get really tricky when hierarchical linear modeling kicks in to provide a “value-added” way to compare actual scores to a prediction and to use the difference to rate teachers’ effectiveness. Ignoring warnings from experts, these value-added models or VAMS, have been misused by policymakers to weigh heavily in the annual evaluation of teachers. Carol Burris, an outspoken principal who opposed this misuse of standardized test scores, recently wrote of a teacher’s lawsuit filed in New York State by my friend, Sheri Lederman, who hopes her case can become “a tipping point” in bringing this damaging unreliable practice to a grinding halt.

That may be wishful thinking because now the dots are being connected to the colleges and universities that educate teachers. They too are to be evaluated and ranked based on the performance of their candidates for teacher certification on standardized tests, which can be more than four in some cases. New federal regulations currently open for public comment until February 2nd would require these institutions of higher education to also track their teacher graduates, and collect their annual evaluation ratings including the VAMS measure, in order to be considered eligible for the TEACH grant program. (I have previously written of how similar perverse incentives plague the new CAEP accreditation standards for these institutions).

Here’s a test question for Arne Duncan, our Secretary of Education:
TRUE OR FALSE?
“A program’s ability to train future teachers who produce positive results in student learning [as measured by standardized testing] is a clear and important standard of teacher preparation program quality.” (from p. 63 in proposed regulations document)
Here’s a hint, provided by Benjamin Campbell of Richmond,Virginia on the federal register of comments. “Current research indicates that no more than 14% -- and often far less – of a student’s learning as measured by standard tests – the only standardized measure – can be attributed to the teacher.”

The bad news is that Arne Duncan, and a whole slew of politicians and policymakers in line behind him, think the correct answer to this question is TRUE. They actually believe harsh punitive consequences work and lead to improvement. They think closing schools and teacher education programs is a good idea. They don’t care if any of their plans are based on faulty data, junk science, or illogical statistics. They blithely ignore extant research, recommendations from experts, and, to put it bluntly, common sense. The question remains – what are we going to do about it?


As Captain Jean-Luc Picard would say, “Engage.”

Wednesday, December 17, 2014

'Tis the Season for Mudslinging

It’s the time of year when education bureaucrats get into a bullying mood and this time their target is teacher education. The U.S. Department of Education has released the proposed teacher preparation regulations that are open for public comment through the federal register through February 2nd and they contain the same problematic features of the new CAEP standards (see my previous post). Here in New York we’ve had the public release of college-specific certification exam results with inflammatory headlines proclaiming future teachers are flunking and are illiterate (here are results). Our education commissioner, John B. King, continues his aggressive agenda to close teacher preparation programs with dubious justifications of accountability and transparency, only now he’s headed to Washington D.C. to work side by side with Arne Duncan.


There have been some badly needed responses from teacher educators. Fred Kowal described why the release of the certification exam scores was irresponsible and unfair in a radio interview and my colleague Howard Miller wrote an excellent letter in response to coverage in the New York Times. What has me in a rage is that I know firsthand from my student teachers at Mercy College’s Bronx campus just how impossible life is for them right now. Yet they are getting through all this adversity with admirable professionalism and poise and I am bursting with pride. The schools where they have been doing student teaching are equally impressed, and their cooperating teachers are despairing that their internships are coming to an end this week. So in our weekly seminar meeting we talked about what might be an appropriate response. They helped me with ideas for the parody below. We hope it makes you laugh, but mostly, we hope it makes you think. These exams are really that bad.

Commissioner Regents And Policymakers Test

 (Time allotted: 60 minutes - tic tic tic)

Should Teachers Be Required to Pass Tests Before They Can Be Certified to Teach?

Use the passages below and the information in the graphic to write two focused responses and an extended response. Your responses should be written for an audience of educated adults. You must maintain an appropriate style and tone and use clear and precise language throughout. With the exception of appropriately identified quotations and paraphrases from the sources provided, your writing must be your own. The final version of your responses should conform to the conventions of edited American English.[1]

Focused Response Assignment: Use Passages A and B to respond to the following assignment.
In a response of approximately 100-200 words, identify which author presents a more compelling argument. Your response must:
-       outline the specific claims made in each passage;
-       evaluate the validity, relevance, and sufficiency of evidence used to support each claim; and
-       include examples from both passages to support your evaluation

Focused Response Assignment: Use Passage B and the Graphic to respond to the following assignment.
In a response of approximately 100-200 words, explain how the information presented in the two charts can be integrated with the author’s central argument about the impact of licensure requirement reforms on preservice teachers. Your response must:
-       explain how specific information presented in the charts either supports or counters the author’s claims, reasoning, and evidence with regard to new licensure requirements; and
-       include examples from the passage and the charts to support your explanation

Extended Response Assignment: Use Passages A and B and the Graphic to respond to the following assignment.
In a response of approximately 400-600 words, present a fully developed strategy for policy reform that satisfies the needs of stakeholders, is informed by current research in the field, and balances benefits and drawbacks of testing preservice teachers prior to initial certification to teach. Your strategy for policy reform must:
-       include evidence that you are knowledgeable and understand the issues
-       use research evidence and valid reasoning to support your strategy for reform
-       support the claims made with relevant and sufficient evidence from all three sources; and
-       anticipate and address the counterclaims of those who will undoubtedly oppose your strategy for reform


PASSAGE A 
By Imani Diot 
Pro: Dubious Funders of Education Reform (DFER)

Are most teachers smarter than a fifth grader? Probably not. Researchers from The Institute for Obscure Equations conducted a survey in 2009 and found that over 90% of currently employed elementary teachers could not solve this problem[2]:
How can our students be ready for college and careers if they aren’t taught to decode important secret messages such as these? We will end up with an entire generation that is completely reliant on Google for all knowledge.

Until now, we have asked colleges and universities with programs that prepare teachers to be responsible for ensuring they are smart enough for the job, but it turns out that those professors aren’t smarter than a fifth grader either, so we need a more objective measure. In fact, we need lots of objective measures. Research suggests that the harder the test, and the higher the cut score used to determine a pass rate, the less likely people are to question the validity of the test (Preason, 2012). In the past, teacher tests that were piloted and field tested nationally produced high passing rates and were the subject of ongoing debates about how the results must be unreliable and invalid.

To address this need, new tests have been designed that will ensure we are able to determine who is smart enough to be a smart teacher. Research has shown that smart teachers are hard to come by, because most teachers are lazy and only chose to teach because they want extended vacations and a shorter work day (Grates Foundation Report, 2011). These tests will require the ability to look at very small typeface in a florescent-lit room while using multiple tabs and windows to examine documents for hours without a break. These testing conditions have been proven to cause weaker candidates to experience panic, fear, and loss of confidence, especially when distracting noise is present. Testing topics have been preselected to be of interest to only a very small minority of the population to ensure maximum distractibility.

While computer-based testing can resolve most of the concerns about making sure our teachers are smart, practitioners will never give up the argument that they are insufficient measures of how teachers perform in front of real, live students. To counter this argument, a new test was created by brainiacs at Smartford University that will end this debate once and for all. This one requires weeks of preparation, decoding hundreds of pages of documents and guidelines, videotaping in public schools (which can require cumbersome permissions and paperwork), and months of writing up detailed analyses and data summaries. An added bonus is this exam can also lessen the power that colleges and universities have over the preparation of teachers, eventually eliminating the justification for useless theory and foundation courses.

Together with the other tests, we can be sure that only the very smartest and brightest will teach in our nation’s schools. Even if they only stay in the profession for two years, at least they will have provided our students with the googling skills they need to succeed.

PASSAGE B
By Wendy Tessurbad             
Con: No Test Is Fair (NTIF)

Teachers today have to be prepared for a myriad of unexpected challenges: a student throws up on you, the door to the classroom breaks and you’re locked inside, or the intercom speaker in your room has a persistent buzzing hum. Can a test predict if you will know how to handle such situations? Of course not. But somehow the public is reassured that if teachers take tests they’ll be well equipped for the problems facing them once they have their own classrooms.

Multiple choice tests are notoriously silly and useless, and research has shown that high scores correlate with good test takers (ETS Report on the SAT, 2002). In the case of licensure tests for teachers, most present short examples and cases for analysis that do not reflect what teachers have to think about in a real life scenario. Bias is also unavoidable as in the following example:
            Marcus offers you a smushed, half-eaten cupcake. You should:
a)     Accept with a smile and eat it immediately to show your appreciation for his kind gesture
b)    Tell him you can’t eat gluten but appreciate the thought
c)     Split the cupcake into two parts and give one to your teaching assistant
d)    Split the cupcake into lots of tiny crumbs so everyone in the class can have some
Were you fooled into thinking b was a good answer? The question is designed to trick celiacs and gluten sensitive people into choosing that answer, but the correct answer is d, which has the most equitable outcome.

Another problem with the teacher exams is the scoring process. While computers can score multiple-choice questions, they are not very good at scoring essays, at least not yet. People who are hired to score teacher exams are not paid very much, and would probably find a job at a fast food burger restaurant more enjoyable and satisfying. Research has shown that hours and hours of reading similar responses can lead to an inability to distinguish between a STRONG grasp of writing skills and a SATISFACTORY grasp of writing skills (Minduming, 1998) and in at least 15% of the cases studied, to inappropriately selecting NO grasp of writing skills whatsoever. Some testing opponents have even suggested that test scorers outsource the work to their children and bribe them with candy and video games (Nutjob, 2012).

Finally, it’s self evident that licensure exams are not necessary. Preservice teachers have to pursue their degree while juggling jobs and family responsibilities, they have to write endless lesson and unit plans using original ideas, and prior to student teaching they have to squeeze in time for fieldwork in school settings without conflicting with their current employment. Then for the semester of student teaching, they have to wake up very early every morning even if they were up late studying and writing papers, or if they suffer from various symptoms related to the constant exposure to germs, and they have to do all the work their cooperating teacher doesn’t feel like doing, plus help all the students the cooperating teacher doesn’t know how to help (or doesn’t like), and they have to prepare and deliver perfect lessons for their supervisor’s visits. “I just love student teaching!” said no teacher candidate ever. Furthermore, if they are unable to control the most unruly and disobedient students, they are deemed unfit for teaching and may be told they need to repeat student teaching again. Only those people who really, really like kids and the challenges of teaching would put up with all of that for a measly starting salary and unacceptable working conditions. Why put them also through difficult to pass exams that cost hundreds of dollars? That’s just rubbing salt in the wound.

Tests are bad, tests are dumb. Here’s my verdict: A pointing down thumb!


GRAPHIC

New York State Preservice Teachers Survey (NYS IHE, 2014)
3,278 teacher candidates throughout New York State were surveyed about the new licensure exams: Academic Literacy Skills Test (ALST), Educating All Students (EAS), Content Specialty Test (CST), and the Teacher Performance Assessment (edTPA). They were asked to check boxes correlating to a range of symptoms reported from pilot testing. Percents reported below have a margin of error of 0.00001.


Financial ruin
Panic, stress, depression
Need new eyeglasses
Migraine headaches
Broke up with partner
ALST
97%
100%
99%
98%
95%
EAS
95%
92%
91%
99%
93%
CST
96%
97%
89%
92%
97%
edTPA
98%
100%
92%
97%
99%


Washington State Preservice Teachers Survey (WAS IHE, 2014)
Following five years of pilot testing, the edTPA requirement went into effect in Washington State. 1,003 preservice teachers were surveyed about their response to the edTPA requirement.


[1] The wording and format of the CRAP test is identical to the ALST. A study guide can be found here.