The Evidence for Building Thinking Classrooms is Weak
What does it mean to be research-based?
Building Thinking Classrooms contains Peter Liljedahl’s comprehensive system for getting kids to actually start thinking in math class. According to Liljedahl’s research most students spend math class subverting the expectations of their teachers. They “student” and find clever ways to avoid doing any actual thinking. Without thinking, there is no learning, and so teachers need to gets kids thinking.
His system is truly comprehensive. The book has fourteen chapters, each describing a different aspect of the classroom. Here is what the first ten chapters say, in brief:
Only assign questions that students solve without direct instruction.
Randomly place students in groups for their classwork.
Give each group one marker and tell them to work on a “vertical non-permanent surface” i.e. mounted whiteboard.
Don’t put the desks in rows or columns.
When a student asks for help, only answer their question with another question.
“Give the first thinking task in the first 3-5 minutes of class, give the thinking task with students standing loosely clustered around you.”
Give homework but do not check it, mark it, ask about it, don’t use words like “practice” or “assessment,” and use phrases like “this is your opportunity.”
Encourage kids to walk over to successful groups to learn from what they’re doing (“mobilize knowledge”).
Use hints and extensions to keep students in a “flow” state.
End the lesson by teaching students by pointing to things that students have done (“consolidating”).
This system and its methods, he writes, “was almost universally successful. They worked for any grade, in any class and for any teacher.” Wow.
The book is a record of his research, some of which has been published, much of which is appearing in this book for the very first time. I am very interested in research, so let’s take a look at the research informing his book.
Subscribe and never miss a post.
There are four papers of Liljedahl’s in the references of BTC.
Liljedahl, P. (2016). Building thinking classrooms: Conditions for problem solving. In P. Felmer, J. Kilpatrick, & E. Pekhonen (Eds.), Posing and solving mathematical problems: Advances and new perspectives (pp. 361–386). Springer.
Liljedahl, P. (2018). On the edges of flow: Student problem solving behavior. In S. Carreira, N. Amado, & K. Jones (Eds.), Broadening the scope of research on mathematical problem solving: A focus on technology, creativity and affect (pp. 505–524). Springer.
Liljedahl, P., & Allan, D. (2013a). Studenting: The case of homework. In M. V. Martinez & A. C. Superfine (Eds.), Proceedings of the 35th Conference for Psychology of Mathematics Education—North American Chapter (pp. 489–492). University of Illinois at Chicago.
Liljedahl, P., & Allan, D. (2013b). Studenting: The case of “now you try one.” In A. M. Lindmeier & A. Heinze (Eds.), Proceedings of the 37th conference of the International Group for the Psychology of Mathematics Education (Vol. 3, pp. 257–264). PME.
In the homework “Studenting” paper, they pick five classrooms, Grades 10-12, and interview kids about whether and how they did their homework. On the basis of the interviews, they conclude that teachers should not mark homework, because marking causes cheating, mimicking, and other non-desirable behaviors.
The second Studenting paper involves very brief (1-4 min.) interviews with fifteen 11th graders after a single “traditional” lesson. They classify their behavior/responses as Amotivation, Stalling, Faking, Mimicking, and Reasoning. They conclude that in a traditional math lesson, most students aren’t engaged in thinking.
For the 2018 “On The Edges of Flow” paper, Liljedahl observed a Grade 11 and a Grade 12 classroom. He was looking for instances of students moving in and out of “flow state” into disengagement. He provides a list of six situations he observed that show the various causes of this move, such as “quitting” or “seeking increased challenge.” He concludes that hints and extensions can move students back to flow.
The 2016 paper is a dry run of the Building Thinking Classrooms book. There’s a lot of theory there, but I’m focusing on the results he describes from the data he collects.
First, he thought about his own teaching and came up with a list of nine elements that impact whether students think in class or not. He then observed many (40) other teachers, and decided that these nine elements were important in their classrooms as well.
He developed his “teaching classrooms” system, tested it on his own students and on the teachers that he worked with. You’ve seen that system already up above; it was those bullet points that you skimmed.
So: what did Peter L. do to see if the system worked?
First, he describes a study involving five high school classes. He put them in groups, told them to work with various surfaces (e.g. paper, notebook, whiteboard) and measured how long it took them to start working. Then he rated them on a scale of his own design to capture their “eagerness” or “discussion.” He found that vertical whiteboards led to the most engagement.
He then followed up with teachers who attended his workshops and asked if they took his advice and implemented vertical whiteboards. They told them that they did, and he checked 20 classrooms (he doesn’t say how they were selected) to confirm the reports from his interviews. Pretty much 100% of teachers told him they were still using his ideas.
He also asked teachers if they took his advice on placing students in random groups for their classwork. They likewise told them that they did, and his follow-up visits confirmed this. These teachers were a mix of elementary, middle, and secondary teachers.
And that’s it for the 2016 paper.
Is that all the evidence? The book reports many more results, but in a slightly haphazard way. Here are some typical quotes:
80% of students entered their groups feeling like they were going to be a follower rather than a leader.
It turns out that of the 200–400 questions teachers answer in a day, 90% are some combination of stop-thinking and proximity questions.
Although it is true that the students spent much less time writing, and that they did not fall behind, very few students (35%) actually spent the time listening.
We now observed 75%–100% of students taking notes, depending on the class, and 50% of students referring back to their notes at some point.
About 15% of the students told me that the unit they just finished was made up of a number of subtopics, and they were able to name or describe what those subtopics were. These students, for the most part, scored above 90% on the upcoming test.
There isn’t much context given about where these numbers are coming from, though he frequently notes that they emerged from his research.
We could imagine a version of this book that doesn’t say anything about research at all. “Here are my thoughts on what makes for good teaching,” it would say. “It’s based on my own teaching, observations, and the experiences of the people I’ve worked with. But don’t take my word for it—try it for yourself, and you’ll see that it works wonders.” Would that be OK?
I think the answer is, that would be very OK. That’s just telling people what you think. You have to be allowed to do that.
Now, how differently should we think about this system with the research support that Peter has given us? I’d argue, not much differently. It’s not that there’s no evidence, and it’s not that he’s playing loose with the facts. It’s just that the evidence is weak. It doesn’t support big generalizations. You wouldn’t want to bet the bank on it.
He doesn’t measure learning at all, only engagement.
He only measures engagement for vertical whiteboards and maybe homework, not for any other component of his system.
He counts teacher uptake as evidence of student engagement.
The measurements he reports are all for older kids.
We don’t know how he selected the groups he measured.
We have no idea where the percentages he reports are coming from in his book.
In fact none of what he did to arrive at his results is particularly transparent, which is probably why his four papers are published in books and conference proceedings, but not journals.
And while I’m tempted to be chill about this, it does bother me. The book and its supporters make extremely strong claims. “The results of this research sound extraordinary,” Peter L. writes. “In many ways, they are.” And it does seem to me that the book—which has become extremely popular in US math education—owes much of its success to its presentation as a work of strong research.
I don’t know what to do! I’m under zero illusions that people actually care about evidence. Nobody reads the papers. Very few people care about getting this right. And yet, apparently, almost everybody cares a great deal about the perception that some new thing is rooted in research. It opens doors, hearts, and wallets.
What do I want? I want more people to understand the difference between strong and weak evidence. I want people who do understand this to speak up, and hold the entire field to a slightly higher standard. Nobody wants cynicism, but we all need to muster a bit more skepticism about this kind of stuff, and not just for our opponents’ ideas.
Thanks for reading! I’m the author of a book Teaching Math With Examples. If you’d like to support my work, consider purchasing a copy or sharing this post.