Episode Transcript
[00:00:06] Speaker A: If everyone receives a different kind of test, then they should also receive different kind of education.
[00:00:12] Speaker B: The Internet and then social media allows students and learners to really forge their own path.
[00:00:18] Speaker A: We need to change teacher beliefs as well.
[00:00:22] Speaker B: There have been pushbacks against the kind of entry test to university on the basis of equity. But in the absence of a standardised tests, well, what are we going to do to replace that? And then we go back to what we had before standardised tests, which was really just systems of patronage. There is a very clear disconnect between the information and the data you want as a policymaker versus the information, the data you want as a teacher.
[00:00:48] Speaker A: How can you fight for alternative approaches and less perhaps standardized approaches when the end result should be a grade?
[00:01:08] Speaker C: Welcome back to the third episode. And in this one, I think we want to shift the focus a little bit, not so much on the teachers, but on the people that determine assessments and how we can help that lead to better assessments which lead to better teaching and learning. So we're thinking about the assessment learning relationship, but more from the policy and assessment point of view.
And so this should be simple to wrap up.
[00:01:37] Speaker B: It takes 10 minutes.
[00:01:38] Speaker C: I think we can sort this out really easy. I don't know why it takes so much discussion.
So we're thinking, well, let's start with kind of general principles.
So if we're trying to think from an assessment point of view and we're thinking, what's going to help teaching and learning?
What are the kinds of questions we might be thinking of? Kind of looking at you, Nate, you must have thought about this in the development of the Oxford Test of English.
[00:02:05] Speaker B: What kind of example? Yeah, I mean, I suppose so. I mean, when we designed that test, I mean, we were very, very much thinking about what the test was going to be used for.
And I think what we really wanted to do was to think, okay, what kind of language do people use in the real world that that test is going to be used for? So this is a kind of C1 test. It's going to be used for university or professional purposes because that's really what C1 level is really about.
And then think about, well, what do people do in those kind of scenarios? And that's why we integrated a lot of mediation style tasks and activities into it.
So thinking about integrated skills. So reading into writing was kind of really, really important.
Okay, what are people doing? They're reading things and then they're writing about what they've read.
That's what happens at university, therefore that's what your assessment task should look like. And if you can build that in, then hopefully that will have this kind of washback, backwash effect downstream, whereby that's how people will prepare for those activities.
Exactly. I mean, that's what it's all about. Yeah. Because, I mean, this notion of isolating skills is something that we're still struggling with.
Think it goes back to. Again, there's this one kind of key text from the 1960s that really introduced the notion of speaking, reading, writing, and listening into language education and testing in particular. And it's been dying hard ever since.
And even if you try slow death, it's a slow death because very slow. Try and move away from that. You will always run up against policymakers, decision makers, whether they're at university or whether they're politicians, saying, no, I need a speaking score, a listening score, a reading score, and a writing score. So the way around that is by having, okay, we'll have integrated skills tasks, but they will contribute to one particular score. So the reading into writing will contribute to writing.
All the speaking into listen, or the listening into speaking will contribute to speaking because that's the product, that's the outcome that you're producing. That's the kind of linguistic artifact almost.
[00:04:09] Speaker A: How do you measure listening and reading, then?
[00:04:11] Speaker B: We don't. In those tasks specifically. Yeah, you can try, but you will not be very successful. Because what we found is that the overwhelming majority of the variability in scores in integrated skills tasks links to the productive part of the test. So, yeah, a reading into writing task will relate more strongly to a writing task than it will to a reading task, because it is predominantly a writing activity, and it's the writing overwhelmingly that you're being judged on. But again, from the point of view of standardization, come back to in the previous episodes, you have to think very, very carefully about what it is you want the test takers of the students to do with that input material, how they're going to use it in the response, and that the markers the assessors will look for how it has been used effectively in those responses. That is probably the most difficult aspect of standardizing those tasks.
[00:05:07] Speaker C: But am I also right in thinking that where you have integrated tasks, you also have single skill tasks? So you have single skill writing. So you've kind of got to check.
[00:05:18] Speaker B: Which is a bit of a conceit, really, because there's no such thing as an isolated language skill.
We talk about them, and then it really exists. I mean, even in something which is notionally a listening task. Let's take a listening test and then you're given an audio script, but the multiple choice questions are still written down. So there's still a very heavy reading component at the end of the day in a listening test. And there's no easy way around that because you have to assess something like listening, which you can't observe through something that you can observe and that is interaction with a task on a piece of paper, or it's through verbal interaction. But then again, in verbal interaction you're not typically assessing the listening as such. You're assessing speaking as well. Yeah, yeah. So again, listening is something that doesn't really exist in isolation.
[00:06:08] Speaker C: I always thought it's interesting what you're saying about the policymaker expectations and perhaps some of the psychometrics, which was trying to say we want to know exactly what your listening skills are untainted by other things.
I always thought there might be a practical element in it, in that to organize a listening test in the past at least you had to have everybody in a room listening to a recording.
And to have a speaking test, you had to organize that in a particular way. I mean, the writing and the reading could be integrated, but just because of that organizational thing, you had to say there were separate test components.
[00:06:45] Speaker A: Practicality element.
[00:06:47] Speaker B: Yeah, several practicality elements to it. Yeah. So I mean, the notion of an exam room still exists, but maybe it's no longer about, you know. Well, I mean, it's really about security really these days, and ensuring the security of your item bank in a computer adapted test and ensuring the absence of cheating, which is, you know, how Chinese did it many millennia ago. The purpose of actually having everybody together in one space so they could be observed and so you could ensure the absence of cheating, which ensures reliability, ensures validity, ensures dependability and so fairness as well, which is built equity, which is built upon all this. I know, you know, nobody really links equity to high stakes standardized assessment, but actually that's why it exists. Yes. And if we started, there are movements about moving away from standardized testing, particularly in the United States. There have been pushbacks against the kind of entry test to university on the basis of equity. But in the absence of the standardized tests, what are we going to do to replace that? And then we go back to what we had before standardized tests, which was really just systems of patronage and advantage.
[00:08:00] Speaker A: And also if you, if you lose that, then, I mean, how are you going to continue education then? So if everyone receives a different kind of test, then they should also receive different kind of education. But if they all receive different tests, then. And then they are herded into education, which is also, in a way standardized and the two parts don't match up. But then it should be individualized education for all, which is, again, from a policymaker perspective, is complicated and almost impossible.
[00:08:36] Speaker B: Yeah, technology can help with this. I think to a certain extent we are moving towards individualized education. Again, it's a slow advance, but it is increasingly happening.
You know, the Internet and then social media allows students and learners to really forged their own path. And then to a certain extent, we are individualizing assessments through things like computer adaptive testing, which is what the Oxford testing is built upon.
And what that does is that kind of has an engine, an algorithm underpinning it, and then it's responsive to whatever test takers do in the test. If they get an item correct, it gives them something which is more challenging. If they get it wrong, it gives them something which is slightly easier. And then the idea is to sort of hone in on and match the difficulty of the test with the test.
[00:09:25] Speaker A: Taker'S proficient help with individualization.
[00:09:28] Speaker B: It can further as well. AI. Yes, of course, if we're moving towards, you know, ideally a vision of AI chatbots, which can actually interact authentically with learners in real time. Yeah, but this is where it gets challenging.
Saw some interesting research at a conference earlier this year. When in real world conversation, what people tend to do is not if they're not overlapping in speech, although that does happen a lot, they are overlapping in terms of their thought. And they're usually formulating a response to what someone is saying about midway through that individual's turn. And that's currently what I cannot replicate at the moment. Because AI, you know, it waits until someone has finished speaking, then it processes, then it formulates its response.
And it's good at doing that. And it can produce a response which is maybe formulated within half a second, but that half a seconds moves that conversation away from what you would have in an authentic interaction, which is instantaneous. So we're not there yet, but yeah, yeah. If AI can figure out a way, or AI companies can figure out a way to interpret, you know, in real time what someone is saying, yes, that will be much closer to real world communication.
[00:10:43] Speaker C: Although it does bring to mind Harold Pinter plays so you have these nice dialogues where you've got two people seeming to have a conversation, but actually they're each on their own track because they're not listening to the other person. They're formulating exactly as you're saying, they're formulating their answer. And it's really interesting when you see dialogues happening like this in real life, they're not always responding to what the other person is saying.
[00:11:09] Speaker A: Oh, that is actually very common. Oscar Wilde already.
[00:11:12] Speaker B: Yeah, just waiting to speak in. Yeah, yeah, yeah, yeah. Hi, I'm Nathaniel Owen, Senior Assessment Research and Analysis Manager at Oxford University Press, and I just wanted to point you in the direction of our fantastic position paper, the Impact of Assessment on Teaching and Creating Positive Washback. Our team of distinguished experts go into detail on the concept of positive washback as well as sharing some recommendations to help schools and policymakers foster positive watchback and exploring how emerging technologies are reshaping the way educational institutions approach testing and assessment. Go ahead and download the paper today by clicking on the link in the description and enjoy the read.
[00:11:54] Speaker C: So let's think.
We're working in the Ministry of Education in country X and there is a determination to raise the level of English in that country seen as really important for global economy, engagement, etc.
And somebody says, well, the best way to change it is to change the exam. We make the exam more communicative and then everything else will follow.
Is that something that you would recommend to the minister or what would be your response to that suggestion?
[00:12:32] Speaker A: If I can take this, I'd say, I think it was in the first episode we said that it doesn't work so simply because we need to change teacher beliefs as well.
So I would say it's definitely a good direction, but we need teacher education as well and further continuous professional development to help teachers understand why these changes are happening and what they or what their responsibility is in the entire process. So yeah, it. We basically have two layers here. One is the like exam design.
[00:13:15] Speaker C: Yeah.
[00:13:16] Speaker A: Which potentially might have a positive impact on teaching, but we also need teachers buy in and understanding. So we need this bottom up layer as well.
[00:13:26] Speaker B: Yeah, yeah, I think I would say it has to be minimally accompanied by a range of accompanying materials that supports the introduction of that test.
As you said, the teachers and the students know why the changes are happening, but also know how to deal with those changes as well.
[00:13:42] Speaker C: Yes.
[00:13:43] Speaker B: So any kind of advice around pedagogy, any kind of advice around the kind of activities you can do to support the intention of implementing these changes to actually affect teaching and learning can actually take place.
[00:13:55] Speaker C: Yeah.
[00:13:56] Speaker B: Again, some of the major problems of course is the fact that at the end of the day you're still going to have one teacher and about 30 kids in a room and that, you know, educational reform, examination reform isn't going to change that. Reality.
[00:14:08] Speaker C: Yes.
[00:14:08] Speaker B: So very often the teacher will still find that time dominated by classroom management, whatever it is they're trying to teach, whether it's communicative or whether it's just more grammar translation activity.
Yeah.
[00:14:19] Speaker A: Also another thing that tends to happen is that teachers get frustrated whenever a new change is introduced and they think like, oh no, another one, the latest thing. Yeah. And I need to change again, which.
And that's why their buy in is so important. I get it. Absolutely. That most of their time is taken up by classroom management. But they also, they might be able to accommodate these things and difficulties if they know why this whole thing is happening.
[00:14:48] Speaker C: Yeah.
[00:14:49] Speaker A: And it's not because somebody wants to make their lives difficult. Sometimes it is about that. But yeah, yeah. Not always.
[00:14:57] Speaker C: I mean my experience with this has been a misunderstanding about time scales.
And so a ministry will say, yes, we need to raise the standard. We need to get everybody graduating from school at B1, for example.
And they want that to happen within four years.
[00:15:18] Speaker B: Within the election cycle.
[00:15:19] Speaker C: Within the election cycle. That's right. They're thinking that. And then they spend two years debating or working through what the curriculum change will be.
They'll then spend another year with the impact on the exams.
Then you've got the materials to be there. But what suddenly happens is that everything arrives in year three.
A revised exam material's not quite aligned to that curriculum, which is not quite thought through. And it all lands on the teachers and they think all my students are going to fail because I'm not ready for this new exam. And I think recognizing that actually it's a, it's a two term problem.
[00:16:04] Speaker A: Yeah, yeah. Again it's about, I think that policymakers should, should be invited into the classroom and yeah, like they need to see what, what's actually going on. It's a bit similar to how project managers think about a project and then think that, oh, if one person can do this in.
Oh, I think there's this classic example.
If a pregnancy is nine months long, then what if we bring in two people, can it be cut into like 4.5 months and can it be done quicker? And it's this kind of understanding of the other side. And then they might think or plan with different timescales in mind.
[00:16:49] Speaker B: Yeah, yeah, yeah. To accept that there is a pragmatic reality. I mean if you're a teacher, you're interested in the 30 kids in front of you and what works for them.
And that's why sometimes there's a bit of a disconnect between what happens in education research and what happens at the national level. So, I mean, education research is predominantly. Well, it's dominated largely by people who are former teachers. That's why a lot of education research is primarily qualitative in nature and very sort of student focused. Whereas policymakers. You're interested in the big picture, you're interested in the national picture, not what happens in school X in town Y. You want to influence change at the national level, and you can only do that with large scale quantitative data which is provided by standardized assessments.
So there is a very clear disconnect between the information and the data you want as a policymaker versus the information of the data you want as a teacher in front of your classroom.
[00:17:45] Speaker C: That's such a good point.
[00:17:46] Speaker B: That's where the divide is.
[00:17:48] Speaker A: Well, that's why they should somehow be brought together. Policymakers entering schools and teachers entering policymaking.
[00:17:54] Speaker B: Michael Gove, I mean, he gets a lot of bad press from teachers and educators who tend to hate him. But to his credit, he did actually attend education conferences and did actually speak to education researchers, which is relatively rare.
[00:18:07] Speaker C: Yes.
[00:18:07] Speaker B: For policymakers.
[00:18:08] Speaker C: Yes.
[00:18:09] Speaker B: So hopefully in the future we'll see a few more education secretaries or people like that going into schools or.
[00:18:15] Speaker A: Well, that would be nice.
[00:18:17] Speaker C: Yeah, yeah, yeah. Okay. So I think. I think we've explored this topic about the kind of institutional impact, the relevance of assessment and learning for policymakers and also for assessment design. I think that's been really interesting. Thank you very much.
[00:18:34] Speaker B: Thank you.
[00:18:35] Speaker A: Thank you.
I just remembered we didn't talk about alternative assessment methods.
[00:18:40] Speaker B: Oh, we should do that. Can we do that?
Or can we fit that in?
[00:18:44] Speaker A: Yeah, I don't. And because that is such a great point, where policies and alternative approaches, assessment.
[00:18:51] Speaker B: What might that look like? Oh, yeah.
[00:18:53] Speaker C: But do you mean for formal assessments or for informal assessments?
[00:18:57] Speaker B: You mean on both?
[00:18:59] Speaker A: Why not both? But then there's the other disconnect. And I just went to, like a discussion a week ago, and we were talking about these alternative approaches, and there's this problem that, okay, maybe the teacher or the school favored this kind of approach, but then at the end of the day, they still need to give a grade or they need to somehow fit every student into one box. So how can you fight for alternative approaches and less perhaps standardized approaches when the end result should be a grade?