Episode 34

6 AI Tool Ideas That Will Transform How You Test

March 2, 2026 · 51:12

TVRS Episoode 033
===

Richard: So in this episode, do we have a new term? Should we have a new term? What even does the new term mean? And that term is automation in quality. Can we now automate aspects of quality that we couldn't do before, thanks to the latest technologies around AI and Gen ai, or. Is it just another blurring of words and everything's technically testing anyway.

And automation and test automation or Yeah, is Vernon and Richard onto something. We dunno. So we decided to discuss it And do we know anymore? You'll have to listen and find out. So enjoy.

Vernon: Enjoy folks.

Richard: Hello everybody. Welcome back to the Vernon Richard Show. I am Richard,

Vernon: I'm Vernon.

Richard: [00:01:00] If you didn't know that, where have you been? Um, so, uh, yeah, what's happened in the last week, mate? Anything exciting in your last week? Since we recorded

Vernon: this. Uh, just going, just going down the, the AI rabbit hole a little bit more than I was before that rhymed.

Um, so yeah, that's been good. That's been fun. Very interesting, I think. But I think I'm just, you know, I'm just looking down the, I haven't really, it's a big ass rabbit hole, right? So

Richard: Hole nice.

Vernon: Because of the way we work in my, uh, in my day job.

Hardcore on the AI stuff. So yeah, that'll be fun. It'd be interesting, lots to read, lots to figure out, lots of experiments to run, lots of things to define and understand and all the rest of it. So yeah, man, I think that's, uh, from a day job point of view, that's the thing. From a, um, side hustle point of view, I've been [00:02:00] doing lots of warm outreach.

That is the terminology. Uh, so that, that's been going well. It's, man, I've caught up with some people I haven't spoke to in years, which is absolutely amazing. Still trying to figure out, um, where the people are hiding that I can sell my thing to

Richard: wonder the money. And

Vernon: if they wanted, need to decide, if they wanted,

Richard: you need to put AI in the title, and then you would be, you'd be fine.

Vernon: Alright. Do you know what I do? I was thinking about that and I, I dunno about in, well it may or may not be in the title, but it definitely needs to be part of the offer. That is for sure. That is for God damn sure. Yeah.

Richard: Nice.

Vernon: Um,

Richard: excellent.

Vernon: Yeah. How about you, man? How it been?

Richard: It's been all, yeah. Been quite a week at work than it has been, uh, which is nice.

Um. Yeah, it's still a lot of like playing catch up. So I've been trying to get my head around or implementing some more evals for this project that I'm working on. Um, you know, finally getting a decent [00:03:00] pipeline in place, which has kind of not been our problem but didn't exist and that finally does exist.

Okay. Um, yeah, personally got, got bonus at work and the small paying things, which was nice. Um. Bought myself. Medium. Medium T-shirt. Right? All this, uh, exercise is, uh, paying off.

Vernon: Look at him.

Richard: Um, and yeah, keeping this silly, well, not silly, this, uh, invis line gum shield in, I'm finally think I'm Is, is there

Vernon: any, what?

Is it any at the top? You've got it.

Richard: No, both. Yeah.

Vernon: Bloody hell.

Richard: They're really hard to tell. It's to fix this tooth here. That one there

Vernon: is that the offending one.

Richard: Yeah, that's been chipping this one. Uh, so. It'll take ages, but yeah, they have to push them all back. And then these top ones are being pushed forward ever so slightly to make space for where the new straight teeth will be.

So it's pretty cool. [00:04:00] They show you like step by step pictures of it.

Vernon: Yeah.

Richard: Um, and then you have to go in every, you know, eight weeks for a check to see if it's actually doing what it's meant to do.

Vernon: Make sure you keep them in young man. Make sure you stick to the routine and that's easy not to stick to the routine.

Richard: The last thing I did yesterday. Which is very topic for today, is I bought the automation in quality.com domain.

Vernon: Good. Good.

Richard: Which is what we're gonna talk about today, um, because we kind of ended last episode on that topic. And I think it's a good opportunity to think a little more broadly, I think around, there's something in my head that thinks automation and quality is kind of the evolution of automation and testing.

Mm-hmm. Purely because. There's more opportunity to build tools, but also I think there's an opportunity to build tools that aren't necessarily testing related tools. And maybe they're more quality related tools. Mm-hmm. [00:05:00] Or maybe I'm just blurring the two worlds. I don't know. Um,

Vernon: maybe, maybe Is the distinction helpful?

That's the thing. That's the, that's the thing to, to, to chat about maybe as well.

Richard: Yeah.

Vernon: Is it, what's, what's the point? It, we might cover some old ground perhaps, but what's, why is the distinction important? That could be interesting to explore.

Richard: Yeah, so what I felt I'll do is I'll, I'll basically do what I'd spoke to one of my colleagues today, uh, uh, Kalps.

He does listen. So I was chatting to Kalps today and we're on a bit of a mission internally, right? So it might sound like, you know, going back to basics, and I might have mentioned this last week, I can't remember, but we're currently going through. What is qa? What is qe? What is testing? What is an SDET? What is a test engineer?

What is a QE engineer? Because we're having a few challenges when it comes to making it clear to some of the RFPs and some of the salespeople. So we kind of want to go back to basics a little bit, get a little landing [00:06:00] page made and kind of reverse engineer kind of what we're doing. And I was chatting to Kalps today around.

Some of the kind of throwaway tools that I've been playing with, for example, um, analyzing a story, so mm-hmm. From a, so some people might like this. I'm gonna ruin it 'cause this is what Kelp said he was. Um, so actually no, I'll, I'll do the way around. So I was like, we can now use AI or an LLM specifically, right.

To evaluate a story. Mm-hmm. So you can take the story and you can be, go to an L lm, an LLM. Maybe give it some context of your company, but say, does all the products, sorry, does this story make any sense? Is there anything missing from it? Right? And you could have that run automatically as soon as someone marks a ticket as ready or whatever your process is, and have it ping you a message on Slack or send you an email or add some [00:07:00] notes to the Jira ticket, whatever it may be, right?

The options are there. You could have it analyze the acceptance criteria that's there. Does the acceptance criteria match the story? Is there gaps and things like this? And I was like, you know what? That's, that's automation and quality. 'cause you know, you are, you're almost like looking at the quality of the stories.

And I Kalps went, well, just to be devil's advocate, Richard, um, if you go back to I-S-T-Q-B. I think you'll find that that is static analysis.

Vernon: Oh.

Richard: And I was like, you're not wrong. Um, but then also we just had a quick chat. Is that shift left testing? So is it still testing because you are testing the story, or is it an element of quality to make sure that people are writing coherent stories and coherent acceptance criteria?

So that was kind of the first one. So yeah, that's, let's discuss that one. I guess. What'd you think about that one?

Vernon: [00:08:00] Uh, maybe, maybe it depends on where you draw the distinction of the product, because if you're interacting with the product, what is the product? Is it the application on the test? Is it the, uh, the user story, the artifact called the user story that I'm interacting with?

Um, so depending on where you draw that line, you could either say it's, it's, uh, testing or it's quality. Um, yeah. So yeah, each one is a hard and fast definition of, well, at least collo, colloquially. When people say that they're testing, they're usually talking about, um, the software product itself. But by, by thinking more expansively, you can find more.

Opportunities to use your skills and, and help, um, the goal, which is to deliver some working software to customers that they absolutely love. So [00:09:00]

Richard: that's exactly what, um, kind of what I said to, to Kalps, I guess. Like where do you draw the line? It's one of the reasons why I feel like it's. It feels a bit like a quality gate, like, you know, and I, I had the same view as you is like testing.

Yes. It's not about just the physical or the software product. Mm-hmm. But it commonly is, right? Mm-hmm. The whole shifting left is usually to enable yourself to then do the testing easier later. Yeah. Um, so.

So, yeah, basically that was my first idea. Right. There's stuff now, and I think you mentioned on one of your team is doing stuff with Jira. So there's stuff now we can do around the words that are in a ticket. Mm-hmm. Um, you know, in the past you could have had a quality [00:10:00] gate that said, is there acceptance criteria?

Mm-hmm. And just checked that, you know, the field or the box acceptance criteria has something in it. But now. You could actually analyze that and evaluate that and almost do that before it even comes to me. Like I could work with the PO to help them eliminate if there is a po, but like, you know, I could help them eliminate some of those mistakes.

Vernon: But the, but the thing is, the thing is, many of us have been advocating for years that that's what should be happening anyway. Right. So it is, so it's, I was just, I was just on a panel, a lead dev panel where this, this kind of thing came up where it was like, it feels like the advent of, of AI is gonna force us to do all the things that we should have been doing in the first place.

Like all the things that everyone was talking about saying that they do.

Richard: Yeah.

Vernon: Um, I, I, I kind of likened it to me being diagnosed with type [00:11:00] two diabetes, because generally speaking. We all know how to stay fit and healthy, but if you don't move enough and if you don't eat the right things and you don't drink the right things, you don't get enough sleep, et cetera, et cetera, et cetera, then you'll probably get type two diabetes.

You know, that could be an outcome for you. Yeah. And you, it's not 'cause you didn't know, it's just 'cause you didn't do the things. And similarly with software. No one in 2026 is gonna be like, excuse me, pipelines. They're, they're, what are they? I've never heard of these things before. What do you, what do you mean exploratory testing?

You say what is that? That's what I don't even, you know, but we don't always do them and if we do do them, we probably half arse them.

Richard: Yeah.

Vernon: Like it's few and far between

Richard: documentation is the one that's coming up a lot.

Vernon: Documentation.

Richard: I, no one likes doing documentation.

Vernon: Well, I, I, I know, I know someone who seems to like doing documentation.

Shout out to you. Emily. Okay. Um, uh, but [00:12:00] yeah, I just, I just, I just, I just think it's that. I just think it's that. I think for me, we're gonna have to do all these things that we've said we were gonna do. And to be honest, I was gonna say it's like, I think it's a quality thing because the test, if the testing is at the.

That's about interacting with the product, ev everything else that you are doing. So the story thing, why that's important is because if the stories are better quality, if they're, if they're more, if, if the understanding is shared more closely within the team, then my testing gets, I've, I've added more leverage to my testing.

And it's, and it, and, and you can make that, you could probably make that case for a lot of things that are deemed to be not testing, but you can still leverage your testing skills in, in pursuit of those [00:13:00] things in order to make the testing easier.

Richard: Yeah.

Vernon: So,

Richard: well, that ties nicely into my second idea.

Vernon: Oh, go on.

Richard: If we theoretically made this change, what areas of the system would be impacted? So again, LLM knows your code base. Knows your code, can look at the change, can maybe have a best guess at how it might do the change. Mm-hmm. And then tell you what other areas may or may not be impacted. Because this is exactly what I used to do really well in the meetings because I would be the one with all the knowledge of the system.

I would be understanding how it all relates and it would be, yeah. Yeah, that's a great change. But you do know that, um, profile uses the same code. Oh, oh no, we didn't know. And then the estimate, the estimates change that the, the scope of the ticket changes, the testing scope changes. [00:14:00] And again, I think some of that can now be done programmatically.

Doesn't mean it's gonna get it right, and doesn't mean we shouldn't do it. Right. But. If the ticket doesn't mention, you know, the knock on effect of other areas or that the other areas are gonna need refactor in, because when I did that in meetings and I'm making this up completely on the fly, like I reckon I could probably, is that the word I want?

Distill, um,

Vernon: distill.

Richard: Yeah. I dunno. Yeah. In my, what I would do into probably five to 10 questions. Now those questions. Now fear, I I, I could build a tool that would ask those 10 questions with context and spit out an answer or a score or something, which may or may not provide me some insight, and you couldn't do that before.

So to me, that's not, I'm not testing at this point. I'm using AI to gather [00:15:00] information about the problem before. Everyone else, right? So the dev will end up doing this when they start coding it, right? If they're coding from, you know, old school code, right. If that's not there, coding, um,

Vernon: human authored code,

Richard: yeah, phrase it.

But if they're vibing, I imagine the LLM might make them aware of some of the stuff, depending how good they are at prompting, or they may miss it completely. So if we can have a tool do this upfront, I think that's useful insight.

Vernon: Yeah. And I, and I would call that and I would call that quality. Or is it, uh, maybe it's more, maybe that is more testing.

'cause it's more about the risk, isn't it, that you're talking about and trying to surface for the thing that you are building directly. So maybe

Richard: it's almost a step before though, for me. 'cause you, you don't know there's a risk yet, so. Well, you can't, you, you do know there's a risk in the sense you're hypothesizing.

There could be a risk of a knock on.

Vernon: Mm.

Richard: But really you're almost doing that research bit again before you're [00:16:00] designing your tests or even thinking of doing any tests. Mm-hmm. And again, to make the ticket go smoother, to make it easier for the dev who writes it, to make the story more clear, and also to potentially not start building something that's too big.

Mm-hmm. You know, bigger than you thought. Yep. Um. You might not have the capacity to do that. It might be too much of a risk. You could be a week away from code freeze or something, right?

Vernon: Mm-hmm.

Richard: But again, this tool could have all that information fed into it to give you that analysis and say, you know, yeah, this is good to go.

Or actually, no, you are, you are missing some stuff here. So it is, it's testing, but it's, it's, again, it's not it, it's not what we commonly. Classes, testing, or what we would commonly automate.

Vernon: I dunno though, right? Because in, in order for you to design and think about what it's, you're gonna test, you do need, to me, that sounds like you are pointing out some [00:17:00] dependencies and some risks and, you know, things like that.

And that is, for me, a part of testing personally. You know, the, the, the risk conversation is the first thing that you'll probably talk about. Or maybe the problem is, and the risk is second, but it's all part, you can't, you, you can't do a very good job of testing without having that conversation about, you know, where is the scope, where's the risk?

Do I need to worry about this part of the application that would, and focus on this part of the system? Because reasons I think that I, I think that's part of testing personally,

Richard: is that, is that because you are doing high quality testing. And you are a high quality tester because I don't think a lot of pe I don't, I, I think people do, but I also think a lot of people just literally look at what the ticket says and test what the ticket says.

Vernon: They, they. So that happens even within that, there's a level of risk analysis, but I think you're right. [00:18:00] I've certainly seen that, certainly seen that where the focus is on the functionality of two. And, and there's, there's reasons for that as well. Some of those is, it's not always that people. For want better word, not being thorough.

Sometimes it's a systemic, there's just so much stuff to get done that I'm trying to just do.

Richard: Yeah.

Vernon: As much as possible. I've seen that happen as well, but um, okay. Yeah.

Richard: Well, you'll definitely see a theme of these ideas that are sent. They're mostly centered around the gathering of information. So the next one obviously is prs, right?

So the dev's gone off and built the thing. Um, and then typically there'd be a pr you may or may not be involved in the pr Right. But I mm-hmm. I like to be involved in the prs. Yep. Um, pull requests for people who don't know what a PR is. Mm-hmm. Um, and obviously now sometimes a PR is tiny. Sometimes it's [00:19:00] absolutely massive.

So what I tend to be looking at is the quality of the code that was written. But also what files were changed, and again, very similar to the one we've just discussed. What else uses those files

Vernon: exactly, yeah.

Richard: That we may have, we may have missed. So again, with a well-trained, you know, agent or prompt, you could write a lot of that analysis to say, you know, can tell me, tell me the other, I dunno, the five areas of the system that all share this class.

Uh, you could ask it for a summary of the changes that have been made. Um, do you foresee any down, you know, down, um, effects that aren't covered here? Would you have expected some tests to fail that haven't been ran or been changed? Um, you know, all this before we potentially run the build and everything else?

So that, again, just summarizing the change from a. Various testing slash [00:20:00] quality lens. Like how big is the change, you know, how many files were changed? Um, is there files that were changed that you wouldn't have think were related to the story? You know, as the dev just snuck a few things in which I'm not saying they shouldn't do right?

Because I do that. Yeah. Hell yeah. If it's not in the ticket, it might not be thought about. Right? Because again, the. Yeah, the tester may not, the, the other people in the team may not notice. So again, mm-hmm. Like almost like a quality gate, but semi-intelligent. You could have it always looking for things like that and just spit out information to you that you're gonna read it and go, yeah, yeah.

Ooh. Well that's interesting.

Vernon: Definitely

Richard: didn't wanna test that.

Vernon: And whatcha calling that?

Richard: PR and analyzer, I dunno.

Vernon: Is that, is that, is that automation in testing or is that automation in quality?

Richard: Again, it, it, I, I think this is the fun bit of this conversation because [00:21:00] you are testing the code.

Vernon: Yeah.

Richard: But you're not, you're not, you're not exercising the code, but you are statically and analyzing the code Tob.

You are also doing it to shape the next stage of quality slash testing initiatives. So,

Vernon: okay,

Richard: this information may improve the next, the testing that takes place because it's framing it more, which in a, in a way, for me, feels quality related, even though the next task is a testing task. But again, the lines are blurring so.

Vernon: I think, I think, like I say, anything that is, is gonna amplify the testing, remove the friction from the testing, et cetera, et cetera, et cetera. I think that is more [00:22:00] of a quality focus. 'cause you can just carry on doing the testing without doing that stuff. But instead of, instead of, this doesn't really make any sense, but instead of getting like.

You know, for one unit of effort, you get out one unit of testing. When you do, when you take more of a quality lens to things, when you put in one unit of effort, you get out two units of testing or maybe you get out five units of testing. Yeah. So it's that. It's that kind of, it's that kind of thinking. So if you are, if you are talking about.

And this, and this is where the task analysis thing comes in. It's because you are talking about saying to self, oh, it might be useful if I knew which files had been changed as a result of this, of, of this deliverable being like implemented. Right? But not everybody thinks like that. Not everybody thinks like that.

Nobody thinks [00:23:00] where else could there be some risk in this deliverable? It's, it's literally what you said before. It's like, well, the feature said it needs to do A, B, C, so I'm gonna, I'm gonna test A, B, C in the most rudimentary way possible. And then it's, it's, it's out of my plate. It's into done or review or whatever.

And I'm, and it's off my, it's out. It's out of in tray.

Richard: Yep.

Vernon: Um, so yeah, that's what I think.

Richard: And I think. The thing is from a testing perspective, or a lot of people usually discussing it as exercising the product. Mm-hmm. That that analysis is not, you're not even exercising the code. The code is, could be faultless and be perfectly well written.

You are analyzing the fact that they may or may not be gaps or additional changes that weren't factored. But when I say gaps, I mean gaps in your understanding of what was being changed and why. Uh, and it's also be interesting, and again, I'm not, [00:24:00] I've not built this, but the analysis from that first tool I mentioned could then be compared to the, the actual implementation and see where they differed and yeah.

Have they differed because the dev has interpreted the requirements differently? Half they differed because the coding agent that was used interpreted the requirements differently. Um, or did one of them just get it wrong? Like, I, I just think there's so much little nuances that are interesting little nuggets of information that are gonna guide the testing that gets done, or the assessment of quality.

Like that's another way of looking at it. Um, so the next one was, and I've seen people build this already, and I don't think it's that. Incredible. But it's interesting is the dynamic, um, [00:25:00] dynamic selection of automated tests based on the changes made. So instead of having your traditional pipeline that runs all your tests in the same order every time, it picks out the high priority tests to run first, so you get that mean time to feed back quicker.

If some of those things are broken, so it ain't not running some of your tests, it's just cherry picking 5, 10, 20, whatever to run first. Mm-hmm. And then running the rest afterwards. The rest, which you couldn't do before. Like I, I wrote something 10 years ago that like basically mapped files to areas of the system and did like a heat mapping score.

And then based off that heat mapping score, decided to run tests that were tagged manually, as in I'll tag them all with the number and then it would pick the test with that number. Right. But that was so like, you know, [00:26:00] hacky and rudimentary and um, whereas now, you know, you can. You can use AI to do that.

What do you think about something like that?

Vernon: That, that, to me, that feels like an, an optimization rather than a revolution of testing or quality. It just seems like a, you're making your testing more efficient.

Richard: Mm-hmm. '

Vernon: cause you are, you are, you know, you are delegating the decision of what is important to the robot, to the ai, the LM hoping that it's.

A sensible decision, and ultimately all the tests are gonna run anyway, but you just want that extra rapid feedback for the more hazardous part of the, of the platform or the application that you're testing. Um,

Richard: okay. Yeah.

Vernon: So, yeah. Yeah, I think it's an optimization more than anything else.

Richard: I think I've only got two more in my head, and then I'll obviously, I'll make sure you can [00:27:00] contribute some of these ideas if.

Um, the next one is one that I think a lot of people are already doing, but again, not, I don't think mainstream. I think it's all the cool kids that are doing it 'cause they've got the money and the tools. But the, the automatic analysis of failed, failed builds, um, being done by an agent, which then in turn will try and fix it and basically open a pr, um, that fixes it.

Now I've tried this and it was very interesting because. At the time I built this, I didn't have as much knowledge on AI as I do now. And often it would always assume the test was wrong and always change, like not, you know, not always, but 80% of the time would change the test to fit the broken software.

Vernon: I think that

Richard: talked

Vernon: about this a little bit.

Richard: Yeah, we did because it doesn't have enough context to distinguish between the two. Wow. Because we didn't have the Jira tickets, [00:28:00] we didn't have everything else. And when, when I built mine, so there was so much knowledge needed that at that time it wasn't accessible because there wasn't MCP servers for everything like there is now.

Mm-hmm. So now I think I could build it much better. Mm-hmm. Um, but I do think there is value in it because sometimes it can just be a missing independency. It could be a, you forgot to change a locator. I'm not saying that that should go unnoticed, but also what if you've gone for lunch or there's an all hands meeting for the next two hours that ain't getting fixed because you're all away from your screens and it could be fixed theoretically by an agent.

Um, we're now, is it, it's not testing and it's more development esque activity. So it's an investigation process and it's, you know, you, you're trying to learn why, and then you're gonna analyze that and put that into context and then try and fix [00:29:00] it and then see if you fixed it. Mm-hmm. So, you know, it's mixing dev and testing.

Um, but again, it's not what I would class as testing in terms of like the testing's happened, and it's more to do with analysis and reporting and

Vernon: well,

Richard: what we do next.

Vernon: I don't know. It's, it's alm, it's almost like when you find a bug, you, you encounter a failure in the application and you have to do some research and investigation into the failure.

It kind of feels like that. Um, and there's a, there's a, I've, I've, I've kind of reconnected with a old friend and colleague of mine. Shout out to you, Dan, the Agile guy. He is man, working with him was so much fun. You know, when you work with people who are really smart and good at something, but they all, but they're just so hype about it that they love explaining stuff to you.

And I, oh, I've just [00:30:00] found, I'll just discovered this, that he, that he was one of those people. So working with him is, yeah. Um, and he's got some, I'll link, I'll put it in the show notes. He's, he's, he's got this post, I think he wrote it today or yesterday, where he talks about he's got two, what does he call them?

He's, he's basically got, he's got, uh, an agent that he treats like a partner, a thinking partner, and then I think he's got another a, I'm gonna call it an agent, I'm probably getting the terminology wrong here, but he treats that person, person like a, like an employee and they, they don't do the same things.

They've got slightly different missions, they've got definitely different capabilities. And the agent. What should I, let's call the employee, I can't remember the names, but the employee robot that he's got, she looks for, and it's a she, I think she monitors some particular environment that he's responsible for, has an interest in, [00:31:00] and when it, when it goes down, this thing will notify the.

This, this thing over here will then alert him to the fact that, oh, it broke, it got fixed. It's all good. And he just wakes up and he doesn't get, you know, he doesn't get alerted to every little problem that happens. And I'm bringing that up because I think, is it testing? Is it development? I don't know. I don't know.

Dunno, because both, both, both roles. Want an explanation about what just happened. You might want it for different reasons, but you know, you know when you find a, well, you know, when you're testing something and you find a bug, you run into this bug. There's a, you know, at at one extreme there is the bug report that says it doesn't [00:32:00] work.

And then the other extreme, there's a bug report that says the problem is in this. File file in this function because this variable was called this and it was miss blah blah, and that, you know, that's, and then there's a load of space in the middle. It's like, figure out where, how much investigation should I do as a tester before I can legitimately hand that over to my dev teammate, who will then have to do the rest.

Do you know what I mean? So,

Richard: yeah.

Vernon: I think, I dunno, I dunno. It's,

Richard: you've just, you've just, uh, overlap. Yeah. But you've just triggered a good thought there, which is all these things I've just mentioned from a quality perspective, um, the fact that it could be documented is probably a higher impact on quality than debating whether this is development or testing, because a lot of developers will [00:33:00] do an RCA, right?

But that RCA often isn't in the ticket 'cause you'll just get the change. Or you'll just get the final thing as in,

Vernon: oh,

Richard: I dunno. Um, the class was missing, right? The import was missing. But you tend not to get the why. Um, yeah, and I think with this now, I think some of the explanations might be better or they'd be written in a way where the human can just disagree and agree.

Uh, and then that usually might lead to better insight, which in the future. If you're centralizing a lot of that insight, you might start to spot common patterns, which ties nicely into my final kind of ideas is there is so much data. Now, if you, there's already a lot of data, right? But if you're doing some of these tools as well, there is so much data now to give you that holistic view of quality.

Where are we falling short? Is [00:34:00] it often that our stories are not clear enough? Are we making mistakes at the code point? Is it our tests aren't comprehensive enough? Like not to play the blame game, but to get a better idea of where our overarching processes seem to stumble down, because we've never really had tools that do that.

We've had tools that focus on testing. We've had tools that focus on the code. We've had tools that focus on Jira. Our processes around them. But now we have tools that can take all of that data.

Vernon: We have a tool,

Richard: an attempt to make sense of it.

Vernon: Yeah.

Richard: That may lead to quality insights such as this happens a lot, this is happening a lot that, you know, we don't see any of this, this has stopped happening, um, which usually would be manually tracked.

So again, I'm not saying that they don't want, people aren't doing them. But we can now automate some of that a lot quicker than we could before.

Vernon: It almost sounds [00:35:00] like what we're saying is if you can document it enough, then the robot can do no wrong.

Richard: No, I'm not saying it's getting it right, it's, it's more the fact that you can now collect stuff and interpret them.

That you couldn't do before. I still think that report lands on a human's desk, and they either agree or disagree, but because there's so much analysis happening and document ha documentation happening and files being saved and comments on prs, all the things that you, again, you said before should be happening, but often don't because we can put all the quality gates in.

Right. But you get just people writing a dot and you know what I mean, people bypassing, right? Yes. The the robot. The robot will never do that. Well, it might. It might do eventually, but at the moment it won't do that. So we're going to get a lot more data points than we've historically had, which may or may [00:36:00] not lead to insights.

But those insights now could be, you don't have to write a complicated agent to just go and ask it, to stick all that together and try and make sense of it. It might come out with utter nonsense and you're like, yeah, that was a waste of time, but you could do that in minutes. Whereas previously it takes a lot longer to do that proper root cause analysis of, you know, why bugs have escaped or why that feature took too long, or, um, or it just ends up in conversations.

So I'm not trying to remove the human aspect. You know, we all have conversations and are, you know, root cause calls, but maybe a smart agent could, could do some of that analysis of our overall approach to quality.

Vernon: There's a, there's a, I'm, I'm not gonna do this article, justice at all, so I will put it in the show notes and I'll urge you all, please go and read this article.

So it's by, uh, uh, John John Cutler in his newsletter. He's a very, very interesting dude. We recommend that you follow him wherever [00:37:00] you can follow him. And he had a, he put out a newsletter, I dunno if it was his most recent one, but it's fairly recent and he was talking about context. And he was talking about it in, in the context of ai.

Uh, and if I remember rightly, he was, he was basically saying, look, the way we treat context in 2026 land of, of AI is that it's this, it's this thing that you kind of define and then you can just move it around. Give it to an, uh, an AI tool to consume and work with. And he was, I think what he was saying is that that is not what context is.

The context is a result of the interaction between the players and that interaction is what forms the [00:38:00] context. So if there is no interaction with the ai, is it aware of the context or not? It sounds like not if I, if I've understood the point of his article. So what he talks about in there is, is the fact that context engineering is about designing the interactions that happen.

But yeah, go read the article. 'cause I'm probably butchering this quite badly.

Richard: I think

Vernon: in a multi, multi-agent, it just gets me thinking, yeah, go on. Because that's, that's, I think you're gonna say in a

Richard: multi-agent system, that makes sense. Yes. But in my, in my perspective here, I'm more thinking of producing information for a human.

Almost put into context.

Vernon: Okay.

Richard: In a multi-agent system, yes. They have to be sharing the fact they're sharing their results with each other is building the context for the next decision that gets made via [00:39:00] an agent. And I guess in what I'm saying here, is that a step before that? Because can we have these agents take app context?

Vernon: Mm-hmm.

Richard: Apply what we've asked them to apply and spit out what they now think it is. For us to digest and disagree and agree on. Whereas in a multi-agent, um, oh, what's the word? Like orchestrated kind of multi-agent system where they're all working independently and vibing, almost vibing off each other, that context they have, number one, they have to share it with each other, but number two, they also have to be storing it in a place that the next.

Random agent that might come into the mix can pick it up and run with it. So yeah. Uh, that does, that does make sense to me.

Vernon: Mm.

Richard: I don't think it's what context is in the way we talk about it though, within the context of the context of the context of agents' [00:40:00] context makes sense.

Um, but anyway, tho those. I, I've had more ideas than that, but like conscious of the time and like, I just think there's more options out there than we've, than we've ever had before, and I think people need to start tinkering and playing with them. And so whether it's testing, whether it's quality, what, whether it's static analysis, like whatever it is, we can do it differently than we've ever done it before and programmatically and potentially quicker.

Um, so yeah, what ideas or what'd you think about that? Or have you got any other ideas to add on to that pile?

Vernon: In some ways it's the, it's the same as ever because, because what, what, what I've always been interested in, I found particularly fascinating and cool and awesome, is when people find a way. To [00:41:00] amplify their effectiveness and just make what they're doing better in some way, shape or form.

Um, you know, I, I used to work, I used to, I had a friend who was amazing with, with SQL, a colleague, Phil, shout out to you, Phil. Uh, he, he always was getting onto us about, you don't need to go through the GUI. To go do all these steps to create a customer with an order in the right state so that you can do the test.

He would do, he would, he would saying you can, just because we grab an anonymized production data, that there's gonna be a customer in the right state in that. So we just need a SQL query to find a bunch of customers in that state. And then you're golden, and I think. AI is the same thing. The the skill actually is [00:42:00] realizing that there's an opportunity to make yourself more effective in some way, shape or form.

Could be a change in the process, could be the existence of a tool or something like that. Um, kind of doesn't matter, but you need to see the opportunity. You can think of that entrepreneurially, which is where my mind goes based on all my side hustle stuff. I'm like, oh, this is like, you know, entrepreneur.

I've seen a gap in the market for a SQL query. I'm gonna get, I'm gonna design on amazing. You know? Or you could, you know, the thing that which keeps mentioning task analysis all the time. I'm gonna try and catch up with Rob Sabourin about this and say, what actually is going on when I do x? Gee Paw Hill talks about this.

You should, and I'll put this in the show notes as well, the, the lump of code fallacy where you think that coding is just coding. You sat at the keyboard and actually it's not. There's three flavors of work that occur [00:43:00] when you are coding. According to Gee Paw Hill. Very cool. And I think the same thing applies to testing and trying to break those things apart and say, oh, first I do this and then I do this, and then I do this and I have to wait for that thing to happen.

And it comes back and then I've gotta go over here and do that. And then you'll start to say, ah, right now I've got the whole process laid out in front of me in some way, shape, or form. I can say that part there, I'm going after that part. I'm gonna make that bit better.

Richard: Yep.

Vernon: I might make a little bit of a tool.

I might, you know. It might be a process change.

Richard: That is a great summary of two things actually. Number one, the conversation I had today on a, on a separate interview was why I got into automation. It was because I wanted to be more efficient. It wasn't like we were told to. It was no business value. It was, I feel like I'm not being efficient.

Right. And then number two, everything you've just said there is so perfectly spot on because that. That [00:44:00] those ideas have stem from that I'm going to do this anyway, and it's happening regardless where I'm going to do it, but now I can codify some of it to make sure I do it to a decent standard every time, or to make sure it doesn't, sorry to make sure it gets done when I'm not around.

So that's why I kind of bridge the gap between testing and quality. Um, but essentially it is that task analysis of what do I do as a quality engineer or a very skilled tester or an automator that I can now basically hand off bits of it to a, a computer or a robot, or an agent, or whatever you wanna call it, that I couldn't do before.

That's what's got me excited and that's why I, you know, joke about having a different name for it, but it does feel very different. And it's also another thing that I've always struggled with is I almost feel it needs distinguishing because otherwise it would just blur into the noise that [00:45:00] is test automation.

So,

Vernon: well, thinking up what, um, here's, here's an I, I had a, I had a conversation with a colleague at work. Was it last week?

Um, last week from, from my people team. And the question was, in a world where agents can do everything, what is your performance review about now? Like, what is the skill? What is it? And I think, I think that's kind of what we are getting at in this conversation. It's like what, you know, you are saying, okay, so there's this skill where I.

Get to build something, because for me, I see I, it's the di, the distinction for me is the existing skill, which AI doesn't change, is seeing an opportunity to build something or change something or spot an inefficiency or [00:46:00] something that's ineffective or clunky or whatever, and say, I think I can make that better.

I think what you are saying is. With the advent of this new technology called, um, gen ai, the kinds of tools that we can now make for ourselves are now next level. And the kinds of problems that we could identify but not solve are now o it's like open season now, man. So open season, we can go after them.

Um, and so I think, I think, I think those two things are not the same. Uh, and figuring out what that skill is so we can name it and make it distinct might actually be pretty cool because

Richard: I don't have a good name. I don't have a good name for it. But from tinkering with a lot of this AI tooling and building some agents, there's always this aspect of it will only ever do [00:47:00] with what you tell it to do, right?

So how do we stay on top of. Some of those instructions,

Vernon: is that what Gen AI does

Richard: well and then also, um, the continued evaluation of what does, what is good enough? Yeah. Right. So in order to know what good enough looks like, you almost have to learn a lot of this stuff yourself anyway. So even though you're not the one bashing the keys, someone still needs to be learning it or we offset more of it to off the shelf tooling.

Start to accept that other people know what good enough is and we all follow suit. Right? Which is never gonna happen 'cause it hasn't happened this far. Right. So I can't imagine it happening anytime soon either.

Vernon: No.

Richard: So I think there's that element of, again, I don't wanna use the word context so much, but that your product space, the, the technologies, the, the, the, the science of your [00:48:00] area and the craft of your area.

They can't be delegated to any of these tools yet because they're not, they just, it's just made up stuff, whereas all the things I just described aren't made up things. They're real

Vernon: bloody hell.

Richard: So,

Vernon: yeah.

Richard: Anyway,

Vernon: heavy stuff.

Richard: That's, let's send it there. Um,

Vernon: yeah, once again, the time flies.

Richard: So, um, I hope you enjoyed that.

Yeah. What, what are you like discussing, thinking about building at the moment? Have you been able to build something that you couldn't build before? Is it working? Are you getting value from it? Um, we didn't mention anything today, but we had a chat before we recorded. There's lots of testing, buddies and agents being built at the moment.

Um mm-hmm. It's con, it's gonna continue to be in stuff like that. Um, so. What are you building with ai or are you just using AI to help you build the things you've already been building for a long [00:49:00] time? Which again, no issue with that. I just think it's time maybe for an industry to start thinking of what tools can be built that are either other aspects of testing that aren't exercising the product per se, or maybe.

There's something in the idea of we can help automate some aspects of quality. So yeah, let us know if you listen. Love to hear your thoughts. Um, yeah, we won't be recording next week, Vernon, just so everyone's listening. So there might be a, a bit of a gap in the next episode. 'cause I'm gonna be at where, where

Vernon: will you be?

Richard: I'll be at, I'll be at the test automation days, speakers dinner. Um, at the time that this will be getting recu. We should have been recording next time.

Vernon: Shout

Richard: out. Yeah, if we can find a slot, um, we can do something. Um, or if I've got energy and stuff while I'm there, maybe I can grab a few people for a bit of a chat.

We'll see. Yeah, and the week after that we'll be recording. What's the

Vernon: week [00:50:00] after that

Richard: We'll be recording live at.

In Nottingham, so we're gonna do a live episode there. Not sure on the format yet. Me and Vern, need to find some time to work that out. But it could be similar to the one we did at a TD that time where we got,

Vernon: you

Richard: know, basically grabbed speakers and people as they came past.

Vernon: Yeah.

Richard: Um, I quite like the idea of asking people, um, some questions and kind of turning them into little LinkedIn posts, you know, like ask 20 testers the same question.

And kind of make a little short that it's real of that. Uh, and then we can use that to discuss some of that. Um, but yeah, interviews with probably some of the peers, organizers as well. Yeah. So yeah, we're gonna kind of do a little deep dive on the, on the PeersCon live, not live for you lot, but live for us.

Vernon: If are at test automation days and you see Rich, make sure you say hi. If you see us at PeersCon, [00:51:00] please come and say hi. Um. Be to meet people in person. And if you've got a microphone, do not be shy. Come and say hello. It'll be awesome.

Richard: Alright,

Vernon: I think that's it, right?

Richard: Goodbye. That's it. Goodbye.

Vernon: See ya

View episode details

Listen to The Vernon Richard Show using one of many popular podcasting apps or directories.

← Previous · All Episodes · Next →

6 AI Tool Ideas That Will Transform How You Test

Subscribe