Call Calibration can be a contentious process, though calibration is necessary for anyone doing Quality Assessment (QA). Calibration, for those readers who have never heard of it, is the process of analyzing a phone call together in a group with the goal of having each person scoring calls the same way – thus being "calibrated". There are a myriad of ways that calibration is done – some better than others – but few calibration sessions go by without there being some disagreement – thus the contention.
I was sat in calibration with a client the other day and listened to two well-trenched groups of people debating long and hard about a miniscule matter. The disagreement stemmed from a behavioral element that asked if the CSR “Asked additional questions
to further understand the issue”. The element was understood by some to be any question including “Can I
have your account number?” while others understood that it was only applicable if
the CSR had to ask questions to clarify a vague customer (e.g. “So, let me get this straight, are you
calling on invoice 1234 or invoice 5678?”). I know it seems silly to outsiders, but in QA it is important to ensure that people are analyzing calls and applying the QA elements consistently.
After several minutes I decided it was time to exit and get another cup of coffee. As I took a breather from the fray, it struck me that calibration arguments are often the result of
competing points-of-view for which there is not a right or wrong answer. Either position is fine. This was one of those discussions. Typically, I find that it takes a good manager to make the call and get the QA troops on the same page. A capable QA manager will listen to both sides, make a decision and step in to lead:
"This is a great discussion and I see both points-of-view. There is no right or wrong here – both views are valid, but we need to make a decision and move forward. This is how I want us to score this element from now on and here’s why I’m making this decision…"
The key to having efficient, productive calibration is to have capable, strong leadership.
One of the goals of QA is an objective analysis of what took place in a phone call. When developing a QA scale, it’s important to incorporate elements that make it easy for the call analyst to be objective. A good example is "dead air".
Everyone in the call center world knows that "dead air" is a negative. It destroys customer confidence and can even lead to abandoned calls, return calls and unresolved calls. What many QA teams struggle with is how to define how much dead air is too much dead air.
Often, I will find clients who leave the call to the subjective call of the analyst. The QA element might read something like "Didn’t leave caller in dead air" – but then doesn’t define the amount of dead air that’s unacceptable. The result?
- QA Nazis will ding the CSR if there’s a hint of unexplained silence
- QA liberals will never ding the CSR because they understood what the CSR was trying to do, and we wouldn’t want to make the CSR feel bad when the system was hanging up and giving them fits
- QA narcissists will ding the CSR only if it seemed excessive to them
- QA relativists will generally give credit because "if you think that’s dead air you shoulda heard the dead air they used to leave customers in right after they were hired. This is nothin‘!"
If you’re going to be objective, you have to set a standard that is clearly defined and measured. I’ve seen QA scales with a seven second standard, a ten second standard, a fifteen second standard – or a tiered standard depending on whether the CSR explained what they were doing to the customer. How long it should be is a discussion for another time, but in each of these cases, the call scorer is given a clear time boundary. If the dead air exceeds that line you score it down. Simple.
Dead air is an easy example – there are others. Take a look at your QA scale or scorecard. Are there elements that open the door to broad differences of interpretation? If so, you’re QA scores have a broad margin of error.
I am the father of two teenagers. They’re amazing individuals and I love them very, very much. In fact, I love them so much that I’m constantly focused on helping them learn to manage life. Part of that process is managing them and their choices. While I provide the the opportunity to express their desires and make their plans, I ultimately reserve the right to manage when needed.
For example, I love that they have an active social life. Hanging out with quality friends is a great thing – but when their social life begins to take precedence over their responsibilities [shocking, isn’t it?], it’s time for me to step in and manage, to say "no", and to set out the appropriate boundaries and accountability.
QA is often like parenting teenagers. There is a team of wonderful, intelligent people who are analyzing calls and coaching CSRs. But the process still needs a manager. I have watched QA teams struggle and fail because no one is managing the process. Many companies attempt to run the QA by democratic process or by committee. By it’s very nature, QA requires a constant stream of decisions regarding how things will be measured, analyzed and enforced.
Three things I observe in successful QA programs:
- The buck stops somewhere. President Harry Truman’s motto was "the buck stops here". I knew a man who worked with every Presidential administration from Truman through Nixon. His favorite President was Harry Truman. "If you needed a decision made," my friend said, "President Truman always gave you his decision and gave it within the time frame you requested." Truman was a great leader and great leaders emulate his motto. QA needs a leader who is willing to listen to all sides of a given issue, make a decision, and be responsible for that decision.
- QA Analysts are accountable. QA programs hold front-line agents accountable for their performance. Good QA managers will hold their QA analysts and supervisors accountable for appropriately, consistently applying the correct methodology when scoring calls – and put the structure in place for doing so.
- The process is continuously improving. I always remind people that a QA scale is not scripture, and it shouldn’t be the constitution. If something needs to be changed then change it, and don’t make the process as difficult as a constitutional amendment. While constant change can be distracting and work against you – continuous, methodical, timely improvements are required to make QA effective – and they don’t happen without someone managing the process.
Like teenagers need a parent, QA teams need a manger. Managing QA should be a process of leading others to be effective managers themselves. To be a great manager you have to be willing to lead by example, to make decisions and to be accountable for those decisions.
Is anyone managing your QA process?
I was taking part in a client’s call calibration a while back and something that one of the participants said really stuck with me. We were discussing whether or not to "ding" the CSR for a particular behavior in one of those pesky gray areas that the scale didn’t define particularly well (which is a great idea for a future post). One of the coaches was explaining why they would mark the CSR down, admitting that the behavior was a "pet peeve" and they knew they tended to be more sensitive to it.
I walked away from calibration that day thinking about my own QA pet peeves. I think that pet peeves can be a strength, because they make us keenly aware of a particular behavior. It’s less likely that we’ll miss it if it’s a particular pet peeve. Of course, like the rest of life, our strengths may often open us up to weaknesses and "blind spots". While a pet peeve may make me more keenly aware of a behavior, it may also cause me to judge the behavior more harshly than the scale, or the company would.
Since that calibration, I’ve tried to be more aware of myself when I’m analyzing calls. I want to be objective, analyzing the call according to the scale and the desired behavior the company is attempting to motivate. I want my pet peeves to serve me well rather than be a slave to them.
I received a wonderful e-mail (and a comment) the other day from alert reader, Barbara, at Bright House. Their quality team sounds like they are doing some great work over there. Here’s an excerpt:
"All inside teams must remember to be be Switzerland, pretend you dont
know that Joe’s tone is just that way or that Caroline is having a bad
day.. we refer to it as WWDD- what would division do– this means how
would the president -who does know these agents grade this call– then
we do the same."
WWDD?! I love this concept. Kudos to the Bright House crew on keeping their objectivity. QA coaches and supervisors often and easily analyze calls relative to what they know about the agent, rather than looking at it from the customer’s perspective. This is a slippery slope that ends in a QA program that has little integrity or validity. In Barbara’s case, they bring objectivity by looking at it from their corporate President (whom, I assume would want them to meet/exceed customers’ expectations!).
How about a few others?
WWCE: What Would Customers Expect?
WWEE: What Would Exceed Expectation?
WWWC: What Would "WOW" Customers?
I’ve noticed a pattern while sitting in on calibration sessions with various clients. It’s my theory of relativity in QA. Scores (S) are the result (=) of avoiding two potential conflicts (-C2).
The two critical factors in the equation are:
- Outcome (O) – You make a decision on a behavioral element based on the resulting outcome. For example, you’re scoring a call and the CSR’s voice tone was really flat and robotic. You’re considering "dinging" them on this behavioral element, but you then check to find that marking them down will result in the Overall Service Score being 89.9. If that happens, the CSR won’t make their incentive. If they don’t make their incentive they will be upset and argue the point. You don’t want to deal with the conflict so you figure you’ll "just give it to them".
- History (H) – Let’s say that you’re analyzing a call and the CSR was impatient and interruptive with the customer. You should really "ding" them for this, but once again you know that it will probably result in a confrontation with the CSR. Then you remember that, in the past, the CSR was much worse. So, since their behavior is a relative improvement over past behavior – you "just give it to them".
The problem with both of these scenarios is that you destroy the objectivity of the process and the credibility of your program. The decision of whether to give credit or mark down on a particular behavioral element should be a simple consideration of the standards or the behaviors you’re attempting to drive with the QA scale for that particular element. Factoring in the resulting scores or the CSR’s past behavior turns an objective decision into a relative decision based on criteria outside the scope of the QA scale.
Connie Smith recently posted a common question among call center QA teams. The question is this, if I may paraphrase it:
Is there a behavioral element – a zero-tolerance issue – on the QA form that is so important that to miss it means you get a big, fat zero on the whole form?
Great question! There are some important questions to ask yourself when considering this course of action.
- What are the goals and objectives of the QA process?
Definition will help you answer the question at hand. If your QA
process is in place to simply spot-check CSRs on critical, show-stopper
issues that make or break the call – then having a pass/fail
methodology may work (depending on how it’s structured). If you’re using
QA to gain an overall picture of the customer’s service experience and/or
the overall performance of the CSR, then giving one behavioral element
the power to zero out all the other elements is probably not in your best interest.
- Are you using QA as individual performance appraisal and tracking the results? If so, then making the entire result "0" because of one missed element ruins the objectivity of the process. Denying a CSR credit for the things that they did do correctly can damage CSR buy-in, your QA programs reputation and it may not stand up to scrutiny should circumstances escalate.
- Realistically, what are the consequences of the behavior? There are some call centers who deal with highly regulated issues and one mistake could place the company at significant risk. So yes, one missed element can be extremely serious. Unfortunately, we’ve also witnessed some call centers who make "zero tolerance" issues, not because of potential regulatory improprieties or impact on customer satisfaction, but because of a manager’s pet peeve.
We’d recommend that, rather than giving a zero for the entire call, you weight the critical behavior so that it has an appropriate and proportionate impact on the resulting overall service score. The CSR still "fails" the call, experiencing the negative consequence of missing the critical issue (which may include reprimand or termination) – but you’re not denying the fact that the CSR did, in fact, adhere to other behavioral expectations.
Of course, you’ll need to have some reasoning (having supporting data helps) to defend your weighting of the critical element. Nevertheless, you can still manage the resulting consequence as a performance management issue, but you’ve put yourself in a much stronger position should your methodology be scrutinized.
I was in a calibration session this morning and a a service issue came up as part of the call that was not covered anywhere in the QA scale. It was interesting to watch the mental gymnastics of the group as everyone tried to figure out where to address the issue on the scale. There were multiple suggestions for where to "stick it", but in each case it was like forcing a round peg into a square hole. It just didn’t fit.
When addressing a clear service element that isn’t addressed on your QA form, it will be tempting to just "make it fit", but that mentality creates future problems:
- You have to expand the definition of the element you’re trying to force it into – which will only muddle up the works now and in the future – making the definition cloudy. This only leads to the potential for longer and more confusing calibration sessions, and more difficult call analysis.
- The CSR will scream bloody murder when you try to explain why you scored them down. It won’t make sense and they will be right. I’d question it, too, if I were in their shoes.
- Because it doesn’t fit, it will likely be forgotten where you "stuck it in" and if the situation comes up in the future it will generate the discussion all over again. Arrrgghhhh! I don’t like meetings, anyway. I especially don’t like rehashing issues that have been previously hashed.
So, what was the solution?
We opted to craft a new, clearly articulated element that addressed the situation in question along with other similar problems. It will be easy to score because it’s well defined (thus it will add only a fraction of a second to the call analysis). Because it’s clearly addressed, we don’t have to waste time in future calibration sessions trying to figure out where it goes.
Flickr photo courtesy of katiek2
I have been in QA calibration sessions with a handful of different clients the past few weeks. In each one, the participants had scored the call ahead of time and the session started with everyone sharing their scores with the group. The call was then played and everyone went around the group to discuss where there were differences in scoring. One comment I routinely heard at some point in every session:
"Oops. I gave credit for that. I didn’t catch it when I scored it, but now that I hear it again..."
Quality Assessment is a human process, and we all know that human beings aren’t perfect. Just as every CSR will have a "clinker" now and then, every QA analyst will miss an element or two. Nevertheless, every QA analyst has an obligation to the CSR and the customer to be conscientious in their analysis. We control data that can and will have far reaching impact. To that end, it’s important that we take measures to monitor our own performance:
- Avoid scoring calls during "live listening" unless it is a simple spot check that does not impact the CSR’s performance review. The margin of error on live listening is too high to be reliable.
- Don’t score a call on one hearing. Once again, the margin of error is too high.
- Don’t multi-task while you are scoring. While your brain is distracted, you will miss something important. If you’re interrupted while scoring, go back and start the call over again.
- Track QA analyst performance in calibration. By keeping track of pre-calibration scores and comparing them against the consensus score, over time you can discover some interesting trends that can be a reality check for the QA analyst.
- Perform regular audits. An audit of your QA team’s analysis by an objective party can unearth blind spots in your process and provide healthy accountability for your QA team.
- When you find yourself in calibration saying "oops, I didn’t catch that", ask yourself why you missed it. Try to diminish those "misses" by identifying why they happen, and then alter your behavior accordingly.
Successful Calibration Basics
Getting an Outside Perspective of Your QA
Too Many Coaches Spoil the Calibration
A Different Twist on Calibration
Flickr photo courtesy of shuttr
One way QA teams often try to make their scoring tool more "simple" is to lump multiple behavioral elements into one line
item in the scale, such as:
“Be courteous; use the customer’s name”
These "double-barreled" elements create a natural analytical dilemma. If the CSR was courteous, saying “please” and “thank
you” but didn’t use the customer’s name, do you “ding” them? Doing so will usually raise a fire-storm of protest from the CSR who will argue that it’s not fair to penalize them for not using the name when they did, in fact, use courtesy.
Usually, the next time the situation arises, the call coach will decide that "dinging" the CSR for not using the name isn’t worth the argument. So, the coach will give the CSR credit for the element and “coach them" on using the customer’s name.
Them problem then arises that you
really aren’t holding the CSR accountable for the behavior of using the customer’s name. CSRs will quickly (and
correctly) surmise that as long as they use courtesy words they don’t have to modify their
behavior to use the customer’s name, knowing they will never get “dinged” for it.
Maybe it’s time to give your QA scale another look. Any double-barreled elements in there?
Common QA Pitfalls: Scoring Differences
Common QA Pitfalls: Poorly Defined Goals
This Shortcut Won’t Save You Time
Flickr photo courtesy of ob1left