Share on facebook
Share on twitter
Share on linkedin
ai and customer trust

Customers Don’t Trust Artificial Intelligence — And They’re Right Not To

In 2019, researchers gathered 228 American students for a study on stress. They offered each participant a Salivary Cortisol Test and Stress Questionnaire. The analysis of both, said the academics, would produce a qualitative assessment of their health and a recommended course of action for managing their stress.

At this point, the experimental group split in two. Researchers told one-half that the data analysis would be performed by a human physician. They informed the other half that artificial intelligence (AI) would analyze their results. All other factors—accuracy, cost, time—remained equal.

The difference in uptake was staggering.

While nearly 40% of students agreed to take the free human-powered test, just one-quarter signed up for the automated option. The study sent a clear message: People do not trust artificial intelligence.

The lack of public trust in artificial intelligence is universal. People mistrust AI insurance underwriters, money managers, and shopping assistants. Nowhere is it more true than customer service. Just four in ten people believe AI has the potential to improve customer service and even fewer are comfortable interacting with the technology.

For businesses, customer skepticism surrounding AI is concerning. Technology is meant to increase our productivity and enhance our output. It promises to strengthen service and deliver an improved customer experience. But progress is only possible when customers trust technology enough to use it.

The challenge of trust in technology is so serious that it has already attracted political intervention. European lawmakers recently presented a risk-based proposal for regulating artificial intelligence within the single market.

“Trust is a must,” announced Margrethe Vestager, Executive Vice-President of Competition at the European Commission. “The EU is spearheading the development of new global norms to make sure AI can be trusted. By setting the standards, we can pave the way to ethical technology worldwide.” 

Under the proposals, AI products would face a battery of ex-ante (quality of training data, human oversight, and so on) and ex-post (post-market surveillance) regulatory requirements. Some applications like social credit products and behavioral manipulation would be entirely prohibited.

Although the EU limited its regulation to so-called “high-risk” applications, it has had a much broader secondary impact. It is among the first pieces of regulation to recognize—and act on—the relational nature of trust. Trust is never unidirectional. Instead, all parties must share an understanding of where, how, and when AI is used. Practically, this means practitioners must focus on the transparency, accountability, and ethics of AI to cultivate trust in their AI-enabled products.

What is trust?

Trust is a nebulous concept, easily felt and poorly understood. According to psychologist Jan P. de Jonge, it’s the “positive expectations a person has towards another person in situations involving risk.” It’s about putting faith in others’ dependability. From a social perspective, it’s essential to our survival. 

“We need to rely on others to help us thrive and survive,” de Jonge says. “In turn, we need to have faith that others can help us do tasks we cannot master or manage. Trust in others is what makes us function.”

Often, experts describe the emergence of interpersonal trust from four pillars: ability, benevolence, integrity, and predictability—or ABIP. Trust in technology is related, although uniquely different. As researchers from Newcastle University explained in a recent paper, technological trust rests on humane, environmental, and technological (HET) qualities.

Engendering trust within artificial intelligence technologies requires us to go one level deeper, to a unique framework created by the same researchers. Here, trust relies not on ABIP or HET, but a quartet of new qualities:

  • Fairness: “Technologies focused on detection or prevention of discrimination and bias in different demographics.”
  • Explainability: “Technologies focused on explaining and interpreting the outcome to the stakeholders (including end-users) in a humane manner.”
  • Auditability: “Technologies focused on enabling third-parties and regulators to supervise, challenge or monitor the operation of the model(s).”
  • Safety: “Technologies focused on ensuring the operation of the model as intended in presence of an active or passive malicious attacker.”

Simply put: Those systems that are both perceptually and objectively fair, explainable, auditable, and safe will engender trust. Those that are not will provoke suspicion. While the FEAS framework devised at Newcastle University succinctly describes the trust-enhancing qualities of artificial intelligence, the leap from theory to practice is devilish and complex.

Consider a cell carrier attempting to increase trust in its AI-powered technology. How does their chatbot practically prevent discrimination? What does humane customer service sound like? How ought their developers architect their self-serve knowledge base to facilitate external audits? Does a focus on safety require extra work from a Chief Information or Security Officer?

To answer these questions, we require more than a framework. We need practical advice.

1. Root out bias—both in data and society

We expect computer systems to be logical, to strip away the discriminations and prejudices held by humans to deliver fair verdicts. Unfortunately, the development of AI has been dogged by allegations of bias: racially biased risk assessment systems used by police forces; recruitment systems consistently biased against women; facial recognition systems that work perfectly for white men and a few other groups; the list goes on.

Eliminating or reducing bias in systems is no mean feat.

The first correction companies must make is to their training data. For AI systems, data is their lifeblood. It is how they learn, analyze, and improve. Trained on data representative of a particular group, AI-powered customer service applications will learn to serve only those people to the detriment of all others.

Ultimately, training data must be representative of the audience its system serves. If it is not, no amount of model tweaks will improve its performance.

But bias creeps into systems in other ways. Societal bias represented accurately in the data remains a key problem. 

“The data is reflecting what’s already there,” says Tiago Ramalho, an ex-Google engineer who now leads technology consultancy Recursive AI. “You can’t just collect more data because the problem is not going to vanish.”

Consider a contact center that uses geographic location in a risk assessment. The system may deem a person high-risk merely because they called from a neighborhood with high crime rates. Taken to the extreme, such a biased system could unfairly lock out entire populations based on little more than their zip code.

To address such bias, Ramalho advises companies to explicitly add variables of interest—gender, ethnicity, socioeconomic status, and so on—and force the model to make their outcomes be the same.

Say a cell carrier develops an AI-powered chatbot to triage incoming queries. Two people contact the bot. They have equal circumstances and requirements, but different genders. The bot may want to treat the calls differently based on trends and connections in its training data. It may recommend men one call plan and women another.

If the carrier wishes its customers to be treated equally, irrespective of gender, the chatbot requires an enforced adjustment. The tweak forces the model to equally represent certain variables—in this case gender—negating underlying biases that may be accurately represented in the training data.

2. Rein in marketing promises

People love talking about artificial intelligence, says Steve Shwartz, veteran AI researcher, statistician, and investor. The term invokes images of hyper-intelligent robots like the Terminator and machiavellian minds like HAL 9000. But today’s technical reality is somewhat different.

“Around 90% of all AI applications use one particular type of AI called supervised learning,” Shwartz says. “It’s a type of machine learning, one we’ve been doing for 50 years. I used to teach supervised learning back in the 1970s.”

The difference between today’s technology and the systems Shwartz once taught is power. We have more powerful computers, neural networks, and deep learning networks. While we can compute very complex functions, the general intelligence of HAL 9000 is still very far away. But this doesn’t stop companies from exaggerating and embellishing.

Consider self-driving cars. Some companies advertise a fully autonomous driving experience—but the small print often rolls back the promise. In most self-driving cars, human drivers must keep their hands on the wheel, ready to take over in the blink of an eye.

“People are overhyping what technology can do,” says Shwartz. “You’ve seen the story of the two guys who recently killed themselves by getting in the back seat while the car was driving? If I had tried that, I would have been in a hundred accidents. I’d be dead a hundred times over.”

In customer service, inflated claims abound. Many—if not most—businesses use chatbots in lieu of human agents. But some go one step further, equipping their bots with fake profile pictures and human names. From the customer’s perspective, there’s nothing to suggest these chatbots are manned by silicon minds, rather than humans. This approach is fine for the simplest of communications, but it falters when customers raise complex queries. Bots regularly become trapped in circular decision trees and spout nonsensical advice. Inevitably, customers become frustrated.

Apart from the practical implications of malfunctioning technology, the disparity between marketing promise and reality damages the reputation of businesses. When people can’t trust the technical claims, they don’t trust the business. 

The antidote to overhyped technology is simple: truth.

Shwartz says we need a radical recalibration on AI communication. Accuracy is key for all product descriptions and marketing claims. Instead of claiming cars have an autopilot, he suggests they use terms like assisted driving. Instead of equipping chatbots with fictitious profile pictures and names, companies should disclose that their frontline customer service is manned by bots.

“The burden of proof should be on the manufacturers,” he says. “They should be clear on what the products can do, can’t do, and the types of things they might do wrong.”

3. Build guardrails, instead of explainability models

Between 2015 and 2018, Ramalho worked on Google’s AI project DeepMind. Part of his role was to explore the explainability of its systems—describing how the AI system worked. But after just six months, he realized it was a fool’s errand. 

“The whole power of a neural network is that it’s extracting non-linear correlations between all the elements in the data,” he says. “If you say, ‘I can reduce this to like one or two simple rules,’ why couldn’t the system have been written with one or two simple rules in the first place?”

The purpose of AI systems is to find connections and correlations that humans can’t. They are, by necessity, black boxes. While there are explainability mechanisms available to developers, Ramalho describes them as “brittle” and questions their use.

Instead, he advises companies to shift their focus further down the line. Instead of explaining the black box, concentrate on the output and safety testing.

“Try to break systems in as many ways as possible and test drive it in real scenarios,” says Ramalho. “In my mind, this is the only robust way to ensure your system is going to work.”

He says companies should build batteries of tests to stress the system with possible failure cases and extreme loads. Developers should act like civil or aeronautical engineers. Just as they break bridges and stress test airplane wing loads, so too should AI developers.

Think back to Microsoft’s ill-fated Twitter experiment Tay. The Twitter bot learned from its community and engaged in “casual and playful conversation.” But within hours, users on the social network had taught the AI to tweet all sorts of misogynistic, racist, and offensive remarks.

Instead of attempting to explain how it learned such behavior, Ramalho suggests the developers ought to have focused on safety testing its output. Such tools already exist. OpenAI, the developer behind language model GPT-3, offers safety testing for free. As part of their publicly available service, they have a separate model on top of GPT-3 to analyze the output and highlight potentially harmful content. This module is live and running in hundreds of different applications, including Viable, a customer feedback analysis service, and Algolia Answers, a new semantic search engine for content archives. 

Ramalho’s approach is liberating. Organizations don’t need to ruminate about why their technology works. Rather they ensure that it delivers fair and impactful results.

Make trust your foundation

For service, experience, and business leaders contemplating an investment in AI, Ramalho has a piece of foundational advice: integrate trust into your design process—early.

You cannot easily retrofit the building blocks of trust—data, product positioning, and guardrails—to an existing product. They must be a core part of your process and product from day one, he says.

This admittedly isn’t easy. Most organizations are optimized for traditional code and technology. The opaque and unexplainable nature of AI poses a novel challenge. It requires fresh perspectives, processes, and environments.

Although the task is multi-layered and multi-faceted, the first step to becoming an organization fit for AI must be trust. Only when practitioners make trust a core part of the design process will their applications deserve our trust.

Leave a Reply

Your email address will not be published.

Ask an expert

Ask an expert Have a question related to this story? Or a thought you’d like an expert to weigh in on? Write to us.