Seeing Further, Not Just More Clearly

Why comparing AI to humans can only take us so far

  • Blog
  • 6 minute read
  • November 28, 2025
Matt Wood

Matt Wood

Global and US Commercial Technology & Innovation Officer (CTIO), PwC United States

When early telescopes were developed, they were judged by how well they extended ordinary sight. The clearer and sharper the image, the better the tool. At first, the goal was to see farther using the same frame of reference — just more of it. But over time, telescopes changed. They began to detect forms of light and energy the eye couldn’t perceive. Radio waves, ultraviolet bands, gravitational signatures. These instruments were no longer just improving vision. They were shifting how we observed and what we could know. 

Artificial intelligence is moving through a similar transition. In the beginning, we evaluated systems by comparing them to familiar human abilities. Could a model solve math problems like a student? Could it write like a journalist or reason like a doctor? These were understandable questions. They helped define scope and give a sense of progress. But like early judgments of telescopes, they reflect a limited frame. 

Real estate

The Human Benchmark 

Comparison to human performance has been a useful starting point. It provides orientation when technology is new and evolving quickly. In many cases, it is still the best available proxy for usefulness or safety. But if it becomes the dominant lens, it can mislead us. Systems that appear “better than human” at a specific task may still be hard to trust, difficult to use, or poorly integrated into existing workflows. 

Framing AI systems as human equivalents also affects how people respond to them. Telling someone a model is more accurate or more efficient may signal that their role is at risk. When systems are introduced this way — even implicitly — they can generate friction or disengagement. This is especially true in professional settings where identity, experience, and responsibility are closely held. 

Terms like “digital worker” or “agentic workforce” often reinforce this problem. They imply symmetry where there is none. They suggest that the goal is to replace rather than support. The result is that the people who are essential to the success of the system may become the ones most reluctant to use it.  

Adoption Follows Usefulness 

The most widely adopted AI systems today do not succeed because they outperform people on standardised tasks. They succeed because they are adaptable, accessible, and easy to apply to a wide range of real-world situations. ChatGPT is one example. It offers general-purpose assistance without asserting authority. It allows for experimentation without requiring buy-in. GitHub Copilot is another. It integrates into a developer’s existing workflow, offering suggestions without interrupting pace or control. 

These systems do not compete with human expertise. They offer support in areas where speed, scale, or flexibility matters. Their success is less about capability in isolation and more about context — how they fit, how they feel, and how quickly they become part of a rhythm. 

The telescope analogy applies here, too. The early instruments that tried to sharpen what we already saw were eventually surpassed by those that helped us observe things we could not otherwise detect. But those tools were only valuable if people were willing to use them — and if the insights they generated could be acted on.  

The Shrinking Mirror 

There is also a limit approaching in how far human comparison can take us. Many benchmark tasks — language exams, coding problems, visual classification challenges — have already been met or surpassed. In more complex domains, there may be no single human baseline to compare to. And in many frontier areas — such as modeling proteins or genomes — systems now operate in ways that have no direct human analogue at all. 

As the models become more capable, the frame of human equivalence becomes less meaningful. Not just because it underestimates the systems, but because it under-describes what they are doing. We do not evaluate a radio telescope by how closely it replicates sight. We evaluate it by whether it reveals something we would otherwise miss. 

New Instruments  

If AI is to reach its full potential, we will need new ways to assess what matters. Instead of asking whether a model is better than a person, we might ask: Does it improve the quality of decisions? Does it reduce the time to insight? Does it surface options that were not previously considered? Does it make people more confident in uncertain conditions? 

These are quieter questions. They don’t lend themselves to headlines or leaderboards. But they are the kinds of questions that emerge when AI becomes embedded in actual work, not just in abstract contests. 

This shift won’t happen all at once. Comparison to human ability may remain useful for certain regulatory, safety, or onboarding decisions. But over time, our understanding of what makes a system valuable will evolve — just as our understanding of what makes a telescope powerful evolved when we stopped asking it to behave like a better eye. 

A Shared Direction  

The most powerful tools do not just reflect our capabilities back to us. They help us work in new ways, see new patterns, and act with new perspective. To do that, they need to be introduced with care. Not as replacements for human roles, but as extensions of human systems. 

We cannot build communities around systems that feel like threats. But we can build shared momentum around tools that help people feel more capable, more informed, and more involved in the work ahead. 

The telescope didn’t make the eye obsolete. It changed what the eye could reach. 

AI might do the same — if we stop asking it to mirror us, and start asking what it might help us see next.

Explore our services

Scale AI for your business

Next Tech Agenda