The New Partnership Model: Human Co-pilots to AI.

Partner with Monsieur Popp
11 min readJan 20, 2025

--

Part One: Understand AI as a Prediction Technology

I am not embarrassed to admit that before I joined Glean, I was a complete Luddite when it came to AI. I avoided the space both out of fear — ‘have you seen Terminator?!?’ I would say — and out of cynicism towards the shameless capitalist ambitions of Silicon Valley startups — ‘OpenAI’s reorg into a for-profit entity shows they couldn’t avoid the siren song that is profit maximization’. I credit Glean and its leadership for gently helping me take my head out of the sand, and opening my eyes to the positive impact of AI. And of course, I went down the rabbit hole. As a self identified nerd, and historian by academic training, I was really excited by the fact that AI would actually empower those with a liberal arts background — hello prompt engineering and low code application development!But, to quote Spiderman, with great power comes great responsibility. We have to balance the sanguine acknowledgement that AI will bring forth both massive leaps in science, innovation, healthcare, but sadly, by that same same token, it will potentially increase inequality and concentrate power in the hands of the few. I’m particularly wary of the hubbub around AI agents potentially becoming pervasive. I believe that philosophically we are implicitly placing too much faith in the autonomy of those agents. Even Microsoft’s nomenclature for its agent, ‘Co-Pilot’, implies that the agent is at our side — but at what point does it get so good at being our co-pilot that it develops its own agency, and eventually independence, supplanting us altogether?

I believe that we need human co-pilots to AI systems and applications, rather than AI co-pilots organizing our day to day life autonomously. An AI society is one where people are empowered by AI, rather than one where AI runs rampant, automating tasks and relegating humans to the sidelines. I will explore this argument in a three part series. In this first post, I will provide the background to understand that AI is a form of prediction technology, and can be productized in the form of point solutions, applications, and systems. In the second article, I elucidate that human designers need to build and oversee AI systems because machine judgement is no substitute for human judgement. And lastly, I argue that even when humans launch AI systems, they need to be wary of the unintended externalities brought about by their advent, further buttressing the position that we need humans to continually manage the service. To those that say that human oversight is fallible in part due to the very nature of humanity, I counter and conclude that our imperfections are even more reason to lean into our blemishes, as we can anticipate them in our design and account for them in building safeguards to continually iterate on the AI systems we build.

Understanding AI through the lens of economics

My thinking first took root when I started to break down AI to its component parts, drawing inspiration directly from Power & Prediction and Prediction Machines, two deep and thoughtful books published by its authors Ajay Agrawal, Joshua Gans, and Avi Goldfarb. At risk of over-simplifying Agrawal, Gans, and Goldfarb’s main arguments, allow me summarize the broad strokes of their two books. The first innovation in their analysis is to classify AI as a prediction technology that lowers the cost of inputs that go into making a decision. They arrive at this conclusion by arguing that the type of artificial intelligence that exists today in its current form makes decisions less costly because the technology is more apt at making predictions. Take for instance the following application of AI: the generation of written or spoken words. ChatGPT is a prediction technology because when prompted with a question, it answers by predicting the next word most contextually relevant to the previous ones. We’re oversimplifying how AI is able to reason, discern semantic understanding of words, etc, but the point still stands: ChatGPT outputs new words by predicting which ones make the most sense in context of the previous ones it encountered. The AI’s generated output is considered a product of intelligence in the sense only a human was able to accomplish that act previously — but now it can be done by a machine more quickly, which means less costly.

The authors continue that decisions are powered by predictions, and AI can form predictions much more quickly and more confidently than humans can, because of their ability to build predictions based on larger data sets that humans can intuitively analyze. Better decisions can be made because AI can model out more permutations of possible actions, assigning each one a utility score, and then take action — faster than humans can. Let’s unpack that. Think of utilities as weights — ie burning your hand would be -10, getting good grades, +10, etc. When a human makes a decision, their purpose is to maximize overall utility.* However, when AI models out a wider variety of outcomes, they may not be able to accurately model out the consequences of action, or simply put, the possible utility. For example, if I approach a goalie in soccer, how should I kick it to score — bend it like Beckham? Chip shot? Dribble past the goalie? Indeed, there is a wider variety of possible actions to take. All of them can assign a utility score to the outcome. Bend it like Beckham could result in a goal, but also has a chance of hitting the crossbar. Chip shot could also result in a score, but because it is a riskier move, it could also sail above the crossbar, which is embarrassing, an outcome I would rather avoid; I would feel more pain from the embarassment than the pleasure of the goal being scored. If I take the wrong action, the consequence could be dire — ask Mbappé if he’d like to take another crack on his breakaway in the World Cup where he was denied by the Argentinian goalie. He had modeled in his head different approaches to scoring, but the consequence of missing means no World Cup title. We can argue that the utility of scoring to win a World Cup is high, just as missing a shot and losing a World Cup is equally painful. But that pain might be more acute for different people so they would want to minimize their chance of missing and thus shoot the ball in a manner that has a higher likehood of hitting the back of the net. Messi might suffer more pain that Mbappé because he had been chasing a World Cup title for far longer, while Mpabbé is younger and anticipates he can probably go back to a World Cup, so he might take a riskier shot than Messi. A synonym to this concept is stake. A consequence has high stakes if it has a high probability of steep disutility as much as it has a probability of steep utility.

What weighs the scale of utility and disutility? Judgement, which is unique to every person, and assigns values to the stakes. It is formed by two elements (dare we say data inputs into its model). The first: a contextual understanding of the world and its laws (ie gravity, inertia, laws of physics, etc). The second is based on one’s own experiences — i.e. when I touch the stove when it’s hot therefore I burn myself which I view as painful (though some masochists might revel in that feeling). So the relative weight of each prediction — that utility being attached to possible outcomes, like the pain/pleasure felt by burning my hand, also triggers a feedback loop that becomes in of itself another data point fed into the judgment model. One can increase the confidence of a prediction by enriching the model with additional feedback data. Note that without human judgement assigning weights to each prediction the feedback loop breaks down. Certain decisions can be automated because the stakes of the decision are low. However, for certain use cases — like self-driving cars — the stakes are a bit higher. We will err towards human intervention and/or judgment until the prediction model can 100% predict all possible outcomes with confidence, or if we feel like the stakes are low enough that we can accept the disutility of an outcome (ying) on the same plan as the utility of its opposite (yang).

This conclusion forms the first layer of the foundation of the argument that humans must remain in the loop because human judgement cannot be substituted by a machine’s… for now. Workplace agents trained on your company’s knowledge and data can eventually automate workflows. Deflecting customer tickets comes to mind. Certain queries by customers are going to be repetitive, and predicting the correct response can be made with higher confidence because the agent can narrow down the correct response to a few options. But, for more complex tickets, where the number of possible responses are higher, a trade off is calculated. Pick the wrong answer, and that could infuriate the customer, lower your NPS, increase your churn rate, and impact your bottom line. However, the decision could be low stakes, and the customer could become frustrated in the moment, but not so much that they stop using your service altogether — they may simply ask to talk to the AI agent supervisor (a human). Until a machine achieves AGI, and can replicate a human’s consciousness completely, we’ll need a human supervisor to the AI agent less we want to risk the aforementioned consequences. Only a human can judge whether that consequence is dire enough that the advantages of using an agent (productivity gains and automation) outweigh the disadvantages (losing a customer.)

Image 1: An action is taken based on a prediction — AI can predict certain outcomes based on specific inputs which act as training data. A decision is taken based on all the possible permutations of outcomes it can predict, and leverages judgment — which is assimilated by learning the laws that govern human existence — to pick one.

AI point solutions, AI applications, and AI systems

So where does AI fit in with how humans go about their every day? Society is structured inasmuch as the ways humans organize themselves — professionally in companies and socially by institutions. The building block across both organizing structures according to the authors is the task. The authors aptly used the following example to illustrate the concept: there are 146 tasks associated with the process of taking a company public. Another example that I survive every day: as a toddler dad, when I want to figure out what groceries to buy to satisfy her insatiable and varying predilections, I look in the fridge; take stock of what food needs to be procured; and then I use those inputs to actually grocery shop. Can tasks in that process — inasmuch as the tasks in taking a company go public — be automated because AI can predict and generate decisions that make one more productive, more efficient, more intelligent? Agrawal, Gans, and Goldfarb’s second innovation is to underline that the reason AI adoption has been slower than anticipated has to do with the fact that we are still in the early stages of identifying the use cases that AI can disrupt. Similar to the history of the adoption of electricity, which took roughly thirty years after its invention to be mainstream, for AI to become mainstream we must first identify AI point solutions, and then evolve to designing AI applications, and eventually, AI systems.

Image 2: A task, if leveraging AI to reduce the prediction costs, can be automated — the consequence of which is that the output is done more quickly and efficiently. When subsequent tasks are grouped they can form AI applications — like forecasting product demand, or predicting manufacturing defects and flagging them.

Let’s explore each separately. AI point solutions simplify a workflow down to a few tasks. For instance, this could be a singular task like predicting when the fridge is devoid enough of food so that it may trigger an alert to its own owner to go shopping. An AI application is a collection of automated tasks — ie, automating every task in the process of taking a company public. Nowadays ‘agentic reasoning’ is the buzzword phrase most associated with this workflow; agents have to reason and then take actions, which could also trigger other agents to complete tasks on their own as well.

This cascading automated set of tasks propagated by autonomous agents — or, in other words, AI applications interacting with one another — constitutes an AI system. Think of the education system; the capital markets system; the political system etc. In Power and Predictions, the authors self-reflect that their first book over indexed on the impact of AI on revolutionizing point solutions rather than the systems themselves. I agree with the book’s conclusion that an AI system or the re- architecture of non-AI ones will be even more disruptive because it will uncoil decisions that are interconnected to one another. This could in of itself necessitate building all together new workflows because re-architecting the non-AI infused workflows would be too costly. Right now, it is difficult to automate tasks in the classroom because the cost of prediction is too high. For instance, curriculums can’t be personalized for each student in classrooms because the cost of tailoring the individual tasks around education someone on a particular subject is too high; you would have to model out each student’s learning style to predict what education format makes the most sense to them, to the nth degree, n being the number of total students. So instead, we’ve put in place rules — i.e. classes should have thirty students sitting in rows, last forty five minutes, and follow a prescriptive curriculum that takes them from kindergarten to twelfth grade. Does it make sense to infuse AI applications in that system, or reinvent a new AI system altogether that improves education outcomes for all? Regardless of which path I hope we can agree that a human will have to design and instrument that system with AI. If we continue to assume that the development and orchestration of AI will be overseen by humans, and assuming agentic workflows do indeed take root, then humans will remain in the driving seat.

*We used the term ‘utility’ here in the Jeremy Bentham sense: utility is when a decision maximizes the possible happiness of its agent. However that whole concept has been debated by many philosophers. Kant completely rejected happiness as the basis of utility, and rather believed utility is maximized when decisions act upon universal moral principles. Others like John Rawls believe utility is an exercise maximizing justice, by designing laws that minimize its possible nefarious consequences on the minority. Going into the merits of each philosophy is outside of the remit of this series. Safe to say each philosopher would argue there is a function to maximize, arguing for different variables to consider. Construction of so many permutations defining utility only reinforces this article’s position that human judgement will have n permutations because n humans will have n interpretations of what that is. They will all be different from AI’s in the first place if machines attempt to model it uniquely on their own rather than emulate human’s construction of utility.

--

--

Partner with Monsieur Popp
Partner with Monsieur Popp

Written by Partner with Monsieur Popp

Tech Partnerships, Entrepreneurship, Random Musings. Butler to Mademoiselle Popp, chief of staff to Madame Popp. Views are my own.

No responses yet