Part 1: Understanding AI and Its Underlying Principles
5 min read

Part 1: Understanding AI and Its Underlying Principles

In the evolving AI landscape, data is the new fuel, fostering innovation. As creation costs drop, value shifts to data-rich companies, navigating a complex interplay of data and innovation in a promising future.

My pet peeve these days - I cannot log in to Linkedin without someone presenting themselves as an AI expert. Essential to this demonstration of being an expert is a list of AI tools that will "transform" the way we work. Add to that, predictions from these AI Nostradamuses (or Quasimodos), of millions of impending job losses, suggesting that everyone ranging from Copywriters to Mathematicians need to be afraid of GPT-3/4/n and other such Large Language Models (LLMs).

I don't think many of these commentators have thought through the economics of AI. Innovation in AI is upending many things, but elementary microeconomics still holds true.

What is AI, and how does it do what it does?

If I had to reduce AI to one word, it would be 'prediction'. Let's understand why, and then build further from there. Consider the use of one of the simplest models used for prediction - a linear regression model. Mathematically, we can represent it as:

y = mx + b + e,

where y is the target variable that we wish to predict, x is the feature we are studying, m is the slope of the line, b is the intercept, and e being the error term.

This simple model can be used to predict a variety of things, ranging from predicting sales, to response to medication. The Princeton Economist Prof. Orley Ashenfelter used a linear regression to accurately predict wine prices, with a rather simple formula:

Wine Quality Score = (0.00117 * Winter Rainfall) + (0.0614 * Average Growing Season Temperature) — (0.00386 * Harvest Rainfall) — 12.145

With this stunningly simple model, Ashenfelter predicted wine prices with greater accuracy than experts, beating the hoity-toity wine snobs at their own game. Later research has proven multiple times, the wine experts are nothing more than swill merchants, playing make-believe in a fantasy world that society has given the thumbs up.

Billy Beane and Paul DePodesta deployed a similar approach to understand the nature of baseball, identifying undervalued skills, and building a team that reached the playoffs on a minimal budget. All predictions.

Linear Regression is wonderful and covers some of our most common use cases. The world is not always very linear, however, and it helps to have models that capture non-linearity as well. Moving onto a more complex model, we have Deep Learning which uses neural networks with many layers (hence "deep") to make predictions.

Mathematically, we can represent it as


where y is the output of a neuron, w is the weight, x is the input, and f is the activation function.

Consider a neural network designed to recognise handwritten digits - a common task in machine learning (Yann Lecun wrote a famous paper on this topic). The inputs (x) would be the pixel values of the digit image, the weights (w) and biases (b) would be adjusted during training to improve the network's accuracy, and the activation function (f) would transform the inputs and weights into an output that indicates the network's prediction for the digit in the image. 

Finally, moving on to the thing that has created this new interest in AI - Large Language Models. I grossly oversimplify how they work, but essentially LLMs use a concept called 'attention' to assign weights to different words in the input, almost like highlighting key points in an essay. Fed a vast amount of text during training, the models are statistical parrots generating new text on the basis of relations they have established between the occurrence of words.

I don't think a lot of people (myself included) fully understand LLMs, since like most blackbox models, we only see the inputs and the outputs, and don't get to see how the sausage is made.

My crude analogy is this.

Just as geographical locations can be mapped onto a grid, and codified into numerical values, so can words in a dictionary. So we know that NYU (40.7295° N, 73.9965° W) is close to Columbia (40.807384° N, 73.963036° W) and to the east of Stevens Institute in Hoboken (40.7448° N, 74.0256° W), but far from Stanford University (37.4275° N, 122.1697° W).

An AI Generated Map

​​In a similar manner, one can map words across the universe of words, training a model on decades of news stories and millions of books. Using that, we can derive relationships and "distances" between certain words, and conclude that father is close to son and man, Swiss is close to Switzerland, but gun is not in the vicinity of cake. This encodes both the subtleties of language use, as well as many of our inherent biases, but this is the way we use language.

We can further add dimensions to usage, such as context, word order, syntax etc, and then transform it into sequences of vectors. You may like your coffee with whisky, but an LLM is more likely to predict that the word "milk" is a better fit to end the phrase, "I drink my espresso with a dash of ____".

As the field progresses, models are being trained on more data, more dimensions, more layers, and parameters. This allows for a wide range of possibilities, from writing a funny Haiku to expressing an "opinion" on a legal case.

The Silicon Catalyst: Accelerating AI's Progress

The rapid advancements in AI are intrinsically linked to the parallel progress in semiconductor technology. These chips, essentially the powerhouse of AI systems, have evolved to meet the escalating computational demands of modern AI applications. Companies like NVIDIA and AMD are spearheading this movement, developing GPUs and TPUs that are specifically designed to accelerate machine learning tasks, thereby enhancing the efficiency and speed of data processing.

Furthermore, the semiconductor industry is witnessing a surge in innovation, with new materials and designs, such as 3D Stacking and silicon photonics, coming to the fore. These developments are pivotal in overcoming the limitations of traditional silicon-based chips, offering higher performance and lower energy consumption.

The global chip manufacturing sector is also a hotbed of geopolitical tensions and economic strategies, with nations vying to secure a steady supply chain and reduce dependencies on dominant players. The recent U.S. restrictions on chip exports to certain countries underscore the critical role of semiconductors in not only fuelling the AI revolution but also shaping global economic and geopolitical narratives.

As we explore the economics of AI, understanding the semiconductor landscape becomes vital. It's not just about facilitating AI processes; it's about steering the direction of AI advancements and having a significant bearing on the global market dynamics and power structures. Having said that, this is beyond the scope of this post, but I will highly recommend Chris Miller’s book on this topic, Chip Wars

Continued in Part 2...