Are We Insane?

mariprofundus
4 hours ago
4 min read

I’ve gotten curious about how AI actually works and so recently read ‘The Scaling Era’ a book by Dwarkesh Patel about the development of AI. Patel is an AI evangelist, very knowledgeable, who writes and podcasts extensively about the subject. The book is basically a series of interviews with technical leaders in the field including heads of research and CEO’s of some of the leading AI companies. Much of the discussions are quite technical, well beyond my fairly limited comprehension of computer science. Nonetheless, the book provided me some idea into the inner workings of AI, although less insight into how it actually works. For example, when I ask Chat GPT a science question does it collect all the relevant data and run it’s an appropriate statistical analysis, or is it simply summarizing from the text of different articles? I believe it is primarily the latter, which is very impressive in its own right, but as pointed out below, it’s actually not clear what AI is doing, ‘behind the scenes’.

The book’s title essentially reveals the main plot line behind current advancements in AI: scaling. Somewhere around 6 or 7 years ago it was realized that most effective way to develop the AI models we have today, e.g. large language models like Chat GPT, was to start with a relatively simple neural network computer program, and then throw massive amounts of compute and data at it. The alternative is to develop the perfect algorithm that would mimic human intelligence with a few 10’s or 100,’s of thousands of lines of computer code, and that had been the approach for several decades, but, thus far, that hasn’t worked very well. Instead, it’s just brute force, throwing massive compute resources and massive amounts of data to train the models that give rise to the AI programs we use today. In its most simplistic terms these models are attempting to create a neuronal network akin to the human mind.

Compute is essentially computer hardware, the racks and racks of servers that fill data centers larger than large sports stadiums. Data is whatever information that can be made machine readable, including essentially everything on the internet. Throw more and more data and more and more compute into your model and it becomes more and more knowledgeable and able to communicate in more and more seamless ways with humans. In some sense, and it’s clear to no one, exactly what sense, these models are believed to be approximating how the human brain works. Just as we don’t understand how the mind works to create intelligence and consciousness, the inner workings of these AI models are not understood at a mechanistic level, i.e. no one understands how Chat GPT, Gemini, Claude, Grok or similar models actually generate the outputs that they provide.

The ultimate goal of these AI companies is to reach artificial general intelligence (AGI), that is an AI model fully as intelligent and capable of reasoning as a human. It is assumed that once this second breakthrough point for AI occurs (the first being scaling) AGI will be used to further advance AI. In theory, and in the glint of the eye of some developers, this can quickly lead to super-intelligent AI, or super AGI, models that will far surpass human intelligence.

One way to think about AGI is to imagine a self-driving taxi that is programmed to work in San Francisco being flown to London, and the next day, readily navigating London streets, e.g. it would adapt to driving on the left side of the road on a completely different grid of streets with similar, but different, rules of the road. All with no programming input. I suppose the superAGI example, is that the taxi flies itself to London! If you are a middle manager at Google, Apple, Meta, Amazon, Oracle, etc, another way to think about it is that a trained AGI system could instantaneously have full understanding of all the thousands of different operations that each of these companies carryout on a minute by minute basis and make rapid, logical (based exactly on what might be an open question) decisions that would obviate the need for middle managers.

As I understand it, if the current ‘scaling laws’ hold up, then with X number of dollars to cover the compute costs, and Y amount of energy to power the compute, it will be possible to achieve models that are equivalent to human intelligence, AGI, in Z number of years. As it currently stands, X is in the trillions of dollars, and, by some estimates, Y is equivalent to 20 or 30% of the total daily energy consumption in the United States. Z might be 3 to 10 years away (or never). Note that this equation does not take into account the cost of replacing the 25% of the US’s total energy consumption that goes to AI for all the other things that that 25% currently does, like keep the lights and heating or cooling on at home, run factories, etc.

Furthermore, as the experts are unanimous in pointing out, since at a fundamental level it isn’t understood how the neural networks driving the current AI models work, we can’t know whether humans will be able to control super-intelligent AGI, or, even if they can, which group of humans will do it, and what that will that will mean for the rest of humanity. Supposedly something called alignment will be critical in making sure that the AGI and super AGI models align with human interests. Whose interests, and whether alignment protocols will, a) be developed in time for the AGI breakthrough, and/or b) work at all, remains at best fuzzy.

So that’s what I have learned about AI. We are hell bent on getting to AGI and then Super AGI without knowing the inner workings of the comparatively simple AI we have today. This means we have little chance of understanding AGI, or what it will do, or if it can be controlled. To do this we will spend trillions of dollars, and use enormous amounts of energy and resources to develop a technology more disruptive than the advent of personal computers and the internet has been. To what purpose? The AI evangelists can spew answers to that question about as quickly as ChapGPT can keep encouraging you as you drill deeper and deeper into a subject, while giving you less and less plausible answers. We can rest assured, however, that one question ChatGPT, Gemini, Grok, Claude etc, will never ask us is: Are We Insane?

The Partial Observer

Are We Insane?

Recent Posts

Comments

The Partial Observer