Cal: Language model based tools like ChatGPT or Cloud, again, they’re built only on understanding language and generating language based on prompts, mainly how that’s being applied. I’m sure this has been your experience, Mike, in using these tools. Is it, it can speed up things that, you know, we were already doing, help me write this faster, help me generate more ideas than I’d be able to come up, you know, on my own, help me summarize this document, sort of speeding up.
Tasks, but none of that is my job doesn’t need to exist, right? The Turing test we should care about is when can a AI empty my email inbox on my behalf? Right. And I think that’s an important threshold because that’s capturing a lot more of what cognitive scientists call functional intelligence. Right.
And I think that’s where like a lot of the prognostications of big impacts get more interesting.
Mike: Hello and welcome to another episode of Muscle for Life. I’m your host Mike Matthews. Thank you for joining me today for Something a little bit different than the usual here on the podcast. Something that may seem a little bit random, which is AI.
But, although I selfishly wanted to have this conversation because I find the topic and the technology interesting, and I find the guest interesting, I’m a fan of his work. I also thought that many of my listeners may like to hear the discussion as well, because it feels They are not already using AI to improve their work, to improve their health and fitness, to improve their learning, to improve their self development.
They should be, and almost certainly will be in the near future. And so that’s why I asked Cal Newport to come back on the show and talk about AI. And in case you are not familiar with Cal, he is a renowned computer science professor, author, and productivity expert. And he’s been studying AI and its ramifications for humanity long before it was cool.
And in this episode, he shares a number of counterintuitive thoughts on the pros and cons of this new technology. How to get the most out of it right now and where he thinks it is going to go in the future. Before we get started, how many calories should you eat to reach your fitness goals faster? What about your macros?
What types of food should you eat and how many meals should you eat every day? Well, I created a free 60 second diet quiz that’ll answer those questions for you and others including how much alcohol you should drink, whether you should eat more fatty fish to get enough omega 3 fatty acids, what supplements are worth taking and why, and more.
To take the quiz and get your free, personalized diet plan. Go to muscle for life dot show slash diet quiz muscle for life dot show slash diet quiz. Now answer the questions and learn what you need to do in the kitchen to lose fat, build muscle and get healthy. Hey Cal, thanks for taking the time to come back on the podcast.
Yeah, no, it’s good to be back. Yeah. I’ve been looking forward to this selfishly because I’m personally very interested in what’s happening with AI. I use it a lot in my work. It’s now, it’s basically my, it’s like my little digital assistant, basically. And because so much of my work is these days, it’s creating content of different kinds.
It’s, it’s just doing things that require me to. to create ideas, to think through things. And I find it very helpful, but of course, uh, it’s also, there’s a lot of controversy over it. And I thought that might be a good place to start. Uh, so the first question I’d like to give to you is, uh, so everyone listening has heard about AI and what’s happening to some degree, I’m sure.
And there are A few different schools of thought from, from what I’ve seen in terms of where this technology is and where it may go in the future. There are people who think that it may save humanity. It may usher in a new renaissance, uh, it may dramatically reduce the cost of producing products and services, new age of abundance, prosperity, all of that.
And then there seems to be. The opposite camp who think that it’s more likely to destroy everything and possibly even just eradicate humanity altogether. And then there also seems to be a third philosophy, which is kind of just a meh, like the most likely outcome is, is probably going to be disappointment.
It’s not going to do either of those things. It’s just going to be. Uh, technology that is useful for certain people under certain circumstances. And it’s just going to be another tool, another digital tool that, that we have. I’m curious as to your thoughts, where, where do you fall on that multi polar spectrum?
Cal: Well, you know, I, I tend to take the Aristotelian. approach here when we think about like Aristotelian ethics where he talks about the real right target tends to be between extremes, right? So when you’re trying to figure out, uh, about particular character traits, Aristotle would say, well, you don’t want to be on one extreme or the other.
Like when it comes to bravery, you don’t want to be foolhardy, but you also don’t want to be a coward. And in the middle is the golden mean he called it. That’s actually where I think we are. Probably with AI. Yes. We get reports of it’s going to take over everything in a positive way. New utopia. This is sort of a, an Elon Musk, I would say endorsed idea
Mike: right now.
Horowitz as well. Uh, uh, Andreessen Horowitz, uh, Mark, Mark Andreessen.
Cal: Yes, that’s true. That’s right. But Andreessen Horowitz, you got to take them with a grain of salt because their, their goal is they need massive new markets in which to put capital, right? So, you know, we’re, we’re like two years out from Andreessen Horowitz really pushing, uh, a crypto driven internet was going to be the future of all technology because they were looking for plays and that kind of died down.
Um, but yeah, Musk is pushing it too. I don’t think we have evidence to right now to support the sort of utopian vision. The other end, you have the, the P doom equals one. Vision of the Nick Bostrom superintelligence. Like this is already out of control and it’s going to recursively improve itself until it takes over the world again.
Like most computer scientists I know aren’t sweating that right now, either. I would probably go with something if I’m going to use your scale, let’s call it math plus, because I don’t think it’s math, but I also don’t think it’s, it’s one of those extremes. I, you know, if I had to put money down and it’s dangerous to put money down on something that’s so hard to predict, you’re probably going to have a change maybe on the scale of something like.
The internet, the consumer internet, like, let’s think about that for, for a little bit, right? I mean, that is a transformative technological change, but it was, it, it doesn’t play out with the drasticness that we like to envision, or we’re more comfortable categorizing our predictions. Like when the internet came along, it created new businesses that didn’t exist before it put some businesses out of business for the most part, it changed the way, like the business we were already doing.
We kept doing it, but it changed what the day to day reality of that was. Professors still profess, car salesmen still sell cars. But it’s like different now. You have to deal with the internet. It kind of changed the day to day. That’s probably like the safest bet for how the generative AI revolution, what that’s going to lead to, is not necessarily a drastic wholesale definition of what we mean by work or what we do for work, but a Perhaps a pretty drastic change to the day to day composition of these efforts, just like someone from 25 years ago, wouldn’t be touching email or Google in a way that a knowledge worker today is going to be constantly touching those tools, but that job might be the same job that was there 25 years ago.
It just feels different how it unfolds.
Mike: That’s I think the safe bet right now. That aligns with something Altman said in a recent interview I saw where, to paraphrase, he said that he thinks now is the best time to start a company since the advent of the internet, if not the entire history of technology, because of what he thinks people are going to be able to do with it.
With this technology. I also think of he has a bet with I forget a friend of his on how long it’ll take to see the first billion dollar market cap on a solopreneur’s business. Basically, just a one man business. I mean, obviously would be in tech. It’d be some sort of Next big app or something that was created though, by one dude and AI billion dollar plus valuation.
Cal: Yeah. And you know, that’s possible because if we think about, for example, Instagram. Great example. I think they had 10 employees when they sold, right?
Mike: It’s 10 or 11 and they sold for right around a billion dollars, right? So. And how many of those 10 or 11 were engineers just doing engineering that AI could do?
Yep.
Cal: That’s probably a four. Yeah. And so, so right. One AI enhanced, uh, one AI enhanced programmer. I think that, I mean, I think that’s an interesting, that’s an interesting bet to make. That’s a smarter way, by the way, to think of this from an entrepreneurial angle, making sure you’re leveraging what’s newly made possible by these tools in pursuing whatever business seems like in your sweet spot and seems like there’s a great opportunity as opposed to what I think is a dangerous play right now is trying to build a business around the A.
- tools themselves in their current form. Right? Because one of my kind of a collection of takes I’ve been developing about where we are right now with consumer facing A. I. But one of these strong takes is that the existing form factor of generative AI tools, which is essentially a chat interface. I interface with these tools through a chat interface, giving prompts that have to, you know, carefully engineered prompts that get language model based tools to produce useful text.
That might be more fleeting than we think. That’s a step towards more intricate tools. So if you’re building a startup around using text prompts to an LLM that, you know, you may actually be building around the wrong Technology you’re you’re you’re building around, you know, not necessarily where this is going to end up in its widest form.
And we know that in part because these chatbot based tools, you know, been out for about a year and a half now. November 2022 would be the debut of chat GPT in this current form factor. They’re very good. But in this current form factor, they have not hit the disruption targets that were early predicted, right?
We don’t see large swaths of the knowledge economy fundamentally transformed by the tools as they’re designed right now, which tells us this form factor of copying and pasting text into a chat box is probably not going to be the form factor that’s going to deliver the biggest disruptions. We sort of need to look down the road a little bit about, you know, how we’re going to build on top of this capability.
This is not going to be the way I think, like, the average knowledge worker ultimately is going to interact is not going to be typing into a box at chat that open a dot com. This is, I think this is a sort of preliminary stepping stone in this technology’s development.
Mike: One of the limitations I see currently in my own use and in talking with some of the people I work with who also use it is Quality of its outputs is highly dependent on the quality of the inputs, the person using it.
And as it really excels in verbal intelligence, general reasoning, not so much. I saw something recently that Claude III, Uh, scored about a hundred or so on an, on a general IQ test, which was delivered the way you would deliver it to a blind person. Whereas verbal intelligence, I think GPT on that same, it was an informal paper of sorts.
GPT’s general IQ was maybe 85 or something like that. Uh, verbal IQ though, very high. So GPT, um, according to a couple of analyses scores somewhere in the one fifties on, on Verbal IQ. And so what I’ve seen is it takes an above average verbal IQ in an individual to get a lot of utility out of it in its current form factor.
And so I’ve seen that as a just a limiting factor, even if even if somebody If they haven’t spent a lot of time dealing with language, they struggle to get to the outcomes that it is capable of producing, but you can’t just give it kind of vague, this is kind of what I want. Can you just do this for me?
Like, you need to be very particular, very deliberate. Sometimes you have to break down what you want into multiple steps and walk it through. So it’s just, just echoing what you were saying there is for it to really make. Major disruptions, it’s going to have to get beyond that because most people are not going to be able to 100 extra productivity with it.
They just won’t.
Cal: Yeah, well, you look, I’m working right now, like, as we talk, I’m writing a draft of a New Yorker piece on using AI for writing one of the just universally agreed on axioms of people who study this is that. A language model can’t produce writing that is of higher quality than the person using the language model is already capable of doing.
And with some exceptions, right? Like, you’re not an English language, natural, uh, English is not your first language, but it can’t, you, you have to be the taste function. Like, is this good? Is this not good? Here’s what this is, this is missing. In fact, one of the interesting conclusions, preliminary conclusions that’s coming from the work I’m doing on this is that, like, for students who are using Language models with paper writing.
It’s not saving them time. I think we have this idea that it’s going to be a plagiarism machine. Like, write this section for me and I’ll lightly edit it. Um, it’s not what they’re doing. It’s way more interactive, back and forth. What about this? Let me get this idea. It’s as much about relieving the psychological distress of faking, facing the blank page as it is about trying to speed up or produce or automate part of this effort.
There’s a, there’s a bigger point here. I’ll make some big takes. Let’s take some big swings here. There’s a bigger point I want to underscore, which is you mentioned like Claude is not good at reasoning. You know, GPT 4 is better than GPT at reasoning, but you know, not even like a moderate human level of reasoning.
But here’s the bigger point I’ve been making recently. The idea that we want to build large language models big enough that just as like an accidental side effect, they get better at reasoning is like an incredibly inefficient way to have artificial intelligence do reasoning. The reasoning we see in something like GPT 4, which there’s been some more research on, it’s like a side effect of this language model trying to be very good at producing reasonable text, right?
The whole model is just trained on, you’ve given me a prompt, I want to explain it to you. Ban that prompt in a way that makes sense, given the prompt you gave me. And it does that by generating tokens, right? Given the text that’s in here so far, what’s the best next part of a word or word to output next?
And that’s all it does. Now, in winning this game of producing text that actually makes sense, it has had to implicitly Encode some reasoning into its wiring because sometimes to actually expand text, if that text is capturing some sort of logical puzzle in it to expand that text in a logical way, it has to do some reasoning.
But this is a very inefficient way of doing reasoning to have it be as a side effect of building a really good. Token generation machine. Also, you have to make these things huge just to get that as a side effect. GPT 3. 5, which are powered the original chat GPT, which had probably around a hundred billion parameters, maybe 170 billion parameters could do more, some of this reasoning, but it wasn’t very good.
When they went to a trillion plus parameters for GPT 4, this sort of accidental implicit reasoning that was built into it got a lot better, right? But we’re making these things huge. This is not a efficient way to get reasoning. So what makes more sense? And this is the, this is my big take. It’s what I’ve been arguing recently.
I think the role of language models in particular is going to actually focus more. Understanding language. What is it that someone. Is saying to me what the user is saying, what does that mean? Like, you know, what are they looking for? And then translating these requests into the very precise formats that other different types of models and programs.
can take as input and deal with. And so like, let’s say for example, you know, there’s a certain, there’s mathematical reasoning, right? And, and we want to have help from an AI model to solve complicated mathematics. The goal is not to keep growing a large language model large enough that it has seen enough math that kind of implicitly gets bigger and bigger.
Actually, we have really good. computerized math solving programs like Mathematica, Wolfram’s program. So what we really wanted the language model to recognize, you’re asking about a math problem, what the, put it into like the precise language that like another program could understand. Have that program do what it does best, and it’s not an emergent neural network, it’s like more hard code.
Let it solve the math problems, and then you can give the result back to the language model with a prompt for it to like tell you here’s what the answer is. This is the future I think we’re going to see is many more different types of models to do different types of things that we would normally do in the human head.
Many of these models not emergent, not just trained neural networks that we have to just study and see what they can do, but very explicitly programmed. And then these language models, which are so fantastic at translating between languages and understanding language sort of being kind of at the core of this.
Taking what we’re saying in natural language as users, turning it into the language of these ensembles of programs, getting the results back and transforming it back to what we can understand. This is a way more efficient way of having much broader intelligences as opposed to growing a token generator larger and larger that it just sort of implicitly gets okay at some of these things.
It’s just not an efficient way to do it.
Mike: The multi agent approach to something that would maybe appear to be an AGI like experience, even though it still may not be in the sense of, to come back to something you commented on, on an understanding the answer as opposed to just regurgitating.
probabilistically correct text, we see the, I think a good example of that is the latest round of Google gaffes, Gemini gaffes, where it’s saying to put, put glue in the, in the cheese of the pizza, eat rocks, uh, bugs, crawl up your penis hole. That’s normal. All these things, right? Where the algorithm. Says, yeah, here, here’s the text, spit it out, but it doesn’t understand what it’s saying in the way that a human does, because it doesn’t reflect on that and go, well, wait a minute.
No, we definitely don’t want to be putting glue on the pizza. And so to your point for it to, for it to reach that level of human like awareness, I don’t know where that goes. I don’t know enough about the details. You probably, uh, would, would be able to comment on that a lot better than I would, but the multi agent approach that’s.
That anyone can understand where if you build that up, you make that robust enough, it, it can reach a level where it, it seems to be, uh, highly skilled at basically everything. And, uh, it goes beyond the current generalization, generally not that great at anything other than putting out, putting grammatically perfect text and knowing a bit of something about basically everything.
Cal: Well, I mean, let me give you a concrete example, right? I wrote about this in a, a New Yorker piece I published and. March, and I think it’s an important point, right? A team from Meta set out to, uh, build an AI that could do really well at the board game Diplomacy. And I think this is really important when we think about AGI or just more in general, like, human like intelligence in a very broad way, because the Diplomacy board game You know, if you don’t know, it is partially like a risk strategy war game.
You know, you move figures on a board. It takes place in World War One era Europe, and you’re trying to take over countries or whatever. But the key to diplomacy is that there’s this human negotiation period. At the beginning of every term, you have these private one on one conversations with each of the other players, and you make plans and alliances, and you, um, you also double cross and you make a fake alliance with this player so that they’ll move their positions out of Out of a defensive position so that this other player that you have a secret alliance with can come in from behind, like take over this country.
And so it’s really considered like a game of real politic human to human skill. There was this rumor that, you know, Henry Kissinger would play diplomacy in the Kennedy White House just to sharpen his skill of how do I deal with all these world leaders. So when we think of AI in a, from a perspective of like, Ooh, this is getting kind of spooky what it can do.
Winning at a game like diplomacy is exactly that. Like it’s playing against real players and pitting them against themselves and negotiating to figure out how to win. They built a bot called Cicero that did really well. They played it on a, uh, online diplomacy, chat based text based chat diplomacy server called DiplomacyNet.
And it was winning, you know, two thirds of its games by the time they were done. So I interviewed the, Uh, some of the developers for this New Yorker piece. And here’s what’s interesting about it. Like the first thing they did is they took a language model and they trained it on a lot of transcripts of diplomacy games.
So it was a general language model and then they extra trained it with a lot of data on diplomacy games. Now you could ask this model, like you could chat with it, like, what do you want to do next? But you know, it would output, these are reasonable descriptions of diplomacy moves. Given like what you’ve told it so far about what’s happening in the game.
And in fact, probably it’s learned enough about seeing enough of these examples and how to generate reasonable texts to expand a transcript of a diplomacy game, there’ll be moves that like match where the players actually are, like they make sense, but it was terrible at playing diplomacy. Right. It just, it was like reasonable stuff.
Here’s how they built a bot that could win at diplomacy. Is they said, Oh, we’re gonna code a reasoning engine, a diplomacy reasoning engine. And what this engine does, if you give it a description of like where all the pieces are on the board and what’s going on and what requests you have from different players, like what they want you to do, it can just simulate a bunch of futures.
Like, okay, let’s see what would happen if Russia is lying to us, but we go along with this plan. What would they do? Oh, you know, three or four moves from now, we could really get in trouble. Well, what if we lied to them and then they did that? So you’re, you’re simulating the future and none of this is like emergent.
Mike: Yeah. It’s like Monte,
Cal: Monte Carlo
Mike: type
Cal: thing. It’s a program. Yeah. Monte Carlo simulations. Exactly. And like, we’ve just hardcoded this thing. Um, and so what they did is that a language model talk to the players. So if you’re a player, you’re like, okay, hey, Russia, here’s what I want to do. The language model would then translate what they were saying into like a very formalized language that the reasoning model understands a very specific format.
The reasoning model would then figure out what to do. It would tell the language model with a big pro, and it would add a prompt to it, like, okay, we want to, like, accept France’s proposal, like, generate a message to try to get it to, like, accept the proposal, and let’s, like, deny the proposal for Italy or whatever, and then the language model who had seen a bunch of diplomacy game and says, and write this in the style of a diplomacy game, and it would sort of output the text that would get sent to the users.
That did really well. Not only did that do well, none of the users, they surveyed them after the fact, or I think they looked at the forum discussions, none of them even knew they were playing against a bot. They thought they’re playing against another human. And this thing did really well, but it was a small language model.
There’s an off the shelf research language model, nine billion parameters or something like that. And this hand coded engine. That’s the power of the multi agent approach. But there’s also an advantage to this approach. So I call this intentional AI or, uh, IAI. The advantage of this approach is that we’re no longer staring at these systems like an alien mind and we don’t know what it’s going to do.
Because the reasoning now, we’re, We’re coding this thing. We know exactly how this thing is going to decide what moved it. We programmed the diplomacy reasoning engine. And in fact, and here’s the interesting part about this example, they decided they didn’t want their bot to lie. That’s a big strategy in diplomacy.
They didn’t want the bot to lie to human players for various ethical reasons, but because they were hand coding the reasoning engine, they could just code it to never lie. So, you know, when you don’t try to have all of the sort of reasoning decision making happen in this sort of obfuscated, unpredictable, uninterpretable way within a giant neural network, but you have more of the reason just programs explicitly working with this great language model, now we have a lot more control over what these things do.
Now we can have a diplomacy bot that Hey, it can beat human players. That’s scary, but it doesn’t lie because actually all the reasoning, there’s nothing mysterious about it. We actually, it’s just like we do with a chess playing bot. We simulate lots of different sequences of moves to see which one’s going to end up best.
It’s not obfuscated. It’s not, uh, unpredictable.
Mike: And it can’t be jailbroken.
Cal: There’s no jailbreaking. We programmed it. Yeah. So, so like this is the future I see with multi agent. It’s a mixture of when you have generative AIs, like if you’re generating text or understanding text or producing video or producing images, these very large neural network based models are really, really good at this.
And we don’t exactly know how they operate. And that’s fine. But when it comes to planning or reasoning or intention or the evaluation of like which of these plans is the right thing to do or of the evaluation of is this thing you’re going to say or do correct or incorrect, that can actually all be super intentional, super transparent, hand coded.
Uh, this is not, there’s nothing here to escape when we think about this way. So I think IAI gives us a, a powerful vision of an AI future. Especially in the business context, but also less scary one because the language models are kind of scary in the way that we just trained this thing for 100 million over months.
And then we’re like, let’s see what it can do that. I think that rightly freaks people out. But this multi agent model, I don’t think it’s nearly. as sort of Frankenstein’s monster, as people fear AI sort of has to be.
Mike: One of the easiest ways to increase muscle and strength gain is to eat enough protein, and to eat enough high quality protein.
Now you can do that with food of course, you can get all of the protein you need from food, but many people supplement with whey protein because it is convenient and it’s tasty and that makes it easier to just eat enough protein. And it’s also rich in essential amino acids, which are crucial for muscle building.
And it’s digested well, it’s absorbed well. And that’s why I created Whey Plus, which is a 100 percent natural, grass fed, whey isolate protein powder made with milk from small, sustainable dairy farms in Ireland. Now, why whey isolate? Well, that is the highest quality whey protein you can buy. And that’s why every serving of Whey Plus contains 22 grams of protein with little or no carbs and fat.
Whey Plus is also lactose free, so that means no indigestion, no stomach aches, no gassiness. And it’s also 100 percent naturally sweetened and flavored. And it contains no artificial food dyes or other chemical junk. And why Irish dairies? Well, research shows that they produce some of the healthiest, cleanest milk in the world.
And we work with farms that are certified by Ireland’s Sustainable Dairy Assurance Scheme, SDSAS, which ensures that the farmers adhere to best practices in animal welfare, sustainability, product quality, traceability, and soil and grass management. And all that is why I have sold over 500, 000 bottles of Whey Plus and why it has over 6, 000 4 and 5 star reviews on Amazon and on my website.
So, if you want a mouth watering, high protein, low calorie whey protein powder that helps you reach your fitness goals faster, you want to try Whey Plus today. Go to buylegion. com slash whey. Use the coupon code MUSCLE at checkout and you will save 20 percent on your first order. And if it is not your first order, you will get double reward points.
And that is 6 percent cash back. And if you don’t absolutely love Whey Plus, just Let us know and we will give you a full refund on the spot. No form, no return is even necessary. You really can’t lose. So go to buylegion. com slash way now. Use the coupon code MUSCLE at checkout to save 20 percent or get double reward points.
And then try Weigh Plus risk free and see what you think. Speaking of fears, there’s uh, a lot of talk about the potential negative impacts on People’s jobs on economies. Now you’ve expressed some skepticism about the claims that AI will lead to massive job losses, at least in the near future. Can you talk a little bit about that for people who have that concern as well, because they’ve read maybe that their job is, uh, is on the list that AI is replacing whatever the, whatever this is in the next X number of years, because you see a lot of that.
Cal: Yeah, no, I think those are still largely overblown right now. Uh, I don’t like the methodologies of those studies. And in fact, one of the, it’s kind of ironic, one of the big early studies that was given specific numbers for like what part of the economy is going to be automated, ironically, their methodology was to use a language model.
to categorize whether each given job was something that a language model might one day automate. So it’s this interesting methodology. It was very circular. So here’s where we are now, where we are now, language model based tools like chat, dbt or cloud. Again, they’re built only on understanding language and generating language based on prompts, mainly how that’s being applied.
I’m sure this has been your experience, Mike, and using these tools is that it can speed up. Things that, you know, we were already doing, help me write this faster, help me generate more ideas that I’d be able to come up, you know, on my own, take, help me summarize this document, sort of speeding up tasks.
Mike: Help me think through this. Here’s what I’m dealing with. Am I missing anything? I find those types of discussions very helpful.
Cal: And that’s, yeah, and that’s another aspect that’s been helpful. And that’s what we’re seeing with students as well. It’s interesting. It’s sort of more of a psychological than efficiency advantage.
It’s, uh, humans are social. So there’s something really interesting going on here where there’s a rhythm of thinking where you’re going back and forth with another entity that somehow is a kind of a more comfortable rhythm. Then just I’m sitting here white knuckling my brain trying to come up with things.
But none of that is my job doesn’t need to exist, right? So that’s sort of where we are now. It’s speeding up certain things or changing the nature of certain things we’re already doing. I argued recently that the next step, like the Turing test we should care about, is when can a AI empty my email inbox on my behalf?
Right. And I think that’s an important threshold because that’s capturing a lot more of what cognitive scientists call functional intelligence, right? So the cognitive scientists would say a language model has very good linguistic intelligence, understanding producing language. Uh, the human brain does that, but also has these other things called functional intelligences, simulating other minds, simulating the future, trying to understand the implication of actions on other actions, building a plan, and then evaluating progress towards the plan.
There’s all these other functional intelligences that we, we break out as cognitive scientists. Language models can’t do that, but the empty and inbox, you need those right for me to answer this email on your behalf. I have to understand who’s involved. What do they want? What’s the larger objective that they’re moving towards?
What information do I have that’s relevant to that objective? What information or suggestion can I make that’s going to make the best progress towards that objective? And then how do I deliver that in a way that’s going to actually work? Understanding how they think about it and what they care about and what they know about that’s going to like best fit these other minds.
That’s a very complicated thing. So that’s gonna be more interesting, right? Because that could take more of this sort of administrative overhead off the plate of knowledge workers, not just speeding up or changing how we do things, but taking things off our plate, which is where things get interesting.
That needs multi agent models, right? Because you have to have the equivalent of the diplomacy planning bot doing sort of business planning. Like, well, what, what happened if I suggest this and they do this, what’s going to happen to our project? It needs to have specific like objectives programmed in, like in this company, this is what our, this is what matters.
These are things we, here’s the list of things I can do. And here’s things that I, so now when I’m trying to plan what I suggest, I have like a hard coded list of like. These are the things I’m authorized to do in my position at this company, right? So we need multi agent models for the inbox clearing Turing test to be, uh, passed.
That’s where things start to get more interesting. And I think that’s where, like, a lot of the prognostications of big impacts get more interesting. Again, though, I don’t know that it’s going to eliminate large swaths of the economy. But it might really change the character of a lot of jobs sort of again, similar to the way the Internet or Google or email really changed the character of a lot of jobs versus what they were like before, really changing what the day to day rhythm is like we’ve gotten used to in the last 15 years work is a lot of sort of unstructured back and forth communication that sort of our day is built on email, slack and meetings.
Work five years from now, if we cross the inbox Turing test might feel very different because a lot of that coordination can be happening between AI agents, and it’s going to be a different feel for work, and that could be substantial. But I still don’t see that as, you know, knowledge work goes away.
Knowledge work is like building, you know, water run mills or or horse and buggies. I think it’s more of a character change, probably, but it could be a very significant change if we crack that multi agent functional intelligence problem.
Mike: Do you think that AI augmentation of knowledge work is going to become table stakes if you are a knowledge worker, which would also include, I think it include creative work of any kind, and that we could have a scenario where Information slash knowledge slash idea, whatever workers with AI, it’s just going to get to a point where they can outproduce quantitatively and qualitatively their peers on average, who do not have, who do not use AI so much so that.
A lot of the latter group will not have employment in that capacity if they, if they don’t adopt the technology and change. Yeah. I mean, I think it’s like internet
Cal: connected PCs eventually. Everyone had in knowledge work had to be, uh, had to adopt and use these like you couldn’t survive after by by like the late nineties, you’re like, I’m just, I’m just, uh, at too big of a disadvantage if I’m not using the internet connected computer, right?
You can’t email me. I’m not using word processors. We’re not using digital graphics and presentations. We’re not like you had to adopt that technology. We saw a similar transition. If we want to go back, you know, 100 years to the electric motors. And factory manufacturing, there was like a 20 year period where, you know, we weren’t quite sure we were uneven in our integration of electric motors into factories that before were run by giant steam engines that would turn an overhead shaft and all the equipment would be connected to it by belts.
But eventually, and there’s a really nice case study. Business case written about this, uh, this sort of often cited, eventually you had to have small motors on each piece of equipment because it was just, you’re still building the same things. And I, and like the equipment was functionally the same. You’re you’re whatever you’re, you’re sewing short or pants, right?
You’re still a factory making pants. You still have sewing machines, but you eventually had to have a small motor on every sewing machine connected dynamo, because that was just so much more of an efficient way to do this. And to have a giant overhead single speed. Uh, crankshaft on which everything was connected by belts, right?
So we saw that in knowledge work already with internet connected computers. If we get to this sort of functional AI, this functional intelligence AI, I think it’s going to be unavoidable, right? Like, I mean, one way to imagine this technology, I don’t exactly know how it’ll be delivered, but one way to imagine it is something like a chief of staff.
So like if you’re a president or a tech company CEO, you have a chief of staff that sort of organizes all the stuff so that you can focus on what’s important. Like the president of the United States doesn’t check his email inbox, like, what do I work on next? Right? That sort of Leo McGarry character is like, all right, here’s who’s coming in next.
Here’s what you need to know about it. Here’s the information. We got to make a decision on like whether to deploy troops. You do that. Okay. Now here’s what’s happening next. Okay. You can imagine a world in which A. I. S. plays something like that role. So now things like email a lot of what we’re doing in meetings, for example, that gets taken over more by the digital chief of staffs, right?
They gather what you need. They coordinate with other A. I. Agents to get you the information you need. They deal with the information on your behalf. They deal with the sort of software programs that like make sense of this information or calculates this information. They sort of do that on your behalf.
We could be heading more towards a future like that, a lot less administrative overhead and a lot more sort of undistracted thinking or that sort of cognitive focus. That will feel very different. No, I think that’s actually a much better rhythm of work than what we Evolved into over the last 15 years or so in knowledge work, but it could, it could have interesting side effects because if I can now produce three X more output because I’m not on email all day, well, that changes up the economic nature of my particular sector because technically we only need a third of me now to get the same amount of work done.
So what do we do? Well, probably the sectors will expand. Right. So just the economy as a whole expand, each person can produce more. We’ll probably also see a lot more jobs show up than it exists before to capture this sort of surplus cognitive capacity. We just sort of have a lot more raw brain cycles available.
We don’t have everyone sending and receiving emails once every four minutes. Right. And so we’re going to see more, I think, probably injection of cognitive cycles Into other parts of the economy where I might now have someone hired that like helps me manage a lot of like the paperwork in my household, like things that just require because there’s going to be this sort of excess cognitive capacity.
So we’re going to have sort of more thinking on our behalf. It’s, you know, it’s a hard thing to predict, but that’s where things get interesting.
Mike: I think email is a great example of. Necessary drudgery and there’s a lot of other necessary drudgery that will also be able to be offloaded. I mean, an example, uh, is the, the CIO of my sports nutrition company who oversees all of our tech stuff and has a long list of projects.
He’s always working on. Uh, he is heavily. Invested now in working alongside AI. And, uh, I think, I think he likes get hubs co pilot the most and he’s, he’s kind of fine tuned it on, on how he likes to code and everything. And he has, he said a couple of things. One, he estimates that his personal productivity is at least 10 times.
That’s what, and he is. Not a sensationalist that that’s like a conservative estimate with his coding and then and then he also has commented that something he loves about it is automates a lot of drudgery code that typically. Okay, so you have to kind of reproduce something you’ve already done before and that’s fine.
You can take what you did before, but you have to go through it and you have to make changes and you know what you’re doing, but it’s just, it’s boring and it can take a lot of time. And he said, now he spends very little time on that type of work because the AI is great at that. And so the, the time that now he gives to his work is more fulfilling.
ultimately more productive. And so I can see that effect occurring in many other types of work. I mean, just think about writing. Like you say, you don’t, you don’t ever have to deal with the, the scary blank page. Uh, not that that is really an excuse to not put words on the page, but that’s something that I’ve, personally enjoyed is, although I don’t believe in writer’s block per se, you can’t even run into idea block, so to speak, because if you get there and you’re not sure where to go with this thought or if you’re even onto something.
If you jump over to GPT and start a discussion about it, at least in my experience, especially if you get it generating ideas, and you mentioned this earlier, a lot of the ideas are bad and you just throw them away. But always, always in my experience, I’ll say always I get to something when I’m going through this kind of process, at least one thing, if not multiple things that I genuinely like that I have to say, That’s a good idea.
That gives me a spark. I’m going to take that and I’m going to work with that.
Cal: Yeah, I mean, again, I think this is something we don’t, we didn’t fully understand. We still don’t fully understand, but we’re learning more about, which is like the rhythms of human cognition and what works and what does it.
We’ve underestimated the degree to which the way we work now, which is it’s highly interruptive and solitary at the same time. It’s I’m just trying to write this thing from scratch. Yeah. And that’s like a very solitary task, but also like I’m interrupted a lot with like unrelated things. This is a rhythm that doesn’t fit well with the human mind.
A focused collaborative rhythm is something the human mind is very good at, right? So now if I’m, my day is unfolding with me interacting back and forth with an agent. You know, maybe that seems really artificial, but I think the reason why we’re seeing this actually be useful to people is it’s probably more of a human rhythm for cognition is like I’m going back and forth with someone else in a social context, trying to figure something else, something out.
And my mind can be completely focused on this. You and I. Where you as a bot in this case, we’re trying to write this article and now like that, that is more familiar and I think that’s why it feels less strain than I’m going to sit here and do this very abstract thing on my own, you know, just like staring at a blank page programming, you know, it’s an interesting example and I’ve been wary about trying to extrapolate too much from programming because I think it’s also a special case.
Right. Because what a language models do really well is they can, they can produce text that well matches the prompt that you gave for like what type of text you’re looking for. And as far as a model is concerned, computer code is just another type of text. So it can produce, um, if it’s producing sort of like English language, it’s very good at following the rules of grammar.
And it’s like, it’s, it’s grammatically correct language. If they’re producing computer code, it’s very good at following the syntax of programming languages. This is actually like correct code that’s, that’s going to run. Now, uh, language plays an important role in a lot of knowledge work jobs, English language, but it’s not the main game.
It sort of supports the main things you’re doing. I have to use language to sort of like request the information I need for what I’m producing. I need to use language to like write a summary of the thing, the strategy I figured out. So the language is a part of it, but it’s not the whole activity and computer coding is the whole activity.
The code is what I’m trying to do. Code that, like, produces something. We just think of that as text that, like, matches a prompt. Like, the models are very good at that. And more importantly, uh, if we look at the knowledge work jobs where the, like, English text is the main thing we produce, like, writers.
There, typically, we have these, like, incredibly, sort of, fine tuned standards. Like, what makes good writing good? Like, when I’m writing a New Yorker article, it’s, like, very Very intricate. It’s not enough to be like this is grammatically correct language that sort of covers the relevant points. And these are good points.
It’s like the sentence. Everything matters to sentence construction, the rhythm. But in computer code, we don’t have that. It just the code has to be like reasonably efficient and run so like that. It’s like a Bullseye case of getting the maximum possible productivity knowledge or productivity out of a language model is like producing computer code as like a C.
- O. For a company where it’s like we need the right programs to do things. We’re not trying to build a program that’s going to have 100 million customers and has to be like the super like most efficient possible, like something that works and solves the problem. I want to solve it.
Mike: And there’s no aesthetic dimension, although I suppose it is.
Maybe there’d be some pushback and that there can be elegant code and inelegant code, but it’s not anywhere to the same degree as what you, when you’re trying to write something that really resonates with other humans in a deep way and inspires different emotions and images and things.
Cal: Yeah, I think that’s right.
And like elegant code is sort of the language, uh, equivalent of like polished prose, which actually these language models do very well. Like this is very polished prose. It doesn’t sound amateur. There’s no mistakes in it. Yeah, that’s often enough, unless you’re trying to do something. fantastical and new, in which case the language models can’t help you with programming, right?
You’re like, okay, I’m, I’m doing something completely, completely different, a super elegant algorithm that, that changes the way like we, we compute something, but most programming’s not that. You know, that’s, that’s for the 10 X coders to do. So yeah, it’s, it’s interesting. Programming is programming is interesting, but for most other knowledge, work jobs, I see it more about how AI is going to get the junk out of the way of what the human is doing more so than it’s going to do the final core thing that matters for the human.
And this is like a lot of my books. A lot of my writing is about digital knowledge work. We, we have these modes of working that accidentally got in the way of the underlying value producing thing that we’re trying to do in the company. The underlying thing I’m trying to do with my brain is getting interrupted by the communication, by the meetings.
And, uh, and this is sort of an accident of the way digital knowledge work unfolded. AI can unroll that, potentially unroll that. Accident, but it’s not going to be GPT five that does that. It’s going to be a multi agent model where there’s language models and hand coded models and, uh, and company specific bespoke models that all are going to work together.
I, I really think that’s going to be, that’s going to be the future.
Mike: Maybe that’s going to be Google’s chance at redemption because they’ve They’ve made a fool of themselves so far compared to open AI, even, even perplexity not to get off on a tangent, but by my lights, Google Gemini should fundamentally work exactly the way that perplexity works.
I now go to perplexity just as often, if not more often. I mean, if I, if I want that type of, I have a question and I, and I want an answer and I want sources cited to that answer and I want, I want more than one line. I go to perplexity now. I don’t even bother with Google because Gemini. Is so unreliable with that, but maybe, maybe Google will, they’ll be the one to bring multi agent into its own.
Maybe not, maybe it’ll just be open AI.
Cal: They might be, but yeah, I mean, then we say, okay, you know, I talked about that bot that wanted diplomacy by doing this multi agent approach. The lead designer on that, uh, got hired away from Meta. It was open AI who hired him. So interesting, that’s where he is now, Noam Brown.
He’s at OpenAI working, industry insiders suspect, on building exactly like these sort of bespoke planning models to connect the language models and the extend the capability. Google Gemini also showed the problem too of just relying on just making language models bigger and just having these massive models do everything as opposed to the IAI model of, Okay, we have specific logic.
And these more emergent language understanders, look what happened with, you know, what was this a couple months ago where they’re having a, they were fine tuning the, the, the controversy where they were trying to fine tune these models to be more inclusive. And then it led to completely unpredictable, like unintended results, like refusing to show, you know, Yeah, the, the black, the black Waffen Waffen SS, exactly, or to refuse to show the founding fathers as white.
The main message of that was kind of misunderstood. I think that was, that was somehow being understood by sort of political commentators as like each of those. Someone was. Programming somewhere like don’t show, you know, anyone as white or something like that. But no, what really happens is these models are very complicated.
So they do these fine tuning things. You have these giant models to take hundreds of million dollars to train. You can’t retrain them from scratch. So now you’re like, well, we want to, we’re worried about it being like showing, um, defaulting to like showing maybe like white people too often when asked about these questions.
So we’ll give it some examples to try to them. Nudge it in the other way. But these models are so big and dynamic that, you know, you go in there and just give it a couple examples of like, show me a doctor and you kind of, you give it a reinforcement signal to show a nonwhite doctor to try to unbias it away from, you know, what’s in his data, but that can then ripple through this model in a way that now you get the SS officers and the family fathers, you know, as American Indians or something like that.
It’s because they’re huge. And these fine, when you’re trying to fine tune a huge thing. You have like a small number of these fine tuned examples, like 100, 000 examples that have these massive reinforcement signals that fundamentally rewire the front and last layers of these models and have these huge, unpredictable dynamic effects.
It just underscores the unwieldiness of just trying to have a master model that is huge, that’s going to serve all of these purposes in an emergent manner. It’s an impossible goal. It’s also not what any of these companies want. Their hope, if you’re OpenAI, is to If you’re anthropic, right, if you’re Google, you do not want a world in which, like, you have a massive model that you talk to through an interface, and that’s everything.
And this model has to satisfy all people in all things. You don’t want that world. You want the world where your AI complicated. Combinations of models is in all sorts of different stuff that people does in these much smaller form factors with much more specific use cases. Chat GPT, it was an accident that that got so big.
It was supposed to be a demo of the type of applications you can build on top of a language model. They didn’t mean for chat GPT to be used by 100 million people. Right. It’s kind of like we’re in this, that’s why I say like, don’t overestimate this particular, the importance of this particular form factor for AI.
It was an accident that this is how we got exposed to what language models could do. It’s not, people do not want to be in this business of blank text box. Anyone, everywhere can ask it everything. And this is going to be like an oracle that answers you. That’s not what they want. They want like the get hub copilot vision in the particular stuff I already do.
- I. Is there making this very specific thing better and easier or automating it? So I think they want to get away from the mother model, the oracle model that all thing goes through. This is a temporary step. It’s like accessing mainframes through teletypes before, you know, Eventually, we got personal computers.
This is not going to be the future of our interaction with these things. The Oracle blank text box to which all requests go. Um, they’re having so much trouble with this and they don’t want this to be It’s, you know, I see these massive trillion parameter models just marketing, like, look at the cool stuff we can do, associate that with our brand name so that when we’re then offering like more of these more bespoke tools in the future that are all over the place, you’ll remember Anthropic because you remember Claude was really cool during this period where we were all using chatbots
Mike: and we did the Golden Gate experiment.
Remember how fun that was a good example of what you were just mentioning of how you can’t brainwash you. The bots per se, uh, but you can hold down certain buttons, uh, and produce very strange outcomes for, for anyone listening. If you go check out the it’s, I think it’s still live now. I don’t know how long they’re gonna keep it up, but check out Claude’s, uh, their Anthropx Claude Golden Gate Bridge experiment and fiddle around with it.
Cal: And by the way, think about this objectively, there’s, there’s another weird thing going on with the Oracle model of AI, which again, why they want to get away from it. We’re in this weird moment now where we’re conceptualizing these models, sort of like important individuals, and we want to make sure that like, These individuals, like the way they express themselves is proper, right?
But if you zoom out, like, this doesn’t necessarily make a lot of sense for something to invest a lot of energy into, like, you would assume people could understand this is a language model. It says neural network to just like produces text to expand stuff that you put in there, you know, Hey, it’s going to say all sorts of crazy stuff, right?
Because this is just a text expander, but here’s all these, like, useful ways you can use it, you can make it say crazy stuff. Yeah, and if you want it to like, say, whatever, nursery rhymes, as if written by Hitler, like whatever, it’s a language model that can do almost anything. And that’s, it’s a cool tool.
And we want to talk to you about ways you can like build tools on top of it. But we’re in this moment where we got obsessed about, like, we’re treating it like it’s an elected official or something. And the things it says somehow reflects on the character of some sort of entity that actually exists. And so we don’t want this to say something, you know, it used to be, there’s a whole interesting field, an important field in computer science called Algorithmic fairness.
Right. Which, uh, or algorithmic bias. And these are similar things where they, they, they look for, like if you’re using algorithms for making decisions, you wanna be wary of biases being unintentionally programmed into these algorithms. Right. This makes a lot of sense. The, the kinda the classic early cases where things like, um, hey, you’re using an algorithm to make loan approval decisions, right?
Like, I would give it all this information about the, the. The applicant and the model maybe is better at a human and figuring out who to give a loan to or not. But wait a second, depending on the data you train that model with, it might be actually biased against people from, you know, certain backgrounds or ethnic groups in a way that is just an artifact of the data.
Like, we got to be careful about that, right? Or, or
Mike: in a way that may actually be factually accurate and valid, but ethically unacceptable. And so you just make, you make a determination.
Cal: Yeah. So right there, there could be, if this was just us as humans doing this, there’s these nuances and determinations we could have.
And it’s, so we gotta be very careful about having a black box do it. But somehow we, we shifted that attention over to just chatbots producing texts. They’re not. At the core decisions, they’re not the chat box. Text doesn’t become Canon. It doesn’t get taught in schools. It’s not used to make language decisions.
It’s just a toy that you can mess with and it produces text. But we became really important that like the stuff that you get this. Bot to say has to be like meet the standards of like what we would have for like an individual human. And it’s a huge amount of effort that’s going into this. Um, and it’s, it’s really unclear why, because it’s, so what if I can, uh, make a chat bot, like say something very disagreeable.
I can also just say something very disagreeable. I can search the internet and find things very disagreeable. Or you, exactly.
Mike: You can go poke around on some forums about anything. And. Go, go spend some time on 4chan and, uh, there you go. That’s enough disagreeability for a lifetime.
Cal: So we don’t get mad at Google for, Hey, I can find websites written by preposterous people saying terrible things because we know this is what Google does.
It just sort of indexes the web. So it’s sort of, there’s like a lot of effort going into trying to make this sort of Oracle model thing kind of behave, even though like the, the text doesn’t have impact. There’s like a big scandal right before Chats GTP. GPT came out this way. I think it was meta had this language model galaxy that they had trained on a lot of scientific papers, and they had this, I think, a really good use case, which is if you’re working on scientific papers, it can help speed up like right sections of the papers.
So it speeds up. It’s hard. You get the results in science, but then writing the papers like a pain or the real You know, the real value is in doing the research typically, right? Um, and so like, great, we’ve trained on a lot of scientific papers, so it kind of knows the language of scientific papers. It can help you, like, let’s write the interpretation section.
Let me tell you the main points you put in the right language. And that people were messing around with this, like, hey, we can get this the right fake scientific papers. Like, uh, a famous example was about, you know, the history of bears in space. And they got real spooked and like we got and they pulled it, but like in some sense, it’s like, yeah, sure, this thing that can produce scientific sounding text can produce papers about bears in space.
I could write a fake paper about bears in space, like it’s not adding some new harm to the world, but this tool would be very useful for like specific uses, right? Like I want to make this section help me write this section of my particular paper. So when we have this like Oracle model of, of these, uh.
This Oracle conception of these machines. I think we anthropomorphize them into like they’re an entity and we want that. And I created this entity as a company. It reflects on me, like what their values are and the things they say. And I want this entity to be like sort of appropriate, uh, culturally speaking.
You could easily imagine, and this is the way we thought about these things. Pre chat GPT. Hey, we have a model GPT 3. You can build applications on it to do things. That had been out for a year, like two years. You could build a chatbot on it, but you could build a, you could build a bot on it that just like, hey, produce fake scientific papers or whatever.
But we saw it as a program, a language generating program that you could then build things on top of. But somehow when we put it into this chat interface, we think of these things as entities. And then we really care then about the beliefs and behavior of the entities. It all seems so wasteful to me because we need to move past the chat interface era anyways and start integrating these things directly into tools.
No one’s worried about the political beliefs of GitHub’s co pilot because it’s focused on producing, filling in computer code and writing drafts of computer code. Well, anyways, to try to summarize these various points and sort of bring it to our look at the future, you know, essentially what I’m saying is that in this current era where the way we interact with these generative AI technologies is through just like this single chat box.
And the model is an oracle that we do everything through. We’re going to keep running into this problem where we’re going to begin to treat this thing is like an entity. We’re going to have to care about what it says and how it expresses itself and whose team is it on and is a huge amount of resources have to be invested into this.
And it feels like a waste because the inevitable future we’re heading towards is not one of the all wise oracle that you talk to through a chat bot to do everything, but it’s going to be much more bespoke where these Networks of AI agents will be customized for various things we do, just like GitHub Copilot is very customized at helping me in a programming environment to write computer code.
There’ll be something similar happening when I’m working on my spreadsheet, and there’ll be something similar happening with my email inbox. And so right now, to be wasting so much resources on whether, you know, Clod or Gemini or ChatGPT You know, a politic correct, like it’s a waste of resources because the role of these large chatbots is like oracles is going to go away anyway.
So that’s, you know, I am excited, I am excited for the future where, uh, AI becomes, we splinter it and it becomes more responsive and bespoke. And it’s it’s directly working and helping with the specific things we’re doing. That’s going to get more interesting for a lot of people, because I do think for a lot of people right now, the copying and pasting, having to make everything linguistic, having to prompt engineer, that’s a big enough of a stumbling block that is impeding.
I think, uh, sector wide disruption right now that disruption was going to be much more pronounced once we get the form factor of these tools much more integrated into what we’re already doing
Mike: and the LLM will probably be the gateway to that because of how good it is at coding in particular and how much better it’s going to be that is going to enable.
The coding of, uh, it’s going to, it’s going to be able to do a lot of the work of getting to these special use case multi agents probably at a degree that without it would just be, it just wouldn’t be possible. It’s just too much work. Yeah, I think it’s going
Cal: to be the gateway. I think the court that we’re going to have sort of, if I’m imagining an architecture, the gateways to LLM, I’m saying something that I want to happen and the LLM understands the language and translates it into like a machine, much more precise language.
I imagine there’ll be some sort of coordinator. Program that then like takes that description and it can start figuring out. Okay, so now we need to use this program to help do this. Let me talk to the LLM. Hey, change this to this language. Now, let me talk to that. So we’ll have a coordinator program, but the gateway between humans and that program, uh, and between that program and other programs is going to be LLMs.
But what this is also going to enable us, they don’t have to be so big. If we don’t need them to do everything, we don’t need them to like play chess games and be able to, to write in every idiom, we can make them much smaller. If what we really need them to do is understand, you know, human language that is like relevant to the types of business tasks that this multi agent thing is going to run on, the LLM can be much smaller, which means we can like fit it on a phone.
And more importantly, it can be much more responsive. Like Sam Altman’s been talking about this recently. It’s just too slow right now. Yeah. Because these LLMs are so big,
Mike: even 4. 0, when you, when you get it into. more esoteric token spaces. I mean, it’s fine. I’m not complaining. It’s a fantastic tool, but I do a fair amount of waiting while it’s chewing through everything.
Cal: Yeah, well, and because, uh, right, the model is big, right? And how do you, the actual computation behind a transformer based language model production of a token, the actual computation. is a bunch of matrix multiplications, right? So the weights of the neural networks in the, uh, layers are represented as big matrices and you multiply matrices by matrices.
This is what’s happening on GPUs. But the size of these things is so big, they don’t even fit in like the memory of a single GPU chip. So you might have multiple GPUs involved just to produce, working full out, just to produce a single token because these things are so big. These massive matrices are being multiplied.
So if you make the model smaller, they can. Generate the tokens faster. And what people really want is like essentially real time response. Like they want to be able to say something and have like the text response. Just boom, like that’s the response of this feed where now this is going to become a natural interface where I can just talk and not watch it word by word go, but I can talk and boom, it does it.
What’s next, right?
Mike: Or even talks back to you. So now you’re, you’re, you have, you have a commute or whatever, but you can actually now use that time maybe to, uh, have a discussion with this, this highly specific expert about what you are working on. And it’s just a real time as if you’re talking to somebody on the phone.
Oh, that’s good.
Cal: And I think people underestimate how cool this is going to be. So we need very quick latency, very small latency, because we imagine I want to be at my computer or whatever, just to be like, okay. Find the data from the, get the data from the Jorgensen movie. Let’s open up Excel here. Let’s put that into a table, do it like the way we did before.
If you’re seeing that just happen. As you say it now we’re in like the linguistic equivalent of Tom Cruise, a minority report sort of moving the AR windows around with his special gloves. That’s when it gets really important. Sam Altman knows this. He’s talking a lot about it. It’s not too difficult. We just need smaller models, but we know small models are fine.
Like, as I mentioned in that diplomacy example, the language model was very small and it was a factor of 100 smaller than something like GPT 4 and it was fine. Because it wasn’t trying to be this Oracle that anyone could ask everything about and was constantly prodding in and giving it.
Mike: Is it idiot?
Come on. It was just really good at diplomacy language and, and I had the reasoning engine
Cal: and it knew it really well. And it was really small. It was 9 billion parameters. Right. And so anyways, that that’s, I’m looking forward to that is we, we get these models smaller, smaller is going to be more, it’s, it’s interesting mindset shifts, smaller models, hooked up to custom other programs.
Deployed in a bespoke environment. Like that’s the startup play you want to be involved in
Mike: with a big context window.
Cal: Big context window. Yeah. But even that doesn’t have to be that big. Like you can, a lot of the stuff we do doesn’t even need a big context window. You can have like another program, just find the relevant thing to what’s happening next, and it paste that into the prompt that you don’t even see.
That’s
Mike: true. I’m just thinking selfishly, like think about a writing project, right? So you go through your research phase and you’re reading books and articles and transcripts of podcasts, whatever, and you’re making your highlights and you’re getting your thoughts together. And you have this, this corpus, this, this, uh, I mean, if it Fiction, it would be like your story Bible, as they say, or codex, right?
You have all this information now, uh, that, and it’s time to start working with this information to be able to, and it might be a lot, depending on what you’re doing and Google’s notebook, uh, it was called notebook LLM. This is the concept and I’ve started to tinker with it in my work. I haven’t used it enough to have, and this is kind of a segue into the final question I want to ask you.
I haven’t used it enough to. Pronounce one way or other on it. I like the concept though, which is exactly this. Oh, cool. You have a bunch of material now that is going to be, that’s related to this project you’re working on. Put it all into this model and it now it reads it all. Um, and it, it can find the little password, uh, example, or you hide the password in a million tokens of text or whatever, and it can find it.
So, so it, in a sense. Quote unquote knows to a high degree with a high degree of accuracy, everything you put in there. And now you have this, this bespoke little assistant on the project that is, it’s not trained on your data per se, but. It can, you can have that experience. And so now you have a very specific assistant, uh, that you can, you can use, but of course you need a big context window and maybe you don’t need it to be 1.
5 million or 10 million tokens. But if it were 50, 000 tokens, then that maybe that’s sufficient for an article or something, but not for a book.
Cal: It does help, though it’s worth knowing, like, the architecture, there’s a lot of these sort of third party tools, like, for example, built on language models, where, you know, you hear people say, like, I built this tool where I can now ask this custom model questions about, uh, all of the, the quarterly reports.
Of our company from the last 10 years or something like we, this is like a, there’s a big business now consulting firms building these tools for individuals, but the way these actually work is there’s an intermediary. So you’re like, okay, I want to know about, you know, how have our sales like different between the first quarter this year versus like 1998 you don’t have in these tools.
20 years worth of into the context. What it does is it actually, right? It’s search, not the language model, just old fashioned program searches these documents to find like relevant text, and then it builds a prompt around that. And actually how a lot of these tools work is it stores this text in such a way that it can, it uses the embeddings of your prompt.
So like after they’ve already been transformed into the embeddings that the language model neural networks understand, and all your text has also been stored in this way, and it can find sort of, uh, now conceptually similar text. So it’s like more sophisticated than text matching. Right. It’s not just looking for keywords.
It can it so it can actually leverage like a little bit of the language model, how it embeds these prompts into a conceptual space and then find text that’s in a similar conceptual space, but then it creates a prompt. Okay, here’s my question. Please use the text below and answering this question. And then it has.
5, 000 tokens worth of text pasted below. That actually works pretty well, right? So all the open AI demos from last year of like the one about the plug in demo with UN reports, et cetera. That’s the way that worked. Is it was finding relevant text from a giant corpus and then creating smarter prompts that you don’t see as the user.
But your prompt is not what’s going to the language model. It’s a version of your prompt that has like cut and pasted text that it found the documents. Like even that works well.
Mike: Yeah, I’m just parroting, actually, the, the, kind of the CIO of my sports coach company who knows a lot more about the AI than I do.
He’s really into the research of it. He has just commented to me a couple of times that, uh, when I’m doing that type of work, he has recommended stuffing the context window because if you, if you just give it big PDFs, uh, you just don’t get nearly as good as results as if you do when you stuff the context window.
That was just a comment, but, um, we’re, we’re coming up on time, but I just wanted to ask one more question if you have a few more minutes and, and this is something that you’ve commented on a number of times, but I wanted to come back to it and so in your work now, and obviously a lot of, a lot of your work is that the, the highest quality work that, that you do is, is deep in nature in many ways, aside from um, um, Maybe the personal interactions in your job in many ways, your career is is based on coming up with good ideas.
Um, and so how are you currently using these, these LLMs and specifically, what have you found helpful and in useful?
Cal: Well, I mean, I’ll say right now in their current incarnation, I use them very little outside of specifically experimenting with things for articles about LLMs. Right? Because as you said, like my main livelihood is trying to produce ideas at a very high level, right?
So for academic articles, New Yorker articles or books, it’s a it’s a very precise thing that requires you taking a lot of information and And then your brain is trained over decades of doing this, sits with it and works on it for months and months until you kind of slowly coalesce. Like, okay, here’s the right way to think about this, right?
This is not something that I don’t find it to be aided much with sort of generic brainstorming prompts from like an LLM. It’s way too, it’s way too specific and weird and idiosyncratic for that where I imagine. And then what I do is I write about it, but again, the type of writing I do is incredibly sort of like precise.
I have a very specific voice that the, you know, the rhythm of the sentences, I have a stylistic. It’s just. I just write. It’s, it’s, it’s, um, and I’m used to it and I’m used to the psychology of the blank page and that pain and I sort of internalize it
Mike: and I’m sure you have, I mean, you have to go through multiple drafts.
The first draft, you’re just throwing stuff down. I don’t know if for you, but for me, I have to fight the urge to fix things that just get all the ideas down and then you have to start refining and
Cal: yeah, and I’m very used to it and it’s not it. You know, my inefficiency is not like if, if I could speed up that by 20%, somehow that matters.
It’s, you know, it might take me months to write an article and it’s, it’s about getting the ideas right and sitting with it. Where I do see these tools playing a big role, what I’m waiting for is this next generation where they become more customized and bespoke and integrated in the things I’m already using.
That’s what I’m waiting for. Like, I’ll give you an example. I’ve been, I’ve been experimenting with just a lot of examples with GPT 4 for understanding natural language described. Schedule constraints and understanding here’s a time that here’s a here’s a meeting time that satisfies these constraints.
This is going to be eminently built into like Google workspaces. That’s going to be fantastic where you can say we need a meeting like a natural language. We need a meeting with like Mike and these other people, uh, these be the next two weeks. Here’s my constraints. I really want to try to keep this in the afternoon and possible not on Mondays or Fridays.
But if we really have to do a Friday afternoon, we can, but no later than this. And then, you know, the language model working with these other engines sends out a scheduling email to the right people. People just respond to natural language with like the times that might work. It finds something in the intersection.
It sends out an invitation to everybody. You know, that’s really cool. Like that’s gonna make a big difference for me right away. For example, like these type of things or Integrated into Gmail, like suddenly it’s able to highlight a bunch of messages in my inbox and be like, you know what, I can, um, I can handle these for you, like, good.
And it’s like, and they disappear, that’s where this is going to start to enter my world in the way that like GitHub Copilot has already entered the world of computer programmers. So because the thinking and writing I do is so highly specialized, this sort of the impressive but generic ideation and writing abilities of those models isn’t that relevant to me, but the administrative overhead.
That goes around being any type of knowledge worker is poison to me. And so you know that that is the evolution, the turn of this sort of product development crank that I’m really waiting to
Mike: waiting to happen. And I’m assuming one of the things that we will see probably sometime in the near future is think about Gmail is currently I, I guess, you know, it has some of these predictive, uh, text outputs where you, if you like what it’s suggesting, you can just hit tab or whatever, and it throws a couple words in there, but I could see that expanding to, it’s actually now just suggesting an entire reply.
Uh, and Hey, if you like it, you just go, yeah, you know, sounds great. Next, next, next.
Cal: Yep, or you’ll train it and this is where you need other programs, not just a language model, but you sort of show it examples like you just tell it like these the types of like common types of messages I get and then like you’re kind of telling it, which is what type of example and then it sort of learns to categorize these messages and then, uh, you can kind of it can have rules for how you deal with these different types of messages.
Um, Yeah, it’s gonna be powerful like that. That’s going to that’s gonna start to matter. I think in an interesting way. I think information gathering right. So one of the big purposes like in an office environment of meetings is there’s certain information or opinions I need and it’s kind of complicated to explain them all.
So we just like all get together in a room. But AI with control programs now, like, I don’t necessarily need everyone to get together. I can explain, like, this is the information. I need this information, this information and a decision on this and this. Like that AI program might be able to talk to your AI program.
Like it might be able to gather most of that information with ever no humans in the loop. And then there’s a few places where what it has is like questions for people and it gives it to those people’s AI agent. And so there’s certain points of the day where you’re talking to your agent and it like ask you some questions and you respond and then it gets back and then all this is gathered together.
And then when it comes time to work on this project, it’s all put on my desk, just like a presidential chief of staff puts the folder on the president’s desk. There it is. Yeah. Yeah. I, you know, this is where I think people need to be focused and knowledge work, um, and, and LLMs and not get too caught up in thinking about again, a chat window into an Oracle.
As being the, the end all of what this technology could be. It’s again, it’s when it gets smaller, that is impact. It’s big. Like that’s when things are gonna start to get interesting.
Mike: Final comment. Uh, all that is in my work, cause I’ve caught, I’ve said a number of times that I’m using it quite a bit. And just in case anybody’s wondering, because it seems to contradict with what you said, because in some ways my work is very specialized.
And, uh, that is where I, where I use it. The most, if I think about health and fitness related to work, I found it helpful at a high level of generating overview. So I want to, I want to create some content on a topic and I want to make sure that I’m being comprehensive. I’m not forgetting about something that should be in there.
And so I find it helpful to take something like if it’s just an outline for an article on, or I want to write and just ask it to. Does this look right to you? Am I missing anything? Is there any, how, how might you make this better? Those types of simple little interactions are helpful. Also applying that to specific materials.
So again, is there anything here that. That seems to be incorrect to you, or is there anything that you would add to make this better? Sometimes I get utility out of that and then where I found it most useful actually is in a, it’s really just hobby work. Um, my. My original interest in writing actually was fiction going back to, I don’t know, 17, 18 years old.
And, um, it’s, it’s kind of been an abiding interest that I on the back burner to focus on other things for a while. Now I’ve brought it back to not a front burner, but maybe I, I bring it to a front burner and then I put it back and then bring it and put it back. And so for that. I’ve found it extremely helpful because that process started with me reading a bunch of books on storytelling and fiction so I can understand the art and science of storytelling beyond just my individual judgment or taste.
Pulling out highlights, uh, notes, things I’m like, well, that’s useful. That’s good. Kind of organizing those things into a system of checklists really to go through. So okay, you want to create characters. There are principles that go into doing this well. Here they are in a checklist. Working with GPT in particular through that process is, I mean, it’s, it’s, that is extremely useful because again, as this context builds in this chat, in the exact case of building a character.
Understands quote unquote the the psychology and it understands probably in some ways More so than in any human could because it also understands the or in a sense can can produce the right answers to questions that are now also given the context of people like this character that you’re building.
And so much of putting together a story is actually just logical problem solving. There are maybe some elements that you could say are more purely creative, but as you start to put all the scaffolding there, A lot of it now is you’ve kind of built constraints of a, of a story world and characters and how things are supposed to work.
And it becomes more and more just logical problem solving. And because these, these LLMs are so good with language in particular, that has been actually a lot of fun to see how all these things come together and it saves a tremendous amount of time. Uh, it’s not just about copy and pasting the answers.
So much of the material that it generates is, is great. And so. Anyway, just to give context for listeners, cause that that’s how I’ve been using it, uh, both in my, in my fitness work, but, uh, it’s been actually more useful in the, in the fiction hobby. Yeah. And one thing
Cal: to point out about those examples is they’re both focused on like the production of text under sort of clearly defined constraints, which like language models are fantastic at.
And so for a lot of knowledge work jobs, there is. Text produced as a part of those jobs, but either it’s not necessarily core, you know, it’s like the text that shows up in emails or something like this, or yeah, they’re not getting paid to write the emails. Yeah, and in that case, the constraints aren’t clear, right?
So like the issue with like email text is like the text is not complicated text, but the constraints are like very business and personality specific, like, okay, well, so and so is a little bit. Nervous about getting out of the loop and we need to make sure they feel better about that. But there’s this other initiative going on and it’s too complicated for people to, you know, I can’t get these constraints to my, to my language model.
So that’s why, you know, so I think people who are generating content with clear constraints, which like a lot of what you’re doing is doing. These language models are great. And by the way, I think most computer programming is that as well. It’s producing content under very clear constraints. It compiles and solves this problem.
Um, and this is why to put this in the context of what I’m saying. So for the knowledge workers that don’t do that, this is where we’re going to have the impact of these tools come in and say, okay, well, these other things you’re doing, that’s not just a production of text on clear constraints. We can do those things individually or take those off your plate by by having to kind of program into explicit programs, the constraints of what this is like.
Oh, this is an email in this type of company. This is a calendar or whatever. So one way or the other, this is going to get into like what most knowledge workers do. But you’re in a fantastic position to sort of see the power of these next generation of models. Up close because it was already a match for what you’re doing.
And you’re, as you would describe, right, you, you would say this has really changed the feel of your day. It’s, it’s opened things up. So I think that’s like an, an optimistic, uh, look ahead to the future.
Mike: And, and in using what now is just this big unwieldy model, that’s kind of good at a lot of things, not great, really at anything in a more specific.
Manner that you’ve been talking about in this interview where not only is the task specific, I think it’s a general tip for for anybody listening who can get some utility out of these tools, the more specific you can be, the better and and so in my case, there are many instances where. I want to have a discussion about something related to this story and I’m working through this little system that I’m putting together, but I’m feeding it.
I’m, I’m like even defining the terms for it. So, okay, we’re going to talk, uh, uh, about, we’re going to go through a whole checklist related to creating a premise for a story, but here’s specifically, here’s what I mean by premise. And that now is me pulling material from several books that I read and I’m, and I, and I kind of.
Cobbled together, I think this is the definition that I like of premise. This is what we’re going for very specifically feed that into it. And so I’ve been able to do a lot of that as well, which is again, creating a very specific context for it to, to, to work in and, and the more hyper specific I get, the better the results.
Cal: Yep. And more and more in the future, the bespoke tool. We’ll have all that specificity built in so you can just get the, doing the thing you’re already doing, but now suddenly it’s much easier.
Mike: Well, I’ve kept you, uh, I’ve kept you over. I appreciate the, uh, the accommodation there. I really enjoyed the discussion and, uh, want to thank you again.
And before we wrap up again, let’s just let people know where they can find you, find your work. You have a new book that recently came out. If people liked listening to you for this hour and 20 minutes or so, I’m sure they’ll like the book as well as, as well as your other books. Thanks so much.
Cal: Yeah, I guess the background on me is that, you know, I’m a computer scientist, but I write a lot about the impact of technologies on our life and work and what we can do about it in response.
So, you know, you can find out more about me at calnewport. com. Uh, you can find my New Yorker archive at, you know, newyorker. com where I write about these issues. My new book is called Slow Productivity. And it’s reacting to how digital tools like email, for example, and smartphones and laptops sped up knowledge work until it was overly frenetic and stressful and how we can reprogram our thinking of productivity to make it reasonable.
Again, we talked about that when I was on the show before, so definitely check that
Mike: out as well. Gives a kind of a, almost a framework that is actually very relevant to this discussion. Oh, yeah. Yeah. And, and, you know,
Cal: the motivation for that whole book is technology too. Like again, technology sort of changed knowledge work.
Now we have to take back control of the reins, but also right. The vision of knowledge work is one. And the slow productivity vision is one and where it’s AI could could definitely play a really good role is it takes a bunch of this freneticism off your plate potentially and allows you to focus more on what matters.
I guess I should mention I have a podcast as well. Deep questions where I take questions from my audience about all these types of issues and then get in the weeds, get nitty gritty, give some some specific advice. You can find that that’s also on YouTube as well. Awesome. Well, thanks again, Cali. I appreciate it.
Thanks, Mike. Always
Mike: a pleasure. How would you like to know a little secret that will help you get into the best shape of your life? Here it is. The business model for my VIP coaching service sucks. Boom. Mic drop. And what in the fiddly frack am I talking about? Well, while most coaching businesses try to keep their clients around for as long as possible, um, I take a different approach.
You see, my team and I, we don’t just help you build your best body ever. I mean, we do that. We figure out your calories and macros, and we create custom diet and training plans based on your goals and your circumstances, and we make adjustments. Depending on how your body responds and we help you ingrain the right eating and exercise habits so you can develop a healthy and a sustainable relationship with food and training and more.
But then there’s the kicker because once you are thrilled with your results, we ask you to to fire us. Seriously, you’ve heard the phrase, give a man a fish and you feed him for a day, teach him to fish and you feed him for a lifetime. Well, that summarizes how my one on one coaching service works. And that’s why it doesn’t make nearly as much coin as it could.
But I’m okay with that because my mission is not to just help you gain muscle and lose fat, it’s to give you the tools and to give you the know how that you need to forge ahead in your fitness without me. So dig this, when you sign up for my coaching, we don’t just take you by the hand and walk you through the entire process of building a body you can be proud of, We also teach you the all important whys behind the hows, the key principles, and the key techniques you need to understand to become your own coach.
And the best part? It only takes 90 days. So instead of going it alone this year, why not try something different? Head over to muscleforlife. show slash VIP. That is muscleforlife. show slash VIP and schedule your free consultation call now. And let’s see if my one on one coaching service is right for you.
Well, I hope you liked this episode. I hope you found it helpful. And if you did, subscribe to the show because it makes sure that you don’t miss new episodes. And it also helps me because it increases the rankings of the show a little bit, which of course then makes it a little bit more easily found by other people who may like it just as much as you.
And if you didn’t like something about this episode or about the show in general, or if you have Uh, ideas or suggestions or just feedback to share, shoot me an email, mike at muscle for life. com muscle F O R life. com. And let me know what I could do better or just, uh, what your thoughts are about maybe what you’d like to see me do in the future.
I read everything myself. I’m always looking for new ideas and constructive feedback. So thanks again for listening to this episode and I hope to hear from you soon.