speaker 1: Okay, thank you. Okay, so I will first start with some introduction and then we'll get the actual contents of this class started. Okay, so first, so my name is dson. I'm a professor or in computer science here. I use Berkeley and also a co director of the campus wicenter how the center on responsible decentralized intelligence. So I'm the instructor for this class and also we have gets co instructor singing from Google who is also a very alarm to my former student here teaching this class together. And also we have our great talex and seahuand, also we have our great village Tara and Ashman. Okay, so this is the tstaff who will be working together with you this semester. Okay, great. So everyone's here, everyone has been seeing the exciting group of large language models. The speed of advancement is just astonishing. However, these large language models, they operate in a very simple manner. They take tax input and produce tax output. So what we will cover in this message, in this class, is the next frontier large language model agents. So instead of just taking text as input and produce text as apphere, we use a large language model as the key brain for reasoning and planning for the agents, and enable the agents to interact with external environments, observe the environments, and take actions in the environments. And the agents will be using external tools and also external database, lbase and so on for retrieval to help the agents to perform these tasks. And the reach capabilities of these large enmodels makes these lm agents very flexible. And they can easily operate in diverse environments without much particular specific training. And these lm agents, they can interact with different types of environments, including, for example, surping the web through different apis online. And they can also be emboded even in a robot and operating in a physical world. And they can intersense the environments through different types of inputs, even in the bounty bundle setting, even including very sensory inputs and taking actions in the diverse environments. And through this interaction with the complex and diverse environments, they can update their memory. They can learn to do, to use, they can interact with humans, and they obtain grounding through these interactions as well. And these agents not only just interact with environments that can interact with other agents through multi agent interactions and collaboration, including humans as well. And these multi and collaboration can help agents together to solve even more complex tasks. So why is our agent the next part here? Why do we need to empower our arms with the agent framework? For a number of reasons. So ving real, real world task is never just in one goal with tax inputs. Produce tax x apples, oftentimes involves a trial error process and celeeveraging external tools. And the retrieval from external knowledge can help expand al's capabilities. And the more important to this dynamic agenentic flow, this agent workflow can facilitate solving complex tasks through enabling task decomposition, allocation of subtasks to specialized modules, division of labor for project collaboration. And throughout the course, we also see that multi agengeneration can help inspire better responses. Even though agents has been a fairly recent development, we have already seen agents helping transform different education domains through wide ranging, including education, law, finance, healthcare, cysecurity, bill namit. And the development is really exciting and is fast improving. There are many different leaderboards for different agent benchmarks that you can see online and you can see the really fast improvements on all these different agent frameworks. So overall, to better to enable agent deployments, there are a number of key challenges that we still need to address. So first, we need to improve the reasoning and planning capabilities of agents. Our agents tend to make mistakes when performing complex taend to end, and it's important to improve the reasoning and planning capabilities and also into improve embodiments and the learning from environmental feedback for these rm agents. Rm agents are still now efficient at recovering from mistakes for long horizon tasks. We need to further develop methods and capabilities for continuous learning and self improvements for these ragents, and also include multimodel. Understanding grting and water model capabilities of these agents, and also, as I mentioned, multi agcan really help agents to provide better solutions for tasks. And developing theory of minds helps multi agents to better develop as well. And safety and privacy. These issues are also very important for agents. Rms are susceptible to amversarial, attcan, evit, harmful messages or lead private data and so on. Solving these challenges are also really important for deploying rm agents safely in the real world and also enabling human ainteractions and ethics. How to effectively control our aging behaviors and design interaction modes between humans and our imagies to best able our images to serve human needs is also really important. So to help students learn and better develop methods to address these challenges, the course has been designed to cover a broad spectrum of topics actually throughout the different layers of the agents framework and also the domains. So first in the class, we'll cover key model capabilities, including reasoning, planning, multimodel, understanding. We also cover popular real world and Asian frameworks to enable students to learn how to better design agan applications and use various aggentic flows easily. And this will help students to also learn to use our Asian frameworks for workflow design, to use retrieval, auments generation, rp and multi agent systems. And we'll also cover a number of exciting application domains using these agents, including software code developments, workpool, o automation, multimodapplications and enterprise applications. And finally, we'll also cover important topics on our agents safety and ethics. To cover this wide range, conwe have assembled an amazing team of guest speakers and researchers to cover these topics. So the class will be Liby me and Xin Yun, and we have this amazing crew of guest speakers to help cover these important topics in class. speaker 2: Before I talk, and I want to ask one question for everyone, so what do you expect for you may take the old seconds, send about it. So I can imagine many different answers, like solve the hardest metic problems that humans can not solve, for example, even Carry hard to not solve or discover a new scientific theory, or even saw asiyeah. My background is a mission learning. I don't know, in the kind days you have many people study machine and coor, not because inspired child former is oed right? As a michelaniperson, I have really talking a about AI. AI should be performvermed from just few new examples, like what humans usually do. In the past decades, the vehanonomi community has been a great efforts to develop data efficient methods like summary of learning, active learning, conversomething. And you know, if you look at newspaper in the past decade, the people always crabout 1.2 pagainst in the as ora paper. So in practice, actually, I'm nervous. So data efficiapproaches, I would like this music failed, you know, don't feel bad about that. I A presto, go back because I started micto, almost confused it, or I'm a syterpeactually. That would me to think a different problem. What's missing? Emotion on it? So I sort of carried them for years. And finally I found out that I in the kind days, in particular for people in the coast today, I seems so obvious because it's a pleasure about a reasoning. Humans can learn from just few examples because humans can reme, not because of data. Statistics sound so straightforward. Let's start from a toproblem. In my research, I usually prefer a very simple crod, but it kind order detail, order challenging places. So this problem is called a large lacompanies. If you are familiar with neurosymbolic literature, you found similar problems. So for this problem, give a people name as input. The output will be the calenlation of the third of the last letter over the first name. And Muslim, for example, like Milo musk, and the last data of O E, als n, the last data of MaaS is k. So all put in k. This is so simple. And if you have this problem a few years ago, you're probably able to try to subfy a measurement model. For example, you could use transformmal model with A Y. The decoder is ender. Why the encoder? The is a decoder. And then you will find that, okay, you probably need kinds of labell examples to trenmodel. And finally, you can look at an accuracy that 85% or 9% to something. Now, it's easy. Think about machine learning methods, you know, for such a simple task. I mean, simple for humans, okay? And if the methrequires a vast amount of labor bel data to learn, and would you like to call it as AI or not? AI needs artificial intelligence. I suppose an intelligent model should be able to learn this past as using one or two roanimals. Now let's see how this problem can be solved by using large lmodels. I suppose most people know large one models, but professors are calling me yet to explain what am are. Okay, ms is a transformer model trying to predict next word. For example, given the taai is the future where mask featjust with AI and bill as the input without a model pretty, what will be the next word if the net producis not the World Future? We need to adject private meters ors to make produce ing correct? You make the money that's called back procation. Of course, here you can change your model with many sytences. For example, you can use all tags from the Internet. If you don't want to go to details, you can simply think of a training loms as training powers to mimic human libraries. Actually, I ated this sentence, and one guy reacted me. He said he's very curienced about training parents. He's going good job. Why switchen is a model? Okay. And then we can just mimic the process of the training with the training about creating that content. We can use whatever as input and to see what would be the output, the mojust pretty in that coin. And then you can import the generetic token as you can use the impput, genertoken and pretty next net token as how you can answer from aos. And for this problem, we can simply conatminate all the examples we have had as the input, and also contaminate with a test example, Barack Obama here, we can try this as using any lm and see what happens. And probably see you can rise. Se, it here is called k of course it's not correct. Ack, right? Because k is the last letter of black and a is the last night of Obama. The output should be a ka, so this around, right? The problem, this called a few short problem. It's just a minute overmachine learning process. Instead of a training model, we just use the examples as the input. That's the only difference. In the current days when the heart of face is this prompted idea, we I'll need to add driprocess before the answer. Like we just add the explicit crisis here. The last letter of volume, yiome is n. The last of MaaS is k conenn. K is to N K, like at that. It's called a reasoning process. And similarly for bagand, now we use this as a new import and the usc. Okay, we'll get a perfect response from the large amount models. So even like a hummus, one demonwas stration is enough to get an accuracy 100%. That's exactly what I looked for. We cannot imagine any machine learning method can achieve this perfect generseason here. There's no way. But by the way, the want to overread what I said about machine. Machine money is so useful and important for doing research. In the kind days, I saw many of my naive mistakes from social media news, even from the papers in recall conferences, ordinary mistakes, mostly from people who have no activities on machine learning. They just really have different ideas. This Interlegal now this kind of idea of add in the media steps has been proposed many years in the literature. So this is the amazing paper we know by researchers in quality mind published in the acl 2017. So in their paper, they use natural language reginnail to solve MaaS problems. In the paper, the evrow derived the final answer through a series of small steps, and then they changed ined, a sysix model from scratch. If you know channels sort of work, you'll be so surprising about this paper. Peract, the authare, just like time travelers, they know how to make a given their approach. Ge, in the 2021, a team in the opi published an amazing data set called csmk. They followed the idea in tipaper in 2017. In this data, aset, every problem, followed by multimeters tax as solution and also Finzer. And this team created this amazing data set and used that probuntugps three model. They are greatly scale, are up the work by boomind in. Even in the same year, 20, 21 group researchers in Google Green, now part of our comind part, did the work last show security pafor intermediate commitwith lying on models. They discovered the similar ideas independently, but in the domain of program synthesis. That's why they use actual strsymbols here instead of using natural language in the country. Probably many people know our work in sort of property team and the of sort of actually liberates. Sort is not a term where invented is just common English phrase. It means not a step. So in this world, we extensively evaluated from here individual steps and showed amazing results on almost every energy house. So let's put other people here in 2017 who demand publishing paper came with intermediate steps in 2021 and few Public Papers went lms with intermediate size in 20025, 20022 and prompting with intermediate, obviously. Okay. Which part is more important you can see in here. Actually, it doesn't matter if you are sharing open tier or prompt model. What really matters here? Intermediate steps, that's the key. So let me summarize here, regardless of a training by healing or prompting, when provided with examples that include intermediate Iwill, generate responses that also include the media steps s. Keeping in the midstates. One other question, is it helpful to introduce reasonstrategies in those examples for humans? They when they solve a problem, they could have a strategy for solving. So this so case I work from our team is this most proud one team in this world. We enable easy to higeneration by probably many people saw this famous book, how to solve it by Cordia, a classic book for MaaS H education. So there's a chapter about decomso. If you just if you go to details, you may Mose yourself in details Yeah. Notice what is the difference by decomonization? So given this MaaS problem here, so by the way, so in this talk, the MaaS is at an elementary level. So every time when I came an talk before I, my daughter, I also tell my talk, and she reknew my life. I maybe she could, it affgreat love. And he said, esr has three airand now, had two more airples than esr. How many airples do they have together? Okay. But we see the difference is that, okay, we first show light nomodels how to break down these problems to some problems and then solve what and that's why the least most from least the most complex problems. Is there simple idea, but surprisingly awful. So I suggest I assuhow to decompose complex tasks into simple tasks. So this is a sky task for compositional generalization. You can look for examples here. Give a naturally natural line between a command and wait to translate to a sequence of accents, and that could be executed by a robot, some sithat. So if you use this to most prompting, we'll get accuracy. Nine, 927. So we just used 0.1% demonstration examples. So what? I wonder why I showed this task. I actually am a new this task from Xu she here today. And she invented a beautiful ilapproach to southeast task many years ago. When I looked at at this past, I was very surprised. People look so straightforward for humans. Why could we socorporations finally were made it by. And this is another time what a sahue has been call again with a computaogenticim house. I don't know if anyone knows the concept, composition, tional generzation, roughly speaking, that test examples are more difficult than that, the training examples or proting examples. So for example, for the technical problems, the type problems, we are longer slipy here where our Crohas a little bit of change, a little bit. It's called a dynamic list towards primpting, and we just use one person data and achieved a great results, way better than the solar results in the literature and the solar results industry literactually by specialized architectural design and training and the they, of course, auddata set. Yeah, so far. Any question here? Otherwise, I'll go to the next session. Yeah okay. I suppose this part is quite a firmtive for everyone. I have two kids, my audience ten years old and my side seven years old. Actually, when the terms of crowding people came out, I heard a very interesting conversation between my daughter and my son. And my daughter asked a little brother, so was eleven, 17 times three. A little brother said, I don't know. And then she asked, what's ten times three? 30 was seven times three, 21. So what's the 17 times three? Oh Yeah, I know, 51. And the funny thing is, my daughter shoume daddy Xian, also cramping, also works in my little brother ther's brain. Okay, now okay, why might you say okay, why intermediate steps are helpful ful, so let me say, okay, that's so natural for humans, but with own research, we a in it deeper. Yeah, that's just something similar that our elecon models are just machine learning models. Where to understand what happened. And this year we have work published that I created 2024, and I collaborated with briliant civians from stamburg. And in that work, we are given rigorous mathematical analysis. Okay, so here are two results. Transformer generating intermediate steps can solve any inherently serious problem as long as its death th exceeds a constant shareholand ever a constant, that means independent of your import. However, if transformer generating direct answers either requires a huge depth to solve or kind solve at all. Yeah police chected state and again, then then moving through an aspect. Probably as say in terms of the practical implications of this theory. Yeah if you could solve a problem, you'll be similar generating more intelligence steps. And also probably you could call some external tools that search to help intermediate steps. So as though in this lm agent course, many people would talk about how to use an ectomal course and you can to think about how to download in videos and limitations. Yeah. So I have one of my big finis to find problems my daughter can solve in saings, but alms fail, Yeah. Okay. So now have thought about how to use examples to trigger lms to generate step by step. So what manner is possible to trigger with you without using clforce? Here's amazing work. Actually, when this paper came out, I thought it was a joke. It turned out lot. And then I was inspired a lot. Very spoke. It's called lesson step by step. So given this question, okay, we don't need any examples. We just need to see less step and step. And the model can generate business. Yeah it's really cool but yyou know the approach is there's no than examples it's worse than feel short. Women wonder, okay if we can have purge you still to job but can do much better work. So this this to our another world is called alms as anlogical reasoners. So again this beautiful book how a solve it and bacon you so in some book say okay horrible and likely reason to solve MaaS problems so we see a new problem you'll first ask you a question do you know a related problem or masses strategies Yeah so out that talk how you're going to kind of find a disorder of it and provided another people so this I really liected the code from if you started funds analysis you will know part a space and I was really amazed by the Lawson ness the aumade mathematiciism is the one who can see analogies between analogies of course I I A certain here without you know how far well from ati and so given this simple problem okay of course it has a okay lessons that I said but now we can see a different way okay we call a related problem and then solve the one solve this one okay you can see that actually indeed will cause relevant examples and knowledge here but another problem are exactly the same problem Oh that useful that's amazing. And we found that agwhich, of course, we had a bit ch Marks. And see, it works really well. So you can see that the last row row is formed an logical resoner by a prot. Of course, you can optimister prompt by yourself, catabout yourself. The most important thing here is that it's much better than just see lesson step by step. That's up here in still short cot. And even this approach outconfirms matalso of you here, as the main, is that, you know, we, you disaapproach the model, automatically generate related equations to each different problem. This is resource on a big bench. It was great performance. Promise the results on forcompetitive programming. Yeah if you are interested in company programming, you could try this approach. So what we didn't do here is about skaling. You can maybe you can search the web from all the related problems, knowledge logies for the problem you will solve. So the key idea here directly generates relevant examples and knowledge for each given problem instead of using a big set examples as a manual channel el prompting. Okay, now we can see that. Okay, we can use build short examples to shield a model. How to do step by step with me was can zero shock without using any examples? Just see that's in step by step. Now I could ask another question here. Is it possible to figure step by step with you even without using any problem that seems to by step? You could say, okay, what models in the company are just like, right? You're right, the data is Chinese Union or something. That means they already used many examples in the data mixture for training or training. So Yeah, we found those years. Is that in our recent world, instead of sort reasoning without prompting, without probably that without seeing anything, just give problem to the model, even for point per chalnot a child. Let's look example here. I have three apples. My dad has two more apples than me. And how many apples do we have together for this example? See, the approach actually is a very simple decoding. At the first step, we looked on passport Bocast here I listed file tophere. Okay, so we started the first book applicants and then continue greatdecomding. Okay, so the first one is a file aokay. The first is a file and the next book is file aples. And if you put tool called cancis, I then the fourth generation will be, I has three efforts. My dad has two more efforts than me. And so he has five efforts. Yeah, Yeah. And I see that. So that's that's very usually right. So we didn't see anything reasonhere or the model can do some visuif. We start from different tokens. Here's another example to say, okay, was Nicholas Cage born one in even or all year. The first one say, okay, any was okay was a poem in Oyet. The that in new class was the first token and the first second one that has even and then period third was or the period okay? Now probably say, okay, if the indices, you know, the model could have had shared sort in their response, the probably is harfunded obviously okay. You can taken longer sentences, longer senmeans. The model could do some reading steps. Actually, it's surprising thing to look with the probability. Open that on the poand. Open that yeside. If you look at the probability and the first row here, a Nicholas Cage was born in old India, and the sois quite low. And however, you see, if there's a reason, PaaS that. The last one, Kate, was born in 1964 and an evbay year. There's a reasoning process here and then probably finally chapter to point 98. That's amazing, right? It seems that the model is so well calibrated. I was really surprised by the same those probabilities. See that that turn three with modeeven or old and the crmeters are really low. So key of business and preals have had responses with step by step with the new amount generations started with the top key tokens. We don't need to build any prhere not needed. And the higher confidence ance you can call in the final answer when a step by step reasonpass is present. So here is a comprising between grad decoding and the channel of decoding. We see that the channel ofor coding performance much better. Yeah. So any question here? Now let's move to the net topic, right? Generating the ate steps are helpful, even really helpful, you know but and you can on generating image celinstead of direct answers. Any questions? Okay. Yeah. So this the authentic business social. So probably overthe city depends on your problem. You need Yeah. So actually, in the kind days, you know, we need to always keep in mind that alms are probability models of generating next tokens. They are not humans, no matter if I use my case examples or not. Keep this your mind. So it's a previous model so that see what lm does in code ding. So it's actually automx probability of reasoning PaaS. And the final answer, given the problem. However, what we want is arguments privby the answer given proright. That's what we learned in machine learning. This doesn't mean written PaaS is not important. Yes. See, Finwe have to make sure finis correct, and then look at the written path. They not aligned right, but two different objectives. Okay, now let's look by staff further. Okay, the primedial finisare given a problem how a computer we should sum over all parts or reasoning paths. That's the thought, is compufrom our course, we'll learn, right? So given a MaaS H problem, you could find different solutions which leato the same answer. Yeah, we that when you produce some mission here and then, okay, how do compute the sound. If you start machine learning, you know, answer, okay, right, sample it. So simple of here. Now this due to our work self consistsency, probably many people have no self assibut with all hai really want to let you see the underlying motivation, how we approach this problem from the first principles in machine learning. So let's look the question here. Okay, we have this MaaS problem and you could sample the answer multiple times. Yeah, again, and the finally you see, okay, the most frequent answer is 18. What we get along in here is not most frequently reasonpast, which use most frequent eyesight as huge difference. Reason. The past here is late in the Bible. This idea is so simple, by using self consistency with simply crushed solar results in the literature data time. And I see that employee research with doing you we can you know, it's really just about your idea. You don't have to have to know a lot since. And of course, you know give our explanation on self conency is about probability. It's about sampling. So imagine that more consistent results, more likely to crsh. We look first here. If the consistency is more than 80%, then then the accuracy is nearly 100%. Yeah quipoint about okay. So when the Iwas a direct answer, we saw intermediate steps s we will steer sample or separate hands and then choose the most common answer. Anyone let it ice? Yeah. Okay, great. Yeah. Is one token. Okay, that's where the ple with muscle ability. Yeah. And for a third question, and change self consistency by lack king agenerate multiple responses instead of sampling multiple hands. And then chothe most common answer, does this make sense? Yeah, no. Great. Yeah, that is no. And for both answers, we just need follow this principle. I maprivate ability to fingiven problem. That's all you need to understand itself. Self consistency is a very, very simple principle. Also, first, one of the best principles you mentioned on, if we all know many and more probably know, okay, this record Marks marginal inference. Okay, so one more to okay, how about free from answers you upon as universal subcontinency that machine here. So this idea is a little bit different, but a relaso I put here and given this problem, where do people drink less coffee than they do in Mexico? If you look answers, and this answer is different from others, but the most common response is to here, Japan, China, India. Any question. Otherwise couldn't move from that section, okay. Self caency or sawho ya as a market hands and then shows the most frequent answer as the other Yeah. And next I'm will control about the ittations. The first way I'll above lms can be easily distracted by revenue. From college studies, you know, revenue information may significantly decrease some children and even adults problem solving accuracy. They want check if this observation holds for lms. So these are some problems here. The highlighted text is manually added with variminat. Least $10 is relevant to the original problem. But you see, after the model made it wrong, solution wasn't here. So actually, the inries are okay. If we add a prom like ignore event context acts and tomorrow model, immediately notice that and make cralone. But it's still hard to take a product back if we make a problem making eleven connecare big. So even we can simply just add irrelevant synces like the sky, the blue and the grass Green or something, you know, those mountains you can make this input up to along. You will see a significant performance draw across all lis. The next thing is I'm going to talk about lms can not sell correct reasoning yet. Let's start from a math public game. And actually this problem is a triif you look at and I see that the model gave a wrong answer and then we prompt the model with review your previous answer and find problems with your answer. Okay. And Interestingly, after reviewing the model recognized and and perhaps this looks at sounds amazing, right? You see. And then we see another problem. Based on the problem you find, you improve your answer, and the phenomanswer here is a crsh. However. If that we're in ces a crack, we do a same prompt. The model could have made mistake. That's a problem. So overall, when allowing airto review, their generated response can help correct inaccurate answers. It may also risk changing correct answers into impcorrect ones. We run extensive studies on some benchmarks like a tck comgnisense qa and the huof our qa. And we didn't notice any improvements from self cracks and methods. It just makes this worse. How do you store some improvements from the literase? Now this said, look at improvements are reasoning. And actually they use oracle answers. You can see the orhere. Oracle means you only prompt lms to craft the answer when the ice is one. The problem is that the model doesn't know if the ice is correct or wrong. You held them because the ice is wrong. Descriit one. And also this ability to Moate mitiate the yogroup multiple os, os debate each other and to achieve an equipment for casensus. And also we try this approach, we friend of actually the es, how many response are generated? For example, if we have three os, and if one can be ary in response, there will be three. If a latminate there we one, there will be nine response error. So high partise to serve assidency with nine response, and let's see what happened. We the fact that those efforts ges cannot outperform self assiency, self facis medisimple because simple body tags and take the most frequzer as refinement prediction. So the lesson we learned here is oracle peterback is needed for am to self crsh. If so, that we adit to our work is Saudi basadebak naturally leverage unit tests. As oracle is about coding problems. You analythen have unit tests as try. Actually, we started this work quite early and we didn't make it a self present work. And then finally we'll move to the foo. And the last noi'll talk about the previous order matters in and within. So you know, in a conbase where every time we use phototype reporvers from the archive for poto somewhere, and the people will show great results, for example, recently, more information and should emit results. And given I have probably trounumbers you in the current day, the model channel was over there from that, therefore probably various and other house. So one of on my team is to generate different evolve tasks. So to test the models, so that here, that here we just did a simple trick. We are given this original tsmk proo. We reorder the sentence a little bit and see if the modesphere can solve it. So here, you know, in this orient problem that hair losses ten beers while getting home. And we could you just move this sentence to the end and to see what happened? We'll just give some chance for some chance. Gave problems. And we noticthat there are about a ten points drop rought unsolving risacross all frontier areas. So here, response here is, can contive with compare response, whatever problem, follow the problem. And then we all hold the problem. So I see that at the model, actually, just the model, just know how to solve the problems. Sequently. They couldn't go back and forth. And one could say, okay, that maybe be related to some semantic understanding in what is, I found reasonthen, we design another task. It's quite a logical in ference is either more pure than the MaaS problems you say if if they, if they if they write even we don't use real rewards. We just use random random tokens here. And given the rules ws and the facts, and then model inference, logical inference in the query and the rules for the original problem, the rules are ordered according to the use in the inference process, perhaps how I put out, not all rules are necessary for recoquery. And another way, you know we could just randomly order those rules, okay? I want only remember order rules relevant to query. If not relevant to query, they just keep their or even positions. And surprisingly, then we saw a city passport upon the draacross all frontier lms. From my personal experience, I think it's really important to design experiments when doing research to this athletic body. Okay, now let me summarize the pohere. So the first thing that holy power is generating the media steps s improves aliperformance actually a lot. And you can do training, fine tuning, prompting with intermediate stats, but you have to also to to show logical ways near or some kind of a schedual decoding live safety ding I presented today. And also self consistency greatly improves step by standlistening, no matter you are from five tool model or from btable. And also saw a lot of limitations that you have the contacts self correction and the premise order orders. Those are matters for reasonperformance. So when I came to public, I say a little the next it what probably will come, right? I think the most interesting here is know, say we work something we couldn't say, we work out agi or solve agi. That's not a problem. The problem is to find a right problem, work on and solve it from the first principles, not just you increase on assumpmpfrom principles. That last, machine learning is still super important here. And actually currently I'm in the called marwith, a bunch of amazing people and it's the first ever conference dedicated to lonmarand. Welcome to Yeah, that's it sense.