Explain Format | Joe Ton

Human contribution

Speaking the full draft out loud

AI assistance

Speech-to-text dictation (Grok)

Ramble

Alright, so I’m gonna try. Alright, so I’m gonna try to go through this process again and try to give my best thoughts. Right now, I’m struggling with figuring out a solution to make this as transparent as possible. One of the things that I’m thinking about doing is moving to Twitch and YouTube Live and just having people watch me as I’m doing this. What I found is that the vast majority of this online content information is, it feels, and I know a lot of it is duplicate work, and a lot of it seems like it’s unoriginal and these optimization techniques that are used are not that powerful. And I want to figure out a better solution than that. It seems like this is compounded by AI slop, and some other things as well. So, for now, the best solution I can think of is try to document as much as possible and try to be as real as much as possible. And, you know, probably work on articulating my thoughts better in a way that would serve a larger audience. And probably the best way to go about that is to really hone down the skills of decomposing complex ideas into simple terms, and in addition to that, provide memorable simple analogies that are everyday and they are simple to digest for the audience. So I need to do that over and over again, especially if I’m breaking down an idea that’s very complex, especially in the small world of AI systems performance engineering, where I need to marry the hardware, software, and the AI research together. And then of course we got to apply this to hardware and benchmarks and give an honest opinion about what’s going on. So, one of the things I’m taking a look at is the MI300 from AMD. I signed up for their AMD Dev program, and it looks like I’m given maybe a hundred dollars worth of free credit to run the MI300, and there’s maybe a possibility of getting additional credits if I do a good job. I know these things can be expensive, so maybe that’s part of the reason why a lot of people don’t do this. Of course, Nvidia is the other reason they have an immature environment with CUDA. So, I think the best approach here is to try to be as realistic as possible. Don’t try to fake it till you make it kind of thing. It’s kind of weird because I think that in this world of AI slop, the issue that arises over and over again is, can you produce an original thought without the assistance of AI? And that seems almost impossible because you need to keep up with people. Even in the world of, like, prototyping, um, in hardcore coding. There, there is some use of AI for sure. So, there needs to be some kind of hybrid approach, while also staying grounded in reality. And I’m struggling to figure out the best solution for that. But what comes to mind is just try to be as thoughtful towards the listenership as much as possible. To respect their intellect. And in addition to that, also be able to go from zero to a hundred. Like, for instance, like, when I explain an idea to colleagues, I oftentimes lead with the reason why certain things are the way they are. And if that doesn’t stick, then I would use an example to walk through an idea. The issue that comes to mind is that you have to lower your communication to the common denominator. And the common denominator for people who listen to my blogs or my ideas is that they may or may not have any experience with these kind of things. And everyone starts kind of that way. So, I’m trying to figure out a good balance. I think that the best approach probably has to do with first attempt to bring the complex idea into simple terms. Kind of like what they talk about the Feynman technique, but put it in the words that you know. So don’t try to be someone you’re not. And not try to throw in these big words. But try to explain it in a way that feels like you can react to it in a simple way. So that’s one. The next step probably is to figure out a way to explain things in a very clear way. And the only thing I can really think about is providing a very good reason for the purpose of what you’re trying to explain. Which in itself is like selling an idea because there’s, of course, many reasons why something exists. But oftentimes, in order to sell something, especially if you’re the more senior you are as an engineer, you have to sell things. And you can’t just sell many different things because it kind of dilutes the idea itself, which you’re trying to help the customer with. So the probably the best way to sell an idea is just to focus or optimize one thing. Which in itself is very difficult for an engineer because an engineer who thinks about the problem oftentimes needs to think about all the contingencies, all the trade-offs, all the different things that need to happen. But if you’re talking to like a non-technical person, if you’re talking to the leadership or to the customer or whatever, you need to convey some level of confidence. In addition, you need to make clear what the solution is. And maybe you would have two solutions or something at most, but that’s hard enough. Like, understanding one solution well is very, very hard. So, to make this idea as clear as possible, I think that in terms of explaining something to a live audience, like on Twitch, for example, or YouTube Live, ensure that you can explain in simple terms first, and only optimize for the most important thing. So, for example, if I were to explain KV cache, which is a very big thing in inference right now, I guess I would explain it that it’s some form of container that tensors or tokens go into, and they get filled up very, very quickly the more you use inference. So my focus there is to explain that this container, this KV cache, this HBM, gets filled up very, very quickly. And as a result, we need to figure out optimization techniques that prioritize reducing the structure or the amount of tensors that go into the HBM. And there is help on the way, right, from a hardware perspective. But even then, you have to consider that there will be more KV tensors in the future. As our standard of AI increases, we probably are expecting better results, and therefore we’re probably expecting more tokens. So, in a way, you know, it’s a continuous battle of the hardware catching up to the software slash AI research. With that out of the way, like, there’s like three layers to which I’m trying to improve my communication when it comes to explaining something. The first layer is probably what we just talked about. And then this is just me freeballing here. But the structure is, first explain it in simple terms. Like, almost point at it. Kind of like explain to like a blind audience. Like, here’s the idea. Here’s it in simple terms. And here’s what we’re gonna focus on. So that’s the first layer. The second layer is probably you need to expand upon it as much as possible, without including other stuff that is irrelevant to understanding it, or learning about it, or doing critical thinking. The reason being is because it’s hard enough to understand something, but including unnecessary pieces of information can make the information overwhelming. You can reach cognitive overload very, very quickly. So, being able to connect the idea and have it in a concrete, simple form is going to be crucial here. So when I talk about the second layer, probably the best approach would be something to the tune of providing a reason or an example or an analogy. Like, if you want to condense it down, it would be the letters R E A. R for reason, E for example, A for analogy. And the reason why we want those is because it seems like it’s the most proper way to explain something from end to end. The reason gives it purpose, so you’re not just walking in thick fog trying to understand where you’re supposed to go. There’s kind of like the light at the end of the tunnel. There’s some purpose to why it needs to exist. So, that’s provide the reason first. And then the next thing is example because it’s not just enough to be able to see the light at the end of the tunnel. You need to be able to walk to the tunnel. So, what this means is that you need to provide a concrete example that instantiates it in my mind or your mind. So, I think providing example is very important. And if you’re talking to peers, you’re probably moving back and forth on that. Providing a reason and providing example. For the uninitiated folks, when they’re learning something, especially if you’re doing this live, you need to be able to provide an analogy or a simple metaphor. And this really drives it home because the mind itself is kind of like this compression system. And it’s trying to connect the dots. And the way you connect the dots is you need to be able to bridge ideas. And the bridge itself is essentially metaphors or analogies. So, for some reason, the brain really loves analogies, really loves something that just highlights a key characteristic and apply that to a simple idea. For whatever reason, the brain loves that. So that’s very important. So, just to recap, the second layer would probably have to do with reason, providing an example, and an analogy. The third one is probably a call to action. And this structure where you kind of point at it, give it, explain it simply. The second layer is probably explain it thoroughly. And the third level is kind of like a call to action slash bridging of other ideas. So, when we try to look for, like, discoveries in science, what we’re trying to do is bridge it with other ideas. And a discovery in itself probably needs to, in some way, connect with other ideas. It’s not like an isolated activity. So, what needs to happen here is that you need to take that node in the graph and connect it with something else. Another way of initiating this would be something like asking yourself, how does this relate to something I already know? And what are the other things? And how does it connect? And why is it important for this and that? So you need to explore it and broaden the idea to connect and help instantiate it long-term. It’s kind of like having multiple memory hooks, the more you connect with other ideas. And then the last part of the third layer is probably something to do with trade-offs. Because, in the end of the day, you know, when we talk about something as complex as engineering and hardware, software, or AI research, it needs to be from the mindset of not having one singular solution. It needs to be about trade-offs. Because the reality is, depending on your priorities, and everybody has different priorities. The customer has different priorities. The engineers have different priorities. The business folks have different priorities. You need to balance that. And unfortunately, you know, when we talk about AI, it’s sometimes it comes down to taste. And taste, for whatever reason, seems like a human activity. It seems like despite whatever idea that is optimal, here’s my preference. And you have to account for that. Because at the end of the day, when we talk about user experience, we’re not talking about it in terms of the AI experience. We’re talking about it from a human experience. And humans have flaws. Humans have biases. And humans have different preferences and different priorities that they’re attuned to. So, with that, you need to make a decision of trade-offs. So probably the third layer is just connecting the dots and seeing what’s out there in terms of a solution and understand the trade-offs that you need to make as an engineer. So that’s kind of my framework. I’m open to changing my thoughts and my approach about that. I’ve done it enough times where I feel confident to propose this idea of these three different layers where I have the, let’s just say the first layer, which is simple definition. You just point at it, you emphasize the main point, and you move on in simple terms. And then the second one is just REA, which is reason, example, analogy. And then the third layer is just, can you connect it with something else, with other connections? Can you explore that? In addition to that, can you provide solutions or multiple solutions with different trade-offs. So that is it for now. I wonder how I should distill this idea. And how I want to present this. So, if I were to go to my blog, go to Jotun blog, and I need to write this up. How would I write this up? I would go to data, and then I would go to blog, and then I need to name this somehow. So, what do I name this? I name this as something to a tune of, well, okay, so before I name this, what’s particularly interesting here is that we have different dot mds. So, this is particularly interesting. I wonder if there’s a more effective way to structure my blogs. Maybe this is something I can explore later. But for now, I’m just gonna post this out and just kinda give the framework or the structure to which I plan to communicate my thoughts. And then probably I need to emphasize more of analogies in my explanations. That’s probably my thought process. And then I need to figure out a better structure to how I communicate going from paper to hardware benchmarking. So, probably want to figure out a way to automate this a little bit faster as well. Alright, I’m gonna stop there and just take in this information. And see where it goes. And then just gonna post this.