Podcastskeyboard_arrow_rightSDS 831: PyTorch Lightning Lit-Serve and Lightning Studios, with Dr. Luca Antiga

79 minutes

Data ScienceArtificial IntelligenceDeep Learning

SDS 831: PyTorch Lightning Lit-Serve and Lightning Studios, with Dr. Luca Antiga

Podcast Guest: Luca Antiga

Tuesday Oct 29, 2024

Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn


PyTorch Lightning is revolutionizing the AI landscape, and Dr. Luca Antiga, CTO of Lightning AI, joins host Jon Krohn to explain how. In this episode, they explore the tools pushing AI development forward, from Lightning Studios to Lit-Serve, and discuss the game-changing rise of small language models that challenge industry giants with precision and speed. Luca also shares his vision for developers in an AI-enhanced world, where coding meets creativity and collaboration with intelligent tools. 


Thanks to our Sponsors:





Interested in sponsoring a Super Data Science Podcast episode? Email natalie@superdatascience.com for sponsorship information.

About Luca Antiga 
Luca, CTO at Lightning since 2022, was an early contributor to PyTorch core and co-authored “Deep Learning with PyTorch” (published by Manning). He started his journey as a researcher in bioengineering, and later co-founded Orobix, a company specializing in building and deploying AI in production settings.

Overview
In this episode, Luca highlights how Lightning AI is accelerating AI development with its suite of accessible, open-source tools, including PyTorch Lightning, Lightning Studios, and Lit-Serve. These tools streamline complex workflows, making it easier for developers to build and deploy advanced models. As AI adoption grows, this user-centric approach helps simplify processes that were previously reserved for experts, making high-performance AI more accessible and efficient than ever.

Luca also sheds light on the potential of small language models, which, despite having fewer parameters than large language models (LLMs), are quickly closing the gap. These compact models can be optimized for specific tasks, delivering comparable performance to larger models but with significantly reduced costs and faster inference times. With ongoing improvements in model architecture and training strategies, these smaller models may soon rival LLMs for various specialized applications, opening new doors in AI technology.

Looking forward, Luca discusses the evolving role of software developers as AI becomes more integrated into the field. While AI tools will augment developers’ work by handling repetitive tasks and suggesting solutions, they won’t replace the nuanced skills developers bring, such as creative problem-solving and systems thinking. Instead, as A.I. assistants continue to advance, the most valuable skills for developers will be the ability to collaborate with both AI tools and other developers effectively, blending technical acumen with strategic insight to push the boundaries of what’s possible in AI-driven development.

In this episode you will learn:
  • How Lightning AI's open-source tools make AI development faster [11:30]
  • The rise of small language models and how they'll rival LLMs [37:47]
  • Luca's journey from biomedical imaging to deep learning pioneer [52:03]
  • How AI will transform software developer tasks [1:03:05] 

Items mentioned in this podcast:

Follow Luca:
Jon Krohn: 00:00:00
This is episode number 831 with Dr. Luca Antiga, CTO of Lightning AI. Today's episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by ODSC, the Open Data Science Conference. 

00:00:13
Welcome to the Super Data Science Podcast, the most listened to podcast in the data science industry. Each week we bring you inspiring people and ideas to help you build a successful career in data science. I'm your host, Jon Krohn. Thanks for joining me today. And now let's make the complex simple.

00:00:50
Welcome back to the Super Data Science Podcast. Today we are fortunate to have Dr. Luca Antiga on the show. Luca is CTO of Lightning AI, the folks behind the wildly popular open source deep learning framework, PyTorch Lightning, and Lightning AI is also one of the world's hottest startups developing AI tools, they've raised over $80 million in venture capital to fulfill their mission. Luca, in addition to being CTO of Lightning AI is also CTO of Orobix, an Italian AI services company that Luca co-founded 15 years ago. He holds a PhD in biomedical engineering from Politecnico di Milano.

00:01:24
In addition to being a legendary AI executive and open source contributor, Luca co-authored the book Deep Learning with PyTorch, that the creator of PyTorch, Soumith Chintala considers to be the definitive treatise on the PyTorch library he created. So I will personally ship five physical copies of this great Deep Learning with PyTorch book to people who comment or reshare the LinkedIn post that I publish about Luca's episode from my personal LinkedIn account today. Simply mention in your comment or reshare that you'd like the book. I'll hold a draw to select the five book winners next week. So you have until Sunday, November 3rd to get involved with this book contest.

00:02:01
Today's episode will probably appeal most to hands-on practitioners like data scientists, software developers, and ML engineers, but any tech savvy professional could find it valuable. In today's episode, Luca details how Lightning AI's suite of open source tools is making AI development faster and easier. He also talks about the rise of small language models and their potential to rival LLMs on many tasks. He talks about his journey from biomedical imaging to deep learning pioneer, and he gives us his thoughts on how software developers' work will be transformed by AI in the coming years. All right, you ready for this exciting episode? Let's go. 

00:02:40
Luca, welcome to the Super Data Science Podcast. We're so excited to have you on the show. How are you doing today? 

Luca Antiga: 00:02:46
I'm doing great. Hey, Jon. Thanks for having me. 

Jon Krohn: 00:02:49
Yeah, it's my pleasure. And where are you calling in from today, Luca?

Luca Antiga: 00:02:53
I'm calling from Bergamo. It's a city in the north part of Italy, not so far from Milan. People may know it because if they're from Europe, they travel through Ryanair, and Bergamo is the Ryanair hub for Milan. So you usually landing Bergamo, see Bergamo, come visit Bergamo. It's a great city. 

Jon Krohn: 00:03:12
How far is it actually? Is it one of those Ryanair things where you're like, "Man, this is really far from the city I wanted to go to"? 

Luca Antiga: 00:03:19
Yeah, kind of. It's 40 minutes, but it could take anywhere from 40 minutes to two hour plus. I don't know. It depends on what time of day. 

Jon Krohn: 00:03:30
Depending on traffic. 

Luca Antiga: 00:03:32
Yeah, lots of it. 

Jon Krohn: 00:03:34
Something I was surprised about with Milan, and I don't know what it's like in Bergamo, but I spent a week in Milan a year ago and I was surprised at the levels of pollution all the time. I mean, there's all kinds of positives. Maybe I should have said that first, because some amazing food. Though interestingly, my best food experiences in Milan were actually relatively expensive. Say my first slice of pizza that I bought in... I can't remember the region now. It was a very central region where there's lots of art. 

Luca Antiga: 00:04:07
Brera? 

Jon Krohn: 00:04:08
Yes, exactly. And I walked into a random place, had a slice of pizza, and I was like, "Wow, this is insane. My whole Milan trip's going to be crazy with food." And then a lot of the really nice restaurants I went to, I was like, "Eh..." 

Luca Antiga: 00:04:21
Yeah. I see, I see. Well, pollution wise, yeah, I can hear you. I'm personally, very personally Milan is great and everything, but that's not my favorite place actually. I think Milan has a few fundamental issues. And it got better. Now I see it could get even better, right? I'm traveling a lot and in some respects I think we should, yeah, do a bit better in multiple ways.

Jon Krohn: 00:04:57
Well, the thing that I loved about Italian culture in general was seeing so many people filling the restaurants every evening and talking for hours and hours. 

Luca Antiga: 00:05:08
Oh yeah. 

Jon Krohn: 00:05:09
And that's something any other culture I've ever been associated with could learn from and do better. It seemed like people were really enjoying each other's company and appreciating each day with them. 

Luca Antiga: 00:05:22
Yeah, I think it also changes a lot with the region. So Italy is not a country that has existed in its current form for a very long time, it's over 150 years. And so you see different vibes from very different parts of Italy. And the language itself, the dialects, if you're into spatial statistics, it's kind of continuous variations on the 2D plane, where if you move in one direction, you get elements from all the neighboring things. And it's crazy. If you move 30 kilometers, 40 kilometers, you get different words, you could get sometimes different food. And so even the vibe of going out and being out, I come from a lake called Garda, close to Verona originally, and people there go out all the time. They do aperitivo all the time. Here in Bergamo, it's kind of a more recent thing. When I moved here originally like 25 years ago, even getting an aperitivo at like 8:30 PM wasn't that common. I was struggling to get that. And now everything changed, also thanks to the airport hub. 

Jon Krohn: 00:06:44
Oh, I was going to joke thanks to Ryanair. 

Luca Antiga: 00:06:48
It's actually true. Maybe not exclusively thanks to that, but you could tell the difference. A lot more tourists came in and the city transforms itself. 

Jon Krohn: 00:06:57
It's like a Ryanair thing. They're like, "The travelers are expecting aperitivo at 5:00 PM Bergamo. We said that this was Milan, so you're going to have to do aperitivo like Milan." 

Luca Antiga: 00:07:07
Yeah, yeah. Well, I think it's jokingly true, and it's true for the better. I think right now definitely visit there. 

Jon Krohn: 00:07:15
And so for our listeners aperitivo are, I mean you could explain it better than me, but my impression or my experience of it was kind of, at least at my time that I was there in November last year, same time of year as this episode is going to be released, in Milan this is around the time that the sun was setting, so kind of 5:00 PM, and it still is pretty warm. There was some days that were very warm, I could just wear a T-shirt, which was nice in November compared to every other European city I'd been in. It was very nice and warm and sunny relative to... I was also in Berlin, Amsterdam, Paris, and those places were all cold and dark. And so I was like, "Oh, Milan, it's so nice and warm." And then at 5:00 PM, just as the sun is starting to get a little bit lower in the sky in that time of year, everybody, well, maybe not everybody, a lot of people are out on the street, like bars on the street having a light cocktail, like a spritzer kind of thing. 

Luca Antiga: 00:08:11
Yeah. Exactly. 

Jon Krohn: 00:08:12
And nuts and olives. 

Luca Antiga: 00:08:14 Yeah, exactly. And then it can be even a bit later, I think 5:00 is kind of early for aperitivo. Usually you get more into the 7:00s or 6:00s and 7:00s because people go out from the work. When I was a kid actually on Lake Garda, typical dads would take their city bikes and go on Sunday morning to the center of the town to get aperitivo and then come back a little bit zigzagging their way home. It was very, very typical. 

Jon Krohn: 00:08:51
Also, I love that, that you said city bikes. So it implies that in the community you grew up in, you kind of had this arsenal of different cycles for different scenarios.

Luca Antiga: 00:09:02
When I was a kid, alas, a lot of BMX and Stranger Things kind of vibes when bikes concerned. It was the '80s. But dads used to have this very large Bianchi city bikes and old style, and I have one right now. And so I turned into one of them actually. 

Jon Krohn: 00:09:26
Nice. And then would you maybe potentially also have a road bike for going up and down the mountains? 

Luca Antiga: 00:09:31
Yeah, especially here in Bergamo, it's quite common because there are a lot of mountains around, so a lot of woods. It's great. 

Jon Krohn: 00:09:38
Very cool. Yeah, a lot of the world's best cycle companies are based in the Italian Mountains for sure. 

Luca Antiga: 00:09:44
Correct, correct. 

Jon Krohn: 00:09:45
Nice. All right, so this has been super fun and super interesting, but except for your talking about this kind of 2D plane and variation over that plane, we haven't done much related to data science yet, so maybe we should dig into that. 

Luca Antiga: 00:09:56
Let's do it. 

Jon Krohn: 00:09:57
So you are the CTO of Lightning AI, the makers of AI Studio, PyTorch Lightning, and many more open source products. We're going to talk about a lot of them in this episode. And we did have another episode earlier this year with Sebastian Raschka. I don't know if you knew that, Luca. So we had episode number 767, which sounds also like a plane, and in 767 Sebastian talked a lot about the different kinds of open source products that PyTorch Lightning creates.

Luca Antiga: 00:10:27
Yeah. 

Jon Krohn: 00:10:30
Sorry, that Lightning AI creates.

Luca Antiga: 00:10:34
I said yeah, you know because... 

Jon Krohn: 00:10:38
It kind of evolved from that, right?

Luca Antiga: 00:10:39
Yeah, absolutely. Yes. 

Jon Krohn: 00:10:41
So PyTorch Lightning was kind of the starting point for Lightning AI.

Luca Antiga: 00:10:44
Correct. 

Jon Krohn: 00:10:44
And then once Lightning AI was established as a separate company based largely around that open source PyTorch Lightning project, lots of additional projects have come out of it. And we enumerated those and went into a lot of them in detail in episode 767 earlier this year. So we have different questions and topics planned for you. We're not going to repeat that kind of thing. But of course we'd love to hear your take on these products as we go through. 

Luca Antiga: 00:11:10
For sure. 

Jon Krohn: 00:11:12
You were actually an early contributor to PyTorch Core, which might've kind been some of your interest in PyTorch. And you wrote a book Deep Learning with PyTorch, which was Manning Publications, if I remember correctly. 

Luca Antiga: 00:11:26
Yeah, yeah. 

Jon Krohn: 00:11:27
And so yeah, can you elaborate a little bit on the suite of products that Lightning AI offers and what kinds of companies, what kinds of teams, what kinds of data scientists that we have listening on this show should these different products interest?

Luca Antiga: 00:11:45
Right, right. Yeah, absolutely. So PyTorch Lightning was released, was born many years before, created by William Falcon, the CEO and founder of Lightning AI. And it was officially released as open source in 2019. Back then, it was already kind of bottle tested because William created it while he was interning at... It was actually at FAIR, Facebook AI lab, and he was doing a PhD at NYU so he had access to a ton of GPUs, and so they were doing self-supervised learning mostly in the vision space. And the need there was to train over thousands of GPUs, which sounded crazy back then and now it's like a medium-sized training job. But a lot of the struggle about setting up things that would work seamlessly on a single node and multiple nodes, sampling data correctly, setting things up correctly so that you don't shoot yourself in the foot and you can focus on the modeling task and so on. All these things were battle tested in PyTorch Lightning back then. 

00:13:03
So when it came out, it immediately resonated with the people that were doing small and large training jobs. And incidentally, I got to know PyTorch Lightning before I got to know William because my company in Italy, Orobix landed on PyTorch Lightning to standardize their training code. And this happened over and over so that now we are at 150 million downloads and 9 million downloads per month, which is very large. 

Jon Krohn: 00:13:36
It's wild. 

Luca Antiga: 00:13:37
Yeah, it's very wild. And in fact, PyTorch Lightning became one of the ways PyTorch lands in organizations because PyTorch Lightning doesn't wrap PyTorch. So if you don't know PyTorch Lightning you may think that is a wrapper around PyTorch, but you still write your model in pure PyTorch. It just organizes your PyTorch code so that if you break it up into hooks, into different life cycle moments and have every piece of code organized for, "I need to instantiate optimizers, I need to run the training step," and so on, then the trainer, the PyTorch Lightning trainer will know what to call when.

00:15:13
So it can take care of a lot of engineering aspects and then calling to your code when it's the right time. And that takes away a lot of surface area for mistakes. That's ultimately what PyTorch Lightning is good at. Because you don't need to know what you shouldn't be worried about, unless you want to really get into the details, which you still can. But for the most part, you should focus on what's your task. And as a researcher or a person at a company, I want to solve my problem and not necessarily everyone has to be an expert at distributed training, right?

Jon Krohn: 00:15:03
Keith McCormick, the data scientist, LinkedIn learning author, and many-time guest on this podcast, most recently in Episode #828, Keith will be sharing his “Executive Guide to Human-in-the-Loop Machine Learning and Data Annotation” course this week. In this course, Keith presents a high-level intro to what human-in-the-loop ML is, which will be intriguing even for consumers of AI products. He also introduces why data professionals and need to understand this topic for their own AI projects, even if they delegate data annotation to external companies.You can access the new course by following the hashtag #SDSKeith on LinkedIn. Keith McCormick will share a link today, on this episode's release, to allow you to watch the full new course for free. Thank you, Keith! 

00:15:53
Yeah, yeah. So to kind of recap that back to you, PyTorch Lightning open source framework for Python, it works with PyTorch, very popular deep learning library. And PyTorch Lightning allows you to avoid mistakes as you train models, as you deploy models, as you parallelize model training or deployment across many GPUs, which is common, especially with large language models, which are very large deep learning models that probably most people who've listened to the show have already heard of, things like GPT-4, and the Llama architectures in terms of open source ones. And so PyTorch Lightning supports that whole ecosystem. And yeah, it's one that I love, one that I've taught on. Actually, when you and I met, Luca, the only time that we've met in person, was at ODSC East in Boston earlier this year, the Open Data Science Conference. And one of the things that I was doing at that conference was I was doing a day long training on large language models featuring, yeah, PyTorch Lightning.

Luca Antiga: 00:16:54
Yeah, I remember. Yeah, yeah, that was great. And then out of there, this vibe of allowing teams to iterate as quickly as possible, leaving some of the details to when you actually need to care for those, but then allowing you to iterate on what you need to do has become kind of the vibe for whatever Lightning AI does. And that applies to studios because once you have your training code that is easy to manage, easy to scale, you need to make it run somewhere. And so the problem of getting access to compute, getting data to move quickly and seamlessly, having access to multiple machines as easily as you would have access to your laptop is what Lightning Studios is for. So the same mental model that you need to have about PyTorch Lightning simplifying distributed training you can use when you think about Lightning Studios and the way it simplifies using cloud computing resources. So we make it as easy as it is to use your laptop, but now you have a thousand laptops that can work in parallel because you have access to the cloud. 

Jon Krohn: 00:18:07
Yeah, and something that I am going to absolutely be doing the next time I offer that same training, which I haven't yet again this year, is updating everything that I was doing. The kind of final module that I cover in my training involves parallelization. And it sounds like Lightning Studios is going to be by far the easiest way for me to be demonstrating that to our audience and having them hook in. 

Luca Antiga: 00:18:29
Yeah, absolutely. And then it is not just for training, right? Right now we're discovering things in a chronological order. But Lightning Studios is really a development and production environment for building AI from training to deployment, to building compound systems. Its sweet spot is building AI systems in general, and there's a lot of need for that. And I think we're offering a very, very smooth experience. And we're rolling out new features all the time, so it's a great time to get into it and try it out. It's very self-serve. You can sign up, get three credits. So we are seeing a lot of people doing that. And as far as the rest of the ecosystem, you don't have to use PyTorch Lightning on Studios and you don't have to use Studios to be able to use PyTorch Lightning, but if you use them together, it is kind of magical. You can just scale seamlessly.

00:19:27
Same thing for LitServe, which is kind of the PyTorch Lightning for serving that we launched earlier this year. It's already being used by a lot of people. It's a very simple framework for serving, kind of like TorchServe, but it's very, very simple. Its internals are as tight and minimal as possible, and that allows contributors to come in and to make it evolve easily. At the same time, it's very fast. And so it's generally faster than TorchServe from our benchmarks. Not because we want to compete, but just because you don't have to worry that it being simple means that you are leaving performance on the table. That's not the case. And so I think that's a very nice way to have people building models or building systems, serve them through an API without having to worry about, "Oh, how do I manage workers? How do I serialize data?" And so on. You need to just implement a few hooks. Again, same thing as before. And then we give you the performance and reduce the surface area for mistakes.

Jon Krohn: 00:20:50
Nice. And it slipped past me as you were describing this, all of this that you're describing here, this most recent product, what's that called again? 

Luca Antiga: 00:20:56
LitServe. 

Jon Krohn: 00:20:58
LitServe, LitServe, LitServe. Yes, yes, yes, yes, yes, of course. And so- 

Luca Antiga: 00:21:01
We have others, but if we don't have to make the list, it's just you need to understand that when you come to our products, they tend to be minimal and they tend to make you not worry about certain things so that you can move faster. That's the whole thing you can expect from us. 

Jon Krohn: 00:21:18
Excellent. So what do you think it is about PyTorch Lightning that allowed it to be able to grow so meteorically quickly? In just a few years going from its inception with Will Falcon, as you mentioned, the CEO of Lightning AI, being at NYU, being at Meta in the, well at the time Facebook, Facebook AI research FAIR, this iconic lab, and he's developing this open source tooling from scratch and now getting 9 million downloads a month. What do you think it is that contributed to that rise? Is it the kinds of features that you just described where people are seeing, "This just makes my life so much easier. It makes it so much easier to avoid mistakes"? Yeah, is there anything else in addition around the community? 

Luca Antiga: 00:22:07
Yeah, so I think one of the things Will excels at repeatedly so is be able to ship things minimally in a fully polished way. So he would always push everyone to get to something that is finished from the point of view of a product. Even the README, we spent a long time on the READMEs. We spent a long time on polishing the documentation and so on. So that's, I think, the first step into a product is not just the code or the obstructions, it's the whole experience. And then the other thing is, yes, you need to speak to struggles. And I think PyTorch learning speaks to some of the struggles that you have while doing the work of training or for LitServe serving and so on. So if you can avoid worrying about something and focus on the thing you want to focus on, that's the kind of value without being too opinionated.

00:23:10
And so that's a bit of an art, right? You need to allow users to do what they want to do, at the same time taking some of the burden away. And I think if you look back, in 2019 there were five or six training frameworks, and then ChatGPT came, GenAI came, some of the frameworks survived that shift, some less so. I think PyTorch Learning is one of the ones that did. And I think at this day and age, it's much less common to find people in teams that want to write one from scratch because the attention has shifted somewhere else. So you know PyTorch Learning was there at the right time with the right level of maturity with the right level of user polish and experience. And that's I think what brought it to be one of the leading frameworks. 

Jon Krohn: 00:24:08
Yeah, and I think the passion of Will was probably a key to having this work that you described some of his principles there. I met him once two years ago at the... Insight Partners runs this annual conference. It's now in its third year, it's called ScaleUp:AI. And two years ago, Will and I were both speakers at one of the first ones. And I met him there and he had to leave pretty quickly after he did his talk. He didn't stick around to meet with the speakers, talk to the media very much, because he had to go home and write code. 

Luca Antiga: 00:24:41
Yeah, yeah. We all write code all the time. So that's also what makes it interesting. As a team we're all very hands on keyboard. Yeah, it's good. 

Jon Krohn: 00:24:52
Yeah, it's cool. For a product like you have, it's pretty perfect. It makes sense that even the executives in the company, the CTO, the CEO are writing code when the company is a product for developers for data scientists.

Luca Antiga: 00:25:09
Yeah, yeah, for sure. 

Jon Krohn: 00:25:11
Cool. Another product that was introduced just earlier this year by Lightning AI, and which we've never talked about on air before, we didn't talk about it in Sebastian's episode or anything, is something called the Thunder compiler, which Thunder sounds like a nice complement to Lightning. Yeah, so tell us about that. 

Luca Antiga: 00:25:29
Yeah, somebody said, "Oh, thunder is slower than lightning, so why are you making a compiler that is slower?" But we stuck with the same name. But that's okay. Let them wonder. So yeah, so Thunder to me, it's a very interesting endeavor. It's extremely challenging, but it responds to one fundamental question. Nowadays with the sizes of models, with the money being poured into training and inference and so on, the ability for you to run a model at its best on a given hardware for a given set of inputs and workloads and so on kind of depend on all these things together. So there's no, "Oh, I have a perfect kernel. This will power everything across the board on any GPU, on any contact size," talking about LLMs and so on. It all depends on what is the hardware you're running on? What is the memory? What is the memory bandwidth? How large are you sizing your inputs? How much memory do you need to shuttle back and forth and so on? 

00:26:46
And all these things, finding the right optimizations for the right configuration and combination of model hardware and so on and inputs, it's becoming where the meat is if you want to train or run inference with the best possible performance. So you can either tweak your code and hope that something will get better on the other side. But I think ultimately we're getting into a phase where we need to be able to provide post-source code optimizations in the easiest way possible. And by that I mean however you write your original model, whether you control it or not, because sometimes I'm writing in a Hugging Face or somebody else wrote that model, and then you need to decide what operations to fuse or how to manage different types of precision in the model or how to offload parts of the computation or throw away some intermediates and recompute them.

00:27:59
And all these decisions are so dependent on a combination of things that the ideal is that you have your computation expressed in a way that is amenable to be transformed either manually or programmatically into something else before you get to execute. And then even executing it, you can choose whether you want a set of kernels or an engine or something else that will execute it. So in a nutshell, program transformation is what compilers do. And our goal is to make it a first-class citizen of a framework. So how can you write a framework so that program transformation becomes something you do either manually, automatically, but you have access to everything. You can reason through it.

00:28:49
Because right now, up until now, compilers are kind of black boxes. You come with a source code, you throw your source code into something that will do its thing. There you can set flags, activate optimizations that some genius somewhere wrote for you, and then you need to hope for the best. And sometimes this doesn't lead to results that you were expecting and you're left wondering why. And Thunder is a response to that. It's still very much work in progress, but it has some fundamental things that would help you reason through what program transformation is about and what you need to do to improve your performance. Yeah.

Jon Krohn: 00:29:37
Excited to announce, my friends, that the 10th annual ODSC East, that's the Open Data Science Conference East, the one conference you don't want to miss in 2025, is returning to Boston from May 13-15. And I'll be there leading a workshop. ODSC East is three days packed with hands-on sessions, and deep dives into cutting-edge AI topics, taught by world-class AI experts, plus, many great networking opportunities. No matter your skill level, ODSC East will help you gain the AI skills to take your career to the next level. Why wait? The first 100 tickets are on sale this week with a generous discount at odsc.com/boston. 

00:30:19
Nice. Cool. So when somebody is thinking about doing compiling, what kind of person needs to be thinking about doing that? I, as a data scientist, I have never compiled. It's been over a decade since I've compiled anything. So when might a listener want to... What are some scenarios where they would actually want to be doing some compiling? And then when they do that, Luca, what do they do? Is it importing something, importing a package? Because it's presumably something that happens outside of the context of a Notebook environment?

Luca Antiga: 00:30:56
Well, you can actually ask... Okay, so let's take a couple of steps back. Compiling for whoever has used software means you have source code, you get an executable out. And then I can give you the executable, you run it. And it's not exactly about that that we're talking about. When you have a deep learning model compiling is transforming it, and then coming up with an execution plan that can involve specialized kernels, it can still be in Python, doesn't need to go into C, or maybe it will go into C. You turn it into a way where you optimize some aspects like memory usage or execution time and so on. So compiling is kind of a broad term. We call Thunder a source to source compiler because it doesn't generate kernels per se, but it will transform the computation so that you can then use the kernels that fit your thing the best. 

00:31:54
But when would you want to use a compiler is when you want to squeeze performance out of a single GPU, multiple GPU or multiple machines with multiple GPUs on them. It's all about the performance. If you don't have performance problems, then probably you don't need a compiler. However, nowadays GPUs are expensive and you need to run things that are super large most of the time. And so having a flexible compiler infrastructure is something that will allow you to get the most out of your money. But I think they are very bimodal. There is a very bimodal distribution of users, the users that do not care about any detail and the users who actually want to go in and optimize the heck out of their workload. And we're trying to address both with two different APIs in Thunder. 

Jon Krohn: 00:32:50
Very cool. So this allows you in a scenario that would be common today where you're training or deploying large language models, these take up a huge amount of GPU infrastructure, they're very expensive to be running and deploying. And so whether you're aware of it or not, before listening to this episode, it might be helpful to you and your organization to be taking advantage of something like the Thunder compiler to allow you to be making the most of the infrastructure and of the spend that you have.

Luca Antiga: 00:33:23
Yeah, correct. And I mean, Thunder is by far not the only compiler... Actually is one of the, if you will, less mature ones. If you ever use JAX or PyTorch/XLA, you can run an XLA on the XLA stack. But even before that, there is a torch.compile that was first released with PyTorch 2.0, and that's the natural choice when you want to switch on a compiler, like, "Let's try to compile this model and see what the effect is." You usually torch.compile the model, and then you run the model that comes out of that, and that interprets the code and then generates optimized kernels to run the code in something called... There's a dynamo and inductor, which are the two pieces of torch.compile dynamo, capture what your code wants to do, and inductor generates optimized kernels from that. And so we like to think that Thunder is, again, a complement to torch.compile for certain things. 

00:34:35
And then if you've been following optimizations in LLM space, there has been a lot of hackers that has gone into, "Oh, let's reimplement this model in C." So like Andrej Karpathy was much more than a hacker, of course, but he did this project called llm.c where a lot of hackers this time joined him and optimized training for initially GPT-2 size models, all in C. So they had C, they had CUDA. And the whole stack was basically fitting a couple of files. It was great because with a minimal amount of lines of code, you had everything. And so you didn't need to pierce through layers of obstructions in order to say, "Okay, let's reuse this memory buffer for this other thing because this will allow us to save an allocation and so on." So that kind of finding granular control is what makes those projects successful. At the expense, of course, of generality. And same thing for llama.cpp and other projects like that. What we're trying to achieve with Thunder is giving you a basis that is thin enough so that you can reason through the stack without having, again, to pierce through too many layers in order to optimize your thing and to open it up to you as a practitioner. 

Jon Krohn: 00:35:55
Trying to get those principles right as you described already earlier in the episode, where you're trying to not dictate too much, not make too many assumptions about what the users are going to want to be able to do, while simultaneously offering them flexibility if they want it. 

Luca Antiga: 00:36:09
Yeah, exactly. Exactly. And it's a hard balance to strike but we're well on course I think. We're using it internally. We're very happy with it. So it's good.

Jon Krohn: 00:36:18
Yeah. It's nice to have an organization like yours on the show like Lightning AI, where I can hand to heart say you guys are nailing this, you're doing a great job. And the kinds of tools that you're developing time and time again are exactly the kinds of tools that data scientists, software developers want for training, deploying, faster, cheaper, more accurately their machine learning models. So thank you. 

Luca Antiga: 00:36:48
Well, thank you. That's great to hear. And we're doing our best. 

Jon Krohn: 00:36:54
Okay, so when we're talking about, like we just have been, about very large language models and huge costs on GPUs, that isn't necessarily where everything is going. So we do have this trend towards gigantic models like we saw with GPT-4, with Claude from Anthropic. These models are getting bigger and bigger. But simultaneously is also a trend towards small language models, SLMs. And during a recent ODSC podcast, we were talking about ODSC earlier, so I assume... I didn't actually listen to this episode, but I'm guessing it was hosted by Seamus McGovern as usual. 

Luca Antiga: 00:37:36
Yeah. 

Jon Krohn: 00:37:37
And so during that recent ODSC podcast on small language models, you discussed deploying SLMs into production. So could you fill us in a bit more on SLMs and what it's like working with them relative to larger models, why somebody might want to use an SLM instead of an LLM?

Luca Antiga: 00:37:59
Yeah, so let's say that ever since there's this push to make models larger because capabilities have evolved with scale, so we learn from OpenAI and then everybody else following that trend, that as you scale up a certain architecture, be it transformers or other architectures that have shown similar gains in performance as they scale up, it's kind of a universal property of something doing certain kinds of computation. Doesn't need to be a transformer. But what we know right now is that if you scale things up, the capabilities will also increase for a certain set of data, for a certain task that you want your model to solve. In this case next token prediction, right? You give it a huge corpus and then say, "Okay, if I give you a chunk of tokens, like fragments of words, what the next one?" Or, "Assign a probability in your gigantic vocabulary or 30K possible tokens, can you assign a probability to each one of them, of them being next?"

00:39:18
And then going through this process, the model generates some sort of internal representation of higher order concepts I would say that you can have an experience with as a human interacting with the thing. And then the other property that we saw, and then I come to your question specifically, is in-context learning. So prior to a certain point in time, to solve every task, you had to train a model on that task, you had to have a lot of data annotated for that task. And then it's still true for many disciplines. I've done a lot of work with AI in manufacturing context, and that's very true there. 

00:40:14
But in language specifically, and then vision as well, in some cases. You can train models that are more generalistic and you can direct them to the subtask or a task that is somehow related to the original task by either flinching them, so using gradient descent to change their parameters or a subset of those, or a layer on top of the parameters you have, to adapt the behavior of the model. And that kind of biases generation in a certain direction. It doesn't really add a lot of knowledge into the thing, but it's probably amplifying things that the model has in it.

00:40:57
But also you can do in context learning, which means you can explain through the task that you want out of the language model by just writing it in the context. And then even providing examples. And those help the model, again, bias their output in a certain direction as well. What has become quite clear, once ChatGPT was released and then Llama came and so on, was that there was a process of miniaturization where practitioners starting to look for smaller models that maybe had fewer facts but could still retain those properties of in-context learning and being general and adaptable to different tasks. And what is usually happening is that small models are better at specific tasks than general tasks and are worse at in-context learning. Although this is incrementally month after month, and months are very long timescales these days, are getting a lot better.

00:42:14
I remember the first time I used Llama 3-8B, I said, "well, this is giving me a few vibes of the initial ChatGPT, right? And it was running on my MacBook. And then you start seeing 1or 2 billion models obtained through distillation of larger models that are starting to get that kind of... But they don't need to be perfect. They need to be, A, good enough. And then 2, hallucinate not so much. I've seen other models in the past that were small, like the original 5 hallucinated a lot, although it had good metrics, but the hallucinations were a bit too many. I think Llama-3-8B was one of the models that didn't hallucinate so much, at least on my private test cases that I have. 

Jon Krohn: 00:43:05
The 8 billion, that to me, that's starting to be... That's like a small LLM. I don't know if I would say... It's funny, it's right on the cusp.

Luca Antiga: 00:43:13
It's on the cusp, yeah.

Jon Krohn: 00:43:16
When you're saying 1 billion, 2 billion. For me, that's a small language model for sure. It's interesting when it's like eight, it's like, yeah, it's right on the threshold of it's a smaller large language model, but it's not super portable on modern phones, that kind of thing.

Luca Antiga: 00:43:33
Yeah, yeah. But I mean, this will evolve and probably 1 billion, 2 billion will become... Yeah, as we learn how to distill things. We did some investigation with an intern this summer, and we saw, and we'll probably publish it at some point, that... Actually he already published a part of the findings. And many other papers are looking at the same thing. Even 8 billion models or 2 billion models are under trained, there is a set of parameters that is doing nothing. And so think about this. We landed on something because we wanted to scale it up, and we saw that just by scaling up we gained a lot of features. But we're still at the very, very early stages of being able to say, "Okay, are we actually using this and all parameters in a way that makes sense? Or we're just shooting and..." It's like the latter is true. And we're seeing that distillation is doing something that kind of compresses and makes things more efficient. 

00:44:40
But again, it goes together with interpretability a little bit, they're squeezing every inch of performance or every inch of capability. We're very far from there. And also facts are also polluting and occupying a lot. And so reasoning and knowing everything are two different things. So even the way having to train on trillion sized tokens data sets and having to memorize all these facts are all things that are pointing in the direction that in a couple of years, in one or two years, we'll get the same good enough for interaction kind of vibe from models that are like, yeah, no more than one, two billion for sure. It's a tendency. When we had the boom, the radios and then we had walkmans, it's got to be inevitable. I don't think we're hitting any fundamental property. Yes, there are scaling laws, but scaling laws are a result of conditions. They're a certain set point at which they're being formulated. 

Jon Krohn: 00:46:06
Exactly. 

Luca Antiga: 00:46:07
And then it's universal if you put yourself under those constraints, but it's not universal [inaudible 00:46:14].

Jon Krohn: 00:46:14
Exactly, exactly. Do you ever feel isolated, surrounded by people who don't share your enthusiasm for data science and technology? Do you wish to connect with more like-minded individuals? Well, look no further, Super Data Science community is the perfect place to connect, interact, and exchange ideas with over 600 professionals in data science, machine learning, and AI. In addition to networking, you can get direct support for your career through the mentoring program, where experienced members help beginners navigate. Whether you're looking to learn, collaborate, or advance your career, our community is here to help you succeed. Join Kirill, Hadelin and myself, and hundreds of other members who connect daily. Start your free 14-day trial today at superdatascience.com and become a part of the community. 

00:46:59
So the idea of something like the Chinchilla scaling laws, that makes sense under the context of a specific budget, for example, or a specific amount of data. And so when in those kinds of scenarios with those constraints, they're like, "Okay, if we scale up the number of parameters, it will have this effect." But of course, that doesn't take into account, "How can we be more clever about distillation or pruning to allow us to make the most of all of the parameters in a 1 or 2 billion parameter model?"

Luca Antiga: 00:47:30
Yeah. 

Jon Krohn: 00:47:31
It's interesting to hear, though unsurprising to me to hear that you had an intern that was able to discover that even on a 2 billion parameter model, there's lots of parameters that are being unused across all tasks. So there's still more juice to be squeezed from that lemon. 

Luca Antiga: 00:47:46
Yeah, for sure. 

Jon Krohn: 00:47:47
It's kind of obvious to me that you hear in the press, in the popular press, "Oh, right now, training a state-of-the-art LLM costs in the area of $100 million. The next generation will be a billion, the next generation will be 10 billion." And I don't think that's true because of exactly the conversation that we're having here, where yes, that instrument, that hammer of just scaling up model parameters got us from GPT-2 to 3 to 4. But once you are at OpenAI and you're looking at, "Wow, this is going to cost $1 billion or $10 billion to train the next generation of models," and then deployment is then you're going to have 10X, 100X costs and deployment as well? It's not possible. It doesn't make economic sense when there is so much juice to be squeezed from the lemon in other ways, distillation pruning, as we already talked about with small language models, and also now with paradigms like we see with o1 from OpenAI, you can be scaling inference time compute as opposed to training time compute.

Luca Antiga: 00:48:58
Absolutely. Yeah, that's the other thing I wanted to mention too. Right now we use LLMs like Zero-Shot even within context learning. But you know, again, we're scratching the surface. It's so early. If you read around, it looks like everything, oh, it's all set. But it's not even the beginning. So approaches like, I was reading about Entropiks in these days, this method to kind of [inaudible 00:49:33] whether the model is confused at a certain point of the generation so that you can give it more time to sample differently. And sampling is just one of the ways you can do things. And think about the effect that LoRA has on weights. It's a surprisingly small amount of information that you're adding, but you can pretty dramatically change the behavior of a model, even with those small amount of numbers. That means that a model has in itself the ability to produce a very diverse set of outputs, and we just need to bias it to produce those. And going through grading type of methods and optimizations is one way, but I think it is a way that is a bit indirect. 

00:50:24
I think the real opportunities will come when you have a knob that you can turn real time to tune exactly what you want out the model or give the model the ability to real time bias itself towards behaviors without necessarily going back and do the whole add an optimizer, blah, blah, blah, with examples and so on. That will stay there for sure to create the basic models and also to align them and so on. But I think we're getting into a inference time compute period of time where the knobs will be a lot more visible. And even from a business perspective and compound AI systems' perspective, that's where we will be able to squeeze and a lot more value out to the systems. And we'll learn that small model know a lot more than we think.

Jon Krohn: 00:51:16
Yeah, I agree with you 100%, and I'm looking forward to the lightning knobs module when that comes. 

Luca Antiga: 00:51:20
Yes. Absolutely. Yeah, we're taking it easy in that space, but it's not that we're not thinking about all that stuff. We're always thinking about it. But releasing something is a big commitment. Releasing the product, even if it's a small open source library, you need to feel that you nailed it. So I think we'll see what we do in that space. 

Jon Krohn: 00:51:54
So switching gears a little bit here, turning the knobs a bit on the topic of the episode, I want to go back a little bit into your past. You have a PhD in bioengineering, and over the years you've contributed to lots of open source projects like the Vascular Modeling Toolkit, VMTK, which is for modeling blood vessels for medical images. And I'll include a link to vmtk.org for people who want to check that out. So these days, as our researcher, Serg Masis pointed out to me, when you go to the dentist, you get a high resolution 3D scan of your mouth. And using advanced imaging techniques and AI powered analysis, you can accurately predict cavities and other dental issues before they become visible or symptomatic. So the question that I have for you, given your background in bioengineering, in all the kind of open source development you've done in that space, how do you see this kind of technology evolving further in the coming years, incorporating more AI to allow more early diagnosis in biology and medicine and personalized treatment in healthcare? 

Luca Antiga: 00:53:06
Oh, yeah. This is a very big thing. So VMTK is something I developed in 2003/4. So I open sourced it in 2004, I was working with David Steinman, he was at Robarts Research Institute in London, Ontario, where I spent a year, beautiful year.

Jon Krohn: 00:53:29
There's no way you would know this Luca or expect this, but I've also done research at Robarts in London, Ontario.

Luca Antiga: 00:53:35
Oh my God, amazing. Really? 

Jon Krohn: 00:53:38
And it wasn't that too far off the timeline that you're describing either. It's possible we were there at the same time. 

Luca Antiga: 00:53:44
Oh my God. 

Jon Krohn: 00:53:44
I was an undergraduate student in Waterloo, Ontario from 2003-7. And in 2006 and 2007, maybe even 2005, but definitely 2006 and 2007, I was doing FMRI experiments. 

Luca Antiga: 00:53:58
Oh, of course. 

Jon Krohn: 00:53:59
And we didn't have a research grade scanner in Waterloo. So I would drive my subjects whose brains I was scanning, and these were all humans, and so I'd rent a car and get two or three undergraduate students in with me and we'd drive from Wilfrid Laurier University in Waterloo to the University of Western Ontario to the [inaudible 00:54:23]. 

Luca Antiga: 00:54:22
To Ravi Menon's lab? 

Jon Krohn: 00:54:25
To Ravi Menon's lab. Exactly. That's exactly right.

Luca Antiga: 00:54:28
Yeah, that's amazing. Yeah. 

Jon Krohn: 00:54:32
Yeah, and then we'd get people in the scanners. And the experiments that I was doing at that time, they were neuroscience experiments, and we trained our subjects, our experimental subjects to be able to integrate information, visual and touch information to be able to recognize letters. So Philip Servos, who was the professor that I worked with at Wilford Laurier, he had developed and had 3D printed, which at that time was a pretty new technology, he had this device 3D printed that on the front of the device it had LEDs. So that was very easy to train people to be able to recognize letters made out of LEDs. Okay, this is very simple. But on the back of the device, it would pump air. So it had in the same grid as the LEDs on the front, it had air jets that would blow air onto the palm of your hand, and we could train the subjects.

00:55:29
So we would do this in Waterloo. We would train the subjects on being able to recognize the outline of the letter, the shape of the letter on the palm of their hand. And then what you could do is you could train them to be able to recognize letters only when LED vision information on the front was combined with tactile information on the back. So you take away half of visual information, half of the tactile information, and so you had to integrate this visual and touch information in order to be able to do the task. And it's a hard task, but people would be better than random, I'm guessing. So definitely there was some learning process that happened here over several hours of training. And then you'd get them in an fMRI scanner and you'd do the same, what areas are being activated in the brain with the visual task, which with the tactile and which with blending together? And the idea there was to be able to identify what areas of the brain are involved in that multi-sensory processing and the integration. 

Luca Antiga: 00:56:32
Yeah, that's amazing. 

Jon Krohn: 00:56:32
So that's kind of interesting. I've never talked about that on air and it's only because you mentioned Robarts. And I've taken you way off course in your episode. But the final thing that I would say that was really interesting about doing these fMRI was so when people were doing the experiment that I just described, you're fully engaged in that task, that's all that you would be doing. And maybe you'd do that for half an hour in the scanner. But then you'd leave the subject in the scanner for maybe an hour after just to do a detailed anatomical MRI of their brain in high detail. And people need to sit there still for the whole thing, which it can be claustrophobic, it's cold- 

Luca Antiga: 00:57:15
It's cold and loud, yeah.

Jon Krohn: 00:57:15
It's cold, yeah. But the thing that Ravi Menon's lab is they had lots of Pink Floyd CDs. So while people were getting these structural scans after the experiment, at least you could listen to Pink Floyd, a whole album and kind of trip out as you're laying there having your brain scanned. 

Luca Antiga: 00:57:38
Yeah, yeah, yeah. Yeah, it was great. It was a big lab. It still has, because it's been, I think, incorporated into University of Western Ontario at some point. But it's still there. And you have the molecular biology folks, and then you have the imaging folks. And in the imaging department, there were like 100 plus people when I was there, and they were all doing heavy Python back then, open source heavy stuff. And I was learning just by just walking around with my muffin. And so it was great. And so that is what pushed me to release my stuff, because I was doing all these algorithms for applying computational geometry concepts to anatomy, like Voroni diagrams and solving partial differential equation on the 3D Voroni diagrams was a lot of fun. And then I said, yeah, probably it's going to be useful for someone.

00:58:38
And to my surprise, that tool is still used today in bioengineering departments. I don't develop it anymore. And it's been kind of adopted by a group in the US based in Harvard doing medical imaging that I also participated to at some point with ITK was a tool for medical image analysis, it still is, and 3D Slicer. So it is become primarily a plugin for 3D Slicer. But that taught me a lot. And I was doing the whole mailing listing at the time. At some point I went back and looked, I responded to like 10,000, 12,000 emails over the course of 10 years helping researchers. But then what would blow me away, I would go to, I don't know, Norway, and then get into a lab and I would see students typing my stupid CLI commands. I was like, "I'm sorry." So it was great. And so I understood the power of open source, and then I've always been willing to operate in this area.

00:59:56
The whole medical imaging thing led me to explore deep learning. Because in 2009, I founded my company, Orobix, that I already mentioned, and we got a challenge with a bunch of MRs, a few thousand MRs we had to segment and all my combination of algorithms didn't quite work. So at that point, we said, "Okay, maybe we try this new thing." Well, actually, first we tried a random forest for segmentation. They kind of worked, but they weren't deep enough. And so at some point we said, "Okay, let's try deep learning."And there was this library, Torch7 written in Lua that had 3D kernels, convolution kernels, and we had to do it in 3D. So we used that, and things kind of worked pretty quickly. I was kind of blown away about the lack of pain in getting results out, and I understood that that was going to be the future. Back then, it was like 2013/14. I remember there were three papers using deep learning in medical imaging on PubMed. That was funny. 

01:01:06
And that basically what led me to Torch7, Soumith was already maintaining Torch7. I sent a small PR to Torch7 at some point. And then when PyTorch came out in early 2017, I said, "Okay, it's time to contribute to this thing." And so, yeah, I spent a lot of nights in that, and it was great. Very open, very populated by amazingly smart people. So I was like... Do you know torrone? You know what a torrone is? A torrone, a sweet, like the Christmas treat from Italy? 

Jon Krohn: 01:01:49
Oh, I've seen it. It looks visually like- 

Luca Antiga: 01:01:53
It's a brick. It is a long brick of sugar and almonds and egg. I ended up I remember I was joking because we had torrone the other day with my kids, and I was telling them I would eat one of them, the whole thing on a PyTorch night. And yeah, yeah, it was very demanding. But it was an amazing experience for me. And that led me to write the book and get in touch with Lightning. You can't really underestimate the power of serendipity and just going with that. And it has been like that for me. And so yeah, it's been great. 

Jon Krohn: 01:02:40
You must have complimented that torrone eating with lots of mountain cycling or something, because you look like a-

Luca Antiga: 01:02:48
Yeah, unfortunately not.

Jon Krohn: 01:02:50
You don't look like you've a torrone a night for years on end. 

Luca Antiga: 01:02:56
Well, yeah. I mean, that happened a few years ago. If I did it today, I would probably be at the hospital. But yeah, it was a few years ago. 

Jon Krohn: 01:03:05
Very cool. It's nice to hear that background into what led you to what you're doing today, the power of serendipity. Looking a bit towards the future, generative AI obviously is transforming how software developers work, how data scientists work. What do you think are the kinds of the new skills and knowledge bases that developers, data scientists need to stay relevant in this generative AI world? 

Luca Antiga: 01:03:32
Yeah, that's interesting because I hear a lot of people that complement themselves with language models, and I do as well, in maybe not small doses, it depends on what I do. Honestly, I don't think we've cracked the recipe for working alongside AI for coding yet. There are a few very notable examples. But at the same time, for the more mundane task, it's great. Also for complicated things, it can be great. But you need to find your dimension as a... You need to use it as a wall to bounce ideas back and forth. That's where I get the most out of it. It helps me. When I have an idea, even a methodological idea, even with some maybe math theory behind it, whereas I would look for papers on Google sometimes, that process of getting in the vicinity of where you want to be is very facilitated by a language model, a powerful one if you know how to bounce ideas back and forth. And I think that is the skill you need to develop.

01:04:58
Also, I think sometimes I develop on the back end and so on, and I see a lot of things that could be so much easier if only I could delegate it to someone else. And that someone else can be a language model, because those tasks are extremely predictable and repetitive. And yes, you can use libraries that obstruct things away, but then what happens when something goes wrong? You got many layers to peel off, and maybe it's just easier to keep things simple from a library perspective and have AI filling the gap between you and your willingness to spend time at that level and the task you need to accomplish.

01:05:41
And I do think that in the future, you will just write a function or call a function, and that function doesn't exist until you call it, or it will be cached in many ways. But for sure, you will have to weave everything from the top to the bottom. There will be some layers that may be delegated, again, in that realm of repetitiveness and so on. I don't think we're at the point where we can say that software development will be rendered useless. I think it's more like we're nearing the point where personal automation is closer. And then what software development will become when you have personal automation, then it'll evolve naturally.

01:06:37
Of course, if you want to be in your comfort zone of being a super expert at a language and know the ins and outs, that's great. It's probably not the way things will evolve in the future. Although there will still be a need for that, but just in a smaller number. But something I already brought up I think in some other time, in another video, when I first interacted with ChatGPT, it reminded me, something crystallized in front of my old person mind, which is HyperCard.

Jon Krohn: 01:07:23
HyperCard?

Luca Antiga: 01:07:23
Yeah, do you remember HyperCard?

Jon Krohn: 01:07:27
No. HyperCard? 

Luca Antiga: 01:07:29
Yeah. So HyperCard is something that coincidentally was created by a person under the influence of an hallucinogenic substance. But it was actually a product by Apple that shipped with MacOS Classic. I don't know if it was a software on top of it or shipped with it, I can't remember. But it was an attempt to bring automation and programming and building systems to people that were not programmers. It had a hyperscript, which was a scripting language that resembled English from a syntax perspective. The problem with that is that it's not the syntax, the problem is that if you need to write a forward loop, and you might need to be crafted in such a way that you understand what a forward loop does, and you keep track of variables and blah, blah, blah. So yes, you can write it with a semicolon or [inaudible 01:08:29] or in English, but it doesn't matter in the end, right? 

01:08:31
So it was trying to solve that problem from an angle of syntax and accessibility, but it didn't really solve the problem throughout, which is, "How can I express what I want in natural language?" So of course, technology didn't allow to express what I want in natural language because that was relegated to science fiction movies, and now we're there. But the whole purpose of that thing was can I allow someone who doesn't have a background in computer science to write their own tech, to have their own tech materialize in front of them because they need to solve a very specific problem and they want to solve it in the way that fits them, their immediate need? 

01:09:17
And so I think we're at the point where the technology is ready to get there. And I think that's what makes me the most excited, the ability bring automation and the ability to express algorithms with an intention rather than with having to walk the little robot every step of the way. So I think yeah, that I think is what the future of development might be. So in a way it will get more accessible so that I can just express what I want. And it's also already partially true, but it is just very partial, with agents and so on. And then there will be another set of people that will just dive deep into whatever and use language models to empower them to think faster and to get the results faster, and that's inevitable. 

Jon Krohn: 01:10:34
Nice. Yeah. So basically to kind of summarize back to you what you're describing is that with generative AI, already today we have some of these kinds of personal automations where you can be delegating some software development or data science tasks and over time that will become more reliable, more expansive. But for the foreseeable future at least, the role of software developer, the role of data scientist won't disappear. It's just that there will be more and more automations that you can spin up easily. So it provides more accessibility. It means that you don't necessarily need to be expert in all of the different programming languages that you are developing in, say. 

01:11:16
And so maybe the kinds of skills that become more important in that kind of environment are the kinds of collaborative skills with the team to understand what the product needs, what the business needs, creativity to be coming up with solutions that will really move the needle for some product or organization, and simultaneously principles around architecture and having systems work well. So it moves you up the stack where you don't need to be worried so much about the low-level coding as a software developer or data scientist, more you're thinking at a higher level and maybe sitting a little bit closer to product.

Luca Antiga: 01:11:55
Yeah, that's true. I agree with that. Although it lasts like Blockly right? In the sense that you need to be low-level to understand exactly what you want from a system. So we're not at a level where an automated system will be able to figure out the architecture of something for you. But it will help you iterate much faster at getting that architecture out the door. And I don't think there are any system that right now that is ready to just replace the whole full stack. You still need to be full stuck, but it's actually easier to be full stuck right now because some of the things you just had to know in order to have an acceptable speed, you don't need to know them all to have an acceptable speed, right?

Jon Krohn: 01:12:44
Yeah, yeah. That's right. And I did also look up this HyperCard, which it's a really big deal and I'm embarrassed that I hadn't heard of it before. So it's nice to learn something completely new like that here. 

Luca Antiga: 01:12:54
There's a whole underground thing of people still doing HyperCard competition communities. Of course, just like everything, right? 

Jon Krohn: 01:13:06
And you're absolutely right that it's someone named Bill Atkinson, he worked at Apple from 1978 to 1990, and he devised HyperCard on an LSD trip. 

Luca Antiga: 01:13:18
Absolutely. 

Jon Krohn: 01:13:22
There you go.

Luca Antiga: 01:13:22
And the hallucination thing takes another the whole other shape. Yeah. 

Jon Krohn: 01:13:29
Right, GenAI hallucinations. Yeah, very good. So it's been awesome having you on the show, Luca. I enjoyed learning this about HyperCard and everything else that I've learned on the show. Before I let you go, I always ask my guests for a book recommendation. Do you have anything for us?

Luca Antiga: 01:13:47
Yeah, so I'm a bit less about books about AI and so on. I mean, there are many of them that you can read. I wrote one, Sebastian Raschka wrote one, Simone Scardapane wrote an excellent book about Alice in Differentiable Wonderland. All these are great. But I think sometimes when you're so into technology all the time, you need to allow your brain to breathe a little bit. And so I think that a book that always fascinated me, it's not necessarily positive book, but when I read it I always come out on the other side more motivated on what matters. I'm a big Dino Buzzati fan, and Dino Buzzati is an Italian writer, and his short stories are just mind-blowing and phenomenal. 

01:14:50
But he wrote the book that is more popular is called in English, it's called The Tartar Steppe, and it's about the tale of a man, a young officer that spends his life guarding a fortress for some menace that is never going to come. And then when the menace from the other side of the border kind of arrives, he's so old that he's sent back and nothing happens. So I find it mind-blowing as a story and you need to read it of course and experience it. But I think it speaks to many of the things that we're living through. And as great visionaries often do, they speak to universal things and universal box that we have and then we need to know that they're there. And sometimes you cannot just stare them in the face. You just see them with a side vision. And that's what Dino Buzzati is so good at. 

Jon Krohn: 01:15:57
Very cool. I like that. It also occurs to me, I probably should have mentioned this at the outset of the episode, but often Luca, we give away copies, physical copies of our guests' books, and I should be giving away five free physical copies of Deep Learning with PyTorch to listeners.

Luca Antiga: 01:16:19
Oh, nice. 

Jon Krohn: 01:16:19
So I'll be sure to include that in the intro. I script an intro after you and I kind of record this conversation, so I'll include in there that we'll do that book giveaway for our listeners. 

Luca Antiga: 01:16:30
Nice. 

Jon Krohn: 01:16:30
So yeah, there you go. That'll be in there. So I think our listeners know the drill. On day that this episode comes out, I'll make a LinkedIn post announcing your episode and people who ask for a copy of Deep Learning with PyTorch on my LinkedIn post, they'll have about a week, typically up until the Sunday, to request that, and then we'll hold a draw and some people will get shipped some books. 

Luca Antiga: 01:16:56
Amazing. Okay. That's nice. Yeah, thanks a lot for the chat. It's been super fun. 

Jon Krohn: 01:17:04
Of course. How should people follow you after this episode? What are the best places to follow you online and hear your thoughts? 

Luca Antiga: 01:17:13
Well, I don't publish a lot of thoughts, but I have a Twitter, X, whatever user that is @lantiga. And then I'm on LinkedIn. And then every Friday we do a podcast with my good friend, Thomas Viehmann about Thunder, called The Thunder Sessions. And then- 

Jon Krohn: 01:17:38
Really? I didn't know that.

Luca Antiga: 01:17:39
I show up there. We nerd out for like half an hour, 40 minutes, and we make progress every week and we absolutely [inaudible 01:17:49] regular thing. 

Jon Krohn: 01:17:50
The podcast is actually tracking the development of Thunder?

Luca Antiga: 01:17:54
Yeah, yeah. Actually, we use it most of the time. We either explain it or use it. And this is so valuable for us because we experience things that we know we need to improve, and so it's a great way to put ourselves in front of what we're creating. 

Jon Krohn: 01:18:13
Very cool. Love it, Luca. All right, yeah, thank you so much for being on the show. It was great having you on. And we'll have to catch up with you again in the future and hear how the journey is coming along.

Luca Antiga: 01:18:22
Perfect. Thank you so much, Jon. It's been great.

Jon Krohn: 01:18:30
Nice. To recap, in today's episode, Dr. Luca Antiga filled us in on how Lightning AI offers tools like PyTorch Lightning, Lightning Studios and LitServe to simplify AI development and deployment. He talked about how small language models with about one to two billion parameters are rapidly improving and may soon rival larger models for many specialized tasks at much lower costs and inference times. He talked about how beyond simply scaling up parameter count, there's still much untapped potential in optimizing model architectures and training approaches. He talked about how AI assistance will increasingly augment developers capabilities in the coming years, but won't fully replace the need for low-level coding skills. And he talked about how the most valuable skills for coders going forward will be creativity, systems thinking and the ability to collaborate effectively with both AI tools and other people. 

01:19:19
As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Luca's social media profiles, as well as my own, at superdatascience.com/831. And if you'd like to connect in real life, I've got two things up soon in the same week on very different parts of the world.

01:19:39
On November 12th, I'll be conducting interviews in New York at the ScaleUp:AI conference run by the iconic VC firm, Insight Partners. This is a slickly run conference for anyone keen to learn and network on the topic of scaling up AI startups. One of the people I'll be interviewing will be Andrew Ng, one of the most widely known data science leaders, and I'm very much looking forward to that. It will probably also eventually be a Super Data Science podcast episode.

01:20:04
Also that week I'll be giving a keynote and hosting a half day of talks at Web Summit. That's November 11th to 14th in Lisbon, Portugal. There will be over 70,000 people there. I'm pretty sure it's the biggest tech conference in the world. It'd be cool to see you there too. Other folks speaking include Cassie Kozyrkov, the CEO of Grok, and the Brazilian football legend, Roberto Carlos. 

01:20:30
All right, thanks to everyone on the Super Data Science Podcast team, our podcast manager Ivana Zibert, media editor Mario Pombo, operations manager Natalie Ziajski, researcher Serg Masis, writers Dr. Zara Karschay and Silvia Ogweng, and founder Kirill Eremenko, last but not least. Thanks to all of them for producing another exciting episode for us today, for enabling that super team to create this free podcast for you. I mean, we couldn't do it without our sponsors. You can support our show by checking out our sponsors' links, which are in the show notes, and if you'd like to sponsor the show, you can get the details on how by making your way to jonkrohn.com/podcast. 

01:21:10
Otherwise, other ways you can support us are sharing the episode today with folks that would like it, reviewing the show on your favorite podcasting app or on YouTube, subscribing obviously if you're not already a subscriber. But most importantly of all, I just hope you'll keep on tuning in. I'm so grateful to have you listening and hope I can continue to make episodes you love for years and years to come. Until next time, keep on rocking it out there, and I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.: 

Show all

arrow_downward

Share on