82 minutes
SDS 739: AI is Eating Biology and Chemistry, with Dr. Ingmar Schuster
Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn
At Exazyme, CEO and Co-Founder Ingmar Schuster uses AI to design proteins. He speaks with Jon Krohn about their wider applications in pharmaceuticals and chemistry, how Kernel methods make the design of synthetic biological catalysts more efficient, and when to use shallow machine learning over deep learning.
About Ingmar Schuster
Dr. Ingmar Schuster holds a double degree in Computer Sciences and Linguistics from the University of Tübingen, complemented by a Ph.D. in Artificial Intelligence. His expertise lies in the Bayesian model of natural language semantics and Monte Carlo numerical integration methods. After a successful academic career with positions at Fraunhofer FIRST, Université Paris Dauphine, and Freie Universität Berlin, he transitioned to industry applications upon joining Zalando Research. During his tenure at Zalando, Dr. Schuster led the development of an algorithm for precise warehouse stock predictions, streamlining logistic planning. Additionally, he played a pivotal role in crafting an AI-driven COVID test strategy. Driven by his passion for transformative innovations, he departed Zalando in order to co-found Exazyme, a company blazing a trail in using AI for engineering proteins. In his role as CEO, he leads the development and advancement of new AI models and the product, driving the expansion of protein engineering use cases and refining prediction accuracy.
Overview
The human body cannot function without proteins. Their application in our bodies as biocatalysts and antibodies makes them essential to everyday activities and protection against disease. Each of these proteins comprises a different configuration of amino acids, so researchers must understand those configurations in order to replicate them in the lab. Episode guest Ingmar Schuster explains that protein sequences can be easily reproduced with machine learning techniques because of their reliable strings of amino acids. He also says that, as changing just one amino acid in the protein has the power to change its effects, machine learning has an essential role to play in protein design and, therefore, the future of medicine.
When Ingmar spotted this role, he founded Exazyme, a startup for designing proteins with improved functionality. By using biocatalysts as a production resource, companies can help make chemicals in sustainable environments, reducing waste, energy and money. Another major application of designing proteins is for cancer care, creating antibodies that can better identify and combat cancers that may otherwise go undetected.
Most recently, Exazyme engineered a biocatalyst that helps plants store more CO2 than is typical. Using the brute force method in machine learning, Ingmar’s team managed to generate thousands of variants in the lab that ultimately helped them reduce the energy demand by 50% and triple the catalytic speed of the protein used for storing CO2. Ingmar points to Professor of Chemistry and Computer Science Alán Aspuru-Guzik’s coinage of the “self-driving lab”, an improvement on the self-driving car in that the environment can be almost completely controlled. What self-driving labs mean for startups like Exazyme is that the controlled environment reduces noise in the experiment, enabling proteins to be optimized by machine learning with little interference. Finally, Ingmar emphasizes the incredible potential of computing in biochemistry, considering the reduced energy demands, saving considerable time and money.
Listen to the episode to learn more about the methods Ingmar uses at Exazyme, including a deep dive into the Kernel method, the need for human-led instigations behind AI-driven outcomes, what Ingmar thinks about Europe’s strict regulatory landscape for AI, and the benefits of ‘shallow’ vs deep learning.
In this episode you will learn:
- On designing proteins with AI [03:14]
- Designing proteins at Exazyme [08:22]
- About the Kernel Methods [18:10]
- The importance of human-led approaches in protein research [35:44]
- Europe’s focus on AI regulation [43:45]
- Deep vs shallow in AI [59:35]
- How a background in academia helps with entrepreneurship [1:09:17]
Items mentioned in this podcast:
- This episode is brought to you by Gurobi
- Exazyme
- kernel methods
- passive cooling
- SDS 681: XGBoost: The Ultimate Classifier, with Matt Harrison
- Aapo Hyvärinen on intelligence and suffering
- Francis Arnold, Noel prizewinner in chemistry
- Painful intelligence by Aapo Hyvärinen
- Noise by Daniel Kahneman, Oliver Sibony and Cass R. Sunstein
- Merantix AI Campus - The Merantix AI Campus’s Chief of Staff is Laurenz Lankes. You can reach him at laurenz@merantix.com to explore opportunities at the Campus for investors, AI startup founders and soon-to-be founders.
- Check out Lesson 2 of Jon Krohn’s “Deep Learning for NLP, 2nd edition” video course (available via oreilly.com) for an introduction to word vectors. You can use this link to get a free 30-day trial of the platform.
- Jon Krohn’s Podcast Page
Follow Ingmar:
Podcast Transcript
Jon Krohn: 00:00:00
This is episode number 739, with Dr. Ingmar Schuster, co-founder and CEO of Exazyme. Today's episode is brought to you by Gurobi, the decision intelligence leader.
00:00:15
Welcome to the Super Data Science Podcast, the most listened-to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I'm your host, Jon Krohn. Thanks for joining me today. And now, let's make the complex simple.
00:00:46
Welcome back to the SuperDataScience Podcast. Today's an exceptional episode with Dr. Ingmar Schuster, a visionary who's pioneering the use of AI to transform biology and chemistry research. Ingmar is CEO and co-founder of Exazyme, a German biotech startup that aims to make chemical design as easy as using an app, thereby shortening the way to solve the world's most pressing problems such as cancer and climate change. Previously, he worked as a research scientist and senior applied scientist at Zalando, the gigantic European e-retailer. He completed his PhD in computer science at Leipzig University, and postdocs at the University Paris Dauphine, and the Freie Universität Berlin, throughout which he focused on using Bayesian and Monte Carlo approaches to model natural language and time series.
00:01:27
Today's episode is on the technical side, so may appeal primarily to hands-on practitioners such as data scientists and machine learning engineers. In this episode, Ingmar details what kernel methods are and how he uses them at Exazyme to dramatically speed the design of synthetic biological catalysts and antibodies for pharmaceutical firms and chemical producers. This has applications including fixing carbon dioxide more effectively than plants, and allowing our own immune system to detect and destroy cancer. He also talks about when shallow machine learning approaches are more valuable than deep learning approaches, why the benefits of AI research far outweigh the risks, and what it takes to become a deep tech entrepreneur like him. All right, you ready for this fascinating episode? Let's go.
00:02:12
Ingmar, welcome to the Super Data Science Podcast. It's great to be with you here live in person at the Merantix AI Campus in Berlin. When I asked, so Rasmus Rothe, one of the co-founders of the center, I've known him for something like 15 years. And I had never been to Berlin before. About a month ago, I said, "I'm thinking about making a trip to Berlin. If I did that, could I stop by the AI campus?" And he said, "Absolutely, you can work here as much as you like." And then Arancha Gomez who leads marketing, she then suggested to me some great people as guests for the show to interview while I'm here in Berlin. And Ingmar, we're now delighted to have you here.
Ingmar Schuster: 00:03:02
Thank you, Jon, for having me.
Jon Krohn: 00:03:04
So tell us about what you do. You're the CEO and co-founder of Exazyme, which is an AI protein design tool. And so I don't know how much people are aware that proteins are something you can design, that humans now have that power. So yeah, what does that mean?
Ingmar Schuster: 00:03:24
Yeah, that's true. Proteins you don't only use to bulk up and eat, but actually, the reason why we need to eat proteins, they don't only make muscle fiber, but they are responsible for basically all the important chemistry that's going on in our bodies and basically every other organism in this world. The most basic chemical task that they fulfill is catalyzing reactions in your body. Like for example, there's a biocatalyst, which is basically a protein that acts as a chemical catalyst, that takes CO2 out of your bloodstream, transports it into your lung, releases it in gaseous form so that you can breathe out.
00:04:10
And every chemical reaction going on in your body, it has to work at 35, 36 degrees Celsius, which is not what most catalysts do that we use in the chemical industry. So these biocatalysts, they can do with very small amounts of energy. So this is one protein. And the other type of protein that's used a lot, especially in pharma, are antibodies, which most people will know from the news about Covid. Because our bodies naturally build antibodies against diseases, which basically are an adapter so that the immune system can recognize and kill diseases.
Jon Krohn: 00:05:05
Yeah, very good example. I don't know if you roughly know the number of different kinds of proteins. I guess if you consider all the different kinds of antibodies, you're talking about billions of different configurations of proteins in each of our bodies.
Ingmar Schuster: 00:05:22
Yeah, exactly. So the nice thing about proteins from a machine learning standpoint is that they have very simple way of, for describing them. Namely it's just a chain of 20 letters. Each letter stands for a minor asset. And you just string them up, and that's it. But because it's 20 letters, and we all know combinatorics, if you go to length like 300, 400, you already have more possibilities than atoms in the universe. This is the standard metaphor that everybody uses for combinatorics.
Jon Krohn: 00:06:01
Yes.
Ingmar Schuster: 00:06:03
So there's a lot of combinations, this means you can do a lot of very different things by seemingly tiny changes. Like sometimes just exchanging one letter, one amino acid can have extreme effects on say the speed of catalysis for a biocatalyst.
Jon Krohn: 00:06:24
Right. Right. Right. So yeah, small changes in the amino acid structure could mean either that catalyst that is required for doing some kind of important work in our body, if there's some genetic mutation, sometimes that genetic mutation will not impact the functionality of the protein, but that's the rare case. Because most of the time you change one of these genetic instructions that encodes that sequence of amino acids that you just described, and that will typically make some kind of effect, and usually that would be a bad effect. Because of all of the possibilities out there, it's probably going to negatively impact the capability of that protein. But I guess then every once in a while, and this is what allows evolution to happen, by chance it ends up being better, and maybe that catalyst works a little bit better or we're able to detect an infection a little bit more efficiently, and then that individual is more likely to survive and that mutation goes forward into future generations.
Ingmar Schuster: 00:07:26
Yeah, exactly. And what you just hinted upon, evolution. In machine learning, people who have been around for long enough, they've heard of this old method of evolutionary algorithms. This is exactly the idea of having a code that you change randomly, and then maybe you get something that's better for the purpose you're interested in. This is exactly what's happening in organisms.
Jon Krohn: 00:07:54
Mm-hmm. Yeah, these evolutionary algorithms are very cool. But I guess with getting back to the AI protein design idea, is that allowing evolution to figure out maybe better combinations? That's just by random, and obviously we can't be then selectively breeding people to have that protein be more likely in the population or something. And so does AI protein design, when you're doing that at Exazyme, are you typically trying to recreate proteins that already exist, or are you sometimes designing new proteins that maybe have better functionality?
Ingmar Schuster: 00:08:35
It's mostly the latter. Because as you say, there's no chance ever to design what's going on in human bodies, in the sense of changing human bodies. There's gene therapy in the making. No, no, no, but that's not the majority of things that people are interested in when they design proteins. The majority of things that people are interested in is they take out the biocatalysts from the host organism, and the host organism could be just bacteria or yeast, sometimes plants, sometimes animals, yes. But then you're interested in the chemical reaction this catalyst catalyzes for making certain chemicals. For example, in an industrial setting. So BASF, the biggest chemical company. But also many pharma companies, they use biocatalysts as production routes. And why are they interested in this? Number one, and very often they get much better performance using biocatalysts. They don't get as much waste product, for example. That's the first thing.
00:09:54
The second thing is they can run these reactions in much more sustainable environments, like lower temperature, nice solvents like water. They don't need aggressive solvents or anything. So this is just from an economical standpoint, it's very, very nice to be able to produce stuff with biocatalysts. This is one thing. The other thing is antibodies. The antibodies that we naturally built as humans or mammals, are of a kind that works for many, many diseases, but it does not work, for example, for many cancers. Because cancers are really just our own tissue that's misbehaving. So it's very hard to distinguish a cancerous part of your body from a healthy part of your body. And this is where normal antibodies that we naturally design with our biology, they don't work. And people from pharma companies come in and say, "Well, we have to do something more advanced. We have to design artificial antibodies in order to combat these types of cancers."
Jon Krohn: 00:11:12
Okay. Okay. So I'm starting to piece together here that it wasn't a coincidence or an accident that the kinds of the two types of protein categories that you kicked off your explanation with, were biocatalysts and antibodies. So it sounds like you're starting to get towards that these are two areas that Exazyme specializes in. And so you have two different sets of clients, I guess. One of which is interested in these kinds of industrial applications like BASF, and then the other set is this medical application, cancer detection.
Ingmar Schuster: 00:11:45
Actually, yes. So biocatalysts are interesting to chemical companies and industrial biotech companies. The medical application, let's say pharma is interested in both antibodies and biocatalysts, because they use biocatalysts in order to make even their normal drugs in a more sustainable and profitable fashion.
Jon Krohn: 00:12:10
Nice. Yeah, that makes a lot of sense. So yeah, maybe you can go into, I don't know to what extent, if there's, without obviously going into proprietary secrets that you can't disclose on air, are there some concrete examples you can give in each of those sectors?
Ingmar Schuster: 00:12:27
Yes. So we've up to now mostly worked on biocatalysts. So one of our recent success cases was where we engineered a biocatalyst that plants use to fixate CO2. So basically take out the carbon atom of CO2 and use it to build up biomass. That's what plants do all day long. They take out CO2 from the air and use the carbon atom to build out sugars and everything else basically that they make. Using the energy they get from the sun. And our collaborators designed a new biocatalyst that can do this. Their goal basically is to enhance plats to take up more CO2 and store more CO2 than they usually do. And they were trying standard methods for design, for optimizing this biocatalyst, and they made a lot of protein variants. The first method they used was what in machine learning we call student descent. So somebody reads papers and things very deeply about what can I do? And then they try out stuff. That's what they did first.
00:13:49
Then they came along with the brute force method, which is basically an evolutionary algorithm implemented in the wet lab, meaning you just create random changes to the protein sequence, random mutations, that's where it's coming from, and then screen for which variant is improving. And this way, they generated over 15,000 different variants in the lab, and they got to a certain point, and that was it. And we started talking to them and they were very cautious and said, "Yeah, yeah, okay." So they said, "Hey, I think everybody's talking about it. So yes, okay, we are willing to give it a try." We gave them 10 more suggestions after the 15K that they made and measured previously. And of these 10 suggestions, two improved the performance. One cut the energy demand in half, and the other tripled the catalytic speed. So yeah, this is our last success case, big success case.
Jon Krohn: 00:14:59
Gurobi Optimization recently joined us to discuss how you can drive decision-making, giving you the confidence to harness provably optimal decisions. Trusted by 80% of the world's leading enterprises, Gurobi's cutting-edge optimization solver, lightweight APIs, and flexible deployment simplify the data-to-decision journey. Gurobi offers a wealth of resources for data scientists, webinars like a recent one on using Gurobi in Databricks. They provide hands-on training, notebook examples, and an extensive online course. Visit gurobi.com/sds for these resources and exclusive access to a competition illustrating optimization's value, with prizes for top performers. That's G-U-R-O-B-I .com/sds.
00:15:44
Yeah, that's wild. So can you dig into a bit the AI, the machine learning that you're using to make those recommendations? That's wild. And they must have been, were they blown away?
Ingmar Schuster: 00:15:58
They were very surprised. They were quite blown away. Didn't imagine why the hell this worked. Because the PhD student who worked with us, he said, "Yeah, but it's very far from the active site." The active site is the place where the action happens in the biocatalyst. "So the mutation your AI suggested is very far from the active site. Do you have any idea why it suggested you there?" And I was like, "No, no idea. It's a black box even to me." And so they were trying to get the 3D structure of this protein in order to understand it. And maybe this is also an interesting part, because many, many people know AlphaFold. People in biocatalysis use AlphaFold because it gives them the 3D structure that a protein sequence translates to, called folding.
00:16:59
And from the 3D structure, they're trying to gain insight. We take a shortcut. We don't go through 3D structure through folding. We directly go from the sequence to the performance that you're looking to improve, basically with a regression model. And the methodology we use. The, I won't go into detail, but really the first method that we used is of all things, the kernel methods, which many people still might connect to support vector machines. It didn't change for quite a long time because there was no need to go to another method. Of course, we also use deep learning as well nowadays, but we are really big believers in metrics. So whatever performs best, that's what we use.
Jon Krohn: 00:17:54
Yeah. We're going to talk in more detail later on. We have a whole topic area around how deep learning is not always the answer, and shallow methods can be more useful. So it's not surprising for me to hear you say that at this time. And yeah, we'll dig into that more later. For now, you said that you used a kernel method. And yeah, for me, when I hear kernel method, I think only of support vector machines, but the way that you phrased it there, it sounds like it's something else.
Ingmar Schuster: 00:18:23
Yes. So support vector machines are really old-fashioned now in kernel methods.
Jon Krohn: 00:18:28
Oh.
Ingmar Schuster: 00:18:30
Still performing very well, but old-fashioned in the sense of they've been around for a long, long, long time. And the way people thought about kernels back then has gotten several updates. And it's by far not as fast as deep learning research. I guess mainly because people in kernel methods, they insist on theory. And I think it's a very nice property of many shallow methods, I guess, that you can do theory, but the fact that you can tinker so much and so easily in deep learning is probably what has gotten deep learning this big following.
Jon Krohn: 00:19:16
Yeah, deep learning famously, there's very little explanation for many of even the core capabilities of these models. It's just, it works.
Ingmar Schuster: 00:19:24
Yeah, true.
Jon Krohn: 00:19:26
Okay. Well, that's really cool. And so maybe it is something proprietary and maybe it's something you can't disclose, but are you able to give us, if it's a kernel method that's not an SBM, are you able to tell us generally what this method is called?
Ingmar Schuster: 00:19:40
It's not published actually.
Jon Krohn: 00:19:41
Okay.
Ingmar Schuster: 00:19:42
So that's why I can't give a name because I never had to find one for any paper.
Jon Krohn: 00:19:48
Yes. Okay.
Ingmar Schuster: 00:19:50
But it's basically, well, unsurprisingly, it's a way to embed a sequence into an embedding space. This is very standard. So in deep learning, you would, for example, embed into a fixed dimensional space, and then do a regression problem from that, or maybe you would use the regression variant of transformers with an artificial token in the beginning, whatever. Here, yes. So it's an embedding into an infinite dimensional space, in theory at least. And then you do a regression problem from there, which works with very few data points. That's the big advantage of this.
Jon Krohn: 00:20:38
Okay. So yeah, and something that we also will dig into more later, although maybe now is just starting to sound like the right time, is that you come from a background in natural language processing, where we are concerned with taking sequences of characters. The word cat, the word bat, the word mat. There's commonality between the characters in these, but the initial character greatly changes the meaning of the word. And so in natural language processing, the standard today, all the way up to the most sophisticated large language models. At the time of recording, GPT-4 is probably the leader, and then Anthropic's Claude, so these huge large language models are concerned with taking sequences like cat, bat, mat, and part of their processing involves converting them into an embedding space of meaning.
Ingmar Schuster: 00:21:36
Yeah.
Jon Krohn: 00:21:36
And as you say, in a natural language processing, there would be a fixed number of dimensions. And this is a hyper-parameter that we decide on. The more dimensions you have, theoretically the more granularity you could have in the way that meaning smears around, but there's also more compute. So it's your trade-off. So you come from this natural language processing background, and so it seems like there is this analogy to what you're doing now, is that right?
Ingmar Schuster: 00:22:05
There is an analogy, yes. Although, there also is an analogy to time series, which is what I did before doing this company. I think this is the beauty of machine learning, that when data looks very similar, you can treat it in a very similar fashion. And yes, many companies that have started out in AI for protein design, they use basically transformer models trained on protein sequences. So it's more or less take the same model and just train it on different type of sequence, and that's kind of it. So yes, it's super related, but in the end it's all just statistics. Like fitting massive models onto a sequential type of data. Both natural language and proteins are just sequences of characters to the computer, so it doesn't really matter. And if you take an even more abstract view, then it's just sequences of bytes really that you learn to fit, like time series, for example. So time series models is what I initially looked at when developing this kernel embedding of sequences. And it's very simple to just massage your algorithm to take characters instead of numbers.
Jon Krohn: 00:23:46
Very cool. So I think we'll get into the time series stuff a little bit more later on. For now, something that I think would be interesting if you have a way of describing this. I've been hosting this show for almost three years now, and I don't think we've gone, in fact, I'm certain that we haven't gone into any level of detail on kernel methods. So maybe there's a way that you could explain to our audience what this means in a general sense, or you could be using an example from your work. But how does it work?
Ingmar Schuster: 00:24:28
Yeah. So one of the modern ways of looking at kernel methods is to look at them as just another type of vector space. So for example, in natural language processing, before transformers came around, before you modeled, everything hit everything on the head with the same hammer, which is what you do now. People looked at vector space models of meaning of language. So there was one vector for representing the meaning of dog, another vector for representing the meaning of cat. And there was a matrix for adjectives, for example. So there was a matrix, when you said green dog, then you had the matrix for green, and the vector for dog, and taking the product between the two gave you the new meaning for green dog. And with vectors, this can add them, you can compute angles between them, cosign similarity, this is what people probably have heard of.
00:25:45
You can do exactly these things, like matrices, called then operators, vectors, which in this case are just infinite dimensional vectors in theory at least. And you can add up these infinite dimensional vectors to make up more complex objects in this vector space. And really, it takes some getting used to. And as said, people in kernel methods, they really like to use mathematical language, and sometimes they only act as formal and theoretical, whereas in reality it's a very heuristic method. It takes some getting used to, but in the end, it's just adding together vectors, computing angles, and so on and so forth. The very modern type of these methods allows you to represent probability distributions inside the vector space. So there's a whole host of papers that have been published in recent years, some of them of myself and my collaborators, about how you can represent the distribution of a dataset as a kernel vector space object, and how you can translate between...
00:27:14
For example, how you can take the distribution of a time series at one time point and transport it to the next time point. And then you can do all of these classical, seemingly, the old type of things like component analysis, ICA PCA on these operators. And we've applied this to a high-def video, and you get out very interesting, very meaningful components. Many people will know variational autoencoders, which are another way of trying to get meaningful components out of your data, basically unsupervised methods that you get in a very simple fashion out of these methods.
Jon Krohn: 00:28:03
Like a couple of the recent episodes, I've recorded today's episode live at the Merantix AI campus in Berlin. The inspiring campus is Europe's largest AI coworking hub, as it houses over 1,000 entrepreneurs, researchers, investors, and policymakers. If you'd like to be associated with the exciting, inspiring Merantix AI Campus yourself, I've put the email address of their Chief of Staff Laurenz Lankes in this episode's show notes. Laurenz can fill you in on how Merantix incubates and invests in early-stage AI ventures. If you are a founder or soon to be founder in the AI space, they'd love to hear from you.
00:28:38
Very cool. Yeah, it's a space that I've been consumed mostly by deep learning in recent years, getting close to a decade now of immersion in that. And yeah, there's so much more here in kernel methods, whether the support vector machines or not, for me to dig into. And yeah, fascinating. Maybe I'll have to have an episode dedicated to it at some point. One particular thing about it that I'm struggling to wrap my head around, you've said a couple of times that the number of dimensions is infinite in theory, at least.
Ingmar Schuster: 00:29:15
In theory, yes.
Jon Krohn: 00:29:16
So going back to the example of the kind of natural language processing, where I would use maybe a deep learning network to convert, or a shallow neural network, so with a relatively now simple and well-known method like Word2Vec, using a shallow neural network to convert a long corpus of natural language. So taking a whole bunch of books or all of the language on the internet, you can use that to map into an embedding space the meaning of any of the words in that corpus that occur at least a handful of times.
Ingmar Schuster: 00:29:59
Yeah.
Jon Krohn: 00:30:00
The details of that, it's probably not great for a podcast format to try to explain that, but I do have content elsewhere, which I can direct. I'll put it in the show notes for explanations of kind of how word Word2Vec works. But when we do that, we specify, as I already said maybe five minutes ago in this episode, when we do that, we specify as a hyper-parameter how many dimensions we want the words to be mapped into in that embedding space. And so I could say, "Okay, well, my downstream task, my downstream natural language task, it isn't going to be very complicated. Maybe all we're doing is predicting whether a document has positive sentiment or negative sentiment. And so maybe I'll try using just 32 dimensions or 64 dimensions because I'll prioritize this being cheap compute, cheap cost for me to run this in production and train. But then for a more complex task, like a generative AI task, maybe then I want several hundred dimensions, and maybe getting close to 1,000 dimensions kind of area.
00:31:12
So in those cases, we have a very concrete... For every one of our words, you could think of this as a row, and for every one of the words, then we have as many columns as there are dimensions. And so then each word has this very specific place in this high dimensional space dictated by that vector, that row that the word has, of length, number of dimensions. And it becomes then easy, you're talking about cosign similarity. We can then take the cosign similarity between the row dog and the row cat, and we'll see, okay, those are probably in a somewhat similar region. They'll have a pretty high cosign similarity score relative to say cat and mat, which are very different meaning, even though there's similarity in the characters. I don't know, so when I think about embedding spaces, it is this very concrete, very specific, I know exactly how many rows I'm going to have, that's as many words as I have in my vocabulary. I know exactly how many columns I'm going to have, that's exactly the number of dimensions that I've specified I want in my bedding space. So I can't wrap my head around how you could have an infinite number of dimensions. I don't even know what that means.
Ingmar Schuster: 00:32:28
Yeah. Yeah. I guess, so just the language blows people's mind and they're like, "What the hell are you talking about?" So when I say that in theory you have infinite dimensions, in practice it always means, not always, but for most kernel algorithms, it means you have as many dimensions as you have data points. So you always know, okay, my dataset is off size. I don't know, I have a million samples. My vector space is going to have size 1 million. I have a million and one samples. So I have a vector space of in reality, a million and one dimensionality. So there's several things to say about that. So when you say typically we fix the dimensionality of the vector space, that's true for many, many methods. So if you have an encoder decoder architecture and you want to fix the encoding size, you can do that.
00:33:40
If you look at the basic transformer architecture of self-attention and trying to predict the next word, what you really have is a mechanism that looks at all the different words that come before that, in principle, no matter how many come before that. Of course, we have practical limitations because computers can only handle so and so much data before the compute explodes. But in principle, every word coming before the word that you're trying to predict helps you predict this word. And what you're really learning is which words to look at preferentially, attention basically. So it's exactly the same mechanism. And unsurprisingly, the vanilla transformer has a quadratic computation cost and kernel methods do as well.
Jon Krohn: 00:34:42
There you go. Okay, cool. So kernel methods allow you to attend to the most important words ahead of whatever are word of interest.
Ingmar Schuster: 00:34:51
If you would build a predict-the-next-word type of algorithm with them, then yes, that's how that would work.
Jon Krohn: 00:35:01
Wow, cool. Yeah, that is completely news to me, but also it's starting to make sense to me. So I really appreciate your explanation. So thank you for indulging me on this deep dive into kernel methods, and I definitely have a better understanding. So hopefully some of the audience has come away with a bit of a better understanding as well. So we got onto all of this because we were talking about how Exazyme provides an AI protein design tool. So now we have some idea of how it works. We have some idea of the application areas, namely biocatalysts for chemical companies, and then biocatalysts and antibodies for pharmaceutical companies. How does a human provide value in the process of doing this work? To what extent is this an automated process, or to what extent does a human still get involved and get creative in allowing some of these applications, these methodologies to bear fruit?
Ingmar Schuster: 00:36:10
So I think what a human always will be needed for is setting goal, so what do we want to achieve? And for quite some time, we will setup the measurement environment. So as Frances Arnold said, who developed directed evolution, which is this brute force, lab-only method hammer that you can apply to protein optimization, "You get what is screened for." So if you measure with low measurement noise and you measure the thing that you're actually trying to optimize, then you will do a much better job in optimizing your protein. And if you use directed evolution, then you will get out quite some value of good measurement practices. If you use AI methods to make more sense of your measurement data, then you will make even better use. And even Frances Arnold, who got the Nobel Prize for this method five years ago, she's working on AI methods for protein engineering these days.
Jon Krohn: 00:37:28
Cool. That is a nice historical data point.
Ingmar Schuster: 00:37:31
Yeah.
Jon Krohn: 00:37:33
Awesome. So humans need to decide the objective, and that's it?
Ingmar Schuster: 00:37:38
That is what will always be there. Nowadays, what we often do is we have the machine generate suggestions, and then a human comes aboard and says, "Yeah, but I know that X, Y, Z." And sometimes they also look at the fold, look at the protein fold generated by AlphaFold or ESMFold, there's a bunch of different packages doing this now, and try to gain more insight from that and maybe some biochemical intuition. If you have perfect measurement data, this is not necessary. Perfect measurement data and say 20 data points, then this is not necessary. But often you don't have perfect measurement data. But the vision overall, I think is what-
Jon Krohn: 00:38:35
Of role?
Ingmar Schuster: 00:38:36
Sorry?
Jon Krohn: 00:38:37
The vision of?
Ingmar Schuster: 00:38:38
The vision overall.
Jon Krohn: 00:38:39
Oh, overall. Sorry.
Ingmar Schuster: 00:38:41
Yes.
Jon Krohn: 00:38:42
I was like... Yeah, I got you. Continue.
Ingmar Schuster: 00:38:42
Yeah. The vision overall is Alan Aspuru, who is a professor in Toronto, he's calling this the self-driving lab vision, like self-driving cars, only for a lab. Interestingly, the self-driving lab is much easier than self-driving car, just because you have a controlled environment, there's no pedestrian walking into your lab robot at any point in time that the robot has to avoid. But you can have a very controlled lab. And if you can automate the synthesis of say, proteins in our case, and the measurement, and then hook this up to machine learning, then basically after setting up the measurement so that you have low measurement noise, you can just press a button and then wait and have your protein optimized for you really.
Jon Krohn: 00:39:43
Very cool. Yeah. Self-driving labs, that'll be exciting, and it does seem like we are moving in that direction. I think there are probably some limited numbers of cases where it is happening.
Ingmar Schuster: 00:39:54
Yes. It's been a very recent achievement that several groups around the world, both in industry and academia, have achieved an automated build, measure, learn cycle.
Jon Krohn: 00:40:15
Very cool. Yeah, something definitely to watch out for. And I've had my eye on this topic for a few months now, and I'm thinking of eventually having an episode that is dedicated also to this self-driving lab idea. So we dug up in our research that there was a link between using AI for chemistry and climate change.
Ingmar Schuster: 00:40:43
Yes, in many respects, because we don't have time for the old type of chemistry. Chemistry is one of the most energy intensive sectors that we have. And for example, developing catalysts that use less energy is one of the major levers that we have.
Jon Krohn: 00:41:10
Right.
Ingmar Schuster: 00:41:11
That's number one. Number two, many projects about CO2 fixation use biocatalysts. So one of them we talked about earlier, where our collaborator used our AI to teach plants to take up more CO2 basically. So there's many, many directions where you can use AI technology to help you make more sustainable and more profitable really, chemical routes and processes.
Jon Krohn: 00:41:51
That is the dream situation policy wise, when you don't even really need to be making a policy argument because the economic argument makes it for the corporation. And you think, "Okay, here's another way, this AI-driven way that I could be developing and applying my catalyst to my chemical reactions and save money, and so why not go down that route?"
Ingmar Schuster: 00:42:15
Yeah, exactly. And luckily, molecules don't have privacy rights. So that's one of the reasons why we thought this is a good application of AI.
Jon Krohn: 00:42:26
Yeah, I guess it's something that we don't want to, there's no point in us spending too much time on in this episode, but it is a bit of an interesting thing where in Europe, there is a more advanced regulatory climate relative to the U.S.
Ingmar Schuster: 00:42:44
That's a euphemism. Yes. Yeah, Europe likes to regulate things that barely exist in Europe, because they are not very good in helping the innovation along. It's not true. Of course, AI exists in Europe, absolutely. But let's say if European politicians were as good in pushing new technology as they are in regulating it, then I would be very happy.
Jon Krohn: 00:43:12
Yeah. Yeah. Yeah. Yeah. That's a good point. And yeah, obviously we are here at the Merantix AI Campus in Berlin, and there's a huge amount of AI innovation happening here. But yeah, perhaps as you say, that's a really nice way of putting it. Where if the same amount of effort went into fostering AI as they went into regulating it, even just trying to be like 50/50 on time spent, I guess.
Ingmar Schuster: 00:43:35
Yeah. Absolutely.
Jon Krohn: 00:43:36
Because the regulatory stuff, it absolutely matters. But yeah, you hear a lot, at least from overseas, the news that I get, it's got to be 99% of the AI-related news I get from Europe is regulatory related.
Ingmar Schuster: 00:43:52
Yes. Yeah. I think they take their GDPR regulations, so basically European privacy law as a big success, because it was copied across the world it seems. But they really also managed to put a lot of roadblocks in companies' ways. I knew a lot of people that were completely insecure about what GDPR means when it was introduced. And I think now everybody calmed down and stuff, but they can do a better job of communication. Although, I think the bigger issue for Europe really is that this promise of the unified market has fallen flat. It's way too fragmented, I think. That's probably one of the reasons why the U.S has been able to grow big tech companies, because they have a big unified market really.
Jon Krohn: 00:45:09
As we often discuss on air with guests, deep learning is the specific technique behind nearly all of the latest artificial intelligence and machine learning capabilities. If you’ve been eager to learn exactly how deep learning works, my book Deep Learning Illustrated is the perfect place to start. Physical copies of Deep Learning Illustrated are available in seven languages but you can also access it digitally via the O’Reilly learning platform. Within O’Reilly, you’ll find not only my book but also more than 18 hours of corresponding video tutorials if video’s your preferred mode of learning. If you don’t already have access to O’Reilly via your employer or school, you can use our code SDSPOD23 to get a free 30-day trial. That’s S-D-S-P-O-D-2-3. We’ve got a link in the show notes.
00:45:59
Yeah. And the culture is relatively homogeneous as well in the U.S.
Ingmar Schuster: 00:46:03
Yes, exactly.
Jon Krohn: 00:46:05
Yes, there are other languages spoken. And yes, a night out or a dining experience in New Orleans is going to be quite a bit different from New York. But nevertheless, there are still all kinds of commonalities, like you're still ordering in the same language, you're getting a lot of the same foods, and things like fast food chains are able to operate across the U.S and Canada. You could theoretically offer exactly the same thing on the menu everywhere in the country. And I'm sure there's some kind of McDonald's, franchises are doing maybe some kind of minor menu changes, but they wouldn't need to. You could, you have this relative homogeneity, whereas in Europe, obviously you have so many different cultures and is what makes it such a fantastic place to visit because you're an hour or half-hour flight away from a vastly different culture, but that makes it harder to create a product that is going to suit all of these different cultures' needs.
Ingmar Schuster: 00:47:15
In part that, yes. I do enjoy the cultural richness. I guess, a less-fragmented market would naturally lead to more homogeneity also in the way people live, yes. But for some products, it is very easy to have a common market, especially digital products, especially B2B type of business. That would be very easy. But it's even hard for us to hire somebody from France and get them the health insurance after a year of time.
Jon Krohn: 00:48:01
Oh, really?
Ingmar Schuster: 00:48:02
Yeah, it's unbelievable.
Jon Krohn: 00:48:03
I assumed that, that stuff was relatively straightforward.
Ingmar Schuster: 00:48:07
That's what you would assume, yes. I would have before this as well.
Jon Krohn: 00:48:11
Wow. Yeah. That's wild. Yeah, I assumed it was along with that free movement of people, it was truly, you'd be able to hire any talent in the EU.
Ingmar Schuster: 00:48:22
You are. You are. You can hire any talent, only then you have folders of this size of actual physical paper that you have to fill out.
Jon Krohn: 00:48:32
For people not watching YouTube, there were several feet of height of papers.
Ingmar Schuster: 00:48:39
Yes, correct. So of course, this is not the truth. You can do some things digitally, but there's a lot of administrative hassle, let's say that.
Jon Krohn: 00:48:53
Cool. Well, moving on from the regulatory stuff and more into the specific expertise that you can provide us on Exazyme-related innovation. There might be listeners out there wondering if these kinds of technologies like AI protein design, if there are some risks associated with these kinds of technologies as well. So maybe a good kind of question to get into would be, with these kinds of technologies that are allowing us to emulate or improve upon biological proteins, what are the risks associated with that? And then what are the benefits?
Ingmar Schuster: 00:49:36
Yeah. The risk is basically the risk of any tool. I don't know. You can use a screwdriver both to build up your cupboard, to help fix your car, and you can use it to stab somebody, right?
Jon Krohn: 00:49:55
Right. Everyone's favorite stabbing tool, the screwdriver.
Ingmar Schuster: 00:49:59
Yes, exactly. And just like any tool really, AI is dual use. And yes, of course you can use it to, I don't know, make a poison, let's say if you want to. But nobody is regulating screwdrivers or knives for that matter, because knives are better stabbing tools.
Jon Krohn: 00:50:28
They are a better stabbing tool.
Ingmar Schuster: 00:50:29
But they are also better in the kitchen. So nobody's forbidding knives, which is because you can use them to cook delicious meals in much shorter time than if you were using a stick. And the same is true for AI, for protein design or chemical design overall. You can use older methods to try and come up with catalysts that break down CO2, and you can spend enormous amounts of time on this and enormous, spend enormous amounts of money on this, and you will at some point come to the same solutions, only it takes you 100 times or 1,000 times as long and as much money. So I would not say that we should ditch this opportunity just because you can also do bad things with it. There's risk anywhere in life really. I remember being a poster in Paris, and there was these really appalling attacks in the Bataclan Bar.
Jon Krohn: 00:51:49
Bataclan?
Ingmar Schuster: 00:51:51
Bataclan was this bar where some people, I think from Belgium, they came and they made a block bath.
Jon Krohn: 00:51:57
In an Eagles of Death Metal concert.
Ingmar Schuster: 00:51:59
Yes, exactly.
Jon Krohn: 00:52:00
Yeah.
Ingmar Schuster: 00:52:00
Yes. And some of my colleagues at the university, and they actually lived across the street, and they were really shocked by this. Everybody was shocked by this. And there was all of this military in Paris, walking around the city with the machine guns. And I stood in front of an arts museum, and I saw this long queue of people standing there. And yes, they were standing there in order to get through security to get into the museum, but anybody, any terrorist with the same machine gun could have just gone and shot everybody in the queue. So you do have risk and you should think about the risk. But in this case, there's so many opportunities attached to it, that you have to follow this path.
Jon Krohn: 00:52:55
There's actually a wild vulnerability that I had never thought of, which is that, this is the same in an airports too.
Ingmar Schuster: 00:53:02
Of course.
Jon Krohn: 00:53:04
Once you're inside through security, of course that's a very safe place to be, but you're sitting ducks when it's a jam-packed queue there.
Ingmar Schuster: 00:53:12
Of course.
Jon Krohn: 00:53:13
And obviously, there's no pre-security-line screening.
Ingmar Schuster: 00:53:17
Of course.
Jon Krohn: 00:53:18
Geez.
Ingmar Schuster: 00:53:18
Yes.
Jon Krohn: 00:53:19
I'm never going to stand in an airplane line feeling safe again.
Ingmar Schuster: 00:53:24
It doesn't matter. You walk around the city. If somebody wants to hurt you, they can.
Jon Krohn: 00:53:29
Yeah. We're sitting ducks like that airplane scenario. Geez. That's the most dangerous part of the flight.
Ingmar Schuster: 00:53:36
Yeah.
Jon Krohn: 00:53:37
Oh, my goodness. Yeah. So yeah, the point is that, yeah, there's risks associated with AI. It's dual use, just like almost I'm sure any tool out there, but the benefits outweigh the risks. And I think that's especially true here, when you have technology that can be playing a part in dramatically transforming. And the examples that you provided in this episode so far today, we're talking about dramatically transforming industries, whether it's chemical synthesis, or it's pharmaceutical. But there is also applications in this area, relatively, maybe in the coming decades. You talked earlier in this episode about gene therapies and gene editing is also something that is, thanks to CRISPR, has become much easier in recent years, to do that very targetly, very cost-effectively. And so, yeah, I don't know if you have any thoughts on other kinds of, maybe thinking ahead decades, the kinds of technology that you're working on here, an AI protein design tool. Do you have any big picture ideas of huge societal benefits that could be unlocked by technologies like this?
Ingmar Schuster: 00:54:52
Apart from what we talked about already, like CO2 recycling is one of the big current trends, antibodies.
Jon Krohn: 00:55:00
Right, CO2 recycling. Yeah. Yeah. Yeah.
Ingmar Schuster: 00:55:02
And designing them with AI is one of the big trends really in the big pharmaceutical companies as well. I think chemistry in general, design of molecules other than proteins also has many, many interesting applications. I think one of my favorites these days is passive daytime cooling, which is not very practical at the moment. The idea of it, maybe just to give a one sentence introduction is you have a material, say a paint or something, you paint your house with it, and it's a passive AC. So you don't need any energy and it cools down whatever it covers, because it's radiating sunlight in a way such that you cool down below ambient temperature. And this, I think is really one key for combating climate change. Because as the world gets hotter, if everybody installs energy-hungry ACs, we are in a vicious cycle and this is one way to break it.
00:56:15
And I think in many respects, making new materials that are light and stable at the same time, this is an important part. And I think overall, the chemical pharmaceutical industry, it's really kind of stuck in the 19th century, not really, but it's been much slower to move. And AI gives us the opportunity to really digitize this industry from the ground up, because doing anything in the physical world is always expensive. So if you can do as much as possible in the computer, with as little energy demand, even for the computer as possible, it will save you huge amounts of time and money.
Jon Krohn: 00:57:12
Yeah. Famously, in recent decades, the cost of developing a pharmaceutical drug has just ballooned astronomically.
Ingmar Schuster: 00:57:19
Yes.
Jon Krohn: 00:57:19
And then that means that downstream, the cost to the patient or the insurance company, and ultimately the patient or their employer is somehow paying for that. And then it also, it makes, the development, these huge costs, it creates a risk aversion culture in pharmaceuticals.
Ingmar Schuster: 00:57:45
Yes. Absolutely, yes.
Jon Krohn: 00:57:46
And so yeah, this is a very exciting application area that I've thought about before for sure. The ability, like you're saying, to digitize, to be able to do Insilico the kinds of estimations. You already gave us an amazing example at the outset of this episode, where a client of yours had done laboriously, expensively, 15,000 tests of different, well, however many tests, of 15,000 different candidate proteins.
Ingmar Schuster: 00:58:17
Yeah, exactly.
Jon Krohn: 00:58:18
And then you were able to blow their minds with suggesting 10. And in some cases, of those 10 would seem like unusual suggestions suggested by the AI, but they end up in many cases, like you described, having a huge impact. So this Insilico, yeah, the cost savings. We talk about in startups of if you can do something, if you can show some 10X kind of multiple, whether that's a 10X efficiency increase. With this, you're talking about many orders of magnitude more than that.
Ingmar Schuster: 00:59:00
Yes, two to three. Yes, exactly.
Jon Krohn: 00:59:04
So yeah, wild. Very cool. All right. So we promised our listeners much earlier in the conversation, we put a pin in this idea of deep versus shallow ways of achieving these kinds of enormous multiples. So let's dig into that. This is something that you mentioned to me a week ago. We had a pre-call to talk about what kinds of topics we could be covering. And when you found out how technical our audience was here at Super Data Science, you thought that a great topic to cover, and this isn't something that I've had on air before, is this idea that we've alluded to earlier, that in many cases, shallow approaches to machine learning, so I guess maybe you can give us a definition there. I guess I'm assuming in my head this means anything that isn't a deep neural network. So it could be a shallow neural network or I guess any other machine learning approach?
Ingmar Schuster: 00:59:59
Yeah. So I'm using the word shallow, just as you say, to contrast with deep learning. And I think what would count as shallow is everything with exactly one non-linearity layer, and that's kind of it. And maybe a matrix before. So linear layer before and linear layer after. Current methods that we have talked about is one such example. It's basically, so in many cases at least you can also make them deep, but let's not dig into that. So one linear layer before non-linearity, one linear layer after and that's kind of it. Why do I think this is interesting? Because number one, in extremely many cases, you can get exactly the same performance out of these architectures. And number two, you can often compute them in a much more efficient way. And even if you choose to, you can go sequential in a much more efficient way than you could in deep networks.
01:01:11
So I think there's been a few papers out in recent years that have looked at classical methods, like for example, random forests, and made the connection between random forests and deep networks. In kernel methods, there's quite a few papers that show, okay, if you take multilayer perceptrons or several other architectures, resnets, connets, and take a certain limit, for example, like the classical limit is you make the layer width go towards infinity. What you get out is a kernel method really with one particular kernel. And on the other hand, people that have tried to look at practically, can you get the same performance out of these types of shallow methods that you get out of neural networks, deep learning methods. When you just scale the number of parameters and the number of data that you throw at it. And there's many very interesting results, for example, by Mikael Bakin at UC, San Diego, who did exactly that, and then showed basically same number of parameters, same amount of data, kernel method performs at least as well or sometimes better than neural network architectures.
01:02:43
Why this is interesting? Because we can, number one, we can understand these systems also theoretically. I think in general, why I find this interesting is just from this contrarian viewpoint of you, you understand more about your standard method, about deep learning by looking at these other methods and how you make them perform. Something that I sometimes discussed with my leads in my previous company was, they were like, "Yeah, but why don't you try this and that method as well?" And I'm like, "Yeah, but to make this method perform well, I have to invest thinking time." If you invest the same thinking time in what I currently have and think about, okay, what might help this algorithm, you oftentimes get better results, just because you have more insight into the problem that you put into this.
01:03:55
Of course, one nice thing about this deep-learning revolution, we talked about it before, the tinkering is really nice and easy oftentimes, and we are converging slowly but surely towards architectures that you can just use in a plug and play fashion. Transformers, libraries, by hiding face, you just take this and hit everything with it, and you're kind of done. This also exists in the shallow world. Some calculus say XB boost is all you need, because it performs just phenomenally oftentimes.
Jon Krohn: 01:04:37
Yeah. So XGBoost for people who aren't aware of it, I think I have done an episode dedicated to this, but it's like the random force that you described earlier, where at each stage, you're specifically correcting for the error made at the previous stage in the tree. And yeah, super, super powerful. It does end up winning a lot of Kaggle competitions.
Ingmar Schuster: 01:05:01
Absolutely.
Jon Krohn: 01:05:02
That's for sure. Nice. Well, that was a fantastic foray into shallow approaches. And yeah, I do agree with a lot of the sentiment that you're describing. It's particularly interesting there to hear you say how when you focus further on some particular approach that you started with. So it could be a kernel method, it could be random forest, it could be deep learning. By sticking with that and tinkering further, you probably are more likely, it's probably a better use of your time than just randomly trying out switching to the completely other approach, where the hyper-parameter is going to be completely different. And maybe you don't have as much expertise yourself in figuring out.
Ingmar Schuster: 01:05:52
Exactly. Yeah. And the other thing is people do it in deep learning also all the time. They think about, "Okay, how is my data structured? How do I adapt my neural network to be able to capture that?
Jon Krohn: 01:06:04
Yeah. And so in your time, in Masters in Linguistics or your PhD focused on natural language processing, were you mostly using these kinds of shallow approaches?
Ingmar Schuster: 01:06:17
I was actually building Bayesian networks. So back at the time, I did a lot of Bayesian work. This was very unwoke at the time, Bayesian non-perimetrics. I think it kind of developed, it was on vogue in parallel to the neural network stuff. So yeah, that's what I used. In my PhD, I was working on modeling meaning of natural language, using this probabilistic language basically, which is Bayesian networks. So yeah, that was mostly what I did.
Jon Krohn: 01:06:59
Cool. And some people are doing kind of Bayesian deep learning stuff. You don't come across that very often. It's usually shallow, not more than one non-linearity.
Ingmar Schuster: 01:07:07
True.
Jon Krohn: 01:07:07
And so that was probably true in your PhD research.
Ingmar Schuster: 01:07:11
Yeah, this is true. This was true in my PhD research, yes. I think the reason why people don't tend to use Bayesian neural networks as much is probably because the really interesting part is getting probability distributions. This is the interesting part about the Bayesian approach, and that you can get even without computing integrals as you do in Bayesian inference.
Jon Krohn: 01:07:38
Very cool. So digging into this a little bit more, into your PhD research. It sounds like it's important for you for linguistic theories to be falsifiable and quantifiable, which might be quite a bit different from the kinds of insights we get from language models today, like the big transformer-based LLM approaches.
Ingmar Schuster: 01:08:05
Well, they are quantifiable in the sense of probabilities. You have probability for the next word being, I don't know, dog, versus the next word being kraken or whatever. So they are very much quantifiable. And I think really this subsumes falsifiable, because if you have probability zero, which almost never happens, then something is false. Yeah.
Jon Krohn: 01:08:37
That's fascinating. And we could probably go on for ages about your academic work, but to kind of tie it into most of the conversation we've been having here around Exazyme, how has going from your academic background, you did a couple of postdocs after your PhD.
Ingmar Schuster: 01:08:55
Yeah.
Jon Krohn: 01:08:56
How was the transition to industry? Why did you choose to do it? And has the academic background, it sounds like it's going to be a yes based on how involved you are with developing new unnamed unpublished kernel methods, for example. So yeah, it's a multi-part question. It's why make the decision to go from academia to becoming an entrepreneur? How is the transition? And how does the academic background help you in your entrepreneurial career?
Ingmar Schuster: 01:09:28
What's the most important part about doing a PhD is really your way of working very deeply, of being able to focus in demanding situations and coping with pressure really. I did my PhD after having my son, which is especially-
Jon Krohn: 01:09:56
Oh, really?
Ingmar Schuster: 01:09:56
My son is playing and then on the other hand, you're trying to think deeply, which really taught me to be able to shut out the environment in the right times. Of course, there's a subtle balance that you have to strike with actually being there for your kid. But yes, this kind of deep work, being able to cope with pressure, being very self-sufficient, self-motivated, this is something that I think academic work teaches you a lot. And of course, you meet many, many super smart people, sometimes not very practical people, that too. But it's definitely internal motivation, intrinsic motivation is I think one of the most important driving forces for doing anything important. Most people don't accomplish big things by hunting for money.
Jon Krohn: 01:11:05
So on that note, for people out there who would maybe like to, as opposed to just being focused on the money itself, but would like to make as big an impact they can, maybe particularly in this kind of biological space, what advice do you have for listeners who would like to be an entrepreneur blending AI and biology?
Ingmar Schuster: 01:11:31
Just get into it I would say, no matter where you come from. Actually, the funny thing is I did not take any biology or chemistry classes in my high school. In my postdoc here in Berlin, this was a very biochemistry, chemistry, heavy group of machine learning researchers. And I was keeping a strong and a wide distance from that, strangely enough. So I never touched anything, although I was exposed to chemistry applications all the time. And I just learned this stuff for the startup from my co-founder who's a biochemist. And yeah, just plunge into it. You see that with many people who also had quite an impact on machine learning. I think the transformer paper was basically an internship project. Some of these people didn't do their PhD, and they are doing really fine. So this is not a necessity. What is the necessity is being driven and motivated and keeping at it.
Jon Krohn: 01:12:50
Nice. Yeah. So if people want to be successful applying machine learning AI to some specific domain, the key is to just get started is your first point. And the second thing, it sounds like based on your, you didn't really say this out loud, but your co-founder does have the biological background. So there could be situations where if you're a listener and you have specific experience, specific expertise in machine learning and AI, but there's some application area where you feel like you could make some big societal impact, then you could potentially be partnering, co-founding a company perhaps with somebody who has that expertise.
Ingmar Schuster: 01:13:27
Absolutely, yes. And people love working with machine learning people.
Jon Krohn: 01:13:32
Yeah, I imagine. Yeah. You hold probably more of the cards today as the MLAI expert, as the data science expert. Awesome. So Ingmar, this has been a fascinating episode. I've learned a ton. Before I let you go, do you have a book recommendation for us?
Ingmar Schuster: 01:13:50
Can I have two?
Jon Krohn: 01:13:51
Of course.
Ingmar Schuster: 01:13:54
I think one very interesting book I've read is there's a strange subtitle. The title is Noise, and the subtitle is A Flaw in Human Judgment. It's among others by Daniel Kahneman, who got the Nobel Prize in economics, I think. So he's the most well-known of the authors. It's basically a book about how human biases negatively influence human decisions. It, Kahneman being a scientist, there's tons of studies. For example, one study which shows that if you give a judge in a criminal court the exact same folder with the exact same facts before lunch and after lunch, the sentencing will be vastly different.
Jon Krohn: 01:14:56
Yeah. Lunch and coffee breaks.
Ingmar Schuster: 01:14:58
Yes.
Jon Krohn: 01:14:58
Yeah, you definitely want to plan your sentencing if are facing sentencing right now as a listener, and hopefully there's not many of you out there, you want to get your lawyer to get you to have your hearing right after coffee or right after lunch.
Ingmar Schuster: 01:15:11
Absolutely. So yeah, it's very interesting in that respect. And because you just are confronted with typical human behaviors and how they can be problematic and what you can do about it. And interestingly, Kahneman, who's a psychologist, he kind of argues for using algorithms because he says yes, they can reproduce racism, for example, and they can have a racist bias, for example, but at least you see it and you can deal with it by taking this out of the algorithm by, I don't know, rebalancing your training data, whatever. Whereas there's at least the same amount of bias in society, but because there's also a lot of variance, you don't see it as easily as you do, because the machine always has the same judgment and doesn't vary so much. So this is a very interesting book.
01:16:17
And the other thing that I'm currently reading is called Painful Intelligence by Aapo Hyvarinen, who is a machine learning researcher. He's the inventor of FAST ICA, which some people might know. It's the ICA method in sklearn, for example. And he's written this book, which is super interesting because it discusses by a super top-notch researcher, it discusses how machine learning is connected to mindfulness meditation.
Jon Krohn: 01:16:53
Whoa.
Ingmar Schuster: 01:16:54
Yes. I almost cried when I read the introduction. It was super, super nice. There was some insights that I thought, "Oh, wow, this is something that I've thought about myself" because I'm also a meditator. And he goes into a lot of depth. And we recently discussed this, and it's actually also in a podcast episode that you can look up.
Jon Krohn: 01:17:26
Oh, nice. Yeah. We had Ben Goertzel on the show earlier this year as well. And part of the conversation later in that episode was about the overlap between mindfulness and AI. Yeah, fascinating space.
Ingmar Schuster: 01:17:43
It's very interesting. I don't know, that you think about the past so much. This is exactly the same as in reinforcement learning, you're replaying experiences in order to learn from them better. It's exactly this phenomenon, as well as worrying about how will you behave in the future. This is also what AI agents do in order to plan better. So it's no wonder that we do these types of things, and it's also no wonder that we tend to be afraid of the future or think about what went wrong in the past because this helps you best to avoid mistakes. Of course, at the same time, mindfulness is there to take the sharp edge out of this. So yeah, it's a super interesting connection.
Jon Krohn: 01:18:34
Great suggestions there. I love them both. I wish I had time to read all the books that our guests suggested. That would be a dream life for me. So yeah, Kahneman and others, Noise and then Painful Intelligence. It sounds great. Ingmar, this has been, as I think I had just said a minute ago, a really fascinating episode. For people who want to follow you after this episode, what's the best way for them to do that and get your thoughts?
Ingmar Schuster: 01:19:00
LinkedIn is number one. You can also look at our podcast called Machines and Molecules.
Jon Krohn: 01:19:07
Yes, Machines and Molecules. So when you're saying you have a podcast about it recently, it's a podcast episode of yours.
Ingmar Schuster: 01:19:14
Exactly.
Jon Krohn: 01:19:14
Yes. Fantastic. Who's the guest? Do you remember?
Ingmar Schuster: 01:19:17
This very last episode is exactly Aapo Hyvärinen, who wrote Painful Intelligence.
Jon Krohn: 01:19:21
Oh, my goodness. Wow. Wow. Wow. Wow. Wow. Wow. That sounds like a great one. I'll be sure to include that link in the show notes. All right, Ingmar, thank you so much for taking the time and having this fascinating conversation with me. Really appreciate it.
Ingmar Schuster: 01:19:36
Thank you, Jon.
Jon Krohn: 01:19:42
Whoa, what a mind Ingmar has and what an educational inspiring episode. In it, Ingmar filled us in on what kernel methods are, how kernel methods can be used to efficiently model tons of different real-world situations, including the biological catalyst and antibodies exazyme helps pharmaceutical and chemical companies synthesize far more efficiently. He talked about how AI can play a key role in the coming years to extract carbon dioxide from the atmosphere, enable passive cooling to become mainstream, cure cancer, and completely overhaul the way pharmaceutical R&D is carried out. He talked about how shallow methods, machine learning approaches with one or fewer non-linearities like kernel methods and XGBoost can outperform deep learning even on natural language processing tasks. And he talked about how if you want to be a pioneering AI entrepreneur yourself, you should just start and perhaps co-found the company with a domain expert if you're the one bringing the ML expertise.
01:20:33
As always, you can get all the show notes, including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Ingmar's social media profiles, as well as my own at superdatascience.com/739. Thanks to my colleagues at Nebula for supporting me while I create content like this Super Data Science episode for you. And thanks of course to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the Super Data Science team for producing another fascinating episode for us today. For enabling that super team to create this free podcast for you, we are deeply grateful to our sponsors. You can support this show by checking out our sponsors' links, which are in the show notes. And if you yourself are interested in sponsoring an episode, you can get the details on how by making your way to jonkrohn.com/podcast.
01:21:16
Otherwise, please share, please review, please subscribe and all those good things. But most importantly, just keep on tuning in. I'm so grateful to have you listening and I hope I can continue to make episodes you love for years and years to come. Until next time, keep on rocking it out there and I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.
Show all
arrow_downward