Podcasts SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

66 minutes
Career Tips, Data Science, Machine Learning

SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn

This week, it’s all about real-time machine learning (ML) applications, as Richmond Alake shares insights, tools and career experiences with Jon—delivering a high-energy and high/impact episode that everyone could benefit from. From his work at Slalom Build to his two AI startups, discover the software choices, ML tools, and front-end development techniques used by a leader in the field.

Thanks to our Sponsors:

Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

About Richmond Alake

Richmond Alake is an accomplished AI/ML practitioner with a wealth of experience as a computer vision engineer for a London-based startup and a machine learning architect for a leading tech consultancy. In addition to his professional work, Richmond is a highly respected writer in the AI field, boasting thousands of followers on LinkedIn and Medium, and has published articles in top tech blogs such as Nvidia, BuiltIn and NeptuneAI. As an advisor to several AI startups, Richmond is an expert in the industry and shares his insights and knowledge with fellow practitioners through his podcast. Richmond is a co-founder of two innovative startups, OpenSpeech and MiniPT. By leveraging AI, he and his team are tackling common problems in content creation and fitness to make these industries more efficient and effective.

Overview

Have you ever wondered what sets the role of a Machine Learning Architect apart from an ML Engineer? Richmond Alake sheds light on this distinction, emphasizing the architect’s responsibility in designing the data framework and ensuring seamless interaction between various systems and technologies. As he takes on the new role at Slalom Build, Richmond is currently learning alongside the data engineering team, utilizing tools like Databricks and Amazon Kinesis to effectively manage large-scale streaming projects.

As a self-proclaimed generalist and tinkerer, Richmond founded two startups that combine his passions and skillset. At MiniPT, Richmond and his team utilize the Swift programming language to develop a personal-training iOS app. This innovative app incorporates Apple Core ML and Firebase, offering a low-latency and user-friendly mobile database solution. By leveraging computer vision technology, the app helps users improve their workout sessions by providing real-time feedback on their form, eliminating the need for expensive personal training sessions.

His second startup, Open Speech, leverages a Python Flask backend and a React frontend, developing a generative AI application that converts audio content like his podcast, into promotional material. He uses OpenAI APIs for proofs of concept and then Hugging Face Inference to quickly and cheaply get his own APIs up and running.

When he’s not working on his startups, podcast or projects at Slalom Build, Richmond dedicates his time to creating courses for O’Reilly and crafting content for NVIDIA. He believes that writing online content is a valuable “hack” that everyone should adopt. Don’t miss the opportunity to tune in and absorb the high-energy insights from this episode.

In this episode you will learn:

What is a Machine Learning Architect? [03:09]
Richmond’s startups [12:07]
Why Richmond started a podcast [29:51]
Richmond’s new course on feature stores [38:05]
Why Richmond produces data science content [43:25]
Why All Data Scientists Should Write [51:30]

Items mentioned in this podcast:

Posit RStudio
AWS Trainium
AWS Inferentia
WithFeeling.AI
Slalom Build
MiniPT
Richmond’s Podcast
Richmond’s Medium
Richmond’s Nvidia Author Page
Richmond on Built In
Databricks
Kinesis
Swift
Apple Core ML
Firebase
Flask
Reliable Machine Learning by Cathy Chen, Niall Murphy, Kranti Parisa, D. Sculley, Todd Underwood
NLP with Transformers by Lewis Tunstall, Leandro von Werra, Thomas Wolf
React
Hugging Face Inference
Feast
Architects of Intelligence by Martin Ford
AI 2041 by Kai-Fu Lee and Chen Qiufan
AI Pioneers Write and So Should You
Jon’s Podcast Page

Follow Richmond:

Follow Jon:

Episode Transcript:

Download The Transcript

Podcast Transcript

Jon Krohn: 00:00:00

This is episode number 685 with Richmond Alake, Machine Learning Architect at Slalom Build. Today’s episode is brought to you by Posit, the open-source data science company, by AWS Cloud Computing Services, and by WithFeeling.ai, the company bringing humanity into A.I.

00:00:21

Welcome to the SuperDataScience podcast, the most listened-to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I’m your host, Jon Krohn. Thanks for joining me today. And now let’s make the complex simple.

00:00:52

Welcome back to the SuperDataScience podcast. Today I’m joined by the extraordinary force of nature that goes by the name of Richmond Alake. Richmond is a machine learning architect at Slalom Build, a huge Seattle-based consultancy that builds products embedded with analytics, machine learning, and other automations. He’s the Co-Founder of not one, but two startups: one that uses computer vision to correct people’s form in the gym, and the other is a generative AI startup that works with human speech. Richmond is an epic content creator, including creating courses for O’Reilly and writing for Nvidia. He previously worked as a Computer Vision Engineer and as a Software Developer and holds a Masters in Computer Vision, Machine Learning, and Robotics from the University of Surrey in the UK. Today’s episode will appeal most to technical practitioners, particularly those who incorporate machine learning into real-time applications. But there’s a lot in this episode for anyone who’d like to hear about the latest tools for developing real-time machine learning applications from a leader in the field. In this episode, Richmond details the software choices he’s made up and down the application stack from databases to machine learning to the front end across his startups and in the consulting work that he does. He talks about the most valuable real-time ML tools that he teaches in his courses, and why writing for the public is an invaluable hack that everyone should be doing. All right, you ready for this scintillating episode? Let’s go.

00:02:18

Richmond, welcome to the SuperDataScience podcast. It’s awesome to have you here. Where are you calling in from today?

Richmond Alake: 00:02:25

So, I’m calling in from the United Kingdom, so I’m in England, just staying outside the outskirts of London.

Jon Krohn: 00:02:31

Nice. Which like which direction around London. What’s your football club?

Richmond Alake: 00:02:36

Do you know what? This is gonna sound weird, but I’m not into football. I am into-

Jon Krohn: 00:02:41

Oh, no kidding.

Richmond Alake: 00:02:43

I am not into football.

Jon Krohn: 00:02:44

Oh, wow.

Richmond Alake: 00:02:44

I am, I’m a gym addict. I’m into lifting weights.

Jon Krohn: 00:02:50

Nice. All right, fair enough. It just seems like the kind of question that you normally ask an English person.

Richmond Alake: 00:02:55

It is.

Jon Krohn: 00:02:56

Yeah. Well, there you go. So, there’s my icebreaker question. Totally bombed. We should just dig right into the technical stuff here. So, Richmond, you are a machine learning architect. I think you’re the first machine learning architect I’ve ever had on the show. So, tell us what a machine learning architect is, how that’s different from machine learning engineering or data science. Yeah, fill us in.

Richmond Alake: 00:03:26

Yeah, so this is the first, me being a machine learning architect. This is the first role where I’m a machine learning architect. So, my previous role was computer vision engineer or machine learning engineer. So, how does, how does the architect role differ from an engineering role? Well, as a machine learning architect, you got to start thinking on a system level. So, you’re thinking about other aspect of a machine learning systems that sort of integrate with the actual machine learning pipeline. So, you’re looking at the data engineering side, you’re looking at the ML upside, and you’re looking at the front end in the UI. And considering all of those sort of components to the system architecture are very high level and how that feeds into the machine learning component of the system. Because now most modern application at the core, it’s a machine learning component. So, it’s very crucial to have that machine learning person on your team that can actually understand the whole, sort of how the infrastructure and architecture works.

Jon Krohn: 00:04:23

So, I’m very familiar with like a software architect role in general who is being thoughtful about how the whole system works together. So, would you often work alongside a software architect to figure out how the machine learning components in particular would work smoothly inside of that?

Richmond Alake: 00:04:40

Good question. So, because this is one, sort of like my first time experiencing this role in this role, it’s a learning journey for me. So, right now I’m in a data engineering team, so I’m getting the sort of the work before we get to the machine learning bit. I’m getting a perspective of that side and it’s exciting, right? So, working a lot with tools such as Databricks, using some streaming platforms Kinesis, and just understanding that data journey, the data life cycle before it gets to that, to the good stuff, which for us is machine learning folks is the features, but a lot happens to the data on a large scale project, on a large scale system before it gets to those, to those feature sets.

Jon Krohn: 00:05:20

Nice. Let’s actually talk about those two technologies a little bit. You hear about Databricks a lot. Why would somebody consider using Databricks for their data application?

Richmond Alake: 00:05:29

Yeah, so Databricks is, handles large-scale projects, large-scale data volumes, and does it with very efficient pipelines that can allow you to start running all this computing processes, data processing jobs in real time and integrative a lot of streaming platforms, a lot of data sources and allows you to run SQL queries, Python queries, they have very robust solution. So, for mission learning driven initiatives. So, they have feature stores, they have data catalogs. So, it’s really like a one stop shop. It’s not just for data engineers. Data scientists can explore some of the tool offerings that Databricks offers, and they also have some specific offering for machine learning engineers as well. So, I’m looking at it from a data engineering perspective where you get to run some compute jobs that could run automatically. You can orchestrate this processing jobs on different layers, testing, production development, and you just start ingesting all the, or you create all this data pipelines for different environments and make everything automated and work seamlessly. So, understanding that aspect of it is is, it’s, I wouldn’t say if you’re in a machine learning role, you won’t get exposed necessarily to that, but for, if you are in an architectural, you need to have like a, a decent understanding of what happens there. Then you mentioned the software architect has an understanding of the software. I have a background in software engineering, web development, so that allows me to start thinking that end as well: frontend interfaces, backend system, API systems as well.

Jon Krohn: 00:07:07

Right. How is the machine learning model going to interact with the rest of the software system?

Richmond Alake: 00:07:12

Exactly. So, are we, are we doing some rest API, what does the user interface look like and yeah, just all the sort of considerations.

Jon Krohn: 00:07:21

Nice. And then the other tool that you mentioned there, which actually I hadn’t heard of and I looked up quickly just now as you’ve been speaking, is Amazon Kinesis. So, it looks like a real-time data analytics platform. So, you use that in conjunction with Databricks?

Richmond Alake: 00:07:35

So, the team used that in conjunction with Databricks. So, I’m, I’m more on the learning side of it. I’m not really an expert on Amazon Kinesis. We, outside of this role, I’ve never experienced it, but we all know what real-time and we all know what streaming platform is. Essentially, you want to be able to get data stream to the end client in bits, in buffers, and you have that in an optimized version, optimized, not optimized system and essentially and it’s efficient. So, in the realm of hours in Kinesis, I don’t, I’m still learning, but in a higher level, we understand what streaming and real-time is essentialy. Everything’s got to be real-time nowadays.

Jon Krohn: 00:08:18

Cool. Yeah. And so the company that you’re doing all of this at is called Slalom Build. So, they’re a tech consulting firm. And so can you maybe give us a couple of case studies, obviously without getting into anything proprietary, but letting us know a couple of case studies of the kind of work that you do there at Slalom Build as a machine learning architect?

Richmond Alake: 00:08:38

Yeah, for sure. So, Slalom Build in the UK is building its machine learning presence. So, I was the one of the first hire on the machine learning practice team in the UK. But if you go over to the US they have such a large presence. They work with most of the big tech companies, most of the big banks. So, I talk a lot with the team over in the US to understand what they’re working on and to get some of the best, some of the best practice they use. Sometimes I give some sort of talks as well to the team over there. So, high-level transportation we have, we work with transportation industry, supply chain industry, retail as well. So, any sort of industry that requires some form of technology input you could find Slalom Build, working with large organizations. So, in terms of how we sort of, how we sort of split the practice and whatnot. So, there’s a data engineering practice, there’s a software engineering practice, there’s a machine learning practice, and it’s, we’re really growing, so, in the UK that is. I forgot, is there any other specific questions you want me to dive into?

Jon Krohn: 00:09:44

Oh, no, no. I mean those, that was the kind of the key area. But I guess just like if there’s, if there were specific case studies, like interesting pieces of work that you’ve had?

Richmond Alake: 00:09:54

Okay. So, [inaudible 00:09:54] working with a supply chain client where we take, we look at the platform and we’re creating an entirely new platform for them to ingest data and understand the sort of lifecycle of a cargo ship from point A to B in a very data-centric manner. So, that covers everything you can think of within tech. So, frontend website, API, backend databases, streaming platforms, data engineering, and eventually the machine learning engineering comes in, right? Because first you have to make sure you get to a certain maturity level where you actually have that data where you actually have that rich data set, you can start to derive intelligence from it. And that’s where the machine learning team comes in. So, for organizations there are different levels of maturity, and I know we normally hear people go in machine learning first, but there’s a lot of, in practice, there’s a lot of work to be done to actually make sure you’re ready to start using this machine learning tools.

Jon Krohn: 00:10:53

For sure.

00:11:28

This episode is brought to you by Posit: the open-source data science company. Posit makes the best tools for data scientists who love open-source. Period. No matter which language they prefer. Posit’s popular RStudio IDE and enterprise products, like Posit Workbench, Connect, and Package Manager, help individuals, teams, and organizations scale R & Python development easily and securely. Produce higher-quality analysis faster with great data science tools. Visit Posit.co—that’s P-O-S-I-T dot co—to learn more.

00:11:31

For sure. Yeah, it’s so often the case that companies will bring in a consulting firm to build a machine learning solution, and the data aren’t even structured to start working with. And so you’re looking at months of just like structuring the data or it sounds like in your case, there could be situations where you’ll need to be getting aspects of the software architecture set up maybe beforehand. Yeah. So, cool, great to hear what you’re doing at Slalom Build and how you’re getting exposure to lots of different projects. You’re getting your feet wet as a machine learning architect for the first time, but that’s not the only job that you have going on. You also have two startups, so tell us about those.

Richmond Alake: 00:12:10

Yeah, so really I could talk about the way I see technology, which is I see myself as a generalist, the ability to pick up different technology to solve different problems, and I come across different problems. And I think about a solution. I like to tinker a bit and try to solve this, solve the problem. So, the first one is MiniPT. So, as I mentioned earlier, I spend most of my time in the gym and I like to train so,

Jon Krohn: 00:12:39

Yeah. You mentioned that to me. Oh yeah, you mentioned that. You did mention that on air when I asked you about football.

Richmond Alake: 00:12:44

Yeah, I did mention it.

Jon Krohn: 00:12:45

Yeah, yeah, that’s right. Cause we were talking about the gym before recording, but yeah, you did mention that on air too. Yeah. So, yeah. You love the gym, tell us, and, and are the, there are people who are watching the video recording of this. They might have been able to tell that without you needing to say it. But yeah, you go ahead.

Richmond Alake: 00:13:04

Yeah, thanks. Thanks for the flattery. Yeah, so yeah, people do mistake me for a personal trainer, but no, I work a tech job remote, so I’m mostly in the office, barely move. But yeah, that’s why I’m in a gym at least 1 hour 30 minutes every day. I love it. It’s like my meditation. And I’ve teamed up with a bunch of personal trainer, which that share the same vision as I do for the gym and tech. And what we’re doing is virtualizing the entire personal training experience. So, a personal trainer is very expensive in the UK, in London, you could pay upwards of 60 pound an hour for a personal trainer.

Jon Krohn: 00:13:42

Yeah. It’s the same thing-

Richmond Alake: 00:13:43

So, what we’re doing is-

Jon Krohn: 00:13:44

In New York, it’s like a $120 is common for an hour. That’s crazy. I’m like, yeah, it’s, it’s insane. Like how quickly, like it, that’s like a gym membership for a month that you spend it in an hour. It’s wild. So, yeah. So, you’ve got the solution.

Richmond Alake: 00:13:57

Yeah, we’ve got the solution. So, what, what I’m doing is what any sort of like AI person will do, which is take that human expertise and just convert that into machine learning models, algorithms, and actually do that in real-time. So, what we’re doing is we’re putting all of this functionalities, computer vision, cause estimation, some data centric algorithms into an app that can watch you while you work out. So, and one thing it does is it tracks all your joints and gives you real-time form correction. So, one thing when I do is when I squat and it kills my knees, is I go too low, right? And I’m, you know, “as the grass” is, as the grass is not good, maybe when you’re young, but as you get older, your knees start to get not as strong as they used to be. So, really, ideally, you want to be just below 90 degrees.

00:14:51

But what this app actually does is it watches you and it tells you when you are maybe going too low or you’re not going low enough, or maybe your back isn’t straight. And we’re doing this all through the headphones in your phone through audio. So, you get that real-time. We also have different components such as like a post-workout assessment where you can see how the joints in your body are moving during a workout session. So, you can see what angles your knees are, how low you’re going to depth, and you actually have a video recording playback of all of this, so you can watch it. And then we give you some tips in card formats that can improve your next session. I’d like to say we’re making every session come to life, and we’re making the data speak to you on how you can improve. And we’re doing this in a very user-friendly, intuitive manner. And I’m, it’s called MiniPT.

Jon Krohn: 00:15:43

MiniPT. So, what’s the tech spec like for that? It sounds like you might have some real-time machine learning going on in there, just like the kind of stuff you were talking about with Slalom Build, but this would be a different kind of stack probably because this is really mobile-focused building on your experience as a mobile developer in the past.

Richmond Alake: 00:16:01

Yeah, so yeah, so as a, not mobile developer, but computer vision engineer, but like I said, I tinker, like I tinker in a lot of things. So, I think anyone within the tech space,

Jon Krohn: 00:16:13

It was computer vision engineer for mobile. Right?

Richmond Alake: 00:16:16

For a mobile company. Yeah, for a mobile company.

Jon Krohn: 00:16:18

Yeah. So, yeah, so I blended those things together in my head.

Richmond Alake: 00:16:22

So, yeah, so what’s the tech stack in MiniPT? We’re building the app on iOS, so we’re working with Swift. Swift is a programming language. We use a lot of machine learning models. So, we’ve dabbled with Vision, Vision framework, which is Apple’s own sort of computer vision solution. So, you can have post-estimation models. We work with some of Google solution ML Kit post-estimation, which [inaudible 00:16:47] libraries. So, we’ve dabbled with, we’ve experimented with a lot of libraries, tried to build our own as well. And we’ve, we’re literally using an, a sort of like a plethora of post-estimation models, computer vision models, object detection to solve this problem from detecting what exercise you’re doing to detecting like what weight you are actually working with. So, we’re using a bunch of different machine learning models there.

00:17:13

So, and on the, on the database side, you can use a Firebase which is very simple. And that allows us to just do some very good real-time querying and getting feedback from any of the user information we store in a database and low latency solution. It, it’s very good. It, it’s good to quickly get set up and test out any proof of concepts you have. What else are we doing on MiniPT? Generally, there’s a lot of experimentation. That’s, that’s what I have to say. There’s a lot of me doing stuff on the product side, on the tech side, then working with the personal trainers in the gym. I live in a gym [inaudible 00:17:53]. Also. The gym has become our lab where we test how the exercise, we try to understand human body motion, dynamic motion as well. So, there’s a lot of experimentation in the field, as I like to say.

Jon Krohn: 00:18:05

Nice. That sounds great. So, you got Swift for your programming language. You use Firebase, which I hadn’t heard of before for your database, because that’s easy to get set up. It’s low latency. And then you were talking about a couple of different, like vision [crosstalk 00:00:00] Apple-

Richmond Alake: 00:18:21

Apple vision framework. You use MediaPipe. You could use, you could create your own sort of post-estimation model using TensorFlow, convert it into a TensorFlow Light model that you could put in a iOS app. And Apple have Core ML, which allows you to sort of which is sort of like a code-agnostic way of developing computer vision models. You can use one of the platforms called Create ML and you give it a bunch of images and for certain tasks and it outputs a model that you can just input into your, into the app. And it works, it works well. But in some use cases not so well.

Jon Krohn: 00:19:03

Right. So, in some cases you can use something, I guess a little bit more out of the box, like that Apple Core ML and then, and then in other situations you need to get down and dirty in TensorFlow and then port it over TensorFlow Light to make it portable.

Richmond Alake: 00:19:16

Exactly. And one space we’re exploring is, because we work with the human body, but human body come in different shape and sizes. We’re looking at things such as a disability, working with people that might not have the expected body parts you might think people should have. It’s a very unique case, but inclusiveness is something that we’re starting with first while we’re still young. And will we expand that and find ways to make sure we actually achieve our mission, which is will make fitness affordable for all.

Jon Krohn: 00:19:47

Very cool. That sounds like a great mission. But not your only startup. So, MiniPT is that first one, but you’ve got another one called OpenSpeech as well.

Richmond Alake: 00:19:56

Yes. So, I could talk about OpenSpeech, I could talk about how it sort of started. So, you know, and maybe some of your listeners know, I have my own podcast The Richmond Alake podcast. And one thing I actually have is after a podcast, I always have to do like a medium post, a LinkedIn post a Twitter post, a newsletter, and just generate this different form of content. And I just thought to myself, I need a tool that can just do this at the click of a button. Right? And I spoke to my brother about it and he was, he was very excited because he’s also a web developer. And over Christmas we, I went home, we spent three days, we spent three days together and we built a basic prototype. And this prototype you could just upload an audio and it creates different content from the audio. So, newsletter, blog post, LinkedIn post, tweets-

Jon Krohn: 00:20:51

Oh, so it’s a generative AI application that takes in the audio and then it’s, yeah, that sounds super useful. It sounds like something I could use.

Richmond Alake: 00:21:02

Yeah. We’re launching, we’re gonna launch in a few weeks. And one thing is, we actually took it a step further because as you know, one thing about podcasting is you want to make sure you’re given your guest enough time to speak. Are you not interrupting them? You’re basically saying the right words. You’re staying on the same intellectual plane. So, we added an analytics component where we analyze the actual audio, we can tell how much one person is speaking, do some sentiment analysis, and do like what filler words are you using, which you can use to improve in your next podcast. So, in your next podcast. I, so I built it for myself. But it’s actually it, a lot of people can use it. And one specific use case we’re seeing, we’re seeing a very good utilization of this tool is in a mental therapies, mental health space.

00:21:58

So, a very close friend of mine, I told him about the tool and he was excited about it, and he owns a mental health practice, and we’re looking at exploring how we can use this tool for mental health sessions. And that comes of a different sort of problem, which is around data privacy, actual where is the data hosted? Who’s hosting the data? Sensitive information. So, we’re trying to solve different problems around that you might not need to solve for content creation, essentially. Basically, I don’t want to my, I don’t want my business sitting on a data center somewhere where it gets hacked or whatnot. So, we need to think about different flows to anonymize the information sent into any of this large language models. Either we’re doing it locally or we’re, if we’re doing it with APIs, we definitely need to find a way to sort of like remove any user-sensitive information from there. So, those are what we’re seeing, the challenges when we’re dealing with the health space.

Jon Krohn: 00:22:58

Right. Yeah. Sounds like you’re doing amazing things. I don’t know how you do all this. This is wild. Like you have, even just within the startups themselves, you have so many different social impact angles. It’s wild. It’s really impressive, Richmond. So, for OpenSpeech, can you tell us a little bit about that stack? You just mentioned that you’re using LLMs, so like-

Richmond Alake: 00:23:19

Yeah, for sure. Yeah. So, we’re looking at exploring fast things first. For proof of concept, we’re using OpenAI. So, we are plugging into OpenAI using the API. On the front end side, you could just, on the front end side we’re using next.js, React. We’re using Firebase as well, like I said, very quick to start off. On the backend Python, Flask, AWS, the usual, the usual suspect. But we’re starting to take things, that’s just for the proof of concept. So, to start, to meet certain use cases, we’re looking at bringing a lot of the, of the LLMs, a lot of the sort of the generative AI in-house, looking at how we can sort of partition the data, partition any sort of data centers we’re using, any sort of cloud service we’re using to actually sort of meet the requirements for help people within the health space in handling that sensitive data.

00:24:09

So, there, there’s a lot of experimentation I want to do with building and actually fine-tuning and training local LLMs on local or On-Prem. But I won’t lie, I’ve moved away from the technical side, or at least I’ve moved away from the technical side of the neural network architecture over the couple years. It’s not like my computer vision days where I have articles exploring the research papers going deep into the architectures, then I show you how you can build like the Alex network. I’ve moved away from like that into more like the infrastructure and the architecture of-

Jon Krohn: 00:24:48

I’m honestly, I’m kind of relieved to hear that because it would’ve been even more humbling for me, if on top of all of this entrepreneurship stuff you’re doing in the architecture, that you were also super on top of the neural network stuff. That’s like, that’s like the one piece that I’ve, that I’ve been keeping track of. And so I’ve done a whole bunch of, we have two episodes a week of this show, and the Tuesday episodes always have guests. The Friday episodes sometimes have guests, sometimes they don’t. And on the Friday ones without guests in recent months, I’ve had a lot of episodes specifically on single GPU LLMs and how you can be like getting these open-source models, fine-tuning them to your particular proprietary problems. So, that’s maybe there’s some, some of these Friday episodes from SuperDataScience you can check out.

Richmond Alake: 00:25:35

Yeah, I think one thing I should mention is I use Hugging Face a lot. So, they have some … Yeah. I love the inference endpoint solution. It’s one, it’s, I feel like it’s a very unique solution where it’s easy to get a REST API to just call a model within the sort of the Model ZOO, and you get access to the data set. Then you can also modify the sort of operations and you can extend the functionalities, right? if you, if you sort of like start to create this custom endpoint yourself and you get access to all of this compute and it’s relatively decently priced. So, I really enjoy using Hugging Face. It’s one of the, the tech stack we’re, we’re exploring.

Jon Krohn: 00:26:21

Nice. Yeah. I’m not surprised to hear that. I’m actually currently reading the, we might as well call it the Hugging Face book. It’s Natural Language Processing with Transformers, and it’s written by three folks from Hugging Fase. Yeah. And it’s, it’s, yeah, it’s a really well-written book and I’m really enjoying learning so much about Hugging Face. It, yeah, we use Hugging Face as well. We hadn’t, we haven’t been using it at my machine learning company Nebula for the inference endpoint that you just mentioned before. But we have obviously been using the Transformers package a lot for just very quickly being able to access huge models and fine-tuning them to our needs.

Richmond Alake: 00:27:04

Yeah. As in interesting. I had Lewis Tunstall, he’s one of the authors of the book you’re reading, had him on my podcast-

Jon Krohn: 00:27:12

No way!

Richmond Alake: 00:27:14

Sometime last year. Cool guy. Very cool. I had, we actually have two episodes together. One is on air and the other is not aired yet. I’m gonna put it out hopefully very soon. And we had a, it is such a, he’s such a cool guy. He, he actually, we went into his background. He used to work at CERN in Switzerland where you are, in the Large Hadron Collider before he became a data scientist. Yeah, so, he had a very good story on how he transitioned into data science and how he was looking for his first data science role, and how he got his first data science role, was very interesting. You should reach out to him. Very cool dude. He has a, he is from, he is from Australia as well. So, yeah.

Jon Krohn: 00:28:00

Are you stuck between optimizing latency and lowering your inference costs as you build your generative AI applications? Find out why more ML developers are moving toward AWS Trainium and Inferentia to build and serve their Large Language Models. You can save up to 50% on training costs with AWS Trainium chips and up to 40% on inference costs with AWS Inferentia chips. Trainium and Inferentia will help you achieve higher performance, lower costs, and be more sustainable. Check out the links in the show notes to learn more. All right, now back to our show.

00:28:37

Yeah, I mean, I would love to have him on the show. I will, I will follow up about that one, because yeah, I’m really enjoying reading Lewis’ rating right now. So, that would be an awesome guest. And for our listeners who might be confused by what Richmond just said about me being in Switzerland, I didn’t explain this, but I am recording today’s episode from a hotel room in Switzerland. So, I’m at something called the St. Gallen Symposium in St. Gallen, Switzerland, and doing a couple talks here on AI. Everyone wants to hear about AI right now. And, and yeah, so, so if you’re watching the YouTube version, have a very different background. You actually have like Swiss mountains, and churches, and trees. Not my usual New York apartment background. But yeah. So, let’s talk a bit more about your podcast, Richmond. So, it’s in the same space as the SuperDataScience podcast. So, a lot of our listeners might be interested in your show as well. You cover technology, you cover data science, you cover AI ML, obviously you have amazing guests like Lewis. I know that we’ve had a lot of overlap in guests in the past when I look at the guests that you’ve had on the show. So, yeah, what prompted you to start a podcast? What’s that experience been like, has been helpful for your career?

Richmond Alake: 00:29:56

Yeah. Yeah. So, let me start with what prompted me to start the podcast. I guess… do you talk about failures or sort of like semi-failures on this podcast? I could share some of mine.

Jon Krohn: 00:30:08

Yeah, I certainly have. And yeah, you’re welcome to air your failures on air as well.

Richmond Alake: 00:30:16

Sure, sure, sure. Just to, just because it might sound like I’m superhuman cause, but no, I’m not superhuman. So, the podcast sort of started, it’s, it’s sort of like an end. It’s sort of like a means to an end. I was writing, I got a book deal from, I got a book deal from a large publishing company within our space. And the book deal came as a result of one of my articles that did really well. And they reached out to me and said, “hey, Richmond, this article is very well written. Do you want to write a book?” And I said, yes, because I am a yes man. And back then I just said yes to everything that came. I just took a lot on my plate. So, so the book was meant to, the book was called Standing on the Shoulders of Giants, and it was essentially how I’ve sort of accelerated my career in machine learning.

00:31:11

I’ve only been in this space for about three years professionally. And one year, I spent doing a masters in AI essentially. So, I’ve accelerated very quickly in, I guess some people think I’ve accelerated very quickly, and they wanted me to write a book about it. And I said, I could write a book about it, but I don’t feel like I have enough experience to fill up a book. I’m gonna reach out to some prominent individuals within the space and do some research. And I then converted the research. I said, okay, why not just do a conversation, a podcast, when, and then after the podcast I can just watch it and make that and write down some notes and create a book out of it. Long story short, writing a book is not easy. Writing a book is very different to reading like a research paper and then writing like an article or media about it, or… It’s, it’s a whole different process.

00:32:09

And it turned out my time management skills or myself, I was not ready for it. It was a bit overwhelming. Long story short, the book deal got, we, we just agreed to like, cut it and just like move different ways. So, I have half a book which I’ll be releasing as articles, but, and that’s just to say, look I’m, I’m still working on certain things like productivity, time management, efficient use of my time, and I feel like that’s a lifelong battle as I’m trying different productivity technique. But the key, the key lesson I learned from there is not to say yes to everything. It’s to just know one your capacity. But I am glad, because of that I have a podcast, and from the podcast I saw a problem which led to OpenSpeech that me and my brother work on together. We’re gonna, we’re gonna be doing a little talk at Stripe and it’s, it’s just the journey continues, right? So, like a phoenix from the ashes, something rises.

Jon Krohn: 00:33:14

Yeah. That’s, so when you talk about failure and how failure when you, when you dare and you fail, you still succeed somehow anyway. Like, there, you can’t fail when you dare because even like the, like the initial failure turns into success. And so I’ve done episodes on this in the past. So, my favorite quote is this Latin quote. It’s actually, it’s from the British Royal Air Force. And it’s I’m probably gonna butcher the Latin pronunciation right now, but it’s “Qui audet adipiscitur” which means “Who dares wins’. And it’s this idea that exactly like you’re describing that on the surface you failed at writing this book, but you didn’t because you learned stuff about yourself. You learned not to say yes to as many things. You got a podcast out of it that led to amazing connections to people like Lewis that led to yeah, OpenSpeech and then you daring at OpenSpeech. It’s like, who knows where that leads to or who knows where daring at the podcast leads to. Like, all these things will unfold over so many more years. So, yeah, just keep daring and you’ll keep winning even if there’s like aparent failures along the way.

Richmond Alake: 00:34:28

Yeah, definitely. And within the space of AI as in I thought everything used to move very fast before, like two years ago, now it’s, it’s ridiculous.

Jon Krohn: 00:34:39

It’s just insane.

Richmond Alake: 00:34:39

I’ve, I just hate going on, I just hate going on Twitter. I’m tired of like, going on LinkedIn because I just feel like I am so behind. And like I said, I’m not in the, I feel, I feel like one thing I do is I gauge my worth in the, in the ML space by how much technical knowledge I have or I used to anyway. But I, but I know if I take like maybe two, three months off and I just dive into the space of Transformers NLP, I can always learn, then write couple articles on it. But the space is moving crazy fast. I don’t know where we’re gonna be when this comes out.

Jon Krohn: 00:35:16

It’s moving crazy fast and it’s, it’s so easy to feel like you’re far behind because when you have that experience of going on Twitter or going on LinkedIn, you are seeing what everyone has learned. But somehow you get this feeling that you should somehow know all of it, even though it’s like one person knows one piece, another person knows another piece, another person knows another piece. But when you see all these people all in a row in one space, so knowledgeable about it, you’re like, ah, everyone knows everything and I don’t know anything, so… [crosstalk 00:35:47]

00:35:49

Yeah. So, it’s easier to feel that way in this field than it gets easier all the time for sure. But you are still doing, you have in the past, you put great efforts into helping people understand things in this space. So, you’ve been a computer vision instructor in the O’Reilly platform and last year you created supporting materials for, and taught a professional certificate in machine learning and artificial intelligence for Imperial College Business School. And so for our listeners in the UK, all of them will know what Imperial College is and what a prestigious institution it is. It is often the top-ranked university in the UK, higher than Oxford, higher than Cambridge. So, an amazing institution to be associated with in creating this AI content for, but it’s wild for me, I need to mention that on air because our American listeners and probably in a lot of other parts of the world won’t have even heard of Imperial. It’s so interesting. Like, Imperial College, London, University College London, these are amazing world-class institutions and often the best universities in the UK, but for some reason, knowledge of them doesn’t, hasn’t like gone across the Atlantic Ocean to the US.

Richmond Alake: 00:36:59

Yeah. I’m not sure why, but everyone’s familiar with Oxford and Cambridge. Imperial College is definitely among the top universities here in the UK and in the world. UCL definitely top for economics. I know a few, like very smart folks that have basically gone there. Yeah, that, that gig came across when I was saying yes to everything. So, yeah. It came across through a company called Emeritus and I taught data science and AI sort of a afterhour session lectures. And I’ve never done a lecture before. I just said, yes, why not? I just, so I was going into this, I created a materials, went in there, taught it, they asked me questions. I did a bit of marking as well, which is very interesting. I was basically a lecturer, but without, I wouldn’t put lecturer. You wouldn’t see lecturer on my LinkedIn. But yeah, that was a very fun experience.

Jon Krohn: 00:38:00

Nice. And I understand that at least at the time of us filming this podcast episode, you’re preparing a new course for O’Reilly. I think it’s probably gonna go live. You’re probably going to teach this in O’Reilly before this episode is actually published, but maybe you’ll teach it again. So, what’s this new O’Reilly course all about?

Richmond Alake: 00:38:18

Yeah, so this one is about feature stores. So, this it’s gonna be out June 1st. It’s gonna be a live training session. It’s gonna run for about three hours, where we talk about one, the general landscape of modern application. We talk about the architecture at a very high level. Then we move into what makes modern applications sort of function properly, which is essentially we have the machine learning components. Then what makes machine learning components perform properly? You have features, right? So, we’re looking into all of this two lens and infrastructure around features that allow you to solve, deliver features in real-time, and have the self-efficient feature pipeline, good feature management and feature stores is one of the tools within the ML op space that allows for this to happen. We’re particularly focusing on Feast, which is an open-source feature store solution. We’ll be going through high-level what Feast is, and we’ll be actually doing some coding, implementing a feature store, using Feast, looking at all the retrievals, servants, and trying to go into real-time solutions using feature stores. And generally just, just getting a hand a grips of all of this MLOps space and how you can actually just take a hands-on from a practical sense. And that’s what the courses are focused on.

Jon Krohn: 00:39:36

Nice. Yeah, so breaking that down and maybe repeating back to you some of the stuff you said for our audience. So, with a lot of machine learning approaches, they are made powerful by having these pre-computed features out of the raw data. And they could be, the raw data could be any kind of data format, it could be natural language data, it could be image data, video data. It could be tabular data. But often, and I think especially in that tabular data case, we can end up having a much more powerful machine learning algorithm if we’re really thoughtful about the features that we, that we compute from the raw data. And so we kind of, we prime the machine learning algorithm with really useful, the most valuable information from all of the information it could, it could be accessing. And so what you’re describing, these feature stores are tools that allow you to efficiently store a lot of these features. And so that you can be, potentially for the kinds of real-time applications that you’re doing in your startups or at Slalom Build, you need to be able to access instantaneously, potentially a lot of features from complex data and yeah, and run those through machine learning model right away. Is that, was that like a good summary of-

Richmond Alake: 00:40:51

Yes, spot on. Spot on. As in, yes, feature store is a centralized data storage for storing and serving features, and they have solutions for online environments and offline environments. So, online being you doing real-time predictions, real-time inference and offline being batch prediction, you could do in model training scenarios, model training scenarios or evaluation scenarios. So, feature stores essentially allow you to, they take some of the headache away from redefining features or sharing features across the team. At some point, organization reaches a large, a maturity level where you have several machine learning teams working in different projects, or even several machine learning teams working on the same projects. And if you worked on a Notebook, you have all of these processes where you are doing some feature engineering, how can you take those features and let the team over in the US share the same features and also keep the same definition of the features, the scope at which your features are being created, defined, which essentially allows you to manage it properly.

00:41:55

Feature stores feature platform offer a solution for this and more, and more. So, it’s very relevant in it is very relevant now, now that we’re working with a lot of large-scale data working across, working across large teams as well. And we’re moving into, well we are in delivering influence results in real-time. So, it’s feature stores is a very, very… I could, it is better than, I feel like those two links and infrastructure will always exist regardless of what models is in vogue. You can, you understand what I’m talking about.

Jon Krohn: 00:42:34

The future of AI shouldn’t be just about productivity. An AI agent with a capacity to grow alongside you long-term could become a companion that supports your emotional well-being. Paradot, an AI companion app developed by WithFeeling AI, reimagines the way humans interact with AI today. Using their proprietary Large Language Models, Paradot A.I. agents store your likes and dislikes in a long-term memory system, enabling them to recall important details about you and incorporate those details into dialog without LLMs’ typical context-window limitations. Explore what the future of human-A.I. interactions could be like this very day by downloading the Paradot app via the Apple App Store or Google Play, or by visiting paradot.ai on the web.

00:43:19

Yeah, I think that’s right. Yeah. This is a useful, this is a great example. I was thinking about this earlier when you’re talking about how the space moves so quickly. It is interestingly, it is interesting, however, that there are some skills that we can learn as data scientists or software developers that are timeless. So, you know, learning data structures and algorithms is going to serve you well in data science or in computer science in software development. You know, and that’s gonna be the case indefinitely. And so this is a great example of another one of these skills at feature store. You know, it’s like this kind of MLOps is gonna be really useful. I actually, I recently heard, I’ve been listening a lot to another podcast called Last Week in AI. And so it’s a, it’s a, it’s a news show and one of the guys on the show is Jeremie Harris. And so, he’s been on my, on the SuperDataScience podcast a couple of times.

00:44:13

And when he was on most recent time I was, he mentioned this show, and I was like, that sounds like a really cool way to keep up with the news. And one of the things that they mentioned on air, and they, even as they were saying this, they were like, we might be exaggerating the number here slightly, but that people working on really cutting edge large language models today, they could potentially be making eight-figure salaries. And it was that, that they were like, as they said that they were like, maybe it’s not quite eight figure, but very, very, very large salaries. And the interesting point that they made was that those people, they’re not commanding that super high salary because of their data science abilities, because of their capacity of understanding how transformers work, for example. It’s the operation side of things. It’s being able to train so many GPUs over so many machines and have it work efficiently that that’s how they’re commanding these really big salaries.

Richmond Alake: 00:45:17

Yeah. Let, let me touch on a couple things you’ve mentioned. Jeremie Harris great guy, also had him on my podcast.

Jon Krohn: 00:45:23

Oh yeah, there go.

Richmond Alake: 00:45:24

Yeah, it was, he was the first guest, or the second I think, which, I think he was the second guest. He, he’s a great guy and he, he’s so ahead in the way he thinks because he spoke about two years ago and he was talking about AI alignment and he has, he has a startup Gladstone AI, I think it’s called. That works on AI safety, AI alignment, which was two years ago. No one was really talking that much about it. Now it’s literally the topic. Is it your hot topic? So, he’s definitely ahead in the, in, in, I don’t even know what he’s thinking now, but I would love to know, because he’s probably thinking five years ahead.

Jon Krohn: 00:46:03

Well yeah, if you want a bit of a clue into maybe what he’s thinking about in addition to the Last Week in AI podcast, which is fantastic, I highly recommend it. In addition to that he recently had a book come out called Quantum Physics Made Me Do It. Yeah. You know that. And the interesting, I think that that might potentially give you a glimpse into like an even kind of bigger picture of like what Jeremie’s seeing. So, yeah. Really brilliant guy.

Richmond Alake: 00:46:35

Do you, what I noticed is a lot of people transition from physics and, and quantum physics and all those the theoretical physics into data science: Lewis Tunstall, Jeremie Harris… And I was like, what, what’s going on here?

Jon Krohn: 00:46:54

Yeah, well, I guess it’s because they, you know, and especially things like Lewis was dealing with, he was dealing with huge volumes of data. So, it ties to the same kind of operations thing. It’s like he developed a lot of expertise in being able to handle huge amounts like that CERN, Large Hadron Collider. Like, it generates absurd amounts of data per millisecond. And so, yeah, and then you need to be able to come up with machine learning algorithms to put on top of all of those huge volumes of data to tease out the little bit of signal that’s in there. So, I can see how that’s, you know, so they were doing data science, it was just, they were called a physicist.

Richmond Alake: 00:47:32

Yeah. And spoiler alert hopefully you get him on here. One, that was how he transitioned. He saw like an algorithm that just processed all the data in like minutes. And he said what one of his colleagues must have showed it to him. And he was so shocked. He was like, what? Yep, I’m changing jobs. Let’s go explore what this data science thing is. Yeah. And yeah, that is good. But again, very cool guy. You should totally have him on. But yeah, it’s, it’s definitely, it’s definitely worth mentioning the operation size of machine learning, and it’s gonna be very, very relevant going forward. It was relevant. There’s a lot of investment going into the MLOps space as well. Before generative AI was the topic, MLOps was the topic. I know we’ve all forgotten about it. But before generative AI, all the VC money was going straight into that. And it still is.

Jon Krohn: 00:48:27

It still is. It’s, so we recently had an episode come out, episode number 679. It was with an investor named George Matthew. And so George Matthew is at this hundred billion dollar VC and growth equity fund called Insight Partners. They’re one of the largest B2B SaaS investors in the world. And he was talking a lot about LLMOps, so specialized MLOps to handle all these huge LLMs that are coming out now. And he’s doing a lot of investing in different parts of that stack.

Richmond Alake: 00:49:02

Yeah. And that’s an interesting space, I would, that’s definitely an interesting space I would love to explore. Maybe I know I saw, I forgot the name, but someone quite prominent is writing a book on it. Literally came out today, forgot the name, but there’s a, there’s an O’Reilly book coming up out on LLMOps, whatever you want to call it. But yeah, there’s definitely a lot of investment going into the space. And the company that is the foundation for it all is NVIDIA, which they’ve really positioned themselves as the, as the supporting platform for all of these large language models, all of the AI stuff that is happening today. They’re in a very unique position. And I’m mentioning them because you were talking about working on the operation, distributed training when, which you would do across several GPUs, and most of the GPUs are all NVIDIA GPUs. So, it’s one thing I’m exploring is Rapid, which is NVIDIA sort of a library that handles large-scale data analysis and also provides distributed training with their machine learning library and Rapids. So, I just thought I just mentioned that as well.

Jon Krohn: 00:50:18

Nice. Is the book that you’re talking about, I’ve just been doing a little bit of research here while you’ve been talking, is the book called Reliable Machine Learning?

Richmond Alake: 00:50:26

No, this book is actually not even been written yet.

Jon Krohn: 00:50:31

Oh, oh, oh. They just, they just announced-

Richmond Alake: 00:50:33

It’s currently. Yeah, they just announced that they’re writing it, yeah. So, it’ll probably come out in a few months.

Jon Krohn: 00:50:38

Oh, nice. I’m so sorry, I misunderstood. I thought you were saying that today they announced that it had come out, but it’ll be out in a few months. But this, in the meantime, I don’t know, I kind of, I dug up this book, it’s also an O’Reilly book. It’s called Reliable Machine Learning, and it’s by Cathy Chen and four other people. And I guess they had, the way that I came across it is I saw that they had a YouTube video where they were talking about LLMOps. So, this might touch on it a bit while we wait for your LLMOps book to come out.

Richmond Alake: 00:51:11

Yeah, no, it’s, it’s not my, not my, not-

Jon Krohn: 00:51:14

No, sorry, sorry. The one that you just mentioned.

Richmond Alake: 00:51:17

One I mentioned. Yeah, yeah, yeah.

Jon Krohn: 00:51:20

Yeah, yeah. Standing On The Shoulders of LLMOps coming out.

Richmond Alake: 00:51:28 [crosstalk 00:51:21]

Jon Krohn: 00:51:31

Awesome. So, actually that ties in really nicely to one of the final topics that I’ve got for you today, Richmond, which is that for the last couple of years you have been writing a lot. So, you touched on this in the context of your book and how you’ll still be releasing that content as blog posts. So, you’ve been writing articles on Medium, and then you got picked up by NVIDIA. You got picked up by Built In as a contract writer for those folks saying yes. And so your writing seems to be mostly about data science fundamentals and career advice, such as tips and tricks of the trade. One of your articles is titled AI Pioneers Write And So Should You. And Lewis Tunstall, really great example, we’ve already been talking about him a lot on this episode. Jeremie Harris, another one. And so why do, I guess you could first tell us why you write, but then even more so, why do you think that data scientists in general should be writing?

Richmond Alake: 00:52:28

Yeah. So, why do I write? For me, writing is a hack. So, I said I started writing when I was doing my masters in computer vision, deep learning, and space robotics, which essentially is AI, as a way to reinforce what I was hearing in lectures and a way to sort of retain knowledge. So, I would, I would, the lecturers would talk about maybe a deep learning technique. I would research it, then I’d write an article about it to make sure I can explain it in detail, and I’ll publish it on Medium. And I did that. So, in the, if you go through like the, my writing journey, you see the earlier articles are more about computer vision, deep learning, talking about some algorithms, talking about some of the convolution and neural networks, some of the techniques that goes on within it pulling, regularization.

00:53:19

So, most of the articles are around, those are actually some of my best articles in terms of views. Then I started reading research papers, then I wrote an article about how to read research papers properly from watching a video from Andrew Yang. I just watched the video and I just communicated the learning and applied like my own process to it. And that also did very well. So, why do I write? I write to essentially make sure that I understand what I’m learning. And within the field of machine learning, you will never stop learning. I think that’s the most time anyone has said “learning” in a sentence. But that’s the thing, right? It’s, you have to have a growth mindset within this field because you don’t know what’s gonna come out next. And for you to stay on top or for you to even just remain relevant, you need to be learning. So, why should other data scientists write? Because writing is a way for you to, one, retain knowledge, and two, create that sort of brand and industry visibility that is gonna actually accelerate your career quicker than you can do not writing. So, through writing, I’ve, I’ve, again, I’ve been picked up by NVIDIA, Built In, other server companies, Neptune AI which is a very large MLops solution.

Jon Krohn: 00:54:39

Yeah, we’ve had them as a sponsor of the show in the past.

Richmond Alake: 00:54:42

Oh, nice. So, yeah, you can see some of my blog posts on Neptune AI covering dataset version. So really writing has given me that sort of reach in terms, within the machine learning data science space where my name is said could be in the same space as some of the large companies that operate within the space. And that’s a very good benefit when you are looking for, maybe you want to get employed or you are looking to just build out, maybe go on your own to do your own consultancy, or your own site projects or just want some extra income. So, I said yes to a lot of things, hence my names are just scattered else somewhere in the internet. I’ve had people come to me saying they want to translate my articles to a different language. One recently was Japanese. I was like, yep, go ahead, as long as you give me the credits. So, writing is very… if AI pioneers like Yann LeCun, Andrew Yang, Kai-Fu Lee, I’ve got his book out here. If those guys write and they are literally at the front of the field, you should be writing as well.

Jon Krohn: 00:55:47

Yeah. So, speaking of which, I wonder if that’s your book recommendation for us. So, we ask everyone on the show if they’ve got a book recommendation. Is KaiFu Lee your recommendation?

Richmond Alake: 00:55:56

No. Kai-Fu Lee’s book is actually very good. I’m gonna, I’m gonna do a bit of a cheating here and I’ll do two. So, first one is Architects of Intelligence. I came prepared, I knew you were going to ask this. It’s by Martin Ford. And this book is very good because it has all the insights and sort of like knowledge from people that are pioneering the AI field. So, we have Yann LeCun, Fei-Fei Li, Yoshua Bengio all of them provide their sort of force on the space of AI where it’s going. And it’s very interesting, if you were to read it now, you see a lot of the predictions around AGI saying, yep, we’re never gonna achieve AGI from literally people pioneering the field. But now today I bet some of the opinions have changed because the space is just moving so quickly.

Jon Krohn: 00:56:51

I mean, we had, at the time of us filming, it was yesterday that Jeff Hinton announced that he was resigning from Google because of how quickly it seems like we’re barreling towards AGI and how he’s concerned about that. And he’s even saying how like he regrets being a pioneer of deep learning now. And he’s like, he, he’s telling himself like he’s basically in this New York Times article I was reading, he’s basically like, we are screwed. And the only way that I feel good about what I’ve done is that I know that if I hadn’t figured this stuff out, somebody else would’ve anyway. So, he’s like, oh yeah, I’m using the same kind of excuse as people have used throughout history when they do like bad things. So, it’s really crazy to hear that he’s like using that kind of language around I’m, I don’t, yeah, like you’re saying, like when this book would’ve come out a couple years ago, I doubt he would’ve had those kinds of thoughts.

00:57:51

Me personally, I’ve been blown away by how quickly we have the capabilities that we have in algorithms like GPT-4. It’s wild how well it can mimic human intelligence or be better than human intelligence on so many tasks. And I think that it’s, it’s surprised us. And so yeah, I mean it’s like, like we already talked about earlier in this episode, it’s crazy how fast this, this field is changing and with, where we were two years ago to today, it is, if it’s mind, it’s mind-blowing to you, it’s mind-blowing to me, we are people in this space, it’s mind-blowing to Jeff Hinton and yeah. So, who knows where we’re gonna be two years from now. And I guess that’s why he’s so concerned and so many people are so concerned. But anyway. Yeah, yeah. Go. If you have more to say on that before you get to your second book.

Richmond Alake: 00:58:38

Yeah, I just wanted to say like definitely it’s mind-blowing is overwhelming because it’s, there’s just news everywhere. And the future is a bit uncertain. It’s a bit uncertain and some people had, some people don’t know where it’s going. And I guess it’s like people like Jeff Hinton, they would rather, I guess probably hang up their gloves. But I wouldn’t say, I wouldn’t say regret should be what he should have towards his contribution to the field, if you get what I mean. Because as much as AI could do potentially a lot of evil, it could do so much good as well.

Jon Krohn: 00:59:20

Exactly. That’s, that’s the flip side.

Richmond Alake: 00:59:22

There is so much-

Jon Krohn: 00:59:23

Absolutely. So, while it will definitely get misused by bad actors, most people are good and so the majority of uses should be good uses. And so that’s, I’ve got the same optimism as you Richmond. I think there’s more good than bad that’s gonna come of this, but we need to like mitigate risks as much as we can. And I think that’s a big part of like, I think if I’m understanding correctly, Jeff leaving Google, a big part of why he was doing that was so that he could speak freely about his concerns and address those concerns.

Richmond Alake: 00:59:58

If you want to understand as in what sort of what the future could look like. This next book, AI 2041 by Kai-Fu Lee, and I can’t pronounce that properly. Yeah. But Kai-Fu Lee is a prominent, is a prominent folk within the AI and machine learning space. He, he worked at Google Brain if I’m correct. So, he touches on different ways AGI, Augmented Reality AI can affect us in this book through actual stories, real-life stories and it touches different parts of the world. And it’s a very good book just to understand why are people scared and what social problems could arise in the future apart from AI taking over the world. It’s a very realistic book that uses the technology we have today, projects it well, 20-something years into the future, less than 20 years into the future. But I’m sure if he was to write this book again, it would be so different. It would be, it would be so different because the space again moves so quickly.

Jon Krohn: 01:01:06

He called it AI 2040 and he is like, “Man, I should have called that book AI 2025”.

Richmond Alake: 01:01:10

Yeah, exactly. Called it AI 2025 would’ve been more appropriate because again, like all the predictions in this book, I bet they all probably would want to change it now. Right. And that’s the space, that’s the space we’re in.

Jon Krohn: 01:01:23

Yeah. And that for our audio-only listeners Richmond was holding up the Architects of Intelligence book just now as he was saying that they’d like to change their perspectives. Yeah, no doubt. So, in order for our listeners to be able to keep up with you as you change your perspectives in the future to this very quickly adapting environment, how should they follow you? What’s the best way?

Richmond Alake: 01:01:45

Yeah, you can follow me on LinkedIn where I post some of my progress, what I’m working on and some interesting thoughts about this space as well. I might put one or two jokes there. My brother tells me I’m not funny, but yeah. LinkedIn, you can follow me on Medium as well: Richmond Alake and also Twitter. I don’t use it that much, but you can connect with me on Twitter as well. And if you’re interested in seeing the progress of some of the startups I’ve mentioned, you can check out MiniPT, minipt.co.uk, sign up to the waiting list and when we launch the app, and you’ll be the first one to get ahold of it. Try to get into the gym and have AI train with you. And OpenSpeech will be launching in a closed beta, I’m very stealth mode of that with that startup.

Jon Krohn: 01:02:31

Nice. I can’t wait to find out that I’m getting my butt too low to the ground in my squats. That’ll be nice if I’ve reached that like mobility threshold that like your algorithm’s like, no, you’re too low. I’ll be like, sweet. At least I have mobility. Awesome. Richmond, I have really enjoyed this episode. It’s been so great getting to know you. You and I had not had a conversation before this, before meeting to record this episode. And because I was concerned about losing light in my unusual Swiss hotel room recording circumstances here, we jumped right into recording. So, we basically got to know each other on air and it’s been such a blast, man. I look forward to catching up with you again in the future.

Richmond Alake: 01:03:15

Yeah, same here. We can meet in person in New York.

Jon Krohn: 01:03:19

Yeah, absolutely. And we’ll have to, we’ll have to have you back on the podcast at some point in the future to check in on how all of these amazing social impact projects you have going on are coming along.

Richmond Alake: 01:03:31

Thanks for having me.

Jon Krohn: 01:03:37

What a fascinating individual Richmond is. It’s wild how much he’s accomplished in such a short time and given the frequency with which he says yes, his ingenuity, and his work ethic, he’ll no doubt make an enormous impact in his lifetime. In today’s episode, Richmond filled this in on how he uses Databricks and Kinesis to ease the creation of applications that involve large-scale real-time data streaming. How he’s leveraging the Swift programming language for developing his MiniPT, personal training iOS app that incorporates Apple Core ML and Firebase the low latency, easy-to-use mobile database. He talked about how he uses a Python Flask backend and a React front end for his generative AI application for which he uses OpenAI APIs for proofs of concept and then Hugging Face Inference to quickly and cheaply get his own APIs up and running. He talked about how he loves the open-source Feast feature store for production ML, including real-time inferences and offline model training in batches, and how writing for the public is a hack that anyone can take advantage of to force themselves to learn new concepts well.

01:04:38

As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Richmond’s social media profiles, as well as my own social media profiles at www.superdatascience.com/685. That’s www.superdatascience.com/685. Your feedback is super helpful for spreading the word about this show. So, if you feel like taking a moment to rate the show on Apple Podcasts or Spotify or whichever platform you listen to it through, that’d be awesome. And if you have feedback about the show, be it positive or constructive, I’d love to hear it. It literally guides me on how I should be tweaking the show. So, please share your feedback with me directly by tagging me in posts or comments on LinkedIn, Twitter, or YouTube. I will read it and I’ll reply.

01:05:23

Thanks to my colleagues at Nebula for supporting me while I create content like this SuperDataScience episode for you. And thanks of course to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the SuperDataScience team for producing another scintillating episode for us today. For enabling that super team to create this free podcast for you, we are deeply grateful to our sponsors. Please consider supporting this show by checking out our sponsors’ links, which you can find in the show notes. And if you yourself are interested in sponsoring an episode, you can get the details on how by making your way to jonkrohn.com/podcast. Finally, thanks of course to you for listening all the way to the very end of the show. I hope I can continue to make episodes you enjoy for years to come. Well, until next time, my friend, keep on rocking it out there and I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon.

Podcasts SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Podcast Transcript

Share on

Related Podcasts

July 22, 2025

July 18, 2025

July 15, 2025

Podcasts SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Share

SDS 685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Podcast Transcript

Share on

Related Podcasts

July 22, 2025

SDS 907: Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman

July 18, 2025

SDS 906: How Prof. Jason Corso Solved Computer Vision’s Data Problem

July 15, 2025

SDS 905: Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann