(00:05):
This is Five-Minute Friday on “my Generative A.I. with Large Language Models training”.
(00:27):
Welcome back to The SuperDataScience Podcast. I’m your host, Jon Krohn. Today, I’m filling you in on my extensive two-hour training that I recently published on YouTube, it’s titled “Generative A.I. with Large Language Models”. Topical. This training is designed to immerse you in the world of large language models, or LLMs for short. In we leverage hands-on code demos of the powerful Hugging Face Transformers library and the PyTorch Lightning library. All of the code from the training is available in an accompanying GitHub repo, which of course, I’ve included for you in the show notes. Most of the code demos are provided in Jupyter notebooks; advanced demos near the end of the training that involve training an LLM across multiple GPUs, these couldn’t be run from Jupyter but we’ve provided straight Python scripts for those.
(01:17):
I split the training into four comprehensive modules. Module one lays the foundation with an introduction to LLMs. This includes a brief history of Natural Language Processing and an overview of key concepts like transformers, subword tokenization, and autoencoding models. We also explore the key transformer architectures such as ELMo, BERT, and T5, and of course, the renowned GPT Family.
(01:41):
Module two covers the breadth of LLM capabilities. In it, we explore LLM playgrounds, and the extraordinary progress of the GPT family, especially GPT-4. We also have a demo notebook focused on calling OpenAI APIs, which is a straightforward but nevertheless super powerful skill for data scientists today.
(02:01):
In the third module we move on to dig deep, this is really, the third module is like the mean of this whole training. In it, we train and deploy large language models. So, this part of the training takes you through the plethora of hardware options, best practices for efficient LLM training, and it shows you how to select and use open-source pre-trained LLMs, fine-tuning them for your own purposes. This third module provides a thorough understanding of multi-GPU training, deployment considerations, and monitoring LLMs in production.
(02:36):
Finally, in the fourth module, we delve into deriving commercial value from LLMs, so the applications of all this. We discuss how machine learning can be supported and enhanced by LLMs, tasks that can be automated or augmented with LLMs, and I provide you with some best practices for AI teams and projects. We also take a look ahead to what the future may hold for AI and, as a result, society at large.
(03:02):
Now, obviously I’ve put this training together because I think it will be useful to a lot of people. In two hours we’ve had a ton of valuable information in there and yeah, I’ve put it together because at an unprecedented pace, LLMs like GPT-4 are transforming the world and revolutionizing the field of data science. The benefits of these models are wide-ranging, from developing machine learning models and commercially successful data-driven products, to boosting the creative capacities of data scientists, pushing them to evolve into data product managers.
(03:34):
This training session was packed at the Open Data Science Conference in Boston, where I filmed it. A big thank you to the ODSC for hosting, and of course, to all of you who attended, asked great questions, and made it such a lively, engaging event.
(03:50):
I took my recording of that and paid a professional editor to create the slick version of my “Generative A.I. with LLMs” training that’s now available in its entirety on YouTube. It has already got over 7000 views and has been met with approval, 241 thumbs up so far and no thumbs down! To allow you to enjoy the entire training from beginning to end without interruption, I’ve disabled monetization on the video, indeed, I’ve disabled monetization now on all of the educational content on my personal YouTube channel. So, you can enjoy it all — commercial-free — right now.
(04:25):
All right, leave a comment on the video to let me know what you like or didn’t like about the training. And, of course, let me know what you’d like future trainings of mine to cover.
(04:35):
That’s it for today, all right, cool. Until next time, my friend, keep on rockin’ it out there and I’m looking forward to enjoying another round of the SuperDataScience podcast with you very soon.