90 minutes
SDS 835: AI Systems as Productivity Engines, with You.com’s Bryan McCann
Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn
AI systems are evolving rapidly, and in this episode, Bryan McCann, CTO of You.com, explains You.com’s unique approach to search, the impact of AI-driven research, and the game-changing potential of AI agents. With a background in natural language processing and philosophy, Bryan joins Jon Krohn to share a fresh perspective on where AI is headed and what it means for the future of work and scientific discovery.
Interested in sponsoring a Super Data Science Podcast episode? Email natalie@superdatascience.com for sponsorship information.
About Bryan McCann
Bryan McCann is the co-founder and CTO of You.com. Previously, he was a Lead Research Scientist at Salesforce Research working on Deep Learning and Natural Language Processing (NLP). He authored the first paper and holds the patent on contextualized word vectors, which led to the transfer learning revolution in NLP with BERT. His work includes early unified models for multi-tasking in NLP and applying language models to biology, where his team generated proteins shown to be as or more effective than those in nature. He received the 1st ever eVe award at SXSW 2021 for his collaboration with author Daniel Kehlmann. Bryan's work comes from a deep philosophical interest in meaning.
Overview
AI systems are expanding beyond traditional search, with You.com creating a “do engine” that connects users with multiple language models for more personalized, in-depth answers. You.com’s “smart mode” provides quick responses, while “research mode” offers detailed insights—serving a range of needs, from simple queries to complex analysis.
Bryan’s conversation with Jon covers how You.com’s technology enables advanced workflows for businesses, with companies in biotech and finance using You.com’s research capabilities to navigate extensive datasets, synthesize information, and streamline decision-making. With AI-driven analysis and context-aware responses, You.com equips organizations to gain insights and increase productivity without growing their teams.
Bryan also explains his work on unified AI models and the surprising parallels between language and biology. By training models to handle diverse tasks through a shared approach, You.com explores new possibilities for AI, from generating proteins to making scientific predictions. Bryan envisions a future where AI agents tackle complex problems, freeing people to focus on creativity, discovery, and meaningful pursuits.
In this episode you will learn:
- (03:55) How You.com’s “do engine” approach connects users to multiple language models
- (11:34) How AI systems at You.com generate optimized, intent-driven queries for better results
- (28:39) You.com’s focus on automated workflows sets it apart from other platforms
- (31:31) AI agents in You.com, with Bryan predicting they’ll outnumber people by 2025
- (41:49) Bryan’s path to unified AI models that can perform diverse task
- (50:40) Early experiments with alignment in AI that influenced modern transformers
- (01:04:45) Bryan’s research on controllable text generation
- (01:11:27) Language models applied to protein generation, linking text and biology sequences
Items mentioned in this podcast:
- You.com
- CS229: Machine Learning
- Profluent Bio (the startup built on the Progen model)
- MoonHub
- OpenAI
- Perplexity
- OpenAI o1
- Jon Krohn’s Deep Learning Study Group
- Word2Vec
- “Learned in Translation: Contextualized Word Vectors” by Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher
- “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer” by Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu
- GPT-2
- SDS 739: AI is Eating Biology and Chemistry, with Dr. Ingmar Schuster
- VentureBeat Article on AI Agents
- Searching for meaning in the digital age
- An Elegant Puzzle by Will Larson
- General Topology by Stephen Willard
- Moby Dick by Herman Melville
- SuperDataScience
- Jon Krohn's Mathematical Foundations of Machine Learning Course
- The Super Data Science Podcast Team
Follow Bryan:
Podcast Transcript
Jon Krohn: 00:00:00
This is episode number 835 with Bryan McCann, CTO of You.com.
00:00:12
Welcome to the Super Data Science Podcast, the most listened to podcast in the data science industry. Each week we bring you fun and inspiring people and ideas, exploring the cutting edge of machine learning, AI, and related technologies that are transforming our world for the better. I'm your host, Jon Krohn. Thanks for joining me today. And now let's make the complex simple.
00:00:45
Welcome back to the Super Data Science Podcast, prepared to be blown away by today's tremendously intelligent, successful, and well-spoken guest, Bryan McCann. Bryan is co-founder and CTO of You.com, a prominent AI company that has raised $99 million in venture capital, including a $50 million series B in September that valued the firm at nearly a billion dollars. He was previously lead research scientist at Salesforce and an assistant on courses at Stanford, such as Andrew Ng's wildly popular machine learning course. He holds a Master's in Computer Science, a Bachelor's in Computer Science, and a Bachelor's in Philosophy, all from Stanford University.
00:01:28
Today's episode should be fascinating to anyone interested in AI. In it, Bryan details the philosophical underpinnings of the breakthroughs that led to the leading AI models we have today, as well as the ones that will emerge in the coming years. He talks about how a coding mistake he made serendipitously revealed fundamental insights about meaning and language model development. He talks about why he believes humanity is entering an existential crisis due to AI, but nevertheless remains optimistic about the future. He talks about the fascinating connection between language models and biological proteins, and why AI systems might soon be able to make scientific discoveries humans could never dream of making. All right, you ready for this extraordinary episode? Let's go.
00:02:17
Bryan, welcome to the Super Data Science Podcast. It's awesome to have you here and your audio is so good.
Bryan McCann: 00:02:26
Well, thanks for having me. I'm honored to be here and thank you for helping me with my audio setup today.
Jon Krohn: 00:02:33
Yeah, we had a fun moment. We were scheduled to record several hours earlier than Bryan and I are actually recording, and he went out and bought a microphone. In case actually anyone's ever wondering, if you really like my sound quality lately, a couple of months ago I bought a Shure, S-H-U-R-E, a Shure MV7+. It's super easy. You can plug it in USB-C to your laptop, and it has all kinds of built-in things that make the sound really good. And both Bryan and me are in a very echo-y room, but you can probably barely tell.
Bryan McCann: 00:03:08
I sure hope so.
Jon Krohn: 00:03:11
It sounds good.
Bryan McCann: 00:03:12
Had to go all around town to get this thing. They talked me into the mic stand too. It's like 15 pounds. I'm going to have to carry it on the plane back to New York.
Jon Krohn: 00:03:22
Yeah, from San Francisco, which is where you're recording today, right?
Bryan McCann: 00:03:25
Yeah, yeah.
Jon Krohn: 00:03:27
Nice. All right. Well, so let's dig right into the technical content here. You're the co-founder and CTO of You.com. I'm super grateful to have you on the show because You.com is a really cool company. I've been using You.com for some time now, and you can explain this better than me, but I'll kind of tee you up for it, is that You.com reimagines search by, instead of having a search engine, it's what you describe as a do engine. And so the first product that made a big splash was connecting us to lots of different large language models. So you can go to You.com, it's free to use, or at least there's some kind of amount that you can use it just for free.
Bryan McCann: 00:04:09
Mm-hmm.
Jon Krohn: 00:04:09
You can sign up with your email or log in via Google or Apple authentication. And then there's dozens of different LLMs that you can choose from. So if you want to be using the latest state-of-the-art models from Anthropic or OpenAI or open source models, they're there. So anyway, so that's kind of like... That was my introduction to You.com. It's probably what I use it for the most. But yeah, tell us more. You can do a much better job pitching it than me, I'm sure.
Bryan McCann: 00:04:35
I don't know. I don't know about that, but you did a fantastic job. But if I were to add on, I would say the model agnosticism, if you will, is definitely a big selling point. People love coming to You.com for the latest and greatest to try out OpenAI, Anthropic, Gemini, wherever. A lot of the most recent open source models are typically there. We work with all of these companies and the groups that are making these models to make sure that we're going to have them as early as possible. And it's just really convenient for people to have, say, one place that they can go to try these things out side-by-side to some extent for free. And then they can also go into our more premium modes and get a subscription. And that's where you can access maybe some of the more advanced models to a greater extent, like larger context windows, more advanced file upload features, and some of the premium offerings where we decide which models to use for more complex use cases like really deep research.
Jon Krohn: 00:05:55
So that's something that I'd love to understand. And obviously you're not going to be able to go into intellectual property and spill all your secrets. But-
Bryan McCann: 00:06:02
You never know.
Jon Krohn: 00:06:03
... for example... Yeah. Yeah. Yeah. Listen up, listeners. So instead of clicking on say, GPT-4o, I can click on Smart as the kind of... I mean, I guess that's not the VLM that I'm choosing, but I'm choosing kind of like Smart Mode in You.com. And by the way, listeners, You.com's Y-O-U.com, like not me.com, but You.com, and that must've been an expensive URL.
Bryan McCann: 00:06:34
I'll get to that. Yeah. Yeah. Let me touch on Smart Mode. There's a Research Mode. That's my personal favorite. And there's also a Genius Mode, but we're in the process of essentially combining Research and Genius into something more advanced. So Smart Mode is our version of a GPT-4o, but then surrounded with maybe a dozen other models that are improving and rewriting your queries, that are potentially rewriting prompts and dynamically constructing prompts for your use case and your intent so that we pick the best prompt, the best model, and also try to include a little bit in Smart Mode of citations so that it's a little bit more grounded in search results.
00:07:33
Now the next step up, so Smart Mode is free. That's the free one. That's what you can go to. And if you don't want to think about which model to use, you use that Smart Mode. Research Mode, for me, that's where it's at, because in that one, we're using more advanced models by default for everything. Every time you type in a query, we're not just doing a search behind the scenes, but we're doing multiple searches behind the scenes for all the things that you probably should ask, but you didn't in that original query, going and getting all of those search results back. And then that model is optimized less for... Or that system, really, that mode is optimized less for quick factoid concise answers and more for comprehensiveness and accuracy being really, really grounded in that information.
00:08:27
So if you try a Research Mode and compare it to a Smart Mode response, it's going to be much longer, you're going to get very accurate citations, usually one per sentence essentially, and you're going to get 100 different sources or something like that so that you can use it. You can start to see what it would be used for if you were a biological researcher or an analyst or something like that.
Jon Krohn: 00:08:55
Very cool. And this actually, what you're describing, it reminds me of some of the kinds of things that I'm doing in my own company at Nebula with our LLMs, and this is the kind of thing that you might have some experience with because you are an advisor at literally a competitor of my company. It's called Moonhub, and I dare say Moonhub is the best known of the companies in my space. So these kind of talent acquisition automation platforms.
00:09:24
And well, we have an encoding LLM, so we have 180 million public profiles that we've scraped of people in the US, the professional information, and asynchronously, we have pre-computed vectors using an encoding LLM for each of those 180 million people. So for our listeners, you can think of that as a big table or a big spreadsheet where it has 180 million rows representing each of these people, each of these profiles. And then there's 1,000 columns that are just numbers. And those 1,000 columns indicate a location in space. And so somewhere in our 1,000 dimensional space, there's data scientists. And then probably nearby them, there's data analysts and software developers are probably not far away.
00:10:21
But servers at restaurants would be in probably quite a different space of this space. And actually the word server is a good example because what's great about these kinds of sophisticated embeddings, these vectors in the way that they're created with modern LLMs is that a word server would mean a very different thing in the context of somebody who's talking about liquor and food relative to somebody who's talking about Python and SQL.
Bryan McCann: 00:10:53
Absolutely.
Jon Krohn: 00:10:55
And so anyway, we have that encoding LLM, and then real time we allow people to write queries and we convert that query also into that 1,000 dimensional space. And then you can rank in milliseconds the top people in the US you've ranked those 180 million people for the query that the person put in. But people don't always put in optimal queries, kind of like you were just describing. And so we have our own generative LLM that takes whatever input people provide to us, and we convert that into something that's optimized for our encoding LLM to turn into a vector downstream.
00:11:32
And so it sounds like you guys have done a similar kind of thing where for, I guess, each of the different kinds of models out there, like the OpenAI API or the Claude API or the Cohere API, you've figured out different tricks with prompts and with restructuring, or it sounds like even rerunning in Research Mode, running maybe multiple queries. And so everything that you're saying kind of makes sense to me, and I probably went into way too much detail about my own use case.
Bryan McCann: 00:12:04
No, no, that's great. I love it because that's a big component of our stack as well. You can imagine the flow for us in a Research Mode looking something like a query comes in, we ask an LLM, "Hey, what are the top three, five, ten queries that should be asked?" So it's not just rewriting the query, it's actually a little bit more than that to ask the questions that you're not asking that could be relevant. And then we'll go out and do searches for all of those. So we will go to some sort of search engine like the one we've built internally, but also potentially external tools. Maybe some of the search is done through a vector database the way you're describing, and sometimes it's a lexical-based search. Those things are all part of what I call understanding, an intent understanding. So depending on the intent of the user or the type of query we're dealing with, all of these things can vary.
00:13:18
Sometimes we'll personalize the answers, sometimes we won't. Sometimes we won't do a search at all. But then once you get all of these sources back and bring them in, put them into a... You could just put them into a context window for a generative LLM, but you can spend a lot of time optimizing that prompt. And so what we've found is for every model, there's an ideal way or a more ideal way to write prompts. And actually for every combination of model, and even to some extent like user, you can change these things to get much better responses. We'll have then models that are checking. So we'll go out and instead of a normal RAG-based approach that maybe pulls in some of these documents but pulls in snippets and summarizes them, we'll actually go out and crawl the pages live, again in Research Mode, to make sure that they're 100% up-to-date and fresh.
00:14:24
So yes, you're using the search index as a cache to know where to go, but then you actually do go get the freshest, latest, up-to-date information. And we'll have models then go in and check, okay, did this sentence generated by the language model actually say what that source said? Because sometimes even if you give them a source, they'll hallucinate, right? So you have to check for implication and entailment in both directions. And sometimes it's okay if one direction is off, but again, that depends on the intent, depends on what people are asking.
00:15:03
So again, there's maybe a dozen or so models that will run on a single Research Mode query, which is part of why that might take a little bit longer, but the output is usually so much better and more accurate, and we can see user behavior changing to ask for more of that. People are willing to wait a couple extra seconds if the output is dramatically better. If the output wasn't dramatically better, then speed is king. But if you can do something that feels almost magical, then it's worth waiting for. And now we're seeing this translate into B2B use cases for us, which are where it gets really, really exciting as well.
Jon Krohn: 00:15:48 Nice. Can you go into a little bit of an example, like a case study or two where it's... I mean-
Bryan McCann: 00:15:51
Oh, sure. Yeah.
Jon Krohn: 00:15:52
... even explaining maybe even before going the B2B, when you're using Research Mode, what are some examples of queries you've ran recently where you're like, "Wow, I'm really glad I did this in Research Mode as opposed to Smart Mode or instead of in ChatGPT or in Claude"?
Bryan McCann: 00:16:06
100%, yeah. This will be revealing of my daily life as well as one of my secrets, but I will regularly use our Research Mode before I meet any customers. So that's a perfect example of a use case where if I were to just type in the name of a company that I'm about to go in and give some technical pitch to, I'd get some links if I was using something at Google. If I was using ChatGPT, I might get some information about that company. But if I use Research Mode and I type in the company's name, it's also going to ask who maybe the founders are. It's going to ask who the key people I should be talking to are. I can have all of this extra research done for me. And if that takes five, six seconds, that's actually way more efficient than me sitting there trying to think about all the things I should be asking and then making follow-up searches. I want as much done in that one query without me having to think about it as possible.
00:17:18
And so then that translates into some of the B2B use cases we're working with. Maybe biotech companies, VC firms, things like this where you actually start seeing the queries change. So one company we've been working with called Elucidata, they have a lot of researchers who are sifting through millions of research reports, PDFs and CSVs for clinical trial data associated with those research reports. And they're literally trying to figure out what is the next best experiment I could run to get rid of cancer or something like that.
00:18:03
It's a lot of data to understand, for one human mind to understand. And if you wanted to ask a question now of our solution for Elucidata, it's basically like going to a resource mode for them over their data and the public web data. So you can say, "Look, did this drug have a positive effect on cancer patients?" They have all these research reports. We can do search over those. They have all this clinical data that we can actually analyze in a code interpreter or kind of agent for data analysis there. And we can go out to the public web and get research reports that might not be from their proprietary data, but might be from other companies or academia and do the same thing. And with that one question, we're going to bring all of it back, break it down and do more than summarize, but actually synthesize certain parts of their research for them all with very clear citations and attributions so they can click through and verify over time.
00:19:15
Another maybe even better example is a VC who's trying to know, should I invest in company X? Not the company, X, but company Y, let's say. And the same holds. They have all the investments they've made in the past from their firm, they have maybe information in a bunch of CSVs from their due diligence. There's a whole lot of information about that company, very likely on the public web. So when you say, "Should I invest in this company?" What you're really saying and what research, our deep Research Modes over this kind of proprietary use case does for you is, " Well, who are their founders? Who are their key employees? Did their founders have prior exits? How did their last companies do? In addition to that, let's go over all of the due diligence materials you have. Does this fit into your previous investments that were successful? Of the ones that were not successful, if it looks like them, why weren't they successful?" Because again, we're optimizing for comprehensiveness. And those responses, they might take a minute, but all of the VCs that are using it are telling me, "Hey, do more. It could take a week. I don't care."
00:20:39
Because if you can actually answer that question accurately and do all that research for them, it would take them two or three weeks to do the same work. So if it takes one week, that's fine. We don't actually know how to even do something that runs for a week right now, but that's an exciting user behavior change. To be asked a question like that is very, very interesting, and that's why I'm very excited about automating more and more of these workflows. Where now Research Mode on You.com might generate these text reports, but Research Mode for some of these companies will generate slide decks with images because we're calling out to image generation capabilities on top of all of the research we've done. And you can just walk into your investment committee and almost decision made. We try not to have the AI actually make the decision. But again, it's comprehensive, so it gives you all the information you would want to make the decision.
Jon Krohn: 00:21:38
This idea of longer inference times, like taking a minute, or maybe in the future as you guys do more R&D, taking an hour or a day or a week or longer, it's reminiscent for me, although under the hood, I suspect a completely different kind of process, but it's reminiscent for me of OpenAI's o1 model, which is explicitly designed to do that same kind of...to scale at inference time, which seems like there's a lot of potential in that in terms of getting really accurate, really comprehensive, like you're describing, really thoughtful results, and so yeah, it sounds like you're kind of following that same scaling opportunity.
Bryan McCann: 00:22:16
Definitely. There's definitely a similar theme going on here. Their approach is promising. They were focusing primarily on mathematical reasoning and any capabilities like that, whereas we're focused generally on productivity and typically in a company's most valuable space. We're not doing emails and Slack and Notion. It's usually this kind of core biotech research or at the heart of your investment thesis. But it is a very similar theme and I'm super excited about it.
Jon Krohn: 00:22:54
Nice. What does a company get, a VC firm that's working with you or a biotech company, or maybe there's people at companies that are listening that they think, "Wow, I'd love to be using You.com in a B2B use case." What are the advantages of engaging in that way and having that B2B enterprise relationship as opposed to just going to You.com and getting a subscription and using the research tool there?
Bryan McCann: 00:23:21
So, typically what we see is some individuals start with You.com. They're using it, they try out Research Mode. They get a subscription, but then, fairly quickly they want to use it for work and they want to start uploading files, files that maybe they need some data guarantees on, or they want to see shared responses, or something like this. They want to be able to collaborate on those things. And so, then you can move into the teams options on You.com, and you can move into the enterprise options. But once you get further beyond that, once you maybe have some specific needs that aren't offered by the out-of-the-box platform itself, then it really makes sense to reach out to us. Because sometimes we'll help people integrate our APIs so they can build some of their own solutions and their own tooling. So some people just use You.com as it is, and that's essentially, if it's doing everything you need, great. That is what it's there for.
00:24:31
If you need to build an internal tool for your company that looks a lot like You.com, but it can't run on the public web or something like that, well then, you can use all of our APIs, which Backsmart Motive is smart API, research motive or research API, and we can make those work over your data. And if you just don't want to think about the rest, will help you do that. And then you can have your application. A lot of people at this point, in fact my favorite customers are the ones that have spent maybe a year or so with five, six, maybe a dozen people trying to build a RAG-like solution using OpenAI and following a blog post to set up a vector data pace and do this, and they just don't really see the ROI or they don't get it adopted within their company.
00:25:27
A lot of large companies are not seeing adoption, even once they've built their internal RAG tool. And we can come in and bring all these extra models, this extra, oftentimes, I'll call it a trust layer on top. Actually, make the stuff work and make it trustworthy for your employees. And that's where we will walk them through evaluations if they don't have evaluations set up. And if they do, then we crush those evaluations and show them how much better it will be with our technology on top. At the same time, you're future proofed again against future models coming out from OpenAI, Anthropic, et cetera. It's the same reason you might not want to be locked into a single vendor on anything crucially important. You can come to You.com for that as well.
Jon Krohn: 00:26:20
Well, everything you're saying, Bryan makes a huge amount of sense to me. It sounds, with my entrepreneur hat on, or if I were a VC investing in things, it seems like you guys have figured out all the angles just right in terms of having a great B2C product, a valuable B2B product, getting those enterprise functionalities in there, like collaboration and data protection, customization. And it must be really exciting to be working on a product like this.
Bryan McCann: 00:26:47
It's been a phenomenal year for the company, and it makes it a really exciting time to partner with folks that also have alignment with our longer-term roadmap. If there's anyone out there that kind of knows anything about my background or my co-founder's background where we try to push pretty hard on the innovation front, and so we've always got a million ideas that we're looking for, design partners to work with as well. And if there's just something that you can't quite find out how to do with You.com or otherwise, then I think it still makes sense to chat with us and see whether that's something that we're looking at doing.
Jon Krohn: 00:27:33
Super cool. When people hear some of the functionality that you've been describing so far, particularly, this idea of bringing back results from the web using LLMs to present that, some of our listeners might think of Perplexity, for example. How do you guys distinguish against them?
Bryan McCann: 00:27:52
So Perplexity and us, I would say maybe a year or so ago, year and a half now, very, very similar. They've continued down a path that is seemingly much more focused on the consumer side of things and those Google replacement queries, to be your replacement for a daily driver search engine. We're really not focused on that anymore. For us-
Jon Krohn: 00:28:22
We're not focused on You.com.
Bryan McCann: 00:28:25
We are focused on You, not Google. We're focused on these deeper, more complex automated workflows. That's where our technical and product roadmap is going, where we can... Our functionality is getting further and further away from those quick, concise, quick knowledge-based answers that you can find on the web, and focusing more and more on where do you get the most value in your company? What could be automated there? How could we double your productivity or double your bottom line, or whatever the metric is without changing your head count at all?
00:29:20
Everything related to that, that's what's most important to us. So, our success is really dependent on customer success. Has nothing to do anymore with beating Google or having a certain market share of the search space. That's just something we're not focused on. We're just focused on you, as you eloquently said before.
Jon Krohn: 00:29:45
Ready to take your knowledge in machine learning and AI to the next level? Join SuperDataScience and access an ever-growing library of over 40 courses and 200 hours of content. From beginners to advanced professionals, SuperDataScience has tailored programs just for you, including content on large language models, gradient boosting and AI. With 17 unique career paths to help you navigate the courses, you will stay focused on your goal. Whether you aim to become a machine learning engineer, a generative AI expert, or simply add data skills to your career, SuperDataScience has you covered. Start your 14-day free trial today at superdatascience.com.
00:30:25
Nice. I love that. Another thing that seems to distinguish you from maybe anyone else out there, and that I'd love to dig into in a lot more detail, in fact I have quite a few questions on this coming up, is AI agents. I talked right at the beginning of this episode about how instead of being a search engine, You.com talks about itself as being a do engine. But so far we've mostly been talking about... In some respects we've been talking about doing research, I suppose. But the result of doing that research is still bringing back search results even if they are a lot more comprehensive, a lot more thoughtful. You've covered a lot more bases.
00:31:08
But it seems to me like the AI agents that are really prominent in your platform these days that that makes this really a do engine. So maybe you could tell our listeners a bit about your perspective on AgenticAI, particularly given that you have predicted, so there's a VentureBeat article that we have in which you predicted that there will be more AI agents than people in 2025, which is next year.
Bryan McCann: 00:31:38
If I have to, I'll just truly create enough to make that true. This is great because it goes back to our origin in starting You.com. We entered startup life, coming out of our research time, our AI research years, and entered into search, in particular, because our intuition was that search would have the most dramatic changes. That really, our relationship to how we retrieve and interact with information would change dramatically. And so, the nature of a search box in 2020 when we started would really change and you could do a lot more. So when we started out, we were talking about a search box, and so we started out building out the foundations of a search engine, but trying to mix in generative AI along the way so that we could do more than just give you 10 blue links.
00:32:56
Today, we call those agents. Everything in that world of trying to do more for users, whether it's booking flights for you, or booking reservations, or whatever it is, taking actions on your behalf is typically referred to as an agent. I maybe have a slightly stricter definition of the last year and a half's worth of agents letting the LLMs decide what to do. I do think there's a little bit of a sense in which all software, it seems people are calling agents now or anything that does anything automatically, which is probably a little bit of a stretch of the term, but I understand why it's happening. But a year and a half ago we started developing LLMs that were deciding which tools to use, so it's intimately related to tool use. Some of those tools can be a search engine. So, some of our more advanced agents, they can use search engines or our Research Mode even as a tool in these more complex workflows.
00:34:10
So for us, yes, you can put a query into Research Mode and it can go do all that research for you, or you can build one layer up and have an agent that can go use Research Mode, but then can also interact with some of your own internal tools. And it can also then go write code based on all of that research, and perhaps that code is to do some of the data analysis, or perhaps it's to write VBA code so that you have a slide deck. And so, every time you're changing the tool being used automatically just based on what the LLM wants to do or how you've prompted your agent to work, that's where I see us extending these workflows to do a lot more. And that's where a lot of our users are finding it to be really exciting as well.
00:35:07
It's not just search, it's not just deep research. It's do all that research for me, but then do a bunch of other things with it, whether it's updating my CRM or going and sending some emails, creating slide decks, writing code that's going to run in these integrations. That can be research-based, but those are pretty different actions than just doing research.
Jon Krohn: 00:35:37
Nice. And I can, it seems to me, I don't have a subscription to You.com, but I still have access to these agents. And so any listener can go create a free account and be getting things like help with booking travel, that kind of stuff, through the platform for free. Should give it a try.
Bryan McCann: 00:35:54
For sure.
Jon Krohn: 00:35:54
Super cool. And congrats. Recently you had a $50 million fundraising round. That's really exciting. I think this is going to be maybe my last question on You.com. Then I'm going to get into some of your research that you've been doing in publishing. What's that like? When you raise $50 million, what does that change in terms of, I don't know, company structure, the way you do things? Does it change anything at all?
Bryan McCann: 00:36:18
It did change things for us. We'd raised 45 before that, and so, but over the course of three and a half years. This round was specifically to start building out this B2B side of the business. And so, that meant we had to hire a lot. We had to build teams that were equipped to, well, create and sell and market enterprise solutions. So the org did change and conversations changed, and it wasn't all about the consumer side and these subscriptions. It was suddenly, well, we're trying to do sales, and I'm flying to Germany and London and spending time with these enterprises doing workshops. And helping them learn how to transform their company so that they can actually better use You.com.
00:37:23
Since a lot of the enterprise stuff for us is API based, usage based, it's very different. In many ways it's been quite exciting to see the company shift and change, and my role has changed in ways I just described, and I've really loved it. It's been super exciting. And it means that we've been able to accelerate our revenue growth and keep up the pace that we had been having for the previous years.
Jon Krohn: 00:38:01
Congrats, Bryan, and thank you for taking the time to be on this show, while all that is going on. I really appreciate it. I'm sure our audience does as well. So let's dig into your research. Let's move away. Feel free to still continue talking about You.com wherever you want to. Very interesting.
00:38:16
But I want to dig into some of your research that spans not just recent years at You.com, but also years before that, because you've been producing machine learning research for over a decade at Stanford, at Salesforce, and now You.com. You hold over a dozen patents, mostly in natural language processing in areas like unified models, explainable AI, LLM, evaluation, and controllable text generation.
00:38:41
I don't know if you want to give... I've just given an overview, but I don't know if there's anything else that you want to say as a general overview before I start digging into some specific questions, some specific topics.
Bryan McCann: 00:38:54
I think there are some, maybe I can just highlight some of the big themes and motivators that I had when I was doing that research before we get into the specifics. I came at natural language processing. I came at AI from a philosophical perspective, originally. I was doing philosophy of language and computer science on the side for fun. I was interested in meaning, and in the academic world and analytic philosophy, a lot of those questions of meaning become questions of language. But I felt like there was a bit of a... I was running into a dead end, at least, with armchair philosophy. And so, I wanted to use computers to study language. And really I wanted to use computers to study meaning, like what is this thing that we're doing when we make stuff mean things?
00:39:48
And I don't just mean in what I call the logical positivist philosophical way, like true and false and things like that. I don't know, this can be a meaningful conversation. Maybe to someone listening out there, this is a meaningful thing for them to listen to. And it's not just about the content and the semantics, but it's something more than that. And so, I was always looking for that in my research, whether it was my first papers on contextualized word vectors or really my broader pursuit and thesis that we should have unified models for all of AI, that was also driven by this desire to understand meaning. And even in focusing on controllable generation and other areas, it was all about giving us tools for understanding what we were doing. Maybe I'll pause there. You can-
Jon Krohn: 00:40:54
Let's dig into the unified models that you just mentioned there, because that's actually, incidentally, the next topic that I had lined up. So on your website you describe pioneering unified AI systems for NLP, for natural language processing. Tell us more about that. What does that mean? In my mind when I hear about that, it makes me think of how as LLM say now have been scaling, we end up discovering that they can end up doing this modeling of the world, having a world model.
00:41:22
So for example, when OpenAI Sora generates video, it seems to encode within its model parameters somehow an understanding of how physics works to allow if a ball is moving through the footage in Sora, that it continues at the same speed and the same trajectory, or that if it's moving downward, its trajectory should increase, its speed, should increase, that kind of thing. And so, when I think of a unified model, I think of this LLM scaling maybe involving more and more modalities, being able to all of a sudden handle any kind of intelligence task, any kind of cognitive task. That's what comes to mind for me. But I don't know if that's what you're talking about when you talk about unified AI systems.
Bryan McCann: 00:42:08
I think that is in the right direction, if not exactly what I'm talking about. When I was starting out in research and when I walked over to the computer science department, I was looking for someone who was working on language. I found my current co-founder, Richard, who's doing his PhD. And he was working on deep learning for natural language processing in in his dissertation. By the time-
Jon Krohn: 00:42:35
I've known him for a long time, because I used to run, in 2016, I started running this deep learning study group in New York-
Bryan McCann: 00:42:41
Oh great.
Jon Krohn: 00:42:41
And we would decide together what coursework to follow, and we started off with a deep learning textbook, but then we'd covered the basics. And so we wanted to get to the cutting edge. So we first went through Andrej Karpathy's course with Fei-Fei Li, the general deep learning course, which has a lot of machine vision applications. And then the next thing that we did was Christopher Manning and Richard Socher's Deep Learning for NLP course.
Bryan McCann: 00:43:09
Well, there you go. So, that's exactly the material that I was encountering for the first time in 2013 when I met Richard. And a lot of the core hypotheses that were being explored there, this idea of words meaning being associated with their context in which they're used, or I think I'm going to butcher the quote, but there's this guy named Firth, and he says something like, " You shall know the meaning of a word by the company it keeps."
Jon Krohn: 00:43:44
Mathematics forms the core of data science and machine learning. And now, with my Mathematical Foundations of Machine Learning course, you can get a firm grasp of that math, particularly the essential linear algebra and calculus. You can get all the lectures for free on my YouTube channel, but if you don't mind paying a typically small amount for the Udemy version, you get everything from YouTube plus fully worked solutions to exercises and an official course completion certificate. As countless guests on the show have emphasized, to be the best data scientist you can be, you've got to know the underlying math. So, check out the links to my Mathematical Foundations of Machine Learning course in the show notes, or at jonkrohn.com/udemy. That's jonkrohn.com/U-D-E-M-Y.
00:44:31
Exactly. I think it's originally Wittgenstein. The concept.
Bryan McCann: 00:44:35
Exactly, exactly. It goes back to Wittgenstein. So, that's what I was noticing. I was coming over and I was seeing like, "Hey, this deep learning stuff and these word vectors are actually a great way for me to test out some of these philosophical hypotheses about meaning." I was thinking exactly that. I had Wittgenstein in mind, and-
Jon Krohn: 00:44:57
You're right though that the quote I think I end up misattributing that. I have many times over the years taught that it's Wittgenstein that said, "You shall know a word by the company it keeps." But I just looked it up quickly, and you're right, it's John Rupert Firth.
Bryan McCann: 00:45:12
It's similar enough to Wittgenstein and ideas that you just had the same brain connection I did back in 2013 where it's like, "Oh, this is the same. It's basically the same." But they were quoting Firth a lot, and I was thinking a lot about Wittgenstein. And my first demo I saw was Mikolov's demo at NeurIPS in 2013 of Word2vec, king minus queen-
Jon Krohn: 00:45:39
Oh, Cool.
Bryan McCann: 00:45:41
Plus this equals this. And I was like, "Whoa, this is working too."
Jon Krohn: 00:45:46
We should be able to do that quickly here on air. So, king - man + woman = queen.
Bryan McCann: 00:45:55
That's right. That's right. That's what it was. And they had a nice demo. You could go and type in different ones, so you could do dentist - teeth + heart. Then you'd get cardiologist. And like, "Oh, this is so cool." And there's something-
Jon Krohn: 00:46:05
And actually, it ties in a way to what we were talking about earlier with vector searching, where the semantic meaning of things is encoded in a high dimensional space. And it's that same property that, Thomas Mikolov like you're describing with Word2vec, he was really... That was my introduction as well to this idea of meaning just being Christopher Manning, or maybe it was Richard Socher, but I remember the way that Christopher Manning would say it. He introduced me to this idea of smearing meaning, in his Australian accent. As opposed to trying to have discrete points in a tree and using some tree ontology to represent the semantic relatedness of things. Instead, you allow a vector space to have meaning gently smeared in all of the thousand dimensions or whatever you have in that vector space.
Bryan McCann: 00:47:00
I think, but that was a controversial idea at the time. Mikolov, Richard, Chris Manning, it was not accepted necessarily that you could cram the meaning of a sentence into a vector. That was a pretty radical idea, but it felt so right to me. And more than that, there were some early papers by Collobert and Weston, as well. One of them was called NLP from scratch. And I got really obsessed with this idea that at the base this was working, there was some sort of structure to language and maybe meaning that we could encode with these models, and it was starting to look like it could actually work. But the way that most of the field still did research and the way that they were building AI was by taking a task like machine translation or sentiment analysis or question answering, whatever it is, named entity recognition. We would pick a task, a conceptually well-defined task, and then you'd architect a neural network for that task and you'd get data for that task. You train the model, and then you'd have a model for that task. And that just seemed so wrong to me, just from day one.
Jon Krohn: 00:48:32
You're so much smarter in forward-looking than me because I was like, that's how data science will always be.
Bryan McCann: 00:48:39
Oh, no.
Jon Krohn: 00:48:40
So small-minded.
Bryan McCann: 00:48:42
I was like, this is the thing that's wrong. We have this step that we're taking with deep learning for language and vision, et cetera, mostly for perception-based tasks. But why would we try to teach every model that's going to do a language-based task language? Is it not important in some sense for that model to know all of English and be as fluent as possible before it decides whether a sentence is positive or negative? Now, is it absolutely necessary in all cases? Well, if you have a big enough data set, well, maybe not. But this was also, this is very controversial, most people believed what you just said. I got a lot of pushback. And to me, I had so much conviction, I was like, wow.
00:49:31
But Richard and I agreed on this as well. So my first kind of researchy paper before I was doing it professionally was while I was in Richard's class, I met him when he was a PhD student, and I was in school he did his first startup, and then he came back as an adjunct professor to teach the deep learning for natural language processing. So I did a class project on all for one multitask and multimodal. So I was saying I want to create a single model that's going to do visual question answering, so answer questions over this, and sentiment analysis. So classify whether sentence is positive or negative.
00:50:18
Why? Why did I want to do that? Because when people did visual question answering at the time, the way that you would design that model was actually picking from a list of answers, a list of possible answers to the questions. You weren't generating a sentence to say like, okay, what color are the bananas in the image? And it would say yellow. It wasn't generating a word the way that we generate with an LLM, it was picking yellow or three as a class out of a list of possible answers like it was a giant multiple choice test. Again, felt totally wrong to me because then your model's constrained to that space. It can't really do visual question answering.
00:51:02
And then the same was true for sentiment analysis. You could choose a class, positive or negative, but the word positive and negative wasn't anywhere in the model. It was just a class, zero or one, and then we assigned a label positive or negative to that class. So I built a model that was saying, well, you don't get to do that. You have to have a shared vocabulary and you have to pick a word and a sequence of words that do this. It didn't work super well, but that was the idea of saying, one, tasks and the way that we've very nicely and conveniently conceptually defined them are made up. Let's stop doing that. And modalities, differences between modalities are not necessarily made up, but they could be very important for each other. And that's what then they hired me at Salesforce to do. Essentially when I joined Salesforce research, and there were maybe four of us after Richard's startup got acquired, my whole thing was, okay, how do we make a unified model for NLP? Specifically when I was at Salesforce, I always deep down wanted to do all modalities, but there was enough work to do with NLP that I focused on that first. And for the first year, I tried, I really tried. I built a very generic architecture. I was taking summarization, question answering and translation, which were some of the hardest tasks at the time. They had a shared vocabulary and I was trying to make it work, but the data sets were really early. We didn't have a lot of the common, the ones people use today or even a year later. So it didn't really work.
Jon Krohn: 00:52:58
It reminds me kind of, and maybe it was around the same time as the T5 model from Google.
Bryan McCann: 00:53:02
So this would've been before T5. This would've been 2016 actually.
Jon Krohn: 00:53:10
Right, right, right. That's quite a bit ahead.
Bryan McCann: 00:53:11
The summer of 2016. It didn't work super well. But I backed off and I said, okay, I'm going to try to establish some connection between tasks. So I'm going to show something like if I train a model to be good on machine translation from English to German, then the part of the model that learns English should be learning something helpful for a model that wants to do question answering on English data on Wikipedia. And that was just the basic connection I wanted to establish, that transfer learning in natural language processing could work beyond just word vectors. People were using word vectors, but they weren't transferring architecture, the models on top of word vectors. Everybody was using GloVe and Word2Vec, but then the layers of the neural net, they were LSTMs at the time. They weren't transferrable.
Jon Krohn: 00:54:10
That was my next question. That's exactly, because it predates 2016, predates transformers being used for this kind of thing. And so I was about to ask you if you use some kind of recurrent approach. LSTMs is the answer, which, it doesn't surprise me. That's what I thought. Yeah.
Bryan McCann: 00:54:23
So I started transferring some bidirectional LSTM layers. So you'd start with Word2Vec or GloVe, then you would train these by LSTM layers for a task like translation or summarization or whatever. And then you'd transfer that and show that you do better on the second task if you learned the first task, which was, again, for some people it was like, that's counterintuitive. Why would that help? But to me, it just felt so right that, well, you're just learning more of the statistics of English from one task, like translation where we had a lot of data, relatively speaking, versus QA. So actually, you'll do better.
00:55:06
And that was my first paper where the accidental insight, well, there are two accidental insights that were very important for me when I was doing that research. One was the hypothesis itself, which originally I set out to say unified models, and then I backed off to transfer learning and found, oh, context. It's all about context. Word vectors are about context. Now, these contextualized word vectors are about incorporating more context. And if you want to think about cross-modality training, that's extra context. And then for years after, it was, oh, it was obvious to me that we needed to get really, really large context windows because you need to have longer context. So for a long time, I worked on really long context windows.
00:56:01
It was also clear to me that just GPU memory was super important because then I could fit more into context. So a lot of these things just became natural implications of this idea of context, context, context. And then the second idea, which will be really relevant to your transformer question, was I messed up an experiment. I accidentally ran an experiment for this translation model where I trained it for translating English sentences into German sentences, but I accidentally made the word vectors on the English side completely random and untrainable. So they couldn't change. They were just nonsense.
00:56:50
But the German word vectors could change, and the layers on top could change, but those word vectors that are supposed to encode semantics and smear meaning across, random. And the model did just well, it just as well as without it. It was fine. So what's going on there? Well, I started thinking, well, that's odd. I've been looking for meaning. I kind of thought there was some meaning in these word vectors. But what's actually happening is my encoder in my decoder through this attention mechanism are just aligning these symbols. They're aligning these word vectors so that I can do the translation problem. And so as long as some of them get to move, they can move around so that this random set is aligned and the problems can still be solved.
00:57:47
So the translation problem was really about alignment of symbols and not meaning. It wasn't translating meaning, which was disappointing for me philosophically, because I was like, damn, there's no meaning anywhere in any of this. And very interesting because then I got obsessed with this idea that could I get rid of the recurrence? And the language I used to use was I just need the alignment. It's just an alignment problem. And so when the transformer paper came out, we were trying to get rid of this recurrence for a long time. It was very slow. It didn't really make sense for GPUs. It was bad. So I had tried to make attention only models myself, thinking, oh, there's this attention mechanism and there's just this alignment problem. It's just an alignment problem.
00:58:37
So it was no surprise to me when Transformers came out and it was, oh, attention is all you need. It was like finally someone figured it out and unlocked this part of the problem. That architecture ended up being even more general than I expected, but it was something that several of us in different labs I think were on the hunt for. That there was this something fishy going on with recurrence and recurrent neural networks that didn't need to be there. Then I came back to the unified model approach. In 2017, this paper got published. It was my first published paper at NeurIPS and Transformers came out, and then the CNN Daily Mail data set came out and Squad came out from Stanford. So then there was more canonical ways to do these hard tasks.
00:59:34
And I did my first real unified NLP paper, which was never published, very much rejected. And I gave talks on it at Google Brain and Apple and a bunch of different universities, and it really kind of split the group. I don't know, every room I walked into, it was like 50/50 or 70/30 of people saying, "This is a terrible idea." The majority was saying, "This is a bad idea. We should not be doing this," for the same reasons. In engineering, for example, if you're at Google, they were saying, "This is a bad idea. We can't be mixing all of data together and making giant models. How's it going to scale? We need to be decomposing the problem into smaller problems and then having task-specific model-specific things. And we'll always do better on a task if we do that." I don't know, it just felt wrong to me.
01:00:30
So I really wanted to teach a model language so that we could use language to describe what we wanted it to do, and it could use language to do all the things that we do with language rather than have the model be stuck on some artificial task. And that paper was very, very similar to the T5 paper. Almost identical in concept, except the T5 paper came out 15 months later, I think, something like that. Because BERT came out 15 months after my contextualized word vector paper, and then T5 came out 15 months after my DECA NLP paper. That's how I remember, it was always 15 months for some reason, I would have this idea, I would do this paper, and then they replaced my LSTMs with Transformers, and then they would add a bunch of way more data and way more compute, and then it would be the same, but better.
01:01:33
And honestly, at the time too, I was thinking to myself, I was in this tiny lab at Salesforce research, like, how do I get the Googles and the Facebooks and everybody to do my research for me? So I kept trying to put out these ideas, and then everybody would get obsessed with contextualized word vectors. And then 15 months later, we'd have ELMo and versions of ELMo and then BERT. And I'd be like, oh, yes. And in the meantime, I'd be going to work on a unified model stuff. And then for that 15 months, people would be like, oh, maybe we should do T5, and some of these other ones while I was working on my next thing, which was more came after that.
01:02:11
So yeah, that was an awesome paper. The T5 paper was I think the one that kind of better showed that it was a viable direction. But the same core idea was that you should describe what you want the model to do in language. You should have a generative approach to generating the answer. It should not be picking from a fixed set of classes, although the line gets blurred, but it should be as generative as possible. And yeah, that was the second big one for me, and that's what was really the unified AI about, I suppose. And then I always wanted to incorporate vision, but NLP kept me busy enough while I was in research that I never got around to it myself.
Jon Krohn: 01:03:02
It's been super cool to hear that story. I've been on the edge of my seat this whole time. It's so cool to hear from the inside these things that I've been studying for a decade completely from the outside and just kind of seeing pieces emerge. But yeah, you're right there in the machinery causing some of this stuff to happen. So super cool to hear that from you. Another concept that really interested me from your research and is something that I don't understand. It showed up in our research, and I haven't dug into the research that we did on you enough to know what this means because I knew I could ask you on air. So this is about controllable generation. So you worked on conditional transformer language models for controllable text generation. And so it sounds like the model or the approach was called CTRL, C-T-R-L, like the control key on a keyboard and then summarization with CTRLSUM. So C-T-R-L-S-U-M. Yeah, what is that? What are these things?
Bryan McCann: 01:04:07
Yeah, so I guess to continue the story a little bit from the DECA NLP paper, as I was saying, a lot of people did not pick that up and did not think that that was a good direction. But there were two groups that did. There was the group inside Google, small group that 15 months later did T5, and there was a group inside OpenAI that roughly eight or nine months later released GPT-2. So if you go back and you look at the GPT-2 paper, they'll cite McCann et al 2018 on this idea that we could be using language as this very generic way to get a model to do things.
01:05:02
When they released GPT-2, it was super exciting. They had this release strategy of not sharing the full model right away. And so I was like, okay, I'm going to go make my own, but I'm going to do it slightly differently. So they had done it the way that I think most people would be familiar now. They took a language model, they trained in on a bunch of data to predict the next word. So given a sequence of words, maybe 511 tokens, predict the 512th token. And that's how you get your language model, and that's how you generate all this text that we're generating today. More or less, you say, what's the next word? Now what's the next word? And what's the next word after that? And if you have a language model that's really, really good at predicting the next word, then it makes a lot of sense for all sorts of different situations.
01:06:02
But at the time, GPT-2 wasn't that good. It was not GPT-4, it was not GPT-3. It was sometimes generating some sentences that seemed like maybe the right style. People liked that. It could generate a sentence or two, maybe a paragraph of coherent text before repeating itself. And people started immediately anthropomorphizing these models right away, like, oh my gosh, is a conscious, is a person, whatever. And for me, so this is all explaining why I took this approach of one, trying to make the language model very controllable from a almost moral ethical perspective. People started doing things like, well, if I type something into language model and then it generates a bunch of nonsense that's bad and stuff, is that my responsibility?
01:07:08
People started asking these kind of questions, right? Well, if I put this false information on the web, but I didn't write it, the language model wrote it, do I have any real accountability? I didn't really like that direction. As the models were getting better and better, I was like, ah, I'd prefer if they were more controllable. So that one, the accountability is there and we could do what we wanted more. It was hard to get GPT-2, if you wanted to write a poem in the style of a sonnet, in the style of Shakespeare, you'd have to start writing the sonnet in the style of Shakespeare, and then the language model would pick up on that and keep going. But you couldn't say, "Write me a sonnet in the style of Shakespeare about X."
Jon Krohn: 01:07:55
One-shot or often multi- shot prompting to get that to work.
Bryan McCann: 01:07:59
Yeah, exactly. So I just trained it slightly differently. I gave it the source, so a little URL, every time I got a document from the internet to train this language model, I put the source or the URL at the start of the sequence. And so every piece of text that the language model learned to generate, it was always conditioned, so that's why it's conditional generation, on the source and the URL. Well, the nice effect this had was, let's say you take a URL like CNN.com/politics/thepresidentwenttoGermanytoday, and then there's an article associated with that. Well, at inference time, if you've trained it with this structure, then you can go in and you can just, instead of trying to find a clever way to start writing the news article or to prompt the LLM to get what you want, you just say, CNN.com/sports/ name of the article you want, and the date, /the date, and then it would write that article.
01:09:14
So it was a much more reliable way of generating it. We were also able to do source attribution through the language model, whereas GPT-2 couldn't do that. You could give it a sentence and say, " Tell me which source this was from by looking at the conditional probabilities of the different sources." That was useful, and you could play with these parameters in a way that felt more like knobs and dials instead of just alchemy with text. And a lot of that's still used for different tasks that have structure, like generating proteins where you really need to condition on the function and the family of the protein. Whenever you have this really conditional structure, it's useful to have these control codes, is what we called them. It gives you just a little bit of an extra constraint on the output.
Jon Krohn: 01:10:14
Very cool. And ties into maybe my last kind of really technical question that I'm going to get into here based on your research, which is protein generation that you just mentioned there. So in addition to generating text, you've repurposed language models for protein generation, which makes sense to me as I have a biology background, a neuroscience background, and so I'm aware, maybe not all of our listeners are aware, that the proteins in your body that do all the functional stuff for you, every imaginable thing that your body can do happens because of, well, except for some relatively small exceptions, but generally speaking, proteins are doing all of the work. And proteins are a sequence, they're a one-dimensional sequence, just like a character string made up of these things called amino acids. And each amino acid has slightly different properties, but basically you create this chain of amino acids, you could think of them as letters of the alphabet, and they allow you to create the vast, incredible amount of functional capability that our bodies have. The proteins that allow your eye to see versus your liver to detoxify alcohol, your skin to do all of the things your skin does. I obviously could go on with examples for hours.
Bryan McCann: 01:11:30
There's a lot of biology out there.
Jon Krohn: 01:11:30
There's a lot of things that our body does, and all of it's just encoded by these one-dimensional sequences of a relatively small number of amino acids, 20-something in humans. So something that's interesting. You have this connection to Moon Hub that we talked about earlier. And this is just a stab in the dark. I don't know what your answer's going to be here, but about a year ago I went to Berlin and I interviewed Ingmar Schuster, who has a startup. They're in the business of doing this. They're in the business of [inaudible 01:12:17]. Yeah, exactly, excellent. And when I was there with Ingmar, he mentioned your co-founder, Richard Socher, having recently been there at the Morantix AI campus in Berlin. So I don't know, just seems like a connection there in some way, which could be spurious.
Bryan McCann: 01:12:36
Yeah, I'll have to meet him one day. My connection to the protein world has evolved primarily through my co-author on the Progen paper from the Salesforce days. As you said, we took the control model, trained it for protein generation, called it Progen, and we were thinking exactly the way you were thinking. There's way more sequences of proteins than secondary and tertiary structures, which are expensive to generate and stuff. So what if we can make a model that just depends on the sequences. That spun out into a startup called Profluent. The CEO there, his name's Ali Madani, great guy. They've been doing great work. Because we had shown that you could generate these proteins and you could synthesize them in a wet lab. You could get proteins that did not appear in nature, but had better fitness, lower energy, so they were better overall, better at the tasks that they were designed for. And that became Profluent.
01:13:56
I'm still connected to that world. Not through Ingmar directly, but I love to talk to him. Maybe we've met and we could re-meet. But I think to generalize a little bit, continuing to push a lot of these themes of, okay, deep learning versus machine learning, getting out of the way. And for me, that move is getting out of the way of the algorithms as much as possible. So instead of designing features, don't. Just make them parameters. And then we've got out of the way of transformers. Instead of having this recurrence and our conceptual biases, let's just have an architecture that more or less just does matrix multiplications and then allows for sharing of information and context. Context, context, context, keep adding context, larger context windows, context vectors, whatever it is. Unify as much as possible, because whether it's vision and language or just different parts of language like code, the fact that code helps with logical tasks in language models, helps you do better on LSAT questions. It's like, oh, that's interesting.
01:15:14
Literally taking control, which was a model trained on English, and then using that to train on proteins was a much more stable training curve and a faster learning curve than training from scratch. That's odd. What does English have to do with the sequence of amino acids? Well, there's something general enough about learning how to do alignment and do sequence generation or something going on there, similarity, just at the core of it all, and I think we need to keep pushing all of this into the natural sciences more and more. So biology, chemistry, physics. I don't know if I've said this before or at least publicly or in a recording, but I think the same way I felt in 2013 about the deep learning transition being good, but our imposition of conceptual tasks and such on how we were doing AI being bad. And so we need to move towards more unified stuff.
01:16:28
I feel the same way about science. There's something about the way that we've been doing science. This is a little bit constrained by our perceptions and our projections onto the world that perhaps AI, broadly speaking, some sort of computational algorithmic approach could unlock for us. And it might feel very similar. It might feel at first that it's less explainable. People always go back and say, "Well, the move from machine learning to deep learning was less explainable. Mixing all the data, all that's less explainable. Oh, we can't explain what's really in a word vector anymore." I think there's an opportunity to go after something really, really fundamental about our understanding of the universe by getting out of the way and just giving as much context to these systems as possible. I keep my topology book with me and a couple of different [inaudible 01:17:41]. I feel like there's something missing that we probably can't figure out, but maybe AI can for us, and we might not explain it in our current terms, but it'll be a much better predictor of how things work, and we'll find use cases for that.
Jon Krohn: 01:17:59
It makes perfect sense to me. I think you're spot on 100%. I mean, to try to make this maybe a little bit more concrete or explain it in a slightly different way...
Bryan McCann: 01:18:08
Please.
Jon Krohn: 01:18:09
... when we go to university, you studied philosophy and computer science in separate departments, and those separate departments cover a standard curriculum that has evolved over time as this is what's important in philosophy. This is what's important in computer science. You're going to learn algorithms and data structures. Everyone's going to do it over here, but those constraints of saying, "This stuff, philosophy stuff belongs over here in this building with these people and computer science belongs over here," some people like you study philosophy and computer science, and in some way your mind might be able to then make connections between them and have interesting ideas about semantic meaning and how natural language models could work or unified models could work. But we can only get exposed to so many different things as humans. But an AI system can scale way, way, way, way more than us. And it can not just be learning philosophy in computer science, but it can be learning every subject and putting all subjects into a high dimensional vector representation.
Bryan McCann: 01:19:16
Or a context window.
Jon Krohn: 01:19:18
Right. And somehow in ways that we might not be able to understand it'll be able to make predictions or assimilate ideas across all knowledge in a way that a human never could.
Bryan McCann: 01:19:31
I think so. Yeah. I'm looking forward to the next few decades of science as we learn how to incorporate these tools more and more and maybe our fundamental understanding of the universe will change and we won't necessarily run into some of the problems we have with it now.
Jon Krohn: 01:19:55
And so then maybe this is my last question, it could be potentially a doozy, but you recently wrote a blog post in August. You wrote a blog post called Searching for Meaning in the Age of AI, and in it you forewarned that humanity is at the beginning of an existential crisis. So do you want to explain more about that blog post and maybe just more generally provide us with some insight into where you think this is going. We got a sense right there from your science perspective, which I think is spot on, but what are the implications for humanity in that world where there are super intelligent systems? The episode that I released today of the podcast, it's on Dario Amadei's 15,000 word blog post, really techno-optimistic about powerful AI. Yeah, I mean, so do you agree that we'll be going towards this potentially utopian society where AI's solving a century's worth of science problems in a decade like Dario describes. How do you see things playing out?
Bryan McCann: 01:21:08
I think I'm a techno-optimist here too. I'm generally very optimistic. I have even earlier blog post on a similar thing from 2016, maybe some earlier ones too. I was reading a book called Life of the Mind by Hannah Arendt, and she's talking about Hegel and thinking and all these other things. And I wrote something like, "Our future with AI is going to destroy mankind." I didn't mean in a Terminator kind of way, but more so because so much of our focus on truth and knowledge and science, well, we may very well be obviated in that process, as you were just saying. The algorithms might be better at all of that than us, but so much of our identity as mankind or Hegel's world spirit or a zeitgeist, some people will use that word, is related to that right now. We've built so much of who we are and our sense of what humanity is on this pursuit of truth, pursuit of knowledge, et cetera, et cetera.
01:22:36
And so I was writing about how at the time, in 2013, 2014, I was like, the middle term, there was the short term, short term was just like, well, we got to figure out how to make this stuff work, it's all wrong. We're going to have to figure it out. The middle term, it's going to be really, really hard because our sense of identity will be challenged more than ever before from, chess to Go to now language, meaning, potentially all of knowledge and science, AI will be better at than us. So in the middle, certainly jobs will be lost. People will have to be transitioned. They'll have to be re-skilled. It's going to happen really, really quickly, and that's going to be hard. The middle term is going to be hard. And then the long term, I was thinking that, well, that will probably be the best of all times to be alive.
01:23:37
You could argue that that's already the case, but I think it will definitely be true in the especially long-term because if you think about us free from that pursuit, in some sense, then we can focus on all of the other things that maybe we are still uniquely special at. And I don't really know what those things are, but as we get pushed out of certain domains, we'll find out what humanity is really special for, if anything. And if nothing, then great. We can just focus on beauty and wonder and admiration and see whether that's stuff that separates us from machines or not.
Jon Krohn: 01:24:22
I love it, man.
Bryan McCann: 01:24:23
Didn't think we were uniquely intelligent, but we may very well be uniquely privileged in the way we perceive the world and can enjoy it by creating meaning.
Jon Krohn: 01:24:33
Right, right, right. I love that. It's beautiful. I couldn't agree more, and I think that's a really great point to end on. But before I let my guest go, and you've been really generous with our time, we've run over on recording time, but quickly if you have some book recommendations for us, our audience would love that.
Bryan McCann: 01:24:54
Well, from the engineering side, as I was being a CTO and starting on my startup journey and things like that, I really liked, I think it's called An Elegant Puzzle. I like that one as a conversation starter. It's not how I necessarily did things, but it's one I usually have all my people read as they become managers and staff engineers. Pick up any book on topology if you want to try to figure out what's going on in the universe. I think there's something interesting there that we don't quite understand yet, but that might be the new linear algebra in five years or 10 years.
Jon Krohn: 01:25:35
Are you talking about donuts?
Bryan McCann: 01:25:38
Kind of, general topology, and just seeing the world in terms of stuff and functions. I think that's going to be a more important way of seeing things in the future. And I don't know if you want to read along with me, I have a book club and we're reading Moby Dick.
Jon Krohn: 01:25:59
Wow.
Bryan McCann: 01:25:59
So do that one.
Jon Krohn: 01:26:00
That's cool. How do people follow along with the book club?
Bryan McCann: 01:26:05
You can reach out to me if you want.
Jon Krohn: 01:26:11
Just an email?
Bryan McCann: 01:26:12
Yeah. You can email me if you want. You can email me at b@You.com, B at Y-O-U.com, and we can chat about Moby Dick. Our last couple of books were Ulysses by James Joyce and Middlemarch by George Eliot. One of the fun things that I did with the Ulysses one is, Ulysses takes place at Dublin. It happens all in one day. It's a 700-page book. It happens all in one day. It's called Bloomsday in Ireland, in Dublin. It's June 16th, and so after the pandemic, 18 months, we'd read a chapter a month and then we all went to Dublin on Bloomsday and met there in person. Some people I'd never met before and some friends from college. So yeah, don't forget to get your daily dose of literature and fiction to keep your mind nimble and thinking about what's not real yet.
Jon Krohn: 01:27:07
I love that. And this is also a perfect segue to my final question, which is how people should follow you. So you already kindly provided us with your email address. Are there social media platforms or something like that that we can follow you for your thoughts after this episode?
Bryan McCann: 01:27:24
I'm most active right now on LinkedIn, so you can follow me and connect with me there. I do have a Twitter, but really the easiest way is just to go to my website. You can find all the links. It's called, it's just bryanmccann.org, so B-R-Y-A-N M-C-C-A-N-N dot O-R-G. You'll see my links, you'll see my blog posts, you'll see my poetry and my paintings, and all the things that we can potentially connect on and make a meaningful connection on.
Jon Krohn: 01:27:57
Very nice. Yeah, we'll have that website link in the show notes, for sure. Brian, thank you so much for taking the time. This has been a fascinating episode. I've loved every second of it. I really appreciate you taking the time.
Bryan McCann: 01:28:09
Likewise. Thank you so much for having me. Really great.
Jon Krohn: 01:28:18
Amazing episode. In it, Brian filled us in on how natural language processing was revolutionized by the idea that words derive meaning from their context leading to innovations like word embeddings and transformers. He also talked about how early unified AI models showed that training on one language task like translation could improve performance on other tasks like question answering by learning deeper language understanding. You talked about how You.com distinguishes itself from other information retrieval AI companies like Google and Perplexity by focusing on complex automated workflows and enterprise solutions rather than competing as a search engine. He talked about how language models trained on English text proved surprisingly effective at generating novel protein sequences, suggesting fundamental similarities in how different types of sequences are structured. And he talked about how while AI may surpass humans at knowledge and science tasks, this could be a good thing. It could free humanity to focus on other unique qualities we may have like beauty, wonder, and finding meaning.
01:29:24
All right. As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Brian's social media profiles, as well as my own at superdatascience.com/835. Beyond social media, another way we can interact is coming up on December 4th when I'll be hosting a virtual half day conference on Agentic AI. Very hot topic, don't want to miss it. It'll be an interactive and practical session and it'll feature some of the most influential people in the development of AI agents as speakers. It'll be live in the O'Reilly platform, which many employers and universities provide access to. If you don't already have access, however, you can grab a free 30-day trial of O'Reilly using our special code SDSPOD23. We've got a link to that code available for you in the show notes.
01:30:22
All right, thanks to everyone on the Super Data Science Podcast team for producing yet another extraordinary episode for you today, for all of us today. For enabling that super team to create this free podcast for you, I am so very grateful to our sponsors. You can support the show by checking out our sponsor's links, which are in the show notes, and if you yourself are interested in sponsoring an episode, you can get the details on how to do that by heading to jonkrohn.com/podcast. All right. Share this episode with people who'd love to hear about Brian's amazing thoughts. Review the episode on your favorite podcasting platform. I think that helps us out. Subscribe, of course, so that you don't miss updates, if you're not already a subscriber. But most importantly, I just hope you'll keep on tuning in.
01:31:11 I'm so grateful to have you listening and hope I can't continue to make episodes you love for years and years to come. Until next time, keep on rocking it out there and I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.
Show all
arrow_downward