98 minutes
SDS 723: Mathematical Optimization, with Jerry Yurchisin
Subscribe on Website, Apple Podcasts, Spotify, Stitcher Radio or TuneIn
In this week’s episode, host Jon Krohn speaks to Jerry (Jerome) Yurchisin, who explains mathematical optimization with case examples that highlight its differences from statistical or machine learning approaches. Jerry also gives his top recommendations for anyone who wants to get started with mathematical optimization, whatever your preferred programming language may be.
Thanks to our Sponsors:
Interested in sponsoring a Super Data Science Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
About Jerry Yurchisin
Jerry Yurchisin (who also goes by Jerome), has over a decade of experience in operations research, data science, and visualization, and specializes in enhancing decision-making. Before joining Gurobi, Jerry worked in consulting (OnLocation, Inc. & Booz Allen Hamilton), supporting numerous projects by building and customizing mathematical optimization models and leveraging machine learning, applied statistics, and simulation—all to support decision-making through data-driven narratives. Jerry also has a background in college-level mathematics instruction and has experience in career management from his time at Booz Allen Hamilton. Now, at Gurobi, Jerry aims to promote the integration of mathematical optimization into the data science and broader AI communities.
Overview
Mathematical optimization is closely related to data science in that it solves similar problems, but it can offer data scientists a much broader toolkit. This isn’t just hearsay: 80% of America’s leading enterprises use Gurobi, and if you’re wondering why you haven’t yet heard of this ubiquitous decision-making technology, Jerry Yurchisin describes mathematical optimization as the engine of a car, saying, “it is a piece of a car, but you’re not going get anywhere without an engine”. Jerry’s work at the company involves introducing these additional tools to data scientists and the broader AI community, dispelling misconceptions about the technology and encouraging practitioners to make business decisions confidently and accurately.
One way that mathematical optimization helps its users make such decisions is first by ‘translating’ a business problem into decision variables that range from the least complex – binary (have yes/no answers) – to integer (e.g., shipment numbers) and the most complex of all – mixed integer (a mixture of binary and integer decision variables). The next step is to place constraints around these variables; in the business world, this might mean budget, resource or regulatory requirements. This, Jerry says, grounds the problem in reality and ensures that the results will be actionable. The final step is to add known parameters such as costs and shipment times. This final step helps practitioners to establish the aim of their investigation, whether to lower production costs or ensure shipments always reach their destination at the right time. Mathematical optimization processes such as these help companies make reliable predictions about profits, costs and productivity rates.
Jerry also gives Jon insight into how to use programming tools like Python and R to call the Gurobi solver, which he says greatly simplifies the process of solving business problems. He reassures listeners that, while algebra could be considered a daunting step in the process, Gurobi aims to lower the barrier to entry with a library of Jupyter notebooks that explain notations clearly, as well as webinars on getting started. Jerry emphasizes that mathematics is a universal language and wants more people to be aware of this power, ensuring data scientists always have a seat at the decision-making table.
Finally, Jerry reminisces about his time as Senior Mathematician and Data Scientist at consulting firm Booz Allen Hamilton. He explains how he first entered consulting through teaching math. At the firm, Jerry became part of a team that built a model to simulate improvements in planning, personnel and resource gathering. He details his time working with the Coast Guard, working out how weather phenomena might affect patrol vessels in affected areas.
Listen to the episode to hear Jerry explain essential terms in mathematical optimization, such as integer linear programming, the times to use mathematical optimization and ML, and player injury predictions in sports.
In this episode you will learn:
- What mathematical optimization is [04:27]
- How Gurobi solver works [29:01]
- How to use Gurobi with Python [36:08]
- Coding and algebra resources [41:14]
- When to use mathematical optimization and machine learning together [54:23]
- Using mathematical optimization in natural language processing [1:01:00]
Items mentioned in this podcast:
- This episode is brought to you by Zerve
- This episode is brought to you by ODSC West, San Francisco (Oct 30 - Nov 2) - SDS special code for 15% off your pass: SUPER
- This episode is brought to you by CloudWolf (30% membership discount included)
- Gurobi Optimization
- Gurobi's public GitHub projects
- Combining Machine Learning and Optimization Modeling in Fantasy Basketball
- Jupyter notebook modeling examples
- Gurobi resource center
- Gurobi events and webinars
- Model Building in Mathematical Programming by H. Paul Williams
- Zero: The Biography of a Dangerous Idea by Charles Seife
Follow Jerry:
Podcast Transcript
Jon Krohn: 00:00:00
This is episode number 723 with Jerry Yurchisin, Data Science Strategist at Gurobi. Today's episode is brought to you by the Zerve data science dev environment, by ODSC, the Open Data Science Conference, and by CloudWolf, the Cloud Skills platform.
00:00:21
Welcome to the Super Data Science Podcast, the most listened-to podcast in the data science industry. Each week, we bring you inspiring people and ideas to help you build a successful career in data science. I'm your host, Jon Krohn. Thanks for joining me today. And now let's make the complex simple.
00:00:52
Welcome back to the Super Data Science Podcast. Today's episode is a special one because it provided me with a powerful tool for solving data science problems that I personally hadn't encountered before, and that's mathematical optimization. Our guide for this journey is the optimization guru, Jerry Yurchisin. Jerry works as a Data Science Strategist at Gurobi Optimization, a leading decision intelligence company that provides mathematical optimization solutions to the likes of Uber, Air France and the National Football League. Previously, he spent eight years as a mathematical consultant where he paired mathematical optimization with machine learning, statistics and simulation to inform decision-making. He was also previously an instructor at the University of North Carolina at Chapel Hill where he obtained his Master's in Operations Research and Statistics. He holds an additional Master's in Applied Math from Ohio University.
00:01:41
Today's episode will probably appeal most to hands-on data science practitioners such as data scientists and machine learning engineers. In this episode, Jerry details what mathematical optimization is and how it works. He provides lots of specific real-world examples where mathematical optimization is a better choice than a statistical or machine learning approach, and he provides his recommended resources for getting started with mathematical optimization in Python or whatever your preferred programming language is and how to get started on that today. All right, you ready for this wicked episode? Let's go.
00:02:18
Jerry, welcome to the Super Data Science Podcast. Awesome to have you on the show. Where are you calling in from today?
Jerry Yurchisin: 00:02:24
I'm in Vienna, Virginia, which is just outside of DC.
Jon Krohn: 00:02:28
All right. That explains a lot of the government work you've done in the past.
Jerry Yurchisin: 00:02:34
Yep, definitely. I think it's a requirement to get into the area. If you don't, then they kick you out. I'm joking. But it's definitely run into a lot of other people who worked in the same space.
Jon Krohn: 00:02:48
Yeah, it can't hurt. Are you in Vienna there? That's not a town that I've personally been to. Does it get jammed up like a lot of the traffic does around DC and Virginia?
Jerry Yurchisin: 00:02:57
Oh, yeah. It's absolutely horrible because not only close to DC, but there's a lot of offices in McLean, a lot of headquarters there, and where I just happen to live now is on an intersection of the two major streets. So I have a two-year-old son and we just go and sit out in the morning before I take him to school and he just laughs at all the cars and points. I laugh that since I get to work from home that I don't need to deal with that commute. So it works out well for both of us.
Jon Krohn: 00:03:27
That works out perfectly for sure. So we met at ODSC, the Open Data Science Conference West a year ago. It's always around Halloween. So I guess circa Halloween 2022. And if I remember correctly, I met you in a line to get drinks.
Jerry Yurchisin: 00:03:51
Yeah. Holding on to tickets and grabbing more as [inaudible 00:03:56]. Yeah, it was a great conversation and we're able to sort of talk about what I do and the Gurobi company I work for now, how we're different from the rest of the group there, rest of the exhibitors and a lot of people were not typically... We're not really a data science company, but we do a lot with that helps with a lot of problem solving. So we're able to have a nice conversation about how that fits in to science space.
Jon Krohn: 00:04:26
Exactly. So I'm super excited to dig into this today because I only have the vaguest understanding of mathematical optimization, the kind of work that Gurobi does. It is very closely related to what we typically call data science and it's solving similar kinds of problems. And so I think this is going to be a really mind-expanding episode for people to get oriented to this completely other way of solving problems with data that could be really useful, tackling problems that maybe they've encountered before and they've been like, "Damn, I wish I could do something here." And so mathematical optimization could provide another tool in their belt for that.
Jerry Yurchisin: 00:05:11
And that's the way that we describe it is if you think about data science and machine learning in particular, that's a hammer that could do a lot of things. Hammers are pretty useful, but sometimes you just need a screwdriver or you need something else. You need a saw to be able to build a house or to build whatever or to solve whatever problem. You need different tools. And sometimes I, as a data scientist or former practicing, I'm not doing it as much now, but it's great for a lot of stuff, but it's also not great for other problems. And sometimes you use the wrong tool for the wrong job. From a business perspective, you could be leaving money on the table or efficiency on the table. It's great to have a wide skillset, a wide knowledge base of ways to approach problems. And that's sort of what my role is at Gurobi. That develops from a lot of a big, big wide background projects and education, all that stuff.
Jon Krohn: 00:06:17
Yeah. It makes perfect sense. And so your title is Data Science Strategist. So what does that mean?
Jerry Yurchisin: 00:06:23
Yeah. So at Gurobi my role is to essentially lead our strategy for taking mathematical optimization and introducing it to data scientists, introducing it to the broader AI community that may not be aware of mathematical optimization or just maybe have some negative feelings toward it from what they've heard or some prior experiences if you tried to dabble. There's misconceptions out there. There's old news out there that just gets thrown around. And my job is to help lead our optimization company into the data science space. So I'm in charge of figuring out what topics to cover via webinars and trainings and work with the rest of my team to develop content and develop sales strategies, and all these sorts of ways to help get optimization into the hands of data scientists.
Jon Krohn: 00:07:21
Perfect. So let's get into that. We have tons of questions for you here. We're going to speak generally about mathematical optimization, get into lots of specifics later on. To start big picture, how does Gurobi solve complex problems with a mathematical solver? So what does that mean at a high level?
Jerry Yurchisin: 00:07:44
So the key difference between machine learning and optimization, mathematical optimization particularly is machine learning is great at thinking in a sense, and optimization is what you need to act. So it's more of a decision tool. It helps you dive through those very complicated, very monstrous decision problems as opposed to predicting things. So we often fall under the category of prescriptive analytics more so than anything else. That's our main label, I guess, and the way that we talk about ourselves. We also are talking how we fall under decision intelligence as well because that's the new buzz term for a lot of stuff when it comes to decision making in the business world and all that. So that's where we fall in and how we just my immediate contrast to machine learning. And that's one way that we differ, although there is definitely a lot of overlap and I'll probably dive into that a little bit more later.
Jon Krohn: 00:08:59
Yeah, for sure. And there's interesting things. There are some areas of machine learning where people are trying to use machine learning to make decisions like reinforcement learning problems for example or frame that way of an agent taking actions in an environment. It requires a lot of development work to even create an environment for a reinforcement learning agent like that to explore. And it isn't the right kind of way of modeling decision-making for a huge array of problems that you can probably solve with your mathematical solvers. So as an example, you have terms like linear programming, LP and mixed integer linear programming, MILP. These sound like they're pretty standard mathematical optimization terms, but in data science, I don't know what these terms mean.
Jerry Yurchisin: 00:10:03
Yeah. Essentially what a linear programming model is, and I'll go through this by relating words or phrases in English and then to some math and then to a code that then uses our solver. So in a linear programming model and also in MILP, which we also shorten it and get rid of the L, and we'll just say MIP or if you want to be hip to the lingo, we just say MIP. But essentially you need to be able to take a business problem, a decision problem, and then translate into the math and then that math into code and everything like that. So what a linear programming model is, it starts with some basic building blocks. First is what we call decision variables. So these are the actual decisions that you'd be making if you were to follow this prescription that the model will spit out at the end. So it could be things like the... It could be the number of products to make of a certain type or the number of this type of product to ship from one location to another.
00:11:26
And then also things that are a little bit more complicated to talk about, which is, do I want to open this warehouse? Do I want to create this shipping route? Do I want to offer this new product line? Those are the decisions that I had talked about a few seconds ago. Those are called continuous decision variables. So you have any number is possible from zero up. We always talk about things in our decision variables as non-negative numbers, but those other decisions that I was talking about of do I want to open this facility or this warehouse or do I want to take this route? Those are binary decisions. So on/off switches, yes/nos, zero/one. And then we also have decision variables that are integer. Say you're building airplanes. You can't really build a third of an airplane, so you need to be either three, four, five, six, up. You need to have that integrality there. So those are the types of decision variables that you use in linear programming is when you have just purely continuous. And then mixed integer programming is when you have a mixture of both. And then IP is when the things are purely integer.
00:12:43
That's part of the building blocks. The next is taking those decision variables and formulating constraints. So if you're thinking about the shipping problem that I was just talking about, maybe you have a limited number of trucks that you can use. That's a constraint, or you have a limited budget that you can spend on travel or other things like that. That implements a constraint. You can't send 50 trucks to do something if you only have 10. So this is adding in those constraints, make the business problem more realistic, make the modeling more realistic because it actually guarantees that these things are met. And then the last building block is an objective function. So you take those decision variables again and then you add in some parameters like costs or shipping times or things of that nature. Just the data of the problem and formulate this objective.
00:13:45
So let's say I want to do all my shipping at minimal cost, or if you want to then maybe integrate the revenue that you would expect to sell at certain stores given a certain line of products, which sounds like something you can definitely predict foreshadowing. But then you take those parameters and you multiply it by your decisions, and that sort of gives you one function that says, "Okay. This is going to be my profit, or this is going to be my revenue or my costs or my times to do stuff." And then you want to either maximize that or minimize that. So you want to maximize your profit, minimize your costs, or maximize efficiency or all sorts of things like that. It doesn't always have to be based with money, but that's just typically how businesses work. You can have any type of function and you're trying to either push that up as high as you can or bring it down as low as you possibly can. So sort of through all of that, my spiel there is you're taking a business problem that someone tells you like, "Okay, I want to minimize my costs and these are the things that I need to do and here's my constraints and here's my end goal." You take all of that stuff, you then write some math about it, write it in an algebraic form, which isn't always necessary, but highly, highly encouraged because it really helps with the transition to the code.
00:15:25
So then you take all that, you code it all up, you have this awesome Python script or something like that, or you can use... For us at Gurobi, we have a bunch of different ways that you can interact with us. And then once you get through all that, that's actually now is when you get to using Gurobi. We are essentially just the library that solves that. So once you have this problem in a mathematical form and then in a code form, then you would call Gurobi to do the really, really difficult work of actually finding that optimal solution. So that's how the whole process sort of works and actually Gurobi is at the very end where you fire it up to actually solve the problem because the algorithms, algorithms the special sauce, I guess, to actually spit out the optimal solution, it's really, really hard to do that. If anyone out there is great with complexity theory and things like that, you can look up just exactly where integer programming falls under that, and it's super hard to solve.
Jon Krohn: 00:16:35
Tired of hearing about citizen data scientists? Fed up with enterprise salespeople peddling overpriced, mouse-driven data science tools? Have you ever wanted an application that lets you explore your data with code without having to switch environments from notebooks to IDEs? Want to use the cloud without having to become a Dev/Ops engineer? Now you can have it all with Zerve, the world’s first data science development environment that was designed for coders to do visual, interactive data analysis AND produce production stable code. Start building in Zerve for free at zerve.ai. That’s z-e-r-v-e dot a-i.
00:17:13
Interesting. Okay. So let me try to say back to you what I've just learned. So with mathematical optimization techniques, three of the main kind of categories are linear programming, mixed integer linear programming and integer programming. And so with that first one, with LP, we have some continuous outcome that we're predicting like dollars. And then with mixed integer programming, MIP or MILP, a model I'd like to program.
Jerry Yurchisin: 00:17:52
I'm going to use that. I'm definitely going to use that. That is great. [inaudible 00:17:58]
Jon Krohn: 00:18:00
So MILPs are for where you have a mixed outcomes, so continuous and integer outcomes. And then integer programming, IP, is when you have a discreet outcome only. So this could be something like, "Yeah. You give the example of number of planes, but anytime that there's a discreet variable that you're predicting." So that gives you these different categories. And then within your individual problem, you need to define your constraints because we are talking about real business problems. And so in that case, there are going to be some real-world constraints, how much of the product you possibly could make. You gave a perfect easy to understand example there of obviously you can't send 50 trucks to solve a problem to optimize something to optimize cost or revenues when you only have 10 trucks to work with.
00:18:55
And then you also have to have some objective. And that's interesting because the objective is something that we do have in machine learning as well. So we often have a cost function that we're trying to minimize with approaches like supervised learning and unsupervised learning. But in this case, the objective, it isn't that kind of... When it's a supervised learning problem or an unsupervised learning problem, we have that cost function. It's an abstract quantity.
Jerry Yurchisin: 00:19:27
Exactly, yeah.
Jon Krohn: 00:19:29
You're trying to get at that cost as close to zero as possible. But in this case, the objective, if you're trying to minimize cost that isn't some arbitrary... It's not in some arbitrary units that you're trying to get towards zero, that is dollars, right?
Jerry Yurchisin: 00:19:46
Yeah, exactly.
Jon Krohn: 00:19:47
And then, yeah, similarly on the other side of things, you could also be trying to maximize something, which again, I guess could be dollars in this case, like revenue or profits, that kind of thing.
Jerry Yurchisin: 00:19:58
Yeah. That's typically where most of any customer, any business that uses this type of modeling is going to be maximizing profits, minimizing costs and things like that. That's just the number one type one unit, I guess is dollars. That's what it all comes down to in the end.
Jon Krohn: 00:20:21
All these business people and the dollars.
Jerry Yurchisin: 00:20:26
Yeah.
Jon Krohn: 00:20:26
Pretty sensible. We often can agree that these concepts are important in business. So people can set those kinds of parameters up themselves. It sounds like a common way for people to do it, I guess especially if you're a data scientist, if you're listening to the show, you probably know Python. And so it would not be uncommon for you to set up all these kinds of parameters in your model with Python. And it sounded like there were other ways we could do it, and it would be interesting for you to delve into that a little bit more. But just before you do that, I'll then say that then Gurobi is the thing that does the mathematical optimization on this, back with TensorFlow 1.0, we got used to talking about the way that information flowed in a machine learning problem is a graph.
00:21:28
And so it's kind of like what you end up doing here is you use some Python code, for example, to set up this computational graph of what the objective is, what the constraints are, what kind of model it is. So you're showing all the possible ways that the model could be set up, but then it's the parameters. It's the parameters in that model that need to be identified. And so in machine learning, we're typically using gradient descent, but it sounds like for these kinds of mathematical optimization problems, yeah, that's not going to work. Actually maybe is there some kind of easy way to explain why you wouldn't be able to use gradient descent in a common kind of situation that you encounter and why you would need to use a Gurobi solver instead?
Jerry Yurchisin: 00:22:28
Yeah. So one of the benefits of using mathematical optimization is it guarantees two things. One, is it guarantees that, what we call feasibility, is that when you set up all of your constraints and you have your decision variables and you have all this sort of stuff and you click go, that you're guaranteed that all of your constraints will be satisfied. And the number two thing that you're guaranteed is global optimality. And that's something where methods like gradient descent and all these other things...
Jon Krohn: 00:23:05
They can get stuck in a local minimum.
Jerry Yurchisin: 00:23:07
Perfect. Yeah, exactly. So having that global optimal solution is something that is unique to mathematical optimization. It's not a heuristic method, as we call it. Something that is a really good approach, but you may not get the actual optimal solution with minimizing your cost function. You could probably set up ways to tweak it, just get a little bit better performance, but by and large things are good enough. More simpler algorithms to solve those things are good enough for you. You don't want to spend years trying to tweak it so that you get that little bit. But in a lot of business sense, again, you don't want to leave money on the table. And if you can cut fuel costs, if you're a big company and you can cut fuel costs by 1%, that's huge. So that's a couple of the advantages of using mathematical optimization.
00:24:18
I would say that the feasibility one that I mentioned shouldn't be overlooked because nothing would be worse than going to your boss or somebody and be like, "Here's how you should do this." And then all of a sudden you just run out of resources, you run out of money, or you run out of people, you run out of something and they look at you and you're like, "Wait, wait, wait, I thought you said this was doable." So those are two of the main reasons why mathematical optimization is super powerful is because it guarantees those things. And if you do violate the constraints or if those constraints need to be violated, then the first thing you're going to see from Gurobi or any other solvers, "Hey, your model is infeasible." And you're like, "Okay. Well, maybe it could just be that the recipe that I set for this problem just doesn't work." Or maybe you had a coding error or something was wrong, but that's a possible outcome. Then you dive back in and figure it out. Once you do get an answer, then you know those two things are true. That's feasible and globally optimal. The global optimal thing I want to emphasize is really important because you can go into a meeting with some people about, okay, here's how I want to solve this problem. Then you can in words describe the constraints.
00:25:39
In words you can describe the objective, you can describe the decisions that you're making, and if you get a thumbs up from anybody, like, yep, that's exactly what our problem is. You take this modeling approach and with the solution that you get, you have the confidence to say there does not exist a better solution. There may be some other solutions that are just as good and we can dive into those, but there does not exist anything that is better. That's a real powerful thing to be able to say and have that confidence when it comes to making decisions.
Jon Krohn: 00:26:12
That's really cool, man. In my head, I've been imagining as you've been talking, I've been thinking about with a really simple regression model where you have just one input variable into the model. This is described by Y equals MX plus B, the simplest line equation. If you plot some data points and you're trying to solve with that very simple linear equation, Y equals MX plus B, instead of using stochastic gradient descent, you can confirm without beyond the shadow of a doubt, you can use a mathematical approach very straightforwardly. It's something that we've been doing in statistics for a century. You can try out all the possibilities and confirm for yourself for sure in that very simple problem, Y equals MX plus B, that you have absolutely the best possible line of fit to those data points and you're going to get exactly the same result every single time you do it because it's just math.
Jerry Yurchisin: 00:27:32
Exactly.
Jon Krohn: 00:27:32
There's no stochasticity, no sampling. You're just always going to converge on an exact right answer for how that line should fit the data. In machine learning problems, we end up having some combination of too many data points or too many parameters in our model to be able to use that kind of approach where you try out all of the possibilities and see where the global optimum is. It's technically impossible, like supercomputer running for a century or whatever scenario. Instead, with machine learning, we use stochastic gradient descent to sample in a compute efficient way, in a memory efficient way, in order to try to hopefully find what we think is the global minimum or the global maximum in a reinforcement learning problem. There's all kinds of assumptions being built in. This is a really cool conversation that we're having and this is eyeopening to me. This is something that I didn't know before, what you've been talking about here, where we can be using these kinds of techniques like linear programming, mixed integer linear programming, and integer programming with a solver, like Gurobi solver and have feasibility be guaranteed and have a global optimal solution be guaranteed. Are you able to tell us anything about how the solver works?
Jerry Yurchisin: 00:29:05
I can give a high-level overview of how linear programming works and how linear programming works with integer programming and mixed integer programming. The most common form of a problem, when you set up all the constraints and if all your constraints are linear functions, let's make that quick assumption. If they're not, there are ways to make it. We can get to that. All of your constraints are linear functions and your objective is a linear function and you want all of your variables to be nonnegative, because that's also another assumption of linear programming. Although again, if you need to violate that, there's things that you can do to get around it.
00:29:59
If you have all those components, then what comes out of it is what we call a convex problem. Essentially, a convex problem or a convex set is take any two points in that set and you can draw a line that connects them and that line stays within the set. That's a super quick definition of convexity. Most linear programs, or by definition, linear programs are that. Everything's linear and you have this convex set. That is actually super easy to solve. Well, there's a couple of methods, but I'll describe the earliest method. This was in the 30s and 40s that this came about, or I guess, sorry, more in the 40s. It's called the simplex method. Essentially, what that does is when you have all these linear things, if you think about a two-dimensional, three-dimensional space, you have all these linear inequalities that make up constraints.
00:31:03
If you think of it in 3D space, they form like a cube or some three-dimensional figure. The number of faces is essentially the number of constraints and stuff like that. What the simplex method does is it finds one corner of that geometric figure and then smartly goes around point by point to each area where these things intersect . And there's smart ways of doing that. Then it goes and figures that out and goes from point to point to point in a smart way. Then eventually, it'll get to a point where it looks at a specific value that is calculated within this method. Then you're like, "Oh, based off of some theory that was developed, when I'm at this point because of these values or this particular value, I'm guaranteed to be at that optimal solution." That's how linear programming works, is it's based off of a lot of mathematical theory of what that optimal solution looks like and that when this condition happens you know that you're at an optimal solution.
Jon Krohn: 00:32:23
Just to interject quickly, that was a really cool visual explanation of how that works. Nice.
Jerry Yurchisin: 00:32:31
Then once you have that linear programming solution, and this is actually something that's solved very, very fast, when you get into the mixed integer world, then that's when things get a lot more complicated. Because essentially now you need to solve thousands, tens of thousands, hundreds of thousands of these linear programs. A mixed integer programming, one of the main approaches is what we call branch and bound. Let's say you have a variable X that is in your linear program solution is 3.2 and we want it to be three or four. It needs to be integer. Then essentially, it'll split on the cases of, okay, well let's assume X is three and then let's assume X is four and then keep going.
00:33:22
Then solve more linear programs until you get to a point where all of your decision variables that need to be integers are integers. You just keep splitting and like, okay, well this one is not integer. If it's like a binary variable where things are off or on, it's 0.4, let's split, let's assume zero, let's assume one, and then keep going. You're just solving a bunch of those until you get to a point again where some math says, "Hey, this is your optimal solution." There's more math behind it that tells you which way to go and a whole bunch of stuff behind it. You can see why this is complex because I've said the word, you got to use a bunch of math a bunch of times.
00:34:06
It's very theoretically based and ways to do this smartly and how to is very, very complicated, which is why the value of mathematical optimization is in the solver. To be able to do this efficiently and come up with the right way to approach it to solve these things and be able to cut corners when possible and to do all these things smartly is a very intensive task from a theoretical perspective. That's why solvers are so important in these problems is because that's the important part. It is easy to do that translation that I was talking about from words to the math to the code. That's easy. It's not easy, but it's much easier than the solving part. That's why solvers like Gurobi are super important.
Jon Krohn: 00:35:08
Be where our data-centric future comes to life at ODSC West 2023, from October 30th to November 2nd. Join thousands of experts and professionals in person or virtually as they all converge and learn the latest in deep learning, large language models, natural language processing, generative AI and other topics driving our dynamic field. Network with fellow AI pros, invest in yourself in their wide range of training, talks and workshops, and unleash your potential at the leading machine learning conference. Open Data Science Conferences are often the highlight of my year. I always have an incredible time. We've filmed many Super Data Science episodes there and now you can use the code SUPER at checkout and you'll get an additional 15% off your pass at odsc.com.
00:35:56
Nice. Great explanations of that. I think I'm grasping the keystrokes here now. When a listener wants to be able to use Gurobi, you mentioned that we could use Python to set up the parameters, including our constraints, our objective. Two things, first maybe with the Python example, just give us a sense of how to do that, what it's like step by step. Then I guess as a follow-up, if it feels right is other than Python, what other ways that we could be doing it in? Are there pros and cons of doing it some other way?
Jerry Yurchisin: 00:36:45
I'll answer the second one first because I forgot it before. I'm an avid R user. I love R. My background is more statistics than other things. I like using R, so you can use R, but C, Java, you can call our solver from MATLAB. Pretty much any way anyone codes nowadays we have an API that can work. We even have a command line interface that if you just want to use Gurobi from our command line, you can. Not the best way to go about it, but definitely an option. Doing it from Python, we have our Python API called Gurobipy. Essentially, the way to use that is there are functions that essentially once you see the problem in an algebraic form where if you were to, let's say I wanted to minimize the cost of shipping. You essentially have your cost parameter and then for shipping from location A to B, and then you have your decision variable of the amount I'm going to ship from A to B, or am I going to ship from A to B?
00:38:19
Then your total cost would be the sum overall possible, all of your decision variables and all their costs. That's what an objective function looks like. That's also what constraints look like. They're this linear equation. The objective is a linear expression, and the constraints are a linear inequality. Essentially, if you think about this quantity, this quantity times this quantity, plus this quantity, times this quantity, plus so on and so forth down the line. You could write that algebraically. Then when you look at the code to actually transform that into the code, it's very, very similar. We have some functionality that helps with the summations and stuff like that. Essentially, it's the sum of decision variable times this. It must be less than or equal to some value B or something of that for your budget.
00:39:21
That's all it really takes to be able to code it up is just understanding a little bit of the syntax, being able to declare your variables, which are very, very easy. You say out my variable X, you say X equals, you have to have created a model object, which is again, just one line. You say, I want to add variables to this model. If you have a hundred of them, you can just say range 100. Then there you go, you have 100 decision variables ready to go. If they are dual index like location A to location B, then you can say, I want to add VARs, which that's the code that you would actually type in and then range 10 by range 20 or something like that. That's as simple as it gets. If you want to then make them binary, then there's a little thing that says, I want to declare them to be binary. Then poof, you're done. That's all it takes.
00:40:15
All that other stuff about incorporating the integrality and everything and telling Gurobi or other solvers to, okay, this has binary variables. You need to be able to understand a split on them and do branch and bound and do other things that are part of that. You don't need to do that. It just automatically knows once you make those declarations. That's why I really think that way of approaching it, of understanding your business problem and words to algebra, to code is a really good way to follow through with it. Because it all makes sense from one step to the next. If you went from someone speaking it to the code, then you may get, like there's a lost connection there. That's why that three-step process is something I suggest.
Jon Krohn: 00:41:05
One of the steps in there from words to algebra to code, that algebra step sounds like it could be potentially intimidating. Are there resources for people that Gurobi provides or you can find elsewhere that help you figure out how to get the words to algebra?
Jerry Yurchisin: 00:41:27
This is one thing that I think is a reason why mathematical optimization hasn't taken off as much as it has. There is this stigma, I guess you might call it, for gate-keeping in a sense, and it's unintentional, but it's there. There's a stigma that you have to have a PhD in Operations Research to be able to utilize mathematical optimization. It certainly helps having a PhD, and any topic helps you with that topic. I think we can all agree to that. People think that you need to have that. That's the only type of people to get these roles, the only type of people who talk to each other about it. There's this very math-heavy lingo that has been developed. If you were to say, okay, I want to learn more about this. You're going to be smacked in the face with math right away. You're going to be seeing symbols and notation that you probably don't understand.
00:42:36
I said, your decision variables have to be nonnegative. Everyone understands that verbiage. When you see X in the set of positive real numbers, which is X pitchfork symbol with three prongs, a weird, funny looking R and bold-face script-y type stuff with a plus sign next to it, that's what you'll see. You'll be like, all right, I don't know what's going on anymore. I'm lost. This isn't for me. When I say X, nonnegative. Okay, yeah, I get it. It could be zero or positive. Perfect. That was one of the reasons why we think that mathematical optimization is not as popular as it should be is because you get hit with that really fast. That's one of the things that, and particularly for my role at Gurobi is how can we lower that barrier to entry? Not everyone has an exceptional mathematical background, and some people may not even have any math background.
00:43:37
Which was one of the real awesome things about the data science explosion was that anyone can do it, really. You just needed to learn to code. You didn't need to understand the mathematics behind it. You just say, I want to cluster this and here's some of the things, and then, boom. How all that stuff worked under the hood, you didn't have to worry about. When you see a lot of optimization stuff out in the wild on stack overflows and everything, you just get smacked with math. What we try and do is, okay, we don't need to use all of the notation. We don't need to have all that formality that you'll see out there, all the mathematical formality. We try and lower that and then introduce those concepts step by step. You do get eventually used to the more complex notation that you'll see. It really helps people understand what's going on and really then makes it easier to get started.
Jon Krohn: 00:44:43
Nice. Then, so part of your role as a data science strategist is creating these training materials, right?
Jerry Yurchisin: 00:44:49
Yeah. I would say our easiest to access is we have a library of Jupiter notebooks, that a lot of them were made before I got there, and they're made for the more OR, operations research audience. They do have a little bit of intense notation and things like that. We've started releasing these notebooks that are made for more the data science crowd. It helps with the story development, is a little bit more important. The notation is reduced, notations explained. We just make it a lot easier to read through the problem and you'll see, okay, here's the statement of my objective that I want to minimize. I want to minimize my costs. Then the next thing that you'll see is the cell to code that up. I'm sorry, the next thing you'll see is the math, a little bit of math sometimes dumbed down a little bit. Not dumbed down, it's not the right word, but simplified. Then you'll see the math simplified and then the next level, you'll see how to code that up. It's all right there. It's like, here's the language, here's the math, often in simplified terms. Then here's the code to do it. It's really easy to see these things next to each other because that's the important part, is be able to take the business problem, take the problem you're trying to solve, and eventually, code it up to get to the optimal solution.
Jon Krohn: 00:46:25
Data Science and Machine Learning jobs increasingly demand Cloud Skills—with over 30% of job postings listing Cloud Skills as a requirement today and that percentage set to continue growing. Thankfully, Kirill & Hadelin, who have taught machine learning to millions of students, have now launched CloudWolf to efficiently provide you with the essential Cloud Computing skills. With CloudWolf, commit just 30 minutes a day for 30 days and you can obtain your official AWS Certification badge. Secure your career's future. Join now at cloudwolf.com/sds for a whopping 30% membership discount. Again that’s cloudwolf.com/sds to start your cloud journey today.
00:47:07
Very cool. Very exciting. We probably have a lot of listeners who are licking their chops, getting ready to tackle those Jupiter notebooks. Because that sounds really fascinating. I mean, everything you've talked about in this episode so far, almost all of it has been completely brand-new information to me. You're opening up this whole new world of possibility to me, and it's exciting to know that even though I don't know the pitchfork symbol all that well.
Jerry Yurchisin: 00:47:34
Yeah, exactly. It's just those simple things and again, it was just the way that the people who know mathematical optimization, it's the simplest form of communication. The plus to understanding all that notation is I can give you my model and we don't even have to speak the same language verbally, and you'll be able to understand what I'm saying, what my model is. Mathematics is a universal language, so that's why enabled to collaborate, that's why it's done. It makes sense why it all happened, but then it just put up this barrier, which is kind of sad.
00:48:16
I think we're working through it. Other things that we offer, lots of webinars. I did a two half-day training for data scientists that really, free too, that really talked about all of these bits and pieces, how to get started, what the notation means, when you need it, when you don't need it. We're definitely really trying to expand to this audience because data scientists are right next to the decision makers. People who are eventually making decisions based off of your forecast, off of your models, you're right next to them. Why not be able to help inform that and drive that with a solution that talks about all the fun stuff that I mentioned before?
Jon Krohn: 00:49:04
Very cool, Jerry. Yep. We'll be sure to include links to those resources. The notebooks, as well as these two half-day trainings.
Jerry Yurchisin: 00:49:13
People can't make a full day.
Jon Krohn: 00:49:14
That'll be exciting. You've given us a sense in terms of the broad strokes of the kinds of problems that we might want to be solving with this without getting into anything proprietary. I know for example, that 80% of America's largest listed corporations use Gurobi, which is a wild thing, that like this mathematical solver, it has the kind of dominance that Google has in search for mathematical optimization solvers, whereas it seems like the biggest corporations are just like, this is obviously what we should be using here.
Jerry Yurchisin: 00:49:59
What I'll say is the use of a solver, particularly tends to be hidden well beneath the layers of some problem-solving tool or something along those lines, and it's so not top of mind, but it is, we describe it as Gurobi Solver as the engine to a car. It is a small piece, smallish piece of the whole puzzle of a car, but I'd say it's probably the most important. You're not going to be going anywhere at any speeds in a car without an engine. So that's how we like to describe our role in that. One case study that we have that I like to put out there, and it shows the power of optimization and also, again, you don't see this in the ad that I'm going to talk about, but the Gurobi Solver is used in NFL scheduling. So the way that they used to create the NFL schedule literally on a board and you see a commercial and they have little tags and they flip them back and forth and they're like, "Okay, I think we're good. This makes sense to us and we're all happy." So we lock ourselves in a room for, I think, there's eight weeks or month or something like that and then they would do all this by hand and then come up with a schedule.
00:51:32
Now they use the Gurobi Solver, and there's a lot of time and effort to model the problem correctly, to take all of the constraints that people are saying that need to happen, put them into algebra and put that into code. Definitely a heavy lift, but then you just update the data that you use to up the parameters, expected eyes on TVs or something like that, and then you just rerun. And then you can run this over a bunch of different scenarios, and then you can look at solutions that are either optimal or near optimal. And then you've gone from doing this, audio listeners, I'm flicking my finger and sticking it into the air, okay, this one seems good, to be able to then be able to look at the solutions that really, really matter.
00:52:19
And recently, I've seen, if anyone watches, I watch a lot of NFL on RedZone and games like that out of market stuff, and there's a commercial from Amazon. And the first part of it, it's all about how they use AI to do awesome stuff. And the first part of it is they talk about the scheduling. Yeah, we can now look through trillions of combinations, which when they say trillions, that's a severe understatement. If we said more than that or if they said more than that, pardon me, people would just be like, that's complete BS, but it is more than that. But there's zero mention of mathematical optimization. There's zero mention of optimization. There's zero mention of Gurobi. It just says we did this, and it's an AWS commercial. So that is showing how it gets under recognized, which again goes to the point I was making before. So that's one. But we claim 40 other ... Over 40 industries use us specifically. Even more are using optimization as a whole. So it is everywhere. You just don't know about it because it's typically buried.
Jon Krohn: 00:53:38
Yeah, this is wild, man. I hope that this is as eyeopening for so many of our listeners as it is for me because it is clear that there's a very useful screwdriver-
Jerry Yurchisin: 00:53:49
Exactly.
Jon Krohn: 00:53:50
... where we've been using hammers for so many different problems and this is awesome. I really appreciate you taking the time to do this and fill us in on the amazing things we can be doing with mathematical optimization. So to get into some more kinds of specific examples of situations that you might want to use mathematical optimization instead of machine learning or maybe in combination because that would also be an interesting thing if there's ... Actually, is that something that you could maybe answer right away? Are there situations where you might want to use both a mathematical optimizer and machine learning together?
Jerry Yurchisin: 00:54:28
Yeah. There's three ways that we view the relationship between machine learning and optimization. One is all of those costs, travel times that we're talking about, all of those parameters, the best way to understand those or to know what those are is through machine learning, is through taking all of your data and coming up with those point estimates for all of those things. And the best way to do that is all of the awesome machine learning tools that we have available to us. And then taking those predictions, so thinking of a demand forecast for your products in certain locations. I could say I have a pair of shoes that is going to be very popular here, maybe not so much popular here, and things like that. That is building out those numerical forecasts. Okay, great. But then when it comes to acting on that and deciding, okay, here's how I should produce them, here's where I should acquire the bits and pieces to make them, and then here's the number that I should take from this production facility to this warehouse and this warehouse to this store and all of those other combinations of things. That's where mathematical optimization shines, is because you can then define all of those decisions very explicitly, the number of shoes I'm going to produce of this type at this facility and then ship them to this warehouse and then this warehouse would then distribute them to these stores.
00:56:15
You can have decision variables that talk about all of that. And then you fill in your constraints and say, okay, I only have this much budget to buy certain things to manufacture the shoes, so there's some of my limitations right there. Or again, going back to the trucks' thing, I only have a certain amount of trucks or certain amount of drivers or things like that that have all these business roles into them, which are all of your constraints. That's essentially one of the main differences is, again, thinking and acting. The machine learning helps you think about where your demand is going to be hottest, of what types, but then actually acting on it is just something that machine learning doesn't do well because part of it is machine learning, you need to have previously seen the outcome for it to be in your dataset. And that's just not the right way of thinking about this. You need to explore all possibilities, and that's what mathematical optimization and the Solver does, is it helps you explore that 3D figure. It helps you explore all of those possible solutions. And again, it's just different tools solving different problems, but one way that they can work together is like that.
00:57:40
The second way is, as you mentioned at the very beginning, is that actually, machine learning problems are optimization problems. You are minimizing a cost function, so you can formulate problems like that, and regression is one example. And actually, we have a couple of notebook examples. One is using mathematical optimization to switch up your objective of linear regression, is minimizing some of the squared errors. What if you wanted to do absolute error? Oh, okay, well there's no calculus to help you there or anything like that. So you could then model that in an optimization problem. And then also maybe guarantee other things like, "Hey, I want this set of regression coefficients to make sure that they're non-decreasing because it may make sense for this tier of product or something. And then I want my beta one to be less than beta two, less than beta three." That's something that you can't do as well with machine learning, at least not very directly. You'd have to maybe add some stuff.
00:58:57
And then another example is feature selection. We have an example that treats features as binary variables in a sense. Do I want to include them? Yes. Is that value one or, no, is it zero? So on/off switch to help you figure out what's the best subset of features if you want to only include a certain number or something like that. So that's another way that optimization can be used with machine learning. And there's also optimal decision trees and stuff like that. So there's a whole bunch of ways to go about it. The third way is the other way around, is embedding, and this is something that we recently released an open source package for, which I'm super excited about this, is taking a trained regression model, something in scikit-learn or XGBoost and embedding that into a larger optimization model. So if you have that price demand relationship, then you can use XGBoost to train that relationship of if in this area with these other things, if I set my price to be this, my demand, I'm expecting my demand to be that. You can directly embed that into a larger optimization model that takes into consideration your whole supply chain as well. So it's a really cool way to leverage mathematical optimization with machine learning.
Jon Krohn: 01:00:34
Very, very nice. Great examples there. It's nice to have three ways that we can relate machine learning to mathematical optimization. That's perfect. I wasn't sure if there was going to be an answer to that question.
Jerry Yurchisin: 01:00:45
I wasn't prepared for that.
Jon Krohn: 01:00:49
And so another area of machine learning today that is obviously super popular is natural language processing with things like large language models. Can mathematical optimization be used in NLP applications? Our researcher Serg, pulled up that you've previously discussed how linear programming could be used to catch plagiarism, for example.
Jerry Yurchisin: 01:01:12
Yeah, this was a super fun example to work on. I have to give a lot of credit to, we had an intern that was more than an intern, Rahul Swamy, who was with us for a while, and he and I worked together on this. Essentially, plagiarism is based off of essentially one thing, is I'm going to use words that are similar to yours, change things up a little bit and essentially just copy your work but make it a little bit, just enough different to where a trained eye even might not be able to recognize it. So that example that we created was using Google's word mover score, which essentially is a model that we don't use that model at all, we just use the output, so this is the first way I was talking about how ML and optimization could work together, uses that to then take two passages or two chunks of text and uses that score to help identify it. When you set up the linear program, you want to essentially minimize that overall distance, so it shows you how close this one document or passage is to this other document or passage.
01:02:41
And for that, the objective, although it's minimizing this difference, the unit is arbitrary. It doesn't make too much sense, but you'd have to set some threshold yourself. But you can do that and say, "I have this document. I have that document." You're setting it up at what we call a transportation flow problem or just a min-cost flow, which is a common archetype of a problem in optimization, but you set it up in that sense. And if you get a score that's low enough, you feel that's low enough, then you can say, hey, this may be plagiarism here. It's a really cool way of using a very common type of mathematical optimization formulation. There's a lot of canned examples out there, but applying it in a new sense to help you really do something completely new.
Jon Krohn: 01:03:41
Nice. I love that. Thank you for digging into that specific example for us, and it helps crystallize what mathematical optimization is and how we can use it with machine learning to be even more powerful, even more widely useful. All right. So beyond your work at Gurobi, you do a lot with sports and math yourself. For example, you call, your Math with Jerome website, it says that it's The Center of the Venn Diagram for Math, Sports, and Fun!
Jerry Yurchisin: 01:04:14
Yeah.
Jon Krohn: 01:04:15
And then a little bit less, it's still PG, but I guess a little bit more an adult way of describing it is that on your website, you say that you have two passions in life, sports and math, but that you probably should list beer in that group. And so typically, you use beer as an excuse to discuss the other two passions, sports and math. So we can pretend that we're having a beer here. Actually, first of all, this is the most important question in the episode. What kind of beer do you like?
Jerry Yurchisin: 01:04:49
I'll give my top two. One is Guinness, and number two is an old college favorite, PBR.
Jon Krohn: 01:04:58
Oh, yeah. Oh, nice, real classy taste, PBR.
Jerry Yurchisin: 01:05:02
Exactly.
Jon Krohn: 01:05:04
Yeah, I'm a big fan of dark beers like the Guinness for sure. Nice. I am imagining it right now. I was out for a few drinks last night and so I'm fantasizing right now but I'm having to take care of the dog, getting over that. So let's discuss math and sports together. What are some of the fun ways of applying math to sports, Jerry?
Jerry Yurchisin: 01:05:33
So on the website that I have, which I haven't been as active on lately as I've been and I'll get into that in a minute, but understanding what works in sports is all about if you can predict the future, then you'll be able to make a lot of really good decisions. I teed it up that way, I guess. That's just how I normally talk now, given my role and everything. Predicting player outputs, understanding things like injury likelihoods, what are safer ways to play sports? Those are a lot of interesting questions that you see so many commercials about it now like wind probability and things like that. It's all over the place and it's really awesome for me to see that.
01:06:32
And ways that I've tried to approach that is to help decompose some things to make them a little bit more understandable or to just provide new context to the way that people think about something. So one article that I have on there is all about ... So there's one where I relate physics to the 40-yard dash at the Combine and to other NFL Combine events. So you have people running, you have people jumping and all sorts of stuff, and there's some basic concepts of physics. One is actually work. It's the amount of force that is needed to move an object times the distance that you move that object. Very simple thing, but nobody uses that as a way to quantify, hey, I have an offensive lineman who's 360 pounds, runs a 40 in this time, and I have a defensive back who is 205 pounds, who runs a 40 in this time. What's more impressive? I don't know. I see some other scores out there that take this into consideration.
01:08:03
But the way that I approach it was like, hey, there's this thing in physics that exists. It's so easily relatable to the data that you get about the Combine. Let's just mix those two things together and have an actual metric that is quantifiable in other things. So I related that type of stuff to the horsepower, which is something that a lot of people, "Oh, I understand horsepower in a sense. Maybe not exactly what it is." I went down that rabbit hole of why horsepower is called horsepower and all that stuff. It's something about the amount of horses it takes to lift on a pulley, some box or something like that. I may be way off on that, but it's something, I was like, "That's weird." It's a very basic physics concept, and we're just not relating it to a physical phenomenon that we're seeing. So I was like, hey, let's quantify that and so we can help compare the performance of these different types of folks, real big linemen versus smaller other players and put them on the same scale with something that already exists.
01:09:11
So that was one interesting article that I put together. And part of the reason why I haven't been active on publishing things there was for the last, I think, two or three years maybe, there's a sports analytics site that's called numberfire.com. Now I think they're called FanDuel Research. A bunch of great folks there. It was awesome, and I was a part-time writer. So I'd come up with an article once or twice a week or a couple of times over an off season or something like that, and I would dive into forecasting what players are going to be doing well, what teams are going to win, and using metrics that are a little bit more common to the sports world now, things like win percentages, win probability, expected points and all that to help inform people. Hey, who should I start on my fantasy team? What bets are good? Only if sports betting is legal in your area, then putting some analysis behind these decisions that you can make.
Jon Krohn: 01:10:30
Very cool. Have you ever used mathematical optimization for sports analytics?
Jerry Yurchisin: 01:10:36
Every week. I built an optimization model for the sole purpose of building optimal fantasy football teams for DraftKings. So the important part there is getting the ... Here's an example of where the importance of machine learning is so astronomically more important than the importance of mathematical optimization in this. Because if you have the right predictions, if your predictions are spot on, then again there will ... I'll put it this way to relate to how we're talking about things before. If your predictions, if your points forecast for players is 100% correct, then if you optimize your lineups with mathematical optimization, no one will beat you. Guaranteed. They may be tied off of either their dumb luck or they're also using optimization, but you will be guaranteed to not lose because again, that's part of the awesomeness of optimization, is that guaranteed global maximum or minimum. So yeah, I do that every week. So I just need to get the points forecast best, get that improved and find reliable sources for that. And then once I figured that out, then I will never lose again.
Jon Krohn: 01:11:59
Awesome. That's very cool. It's nice to be ... Again, this is one of those questions where I was like, "I don't know. Just throwing it out there." Cool to see that you're using your day job knowledge to help with this sports data analyst work. Very cool. I'm not personally a fantasy sports person, but I know a lot of people are, and I suspect that a lot of people that listen to a data science podcast are into it.
Jerry Yurchisin: 01:12:27
Yeah.
Jon Krohn: 01:12:27
So probably have some people now scrambling to find your materials on how they can be leveraging mathematical optimization for their own fantasy sports lineups. Very cool.
Jerry Yurchisin: 01:12:38
Yeah. On that, the first notebook example that we put together was optimizing fantasy basketball lineups using Gurobipy. So it's there. If you're interested, you can see how it works, see how, again, how constraints that you read on a website. I think we may have used FanDuel for that or something. They have like, "Hey, here's the restrictions on your lineup. You need this many players. You need at least one center, one point guard, blah, blah, blah. But you also have flex positions for any type of guard or any type of player." Those are all telling you the constraints of that model. So you take that and then you can put it into the algebra, and then you put it in the Solver and fire it up. So that example exists and you can see how it all works in the fantasy sports world too.
Jon Krohn: 01:13:33
Fantastic, man. Very cool. So beyond the work that you've been doing recently for Gurobi as a data science strategist, you've been there for two years now doing that work, before you got into that, you were a senior mathematician and data scientist at Booz Allen Hamilton, more commonly called Booz Allen. In particular, you were doing a lot of government work, as we alluded to earlier, by you living near Washington, DC, and alluding to, earlier in the episode, to how you lived near Washington, DC, you were doing a lot of government work in that role at Booz Allen so working for the DOD, Department of Defense, as well as DHS, the Department of Homeland Security.
01:14:21
So I don't know to what extent you can tell us about the work that you were doing there or just how your career evolved from you did a lot of teaching as well in the past, so you have a bachelor's degree in integrated mathematics education. I don't actually even know what that integrated means in there. So that's maybe a quick one to explain, but you've taught math and stats at a high school, college graduate level for many years, and then you got into consulting as a senior mathematician and data scientist. And so tell us a bit about that journey to where you are now, particularly as you might not have had, I don't know if there is a traditional path into data science, but you have an interesting path into data science that some of our listeners might also value hearing about.
Jerry Yurchisin: 01:15:12
Sure. Yeah. The integrated in the bachelor's degrees is high school. It's their way of saying high school math. Yeah. I mean, my original career goal or my trajectory was I was going to be a high school math teacher and coach football or wrestling. Those are two sports that I did in high school and I was like, "This is going to be fun," but realized that maybe it wouldn't... While I still love teaching, at that level, it's tough for a bunch of reasons that I won't get into right now. So, then I was like, "I still want to teach," so I ended up teaching at a community college for a couple of years and it was an awesome experience. Really helped me refine how I talk about math, how I talk about analytics, how I talk about all of those topics that people may not be familiar with because I was teaching a lot of people who had very little exposure to it. I may be teaching their first class since high school or something like that, first math class.
01:16:20
So, it really helped me understand how to communicate complex ideas. Then that's why I ended up being, I think, a very good fit in the consulting world is again, that's what you're doing, you're explaining. Now, it's a little bit more of a storytelling. There's obviously some nuance there, but being able to be personable and listen and take in people's concerns from an educational perspective. "I don't understand this. Can you help me? What is it you mean by this?" And then, relate that to clients is a natural transition, and I think that worked out very well. I was able to work on a lot of awesome projects. Highlight a couple, one of which is probably the longest running project that I worked on was for our cyber acquisition. We have to figure out what do we spend money on and what does it do for us in that space? I was part of a really large team that built a simulation model that helps figure that out like, "Hey, if I want to do these types of missions over the next year or something like that, what resources do I need? Who do I need? What do they need to be competent in?" All of that sort of stuff, we needed to help decide that.
01:17:54
My role there was given a bunch of data about how things about operations and stuff like that, what are those key parameters? How many people do you need for this stuff? What is this relationship between this type of mission and the time it takes to complete it based off of X, Y, and Z? So, I was sort of just given free reign to do anything that I felt I needed to do to help fill in that missing information. So, that's where I was able to dive into a lot of unsupervised learning, supervised learning to help... " Hey, these missions are similar," or, "This is the time that it takes on average," or, "This here is the regression model that explains that." So, that was a really awesome way of being able to leverage those skills.
01:18:45
Another example of a really cool project that I was able to work on was for the Coast Guard, and it helped answer a question or at least provide some context to the readiness of certain types of vessels under not great conditions. How does a hurricane hitting in the Gulf affect the readiness of certain vessels to be able to patrol and do their normal day-to-day on the East Coast? You don't necessarily think that that's super correlated, but it is because you got to send a lot of extra resources to where the more catastrophic event just hit. I was able to work on that from a statistical perspective, a lot of stats there. And if you're a machine learning person who doesn't dive into the statistical world a lot, I highly recommend it. There are a lot of great approaches to use as well. Not everything needs to be XGBoost.
01:19:52
There's a lot of really real awesome goodness to use there. But I was just sort of given free reign, and this was with a lot of projects, to figure out the methodology with this boatload, pun intended, of data. How can we quantify this? How can we see what effect this has on a larger scale when something like a hurricane hits in the Gulf or these types of events? So, I was able to use that data and I devised the methodology and was able to use a lot of statistics. I did use some machine learning there for clustering, but I was able to help sort of highlight, "Hey, this is what happens. This is the impact that this has." So, it was a raw eye-opener I think to a lot of people, the results. Yeah. It's really cool to be able to work on some of those projects where it does have an impact on things. And part of the reason why I did like working in the federal space is all the previous of this episode, we were saying, "Dollars and cents is the bottom line, and it's refreshing in that space, where it isn't the bottom line." There are things that you incorporate lives saved as a metric that's important, so that's really cool and a bit refreshing and was definitely motivating for me.
01:21:25
And then, the last example I'll quickly hit on is an example that actually kind of made me think about, "Hey, if I want to do all of the cool things that I know I can do and want to do, maybe I do need to look elsewhere for other types of opportunities," and it was with doing a project. I was leading a project for the US Army where they wanted to understand how their assets degraded over time to help them sort of plan budgeting and everything. It's mainly on structural assets like that, so I have this building that has this amount of stuff that's wrong with it already. What's the condition? How much worse is it going to get in the next year? So, we were able to do... Like I said, I led a small team to help do that analysis, and we ended up with a real awesome regression model that was pretty good. I can't remember our R-squareds and everything like that, but I think it was absurdly high for real-world problems. It was in the 0.9s and stuff like that, and the first time I saw it, this isn't right. That's just not what happens in practice. That happens in textbook examples. But alas, we got that performance and it was a real cool project to help them.
01:22:58
Again, this one was a little bit more narrow in scope. They wanted a linear model to help, so this was a restriction. They needed to have it that way to implement it in their larger platform for this, so we weren't able to use real fancy methods from the machine learning perspective. But this was an example of a project that was scoped in such a way and scoped properly for it to do what it needed to do and what the client wanted right away. But if I know the condition of all of these assets and their importance and things like that, if I know that for next year, it's a very easy decision problem or it's a difficult decision problem to figure out where should I invest and what the impact that's going to have, but that's an optimization problem. I have a budget of money, a bucket of money that I can spend on these things to improve them or not improve them and let them go. And I was like, "This is a perfect case for optimization, and since the project was scoped in such a way, it wasn't something we could do." I was just like, "This is kind of frustrating," because it is what it is at that point, but I wanted to do that, "Hey, let's take a machine learning model and use that output as an input to an optimization model to help inform decision making." And it was pushing a boulder uphill to sort of get that going and eventually it never materialized, so it's a little disheartening and I wanted to really use my...
01:24:40
By that point, I did several optimization projects that I didn't mention, but I wanted to combine the two. I was like, "This is the way." It was just hard to get that going, so that's why I was like, "Yeah, maybe I need to gather other experiences to help plow that forward." And actually, after I left Booz Allen, I worked for a really small but really awesome consulting company called OnLocation where they're in the energy space, so I was able to use my optimization chops by tailoring optimization models to see how certain policies will affect stuff. That was also really cool to work on some of those problems, and one project there was with renewable energy, how does expanding our renewable energy, how does that affect the rest of our energy sources and what we would use and what would be most cost-effective? Yeah. That was my time as a federal contractor.
Jon Krohn: 01:25:50
Nice. Very cool, Jerry. Well, it's been a fascinating path so far. You've been doing terrific work, and I can't wait to see what happens next. Thanks so much for filling us in on mathematical optimization in this episode. Seriously, as I've said at least once in this episode already, I learned a ton. This is Terra Nova for me, new ground, and really exciting to have this screwdriver in my belt. Before I let you go, I always ask our guests for a book recommendation. Do you have one for us?
Jerry Yurchisin: 01:26:26
Yeah, and I'm going to cheat a little bit here and do two. Part of the reason why I want to do two is because again, as you said, this is a super new topic to probably a lot of your listeners, so I do want to recommend something that's a little bit more textbooky, so it's not as fun, but it is a great book to help you understand the use cases for mathematical optimization and get the basics down. It's called Model Building in Mathematical Programming by H. Paul Williams. What's different about this book is it focuses on the use case. It focuses on the problem, so it's really easy to sort of understand all I was harping on for the whole time. This is like, "Hey, we have this verbal or written problem, and then you have the algebra, and then you have the code." It doesn't focus so much on the code, but it has a lot of the first two, so it's a great resource to learn and I think it would be the best place for someone who wanted to get something that's a more instructional book to learn about mathematical optimization.
01:27:39
Now, the fun book that I really like is called Zero: The Origins of a Dangerous Idea. I had to make sure I wrote that down. It's very interesting because one of my favorite classes in undergrad was number theory. For whatever reason, it's hitting a nail with a sledgehammer to make high school math teachers understand number theory, but I loved it. It was awesome, and it just sort of through the study of that course, and also a history of mathematics course that I had to take, which was also super awesome. You sort of learn about how these systems developed. The first person to understand numbers didn't say, "Oh, well, we have 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Let's write them as these digits." It took thousands of years to get where we are with how we talk about numbers, and one of the interesting things is that zero did not exist for a long, long, long time, and the first several number systems, zero wasn't a thing. I can ask you to write zero in Roman numerals. I'll just be like... I can't. Exactly. They didn't have zero. The concept of zero was at best just sort of vaguely there. So, this book goes through the origins of when it came about, who sort of brought the idea around it, the symbols that they're using, how they were going about it, how it spread across.
01:29:23
It originated in India, I think, around the 5th century or so, and then it wasn't until the 12th century in which it became sort of this thing that people accepted in Europe. So, it's just an interesting tale about that, how people thought about zero from a philosophical perspective. And that gets into the dangerous subtext there is that it was met with resistance and it sort of was this challenged thing because it also represented things like non-existence. When you sort of think about things in a more broader context of life, nothingness is like, "Oh, that's kind of sad. Hang on a second," So, it's a very interesting book because it hits on how this invention, which zero is an invention, that it wasn't always there, and when it became there and became accepted, that it had this sort of real cultural and philosophical impact. So, highly recommend it. It was a great read.
Jon Krohn: 01:30:37
Fascinating. Great book recommendations, Jerry. Thank you so much for that. If people want to get either your fun math insights or your mathematical optimization insights, what's the best way for them to follow you after this episode?
Jerry Yurchisin: 01:30:52
Sure. Yeah. I am on Threads, @MathWithJerome, also the platform formerly known as Twitter as MathWithJerome, my website mathwithjerome.com, and also have a GitHub repository under the same name. It's bare-bones right now, but if you want to see how I did the fantasy optimization, I have that posted there and a couple other things as well. That's how to reach out to me. You can also reach out to me directly to my work email, which is yurchisin@gurobi.com. That's Y-U-R-C-H-I-S-I-N @gurobi.com. And then, to get into the stuff that Gurobi has put together, follow us again at Gurobi, and LinkedIn's probably the best way for folks to get in touch with both of us as well. MathWithJerome for LinkedIn for me, and Gurobi for Gurobi company.
01:31:57
Once you're in there, I would say the best place to go website-wise would be gurobi.com/learn. There's a whole bunch of resources, a bunch of stuff I didn't get into today, to learn more about that. Specifically, the Burrito Optimization Game is something I want to quickly plug because it is a great way to understand the complexity of optimization and how it's an awesome decision-making tool, and it's also there for competition for folks who just listened to this episode, and also in add reads and stuff like that. If you go to gurobi.com/sds, you can get to all that as well, if there's a competition for how well you can optimize without an optimizer? So, it's a game that you can't win, which is kind of sad, but you can see how well you can do to getting close to optimal.
Jon Krohn: 01:32:51
Nice. When people go to that url, gurobi.com/sds, that brings them to a leaderboard of just Super Data Science listeners, right?
Jerry Yurchisin: 01:33:04
You won't see a leaderboard right away, but what you can do is you have to register for the game and play in a championship mode. Once you click on the website for the Burrito Game, you'll see a championship mode sort of sign. Click on that. You enter your alias name, and then the top box is for a championship code. Put in superdatascience, all lowercase, all one word. And then, you can see the leaderboard once you click on that Super Data Science tag on the left-hand side. That's a really good idea. What I'm going to do is add how to do that to that site, so you can really see what's going on. Again, it's a great way to understand and learn optimization. Once you get it a little bit and you need to explain it, have people play this game because it's fun and it really helps you understand the complexity.
Jon Krohn: 01:34:07
Nice. Awesome. Thank you so much, Jerry, for taking the time with us and opening our minds to the screwdriver. Yeah. It's been a fascinating episode, so maybe we can catch up with you again in the future.
Jerry Yurchisin: 01:34:20
Yeah, that would be great. It was awesome for me to be on, and I love the podcast and everything, and happy to be back on at some point.
Jon Krohn: 01:34:30
Nice. All right. Catch you soon, Jerry.
Jerry Yurchisin: 01:34:32
All right, thanks.
Jon Krohn: 01:34:38
Nice. I'm delighted to have a screwdriver in my data science tool belt now. In today's episode, Jerry filled us in on how relative to machine learning and statistical modeling, mathematical optimization is more for taking action things like decision intelligence. He also told us about how the constraints in mathematical optimization guarantee real-world feasibility, and unlike ML, mathematical optimization guarantees a globally optimal solution. If you listen on the three categories of ways that machine learning and mathematical optimization complement each other to be extra powerful, how the Gurobi mathematical optimization solver can be called from all the programming languages that data scientists use most, including Python through the gurobipy library, as well as R, MATLAB, C, BASH and more. And he left us with the tutorials and Jupyter Notebooks he has put together, so that you can learn about mathematical optimization yourself hands-on today.
01:35:29
As always, you can get all the show notes including the transcript for this episode, the video recording, any materials mentioned on the show, the URLs for Jerry's social media profiles, as well as my own at superdatascience.com/723. After recording today's episode with Jerry, I did play Gurobi's Burrito Optimization Game, so you can head to gurobi.com/sds to play it yourself as well, and set out burrito trucks all over a fun city with building names like Linear Regression Psychology Services, ReLU Realty, and the Multimodal Distribution Center. Your goal in this game over the course of five fictional days, each with different characteristics, different constraints I suppose, is to optimize your burrito truck locations to maximize your in-game profits.
01:36:16
I was able to earn $12,105 worth of profits in the game, and those are obviously not real dollars, but if you head to gurobi.com/sds, you can compete against me on the leaderboard. You can see if you can beat that total that I got to. You can see how well a perfect mathematical optimization would perform against you as well, and you can win real-life dollars. There's a couple of hundred bucks in Amazon gift certificates available to the top three Super Data Science listeners. Again, that's gurobi.com/sds.
01:36:53
All right, thanks to my colleagues at Nebula for supporting me while I create content like this Super Data Science episode for you, and thanks of course to Ivana, Mario, Natalie, Serg, Sylvia, Zara, and Kirill on the Super Data Science team for producing another wicked episode for us today. You can support this show by checking out our sponsor's links, by sharing, by reviewing, by subscribing, but most of all, just by continuing to tune in. I'm so grateful to have you listening and I hope I can continue to make episodes you love for years and years to come. Until next time, keep on rocking it out there and I'm looking forward to enjoying another round of the Super Data Science Podcast with you very soon.
Show all
arrow_downward