Jon Krohn: 00:05
This is episode number 686 with Ruth Yakubu, Principal Cloud Advocate at Microsoft.
00:27
Welcome back to the SuperDataScience Podcast. Today I’m joined by Ruth Yakubu, a deep expert on open-source options for ensuring that we deploy AI models responsibly. Ruth has been a cloud expert at Microsoft for nearly seven years. For the past two, she’s been a Principal Cloud Advocate that specializes in AI at the multi-trillion dollar tech giant. Previously, she worked as a software engineer and manager at the consulting giant Accenture, and she’s been a featured speaker at major global conferences like Websummit. She studied computer science at the University of Minnesota.
01:02
In this episode, Ruth details the six principles that underlie whether a given AI model is responsible or not, and she details the open-source Responsible AI Toolbox that allows you to quickly assess how your model fares across a broad range of Responsible AI metrics. All right, let’s jump right into our conversation.
01:22
Ruth, welcome to the SuperDataScience Podcast. Where in the world are you calling in from?
Ruth Yakubu: 01:28
Jon, I am calling from Midtown Manhattan. So just in case you hear sirens, in the background, just know that I’m in New York, so that’s like a normal sound for us.
Jon Krohn: 01:43
Nice. Yeah, for sure. So I am also in Manhattan Downtown, so maybe we’ll hear the same siren. Go past your window and then 10 minutes later go past mine. Although we’re, we’re filming in the middle of the day, so it would be more like an hour later.
Ruth Yakubu: 01:58
Exactly. Exactly.
Jon Krohn: 02:00
Nice. So it’s great to have you on the show. We met in person at the Open Data Science Conference, ODSC East in Boston, and we got to know each other over lunch. And I found out what you were talking about at ODSC East, and I thought it would make for a fascinating episode, so,
Ruth Yakubu: 02:20
Oh, thank you.
Jon Krohn: 02:20
Let’s dig. Yeah, let’s dig right into it. So what is Responsible AI, Ruth?
Ruth Yakubu: 02:28
That’s a good question. Funny enough, I used to wonder about this. Is it just a slogan? There are different work terms that people use. There’s Ethical AI. I thought that was a different thing. There’s Responsible AI. But in the context of Microsoft, when we talk about Responsible AI, it’s more of principles that we have to adhere to. So they’re core areas. Number one, fairness. When you’re developing a AI solution you need to ask yourself, is it fair to people that are going to be using it? Is it inclusive? So I used to wonder, okay, if it’s fair, then it’s inclusive. But one thing that we have to keep in mind is, let’s say for disabled people, there are about a billion disabled people out there. So when we’re creating our applications, do we take that demographic into consideration?
03:35
So those are areas that we have to think of. Are we thinking of things outside of the box, people that we’re not including in our use cases, that sort of thing. Another area is data and privacy. So this is not just, okay, I put a product out, a AI system out there. Am I respecting people’s data and privacy? This also comes when we’re actually building the model. Where did you get that data Jon to do your machine learning training? So it’s both sides that we have to take into consideration. Transparency, that’s a big thing now because AI tends to be a black box. So being able to understand why it made a decision or why it made a mistake and why, there’s also industries that do a lot of regulations, so you need to explain, let’s say to a doctor why you went about predicting that this patient has this type of diagnosis. Then there’s also accountability. This is the area that is on top of everybody’s mind when we create the AI solution. Who’s responsible when things go wrong? So it’s more of a practice of being able to be accountable for AI systems that we create. So with Microsoft, we’ve created these six principles for us to practice within the company, and that’s why we call it Responsible AI.
Jon Krohn: 05:22
Nice. I gotcha. So I was able to jot down five at least fairness, inclusivity, data and privacy-
Ruth Yakubu: 05:30
And reliability, I forgot reliability.
Jon Krohn: 05:32
Reliability.
Ruth Yakubu: 05:34
Reliability and safety.
Jon Krohn: 05:35
Got it. Got it, got it. So fairness, inclusiveness, data and privacy. And it’s interesting that you mentioned on that data point, I don’t usually think of myself and you call, you highlighted it yourself already, but it doesn’t, when I think about data privacy, I’m thinking about making sure that our users’ details are safe, that no one’s getting their credit card numbers, that kind of thing.
Ruth Yakubu: 05:55
Yeah.
Jon Krohn: 05:55
Or in the case of Microsoft, you guys have lots of, you know, people’s searches, people’s emails, obviously a huge number of corporate clients probably one of the largest numbers of corporate clients worldwide. And so obviously user data and keeping that safe is, there’s probably thousands of people at Microsoft working on that. But it didn’t occur to me this other piece about where data come from. So the province of the data, obviously that also can be sensitive. We see a lot these days with for example, with the large language models that are creating images. We see things like the stamp from the provider of, of the images. So like these, these data are coming from copyrighted sources potentially.
Ruth Yakubu: 06:42
Exactly.
Jon Krohn: 06:42
Where yeah, we run into issues. So yeah. It’s very cool that you highlighted that. Explainability, accountability and then reliability and safety. I should give you a chance to explain that one now too.
Ruth Yakubu: 06:50
So reliability, that one too. I’m like, look, is it reliable or not? That’s pretty straightforward. But think of let’s say if we’re building a smart car. There are so many use cases that we need to think of. How does it react when it’s rainy? How does it react when it’s around pets or children? How does it react at night? So when we have to think of outliers, like use cases that are like way, way out there. So the normal use case versus the ones, the outliers that’s where reliability comes into because we’re thinking of everything possible and the safety of it as well. So I’ll say a smart car is, the only what’s it called, analogy that I can think of in that category. But yeah. Anything that’s reliable,
Jon Krohn: 07:53
That makes sense to me. That’s a good example. So cool. Yeah. So we’ve got the Responsible AI principles, and I understand that because this is such a centrally important issue to Microsoft, that a large team at Microsoft has been working on open sourcing, something called the Responsible AI Toolkit. And this is such an important initiative that it isn’t just Microsoft developers that are contributing to the project. My understanding is that it’s also opened up to anybody, whether you’re a Microsoft employee or not, you could be contributing to this open-source ecosystem.
Ruth Yakubu: 08:27
Yeah. That’s very correct. And for me, I just started working, I took interest and passion into Responsible AI in the past year because for the longest time I feel like we used to hear theory, theory, theory, how all of these areas are important when we’re implementing our solution, but when you come from a developer or implementation background or you’re a data scientist we think of practicality. Like, how can I apply this into real world? So one thing since it was a huge gap that the data science and AI community needed, one of the things that organizations did was start building solutions around Responsible AI. So one example, there’s a company called Fairlearn and giving your model, it can identify areas where your model may not be fair. So do you have sensitive features like age, gender, ethnicity, that your model can potentially have like a maybe age bias or maybe racial bias, that sort of thing.
09:53
There’s another one called InterpretML. I’m sure a lot of people have worked with SHAP and LIME when it comes to model explainability and interpretability. As you all know doing all of that from scratch from a notebook, it’s like very time-consuming. So these companies have gone ahead and implemented solutions like this. Another area is, another company is Error Analysis. So they’ve also put together a solution that given your model, even if it says it’s 90% accurate, realistically there are pockets and demographics within your data that your model may just be 20% accurate. So it exposes areas like that. So these are organizations that create solutions and also there are also academic institutions that also created solutions. Then Microsoft, kind of like how you mentioned before, Microsoft research teams have also implemented solution because this is the area that the industry, at least the data science community found that “Hey, we are lacking, we don’t have any solution that will help developers, ML professionals be able to debug their machine learning models.”
11:20
So all of this was put together and all Microsoft has done is package everything into one holistic area into one holistic package for let’s say Python users. And you’ll be able to call these libraries, but it’s an open-source project and solution. So kind of like how you mentioned you and I and the public can be able to utilize it for free. We want to make it available to everybody, but also have people contribute to it. So that’s a Responsible AI Toolbox. You’re also going to hear of a Responsible AI Dashboard. It’s the same functionality, but this is a new feature announced last year that all of this is actually integrated into the Azure Machine Learning studio. So already, I’ll say data scientists or ML engineers are already doing their e-to-e or end-to-end machine learning lifecycle processes and why not put everything into one place?
Jon Krohn: 12:40
Nice. So the Responsible AI Dashboard has been integrated into the Azure ML Studio. And so this offers people who are already using the Microsoft Azure Cloud ecosystem for their machine learning, their end-to-end development and deployment of models, they can be taking advantage of the dashboard. But all of that functionality is also available in the RAI Toolkit. That’s open-source.
Ruth Yakubu: 13:03
Yeah, exactly. So literally when you have your model and your data set whether you plug it in the Azure Studio or run the insights with the open-source version, you’re going to get the same insights.
Jon Krohn: 13:21
Nice. And so to dig into some of those insight capabilities for our listeners a little bit more, so you mentioned there’s, one thing that was interesting to me is as you were describing this Responsible AI Toolbox, you were talking about companies, you were like, one company did this and a couple of companies were doing that. So it’s interesting to hear that it sounds like maybe companies that specialize in these particular kinds of things, like companies that specialize in interpretability, they collaborated specifically with Microsoft on this Responsible AI Toolbox to provide best in class functionality on that particular feature.
Ruth Yakubu: 13:57
Yeah. So it’s funny, one of the companies is Error Analysis. So that’s the very first component that you do see in the dashboard. So whether you use your library directly or you come to the Responsible AI Dashboard, the user interface is pretty much the same. So yeah. And when it comes to model interpretability some of those use case are pretty much the same as Interpret ML.
Jon Krohn: 14:33
Nice. Cool. So so when you say like the error analysis dashboard there, it’s interesting because there’s like you mentioned how the Responsible AI Dashboard with like a capital D, that’s the thing that’s integrated into Azure ML, but you could be in the open-source RAI toolkit and go into the error analysis dashboard with a lowercase D and… So that’s like, yeah, it’s a dashboard for being able to do error analysis, which allows us to better interpret models and maybe identify situations where yeah, the model is in treating groups equally, for example, so you could identify, identify situations where a sensitive group has a different error distribution than other groups. And so this allows you to diagnose, it looks like you can do it visually in this system to identify what’s happening with either your data or your model in order to be able to yeah, fix that issue and yeah, so maybe you can realize, oh, look it for this sensitive group, we’re actually collecting data in this different way, or we don’t have enough data or something like that and you can patch up the issue.
Ruth Yakubu: 15:41
Yeah, exactly. Yeah, so kind of like how you mentioned yeah, it’s a, it makes it very easy for you to identify where the problems are and also quickly mitigate the problem. Like one of the features that I really enjoy analyzing is the data analysis section. So sometimes, because I feel like data is a very big blind spot, so as you identify where all these demographic that could potentially have issues in your model, even though your model when you ran it did the analysis, the accuracy score was 97% and you’re like, “Hey, we’re ready to ship this out.” And you come to the Error Analysis, it finds all these errors, all of these different components build on each other. So you can create cohorts of where those issues are occurring to do further investigation.
16:50
So error analysis is just for you to identify where issues are when you go to, into like model overview, the different metrics that we utilize, it’s hard to really, you pick one and say, I’m going to just look at accuracy and not recall precision. So it gives you a holistic way of looking at your model and seeing where the disparities are coming from. And one interesting thing is your model may be performing very well in one cohort of data that you create, but very poorly in another cohort of data that you create. So the question is, what is so special in one group versus another? So those are things that you can do in a model overview. But with data analysis, let’s say you work for a loan application process and you have like a whole bunch of applicants submitting their applicant application.
17:57
You can look at the data distribution because you can see, okay, where do I have overrepresentation in my data and underrepresentation in my data? Cause if you have overrepresentation, chances are your model is going to be favoring the biggest population. So if I have more, let’s say married women versus single moms, maybe single moms have like a very good stable job, have a very good credit score but why are they being denied a loan versus a mother in a two-parent household? So it can show you disparities and show you whether this population, are we fairly treating them or based upon maybe historical data we are singling them out. That sort of thing. So it can expose a lot of things about your data and help you, accumulatively with all the different components in the dashboard, I feel like it’s a buildup of a story. At the end of the day, stuff that issues that you find and want component can actually validate and confirm other issues or even expose more issues. So by the end of the day, the end developer has a holistic view of, “Oh, so this is how my model is behaving. These are areas where it performs well, and this is areas where it’s very erroneous.”
Jon Krohn: 19:48
Nice. Gotcha. So you’ve got these within the Responsible AI Toolbox, you have these different areas, these different kinds of dashboards, like the error analysis dashboard, which we already talked about in some detail. There’s the interpretability dashboard, which allows you to use techniques like SHAP and LIME, which are kind of well known techniques for identifying interpretability. But this interpretability dashboard makes it easier to get going and you can compare model interpretability alongside these other kinds of diagnostics, like error analysis. So yeah, so we’ve got the error analysis dashboard, the interpretability dashboard. There’s also a fairness dashboard, which I don’t think we’ve talked about in too much detail yet.
Ruth Yakubu: 20:34
No initially yeah, the fairness is kind of incorporated into all of the different components that you’re doing. So when we talk about dashboard, everything is just one dashboard. You’re just going down through. But fairness assessment-
Jon Krohn: 20:52
I see. So, so it’s like, it’s different areas where we can, so we can visualize these different aspects across the same, so you have one pane of glass where you can see all of these kinds of metrics together. So the Responsible AI dashboard, lowercase d in the Responsible AI Toolbox has all of these different kinds of widgets, like visual widgets that we can see like, yeah. For error analysis, for interpretability, for fairness. And then the overall kind of, I think this is what the last time you were speaking, you were kind of providing us a sense of overall how this Responsible AI Dashboard ties everything together into one place yeah, yeah, yeah.
Ruth Yakubu: 21:38
Yeah. You’re absolutely right. Because it’s actually components like you, as you build your model, you may be like, okay, I don’t want to look at model overview, just when you’re creating the dashboard, I just want to focus on data analysis and feature importance. Where it does explainability. So you can pick and choose which components you want everything to be displayed on that dashboard. So it’s easily interchangeably available. You can remove some, you can add some.
Jon Krohn: 22:20
Nice, very cool. Yeah. So you can pick which widgets you want to have in the view. That makes sense. And then, so in terms of the nuts and bolts of using this tool, so you mentioned that it’s available in Python and of course it’s open-source as we’ve mentioned a number of times, but in terms of like, how does somebody use this, like where would a data scientist as part of their flow, like do they import this as like a Python package as they’re getting started on modeling? Or, or yeah, like I, like just explain a bit like how we, how we use this as part of our flow as a data scientist creating a model, and then is it like a separate browser tab that opens up on our machine or, or how, how do, where does it show up?
Ruth Yakubu: 23:06
Actually, that’s a very good question. Because yeah, first I was like, okay, where does it come into the picture? Because I’m used to training my model and how does it get to this dashboard again, how do I insert it? But yeah, you’re right. Everything that you’re used to doing, you can just cleanse your data. You have your test data, you have your training data, you trained your model, you did everything. The next thing is literally one line. It’s like a constructor, and that’s where you’re telling it, here’s my model, here is my data that you should utilize to evaluate the model. What type of model am I using? Is it a classification use case regression? That’s it. The next thing is, that’s when you go shopping, what components do I want to include? I want to look at error analysis, give me the whole work.
24:02
So error analysis, feature importance, data analysis, model overview. We haven’t talked about counterfactuals, but counterfactuals is actually very important as well. Then causal analysis as well. So those are like different components that you can add. But all of those are just a few lines of code. You’re just adding to the object of the dashboard. And once you’re done, you just say run analytics and it generates, you’ll run the insights, you’ll gather all the insights from your model, given the test data that you’re done, I mean, that you provided. And at the end, it’ll just spit a URL. I should, I should have used a different term, but it’ll display a URL and you click on it and that’s when it opens a very nice browser. And it’s a very nice easy user interactive dashboard.
25:09
And the two, to be realistic, you only import two libraries. One is called Responsible AI Package. There’s one RAI widget, so like the display of things. So this is kind of good because I mentioned all the different companies that have contributed to it. Also Microsoft Research. All of these come with different libraries. So can you imagine if you were just doing one, right? That whole install statement would be crazy and probably if you’re doing analysis, you probably do one analysis, save that notebook, bring and open another notebook to do another analysis, but not knowing insights or things that you’re discovering, everything is probably interrelated with each other. So that’s why putting everything in one dashboard and having one holistic view and story of what’s, of the state of your model and how it’s behaving is very useful to make a data scientist more productive.
Jon Krohn: 26:28
Sounds perfect. I, it, I think I now have my head wrapped around why this is such a value add, why this is such a straightforward option for any data scientist to be working with, because all of us who are deploying models into production should be mindful about some or maybe all of these Responsible AI issues that you’ve outlined. So things like fairness, inclusiveness, data and privacy, explainability, accountability, reliability and safety. With this Responsible AI Toolkit, we have one install that we need to do, and then we get one dashboard for our model to be able to monitor all of these things, be able to do error analysis, interpretability, model fairness all as you said, under one roof with one install and having one holistic view and story.
Ruth Yakubu: 27:15
You summed the right just perfectly. Like you used it before.
Jon Krohn: 27:22
I think I’m going to have to, now I’m going to be recommending it to my team for sure. This seems like a no-brainer to be using. And while we don’t have time today to dig into counterfactuals and how those relate to causality, for our listeners who are interested, you could check out episode number 607 with Jennifer Hill, which was released half a year ago. And in that episode, it’s focused entirely on that. So it’s causality with counterfactuals. And so yeah, the Responsible AI Toolbox, it looks like the documentation also has information on counterfactuals that you can get right there in the tool, as well as all of these other topics that we discussed today. And I can see that there are some, you know, there’s, there’s more than we even had time to cover all of which is unsurprisingly for such a tightly run ship with so many resources into Microsoft. Yeah, it’s all very nicely put together and well-documented. So definitely check it out. I’ll have a link to the GitHub repository in the show notes. Ruth, thank you so much for coming on the show and enlightening me and our listeners about this awesome Responsible AI Toolbox. Ruth, if people want to follow you after the show and hear more about this tool or any other insights that you might have about the industry, how should they follow you?
Ruth Yakubu: 28:45
So my Twitter handle is Ruthie, so R U T H I E, Yakubu, Y A K U B U. And that’s how you can find me on Twitter. And thank you so much Jon, for having me on this show. It’s a pleasure.
Jon Krohn: 29:04
Nice. Thanks to Ruth for coming on and providing us with such practical guidance. In today’s episode, she covered the six Responsible AI principles that Microsoft adheres to and that we ourselves can use as a template, namely fairness, inclusiveness, data and privacy, including data provenance, explainability, accountability, and reliability. She then filled this in on how you can quickly import the open-source Responsible AI Toolbox into your Python code to have a holistic visual dashboard that allows you to evaluate your model for how responsible it is before deploying it into production. All right, that’s it for today’s episode. Until next time, keep on rockin’ it out there, folks, and I’m looking forward to enjoying another round of the SuperDataScience Podcast with you very soon.