051-Methods for Designing Ethical, Human-Centered AI with Undock Head of Machine Learning, Chenda Bunkasem

Chenda Bunkasem is head of machine learning at Undock, where she is focusing on using quantitative methods to influence ethical design. In this episode of Experiencing Data, Chenda and I explore her actual methods to designing ethical AI solutions as well as how she works with UX and product teams on ML solutions.

We covered:

How data teams can actually design ethical ML models, after understanding if ML is the right approach to begin with
How Chenda aligns her data science work with the desired UX, so that technical choices are always in support of the product and user instead of “what’s cool”
An overview of Chenda’s role at Undock, where she works very closely with product and marketing teams, advising them on uses for machine learning
How Chenda’s approaches to using AI may change when there are humans in the loop
What NASA’s Technology Readiness Level (TRL) evaluation is, and how Chenda uses it in her machine learning work
What ethical pillars are and how they relate to building AI solutions
What the Delphi method is and how it relates to creating and user-testing ethical machine learning solutions

Quotes From Today’s Episode

“There's places where machine learning should be used and places where it doesn't necessarily have to be.” - Chenda

“The more interpretability, the better off you always are.” - Chenda

“The most advanced AI doesn't always have to be implemented. People usually skip past this, and they're looking for the best transformer or the most complex neural network. It's not the case. It’s about whether or not the product sticks and the product works alongside the user to aid whatever their endeavor is, or whatever the purpose of that product is. It can be very minimalist in that sense.” - Chenda

“First we bring domain experts together, and then we analyze the use case at hand, and whatever goes in the middle — the meat, between that — is usually decided through many iterations after meetings, and then after going out and doing some sort of user testing, or user research, coming back, etc.” - Chenra, explaining the Delphi method.

“First you're taking answers on someone's ethical pillars or a company's ethical pillars based off of their intuition, and then you're finding how that solution can work in a more engineering or systems-design fashion. “ - Chenda

“I'm kind of very curious about this area of prototyping, and figuring out how fast can we learn something about what the problem space is, and what is needed, prior to doing too much implementation work that we or the business don't want to rewind and throw out.” - Brian

“There are a lot of data projects that get created that end up not getting used at all.”- Brian

Links

Undock website

Chenda's personal website

Substack

Twitter

Instagram

Connect with Chenda on LinkedIn

Transcript

Brian: Hi, everyone, Welcome back to Experiencing Data. This is Brian O'Neill, and today I have Chenda Bunkasem on the line, an AI research scientist in question, right? You're not quite sure is that what I just heard?

Chenda: [laugh]. Yeah, there's debate within the scientific community about titles. So, you know, you always have to be skeptical.

Brian: Exactly. So, maybe we could jump into whether or not you're a scientist, and what the heck you're doing with machine learning and AI. I saw Chenda on a webinar that was about ethics in this area, and as listeners to the show will know, I tend to think about ethics as kind of Human-Centered Design at scale when we're kind of looking beyond just the end-user or our immediate stakeholders, and we're kind of looking to the outer circles of how the technology affects other people. So, Chenda, tell us a little bit about you, your background, whether or not you think you're a scientist, and what you're doing these days.

Chenda: Yeah, yeah. So, I've been conducting machine learning, artificial intelligence research for quite some time now. I'd say about three years. Happened to accidentally write my computer science dissertation on AI, with just intention to make a really cool video game script, which is awesome.

And it led me into conducting machine learning research on Google Stadia project, which is a cloud-based gaming platform, and then eventually, in an AI startup at the World Trade Center, focusing on simulations and synthetic data, which I'll go more into later on. So, all of this has just kind of accumulated to what I'm working on now, which is using a lot of these applied strategies for data-driven ethics and data-driven algorithms with very many different applications.

Brian: Tell me about quantitative ethics. I think that's what I first wrote down when I was like thinking about talking to you and having you come on the show. So, can you dig into what this is? And I'm also curious to see how you might relate this to your early user experience work that you did as well. We talk a lot about qualitative research when we do user experience design, so I'm curious about these two things: quantitative ethics and research there, and then also the qualitatives piece. Can you talk about that a little bit?

Chenda: Right. So, a lot of the times, especially with regards to ethics and data, there's often an approach that's very, very abstract; it's qualitative, as you said before. And while I was on that panel at Hypergiant, I spoke about using quantitative methods to come up with better decisions for how we ethically design systems. And it's funny because in a way, you’re outputting very data-driven decisions with these systems, and it would only make sense intuitively that you'd make data-driven decisions in how they're designed. And so this was the seed for how I started to think about more ethical artificial intelligence research in both the UX sense and in a more deeper technical sense.

Brian: So, what's different about your approach? Or how do you go about doing it ethically versus not? Walk us through what the process of doing a project, or working on a product, or models, et cetera, et cetera, how do you approach the ethics piece here?

Chenda: So, there are a couple of ways we do this. One of them is to definitely align any sort of research and development along a scale that we call a ‘technology readiness level.’ NASA actually uses it. But what's interesting is that only recently has there risen what we call a TRL level—technology readiness level—label for machine learning systems, and so there's there's a lot of space for people to come in and add pieces where it might be better to make a decision on TRL seven versus nine and design a system that takes use case into account.

So, TRL stands for technology readiness level, and labs such as NASA use it—I’m sure SpaceX uses it as well—and it's the meter for which we as researchers can determine whether or not something in the lab or something within research and development is ready to either be productionized, or what stage it's at. And oftentimes, it's here that these ideas needed to be honed already. They need to be honed and sharpened in a sense where, is what we're researching ethical?

Brian: So, how does one go about getting a score? For example, what's the process of that? And I'm curious, is there any involvement with the people that are in the loop that are going to use this or be affected by it? Is there some type of research activity or something that goes into scoring these things?

Chenda: Right, definitely. So, that's a great question. I actually introduced something called the Delphi method during that panel with Hypergiant, and I find it very interesting because there are very many different sort of combinatorial approaches to decision-making where you either have a blind vote, or you have people contributing with—as we said before—quantitative data. So, exactly how an algorithm works, the exact law or the exact policy around, let's say, user privacy rights.

It's about designing communities and curating discussions with domain experts who understand the inner workings of their fields, and then these people coming together to help seed how you would create these systems, let's say, taking use case into account from the get-go. And it's extremely complex. The quantitative feature comes from the fact that the method of itself, the Delphi method, isn't just discussion, it's not just debate.

Brian: What else do you guys go into with that?

Chenda: So, what we'll do, and it's interesting because I'm actually taking up a research project in 2021, to submit to an AI ethics conference in spring on this topic, and so, we're actually still very, very early in its stages, so I'm kind of disclosing or describing how we're starting to form this. But first, we bring domain experts together, and then we analyze the use case at hand, and whatever goes in the middle of the meet, between that, is usually decided through very, very many iterations, after meetings, and then after going out and doing some sort of user testing, or user research, coming back, et cetera, et cetera.

Brian: Mm-hm. Can you talk to me about what a test looks like? Like a user test? Or that field research looks like? What do you actually do, like, step by step with somebody?

Chenda: So, the first thing is, whatever client that we decided to work with, or at least in this sense, the company that is conducting this research, we actually start with their ethical pillars first. So, we start with an abstract concept because more often than not, you want to take the inclination you have first on whether or not a use case is this or that, from your feelings. And this is where it gets interesting because, first you're taking answers on someone's ethical pillars or a company's ethical pillars based off of their intuition, and then you're finding how that solution can work in a more engineering or systems design fashion.

Then you have your solution to how you can approach this problem ethically, or at least how the client ethically wants it, and when you go out and do your user testing, you're usually making sure that the way that the user is interacting with the product—or the end goal—one, has a human in the loop, which AI researchers talk about very often. It's allowing the user to understand the systems that they're working with. And the other thing is, that ensures user privacy.

Brian: So, this human-in-the-loop concept is something I talk about on this show all the time. This is kind of a foundational element of designing effective decision support applications and things of this nature. So, can you give an example of how… like, how would you take feedback from one of these sessions and make a change? Like, if we saw—or maybe you've done this before, and you can talk about an adjustment that you might have made to the system based on getting this feedback from the field. So, whether it's interpretability, or some other—or privacy or something like that, do you have an anecdote you could recount to kind of make it concrete?

Chenda: One thing I would definitely add is that the most advanced AI doesn't always have to be implemented. And people usually—they skip past this, and they're looking for the best transform or the most complex neural network. It's not the case: it's about whether or not the product sticks and the product works alongside the user to aid whatever their endeavor is, or whatever the purpose of that product is. It can be very minimalist in that sense.

Brian: Sure, sure. No, I a hundred percent agree with that. So, do you have an example where maybe the data science approach, the implementation approach for something changed? Perhaps it changed from a more complex model that was more black box, and move to one that was perhaps less accurate but had some interpretability because the stickiness factor was there. Is that the type of thing that you would hope to get from this kind of research, or is that information coming too late to be useful?

That would be one thing I would think of if I put my—a lot of the audience for this show tend to be software leaders, technology, data leaders, et cetera, and I'm guessing many of them wouldn't want to find out that late in the process that, oh, we have to do all this rework of the modeling and the engineering because we found out no one will use this if they don't understand how the system came up with this recommendation or whatever. Can you unpack that a little bit for me, your perspective on that?

Chenda: Yeah, really, really good question, actually. So, I described this at the panel as well, but you want to design dynamic systems, and especially within machine learning production-ready systems—there's a difference between this concept of static systems and dynamic systems because models have to also retrain themselves if they want to optimize, let's say, object recognition or whatever the outcome is for that model. You definitely want there to be a bit of a loop as you were speaking about before, a feedback loop. And this can happen early on in the process because you don't want to wait too long even before it gets to the user to see this.

And you want systems that have features of automation as well—if we're talking about AI systems—automation features that allow for, let's say, the tuning of a hyperparameter. So, there's systems designed like this that will iterate on itself, and then be optimal for the user.

Brian: One of the interesting things for me here in this particular question was more around the feedback loop between whether or not the solution is actually, as you call it sticking, like, is the product going to stick and get used for the desired use case? Is this something where you feel like you have to go through the process of actually creating the full automation, you have to pick a model and push that out before you can get that feedback? Or is it possible—and part of the reason I'm asking this is because I know that it sounded like you had some work in the BitChat, you did some user experience research. I'm kind of very curious about this area of prototyping, and figuring out how fast can we learn something about what the problem space is, what is needed, prior to doing too much implementation work that we don't want to rewind, or that the business doesn't necessarily want to rewind and throw out? Do you think it's possible to get informed about which models and which technology approaches we may want to use with machine learning earlier, or do we have to get into some kind of working prototype before we can establish that?

Chenda: That's a really, really good question. The first thing I would lead with my answer on that is the more interpretability, the better off you always are. If there is a definitive method right now to determine whether or not you should or shouldn't prototype a specific type of model, I'd be very curious to explore. But the higher interpretability, the better, and the more data is always better.

Brian: Always. I mean, I guess I can think of places where you may not care about how a low-risk scenario, a recommender or something like that, like exactly knowing how, so I can see people playing the other side of that argument. I mean, I would generally, I would generally agree with you that if feels like if the human-in-the-loop is is a heavy factor, and whether or not the solution is going to be adopted and used, then the right amount of control from their perspective is really important, the right amount of transparency, et cetera, so do you think it's possible to tease that out early in the technology-building part? Or do you just play it safe and say, “You know what? I'm not going to go—we're not going to use any of these techniques.” It's just more of a gut read. Like, “This will not work; we have enough gut read on the situation from our customers to know that a black box implementation is going to be a showstopper,” so to speak.

Chenda: So, you usually want to know, you don't want to go with intuition in this sense. As I said before when there's more interpretability, it's easier to determine those things especially with regards to that human-in-the-loop feature, and as we were talking about before if you have more data to play around with, that's more data you split into either training, testing, validation phases with your machine learning, right? And you can switch that data out, you can also design systems solely for the sorting and rearranging of the data required to make the best model as well. And these are all preemptive kind of systems design steps that one can take.

Brian: Mm-hm. So, it sounds like you're at a new startup, correct? You're working at a tech startup building some kind of software SaaS or some kind of software product. Is that correct?

Chenda: Yeah, definitely.

Brian: Cool. So, I definitely want to hear about that. I'm curious, how involved are you in that kind of design phase, so to speak, the product itself, the interfacing with customers. Do you work tightly with the product management and your product designers, or are you staying a step away from that? What's that relationship like, and how do you guys work it?

Chenda: So, it's really interesting. At this company, Undock, I work very, very closely actually, with the product teams and the marketing teams because, you know, machine learning for product management is very, very particular. And it's very, very interesting because, as we're kind of saying before, this preemptive systems design needs to be noted very, very early on from the get-go. And how a person is experiencing the product is very, very relative to what models are chosen and how the models are architected.

Brian: How do you inform those technology choices?

Chenda: So, I really like this question because the user journey—or if you want to refer to as the user experience—is kind of one of the fewer, actually, pieces of—especially with Undock—that actually has to very, very closely align with not just the model architecture, but the data generation and the data collection. We're actually generating data as well, synthetic data, for A lot of this model training as well, and the reason is because the user experience, or the product experience that we would like the users to feel, needs to be seamless. And so if you're working in a very, very close loop with people who are designing the user journey and designing how the product looks, you have a better idea of what data needs to be collected where and when to give the person that experience, especially with regards to machine learning.

Brian: Mm-hm. Was there a particular anecdote or something you might recount from your journey so far with Undock where you're like, “I'm glad I heard this now.” [laugh]. Or something along those lines, or, “Wow, this was really informative to my work or prevents us going down the wrong path.” Is there anything like that that came up so far?

Chenda: [laugh]. Actually, it's funny; kind of. So, Undock has this, we have this phrase that we use, which is time travel, right? We are an AI-powered smart scheduling application that is sort of all-encompassing, with being able to automatically detect whether or not you're free during a certain timeframe, and free with the rest of your coworkers as well. It really, really is kind of revolutionizing how you think about scheduling, especially with regards to remote work because all of your colleagues are now in different time zones.

And there are so many pieces to determining whether or not—it's almost like having an EA within your laptop, you know? It's almost as if every individual person has one of those as well, and the different pieces with a user journey like that, you got to think to yourself that you're getting smart suggestions all the time and the data that has to be collected and aggregated for the optimal experience that is seamless within how you communicate with your coworkers, and how you’re communicating with the application is quite extensive, actually; it's very complex.

Brian: Tell me about this Undock. What I'm picturing here is it's scheduling facilitation. I use Calendly extensively in my own work, so I'm always curious about reducing time spent on this kind of stuff. So, how is it different than just booking a room and your people, especially if this is inside the corporate walls where everyone's on Exchange or whatever, and you know when everyone's available, et cetera? Is this intended for cross-company where you don't have that visibility into people's schedules? Is that the difference or—like, how does it work?

Chenda: That's actually a good question. Um, you don't actually have complete visibility into your coworker’s schedules, which is quite different from how you might see Google Calendar. But that's what's nice about it because the more personal features of what your schedule looks like, and how you’re planning your days is actually occluded. Well, you can still really, really easily and seamlessly figure out when's the best time to meet and when's the best time to talk.

You can schedule meetings in person, you can schedule meetings face-to-face as well, and there's a feature to it that actually makes it kind of social: you have a profile, you can state whether or not you're online, and it's really blending a lot of features from—there's an application called Discord, which a lot of remote workers are using these days, and Calendly, Slack, too. As I said before, we have this thing where we're talking about time travel or controlling time, where you don't actually have to know all the details to get the best outcome.

Brian: Sure, sure. So, what do you want to change about how we develop machine-learning-driven products and services? Is there even a problem, I guess, is a good question.

Chenda: I don't think there's a problem per se. I think we're very clear now that there's places where machine learning should be used and places where it doesn't necessarily have to be. There's obviously that very stark difference between machine learning engineering—which is really software engineering using elements of machine learning—and machine learning research and development, which is massively different. It has a place in science that is very abstract.

It's all-encompassing, you can use machine learning in your research in biology, you can use machine learning in research in physics, the list goes on, and on, and on. So, it's broad, but it's not. And it's applied in engineering, but it's also applied in research. And it's very all-encompassing, so what I'm really curious to see is just how it changes in the years to come. But I don't think there's anything that should be changed specifically.

Brian: Mm-hm. There's a lot of solutions that end up not—I use “solutions” in quotes—there's a lot of projects that get created that end up not getting used at all. I'm not sure the large majority of people are necessarily thinking about the last mile and kind of that, as you called it, the stickiness of the product, or the solution, or whatever it's going to be, and kind of working backwards from that, and I'm just curious, do you think something needs to change there to get a higher success rate here? And I'm speaking specifically about not the technical blockers: not, like, the data sets are too small, or we can't train the model, or the accuracy is slightly too low for various technical reasons and whatnot. More the human piece, right? It doesn't solve a problem that exists, or it solves it in the wrong way, or it's too complicated or hard to use. More of these kind of last mile user experience problems, do agree that that's a challenge right now, and if so, is there something that could be done differently?

Chenda: I do agree that it's a challenge. I'm sure that you've heard, many people are saying machine learning is no place to move fast and break things. And it's true. There's this sort of race to innovate. And the question is always for why?

Brian: Yeah. [laugh]. Throw machine learning at everything.

Chenda: [laugh].

Brian: Yes. Oh, I know.

Chenda: And then the question is for why and who's affected? Of course, there's facial recognition, and I think it's a perfect example because what facial recognition really is—without it being the product that it is—is glorified computer graphics, right? AI aided, machine learning aided computer graphics that is advanced enough to pick up the pixels in what a camera can see and make meaning out of that, you know? Now that's facial recognition. So, you really have to ask yourself, why is it necessary? And who is it helping?

Brian: Mm-hm. Or who's it not helping?

Chenda: Yeah.

Brian: Do you have concerns then, when you're in the middle of a project, or maybe you're working on a feature or some aspect of Undock. I mean, maybe Undock’s a good example of this, like, the question of, should we do this? And should we be using these advanced analytical techniques in this particular area? Should we be looking at someone’s… whatever; their online/offline status, or whatever it may be. Even if you've technically thrown up the dialog that says, “Yes, click here to give permission to do this thing?”

There's also the question of does the person really understand what they're handing over in terms of their personal information and how it's being used? I think there's the comprehension piece is almost separate from the did we get permission, and it feels good to just say, well, they check the box. And at some point, there's personal responsibility, but some of this stuff is pretty technical. So.

Chenda: It's an interesting point. Most people don't really know what they're giving to the products that they're using, actually. And it's very funny because there's this quote that circulates around tech Twitter, which is, “If you're not paying for the product, then you are the product.” And that's more than often, you know, with social media, blah, blah, blah, blah because it's true.

I don't think it should actually be a part—actually, this is quite an interesting point. I don't think it should actually be a part of their user journey in every single individual product they're using. I think there should be a centralized place where people are well aware of how they interact with products generally, in a more macro perspective.

Brian: Mm-hm. Yeah, I just talked to an interesting, MIT sandbox startup there, and they're actually thinking about, kind of this, bank analogy where the bank is, you have all of your personal data there, and companies can pay you for it. So, here's an offer for nine dollars and twenty-seven cents; Google would like to use your email address, and here's how they want to use it. And you can choose whether to disclose it and get paid for it.

And I thought it was a very interesting concept of starting to expose where's my stuff being used. How is it being used? Now, I mean, how transparent they get about that, and whether they just throw up a bunch of legal stuff in front of the offer, I don't know. To me, there's potentially some ethical considerations there as well, especially with people that don't have a lot of money that may just see it as free money. “I just click here and I get $9 a month from Google, and I don't know, like, I didn't have to do anything.” [laugh].

So, I still think there's a human effort to understand—there's effort on both sides that's required if you really want to have a long-term strategy of being both ethical but also producing some kind of business value there. I don't know. It's a complicated question, but I wanted to get your take on it because I know you have a heavy ethics consideration in the work that you do.

Chenda: Yeah, yeah, it's definitely. It's something that I think should really actually be coming from a centralized place because despite the methods being different, maybe the algorithm being different, or the user journey, from product to product, the essence is the same, and it's that—this goes into painting a larger picture, but how much of a stronghold, especially these larger technology companies have on us, our lives, and our data. And to some extent, it shouldn't actually be just every single product creator’s job to notify users to that extent, I think we should be taught this, you know?

Brian: Yeah. No, it's a fair—I agree. I mean, at some point, there is a level of personal accountability and responsibility, and it's your choice; it's a free society. I totally agree. I think—I don't feel like the scale is not super well balanced at this point.

But it's a complicated [laugh]—it's not a binary, “It’s this way or that way.” It's definitely some kind of scale there. So, I guess we all have to figure out what the right balance is there between customers and what kind of company do we want to work for, and what do we want to do with the work that we do. So, jumping to a slightly different topic, just to kind of start to close this out.

It's been great to talk to you, and I'm curious, do you have any feeling—you know, having worked in some tech companies and stuff, I'm always interested in, kind of, the relationship between the technical leaders, the product leaders, and the designers, and kind of this trio that's at the backbone of many software companies, is there changes that you'd like to see there about how that relationship works in terms of either people understanding your work, or vice versa, or whatever. I'm curious about those relationships. Is there a takeaway from your experience about how you think those teams could be more optimized to work together?

Chenda: I really, really love this question. It's something that I contemplated at my last company and will maneuver differently in this company. But I helped scale my last company from, really, three to four people to now it's over 100. And seeing how senior leadership communicated with each other, especially representatives from each of these different groups, there has to be translators, there has to be people who exist in translational roles, and they're quite difficult to fill because you have to have an understanding of the hows, but you have to be able to explain the whys.

Brian: Mm-hm. Is there a special type of person, or role, or something where you think that role falls naturally?

Chenda: It's something that I actually think is still being—it's showing itself through in tech companies, whether it's the Big Five, at Google or Facebook, or in startups that are high growth. It's, “Oh, we need someone to sit in on this meeting, who is a creative technologist.” There are these names now that get thrown about that aren't actually just ‘senior engineer,’ et cetera, et cetera, who have an understanding—yeah, even sometimes the social sciences behind why you would do something in the design. Or they have an economics background and understand computational social science and why micro-influencers communicate with each other in this way, you know?

Brian: Cool. Well, I appreciate you sharing your stuff with me. And I did have one last question. When I was checking out your background stuff, what kind of music are you DJing?

Chenda: Oh, [laugh]. So—

Brian: I'm a musician as well, so I was curious to hear what kinds of stuff you're spinning?

Chenda: Oh, that's a good question. So, I'm very, very influenced by the electronic music scene in both the UK and Europe, and I kind of combined that with, mmm, yeah, some other futuristic sounds as well. So, mostly electronic music, but, I mean, it can range from like anyone on—

Brian: New stuff? Old stuff? Drum and bass? Like, newer genres?

Chenda: Drum ‘nBass is great. Anything from Brainfeeder. Of course, you have to, like, throw in Aphex Twin sometimes, but then, like, then you mix it with Travis Scott, you know?

Brian: Yeah.

Chenda: [laugh].

Brian: Cool. I have a soft spot for some drum and bass in my life as well, so it's good stuff. [laugh]. Well, Chenda, it's been great to talk to you and where can people follow you? I know you're pretty active on Twitter. Is that the best place to check out your work?

Chenda: Yeah, so Twitter is great. Of course, there's my website. I also have a Substack where I'll be releasing a newsletter. Of course, feel free to follow me on Instagram with anything that's more visual, but I am trying to [00:29:28 unintelligible] everywhere.

Brian: Awesome. And that's C-H-E-N-D-A, bunk like a bunk bed, B-U-N-K-A-S-E-M if anyone's just listening and not reading. So, Chenda, thanks for coming on Experiencing Data and talking to us today.

Chenda: Thank you for having me.

Brian: Take care.

Show Notes

Quotes From Today’s Episode

Links

Transcript

Other Episodes

Episode 0

103 - Helping Pediatric Cardiac Surgeons Make Better Decisions with ML featuring Eugenio Zuccarelli of MIT Media Lab

Episode 0

010 - Carl Hoffman (CEO, Basis Technology) on text analytics, NLP, entity resolution, and why exact match search is stupid

Episode 0

041 - Data Thinking: An Approach to Using Design Thinking to Maximize the Effectiveness of Data Science and Analytics with Martin Szugat of Datentreiber