047 - How Yelp Integrates Data Science, Engineering, UX, and Product Management when Creating AI Products with Yelp’s Justin Norman

In part one of an excellent series on AI product management, LinkedIn Research Scientist Peter Skomoroch and O’Reilly VP of Content Strategy Mike Loukides explained the importance of aligning AI products with your business plans and strategies. In other words, they have to deliver value, and they have to be delivered on time.

Unfortunately, this is much easier said than done. I was curious to learn more about what goes into the complex AI product development process, and so for answers I turned to Yelp VP of Data Science Justin Norman, who collaborated with Peter and Mike in the O’Reilly series of articles. Justin is a career data professional and data science leader with experience in multiple companies and industries, having served as director of research and data science at Cloudera Fast Forward Labs, head of applied machine learning at Fitbit, head of Cisco’s enterprise data science office, and as a big data systems engineer with Booz Allen Hamilton. He also served as a Marine Corps Officer with a focus in systems analytics.

We covered:

Justin’s definition of a successful AI product
The two key components behind AI products
The lessons Justin learned building his first AI platform and what insights he applied when he went to Yelp.
Why AI projects often fail early on, and how teams can better align themselves for success.
Who or what Beaker and Bunsen are and how they enable Yelp to test over 700 experiments at any one time.
What Justin learned at an airline about approaching problems from a ML standpoint vs. a user experience standpoint—and what the cross-functional team changed as a result.
How Yelp incorporates designers, UX research, and product management with its technical teams
Why companies should analyze the AI, ML and data science stack and form a strategy that aligns with their needs.
The critical role of AI product management and what consideration Justin thinks is the most important when building a ML platform
How Justin would approach AI development if he was starting all over at a brand new company
Justin’s pros and cons about doing data science in the government vs. the private sector.

Quotes from Today’s Episode

“[My non-traditional background] gave me a really broad understanding of the full stack […] from the physical layer all the way through delivering information to a decision-maker without a lot of time, maybe in an imperfect form, but really packaged for what we’re all hoping to have, which is that value-add information to be able to do something with.” – Justin

“It’s very possible to create incredible data science products that are able to provide useful intelligence, but they may not be fast enough; they may not be […] put together enough to be useful. They may not be easy enough to use by a layperson.” -Justin

“Just because we can do things in AI space, even if they’re automated, doesn’t mean that it’s actually beneficial or a value-add.” – Justin

“I think the most important thing to focus on there is to understand what you need to be able to test and deploy rapidly, and then build that framework.” – Justin

“I think it’s important to have a product management team that understands the maturity lifecycle of building out these capabilities and is able to interject and say, ‘Hey, it’s time for us to make a different investment, either in parallel, once we’ve reached this milestone, or this next step in the product lifecycle.’” – Justin

“…When we talk about product management, there are different audiences. I think [Yelp’s] internal AI product management role is really important because the same concepts of thinking about design, and how people are going to use the service, and making it useful — that can apply to employees just as much as it can to the digital experience that you put out to your end customers.” -Brian

“You hear about these enterprise projects in particular, where the only thing that ever gets done is the infrastructure. And then by the time they get something ready, it’s like the business has moved on, the opportunity’s gone, or some other challenge or the team gets replaced because they haven’t shown anything, and the next personcomes in and wants to do it a different way.” – Brian

Transcript

Brian: Welcome back everybody to Experiencing Data. My name is Brian O’Neill. Today I have the vice president of data science at Yelp and an author and enthusiast, if not expert, in AI product management. So, welcome to the show, Justin Norman. How’s it going?

Justin: It’s going pretty well. Thanks for having me.

Brian: Yeah, yeah. So, I picked up on you from your recent O’Reilly articles that you’ve been writing, I think it was a three part series on AI product management. I think it’s great. I think people should definitely check it out. And I’m going to link up that stuff in the [notes]. But before we get detailed into that, tell my audience a little bit about your background, and what you’re doing today at Yelp, and your work and writing outside of Yelp as well.

Justin: Sure. So, my background is a little bit non-traditional when you think about machine learning, AI, data science, what have you. I actually started my career, after growing up in DC, in the military. So, I spent some time in the Navy, and then at the Naval Academy, studying computer science, mathematical optimization, and then I moved on to sort of the technology field in general, inside of the United States Marine Corps. I did that for several years, for a bit more than five.

And that gave me a really broad understanding of the full stack. And when I say the full stack, I mean from the physical layer all the way through delivering information to a decision-maker without a lot of time, maybe in an imperfect form, but really packaged for what we’re all hoping to have, which is that value-add information to be able to do something with. And that took the form, sometimes, of machine learning products, what we call, actually, an AI product, now. Sometimes it took the form of analyses, sometimes it was actually just aggregation of information from different various sources that we had, and putting that into a visual format, so a really good background in all the different ways that you can process information.

And after I left the government, I spent some time doing management consulting. This was sort of in the heydays of the Bigtable paper, which then translated into Hadoop and that sort of open data platform capability. And I was flying all around the world doing implementations of people’s first big data infrastructures. There wasn’t a whole lot of value capture from that. It was a lot of putting unstructured information in one place and then trying to figure out years later how you would access that, and then turn it into products. But it was really, again, important from a foundational layer to understand the infrastructure and technology that are going to support what we will be building later in the machine learning and AI space.

After that, I got a little bit more focused at Cisco Systems, transitioning through a number of leadership roles, but finally ending up founding one of the first data science teams, and actually the enterprise data science office, and growing that team from really just a couple of people to well over 20, 25 people. And that was the foundation of building this federated data science model throughout all the different functions at the company, and starting really to take AI development very seriously and get into the productization component of it.

After that, I spent some time at Fitbit working on applied machine learning, which was much more focused on machine learning and AI algorithmic development and putting that into an already existing physical product. And then I moved on to Cloudera, where I took over the Fast Forward Labs team, which was an applied machine intelligence team. And that took me farther back into the research side of things with a bit of consulting on top of it, which was really great to understand how do we approach things from the, like, really pushing the boundary of what’s possible perspective. And so that was a wonderful opportunity before I was brought in as the head of data science at Yelp, where I currently am.

Brian: Got it. I’m curious that—not that this is the original reason I wanted to talk about the show, but is there one main difference between doing this type of work in the government military versus outside of it?

Justin: [laughs]. I don’t think there’s one, but I think an aspect that I have found really interesting having been on both sides of it and having been in some leadership roles on both sides of it, is that in the government, they tend to be very clear about what they want, but not as clear about the types of technologies that need to be synthesized in order to get there. And on the civilian or corporate side, tend to be a lot more focused on the technologies and really capable around there, but not as clear about what the outcome is. So, it’s been really interesting sitting in the middle of both of those things. Sometimes it’s frustrating but sometimes it’s also an unlock to be able to speak both languages.

Brian: Yeah, that’s that’s really fascinating. What’s an AI product?

Justin: Yeah. Great. Great question.

Brian: [laughs].

Justin: [laughs]. So, I think before we can talk about what an AI product is, we probably need to talk about what a data product is, and I am not going to use my words for that. I would channel the great DJ Patil and call that, “a product that facilitates an end goal through the use of data.” And so it’s actually really similar to what we were just talking about a moment ago: it’s really any type of product, whether it be physical, digital, software-based, or based in some type of other delivery mechanism that gets you to something you couldn’t do before without using information. And that’s, I think, a really good atomic definition of data, which is useful, and understanding how an AI product could take that further.

And so how I’d put the AI product world together, is an AI product would be a product that uses some kind of automated machine learning or AI capability—and it’s usually in the form of a machine learning model or AI model—combined with one or more data products that we defined previously, to make or support decisions in a form that’s useful to users. And so there are some things that I said there that might have sounded sort of like in between words that are actually really important to call out. One of them is ‘automated.’ So, it’s very possible to create incredible data science products that are able to provide useful intelligence, but they may not be fast enough; they may not be obfuscated in the way that they’re put together enough to be useful. They may not be easy enough to use by a layperson.

And so the automation component of it is really important for AI products because when you think about the best AI products or the most ubiquitous ones we have, you don’t think at all about the machine learning models powering Google Maps, or Lyft, or Uber. You don’t think about the machine learning models that help you to determine what the forecast is going to be on your weather app. You don’t think about those things at all. You just use the capabilities and they’re useful without you actually having to do engineering work. So, automation is really important there.

The other part about it is—and we’ve talked about it—is ‘useful.’ Just because we can do things in AI space, even if they’re automated, doesn’t mean that it’s actually beneficial or a value add. And I think the last couple of years of AI product development or just AI, in general, have shown a variety of really interesting demos, but maybe they weren’t useful at all from a product perspective. And so, the case we’re trying to make as we’re going through writing, and reading, and trying to do some thought leadership around AI product management is really, how do we turn this into something that people find valuable in their day to day lives, either from a commercial perspective or from a social impact perspective?

Brian: Can you talk to me about that piece of starting with the problem space, and is this, like, “We have an idea that X, Y, and Z could be useful to people,” versus, “Hey, we have this data set that we could do X with,” and then trying to retrofit it into a problem? How do you guys approach that at Yelp? Who comes up with the ideas for these intelligent add-ons, or whether it’s a standalone product or however you want to describe it, I’m always curious about that journey of kind of data first, versus people, business value, outcomes first, that perspective?

Justin: Sure. Yeah, I think that’s a really good question. And I’m often really, actually quite surprised that people think that those two approaches are distinctly different. In my mind, understanding what the problem actually is, and providing value to the business, value to the customer, or user, or however we’re defining the audience is the same thing as doing the first layer of analysis. So, before you’re skipping to building a model, hopefully, the first step in building an AI product is identifying the problem that you actually want to solve. And that would include defining some metrics that demonstrate whether or not you’ve actually succeeded in building a solution that solves for that problem. I know that sounds simple, but it’s just very complex, and sometimes not easy.

And I think AI product managers are ones that have a specific expertise in being able to look at how a machine learning or AI product can impact the sort of more general product space, or more general consumer space that they are concerned about. And I would say also that it’s difficult, sometimes, for businesses that haven’t invested in mature data or machine learning practices to agree on metrics, or even define them in the first place. And a lot of times the personalities, and politics, and trade-offs between the short term and long term goals—and when I say short term, it’s typically the revenue that you can impact the easiest, versus the long term, which is investment that would then lead to larger revenues we’d hope, but certainly over time and with more risk. That kind of misalignment is what I think derails a lot of early AI products.

Brian: Yeah, you kind of framed it as not being mutually distinct, but I feel like particularly and non-digital-native companies, it starts with this, “We have this customer data, and then we have all this purchasing data over here. Hey, data science team, what can we do with all this? I hear we’re supposed to be doing something with this. Sprinkle your dust and get back to me.” And so that’s why I’m asking about that.

And part of that’s maybe a literacy challenge, but I feel like right now, there’s still a need, particularly if you don’t have a functioning product management role, whether it’s by title or just someone’s kind of organically doing the work, this is why these projects fail because the data science practitioners and sometimes the leadership are so focused on wanting to do the technical work, the modeling work, they want to write papers or whatever, they want to work on that and they don’t want to be dealing with the, “Well, what do you want?” problem, the, “What are you trying to do?” problem. So, how do you do that if you don’t have an AI product manager to, kind of, interface there, who fills that? Should the data scientists be kind of stepping up, or is it like, “No, go to your business person and get a role requisition for a product manager.” I think it’s a totally missing role in a lot of these companies, even if you’re not a tech company, per se, where your main product is software, it’s still a critical role if you’re going to do this. So, can you unpack some of that for me?

Justin: Sure. Yeah, I mean, I think you’re hitting on a really important dynamic, and this is about making sort of AI capabilities accessible for all types of businesses, and all types of structures, all skill sets. And I do think there’s a way for every organization to find the part of the data science, machine learning, and AI stack that works for them and that’s meaningful. Not every business is going to need the robust Google-style, deep learning infrastructure, but I do think that there are some problems which were really well-solved by a very broad set of businesses.

And so, how I would unpack that is really to think about, at the atomic level, maturity of where your business is, and I think that requires some self-awareness. So, if you’re a company that has spent a lot of time investing in talent from a technical perspective, but maybe you haven’t getting what you expected out of that, or the value hasn’t yet gotten to where you want, that would be a good indicator that there’s a need for this AI product management role to exist explicitly, whether that be in an existing product manager who’s building something that uses a machine learning model, or uses a fair amount of statistical inference-based capabilities, and then what you’re asking that person to do is to spend more time developing that business understanding, and developing that scoping of what is most beneficial for the consumer and what’s beneficial for the user, and then helping to craft that structure into a roadmap that a data scientist and engineering team can execute. So, that’s one aspect that you could do.

But if you’re a company that doesn’t have the talent, doesn’t have a technical talent, then I think going out and trying to hire a bunch of data scientists without clear understandings of what your team in particular needs, as well as not having a good idea of what makes a great data scientist from an experience perspective in that, I think might be really challenging. So, in that case, you might want to start with a technical hire that has the ability to build a team. And that person might need to lean a bit more on the product management, communications, and strategy side of things, even though they have a technical skill set, and those people exist, but they may not be the person that shows up number one, on the Kaggle competition of the day. However, they have the technical skills to be able to get you for the first two or three steps in your machine learning and AI journey. And most importantly, they have the network and the ability to help you build a team, and get you started from a business perspective. So, the advice I would have is to try to understand what your business really needs and where you are today, and then go look for people who can help you advance the practice that you have. Not looking for, like, moonshot solution on day one.

Brian: Yeah. I want to get into when we talk about customers and users. So, I know—and this is going to get into your platform, the Beaker and Bunsen, I did check out your cool article about that, so I want you to talk a little bit about that—but when we talk about users here, I’m guessing in the context of the platform that Yelp has built up, you probably have customers who are quote, “data scientists,” other employees, and you have customers who are Yelp app users, and you have customers that are maybe a product lead or a line-of-business owner, or like, “I run advertising or whatever, and I want to see how my ads are performing.” So, can you talk about the design process of trying to build a solution for all these different personas? And how do you do that? Is that even right, by the way—

Justin: Yeah.

Brian: —is that even an accurate description of who your quote “users” are?

Justin: Yeah, that’s actually really very accurate, and it’s a dynamic that I think is probably not immediately clear when someone says we have an experimentation platform. So, it’s pretty good that we’re having that conversation. So, the simple answer to it is we hired a product manager to build a experimentation platform as a product, rather than saying it’s an internal tool or an infrastructure capability that we’re going to have, and to be honest, that was the way that it started. We realized about a year and a half into it, that there was just a real need for someone to think about all of these questions that you just posed in a really deliberate way. And so that skill set actually transitioned.

So, it’s a pretty good example of that maturation cycle that I was talking about previously. The first person to hold this job was not necessarily a data scientist by trade. They were a very strong product manager, really great at the roadmap program and portfolio management component, as well as building relationships needed in the engineering community to start laying the foundation for some of the core features that were needed for all of these communities. That was the first product manager. The person who holds the role now has a Ph.D. in a technical field. So, now what we’re getting is someone who is able to really take a look at some of the more advanced features that we need—and so these are things like multi-armed bandit, or really providing a structure for more robust causal inference, and then also advancing the UI capabilities that we build to be useful for a consumer that might be less technical, as well as for a fully-featured data science and machine learning team. So, what we did was realize that we actually had to look at the lifecycle of this, just like we would any other product that we had on the platform, and as soon as we started doing that, we really found these unlocks that I’ve been talking about.

Brian: Yeah. So, if I can play that back to the way I kind of see this as, effectively, you’ve built an internal software product, like some tooling and experimentation platforms for both employees to use, but there’s also a customer-facing side of that, so that’s your external view. And so if you’re listening to this show, you have to think, when we talk about product management, there’s different audiences, and this is, I think, this internal AI product management role is really important because the same concepts of thinking about design, and how people are going to use the service, and making it useful, that can apply to employees just as much as it can to the digital experience that you put out to your end customers. So, is that a fair summary?

Justin: Absolutely. And I could say some of the negative things that happen when you don’t make that a first-class citizen in your planning can happen [unintelligible].

Brian: Please do.

Justin: So, in other places that aren’t Yelp, and weren’t as deliberate about this, that I’ve been, where you haven’t made the investment in making software usable, you get to a scenario where you actually can’t deploy the advancements or the features that you need to run the business. And so one of the places that I was we actually had a patent that was unique that allowed us to do some fairly, I would say, advanced human and device interaction, machine learning modeling. So, essentially, you’d be able to get some predictions about what someone’s interactions were doing, so whether or not they’re running, or jumping, or doing something like that, and make some pretty fairly advanced determinations about what that would mean, and what’s most important for them to invest in from a health perspective, equipment perspective, et cetera. So, this was a very powerful product, and it’s something that exists now in a lot of Apple’s products. But unfortunately, there really wasn’t an understanding of how to make the internal tools for testing, internal tools for development, and productization—so this is, like, implementation of it—available to the lay engineer.

And I’m not talking about someone who’s got 10 years of experience working in academia, building their own machine learning architectures, or someone who has been in digital marketing and understands the A/B testing in a way that’s intuitive. I’m talking about someone who knows how to write software in JavaScript, or who knows how to build an iOS app and giving them the capability to be able to develop their AI products without all the other associated skills to be able to make it a reality. And the challenge became, we were trying to build the infrastructure while we were also trying to do the design and developments workflow. And the result was it just went too slowly. And another team from another company beat us to market, and millions of dollars were affected by that. So, I would say that this is something that really needs to be thought of as a organizational competency, rather than it being something that sits on a specific team, and then that team hires for it, builds for it, et cetera.

Brian: So, if you’re going into a brand new place, perhaps a place that’s way less mature with AI as an enabling technology than what Yelp has, do you feel like you need to invest in this platform first before you really try to do any type of application of intelligence into the products, whether it’s in term—for experimentation or whatever, or do you think the goal is let’s allow some A/B experimentation, or let’s do a one-off project which starts the architecture but you’re not really planning out… the end goal isn’t to build the infrastructure, but it’s to enable the experiment or some digital experience? How would you recommend teams approach that? You hear about these enterprise projects in particular, where it’s the only thing that ever gets done is the infrastructure. [laughs]. And then by the time they get something ready, like you said, it’s like, the business has moved on, the opportunity’s gone, or some other challenge or the team gets replaced because they haven’t shown anything, and the next guy or gal comes in and wants to do it a different way. So, what’s your recommendation on starting small versus nope, it’s an investment. It takes all this plumbing, you got to do it. You have to eat that cost before you have any value. Or maybe it’s not those two choices, it’s a third choice.

Justin: Yeah, I think there’s a third choice. And the good news is that we live in the best time ever to be doing this work. There’s new tools and technologies available every day that just simply weren’t possible, given the maturity and scale of software development in the past. So, I think that is where I would lean into. As you’re an emerging company, especially as a startup that is just really focused on growth, and maybe hasn’t done a lot of instrumentation on some of these areas inside of the application that would get you value from the data.

I think the most important thing to focus on there is to understand what you need to be able to test and deploy rapidly, and then build that framework. And so there’s quite a few available even today that allow you to go from even a Jupyter Notebook or a really well-encapsulated model framework like ONNX, or something like that. And then deploy that via something that’s lightweight, like YAML, from specification, and then run your model architectures really extensively through some kind of scheduling orchestration tool. So, you’ve got the Airflows in the world, but certainly, there is quite a few new MLOps capabilities, or new Ops tools coming out. So, you can string together a very narrow set of tools and technologies to be able to do this work very, very quickly, to do both the deployment, experimentation, but it probably wouldn’t scale both from a financial perspective and also from a monitoring and extensibility perspective when you went all the way to 20 million users.

So, I think it’s important to have a product management team that understands the maturity lifecycle of building out these capabilities and is able to interject and say, “Hey, it’s time for us to make a different investment, either in parallel, once we’ve reached this milestone, or this next step in the product lifecycle.” And I feel like the way to get started is to try to support your product development team, especially the one that’s close to the business, as much as you can to make it easy for them to do these things, but still give them the information they need. But I think as you get more mature, and the idea is to leverage the data as many ways as you can, certainly, more infrastructure is required.

And so even at Yelp, and I should be clear, it did not start with the decision of, “Let’s build this product, let’s hire a product manager, and then go through this lifecycle.” It started with dozen, or maybe even more, engineering teams, all recognizing that experimentation and deployment of machine learning and AI capabilities, and even just general application experiences needed to happen. And so they were doing it in their own ways in their own teams and presenting that to executives. It just became something that was not scalable, and it’s actually hard for the leadership to parse all the different methodologies. So, that was the decision to then go centralize this, provide an actual infrastructure. So, even Yelp, which has a very mature architecture around this, went through the maturation lifecycle that I’m talking about.

Brian: Switching over to the last mile or the place where humans get involved with this stuff, you talked about design and UX in your articles about this, about helping out in the ideation phase, and the problem development space, how is it going to be experienced, and you made really great points about no one’s thinking about the routing algorithm in Google Maps, or Uber, and all this kind of stuff. I think that’s great, we’re trying to obfuscate as much of that technical aspect away so that it’s really focused on utility and usability and all that. So, how does working with design help or hinder? I’m especially curious when you’re running experimentation, I could see how that could be hard for some designers who think like, “Oh, this is my job to determine the photos layout on restaurants, or whatever.” And you’re like, “No, actually, we should be showing photos based on a profile, or usage, or whatever.” I read some of your stuff about picking out meals, tagging meals based on popularity. What’s that experience like? And how is design helping you do your work, or making it easier or harder?

Justin: So, design is a customer first, and then they become a partner. And then I think they actually become a leader, or even causal in what gets developed or deployed. And so I’ll go in that order. If you’re a designer that has been in digital products for a while and has kind of understood that what sometimes is attractive isn’t actually engaging, and vice versa, your bias, in my experience, especially working in digital product companies, to actually have a lot of data about the user journey. And so if you’re a designer, especially a leader in the design space, you would want to be receiving analytics information pretty much constantly about how customers are experiencing the product.

And if you’re building a new feature on that, you’d have some indications or signal about what has worked before, and you have some hypotheses, probably, about what you’d want to put in front of them to test to see if it’s still positive or is engaging. And so even at Yelp, the partnership between design and data science is actually something we’re really focused on, and even improving today because we recognize that there’s not necessarily a need for there to be an unhealthy tension there. It’s great for us to both be pushed in those directions. But in order for design to make sound decisions over time, they actually need to be able to recognize what work that they’re putting out, actually means the user themselves. And so that’s kind of the first part of it. The partnership begins during the experimentation process. Certainly you have wonderful designs that have been taken as an aggregate from what’s going on, and are pure set versus what new experiences we might think might be engaging or useful in the product, but you’re going to want to make sure that you try three, four, five different types of those things.

And when you get randomly controlled, cohorted testing available, when you get something that allows you to put different experiences in front of users but not dump that out to everyone, it’s very powerful. And so design really needs to give the feedback loop to the machine learning and data science practitioners about what they see the user is doing and how that’s impacting the user journey. So, this is research, user research, as well as actual design from a static perspective, as well as qualitative research, so going into the survey space. And if you’re working together with data sciences, essentially looking at what the product analytics information is telling you, you can get to a place where a few people, giving them a good amount of information, can make a decision about what the trade-offs should be and where to invest.

And at the end of this is all done, really, it’s going to come down to, even if we do see some harm, or we do see some things that are maybe suboptimal from what the data particularly tells you, but it’s just worth it for us, design, I think, is in a good position to make the case to the product leader, whoever that is, that we might want to own that or take that hit. And I think a good data scientist understand that even though the data might say something that’s pretty clear, it doesn’t mean that it’s always correct to do from a business perspective. And in that case, I would say, we might take a backseat to that decision because we have said our piece, and I think what’s right for the business needs to be something that’s ultimately held by leadership and product. So, that’s how I would frame it.

Brian: Is there a particular example you can give from one of those scenarios, or like you talked about this, here’s the data, but taking a back seat, or even how qualitative information helped? I think so much about a lot of these technologies being data-driven as their quantitative like we’re building intelligence systems that are based on quantitative data. How does a qualitative insight from a research session, or something, affect the work that you do?

Justin: Sure. So, I worked with an airline once, who was very interested in trying to deploy a lot of photo or camera-based tools to look at how they could perform maintenance on their fleet. It’s incredibly expensive to have human inspection, and it’s something that needs to be done, but also at the same time is not something that you can deploy to every single plane in every single location you have. So, what we were able to do is to take a look at what data actually existed to be able to determine whether or not there was a way to automate any of this. And from a design standpoint, we also worked with some, sort of, product leaders who had that design focus and UI focus, and they realized that how technicians and how people interact with the maintenance process actually isn’t super data-driven to begin with. So, we had a problem right at the beginning, in that we could have found the best data in the world, but people don’t have a tablet with them when they’re doing it, right?

Brian: Right.

Justin: So, how are they going to get it? And so the entire process of product development shifted from being, “Okay, let’s just create a machine learning model that’s going to make this easier,” into a scenario where we started to think about, “Okay, well, how do we change the experience of doing maintenance? And does that require machine learning at all, or is it more about us making it a better tool, or more comfortable, or giving them more opportunities to look at things in aggregation?” And that might be something that a computer could do. And so, that’s an example of design really taking a data-driven approach, realizing that data-driven approach doesn’t result in a delighted consumer journey, and then getting instructions back into the data science team in a way that allows us to solve for the right problem with the data we do have.

Brian: Do you find most the time it’s, like—in the journey at Yelp or even at prior places, have you found that to be something where you got to go through a rocky stage before the happy marriage kind of happens between this user experience focus and the data science piece or do they quickly get into bed together, and it just kind of happens fast? Or, tell me about that journey that should be expected?

Justin: No, I don’t think it’s going to be that’s something that happens organically. I think we have to remember some of these capabilities, especially that we have in the machine learning and AI space are less than 20 years old. The math is not new, but the software is. And design, obviously, and user research has been around since there have been products. So, I think there is a bit of a vocabulary and a tools alignment that needs to happen, and that’s something that is easy to do once you realize it’s a thing that you need to do, but it’s not organic.

And then I also think there’s sort of a cultural change. A lot of early data scientists, especially ones that focused on machine learning and AI, came from really hard sciences in academia, and that’s a wonderful place to build great skills, and I’m very glad to be amongst a lot of them. However, it isn’t a place where you get as much exposure to the product development process, and also the methodologies of making decisions are quite different. So, think of hypothesis-based statistical decision making or inference versus causal inference. Where you’re sort of saying, “I have a bunch of ideas and compounders, and then I’m going to make a decision based on what I think is true in the moment.”

And so, data scientists and machine learning practitioners are really sometimes myopically focused on the model and how it performs, and UX researchers and designers are sometimes myopically focused on what the users’ experience has been using the product, however it works. And so bringing those two communities together, where the model, that’s actually important, what it’s doing is important. It needs to be interpretable, it needs to function in a way that’s not biased, it needs to be accurate with the idea that the user, while they’re not aware of how the model is functioning, should be having a good experience overall with the tool, or the software, or the experience that they’re having. Like, there can be friction there, and I think that the reality is that a great product manager can be a person who can help to organize and orchestrate the conversation into a really productive outcome. And that’s a great place to have someone who has a bit of both skill sets.

Brian: You talked a little bit, too, in one of these articles about ethics a bit, and also guardrail metrics, and I was curious if you see those as related, and whether or not a focus on Human-Centered Design helps us identify guardrail metrics, and what the ethical considerations are. And practically speaking, I’m curious if you can give an example of maybe how you’ve adjusted something at Yelp for ethical considerations because I think it’s always so abstract when I hear it talked about, and there’s not a lot of concrete examples of what people are doing to actually adjust—either prevent an ethical problem, or to address one that they know is embedded in the data, or some of the processing, or whatever it may be. So, can you talk about guardrail metrics and ethics?

Justin: Yeah, so I think guardrail metrics, I think traditionally applies to experimentation process, where there are sort of senior business related metrics that cannot be impacted or should not be impacted by the experimentation process. So, an example of this might be if you’re an advertising business, and your goal is to keep the cost per click, or cost per the lead that you’re developing really low, then you don’t want to run a bunch of experiments that changes that, actually, that metric for the users because then they’re going to be like, “Well, I’m in this cohort that I I didn’t ask to be in, and all of a sudden, I’m seeing things get much worse than they were very rapidly for me, and it’s harder to reach on the audience, it’s more expensive to do. Why is this happening?” That’s a bad outcome for an experiment. And so you want to turn that experiment off or change it very quickly. And so that is what guardrail metrics typically are installed to do. I think it’s probably not yet as mature in many companies to apply that to ethical metrics, mainly because those are not as mature, or as easy to add in to the machine learning and AI process as they should be.

And so I want to be really clear, I think it is possible to measure some of these aspects, or at least the research should be done to make sure that you have a metric available for some of these considerations before you deploy. But in reality, a lot of companies skip that, and that’s how we’ve gotten into some of the problems that we’ve had recently around this space.

So, a practical example that I would give offline that probably needs to be done online into the testing process is looking through zip codes, and make sure that you get a fair representation of the types of people that are in a community. It’s really easy to just sort of say, “Hey, there is a set of five zip codes that all go to use a particular product or go to a certain type of business, and those are the most valuable.” But that’s not really true, is it? It’s the most valuable for the experience you’ve deployed. And that is a design problem because if you’re only reaching one group of people, it’s actually less lucrative, of course: you want to reach the widest audience possible.

And it’s also a bias issue because you can probably guess, if you tune the product towards one community of people that are very, very active on your platform, there’s probably another community who’s not receiving a good experience on it. And that can be happening because your product isn’t very good, or it could be happening because it does not resonate with someone’s experience that they’re having because they have—it’s not a positive experience for women, it’s not a positive experience for people of color, or not a positive experience for any host of different ways that people are. And so guardrail metrics in that context could really help you if you’re able to define those communities. The problem is, you don’t always have that data. And so this is where user research from a qualitative perspective is very, very powerful. And so, one of the things that I think we’re trying to deal with right now is, when you have an online system which is really reacting to streaming information—so did someone click on something? Did someone spend time on a page? What pixel are they looking at? Versus the qualitative research, which is largely survey-based and a lagging indicator. How do you put that together? And we’re still working on that.

Brian: Great feedback. This is really been a fun episode, and you’ve shared just a ton of knowledge here, so I really appreciate you coming on Experiencing Data to chat about this. So, just any closing thoughts or advice for data product leaders, or aspiring data product leaders out there?

Justin: Sure. So, I think the case that we’re making for AI product management is not that it needs to always be a distinct role, but it needs to be a distinct function. And so if you’re in a role where you know that machine learning and AI is either going to be useful for your product, or is already in it, I think it’s really important to explicitly identify who is going to be responsible for that function on the product team. And if there’s no one who has the skill set, it’s worth it to invest, whether that’s hiring, or knowledge development, training. And I think what you’ll see is a massive unlock of adjacent capabilities.

Our series of articles—there’s actually another one coming out, a fourth one coming out, that’s going to be focused much more on maintenance and post-production requirements for product management in AI—our series really allows you to kind of think about what you should be doing from a process perspective, but the best way to build this skill set is of course to experience it, and to do the work. So, I really recommend that a lot of aspiring or emerging product managers in this space start to think of themselves AI product managers even if that’s not their title, and then begin to work through some of these ideas with the rest of the product team and build that competency.

Brian: Awesome. Great advice. Justin. Where can people keep track of your work and all of that, at LinkedIn? Twitter? Do you have some accounts you [crosstalk]?

Justin: Yeah, sure. So, you can find me at @justinJDN on Twitter, and I’m usually sharing something about what Yelp’s data is telling us about the economy, or what I’m playing around with in the machine learning tools space. And on LinkedIn, I typically am more focused on strategy and how to put together the right teams. So, both of those places you can find me, and of course, if you’re interested, feel free to check out the O’Reilly article series as well.

Brian: Yeah, I’ll definitely link those up. So, Justin, awesome, great stuff. Thanks for coming on the show, and best of luck with the rest of the year.

Justin: Thank you.

Brian: All right. Cheers.

Show Notes

Quotes from Today’s Episode

Links

Transcript

Other Episodes

Episode 0

026 - Why Tom Davenport Gives a 2 out of 10 Score To the Data Science and Analytics Industry for Value Creation

Episode 0

026 - Why Tom Davenport Gives a 2 out of 10 Score To the Data Science and Analytics Industry for Value Creation

Episode 0

029 - Why Google Believes it’s Critical to Pair Designers with Your Data Scientists to Produce Human-Centered ML & AI Products with Di Dang