023 - Balancing AI-Driven Automation with Human Intervention When Designing Complex Systems with Dr. Murray Cantor

Dr. Murray Cantor has a storied career that spans decades. Recently, he founded Aptage, a company that provides project risk management tools using Bayesian Estimation and machine learning. He’s also the chief scientist at Hail Sports, which focuses on applying precision medicine techniques to sports performance. In his spare time, he’s a consulting mathematician at Pattern Computer, a firm that engineers state-of-the-art pattern recognition solutions for industrial customers.

Join Murray and I as we explore the cutting edge of AI and cover:

Murray’s approach to automating processes that humans typically do, the role humans have in the design phase, and how he thinks about designing affordances for human intervention in automated systems
Murray’s opinion on causal modeling (explainability/interpretability), the true stage we are in with XAI, and what’s next for causality in AI models
Murray’s opinions about the 737 Max’s automated trim control system interface (or lack thereof) and how it should have been designed The favorite method Murray has for predicting outcomes from small data sets
The major skill gaps Murray sees with young data scientists in particular
How using science fiction stories can stimulate creative thinking and help kick off an AI initiative successfully with clients, customers and stakeholders

Resources and Links

Murray Cantor on LinkedIn

New York Times Expose article on the Boeing 737 Max

New Your Times Article on the 737 Max whistleblower

Quotes from Today’s Episode

“We’re in that stage of this industrial revolution we’re going through with augmenting people’s ability with machine learning. Right now it’s more of a craft than a science. We have people out there who are really good at working with these techniques and algorithms. But they don’t necessarily understand they’re essentially a solution looking for a problem.” — Murray

“A lot of design principles are the same whether or not you have AI. AI just raises the stakes.” — Murray

“The big companies right now are jumping the guns and saying they have explainable AI when they don’t. It’s going to take a while to really get there.” — Murray

“Sometimes, it’s not always understood by non-designers, but you’re not testing the people. You’re actually testing the system. In fact, sometimes they tell you to avoid using the word test when you’re talking to a participant, and you tell them it’s a study to evaluate a piece of software, or in this case a cockpit, to figure out if it’s the right design or not. It’s so that they don’t feel like they’re a rat in the maze. In reality, we’re studying the maze.” — Brian

“Really fundamental to understanding user experience and design is to ask the question, who is the population of people who are going to use this and what is their range of capability?” – Murray

“Take the implementation hats off and come up with a moonshot vision. From the moonshot, you might find out there are these little tangents that are actually feasible increments. If you never let yourself dream big, you’ll never hit the small incremental steps that you may be able to take.” — Brian

Transcript

Brian: All right, welcome back to experiencing data. Today I’ve got Murray Cantor on the line who is… has several different titles in your work, but you’re currently the Chief Scientist at Aptage Consulting. You have a significant background in engineering, data science, so can you tell our… first of all, welcome to the show, but jump in and tell us a little bit about your background Murray.

Murray: Thanks Brian, I’m really glad to be here. I’ve been around a long time so I’ll try to keep this reasonably brief. Back in the early 70s, I got a PhD in Mathematics from Berkeley. And since then I’ve been really interested in the dynamics and economics of innovative products and systems. In the past, I was an industrial mathematician at Shell Research. I was a project and architect lead in IBM workstation division putting out AIX 3.0, former Chief Engineer of the service organization of Rational Software.

Murray: Back in the day Rational was a leader in software development tools and processes. Rational got bought by IBM, where I became a distinguished engineer focusing on development, governance and analytics.

Murray: Since leaving IBM I founded Aptage which isn’t really a consulting company. It’s a product company that delivers services for R&D project and portfolio management. I’m also the Chief Scientist Hail.Sports, where we’re looking at applying precision medicine techniques to sports performance. I’m also a consulting mathematician to Pattern Computer where we’re doing state of the art, pattern recognition of solutions for industrial customers.

Brian: You’ve covered a lot of different bases here and I know you have some experience getting involved with user interfaces, and in addition to user experience. In particular, when we had our pre-call, I know you had some strong feelings about explainability and AI models. So I wanted to talk about XAI a little bit today as well as what are some of the concerns that you’ve seen coming up with AI driven interfaces?

Brian: We talked a little bit about you’ve done some work in with airplanes, if I recall correctly and airlines and so you know something about that. Obviously, there’s a heavy part of work that’s not necessarily interface driven or in front of the customer… you know the deliverables are not necessarily interfaces, but you have those in mind, and you’re thinking about the outcomes that you want. There’s a customer, a person involved, a human in the loop. What is your process when you’re working on these very complicated systems, leveraging AI?

Brian: How are you approaching the user experience piece and delivering value and making sure you know the the math and the science and the technology is actually delivering outcomes? Like tell us about your process.

Murray: A lot of design principles are the same whether or not you have AI. AI just raises the stakes. So I I there’s a variety of outside in design methodologies. I’m a fan of all of them. Not particularly wedded to any one in particular. The real question that you have to often face in designing an AI system is the, “So what?” attitude. I’ve discovered this pattern, that’s interesting, now what? I had an example of a company that could look at EKGs and predict whether, or not you’re likely to have an event a cardiac event in the next six months.

Murray: And they couldn’t monetize that, and the reason was, that’s interesting, but so what? This is part of the process question, which is, people want to use artificial intelligence, machine learning, whatever, to do one of two things. They either want to make an intervention in the system, or they want to automate some process. That was essentially things that people were doing that now they want machines to do, for example, reading X-rays would be an example of that.

Murray: They just want to automate that process. So the first so like any other good design thing, you say, “Why are we doing this? What is this for?” And if it’s to automate something, well, then you understand exactly the role… the person whose work you’re trying to automate, plays a role in a process. You just have to understand that role and what the parameters performance parameters are for that role.

Murray: And then how’s that how was the person who’s being automated now interface with the system and that’s probably good guidance for how your AI should interface with other parts of the system. And the uh how’s that function getting measured? Well, that’s how the AI should be measured that you present that.

Murray: If it’s an intervention you’re looking for, like what kind of treatment should I choose in this situation? This comes up in our sports example, what kind of training regimen, what kind of diet should people have? What you want to do is support… understand the decisions that are going to be made, and report to the person who will make the decision, what decisions are going to lead with what probability to what kind of outcome. So you think a lot about designing for intervention.

Brian: This actually made me think of… I was at an event last night, and we’re talking about interventions and with AI, and the speaker actually he used one of these scenarios such as, let’s say, you have an AI that’s doing image recognition, and it’s looking for cancer, you know X-rays or some things and it says, “I think we’ve identified cancerous growth here, and based on all the information, the recommended treatment is X,” and the system is 75% sure that that’s the right path.

Brian: The doctor has three years of experience, maybe, let’s say three or five years of experience, and maybe they’ve seen 100 patients, and they feel like it should… “No, I actually think Y would be the better one,” and they’re relying on their gut and you know their experience, which is what we’ve traditionally used with medicine for a long time. And it was a they took a rough survey of the room and the general feeling was if this was life or death, I’d still go with the doctor, even though the AI maybe it’s looked at a million scans, and its success rate is quite high, versus someone that has less experience and maybe they’ve only looked seen 100 scenarios.

Brian: Can you talk about like this friction and these kinds of recommender systems? Any way you might approach the design or the solution, or how do you talk to the customer when you’re building that system about these scenarios? Does this ever come up in your work?

Murray: I think there’s probably four questions in there. Let me tear some apart of a bit.

Brian: Yeah yeah.

Murray: Okay so the first one is we are getting into how did the system make the decision and understanding the causal model. And the probability of change that lead to the causal problem is real important at this stage.

Murray: So there’s been a lot of really good work in the last couple decades about causal modeling in general. There’s the recent book by Judea Pearl, The Book of Why, which explains some of this history and some of this thinking. And so he actually recently had an article Quanta Magazine to reel AI you’ve got to talk about cause and effect as well as just matching parameters to uh training against parameter sets like the image recognition.

Murray: So what you try to do now is predict a future outcome, and you don’t have the luxury do an experiment. So first, the doctors have not been well trained in this. Most of the people I hang out with would actually trust the AI more than the doctors, believe it or not, because they understand these causal models. Part of the whole revolution of precision medicine, is to get to a point that we have these predictive models based on individual parameters that are deeper than the kind of learning that patients have, that particularly young doctors have, which is, they’ve seen a few hundred cases, and they don’t understand the confounding variables very well.

Murray: So this is an education that just… this is both research and education. We’re just at the beginning of these times. So I would like to believe that in two or three years, people would understand these things better and the doctors would be more comfortable understanding the answer. Now, a better solution in both cases is, if I’m given a 75% chance this treatment might work, and this treatment has risks on its own, and the treatment might kill me, what I would like is more evidence. So a better solution in both cases is not to believe the doctor or the AI system, but to ask the question, “What further evidence would I need to gather, what other additional tests should I take? What kind of biopsy should I have or something that increases the probability of the prediction… of the intervention being more favorable, of getting the favorable outcome of choosing this?”

Murray: And so the goal at that point is… what the doctor should do is like in House on TV, the old series, he should order another, or she should order another test at this point,. If the test is expensive, so be it, now we have enough probability of a problem that requires a treatment that the return on investment on a more expensive tests really makes sense.

Brian: This actually dovetails into my next question. Let’s talk about explainability for a second. So if the solution that the software, that the doctor is… if it’s doctor versus the software, even though it’s not really what’s happening here, we’re both on the same side of the patient, hopefully. but the software if there was some explainability around the prediction in the model, and the doctor was able to help the customer work through that and understand why did the system come up with this maybe to help the patient understand this prediction is based on a lot of parameters that personal information about your particular health that factored into this, many more variables than maybe they’ve seen.

Brian: I wonder if it’s… is that something that the doctor and patient work together through? And what is the role of explainability there? I know you have some strong feelings about… that we don’t really know exactly how explainability is working in these models. So can you talk a little bit… a little ticket tangent here about explainability and how that might fit into this situation?

Murray: Sure. Explainability clearly is is really key, and we’re in early days. My strong opinions which I shared with you earlier, is the companies out there IBM, Google, Amazon, or whatever. I’m not pointing fingers at anyone. The big companies right now are sort of jumping the guns and saying they have explainability AI when they don’t. It’s going to take a while to really get there.

Murray: If you think about it, if you ask a person how they made a decision, often they can’t really even explain it themselves. They just say, “Well, I’m just going with my gut.” What does that mean? Right so explainability of anybody making decisions is actually hard right now. There are two issues here. One is that the early version of of of need for AI it’s just dealing with the problem that people began to notice that AI classification algorithms for things like hiring and giving loans or whatever that involve, have biases in them and what they realize pretty quickly is that if the system today has a bias, maybe train an AI system to do what the system today does, it’ll reflect the same bias.

Brian: Right.

Murray: And so you give it a training set of how a bank already makes a decision, the bank is biased, the AI system will become biased, simple as that. Okay and since we don’t really know, a neural net, a deep learning neural net, literally is a black box, you don’t really know what it’s doing. What happening is that it has literally maybe a million degrees of freedom in the interfaces between the layers.

Murray: It has adapted those parameters as interface layers actuators to match the results of the data using gradient descent algorithm of some sort. Just pattern matching is just like any other kind of regression thing, but it matched so many different parameters that we don’t know why, that we don’t have…

Murray: So looking at the neural net, and understanding how decisions are really being made by that, it’s still an open field. It’s going to involve in the end two things. One is that we’re looking much more deeply at information geometry, information manifolds, and essentially the deeper Mathematics of the structure of these neural nets. There’s some really good work being done in the academic community right now, and that’s really going to help.

Murray: We’re looking at that and various other places, as well. Small companies are looking at that. Presumably that’s being looked at inside the big companies as well. So that’s one thing. The other is… let me just make this other point.

Brian: Sure.

Murray: Is that we’re not going to have explainability until we also include the causal model and stuff that Pearl and his colleagues are writing about. It’s the combination of information geometry and causality which is going to lead to explainability. And we’re just early days, I’m really looking forward to it, it’s going to be a great when we understand this more deeply.

Brian: I guess I would challenge one thing here. I totally understand the bias the issue with bias, if the training data was biased in some way then explainability is only as good as what it learned from. The input, garbage in garbage out.

Murray: The explainability is really very simple, it match the bias and your training set. There, I’m done. Now, what?

Brian: Right so are you saying that because of this, I would say understanding the risks of bias that may be in the system to begin with, removing explainability… Like if you have a model that can take advantage of having explainability as part of the experience and the display that happens there, I would posit that it’s net better than the complete black box situation because there’s probably times where, when the explanation that is provided still requires human involvement.

Brian: There might be that level two analysis that the person does, especially if they get a recommendation or an explanation that they don’t agree with if they find it suspicious or whatever, possibly, maybe just say, “Hey, this this prediction feels weird. I want to feed that information back to the team that created the models because maybe we’re discovering where we have some bias.” But I would net that that’s… do you not feel that’s net better than a complete black box situation?

Murray: Of course. The point is, so we have to open up the black box and there’s some ways to do it. One is this… that was essentially my point. Yes, that explainability AI is very important, but it’s not simple. So a way to look at bias is there’s some sort of latent variable in there that is resulting in… lurking behind the decision process. That’s the language of causal models. Where you’re looking for the confounding variables.

Murray: And if you suspect that race as an example, there are techniques for um deconfounding the model for the bias for those kinds of biases, if you can identify them. And again that’s more than just… and you’ll find that in the causality literature, confounding and deconfounding of the data. That’s really cool stuff, and it’s not being used enough yet, but it’s on the rise, and um we’re going to see a lot more of that going forward, which is all great.

Murray: Um the other thing is though, that there is latent structure. If I try to open up… if I look at the black box and try and open it up and try to understand the structures within that, in other words, what is the, is there a geometry to this model? And is that model…is that another way to explain how these confounding variables are coming into the picture?

Murray: And there’s research going on in that too, and that’s what I was mentioning about the information geometry. Taking right now um the known, that’s a complete black box is, we don’t want them to be, we want to understand what the deeper structure is of those. And that involves information theory and some modern ways of looking at the geometry of the parameter spaces.

Murray: Um which there’s something called topological data analysis. There’s information articles, stuff like that, that is going to um give us some deeper insights into those into the effect of those confounding variables. So all I’m saying is yes, let’s not jump the gun, let’s not pretend we have it yet. Let’s work on it. And in the meantime, having a person in the loop is always a good thing.

Brian: We had talked a little bit about the… I forget how I came up with the 737 Max situation. I had read an article at the time and I don’t know, somehow that came up when we were first talking and and you had some opinions about uh designing the system, especially a system in this case, which was opaque to the customers.

Murray: Yeah

Brian: And it caused it caused death, unfortunately. It was a poor choice.

Brian: You have some strong feeling, there was a New York Times article that just came out that really unpacked, uh at least from this the writer’s perspective about what happened in that scenario, and my general takeaway was that there is fault. Uh he felt there was strong fault on the training side of the airlines in terms of who they’re putting in the cockpits to run these systems. Uh and there was also software… there was an interface problem with how this system was opaque uh to the operators.

Brian: So can you talk a little bit about this, and and some of your opinions about designing the system, and an “an intelligent system” that’s supposed to be helping the human operator?

Murray: There’s no single fault. There’s lots of faults that went into 737. Um in my view, fiasco is still playing out. So here’s here’s the situation Boeing had needed to put a new kind of engine on the fuselage for which it wasn’t designed.

Murray: They essentially, jury rigged that by saying, “Okay, we’ve now taken a stable aerodynamic system and made it less stable by putting on these engines for which it wasn’t originally designed. And we’re going to treat this as an upgrade, not as a whole new system. And we’re going to fix the instability with one sensor and some software. And we’re going to put this out to the whole world, and it’s going to and we know that out there they’re going to be pilots of a whole range of capability, but we’re not going to tell them by the way, we’re going to do this.”

Murray: Okay, so now from so this opens up all kinds of issues about failure process and whatever, and now they’re opening up a Safety Board, and whatever. Blaming the small airlines for not having enough training is really missing the big point in this. So I was not impressed with the New York Times articles. I think they really missed the whole main point. So which we what happens when you’re designing an airplane-

Brian: Actually, let me stop you real quick just in case listeners uh haven’t read up on this I’m going to summarize it real quickly. Feel free to correct me, but uh.

Murray: Sure.

Brian: The long and short of this as I understand it, you know the 737 has been around for a long time, they wanted to create a “better version” of it.

Brian: And if you’re an airline, you don’t want to create “a new airplane” because as soon as it’s a new airplane, it has to go through way more uh check… a long checklist of approval before it can be flight ready. And so instead, you call it a 737 Neo, a Max, and you make your adjustments to a current plane.

Murray: Right.

Brian: So it’s more like an adjustment and you don’t have to go through the same hoops.

Brian: So in this case, it was more fuel efficient, uh better engines, uh and but because of the new hardware, they wanted to put on the plane, this better engine system and to get those efficiencies, they needed to counter some of the the effects that were the aerodynamic effects by putting in a software, what turned out to be an automated software system that would control for some of the new variables that this the engines introduced. Am I correct? And that’s kind of like our starting point for this.

Murray: Close enough. What they did, the new engines introduced instability in um I think was the all or one of the three dimensions of how airplanes transform. It was the aptitude, and so what you want is that an airliner essentially will glide even if an engine fails and and whatever has to be real robust. What you don’t want is an airline that you know a jet fighter is a whole different thing you will want those be able to turn on a dime and win in a dog fight.

Murray: Uh for an airliner you want it to be very stable, not very agile, but um safe. Probably would do real badly in a dog fight. So what happened is is they broke their standards of stability, essentially. And they knew that, and that’s why they put in this automated system which was going to account for that by adjusting the flight services.

Murray: Um the prob so now what we have is something a pilot would normally… The plane is at the wrong aptitude, that is it’s heading up or down incorrectly. The pilot is supposed to detect that and make the correction and manually override it. And so what you have… the total system is a combination of the part of the plane, its sensors, its software and the pilot is supposed to do the right thing under the right circumstances.

Murray: The problem is even what we can seem to have detected even if these were… By the way, some of the pilots are pretty well trained who had these problems is that they would do what they were trained what they should have done and the plane… They didn’t know how the plane was going to react to those actions. And so the interface between the plane and the pilot was broken. A predictable action of a pilot led to a worse problem for the plane.

Brian: Right.

Murray: And the designers of the interface should have seen this coming and they should have either put in very severe training things, which they didn’t, or they should design a system for which, when it gets into instability, it essentially leaves the pilots doing the right thing if it can. It turns out this thing, there’s probably nothing the pilots could have done.

Murray: In terms of interface for systems design, what you want is you’ll imagine that the operator and the um vehicle for vehicle design are really together a total system which is supposed to be able to behave in some sort of effective, uh safe manner. And you design the interface to the operator in such a way that reinforces that. And you see that in like the modern cars with the lane keeping, and the automatic speed controls on the highway and a whole bunch of other things that reinforce the right behavior of the driver.

Murray: And eventually, we’re going to get to even more and more autonomous driving, particularly in long-haul trucking or something where, in fact, the interface to the driver… there may not be any driver at all. But in the meantime, what you want is that the feedback loop between the driver and the vehicle is such that they both can play their role in such a way.

Murray: So here’s another example. Suppose you’re in the air and you’re about to have a colli… And the radar detects that a collision is imminent. The right thing for the plane to do in this case is to uh veer upwards into the left, because that’s the convention. Two planes are about to hit each other there’s some convention about how they’re supposed to uh avoid how they’re supposed to uh change course to avoid hitting each other.

Murray: You want the plane to do that automatically and you probably don’t want that overridable by the pilot. And um designing that level of experience and interface, and you want the pilot to be understand what the plane is doing, and be comfortable with the fact that that’s what’s happening and be glad for it. Right so you design a system like that.

Murray: That make sense?

Brian: Yeah, yeah , no, I understand what you’re saying.

Murray: And they didn’t do that for the 747 Max, 737 Max.

Brian: Yeah, I mean it was unfortunate. I think the article is worth a read whether you see it as a human… you know if you see it as a training issue, putting pilots in the air with not enough experience or mostly software or a combination of the two. Um it definitely…

Murray: Well.

Brian: Go ahead.

Murray: Yeah I think blaming the pilots is just the wrong thing. The aircraft manufacturer knew the population of pilots they were selling this plane into. And they should have accounted for that.

Brian: How do you in that situation then, like I mean I don’t want to speak on behalf of Boeing. Obviously I can’t uh in that case, but there’s always this balance between a prescriptive like an automatic system that’s going to predict and potentially actually enact some kind of action versus that you know human intervention thing.

Brian: So is there a process that you think about when you approach this with a client, for example, about when and how much human intervention are we going to allow in the system? Especially if it’s like, do we want to expose that knob and let them dial it in however they want the end user or is it really a knob that we shouldn’t let them but we’ll give them a different knob that they can you know work with?

Brian: I’m trying to simplify this down a bit.

Murray: Yes.

Brian: But there are times when you probably do want that human to be able to take override and not. Is there a general way you go about doing that in your work to kind of work with a maybe a customer who’s asking for a system that’s does some automation? How do you how do you approach that?

Murray: Yeah so first of all, you explicitly design these scenarios as part of the uh specification in the system. And this is what UX people do in whatever. So when you’re looking at systems decomposition, you look at the total system and then you decompose it into the vehicle or the operator, for example. And you do the discipline of systems engineering with that kind of decomposition.

Murray: But to answer your question more fully, is that’s why there are simulation. And so what you do is you build a faithful simulator and you uh run a bunch of scenarios you do system tests with simulations and you have pilots flying simulated airplane under enough of these different scenarios to validate that the system is airworthy in a way that pilots can will react appropriately. So you test it just like anything else.

Murray: And the answer is here’s the spec, you know you run system tests with simulations. And if if people are crashing the airplane in the simulations, you update the design and [inaudible 00:31:58]. Really it is standard kind of thing. It’s a discipline that somehow got lost somewhere in the product. I don’t know exactly what happened inside Boeing and I probably shouldn’t and I’m not an expert who studied all the evidence, everything. I’m drawing some conclusions from the outside looking in.

Murray: But this is the sort of… The reason you have those kind of systems is to avoid exactly these scenarios that happened to the 737 Max.

Brian: Yeah I mean you basically summarized the same thing that you know designers most software designers, are good ones at least, the ones that are doing what we should be doing is validating the system that you’re designing as you go.

Murray: That’s right.

Brian: And running. We usually call it a usability study. And and sometimes, it’s not always understood by non designers, but you’re not testing the people, you’re actually testing the system.

Brian: You’re testing often the interface.

Murray: Exactly.

Brian: It’s not a test. In fact, we kind of sometimes they tell you to avoid using the word test when you’re talking to a participant and you tell them it’s a study and we’re here to evaluate a piece of software or you know in this case a cockpit uh to figure out if it’s the right design or not so that they don’t feel like they’re a rat in the maze kind of thing. It’s really studying the maze. Um and so…

Murray: That’s perfect. So think of it like this. That’s exactly right. That’s really… It’s so important that this idea keeps getting out. See when you’re putting out a a system like the aircraft, you have control over how the aircraft works. You don’t have control over who’s going to fly.

Brian: Hopefully, there’s some control, right? I mean, I know what you mean but…

Murray: Well no but you don’t. I mean you know there’s regulations and whatever, but you can’t stop. So uh the thing is, is that what you do have is you… You know something about the pilots but you know they will have a range of experience, a range of capability and um there is some likelihood, small, that you’re going to put a pilot in there who’s barely capable of flying this kind of plane.

Murray: The odds of that eventually happening are very high. And the plane should be robust enough to deal with that. So what you do is you plan for the range of capabilities, and the more airplanes you sell, the higher the probability is that one of these poor pilots is going to end up flying it. You see and this happens by the way, what’s interesting is like there was a whole business with Toyota and their quality control, but actually their quality control was probably par for the industry.

Murray: But they sold more cars than anybody else. So more problems are beginning to show up because they had more cars in the field. So what they realized was they had a higher responsibility than the average car maker to make their cars safer because they’re selling them to a bigger population. This is this is um really fundamental to understanding user experience and design is you ask the question, who is the population of people who are going to use this and what is their range of capability?

Brian: Yeah and you’re not always going to you know be testing one person. You know or testing the system with one person is not usually enough. And you know there’s a lot of times people ask, “Well, how many people do you need to study?” My favorite answer to that is like, “How many will it take for you to believe that the system needs changing? Or is it wrong?” Um there’s some math there, but typically, you can test usually you know five to 20 people, somewhere in that range. You’re going to start seeing patterns typically pretty, emerge pretty early.

Brian: And if you keep seeing the same thing over and over, you know an issue then you probably know it’s time to make an adjustment there. But I don’t know that the people working on all these systems are always thinking about this kind of validation. You don’t, I don’t hear it. I mean design, this is normal for the design world. This is like kind of normal at this point for most software design that you’re doing some type of validation work.

Brian: It’s not so much in the data science, uh people that are working on some of these kinds of systems, especially if there’s not a large user interface component. So, I know part of your work you help clients develop data science teams. And so tell me a little bit about some of the non the non-technical skill gaps that you’re seeing with people in the data science and analytics field. Is this one of them, and what are some of the other ones that you see?

Murray: What’s really interesting about this, is that um like in the simulation tests, you’re not going to test 100,000 pilots.

Brian: Right.

Murray: And so what you have is a problem which you brought up which is really a small data problem. How do you draw the right conclusions from smaller data sets? And this is one of the advantages of these um this is where Bayesian techniques are really important. Which is you start and these cars are models which are based on Bayesian techniques.

Murray: Which is…what you don’t get is the way I like to put it is it’s not garbage in garbage out, it’s uncertainty in uncertainty out.

Brian: Right.

Murray: Right and the question that you…so the question is how many people do you have to test? If you’re smart, are you testing an inhomogeneous population so that… If you’re only testing expert pilots, that’s not going to teach you very much. And so putting attention to stratified methods of dealing with uh more carefully selected populations and things.

Murray: Right now people are trained to work primarily with big data sets, and we run them through these um deep learning algorithms and stuff. The skills you need for these, dealing with these smaller data sets where you can’t run big blind experiments and stuff. You know the other thing you can’t do is put 100 good pilots and 100 bad pilots on two different planes and see how many of them crash.

Brian: Right.

Murray: Maybe you can in the simulator.

Brian: Right.

Murray: Right, but the lab experiment you can’t do. So training… So one of the things which we’re seeing not a lot of yet, but um they should be introducing causal models into these uh organizations where people have been trained in big data techniques is one of the things I find myself doing a lot. Is saying, okay, we don’t have the data for that. So now what can we do with the data we have or data we can afford to get using um more sophisticated probability theory. That’s the first thing.

Murray: So I ended up doing a lot of that because that’s really… Um that’s really all over the place. The second is, um the what I just generally find is that um there’s a lot… I think this is this is just again, sort of the maturity of the field, neural nets and the like work better that we… We don’t understand why they work as well as they do. But what happened is we just sort of got cross some computational barrier and suddenly they started working because we made them big enough, okay.

Murray: And then we started training a bunch of people on how to build big neural lips with all the ad hoc techniques that are out there. There’s a bunch of ad hoc techniques out there. And people understand that um right now, working in these various neural net environments like TensorFlow whatever. It’s really just mostly a matter of trial and error for um building and applying. Eventually it’s as much… I wouldn’t call it an art but a craft.

Murray: It’s not yet a science. It’s like the early days when they were building engines before they understood what thermodynamics was. We’re in that stage of this industrial revolution we’re going through with augmenting people’s ability with machine learning. But right now it’s more of a craft than a science. And so the, um we have people out there who are really good at working with these techniques and algorithms, but they don’t necessarily understand. They’re essentially a solution looking for a problem.

Murray: The people with the problem start asking people… They’ll just start saying, “well, let’s throw AI at it.” A lot of what will happen, we’ve seen this over and over again, is that people will throw the algorithms types that they know. It’s the hammer looking for a nail. When they don’t have the depth to understand that that particular technique isn’t likely to work with this kind of problem.

Murray: And so meanwhile, the executives don’t know anything, don’t know any better. And so getting to a point where we have senior data scientists who have the experience to know which of these various techniques and which of these variants of these techniques, and how to pose the problem with the executives in a way that it can be answered, that right now is still an evolving capability in the field.

Murray: And so I find that um building… I think you do this in your own practices is that having seminars, webinars, we sort of inform each… with the various stovepipe functions learn how to collaborate and develop an answer that meets the business need, and it’s still feasible with the kind of data they have. And the other thing that you’ll often see is I have this kind of data, it must be useful for something.

Brian: Right.

Murray: And I much prefer… the other question is, I have this question, what kind of data do I need to resolve it? What kind of techniques and how’s that done? And maybe I need to be collecting different data than the data I have. So working, again, outside in is a technique that a lot of these people are taught. They’re taught, “Here’s your data set do something with it.” Not “Here’s my problem, what data do I need in order to get what I want?”

Brian: Yeah this problem… I hear this a lot. My perception is that there… I feel like business stakeholders sometimes think, like if I hand the data warehouse to a data scientist, they will go look at all the stuff that we have in there and then they will tell us what kind of magic we can create from it.

Murray: Mmhm.

Brian: And their perception on the other end is like, I’m waiting for you to give me a great problem and I’ll let you know what data we need to solve it when you hand it to me. And so everyone’s scratching their head and we get these you know 85% failure rates on you know AI and big data projects and because that there’s something missing here, which is understanding what problems could we potentially solve? What needs are there? What are the business objectives? What do people need or want to do you know with these systems or products?

Brian: And that’s a space where I think designers can help out. We provide that. The deep empathy part of it really helps you ground some of this work.

Murray: Right.

Brian: And so that is partly what I try to work on. At least with the seminar is how do you learn how to ask the right questions uh to a business back to a business person if they are giving you a vague question like can we have some machine learning? Does our product have ML in it? We want to be able to say it has ML in it. You know.

Brian: It’s not there right but how do you unpack that and get to something where maybe there is a viable way to use ML to create a better experience or some business value, uh but they don’t know how to ask the question yet. So um back to what you’re talking about with the skill gap here, is it kind of is it the situation where the young data science talent only knows how to use if like

Brian: I’m looking for places to use a random forest classifier because that’s what I learned or it’s they do have a wide enough technical knowledge about the different models different types of algorithms to use, but they’re not asking the question. Like they’re going in with their favorite one to use or the one they know the best about, because they’re not really sure how to use the other ones.

Brian: So they try to shoehorn it in to every problem they see. Is it a lack of technical and the soft skill piece to to figure out the problem space? Or can you unpack that a little bit?

Murray: Well, first of all, um of course it varies, right? But I’ve seen people… You look at these books and say, learn TensorFlow in a day. Right.

Brian: Yeah.

Murray: You’ve seen these things out there. And you know so that’s the one end. The other end are PhD students who understand these things very deeply. Right and they’re right now um being swept up by the big companies um which is a good thing in a way, right?

Murray: So uh that range is…what I find is that even with solid people, there’s so many issues about vocabulary and overloaded terms and whatever is that neither side has actually been trained to talk to the other.

Brian: Right.

Murray: And it’s more like that. And this is what you do I think. One of the techniques I use in uh working, bringing uh machine learning into an organization is I have the team write, the members of the team write a science fiction story.

Murray: And this is like, you know the whole point is, is you’re trying to create a vision of what the enterprise would look like in five to 10 years because they have machine learning. What is it the Enterprise can do that they can’t do now? What is it they need to have done that they can’t do now? And you may discover that it’s just um achieving better performance. It’s the idea of we’re going to minimize the amount of time someone waits for a train um all times a day.

Murray: And so we’re going to have a much more sophisticated allocation of resources than we have now where it’s always on the same fixed schedule, or something. And then we start going from okay, what would a system look like that fulfills this vision? And we work our way down to various AI approaches to try. But again, it starts at the usual place, which is everyone should have the same vision of where they’re headed to. And I find the story technique is a way to do that.

Brian: I think that’s super awesome. I love that. We sometimes, or at least I sometimes in my own practice call them Northstar designs. And part of it is like.

Murray: Mmhm.

Brian: Especially when you have technical people in the room at the early stages when we’re talking doing a discovery and even starting to sketch solutions out is to take implementation hats off, and just put it aside.

Murray: Yes.

Brian: Because you don’t know…

Brian: You may come up with shoot a moon shot but from the moonshot, you might find out there’s these little tangents that are actually feasible things but if you never let yourself dream that big, you’ll never hit the small incremental steps that you may be able to take. So I love the idea of thinking big and then you know reality will always kick in. You know. Feasibility will kick in very soon, but there should be some time

Murray: Right.

Brian: To let our minds uh you know go big.

Murray: Yeah and then you drill them back down

Brian: Right.

Murray: To reality, but until you know what that is, you really can’t ever get to a big picture plan. So what kind of organization would I need now to implement this vision? And then you get into those kinds of issues as well, uh but the vision is where you start.

Brian: Yeah. Murray, this has been a really fun conversation. I wish we could keep going more on this. We’ve covered a lot of great topics here. Is there a like if someone wanted to follow your work or do you write much or your LinkedIn, any social media, how could people stay in touch with you?

Murray: I do write. I have, uh PCIs about to publish one of my white papers, Pattern Computer. Um the best way to get me is just on LinkedIn. I have an admirable number of followers. Anyone who wants to connect with me, I’m happy to have them do that. I don’t update my profile all that often but I do put out uh papers and I do post them when I do on LinkedIn.

Brian: Awesome, great.

Murray: That’s probably the best way to find me, Murray Cantor, C-A-N-T-O-R.

Brian: Great. Well, I will put your link in the show notes. And uh thank you so much for coming on Experiencing Data and sharing your background. It’s been really fun to talk to you.

Murray: Oh, Brian, it’s been a real pleasure. Thank you for having me.

Brian: Take care.

Show Notes

Resources and Links

Quotes from Today’s Episode

Transcript

Other Episodes

Episode 4

005 – Jason Krantz (Dir. of Biz Analytics/Insights, Weil-McClain) on centering analytics around internal customers

Episode 0

048 - Good vs. Great: (10) Things that Distinguish the Best Leaders of Intelligent Products, Analytics Applications, and Decision Support Tools

Episode 0

053-Creating (and Debugging) Successful Data Product Teams with Jesse Anderson