Tony Muschara
EP
70

The Critical Steps in Workplace Safety

In this episode, Mary Conquest speaks with Tony Muschara. Tony discusses how risk-based thinking and managing critical steps can enhance workplace safety. He provides EHS professionals with practical guidance on how to ensure human error doesn’t cause significant harm to an organization’s people, products or property.

In This Episode

In this episode, Mary Conquest speaks with Tony Muschara, a Human and Organizational Performance (HOP) specialist and owner of Mushara Error Management Consulting.

He’s also the author of several Safety books, including ‘Risk-based Thinking’, and its follow-up ‘Critical Steps’ -  which is the main focus of this fascinating interview.

Tony begins by exploring the role of an EHS professional and reveals his strongly-held views on where responsibility for workplace Safety lies within an organization.

He then describes the 4 elements of risk-based thinking (anticipate, monitor, respond and learn), before sharing his recent research into the importance of critical steps in Safety management. 

Safety professionals will learn what they are, how they can be identified and what can be done to ensure they are completed successfully - first time and every time - to avoid significant harm.

This discussion is packed with wisdom from Tony’s distinguished career in Safety, including why aiming for excellence isn’t always good enough, when to think slow, how to fail safely, and his mantra: “If you don’t have doubts, you haven’t been paying attention”.

Your approach to Safety will definitely be enhanced by paying attention to this interview!

Transcript

- [Mary] Hi there. Welcome to "Safety Labs by Slice." Today's guest is a longtime safety professional and specialist in human and operational performance, or H&OP. Over the years, he's seen many theories and approaches come and go, but one thing he felt was missing was an understanding of a risk-based approach to managing human performance.

So, he wrote a book about it, as one does, and he's here to talk about the ideas behind that book, which is entitled, "Critical Steps." Tony Muschara's background includes seven years of active duty in the U.S. Navy Submarine Service and degrees in mechanical engineering and business. In his time with the Navy, he became certified as an engineer of naval nuclear propulsion plants by Naval Reactors.

He's authored several nuclear industry publications while employed with the Institute of Nuclear Power Operations, including the "Human Performance Reference Manual," which was adopted with revisions by the U.S. Department of Energy as Volume 1 of the "Human Performance Improvement Handbook." Since 2008, Tony has been the principal consultant and owner of Muschara Error Management Consulting.

He's a certified performance technologist and the co-author of several books, including "Risk-Based Thinking" and its follow-up, "Critical Steps," which is what we're discussing today. Tony joins us from Atlanta, Georgia. Welcome.

- [Tony] Thank you. I appreciate it.

- So, I'd like to start a little bit outside the book, but with a question about the role of the safety professional. When we last spoke, you mentioned that you think a safety professional is an advocate throughout the organization for understanding risk. Can you elaborate on that a little bit?

- Yeah. Too often, managers tend to cast off their primary responsibility for safety to a collateral duty to a person that's been, you know, EH&S staff person. And I always tell this story to line managers when I have the opportunity. And I'm a big fan of hamburgers.

I love hamburgers. And I frequent a number of specialty hamburger shops here in North Atlanta area. But I always use Burger King because you can have it your way, right? You can have it your way. And so I'm not a big fan of lettuce. So, I go into a Burger King and order a burger and no lettuce, and then lo and behold, I get a hamburger in my hand that has lettuce on it.

And so, you know, I'm a little miffed about having a burger that's not my way. So, I complain to the hamburger maker. So, I get upset and I walk into the kitchen and I start talking to the hamburger maker. Well, the next thing you know, I'm being arrested by the police. This didn't actually happen, but you can...

- Okay, good. I was going to ask.

- You can speculate what would happen if I did such a thing. But I said, well, is that the right thing to do? Should I go directly to the worker and complain about a product that was delivered to me that wasn't what I wanted? Who should I talk to? And everyone usually responds, "The manager," you know, the store manager. And it's the manager's responsibility for quality.

It's the manager's responsibility for safety. And so I want to make sure that they understand that if that's their responsibility, they have to have an understanding of how safety happens or how quality happens in their production process. And so I believe that the safety professionals, part of the safety professional's duty is to help the line managers understand that safety is not the safety professional's responsibility, he or she is a facilitator to help the line manager understand what needs to happen when critical or safety-critical activities are being performed or functions in various operations are in progress.

Too often, I've sat in planning meetings where the line manager said, "Okay, let's start with our safety moment. We're going to start with a safety moment." And someone speaks for about two, maybe three minutes about safety, and then everybody can breathe a sigh of relief.

"We've accomplished our safety objective, yep, we've checked that off. Now we can talk about production." And that mindset is antithetical to safety. Safety happens exactly the same time production happens. I always talk about work. Work is a use of force to create value. And so if you're creating value, you're producing something, you're manufacturing something, some operation is occurring, that means that there's a hazard in play.

If there's a hazard in play to create work, which is force over distance, in other words, something changes, then you better be thinking about safety at the same time you're doing production. So, I see the safety professionals... one of the safety professionals' primary duties to the organization is to educate and train the line managers, but not to take on the role or the responsibility for safety.

That's not their purview. It's important for managers to maintain that responsibility.

- Yeah, it's impossible for one person, one isolated person to just do safety as a group.

- Right. Right.

- Okay, so let's get into the book. What was the impetus for writing the book? What were you seeing or not seeing that led you to do this?

- I'm going to go back to my first book. My first book was "Risk-Based Thinking: Managing the Uncertainty of Human Error in Operations." And I've been challenged on that title, you know, Managing Uncertainty. Some people believe that you cannot manage uncertainty. I tend to disagree, but "Risk-Based Thinking"... and to be honest with you, in retrospect, perhaps that might not have been the best title for this book.

But what I wanted to communicate to line managers who I targeted as the primary audience for this book... By the way, this waspublished back in late 2017, early 2018, thereabouts. And what I realized is that most books on safety tend to be written by academics, and they tend to use words and terminology and vocabulary that's oriented towards other academics.

And so, I wanted to write a book that was oriented towards managers, especially line managers who had the responsibility for safety. But risk-based thinking is a term or a phrase that's also used by ISO, the quality folks, the quality and control folks.

And so there's no connection, really there's no connection. In fact, I chose that title before I even knew that ISO used that phrase. So, I didn't do my homework to see if that phrase was already used, but there was no title. There was no books by that title, so I got it first.

But anyway, in fact, I own the website, riskbasedthinking.com, so nobody had that domain name as well. But anyway, the title communicates that this is all about risk. This is about risk. And too often, managers have tended to think that avoiding human error was the challenge. We don't want people making mistakes.

And in large measure, most of the safety professionals out there thought the same thing. And, you know, remember, this... I started writing this book back in the mid-2000s, you know, 2005, 2006, while I was still working at the Institute of Nuclear Power Operations. And so I had a rough draft back in those days, but then I started my own business, got busy with that. And so it took that long, took 10 years for me to write this book.

But the idea of "Risk-Based Thinking" is based on the four cornerstones of resilience engineering, which Erik Hollnagel published several times in the resilience engineering books that came out over the last 15, 20 years. And it's fundamentally four habits, what I call habits of thought: anticipate, monitor, respond, and learn.

And those are the four mental postures that Erik Hollnagel, Dr. Eric Hollnagel had identified as being the hallmarks of organizations that are doing well in safety. And I communicated with him on these issues. And so he doesn't push back on my approach to using it that way, but I wanted to communicate to line managers that here's how you do risk-based thinking.

You know, how do we, you know, anticipate, know what to expect when you're about to do high-risk work? Secondly is monitor, what do you pay attention to? Know what to pay attention to when you're doing the work, doing high-hazard work, where they were talking about "Critical Steps," which is my follow-on book which I'll talk about later, but critical steps or the key assets that you want to protect during an activity.

What are those critical safety parameters that you want to stay within to preserve the integrity of an important asset that you don't want to lose control of? And so what do you pay attention to?

The third element of "Risk- Based Thinking" is respond. How do you respond? What do you do to maintain positive control of that activity when you're actually doing work, when you're transferring energy, or you're moving matter, or you're actually transmitting information, such that if you did lose control, you could cause harm.

And so to me, you know, safety's a physics problem. It's fundamentally a physics problem where if you do lose control, you could cause harm. Something changes in a way that you don't want it to change, or you exceed a critical parameter for a particular asset.

And then if you do lose control, another aspect of responding from a risk-based thinking perspective is how would you fail safely? What if you do lose control, what options and alternatives, resources do you have? Barriers, safeguards, what are in play here that if you do lose control of the energy or the matter, how can you fail safely?

Mitigate the harm if you do start to experience harm. And then the fourth element, which is really concurrent with all the preceding three is learn. So, anticipate, monitor, respond, and learn. And now, Erik Hollnagel, he basically looks at learning as a historical...from a historical perspective.

You know, how have we had problems in the past? What I call operating experience. How have we lost control of critical functions, critical processes in the past? Well, I include that, but I also look at what's happening now is basically situation awareness. And that's related to monitoring.

Learning is related to monitoring from a situation and awareness perspective. And then finally, when the job is over, let's look to the future, what do we need to change in the system? How should we realign the system such that we don't lose control, or we're able to fail safer than we did this time?

Or let's make sure that everything goes right the first time every time. And so, I look at learning as past, present, and future. So, those are the four elements of risk-based thinking that I want managers to adopt.

- Let's move in or home in a little bit on the critical steps, because you have a view on...first of all, let's talk about critical. Some steps aren't critical and some are. And you define this very clearly in the book. So, what do you consider a critical step?

- The basic definition that we specify in our book...and by the way, I co-authored Critical sSeps with Ron Farris and Jim Marinus. Both of those individuals are former employees with the Department of Energy. They both work out West, but they're also former Navy nukes. So, all three of us are former Navy nuclear operators.

And so we all three have an operator mindset. But the definition that we land on is critical step is a human action that will trigger immediate, irreversible, intolerable harm to some asset. Notice, I'm not talking about hazards.

I'm talking about assets. And that's a mindset change for some people. I would say for your safety professionals, there's a tendency to want to look at, "Okay, let's find all the hazards," you know, the hazards in the workplace that we're going to be working or the hazards associated with that particular task. And that's good.

I'm saying that's not a bad thing, but critical steps is focused on assets. What are the assets, including personnel safety, that we need to pay attention to? And I usually, in a generic way, refer to assets as people, product, and property, you know, in a general categorization. You know, there's various assets.

You know, it may be time, maybe time is an asset that we have to protect. But fundamentally, let me repeat the definition. A critical step is that human action, so one action, that will trigger immediate, irreversible, intolerable harm to an asset if...it's conditional, if that action or a preceding action was done improperly.

Now, that phrase, that "preceding action," that kind of trips people up because this is presuming that earlier actions have been done correctly. All right? And so it's always clarifying to use skydiving as an example, all right? And so you're in an airplane flying at 13,000 feet above the earth, and you have this parachute on.

And what's the aim of skydiving? Jumping out of the aircraft. You know, you jump out of this aircraft and hopefully, you know, the parachute has been packed or rigged properly and it's functional, and that you have the knowledge, you have the understanding of how to deploy that parachute.

So, the critical step is actually jumping across that threshold of that open door of a perfectly good airplane. So, you can't decide after you've jumped that, "I don't want to do this. I want to go back." So, it's irreversible. Now, it's not immediate because there's a certain time between leaving the aircraft and hitting the earth, but in most cases, we're talking about immediate consequences.

And if the parachute does not work, and you don't have an emergency backup parachute, the obvious consequence is death. And so that's what I mean by a preceding action. The critical step is jumping out of the plane, but a preceding action is donning that parachute and making sure that parachute was rigged properly.

That's a lot of words to define critical steps.

- So, when it comes to critical steps, you talk about there's a problem with the concept of aiming for excellence. What's the problem with aiming for excellence, specifically when it comes to critical steps?

- All right. That's a good question. I always challenge managers because that's a common initiative that you see in many organizations, especially productive manufacturing organizations. And excellence, it sounds good and it looks good on paper. It looks good to the paying public.

But fundamentally, excellence is not what you want for a critical step. So, I always ask people, you know, when your company pays your salary, how much of it do you want? Is 99% good enough, or is 99.5% good enough, or is 101% good enough? That's actually bad news for some people because, you know, the company expects you to pay it back if they pay you too much.

Or say you have a loved one that's scheduled for intrusive surgery and they're being rolled into surgery. What's your expectation for the surgeon who's got a scalpel poised over your loved one? Are you looking for excellence there, or are you looking for perfection?

So, there's certain activities that must go right first time, every time, or else you're going to suffer harm. And so there's a quote that Winston Churchill is noted for saying. He says, "It's not enough that we do our best..." 99%, you know, human reliability is 99.9%. He says, "It's not enough that we do our best. Sometimes we have to do what is required."

All right. And there are situations where you have to get the job done right, first time, every time, and that sounds like quality. And so there's some overlap with critical steps with quality and productivity, not just safety, but it applies across the board for any human endeavor.

- When you were talking about assets, I was noticing a connection there too, right? You know, safety, the asset, is generally human, but when you're talking about assets of property, or what was the other one? Productivity?

- Products. Property and products.

- Product, that's quality.

- So, it's important that the frontline workers who are going to do the work of the organization, that before they go out and start touching things... I talk about touching things. I talk about touchpoints as part of the management part. But before they start touching things, they ought to know what is the most important elements or assets that they're going to be interacting with. Since they're key to the mission of the organization, to some extent, they need to understand the business.

What's the business? I go back to your safety professionals. Safety professionals must understand the business. Their role is not just safety, but is to help the organization produce or create value while maintaining safety. All right. And so, if they don't understand the business, they don't know where it's critical that they do things right the first time.

- Yeah. And that actually jumps ahead to something else I was going to ask about, which is about making the business case.

- The business case, yeah.

- And so in order to make a business case, it behooves you to understand the business.

- They need to understand the business case. And when they go in and talk to management, you know, typically managers, you know, they have a number of initiatives that are ongoing in an organization. And when I'm invited into an organization and I give an executive seminar, the last thing I talk about is the business case.

And I look at the business case from three perspectives. What's the benefit? What's the benefit that human and organizational performance or what I call H&OP... Some people call it HOP, which I think is undignified, but I like H&OP is that they need to know what's the benefit from a safety, quality, productivity perspective that critical steps and H&OP provides to the organization?

But as with any new initiative, there's going to be some costs involved. So, there's a cost involved for bringing people up to speed. There's a training element. Perhaps there are some resources that need to be invested in to support the implementation of such an initiative.

And then finally, I always ask the last question, what's the risk if you don't do this?

- Yeah. You just mentioned investing in, and I think that right there is the key to making a business case because if leadership sees it simply as a cost, then you've failed in making your business case. If they see it as an investment, you've made the case, right?

- Right. And in my book, the first book, "Risk-Based Thinking," and this is not an original thought or a perspective that I came up with. I can't remember the author's name, but he said that safety is a profit multiplier. I like that concept. It's a profit multiplier.

And if you look at the equation for profit is equal to revenue minus expenses. So, how does H&OP...how does critical steps impact revenue? So, you're able to deliver a product or with the quality required, and that's going to enhance the future income of the organization because people are going to buy it again if they know they're going to see the reliability of a product.

But on the expense side of the ledger, the cost of human error, the cost of losing control and the events and the harm and the damage that you suffer and the downtime. So, you can see that there's an impact on the profit margin of an organization by using H&OP.

- Okay, I want to pull back a little bit again to the critical steps. So, first of all, I want to clarify, I think this is kind of self-evident, but it's not that you don't care about the other steps, but you are focusing in on the ones that are critical. So, you referenced the book, "Thinking, Fast and Slow," which many people reference when we talk.

Can you just sum up very quickly what it means to...and of course, obviously, there's a whole book on this, but what it means to think fast, what it means to think slow, and what they have to do with critical steps and with risks?

- Right. Right. I go back to Winston Churchill. When I talk about, you know, the fact that excellence is not good enough at a critical step, but I do make the statement that good enough is good enough when there's nothing at stake. If there's nothing at stake, then good enough is good enough.

And what's good enough from a human reliability standpoint? Human reliability on a good day is 99%. You know, if an operator comes to work and performs 100 actions and they get 99 right, that's generally good enough for an activity where there's no potential for harm, all right? And so that idea is important for frontline workers to understand.

Okay, so what about that one... Can I tell a story real quick?

- Yeah, of course.

- All right. So, let's presume...and don't let me forget the original question about fast and slow, because that's where I'm headed. But let's say, Mary, that maybe, you know, in the future, you know, you would come to work and you're working for me, all right? And you've had a great night's sleep. You have no emotional hangups.

You're not on drugs. You're physically fit, well trained, you know the job. You're, to some extent, ambitious. You want to move up in the organization. And so you're an ideal employee, all right? And so I have a very important job for you, and it has 100 actions, exactly 100 steps, human actions.

And so my question to you, or to the audience, is what are the chance, what's the chances, what's the probability for you performing all 100 actions without losing control?

- I know the answer to this because I've read it in the book. So, I think that the typical answer would be, well, 99%.

- Yeah. So, I back up and I ask... All right, it's best done on a dry erase board. But I said, "Okay, let's look at step 1." All right. Step 1, what are the chances of Mary performing step 1 correctly? And everybody said, "Yep, 99%."

All right. Yep, that's correct. All right. Let's go to step 2. Now, I'm making an assumption that nothing changes in the environment, nothing changes with you between step 1 and step 2. And to some extent, that's a gross assumption. And so nothing changes.

So, now what's the probability of performing step 2 correctly? Ninety-nine percent. And the same for step 3. And all 100 actions is 99%. So, now I ask the question, what's the chances of doing step 1 and step 2 without losing control?

Now, they have to think. And generally, people get the idea that it's 0.99 times 0.99. That's the probability of getting step 1 and step 2 correct. So, fundamentally, for 100 actions, it's 0.99 times itself 100 times, so 0.99 to the 100 power exponent. And that's only 37%.

So, the chances of getting all 100 right is only 37% for a person who's nominally 99% reliable. So, this goes back to the fast and slow question. So, which of those 100 actions absolutely have to go right? So, that's the question of critical steps.

So, we want to know which of those 100 actions have to go right first time, every time. So, what's the state of mind that you want a person to be at when they perform step 55? Maybe at step 55 that's the critical step in this 100-step activity. That's when we want the person to slow down, slow their thinking down.

We don't want fast thinking. Fast thinking is...some people have perhaps heard of Jens Rasmussen or James Reason's approach to skill-based, rule-based, knowledge-based performance modes, which Rob Fisher has made pretty well...written a great book called "Mental Models" in that aspect. But if it's a skill-based activity and there's no particular harm, you can be in fast thinking mode, and 99% is good enough.

- So, sort of intuitive or muscle memory, all those things...

- Exactly. That's exactly right.

- right, is how you... But at the same time, if you do lose control of the activity, there's no real harm. You know, there's no real harm. But step 55, you have to slow your thinking down. And this is where I talk about human performance tools. Human performance tools, which is a section in the book, "Critical Steps," are behavior-oriented tools such as self-checking, peer-checking, three-part communication, conservative decision-making.

Yeah, you know, I think I said three-part communication. But these are all techniques to slow the brain down. And what human performance tools do is they force you to address risk-based thinking. All human performance tools will help you anticipate, monitor, respond, learn.

And some of the tools are stronger than others. Some address all elements of risk-based thinking, but fundamentally, the tools are slow-thinking activities. That was a long answer to something you wanted to be real quick about.

- No, no. The quick part was just the slow and the fast because I think people do cotton onto that quite quickly. They understand fast thinking that you know the things that you do almost automatically.

- Unconsciously, yeah.

- Yeah. And slow thinking is just really being present. There's a couple of things, definitions, I'm going to ask about, one of them is positive control. What does that mean?

- All right, let me give you context on positive control. So, in in the book, "Critical Steps," you know, the fundamental goal, the principle goal of managing critical steps is about creating success. We want to assist that frontline operator, technician, craftsman, whomever it might be, enhance their potential.

We're getting step 55 correct, that we were just talking about. So, we want to achieve success. But the four objectives, there's four objectives for managing critical steps. The first one is to identify a critical step. You know, where in this activity are the critical steps? And then the second objective is exercise positive control of that critical step.

So, how are you going to do that? You know, we talk about pre-job briefings or tail boards or pre-task discussions. And so during that conversation between workers and supervisors, you talk about, "Okay, here are the one or two critical steps. How are we going to do those critical steps? Who's going to do what and when?"

And when I say positive control... You know, I'm an operator at heart. You know, I have an engineering degree in mechanical engineering. But as soon as I left college, I was no longer an engineer. I became an operator. And so I operated nuclear submarines, I operated nuclear power plants commercially, and I even trained operators.

So, my mindset is from an operator's perspective. And every operator will know this. Positive control is defined as what is intended to happen is what happens and that's all that happens.

- That's a good little addition there. Yeah.

- Yeah. So, when they're anticipating performing a critical step, critical act, opening a valve, or turning a switch to start a device, they ask themselves, "Okay, what is intended to happen is what happens and that's all that happens." That's the idea of positive control. And just to close the book on the four elements of managing critical steps, the third one is to fail safely.

You know, if we do lose control, if we don't exercise positive control, you know, how would we mitigate the harm that ensues? Because there will be harm if you lose control at a critical step. So, how can we fail safely if that's possible? And then the fourth management objective for managing critical steps is to align your organization or realign your organization to support the first three of the management objectives.

- So, I'm going to jump a little bit because we'll talk about that, which is at the end about learning. Planning, this is a quote from the book. "If it's important enough to have pre-work discussions, it's important enough to have post-work reviews." So, this kind of ties those two together. What's the role of both of those: pre-work discussions and post-work reviews?

- All right, great. Great question. The book, "Critical Steps," has a subtitle called "Managing What Must Go Right in High-Risk Operations." In fact, we delivered our manuscript to CRC Press, managing what must go right. And earlier I had used "in high-risk operations" and I decided to take it off.

But then they wanted that back on, you know, because we're talking about high-risk operations. So, from a manager's perspective, how do you manage critical steps? And you have to understand how work occurs. And there's three phases of work. There's the preparation stage, where you're talking now we've planned the work, we have operators or technicians who have been trained.

They've been assigned the task of performing this work. So, now we meet and have a pre-work discussion or a pre-job briefing, whatever terminology an organization uses. And they want to understand what are the critical steps. What are the potential consequences if we do lose control?

What are the error traps and landmines that we could perhaps run into in the workplace? What are the contingencies? What's the stop-work criteria, and so on? You have that kind of conversation. And then when everybody's ready and they've asked their questions, they're comfortable with what they're going to be doing, now we go out and do the work.

That's the execution phase. So, the execution phase is where we create value for the organization. And that's where we actually interface with...the frontline operators or technicians actually interface with assets during the work. Once the work is done, then now there's an opportunity because there was risk involved.

If there's risk involved, that's why we have the pre-job briefing. We want to address the risk. So, if there was risk involved in an activity... Work as done during the execution, work as done is never the same as work as imagined. So, during the pre-job briefing, we're talking about work as imagined based on paper, based on what's written in the procedure, or the checklist, or the work plan that's going to be used to guide the work.

Now we go out into the field and do the work. We execute the work, but that work is never the same as what we intended to do. And so the post-job review, I call it a review, is to understand the difference, the significant difference between work as done and work as imagined, or work as planned is probably a better description.

So, what was different is do we need to understand...does management need to make some adjustments in the way they plan the work, or they support the work, or resource the work? And that's the questions that the frontline workers need to address. This post-job review should be quick.

You know, for most work, it'd be about 15, 20 minutes at most. But I'm not talking about an after-action review, which can take, you know, an hour plus in some cases. But a post-job review is just that, you know, what's the difference between work is done and work as planned? And there ought to be some vehicle or administrative communication device to get that information back to line management so they can respond to anything that they need to change.

- And so often, I mean, let's say that the job was successful, that means that the critical steps went right, but...

- That's not always true.

- Oh, okay.

- That's not always true. In some cases, the frontline adapted. You know, perhaps they actually encountered a critical step that was not anticipated. And that happens more often than not, and that's what I call a landmine. This is a landmine. You know, in a military sense, a landmine is hidden in the battlefield. And so, you know, your soldiers, your workers, encounter this landmine.

So, how did they handle that, all right? So, work as done is not what we plan to do, now we have to adapt. And so we were successful, we performed our job, we accomplished the mission, but now we need to feed back to management, "Here's what we did.

Here's what we planned to do, but here's what we did. And here's what, you know, you management need to be aware of what happened today," and give them the opportunity to make adjustments in the organization of work.

- Yeah. So, another quote on that note that I wanted to ask about was managing requires aggressive learning. So, managing risk, managing the organization, is this the same as systems learning, this aggressive learning?

- Yes.

- How are they all related? When I say aggressive, when the frontline workers encounter situations in the workplace that aren't as anticipated, so managers need to understand, "We just put our frontline workers, we put our assets at risk." And when you find a problem, now you're in a timeline.

You're in a timeline. Your time-at-risk clock is running. That means that there's a situation that's vulnerable to losing control. So, there's some problem upstream in the organization, in the system that allowed this situation to occur, whatever that might be. Perhaps it would be worthwhile to use an example.

For instance, two mechanical maintenance technicians are going out into a steam plant. And they're working on 150 PSIG-type steam system, and they have to adjust the packing around two manual valves and then replace the packing on one other valve.

So, these two mechanics walk out into the plant and they were going to start with the one valve that needed the packing replaced, but it was still blowing steam and water, you know, around the stem of the hand wheel.

So, they said, okay, it's been tagged out, it's been isolated, but it was still too hot. So, they went to the other two valves and made the adjustments on the packing so they're not leaking anymore. So, then they come back to this valve they had to replace the packing. The valve stem is dry. There's nothing, you know, being blown by.

So, the two technicians make an assumption. The steam line has been depressurized, when in fact it has not. In other words, what actually happened is the packing that was there, the old packing actually expanded enough to seal that stem around that hand wheel. But they still need to be replaced.

And so the two technicians, you know, one was on the... This valve was about 15 feet above the floor. And so the one mechanic had to sit around, go around to the backside and sit on the railing, and there was no fall protection being used there. And so they start removing the gland follower and the last turn of the last nut, it just blew up, you know, the steam blew up.

And the one technician could back up because he was on the platform, but the technician...the mechanic on the railing had nowhere to go and he received second-degree burns on his hands, on his thighs, and on his face. And so very serious injury. And so the critical step was removing that last nut on the gland follower, all right?

And so remind me where we were going with that.

- We were talking about aggressive learning and systems learning.

- Oh, aggressive learning. All right, so this is my example. See, I'm susceptible to making mistakes as well as anybody. But the managers need to understand that their system set that up. Their system set that up such that they had a tagout system, which is an organizational function.

And they tagged out...you know, isolated that section of piping, but they did not vent it. Why wasn't it vented? Why wasn't there a vent valve on the tagout list to make sure that there was no residual pressure when they took that gland follower out of the valve? And so that's a system weakness.

And managers ought to have a posture of continuously learning and finding vulnerabilities in how their system creates these situations, or could create these situations. Now, part of that aggressiveness is spending time in the workplace. Too many managers...I had a manager tell me that...

We talked about human and organizational performance and this manager said, "You're just adding one more brick to my backpack." Well, you know, I'm not trying to add more work to your responsibility, but what I'm trying to do here is to help you think through what your role is as a manager is to create the opportunities for success in a high-hazard environment.

And so it's a way you think. I go back to risk-based thinking, and that's what we're talking about. It's not a new program. You know, you've perhaps...I'm sure you've heard this about H&OP, it's not a program, even though perhaps it gets started that way. But fundamentally, you want managers to think risk, right? What's the uncertainty?

All right. I always challenge operators and workers that if you don't have doubts, you haven't been paying attention. You let that sink in. If you don't have doubts, you haven't been paying attention. And that's part of that slow-thinking perspective is where's the uncertainty of our operation? Where's the uncertainty embedded in this activity? Removing that gland follower on the packing gland is like crossing the street.

You know, if you step off the curve, you look both ways to make sure it's a safe thing to do to cross the street. Well, the same thing's true when you start removing the packing on a previously pressurized steam valve. You're crossing the street. And there's that chronic unease, you know, that helps you become aware of uncertainty in your operation.

So, managers need to be aggressive. So they need to be out watching work. They need to be asking questions. They want to look at work as done and understand how the frontline workers are achieving success despite the flaws in the procedure, or despite the flaws in training, inadequacies of resources and tools.

So, they need to know that, and this is... So, part of their job is to get out of their office, spend time with the frontline workers, get their perspective because the workers know what's going on, but too often there's a little bit of fear about sharing negative information, and managers... you know, that's the idea of just culture, so, you know, or restorative culture that Sidney Dekker talks about.

But it's important if they set up an environment that allows people to share what's really happening in the workplace.

- To me, that also sounds like...this critical unease sounds like identifying assumptions. That's the doubt, right, the doubt part of it. We're coming up on time. I wanted to ask one thing that I think is important for listeners. If they're hearing all this, they understand the idea of critical step. You said the first step, of course, is to identify which steps are critical, and you talk about critical step mapping.

Can you explain that for us?

- Yeah. There's a chapter in the book called "Critical Step Mapping." And this is a tabletop exercise where you as the safety professional and the engineering representative involved with the development of a technical procedure, maybe even representatives of the user group, the operators or the technicians involved with an activity.

And you can basically see...visualize being in an office or a classroom, and you throw...everybody's got a computer, and we're looking at the steps of this procedure or this technical procedure, and you're walking through it step by step by step. And critical step mapping is attempting to find those steps, those actions in a technical procedure or process that involves a transfer of energy or a movement of matter or a transmission of information.

Most of the time for industrialized operations, we're talking about transfers of energy or movements of matter: solids, liquids, or gases from one place to another place. And so we want to find those transitions. Then we ask ourselves, "Okay, where are the touchpoints? Where do the operators actually create that transfer? Where does the work actually happen?"

And then we ask ourselves, once we know what those touchpoints are, again, we're asking those questions on every step. And we ask, "Okay, does this touchpoint satisfy the definition of a critical step?" If we lose control, if that operator loses control at that step, would there be an immediate, irreversible, intolerable harm as a result?

And so you basically...it is very similar to failure modes and effects analysis to some extent, but the focus here is not on identifying error, the focus is on identifying consequence. So, what's the harm? You know, identify the harm. So, this goes back to the idea of risk. Now, I'm going to digress it just real quick. The focus is not on probability.

Humans are horrible at estimating probabilities. And so here, the focus is on consequences. What's the result if we do lose control, regardless of the probability of losing control? And so critical step mapping finds those critical steps. And then it proceeds to identify, okay, what are the risk-important actions, which we haven't...like putting on the...a risk- important action is put the parachute on, rig the parachute correctly.

Those are preceding actions that have to go right. You explore those and then you identify what are the controls and barriers and safeguards that we have to put in place. So, in a nutshell, that's what critical step mapping is.

- So, this book, the "Critical Steps" book only came out last year, I believe. It's not that old. What's your current thinking about these issues? Where do you see challenges? Where do you see hope in the future of this shift, this thinking shift? Whether it's in a wider industry or, you know, if you've heard feedback about the book, what's on your mind right now in this realm?

Well, we've interacted with the National Safety Council here in the United States, which is a nonprofit, non-government-oriented organization. And they've endorsed it. And let me read to you the endorsement. "We are in the midst of a renaissance in safety thinking that includes new concepts for managing operations with fatal and serious injury and illness potential. 'Critical Steps' takes us deeper through the use of real-world examples, first principles, and tried and true practices that can be put to immediate use."

You can do this tomorrow. These ideas, you can put to use tomorrow. The last sentence, I appreciate, simply put, this book saves lives. In our profession, nothing is more important, saving lives. And I always conclude all my presentations with a famous saying, "Live long and prosper." You know, so most everybody recognizes that phrase, but fundamentally, what I'd like to see this go is I'm seriously thinking about if I can get to...I hope to talk more with the National Safety Council to put together brief training program.

A standalone where you wouldn't necessarily need me to do the training, just something that people could download free of charge, and basically create a movement. The idea is to create a movement. So, take this information, train your people, and I encourage managers to do the training. But then the people that they train can train others.

And so I encourage people to take this material and use it with their families. There's plenty of critical steps at home. You know, for instance, my daughter had her last child. This was about five years ago. And she's walking down the back stairs of our home, and she's holding the baby with two hands and not using the handrail.

I said, "Kristen, you need to hold the handrail. It only takes one stumble, and where's that baby going?" All right. So, every step was a critical step as far as I was concerned. That's a literal step. All right? But anyway. So, I encourage people to take it home, especially, you know, talk to your teenagers.

Teenagers don't appreciate risk as well as they should. And so teach them to recognize... You know, Will Gadd, I don't know if you know Will Gadd. He's a famous ice climber, and he's done the same thing. He's trained his own children how to recognize and mitigate risk. In fact, when I talked to workers, I talked about see it, feel it, control it.

See the risk, feel the dread of that risk because critical steps involves dread, and then control it. What are you going to do to exercise positive control, and what are you going to do to fail safely?

- Yeah. Because as they get older, the consequences do get more serious.

- They do.

- Okay. So, I have a few questions that I like to ask all my guests. If you were to set up training for tomorrow's safety professionals, what interpersonal skills, so not technical skills here, but what interpersonal... or skill do you think should be a focus for tomorrow's safety professionals?

- I'm biased to some extent with that question because I'm a trainer. I wouldn't say I'm a polished presenter. You know, I'm more of a thinker and an operator. I have an operator perspective. But one thing I learned about myself early on in my career while I was still in the Navy, is I enjoyed educating people.

I enjoyed training people. And what I had to learn was how to facilitate that understanding. And so, I would say one of the key skill sets that a safety professional needs to have is the ability to facilitate understanding.

- Which goes all the way back to the beginning, where we were talking about the role of the safety professional to be an advocate for the understanding of risk..

- They can be an advocate, but if they don't know how to facilitate understanding, it's not going to communicate well.

- Yeah, it's true. You can't just say, "This is important," and then walk away and assume that people will understand you. If you could go back to the beginning of your safety career or your career writ large, what is one piece of advice that you might give to yourself?

- Well, that's a tough question.

- I know it's the hardest question I ask.

- That's a tough question.

- I promise they get easy after this.

- What would I do differently, I guess is...

- Well, no, what's one piece of advice that you might give to young Tony?

- Well, what comes to mind is I took a lot of courses in college, and especially in a mechanical engineering degree program, you take a lot of mathematics, you take a lot of process control-type courses. I took a lot of reactor physics courses, you know, from a nuclear operations perspective, but I didn't understand its application.

The professors didn't say, "Okay, here's how you would encounter these concepts in the real world."

- Right. It felt abstract.

- Well, it was math. All I did was do the math. You know, I got A's on the math, but I didn't know where...you know, proportional integral controllers, you know, yeah, I can do the math. I did the differential equations. But I didn't understand it until maybe 10 years later that I was actually teaching controllers to operators. Oh, this is what I was doing back in college.

- So, now, in the book itself, there are a ton of references you draw from the work, you and your co-authors draw from the work of a lot of different ideas. So, other than the book itself, where would you point listeners to learn more about some of the topics in our discussion?

Are there particular books, or maybe even websites, or programs, or anything like that?

- Yeah. I was thinking about, you know, other authors that I've paid a lot of attention to. James Reason, you know, he's probably the initiator of this way of thinking. I'd say, you know, his book, "Managing the Risk of Organizational Accidents." There are some current authors that don't necessarily agree with him, and that's okay.

You progress in the safety sciences by disagreement. And people have to come to grips with disagreements. But I'm still a fan of "Managing the Risk of Organizational Accidents." In fact, he has a follow-on book, "Organizational Accidents Revisited."

So, I would encourage, you know, students to get ahold of those two books. I would also say Derek Viner. I use the concept of...in managing critical steps, I talk about pathways and touchpoints, pathways and touchpoints.

And Derek Viner is a British author. He wrote a book called "Operational Risk Control." And so there's a goodly amount of information. And his approach to safety is pretty engineering-oriented. You know, he sees it more as a physics problem than a social problem. I would also say that Karl Weick and Kathleen Sutcliffe book, "Managing the Unexpected," they're on their third edition of that book.

That's, I would say, the Bible for most of the HRO enthusiasts out there. But let's see, Andrew Hopkins, I'm a big fan of Andrew Hopkins' books. "Organizing for Safety," I think is his first book, where he talks about organizational structure creates safety culture.

So, I'm a big fan of what he's written. And also Nancy Leveson. I'm a big fan of Nancy Leveson out of MIT, "Engineering a Safer World." I'm also a big fan of all the resilience engineering books. But with all that said, I talked to one of the authors in the HRO domain, and I said, "What do you think about resilience engineering?"

And there's a lot of hurt feelings between the authors of HRO versus resilience engineering, and they claim that they've stolen each other's words. But what I encourage safety professionals to do, and line managers, in particular, is use what works. Use what works for your organization.

You know, there's a whole host of different principles associated with human and organizational performance. You know, choose the ones that mean the most and have most effectiveness to your organization.

- Okay. That is a lot of resources. So, people, you have your work cut out for you. Tony, where can our listeners find you on the web?

- Well, there's a couple of places. Obviously, there's muschara.com. That's my consulting website. But I also have websites for the two books. One is riskbasedthinking.com and criticalstep.com or criticalsteps.com. You know, I purchased criticalstep.com, singular, and then we wrote the book "Critical Steps," so I had to add an S, so I own two websites.

You know, one's singular and one's plural.

- All right. So, you don't even have to remember, just...

- No, criticalsteps.com.

- Okay. Well, that is our show for today. Thank you, Tony, for coming to discuss the book and your ideas.

- My pleasure.

- And to our listeners, thanks for showing up on LinkedIn and being part of the conversation and for telling your colleagues about Safety Labs. We really appreciate the support. And, of course, my thanks to the "Safety Labs by Slice" team. Bye for now.

Tony Muschara

After graduating from the U.S. Naval Academy in 1975, Tony served 7 years in the U.S. Navy Submarine Service, where he qualified in submarines and certified as “engineer of naval nuclear propulsion systems.” After resigning from the Navy, Tony trained nuclear control room operators at a commercial electric utility for 3 years. Between 1985 and 2007, while employed by the Institute of Nuclear Power Operations, Tony authored several publications on human performance, including Excellence in Human Performance (1997), Human Performance Tools for Workers, and the Human Performance Reference Manual (both 2006). The U.S. Department of Energy adopted the latter as its Human Performance Improvement Handbook in 2009. Tony began independent consulting in the field of human and organizational performance in 2007, helping managers of industrial organizations manage the operational risks associated with human performance. Tony authored following two books (published by Routledge and CRC Press, respectively): Risk-Based Thinking: Managing the Uncertainty of Human Error in Operations (2018) Critical Steps: Managing What Must Go Right in High-Risk Operations (2022)

Find out more about Tony’s work: Muschara

Books recommended by Tony:

Managing the Risk of Organizational Accidents and Organizational Accidents Revisited by James Reason

Operational Risk Control by Derek Viner

Managing the Unexpected: Assuring High Performance in an Age of Complexity By Karl E. Weick and Kathleen M. Sutcliffe

Organizing for Safety by Andrew Hopkins

Engineering a Safer World by Nancy Leveson