Why don’t we learn from past experiences when it comes to planning new projects? Why aren’t even our best laid plans realistic?
Surely you’ve noticed this – whether it’s getting your taxes done, that big presentation for work, or planning your wedding.
Why do 80-90% of mega projects run over budget and over schedule?
Why has it taken – for example – nearly 100 years to expand the Second Avenue Subway in NYC? The original project was expected to cost 1.4 billion dollars (a 1929 estimate in 2017 dollars) and now with Phase 1 completed ($4.5 billion to build just 3 of the 16 proposed stations), Phase 2 is expected to cost $6 billion.
This phenomenon has been dubbed The Planning Fallacy – the topic of today’s Freakonomics podcast and the inspiration for this post.
Don’t have 45 minutes to listen? Keep reading.
Why do we fall for The Planning Fallacy again and again?
When planning a project we naturally focus on the case at hand, building a simulation in our minds. But our simulations are rosy, idealized, and don’t account for all of the complexities that will inevitably unfold.
We also focus on succeeding, not failing, creating an optimism bias. This means we don’t think enough about all the things that can go wrong.
We’re overly confident, believing in our abilities and the old “this time will be different“ line too much.
We ignore the complexity of integrating all of the parts of a project together.
We intentionally misrepresent a project’s plans in order to get it approved.
We rely too heavily on our subjective judgement instead of the facts and past empirical data.
And of course: incompetence, fraud, deliberate deception, cheating, stealing, and politicking.
Use past projects – even if they’re not exactly comparable – as a benchmark for projects being planned.
Track and score the difference between forecasts and outcomes.
Get stakeholders to put skin in the game, creating rewards and penalties for good and bad performance. #IncentivesMatter.
Use data and algorithms to reduce human biases.
Use good tools to help you focus. Asana co-founder Justin Rosenstein warns against “continuous partial attention” – a state of never fully focusing on any one thing.
Success Building Software
I build projects for a living – mostly product strategy and software for start-ups or innovation groups within larger companies. I plan and execute on projects everyday and I still struggle with the planning fallacy in other areas of my business (did I mention my corporate taxes are due in 7 days?).
But the secret sauce to my successes building products has always been to 1) have personal expertise in what’s being planned and built, 2) refine and go over the plans until your eye bleed looking for possible pitfalls, and 3) have a clear and easy-to-follow process to keep you focused on the right thing at the right time.
Terms & Concepts
The Planning Fallacy – Poorly estimating the timeline, quality, and budget of a planned project while knowing that similar projects have taken longer, cost more, or had sub-par results.
The Optimism Bias – Focusing on the positives of a situation over the negatives.
Overconfidence – Thinking that we’ll perform better than we actually will.
Coordination Neglect – Failing to account for how difficult it is to coordinate efforts and combine all of the individual outputs into one complete system.
Procrastination – Choosing to do things that we enjoy in the short term instead of the things we think will make us better further down the road. In the episode, Katherine Milkman called procrastination a “self-control failure” – my new favorite phrase.
Reference Class Forecasting – Using past and similar projects as a benchmark for how your next project will perform.
Strategic Misrepresentation – Underestimating the costs and over representing the benefits of a project.
Algorithm Aversion – The big thing that Katy Milkman thinks is holding us back from using “data instead of human judgement to make forecasts” better.
Hydrogen Sulfide doesn’t have much to do with strategy but I thought I’d publish this mini-post as Patron-only bonus content anyway since it didn’t make it into my piece on disasters:
Hydrogen Sulfide (H2S) is a funny substance. Not only is it extremely flammable, but it’s also very toxic to humans. Fortunately, it has the very distinct smell of rotten eggs – an early warning sign – above 0.5 parts per billion. Unfortunately, at higher concentrations that smell vanishes after only 2 or 3 sniffs as the nerves in your nose become paralyzed.
The maximum recommended concentration is only 20 ppm. But as the concentration of H2S increases to about 1000 parts per million, one breath will instantly cause you to pass out, collapse, and stop breathing. H2S binds to the enzymes that allow your cells to convert sugar and oxygen into energy – energy your body needs constantly.
H2S has one last nasty property: it’s heavier than air, meaning that H2S hugs the ground and dissipates more slowly than other gases. So what typically happens is that you’re wondering around the refinery, smell rotten eggs, pass out, collapse on the ground, and inhale even higher concentrations of H2S until you stop breathing and die.
Hmm.. maybe there is a strategy lesson here. Hydrogen Sulfide is one of those substances that’s almost too deadly to handle. At my refinery we didn’t handle it much. In fact, we paid another company who had deep expertise in H2S to build an H2S processing plant inside of our refinery. They turned H2S into pure sulfur (S) and water.
Are there parts of your business or life – like hydrogen sulfide – that are too deadly to handle? Can you pass off the risk to someone else? Can you eliminate them altogether? Or can you specialize in handling the most toxic or deadly parts of someone else’s business?
PS: Pure sulfur is pretty flammable too… and you haven’t lived until you’ve seen a sulfur pit fire. As the burning sulfur heats up, it melts, turning blood red while the flame itself it blue…
Sulfur Fire Video
If you listen carefully, you can hear the respirator that the person who’s recording the video is using in order to avoid inhaling the toxic by-products of burning sulfur.
When I first started working at the nation’s largest refinery my boss didn’t have any great projects ready for me so he sent me to “Shift Super” training. There were only 4 of us students – me and 3 unit operators, each with at least 20 years of experience. I was only 19 years old and I barely knew anything about anything.
Each shift supervisor runs a big chunk of the refinery: 2-4 major units. But shift supers needed to know how to supervise any of the 10 or so control centers safely, so on my first day of work I was learning how the entire refinery operated. It felt like a lifetime’s education crammed into 5 days.
After Shift Super training, I began to see more and more of the refinery with my own eyes. What had been a precise line drawn from one perfect cylinder to another perfect cylinder was actually a rust covered 6-inch pipe baking in the Texas heat transporting high-octane, extremely flammable raffinate from a 170 ft separation tower to a temporary holding tank.
Fear of Disaster
One evening – about 2 weeks in – as the refinery was becoming a real place to me, I had this moment of pure panic while driving home.
With so many things that could go wrong at any moment, how was the refinery still standing? How had it even made it until now? Any minor mistake – in the design, production, construction, or operation of any pipe or vessel – could result in a huge disaster. Would the refinery even be there when I returned tomorrow morning?
I couldn’t sleep. The next morning the refinery was still there. And the next. And the next. And my fears slowly morphed into amazement. My refinery was the largest, most complicated system I had ever attempted to wrap my brain around.
BP Texas City Refinery Explosion
But just 29 miles away from my refinery, investigators were trying to piece together what happened during the nation’s worst industrial disaster in nearly 2 decades.
Fifteen people had been killed and 180 injured – dozens were very seriously injured. The cause of each fatality was the same: blunt force trauma.
Windows on houses and businesses were shattered three-quarters of a mile away and 43,000 people were asked to remain indoors while fires burned for hours. 200,000 square feet of the refinery was scorched. Units, tanks, pipes were destroyed and the total financial loss was over $1.5 billion.
The BP Texas City Refinery Explosion was a classic disaster. A series of engineering and communication mistakes led to a 170-foot separation tower in the ISOM unit being overfilled with tens of thousands of gallons of extremely flammable liquid raffinate – the component of gasoline that really gives it a kick. The unit was designed to operate with about 6,000 gallons of liquid raffinate so once the vessel was completely filled, 52,000 gallons of 200 °F raffinate rushed through various attached vapor systems. Hot raffinate spewed into the air in a 20 foot geyser. With a truck idling nearby, an explosion was immanent.
This video from the US Chemical Safety Board (CSB) is easy to consume and well done. The 9 minutes starting at 3:21 explain the Texas City incident in detail:
I’ve studied this and other disasters in detail because in order to prevent disasters we have to understand their anatomy.
The Trigger
The trigger is the most direct cause of a disaster and is usually pretty easy to identify. The spark that ignited the explosion. The iceberg that ruptured the ship’s hull. The levee breaches that flooded 80% of New Orleans.
But the trigger typically only tells a small part of the story and it usually generates more questions than answers: Why was there a spark? Why was highly flammable raffinate spewing everywhere? Why was there so much? What brought these explosive ingredients together after so many people had worked so hard to prevent situations exactly like this?
While the trigger is a critical piece of the puzzle, a thorough analysis of a disaster has to look at the bigger picture.
When The Stars Align
The word disaster describes rapidly occurring damage or destruction of natural or man-made origins. But the word disaster has its roots in the Italian word disastro, meaning “ill-starred” (dis + astro). The sentiment is that the positioning of the stars and planets is astrologically unfavorable.
One of the things I learned from pouring over the incident reports of the Texas City Explosion was that disasters tend to only happen when at least 3 or 4 mistakes are made back to back or simultaneously – when the stars align.
Complex systems typically account for obvious mistakes. But they less frequently account for several mistakes occurring simultaneously. The stars certainly aligned in the case of the Texas City Refinery Explosion:
Employees and contractors were located in fragile wooden portable trailers near dangerous units that were about to start up.
The start-up process for the ISOM unit began just after 2 AM, when workers were tired and conditions were not ideal.
The start-up was done over an 11 hour period meaning that the procedure spanned a shift change – creating many opportunities for miscommunication. Unfortunately, the start-up could have easily been done during a single shift.
At least one operator had worked 30 back to back 12 hour days because of the various turnaround activities at the refinery and BP’s cost-cutting measures.
One liquid level indicator on the vessel that was being filled was only designed to work between a certain narrow range.
Once the unit was filled above the indicator’s upper range, the indicator reports incorrect values near the upper range, misleading operators regarding the true conditions of the liquid level. (ie: at one point the level indicator would report that the liquid levels in the tower were only at 7.9 feet when they were actually over 150 ft)
A backup high level alarm located above the level indicator failed to go off.
The lead operator left the refinery an hour before his shift ended.
Operators did not leave adequate or clear logs for one another meaning that knowledge failed to transfer between key players.
The day shift supervisor arrived an hour late for his shift and therefore missed any opportunity for direct knowledge transfer.
Start-up procedures were not followed and the tower was intentionally filled above the prescribed start-up level because doing so made subsequent steps more convenient for operators.
The valve to let fluids out of the tower was not opened at the correct time even though the unit continued to be filled.
The day shift supervisor left the refinery due to a family emergency and no qualified supervisor was present for the remainder of the unfolding disaster. A single operator was now operating all 3 units in a remote control center, including the ISOM unit that needed special attention during start-up.
Operators tried various things to reduce the pressure at the top of the tower without understanding the circumstances in the tower. One of the things they tried – opening the valve that moved liquids from the bottom of the tower into storage tanks (a step that they had failed to do hours earlier) – caused very hot liquid from the tower to pass through a heat exchanger with the fluid entering the tower. This caused the temperature of the fluid entering the tower to spike, exacerbating the problems even further.
The tower, which was never designed to be filled with more than a few feet of liquid, had now been filled to the very top – 170 feet. With no other place to go, liquid rushed into the vapor systems at the top of the tower.
At this point, no one knew that the tower had been overfilled with boiling raffinate. The liquid level indicator read that the unit was only filled to 7.9 feet.
Over the next 6 minutes, 52,000 gallons of nearly boiling, extremely flammable raffinate rushed out of the top of the unit and into adjacent systems – systems that were only designed to handle vapors, not liquids.
Thousands of gallons of raffinate entered an antiquated system that vents hydrocarbon vapors directly to the atmosphere – called a blowdown drum.
A final alarm – the high level alarm on the blowdown drum – failed to go off. But it was too late. Disaster was already immanent.
Raffinate spewed from the top of the blowdown drum. The geyser was 3 feet wide and 20 feet tall. The hot raffinate instantly began to vaporize, turning into a huge flammable cloud.
A truck, idling nearby, was the ignition source.
The portable trailers were destroyed instantly by the blast wave and most of the people inside were killed. Fires raged for hours, delaying rescue efforts.
Man-made disasters don’t just happen in complex systems. The stars have to align. But the quality of the mistakes matters a lot. Had even one key error above been avoided or caught, this incident wouldn’t have happened. In this case, overfilling the unit by 150,000 gallons of nearly boiling flammable raffinate, set off a chain of events that guaranteed disaster.
The Snowball Effect
Not all mistakes are made equally. Several of the errors in the Texas City Refinery Explosion compounded: Had operators followed the start-up procedure and not filled the tower beyond the designed level, had the tower been better designed to communicate liquid levels over a broader range, had the valve draining the tower been opened at the correct time, had the operators communicated properly between shifts… Had any one of these mistakes been avoided, the tower wouldn’t have been over filled and this disaster would have been prevented.
Miscommunication errors seem to have a special way of compounding and spiraling out of control.
While preventing some of the other mistakes might have mitigated the damage done, failing to understand the quantity of raffinate in the tower ultimately caused the disaster at Texas City.
Preventing Disasters
Think about the complex systems you care about in your business and life. List the raw ingredients for a disaster. What information do decision makers and operators need in order to react appropriately?
Identify the singular points of failure and the obvious triggers. Brainstorm scenarios – both common and uncommon – in order to better understand how different mistakes could interact with one another and how they could snowball out of control.
Pay attention to both the system’s design and the human errors – especially communication errors – that will inevitably arise during normal operation. Think about how you can design the system to be more resilient without sacrificing too much efficiency. What brakes can you build into the process to slow down snowballs?
Where do you need warning alarms? What are the right set-points for each alarm? How vocal do the alarms need to be? What happens when the alarms fail? How often will you test or double check your alarms?
Summary
Disasters tend to happen within large and complex systems. Usually, the immediate cause of a disaster – the literal or figurative spark or trigger – can be readily identified. But there’s almost always a bigger picture, a series of mistakes and errors that led to that spark or gave that spark power. Some of those mistakes set off bigger and bigger problems, which can snowball into something truly catastrophic.
Bottom Line: Understanding the anatomy of a disaster in your world is the first step to designing better systems, procedures, and training to help mitigate damage or prevent disasters altogether.
“At the opening of the 2002 season, the richest [baseball] team, the New York Yankees, had a payroll of $126 million while the two poorest teams, the Oakland A’s and the Tampa Bay Devil Rays, had payrolls of less than a third of that, about $40 million.”
For the Oakland A’s the exact number was $41,942,665. Oakland won 103 games that regular season, while the Texas Rangers had only won 72 and spent $106,915,180. This phenomena was somewhat common actually. Many of the richest teams in Major League Baseball were not delivering results while the Oakland A’s were… consistently.
Let’s look at this another way. Teams have to spend a minimum of $7 million on payroll and a team that’s spending the minimum payroll is expected to win about 49 games during the 162 game season. So, on a dollar-per(-marginal)-win basis the A’s were spending about $650,000 per win while Texas was spending about $4.3 million for each win. What explains this nearly 7x delta in ROI?
The two word answer is simple: Bad Tactics.
Traditional Tactics
Baseball is a sport steeped in tradition and the decade preceding the 2002 season saw teams payrolls rise by tens of millions of dollars per team, up to a 400% increase. These new costs meant that more people were paying attention to how effectively this money was being spent.
In 2002, the vast majority of MLB scouts were still judging players by whether they had a “good face” and by the 5 Tools – running, throwing, fielding, hitting, and hitting power. These subjective metrics were used in place of the enormous data sets that baseball had been collecting since the invention of the box score in 1845.
The data was clear. In 2002, RBIs (runs batted in), stealing bases, bunts, batting average, slugging, foot speed, high school players (vs college), and old (vs new/fresh) pitching arms were all tremendously over-valued in players – and it showed in their salaries.
The following were underpriced: High pitches per at-bat – which wore down pitchers – walks, and any other activity that got a hitter on base instead of out. So despite the availability of the data, the statistics to make sense of it, and the computing power to crunch the numbers, looks and luck were still being priced over results.
The human mind played tricks on itself when it relied exclusively on what it saw, and every trick it played was a financial opportunity for someone who saw through the illusion to the reality.
Baseball teams simply insisted on using bad tactics – which of course amounts to bad strategy. But reliance on knowably bad tactics happen outside of baseball too.
Insider vs Outsider CEOs
A recent episode of the Freakonomics podcast (How to Become a C.E.O.) illustrates another example of reliance on subjective decision making when good, relevant data is available:
A 2009 academic study, which analyzed established public companies from 1986 to 2005, found that internally promoted C.E.O.’s led to at least a 25 percent better total financial performance than external hires.” A 2010 study by Booz & Company similarly found that, in 7 of the 10 previous years, insider C.E.O.s delivered higher market returns than external hires. And yet: external hiring seems to be on the rise: in 2013, between 20 and 30 percent of boards replaced outgoing C.E.O.’s with external hires; a few decades ago, that number was only 8 to 10 percent. Outside hires also tend to be more expensive: their median pay is $3 million more than for inside hires. So, an external hire will, on average, cost you more and perform worse. And yet that’s the trend.
Overpay & Underdeliver
Why do companies overpay for inferior results? Why do baseball teams?
I think the biggest reasons is fear. The fear of humiliation and failure drove both baseball management and corporate boards into the bad tactics of over-paying for inferior results. When you focus on avoiding failure instead of finding success, you’re less likely to see new opportunities and adapt.
There’s also an issue of misaligned incentives at work too. In baseball, owners and managers care more about not being embarrassed by their performance than about wins. A losing team can still be profitable and have a great return. In the business world, board members and CEOs are often scratching one another’s backs and giving one another high paying jobs instead of focused on increasing shareholder value.
And, of course, it’s not always clear who’s delivering value, who’s slacking, or who’s just getting lucky or unlucky – in both a corporate environment and on the baseball diamond.
Recognizing Bad Tactics
So how do you recognize bad tactics?
1) Define what’s important to you.
For boards, they want CEOs who will deliver returns for a fair price. For baseball teams, regular season wins are the key to having a shot at the world series.
2) Look at the data & try to understand how different actions affect the outcomes you care about.
If there’s no data or bad data, start investing in this area. Try to put a value on different skills or results (on-base percentage, walks, or market-cap). How are different variables connected? What’s currently undervalued and what’s overvalued?
3) Ask hard, even contrarian, questions and seek out different perspectives.
Challenge the norms within your sector, culture, or league. Don’t be different just to be different but understand that the standard approach – or even your entire industry – might be severely under-optimized. Seeing reality through the illusion is incredibly valuable.
4) Be honest with yourself.
Embrace your findings. Act on them. Yes, that probably means risking failure.
Moneyball
I was inspired to write this post after reading Michael Lewis’ Moneyball. While I haven’t been a baseball fan since I was about 9 years old, listening to Bill James (one of the key players in all of this) on Russ Robert’s EconTalk got me really excited about the story of the Oakland A’s 2002 season – which was made into a very popular movie as well. I highly recommend readingMoneyball – which uses baseball as an analogy for the tactical and strategic failings of many organizations.
We’ve all heard the phrase “divide and conquer” (divide et impera) but when was the last time you heard concordia res parvae crescunt?
Over 250 years ago – when the 13 colonies were on the cusp of the Revolutionary War with Britain – John Dickinson wrote a series of 12 essays called the Letters from a Farmer in Pennsylvania.
In these letters, Dickinson shared his thoughts on how the colonies should respond to the Townshend Acts (1767) – a set of taxes that the British imposed on glass, lead, paints, paper, and tea before the outbreak of the American Revolution.
In addition to these new taxes, the British we’re also explicitly penalizing the New York colony for refusing to house and feed British troops. Dickinson, known as The “Penman of the Revolution,” wrote the following in response to this so-called New York Restraining Act:
I say, of these colonies; for the cause of one is the cause of all. If the [British] parliament may lawfully deprive New York of any of her rights, it may deprive any, or all the other colonies of their rights; and nothing can possibly so much encourage such attempts, as a mutual inattention to the interests of each other. To divide, and thus to destroy, is the first political maxim in attacking those, who are powerful by their union. (emphasis Dickinson’s)
Said another way: If your goal is to destroy those who are powerful because of their union, first seek to divide them. Further, if you can encourage each party to focus on only their own interests, then division is easier.
Unite & Grow Great
Taking the opposite perspective: If your goal is to grow strong or powerful by nature of your union with others, then seek harmony with them, pay attention to their interests, and refuse to cooperate with one another’s enemies.
Remember: Concordia res parvae crescunt. Small things grow great by concord.