Kategorien
Culture

Thoughts on „The Phoenix Project“

I had „The Phoenix Project“ on my bookshelf for long time. As it was mentioned in our christmas townhall, this finally triggered me to read it. And man was it a read. In contrast to most other book on IT I have read before, its a novel, and it is written in a very dense way. After few tens of pages, I recommended it to my wife, leading her to devour it even faster than me. I strongly recommend this read to anyone in the IT/SW field, as its showing so many daily messy anti-patterns and ways to finally relieve them in an emotional way. Even for people for whom the teached concepts like DevOps, Lean, etc. are well-known, this book is a strong recommend.

In this article I want to collect stuff which are highlights to me. I do not attempt to provide a comprehensive summary of the plot nor the concept presented. Others have done so much better, just google for a review. But then again, better to read the book yourself and go through the rollercoaster yourself.

Disclaimer: I actually own the 5th anniversary edition of the book, which at the end also contains excerpts from „The DevOps Handbook“. Some citations below are actually from that part.

Change Management

Change Management is one of the re-occuring topics in the book.

To my surprise, Patty looks dejected. “Look, I’ve tried this before. I’ll tell you what will happen. The Change Advisory Board, or CAB, will get together once or twice. And within a couple of weeks, people will stop attending, saying they’re too busy. Or they’ll just make the changes without waiting for authorization because of deadline pressures. Either way, it’ll fizzle out within a month.”

page 44

This brings my experience with attempts to establish a pragmatic, yet effective change management. My experience is that the easier a Change Management process is established, the less it actually is proactive. Means, documenting changes already done is happening better, than managing changes before work on them actually starts. Good change managers are hard to find, who really want maintain holistic perspective on a change from business, financial to technical aspects and who can stand their ground to upper management who always (!) will want to sidestep any Change Management.

Lack of cross-functional collaboration

I’ve seen this movie before. The plot is simple: First, you take an urgent date-driven project, where the shipment date cannot be delayed because of external commitments made to […] customers. Then you add a bunch of developers who use up all the time in the schedule, leaving no time for testing or operations deployment. And because no one is willing to slip the deployment date, everyone after Development has to take outrageous and unacceptable shortcuts to hit the date.

page 53

I also have seen this movie before. In an organization with decoupled engineering teams, there is always a blame game between those who come earlier in the chain than those who come later. A typical scene:

Chris replies hotly, “Don’t give me that bullshit about ‘throwing the pig over the wall.’ We invited your people to our architecture and planning meetings, but I can count on one hand the number of times you guys actually showed up. We routinely have had to wait days or even weeks to get anything we need from you guys!”
[…]
Wes rolls his eyes in frustration. “Yeah, it’s true that his people would invite us at the last minute. Seriously, who can clear their calendar on less than a day’s notice?”
[…]
I nod unhappily. This type of all-hands effort is just another part of life in IT, but it makes me angry when we need to make some heroic, diving catch because of someone else’s lack of planning.

pages 53ff, 55

When the guys in the later process steps like testers or IT operations hint at delayed deliveries from developers leading to a crunch at their end, developers bring up that they in fact early involved them. However, that involvement is often insufficient and fragmented, if it even happens in any meaningful way.

“Allspaw taught us that Dev and Ops working together, along with QA and the business, are a super-tribe that can achieve amazing things. They also knew that until code is in production, no value is actually being generated, because it’s merely WIP stuck in the system. He kept reducing the batch size, enabling fast feature flow.

page 297

Myth—DevOps Means Eliminating IT Operations, or “NoOps”: Many misinterpret DevOps as the complete elimination of the IT Operations function. However, this is rarely the case. While the nature of IT Operations work may change, it remains as important as ever. IT Operations collaborates far earlier in the software life cycle with Development, who continues to work with IT Operations long after the code has been deployed into production.
Instead of IT Operations doing manual work that comes from work tickets, it enables developer productivity through APIs and self-serviced platforms that create environments, test and deploy code, monitor and display production telemetry, and so forth. By doing this, IT Operations become more like Development (as do QA and Infosec), engaged in product development, where the product is the platform that developers use to safely, quickly, and securely test, deploy, and run their IT services in production.

page 360 (actually from The DevOps Handbook)

Simultaneously, QA, IT Operations, and Infosec are always working on ways to reduce friction for the team, creating the work systems that enable developers to be more productive and get better outcomes. By adding the expertise of QA, IT Operations, and Infosec into delivery teams and automated self-service tools and platforms, teams are able to use that expertise in their daily work without being dependent on other teams.
This enables organizations to create a safe system of work, where small teams are able to quickly and independently develop, test, and deploy code and value quickly, safely, securely, and reliably to customers. This allows organizations to maximize developer productivity, enable organizational learning, create high employee satisfaction, and win in the marketplace.

page 355 (actually from The DevOps Handbook)

Organizing teams in cross-functional fashion, is to this day still an evergreen. Its rarely done consequently enough, and if everything goes down the drain, task forces are formed where exactly the same is happening (bringing everyone together). See my Corporate SW Engineering Aphorisms.

As Randy Shoup, formerly a director of engineering at Google, observed, large organizations using DevOps “have thousands of developers, but their architecture and practices enable small teams to still be incredibly productive, as if they were a startup.”

page 378 (actually from The DevOps Handbook)

I think this is an underestimated aspect – cutting teams in product/feature verticals will only work at scale, if your system and software architecture enable according working mode.

Bottleneck Staff

A core engineer in the book is Brent. He is the go-to expert for everyone, knowing the IT systems inside out. However, this makes him the bottleneck for almost everything, from feature deployment to outage resolution.

Wes nods, “Yep. He’s the guy we need at those meetings to tell those
goddamned developers how things work in the real world and what type of
things keep breaking in production. The irony, of course, is that he can’t tell the developers, because he’s too busy repairing the things that are already broken.”
[…]
“Probably because someone like me was screaming at him, saying that I absolutely needed his help to get my most important task done. And it’s probably true: For way too many things, Brent seems to be the only one who knows how they actually work.”

page 56, page 115

“Maybe we create a resource pool of level 3 engineers to handle the escalations[…]. The level 3s would be responsible for resolving all incidents to closure, and would be the only people who can get access to Brent—on one condition. If they want to talk with Brent, they must first get Wes’ or my approval,” I say. “They’d be responsible for documenting what they learned, and Brent would never be allowed to work on the same problem twice. I’d review each of the issues weekly, and if I find out that Brent worked a problem twice, there will be hell to pay. For both the level 3s and Brent. […] Based on Wes’ story, we shouldn’t even let Brent touch the keyboard. He’s allowed to tell people what to type and shoulder-surf, but under no condition will we allow him to do something that we can’t document afterward. Is that clear?”
“That’s great,” Patty says. “At the end of each incident, we’ll have one more article in our knowledge base of how to fix a hairy problem and a growing pool of people who can execute the fix.”

page 116

Wes says, […] confirming my worst fears. “[CEO] Steve insisted that we bring in all the engineers, including Brent. He said he wanted a ‘sense of urgency’ and ‘hands on keyboards, not people sitting on the bench.’ Obviously, we didn’t do a good enough job coordinating everyone’s efforts, and…” Wes doesn’t finish his sentence.
Patty picks up where he left off, “We don’t know for sure, but at the very least, the inventory management systems are now completely down, too. […]”

page 178

He pauses and then says emphatically, “Eliyahu M. Goldratt, who created the Theory of Constraints, showed us how any improvements made anywhere besides the bottleneck are an illusion. Astonishing, but true! Any improvement made after the bottleneck is useless, because it will always remain starved, waiting for work from the bottleneck. And any improvements made before the bottleneck merely results in more inventory piling up at the bottleneck.”

page 90

I’ve also come across otherwise smart guys who are of the mistaken belief that if they hold on to a task, something only they know how to do, it’ll ensure job security. These people are knowledge Hoarders.

David Lutz, https://dlutzy.wordpress.com/2013/05/03/the-phoenix-project/

As a solution, Dr. Goldratt defined the “five focusing steps”:
– Identify the system’s constraint.
– Decide how to exploit the system’s constraint.
– Subordinate everything else to the above decisions.
– Elevate the system’s constraint.
– If in the previous steps a constraint has been broken, go back to step one, but do not allow inertia to cause a system constraint.

page 401 (actually from The DevOps Handbook)

All those money quotes highlight that a hero culture is detrimental to a mature organization. Pulling heroic actions every once in a while may seem unavoidable, but its never a sign of good management to depend on it. Use your heroes to bring kick-ass customer features in an orderly process before your competition, but dont require heroes for everyday tasks.

WIP is the silent killer

He gestures broadly with both arms outstretched, “In the 1980s, this plant was the beneficiary of three incredible scientifically-grounded management movements. You’ve probably heard of them: the Theory of Constraints, Lean production or the Toyota Production System, and Total Quality Management. Although each movement started in different places, they all agree on one thing: WIP is the silent killer. Therefore, one of the most critical mechanisms in the management of any plant is job and materials release. Without it, you can’t control WIP.”

page 89

Dominica DeGrandis, one of the leading experts on using kanbans in DevOps value streams, notes that “controlling queue size [WIP] is an extremely powerful management tool, as it is one of the few leading indicators of lead time—with most work items, we don’t know how long it will take until it’s actually completed.”

page 397 (actually from The DevOps Handbook)

Limiting Work in Progress has been one of my guiding principles for a decade. Since I am in project or line management, it has shown to be the most effective way to get a handle on any messy situation. However, its not easy at all to limit WIP. Rarely, its just about saying No often enough. More often, its about combining efforts in clever ways, breaking complex tasks down, aligning on exact requirements and expectations. But it all starts with a relentless assessment of the situation.

Stakeholder Management

Uncertain, I ask Steve, “Are we even allowed to say no? Every time I’ve asked you to prioritize or defer work on a project, you’ve bitten my head off. When everyone is conditioned to believe that no isn’t an acceptable answer, we all just became compliant order takers, blindly marching down a doomed path. I wonder if this is what happened to my predecessors, too.”

page 196

It doesn’t work without top management. If yours is continuously sidestepping any reasonably pragmatic process and ignore requests for priorization, it shows their lack of management skills, not yours (but they will make sure you feel the opposite).

Continuous Improvement

“Mike Rother says that it almost doesn’t matter what you improve, as long as you’re improving something. Why? Because if you are not improving, entropy guarantees that you are actually getting worse, which ensures that there is no path to zero errors, zero work-related accidents, and zero loss. […] Rother calls this the Improvement Kata […] He used the word kata, because he understood that repetition creates habits, and habits are what enable mastery. Whether you’re talking about sports training, learning a musical instrument, or training in the Special Forces, nothing is more to mastery than practice and drills. Studies have shown that practicing five minutes daily is better than practicing once a week for three hours. And if you want to create a genuine culture of improvement, you must create those habits.”

page 213

Like the legendary stories of the original Apple Mac OS and Netflix cloud delivery infrastructure, we deployed code that routinely created large-scale faults, thus randomly killing processes or entire servers. Of course, the result was all hell breaking loose for an entire week as our test, and occasionally, production infrastructure crashed like a house of cards. But, over the following weeks, as Development and IT Operations worked together to make our code and infrastructure more resilient to failures, we truly had IT services that were resilient, rugged, and durable.
John [Security] loved this, and started a new project called “Evil Chaos Monkey.” Instead of generating operational faults in production, it would constantly try to exploit security holes, fuzz our applications with storms of malformed packets, try to install backdoors, gain access to confidential data, and all sorts of other nefarious attacks.
Of course, Wes tried to stop this. He insisted that we schedule penetration tests into predefined time frames. However, I convinced him this is the fastest means to institutionalize Erik’s Third Way. We need to create a culture that reinforces the value of taking risks and learning from failure and the need for repetition and practice to create mastery. I don’t want posters about quality and security. I want improvement of our daily work showing up where it needs to be: in our daily work.
John’s team developed tools that stress-tested every test and production environment with a continual barrage of attacks. And like when we first released the chaos monkey, immediately over half their time was spent fixing security holes and hardening the code. After several weeks, the developers were deservedly proud of their work, successfully fending off everything that John’s team was able to throw at them.

page 329

Because we care about quality, we even inject faults into our production environment so we can learn how our system fails in a planned manner. We conduct planned exercises to practice large-scale failures, randomly kill processes and compute servers in production, and inject network latencies and other nefarious acts to ensure we grow ever more resilient. By doing this, we enable better resilience, as well as organizational learning and improvement.

page 376 (actually from The DevOps Handbook)

Using chaos spreading tools like Chaos Monkey is something we are currently exploring. I dont know (nor have I researched yet) if this is done beyond typical fuzzing approaches in embedded, but I see a lot of potential here.

Throughput

I tell them […] about how wait times depend upon resource utilization. “The wait time is the ‘percentage of time busy’ divided by the ‘percentage of time idle.’ In other words, if a resource is fifty percent busy, then it’s fifty percent idle. The wait time is fifty percent divided by fifty percent, so one unit of time. Let’s call it one hour. So, on average, our task would wait in the queue for one hour before it gets worked. “On the other hand, if a resource is ninety percent busy, the wait time is ‘ninety percent divided by ten percent’, or nine hours. In other words, our task would wait in queue nine times longer than if the resource were fifty percent idle.” I conclude, “So, for the Phoenix task, assuming we have seven handoffs, and that each of those resources is busy ninety percent of the time, the tasks would spend in queue a total of nine hours times the seven steps…”
“What? Sixty-three hours, just in queue time?” Wes says, incredulously […]

Patty says, “What that graph says is that everyone needs idle time, or slack time. If no one has slack time, WIP gets stuck in the system. Or more specifically, stuck in queues, just waiting.”

page 235f

Naturally, as the book is about lean concepts, throughput plays an important role. With the above I actually learned a new aspect. The calculation of wait time from resource utilization seems a bit counterintuitive, and I haven’t bought it fully, still. However, there is certainly a strong correlation here. Your resources – be it staff or tools – should never by occupied above a certain ratio (50%? 80%?), otherwise your full value stream will go down the drain.

Emotional and Motivational Aspects

When people are trapped in this downward spiral for years, especially those who are downstream of Development, they often feel stuck in a system that preordains failure and leaves them powerless to change the outcomes. This powerlessness is often followed by burnout, with the associated feelings of fatigue, cynicism, and even hopelessness and despair. Many psychologists assert that creating systems that cause feelings of powerlessness is one of the most damaging things we can do to fellow human beings—we deprive other people of their ability to control their own outcomes and even create a culture where people are afraid to do the right thing because of fear of punishment, failure, or jeopardizing their livelihood. This can create the conditions of learned helplessness, where people become unwilling or unable to act in a way that avoids the same problem in the future.

pages 372f (actually from The DevOps Handbook)

Never underestimate the employee mood and commitment on your organization’s performance. Another thing which sounds obvious at first, but looking behind the curtains, listening to coffee room chatter and engaging with people’s honest opinions will always (!) reveal something you need to improve one – completely outside of any hard project KPIs. Listen.

Quality and Safety

In addition to lead times and process times, the third key metric in the technology value stream is percent complete and accurate (%C/A). This metric reflects the quality of the output of each step in our value stream. Karen Martin and Mike Osterling state that “the %C/A can be obtained by asking downstream customers what percentage of the time they receive work that is ‘usable as is,’ meaning that they can do their work without having to correct the information that was provided, add missing information that should have been supplied, or clarify information that should have and could have been clearer.”

page 391 (actually from The DevOps Handbook)

Consider when we have an annual schedule for software releases, where an entire year’s worth of code that Development has worked on is released to production deployment. Like in manufacturing, this large batch release creates sudden, high levels of WIP and massive disruptions to all downstream work centers, resulting in poor flow and poor quality outcomes. This validates our common experience that the larger the change going into production, the more difficult the production errors are to diagnose and fix, and the longer they take to remediate.

page 399 (actually from The DevOps Handbook)

Dr. Sidney Dekker, who also codified some of the key elements of safety culture, observed another characteristic of complex systems: doing the same thing twice will not predictably or necessarily lead to the same result. It is this characteristic that makes static checklists and best practices, while valuable, insufficient to prevent catastrophes from occurring.

page 406 (actually from The DevOps Handbook)

Examples of ineffective quality controls include:
– Requiring another team to complete tedious, error-prone, and manual tasks that could be easily automated and run as needed by the team who needs the work performed
– Requiring approvals from busy people who are distant from the work, forcing them to make decisions without an adequate knowledge of the work or the potential implications, or to merely rubber stamp their approvals
– Creating large volumes of documentation of questionable detail which become obsolete shortly after they are written
– Pushing large batches of work to teams and special committees for approval and processing and then waiting for responses
Instead, we need everyone in our value stream to find and fix problems in their area of control as part of our daily work. By doing this, we push quality and safety responsibilities and decision-making to where the work is performed […]

page 411 (actually from The DevOps Handbook)
Kategorien
Culture

The best mail filter rule in the world

As someone who is using many web services and also contributing there, many automated notifications accumulate over time. Jira Ticket updates, newly created Confluence pages in a folder I observe, Gitlab Merge Request status updates, … there is so much going on, and I personally like to read along whenever I have some spare minutes. So, turning off all those notifications is not an option for me. On the other hand, I would get hundreds of such mails to my inbox every day.

So, the first rather obvious step was to add a filter/rule, which moves all mail coming from automailers like jira-no-reply@foo.bar.com to a Notifications folder:

That works in a pretty straightforward manner. Makes my inbox much cleaner and I can browse the Notifications folder whenever I want. However, now I may miss especially relevant updates. How to find those?

Turns out, a good heuristic is to just use my own name as an indicator for relevance (at least for myself haha). So, I do not move any mail which contains my first name or my internal user id → those stay in my main inbox!

Of course, besides using your own name (or mine 😉) you can use other terms which indicate relevance.

This allows me to get all notifications, have the most relevant ones in my main inbox and, thus, stay on top of whats going on without drowning in internal update spam.

Kategorien
Coding Culture

Corporate SW Engineering Aphorisms

During my recent vacation I was reflecting on patterns I observe in my 15+ tenure in a corporate SW engineering environment. My friends and colleagues know me to be very interested in the meta-level of organizational dynamics, my blog is evidence for this.

Its not so easy (for me) to communicate such patterns in a thought-provoking manner. The internet and software culture offers a very rich collection of much more clever people’s takes. You probably have heard about Murphy’s Law, Conway’s Law, or Parkinson’s Law of Triviality. https://matthewreinbold.com/2020/08/03/Technology-Aphorisms has a nice collection of those. However, phrasing laws is a but too much for my humble self. Instead, I figured aphorisms are more apprioriate for my opinions. However, I have to admit that I cant perfectly differentiate between aphorism, sententia, maxim, Aperçu, and bonmot. I guess this is just me trying to be clever, too 🙂

In task forces you bring together end-to-end teams and miraculously it works. Why do you wait for the task force to form such teams?

Great organizations mature and evolve their software over many years. Others replace it every other year – and call this progress.

While the strategy grows on shiny slides, the engineers wonder who still listens to them.

In a world of infinite content, silence becomes signal

It’s not the traffic that breaks the system — it’s the architect’s fantasy of it.

In the world of junior architects, no problem is too small for an oversized solution.

Overengineering doesn’t solve problems — it staffs them.

Complexity is job security — for the team that caused it.

Not every repetition is a problem. Some abstractions are worse.

YAGNI is no excuse for architecture – but a compass for its necessity

Every new DSL saves you five keystrokes – and costs you 3 days of debugging

Kategorien
Culture

Business Trip to Egypt

Few weeks back my colleagues Kemal Hajvazovic, Seif Abdelmegeed and me had the chance to visit our partners from Luxoft Egypt in Cairo. It was great to meet the team which is helping us on our ADAS platform software journey in many regards. Great culture and spirit – keep it up!

Thanks for hosting us Fatmaelzahraa Mohamed & Amr Hussein Taher

Kategorien
Coding Culture

Technology Radar #32: Automotive SW perspective

Few days ago version #32 of Thoughtworks‘ (TW) Technology Radar has been published. As in earlier blog posts, I want to review the topics in there from the perspective of automotive embedded SW engineering. As usual, there is some bias towards cloud technology and machine learning which is out of my current professional scope (exceptions apply), however there are enough other concepts/tools/aspects every time which make me investigate and followup to my best possibilities either at work or in my private projects. In this blog post I will try to list those parts which are roughly new and relevent from an automotive software perspective.

Lets start with the Technology sector. First item TW recommends to adopt is Fuzz Testing. Indeed its a testing approach with great potential I have barely ever leveraged (I am not alone: „Fuzz testing , or simply fuzzing, is a testing technique that has been around for a long time but it is still one of the lesser-known techniques“). Worth noting: Google has an interesting project called OSS-Fuzz in which they fuzz open source projects for free, and find a lot of issues actually. Fuzz Testing is on my top 10 sw engineering practices I want to see in real project life as soon as possible.

The next interesting item „API request collection as API product artifact“ sounds a bit clunky. I interprete it as a set of sample API requests which help developers to more quickly adopt APIs inside of an ecosystem. That is definetly desirable, as examples are often more helpful in getting the hang of a specific API than its API documentation (not to mentioned that doc is still very important, too). One caveat is to establish ways to keep the examples/collection up-to-date so they dont break after a while when the API evolves.

Then comes Architecture advice process, which resonates very well with current experience: In large software projects, a common challenge is the architectural alignment processes. Traditional methods, like Architecture Review Boards, often slow things down and are linked to poor organizational outcomes. A more effective alternative may be the architectural advice process—a decentralized model where anyone can make architectural choices, as long as they consult with affected stakeholders and experts. This approach supports faster workflows without sacrificing quality, even at scale. Though it may initially seem risky, tools like Architecture Decision Records and advisory groups help keep decisions well-informed. This model is proving successful, even in tightly regulated industries. Andrew Harmel-Law has written an insightful blog post on it.

In the Tools sector, uv is getting a spotlight. uv is the hot shit right now in the Python ecosystem. While I have not used it myself, I see it gradually replacing other Python package managers. This is due to its fast execution, but also well designed features, making it easier to run kind of self-contained Python projects.

Kategorien
Culture

Business Trip to India

It was a blast at Mercedes-Benz Research and Development India with my team colleagues from Platform Software Integration Testing, Virtual ECU and Adaptive Integration! Thanks for hosting us, the great achievements and the inspiration for further innovation

Kategorien
Book Culture

Thoughts on “Implementing Lean Software Development”

Reading and summarizing books on lean software development, so you dont have to. Part 3 (see Part 1 and Part 2).

“Implementing Lean Software Development” written by Mary and Tom Poppendieck and published 2007 at Addison-Wesley. The Poppendiecks are quite famous in the lean-agile software development community, as they published the constitutive book „Lean Software Development: An Agile Toolkit“ in 2003, the first (recognized) book about bringing the lean principles to the software development space. The book reviewed here is a successor book aimed at delivering more practical advice. As in the last parts, my review will not focus on re-iterating lean and agile fundamentals, but rather focus on novelty aspects, ideas, and noteworthy pieces.

In the foreword, Jeff Sutherland (co-founder of the Scrum framework) introduces the Japanese terms of Muri (properly loading a system), Mura (never stressing a person, system or process) and Muda (waste):

Yet many managers want to load developers at 110 percent. They desperately want to create a greater sense of “urgency” so developers will “work harder.” They want to micromanage teams, which stifles
self-organization. These ill-conceived notions often introduce wait time, churn, death marches, burnout, and failed projects.
When I ask technical managers whether they load the CPU on their laptop to 110 percent they laugh and say, “Of course not. My computer would stop running!” Yet by overloading teams, projects are often late, software is brittle and hard to maintain, and things gradually get worse, not better.

page xix

In their historical review the authors bring a very interesting statistics which should resonate with many of my peers:

Both Toyodas had brilliantly perceived that the game to be played was
not economies of scale, but conquering complexity. Economies of scale will reduce costs about 15 percent to 25 percent per unit when volume doubles. But costs go up by 20 percent to 35 percent every time variety doubles. Just-in-Time flow drives out major contributors to the cost of variety. In fact, it is the only industrial model we have that effectively manages complexity.

page 5

As evidence, two papers are given: „Time -The Next Source of Competitive Advantage“ by George Stalk and „Lean or Sigma?“ by Freddy and Michael Balle. Managers and engineers increasingly become aware about the not-so-visible cost of complexity, typically by experiencing project failure or long-term product degradation.

For the aspect of inventory, the authors provide a quite good methaphor:

Inventory is the water level in a stream, and when the water level is high, a lot of big rocks lurking under the water are hidden. If you lower the water level, the big rocks begin to surface. At that point, you have to clear the rock out of the way, or your boat will crash into them. As the big rocks are removed, you can lower inventory level some more, find more rocks, clear them out of the stream, and keep on going until there are just pebbles left.

page 8

That adoption of lean practices and mindset is not straightforward and many organizations struggle or fail to do so is explained by the authors by pointing at a „cherrypicking“ approach. Hence, only some activities of the lean domain are adopted in isolation, like just-in-time or stop-the-line. Instead, they a classic:

The truly lean plant […] transfers the maximum number of tasks and responsibilities to those workers actually adding value to the car on the line, and it has in place a system for detecting defects that quickly
traces every problem, once discovered, to its ultimate source.

Womack, Jones, Roos: The machine that changed the world, page 99

I think this cannot be underestimated. To seldom I have seen organizations and management really focussing on the „value creators“ and the impediments those are facing.

In earlier blog posts I already wrote about the differences and similarities in the lean manufacturing and lean development. The Poppendiecks provide a table putting both side-by-side (page 14):

Later, in a footnote, the authors refer to a paper by Kajko-Mattsson et al. on the cost of software maintenance. The paper’s sources vary a lot, however its obvious that considering a typical big software project it becomes clear that this ratio quickly translates to millions of Euro/Dollar.

The published numbers point out that maintenance costs between 40% to 90% […]. There are very few publications reporting on the cost of each individual maintenance category. The reported ones are the
following: (1) corrective maintenance – 16-22% […] (2) perfective maintenance – 55% […], and (3) adaptive maintenance – 25% […].

Kajko-Mattsson et al: Taxonomy of problem management activities, page 1

On the lean principle of waste, the Poppendiecks make a simple but revelating statement:

To eliminate waste, you first have to recognize it. Since waste is anything
that does not add value, the first step to eliminating waste is to develop a keen sense of what value really is. There is no substitute for developing a deep understanding of what customers will actually value once they start using the software. In our industry, value has a habit of changing because, quite often, customers don’t really know what they want. In addition, once they see new software in action, their idea of what they want will invariably shift. Nevertheless, great software development organizations develop a deep sense of customer value and continually delight their customers.

page 23

Too often have I experienced software development projects who dont know what their product and the value they provide actually is. Of course, everyone has a vague feeling about what it could be, but putting it in clear words is seldom attempted and easily ends in conflict (a conflict which can be constructive if facilitated well).

On the second principle „Build Quality In“, there are some interesting distinctions on defects and the relation to „inspection“:

According to Shigeo Shingo, there are two kinds of inspection: inspection after defects occur and inspection to prevent defects.10 If you really want quality, you don’t inspect after the fact, you control conditions so as not to allow defects in the first place. If this is not possible, then you inspect the product after each small step, so that defects are caught immediately after they occur. When a defect is found, you stop-the-line, find its cause, and fix it immediately.
Defect tracking systems are queues of partially done work, queues of rework if you will. Too often we think that just because a defect is in a queue, it’s OK, we won’t lose track of it. But in the lean paradigm, queues are collection points for waste. The goal is to have no defects in the queue, in fact, the ultimate goal is to eliminate the defect tracking queue altogether. If you find this impossible to imagine, consider Nancy Van Schooenderwoert’s experience on a three-year project that developed complex and often-changing embedded software. Over
the three-year period there were a total of 51 defects after unit testing with a maximum of two defects open at once. Who needs a defect tracking system for two defects?

page 27

The authors are citing two papers by Nancy Van Schooenderwoert („Taming the Embedded Tiger – Agile Test Techniques for Embedded
Software
“ and „Embedded Agile Project by the Numbers With Newbies„). This resonates well with me, because accumulating too many defect (tickets) is very expensive waste. Its a kind of inventory with the worst properties. To break out of this is not straightforward, I have attempted and failed multiple times to establish a „zero defect policy“ (i.e. as long as there is a defect no further feature development happens). In that context let me at two more quotes from the book:

The job of tests, and the people that develop and runs tests, is to prevent defects, not to find them.

page 28

“Do it right the first time,” has been interpreted to mean that once code is written, it should never have to be changed. This interpretation encourages developers to use some of the worst known practices for the design and development of complex systems. It is a dangerous myth to think that software should not have to be changed once it is written.

page 29

On the fifth principle of „Deliver Fast“ a very important statement is made:

Caution: Don’t equate high speed with hacking. They are worlds apart. A fast-moving development team must have excellent reflexes and a disciplined, stop-the-line culture. The reason for this is clear: You can’t sustain high speed unless you build quality in.

page 35

Very often I observe a dire need for speed. Of course everyone wants to be faster in the software industry. Competition doesnt sleep. However, similar to unclear definitions of value and products, I have barely ever seen a clear definition of speed in a software project. Or, probably more correct: there were competing definitions of speed on people’s and especially decision maker’s minds. Its a huge difference to beat your team to „push out features now“ and grind to a halt when quality activities are started, or to maintain a sustainable pace:

When you measure cycle time, you should not measure the shortest time through the system. It is a bad idea to measure how good you are at expediting, because in a lean environment, expediting should be neither necessary nor acceptable. The question is not how fast can you deliver, but how fast do you repeatedly and reliably deliver a new capability or respond to a customer request.

page 238

The Poppendiecks are summarizing those effects in two vicious cycles (page 38):

For all the lean principles, the Poppendiecks also discuss myths originating from mis-interpreting the principles or applying them wrongly. One which caught my attention was the myth „Optimize by decomposition“. Its about the proliferation of metrics once an organization starts to apply the benefits of visual management. All of a sudden, there are tens if not hundreds of dashboards, graphs, KPIs, and such flying around. Their recommendation:

When a measurement system has too many measurements the real goal of the effort gets lost among too many surrogates, and there is no guidance for making tradeoffs among them. The solution is to “Measure UP” that is, raise the measurement one level and decrease the number of measurements. Find a higher-level measurement that will drive the right results for the lower level metrics and establish a basis for making trade-offs.

page 40

Speaking about myths, they encourage readers to check which myths apply to their situation – certainly a worthwile exercise also for you 🙂

Early specification reduces waste
The job of testing is to find defects
Predictions create predictability
Planning is commitment
Haste makes waste
There is one best way
Optimize by decomposition

page 42

Coming back to the notion of value, the authors are asking the fundamental question how great products are conceived and developed. They write:

In 1991, Clark and Fujimoto’s book Product Development Performance presented strong evidence that great products are the result of excellent, detailed information flow. The customers‘ perception of the product is determined by the quality of the flow of information between the marketplace and the development team. The technical integrity of the product is determined by the quality of the information flow among upstream and downstream technical team members. There are two steps you can take to facilitate this information flow: 1) provide leadership, and 2) empower a complete team.

page 52

The book has an extensive chapter on waste with many insightful aspects. I dont want to repeat all of them, and instead just provide some examples. For example I found this statement on the relationship of automation and waste/complexity very inspiring.

We are not helping our customers if we simply automate a complex or messy process; we would simply be encasing a process filled with waste in a straight jacket of software complexity. Any process that is a candidate for automation should first be clarified and simplified, possibly even removing existing automation. Only then can the process be clearly understood and the leverage points for effective automation identified.

page 72

In my current position, automation is a key activity, and we try to automate everything in an endeavour to increase speed, quality and convenience. The quote points out, that automation can hide or defer complexity. I can confirm this. Even though my team automated the complexity of product variants in the build process, our customers (e.g. manual testers) dont have a chance to test all the build we produce. Hence, even made with best intentions, our automation is overloading the whole.

Another good comparison between traditional manufacturing and software development is the following table, putting the seven waste equivalents side-by-side (page 74):

On architectural foresight, I like the following statement:

Creating an architectural capability to add features later rather than sooner is good. Extracting a reusable services „framework“ for the enterprise has often proven to be a good idea. Creating a speculative application framework that can be configured to do just about anything has a track record of failure. Understand the difference.

page 76

While discussing Value Streams, the authors dig into effectiveness and efficiency. They are of the opinion that

chasing the phantom of full utilization creates long queues that take far more effort to maintain than they are worth-and actually decreases effective utilization.

page 88

This opinion is not speculation, they provide a good analogy to road traffic and computer utilization:

High utilization is another thing that makes systems unstable. This is obvious to anyone who has ever been caught in a traffic jam. Once the utilization of the road goes above about 80 percent, the speed of the traffic starts to slow down. Add a few more cars and pretty soon you are moving at a crawl. When operations managers see their servers running at 80 percent capacity at peak times, they know that response time is beginning to suffer, and they quickly get more servers. […]

Most operations managers would get fired for trying to get maximum utilization out of each server, because it’s common knowledge that high utilization slows servers to a crawl. Why is it that when development managers see a report saying that 90 percent of their available hours were used last month, their reaction is, „Oh look! We have time for another project!“

pages 101f

I think in daily work, management typically does not pay enough attention to those basics. It is not that this is not known that too high utilization of resouces is bad, quite the opposite is the case in my experience. However, the root causes and the remedies are often not considered. Instead, there is a sentiment of capitulation: „Yes I know our team is stressed and overloaded, but we have to get faster nevertheless.“

In order to reduce cycle times, the authors refer to queuing theory, which provides several approaches:

Even out the arrival of work

Minimize the number of things in process

Minimize the size of things in process

Establish a regular cadence

Limit work to capacity

Use pull scheduling

page 103

In the chapter „People“, there is a lot of reference to William Edwards Deming, a pioneer of quality management. Its an iron of history, that this American actually was teaching the fundamentals of what leater became Lean in post-war Japan, while he was „discovered“ only in the 1980s by the US (industrial) public. Deming formulated a what he called „System of Profound Knowledge“:

  1. Appreciation of a System: A business is a system. Action in one part of the system will have effects in the other parts. We often call these “unintended consequences.” By learning about systems we can better avoid these unintended consequences and optimize the whole system.
  2. Knowledge of Variation: One goal of quality is to reduce variation. Managers who do not understand variation frequently increase variation by their actions. Critical to this is understanding the two types of variation — Common cause which is variation from the system and Special cause which variation from outside the system
  3. Theory of Knowledge: There is no knowledge without theory. Understanding the difference between theory and experience prevents shallow change. Theory requires prediction, not just explanation. While you can never prove that a theory is right, there must exist the possibility of proving it wrong by testing its predictions.
  4. Understanding of Psychology: To understand the interaction between work systems and people, leaders must seek to answer questions such as: How do people learn? How do people relate to change? What motivates people?
https://medium.com/10x-curiosity/system-of-profound-knowledge-ce8cd368ca62

When pursuing change and transformation, it is very important to take the staff on board. This is easier said than done, because the employees have a very fine sense. They realize very quickly, if for example a certain change in mindset is requested from them, but not exercised by their supervisors. In engineering projects, the demands and expectations of decision makers are often antagonistic to their communicated strategies and visions. Just consider if in your organization „quality“ is an essential part of your long-term goals, and totally overriden by daily task force death marches.

The challenge to achieve quality is handled in another dedicated chapter. The authors point out the importance of „superb, detailed discipline“ to achieve high quality. Here come the famous „5 S’s“ into play. The book’s authors transfer them also to the software space:

Sort (Seiri): Sort through the stuff on the team workstations and servers, and find the old versions of software and old files and reports that will never be used any more. Back them up if you must, then delete them.

Systematize (Seiton): Desktop layouts and file structures are important. They should be crafted so that things are logically organized and easy to find. Any workspace that is used by more than one person should conform to a common team layout so people can find what they need every place they log in.

Shine (Seiso): Whew, that was a lot of work. Time to throw out the pop cans and coffee cups, clean the fingerprints off the monitor screens, and pick up all that paper. Clean up the whiteboards after taking pictures of the important designs that are sketched there.

Standardize (Seiketsu): Put some automation and standards in place to make sure that every workstation always has the latest version of the tools, backups occur regularly, and miscellaneous junk doesn’t accumulate.

Sustain (Shitsuke): Now you just have to keep up the discipline.

page 191

I really enjoyed reading this book and can absolutely recommend reading it. It contains a lot of gems, and is probably one of those book you want to read every other year again to re-discover aspects and connect them to new experience.

Kategorien
Book Culture

Thoughts on „Lean Software Development in Action“

Reading and summarizing books on lean software development, so you dont have to. Part 2 (see Part 1).

“Lean Software Development in Action” written by Andrea Janes and Giancarlo Succi and published 2014 at Springer. The authors are scientists at the University of Bolzano, and the book clearly has a more scientific approach than the last one.

As the last – and probably every book – on the matter of lean, agile and software engineering this book starts with an introduction on each of those aspects. Again, I will not reiterate on what lean and agile are, but focus on interesting observations and perspectives exposing new angles on „known stuff“.

The first noteworthy piece is about „tame and wicked projects“. This section is referring to work by Rittel and Webber who came up with a discution between tame and wicked problems (see also). Poppendieck and Poppendieck extended this to projects, and in this book they are described and identified by following ten points:

  1. Wicked projects cannot provide a definitive, analytical formulation of the problem they target. Formulating the project and the solution is essentially the same task. Each time you attempt to create a solution, you get a new, hopefully better, understanding of the project.
  2. Wicked projects have no a stopping rule telling when the problem they target has been solved. Since you cannot define the problem, it is almost impossible to tell when it has been resolved. The problem-solving process proceeds iteratively and ends when resources are depleted and/or stakeholders lose interest in a further refinement of the currently proposed solution.
  3. Solutions to problems in wicked projects are not true or false, but good or bad. Since there are no unambiguous criteria for deciding if the project is resolved, getting all stakeholders to agree that a resolution is “good enough” can be a challenge.
  4. There is no immediate or ultimate test of a solution to the targeted problem in a wicked project. Solutions to such projects generate waves of consequences, and it is impossible to know how these waves will eventually play out.
  5. Each solution to the problem targeted by a wicked project has irreversible consequences. Care must be placed in managing assumed solutions. Once the website is published or the new customer service package goes live, you cannot take back what was online or revert to the former customer database.
  6. Wicked projects do not have a well-described, widely accepted set of potential solutions. The various stakeholders may have differing views of what are acceptable solutions. It is a matter of judgment as to when enough potential solutions have emerged and which should be pursued.
  7. Each wicked project is essentially unique. There are no well-defined “classes” of solutions that can be applied to a specific case. It is not easy to find analogous projects, previously solved and well documented, so that their solution could be duplicated.
  8. The problem targeted by a wicked project can be considered a symptom of another problem. A wicked project deals with a set of interlocking issues and constraints that change over time, embedded in a dynamic and evolving context.
  9. The causes of a problem targeted by a wicked project can be explained in several ways. There are several stakeholders who have various and changing ideas about what is the project, its nature, its causes, and the associated solution.
  10. The project must not go wrong. Mistake is not an option here. Despite the inability to express the project solution analytically, it is not allowed to fail the project.
page 9

These are some interesting observations which resonate with my experience. One might say „many of those points do not apply to our project as we have a quite clear understanding of our product delivery (like an ECU for a car, providing a certain set of functions/features for the customers)“. However I think its not that simple. Many projects of non-trivial complexity I have been involved in do not only have the goal of releasing a product to the market, but there are other, interlinked objectives which give the project in total a semi- or non-defined goal. Besides delivering a good, innnovative product these objectives may consist of e.g. financial goals (better return on investment), efficiency gains, usage of new technologies and approaches and in- or outsourcing of activities. While project I know typically start with those defined on a high-level and people are onboarded or recruited referring to those motivating goals, as soon as the project enters the death march the project’s goals are gradually getting more fuzzy, unbalanced and volatile. I don’t know silver bullets to such situations (yet), but the notion of wicked projects resonates with those and other observations. However, isn’t awareness the first step to improvement?

[…] the quality of the final product is seen as a result of the process pro-
ducing them. This assumption creates a high attention for a high-quality production process according to the credo “prevention is better than healing”

page 41

This lean wisdom sounds trivial, however, I have never seen it realized. I have yet to understand why so many managers ignore the efficiency and effectiveness gains by a proper process and instead decide to continously apply brute force which cost them more money, time, energy, motivation and subsequently of course quality. And, of course, by a „proper process“ I dont mean an perfect considers-everything-process which is both unreachable and undesirable.

„The role of standardization“, page 43

I like this illustration showing standardization as a wheel chock for the plan-do-study/check-act cycle. Similar to the paragraph above it strikes me how many projects and managers re-invent the wheel (haha) and then start the whole process from the bottom. Of course no one wants and says this, but this is what often happens.

The result is a development approach in which requirements are not “refined down to an implementation,” i.e., taken as the starting point to develop an implementation that represents those requirements, but where the business objectives are mapped to the capabilities of the technical platform to “equally consider and adjust business goals and technical aspects to come to an optimal solution corresponding to the current situation

page 63

This is another insightful depiction. It provided me a new perspective on requirements, as they help to close the gap between business goals and technical capabilities. This can be a good approach helping in situations in which a project is lacking a good „feeling“ on the right amount of requirements, between over-specification and under-specification.

page 80

Referring to studies conducted by Herzberg this shows how motivators and hygiene factors influence the motivation of the staff, and how agile methods very well support/complement those.

Later the authors come to write about the „dark side of agile“, on which they also published a paper. As observed by others before, agile statements can be easily twisted and thwarted in good or bad faith to yield an abomination which leads to the opposite and extreme positions of the original intention. Citing Rakitin’s old paper the agile manifesto can be translated as

  • Individuals and interactions over processes and tools: “Talking to people instead of using a process gives us the freedom to do whatever we want.”
  • Working software over comprehensive documentation: “We want to spend all our time coding. Remember, real programmers don’t write documentation.”
  • Customer collaboration over contract negotiation: “Haggling over the details is merely a distraction from the real work of coding. We’ll work out the details once we deliver something.”
  • Responding to change over following a plan: “Following a plan implies we have to think about the problem and how we might actually solve it. Why would we want to do that when we could be coding?”
page 111

The above is one way to twist the agile manifesto in favor of what the authors a few paragraphs later call a „cowboy coder“. This reminds of the „Programming, Motherfucker“ webpage (thanks, Kris). While such sentiment exists, I cant often blame the engineers on mocking the agile manifesto and similar approaches that way. Very often, such engineer’s reactions are preceeded by even more unfaithful perversions pushed by all sorts of management. Just to bring one example to the table: Who has not witnessed a manager throwing new (vague) requirements at the development team every other week claiming this is what agile is about and, of course, everyone has to be faster in reacting. Because agile is faster. I could go on.

In chapter 6 the authors start to synthesise and bring lean and software development together. They start with citing the seminal book of Poppendieck and Poppendieck „Implementing Lean Software Development From Concept to Cash“. I couldnt read it yet as it wasnt available in print when I tried to get a hold of it. So they provide 7 principles for lean software development:

  1. Eliminate waste;
  2. Build quality—we used the terms “autonomation” and “standardization”;
  3. Create knowledge;
  4. Defer commitment—we used the term “just-in-time”;
  5. Deliver fast—get frequent feedback from the customer and increase learning through frequent deployments;
  6. Respect people—we used the term “worker involvement”;
  7. Optimize the whole—we used the term “constant improvement.”
page 131

Nowadays, those principles may sound obvious. However, the Poppendieck book was published in 2006 and I think at that time many if not all of those principles where both not clear nor any best practices and tooling was available back then to realize them.

In a break-out box a comparison between lean and agile is given

  • Agile Methods aim to achieve Agility, i.e., the ability to adapt to the needs of the stakeholders.
  • Lean production aims to achieve efficiency, i.e., the ability to produce what the stakeholders need with the least amount of resources possible.
page 144

After this section which gives some more references to earlier work the book enters its less interesting but extensive part. Janes and Succi present three methods which shall support lean software development. They all have their merits, but I have to admit I dont catch fire for any of them.

Let me sketch those methods in short: First they introduce the „Goal Question Metric (plus)“, short GQM+. GQM+ is based on GQM, which is a methodology to derive crisp business goals. While I find some of the leading questions worthwhile, the overall concept strikes me as overly complex and hard to grasp.

After this, the authors present the „Experience Factory“. This is essentially an extension of the classic plan-do-study/check-act cycle with additional steps and a „sub-cycle“. Its a semi-interesting read, but doesnt convince me in its current form.

Finally, the concept of „Non-invasive Measurement“ is laid out. The goal of this approach is to collect data without distracting the engineers. While such non-invasiveness is indeed desirable, the proposal seems overly complex to me. I mean, there are so many ways of analyzing process flows, code quality, efficiency, etc. Why do the authors describe a database scheme for a concrete solution.

All-in-all the book „Lean Software Development in Action“ didn’t convince me. Its best part is where lean and agile are described, and the book offers a few interesting new perspectives on them to an already somewhat informed reader. Those aspects I have mostly covered above. The part where the authors bring in their ideas for methodologies which augment existing known approaches is rather weak, probably because its about academic ideas with little (not none!) exposure to real project life.

Kategorien
Book Culture

Thoughts on „Lean-Agile Software Development“

Reading and summarizing books on lean software development, so you dont have to. Part 1.

Besides agile philosophy, practices, processes & methods, lean becomes an increasingly recognized topic around software development. At least I can say that about my peer group. After an initial training on the matter in which I learned about the „general lean“ practices from the industrial production area, I had a lot of questions and doubts about its applicability in software development. Of course there are obvious connections and transfers which one could try, but I was wondering about existing experience, studies, research and best practices. So I was checking out the available books and found three. Two of them I already read, and today I want to start with the first (which I actually read second). Please note: I will not provide introductions and details on neither Lean nor Agile, as for such there is a myriad of online resources available, and I assume my readers know at least the agile part very well. Also, as usual in my book reviews, I am less focused on how well-written a book is. My focus is on new thoughts, inspiring ideas, surprising perspectives and generally speaking everything which deserves an application in my professional life (and the ones around me).

„Lean-Agile Software Development – Achieveing Enterprise Agility“ written by Alan Shalloway, Guy Beaver and James R. Trott was published in 2010, which means that in the fast-paced software industry its already quite old. For this book this was not a downside, as I could compare their takes against current state-of-the-art.

In the foreword Alan Shalloway makes an interesting observation:

Too long, this industry has suffered from a seemingly endless swing of
the pendulum from no process to too much process and then back to no process: from heavyweight methods focused on enterprise control to disciplined teams focused on the project at hand.

page xviii

I can confirm this from various scales. On a grand scale this has been true when you look at the software development history, starting decades ago. Enterprise processes, which followed the wild-west of the early days of computing, were replaced by agile practices. Even more, even within an organization down to project-level and individuals, the continued conflict about the „right amount of process“ is probably the biggest philosophical debate around in software development. Shalloway claims that lean principles can „guide us in this“ and „provides the way“. Lets see.

On page xxxviii the authors summarize „core beliefs of Lean“, preceeded by core beliefs of Agile and Waterfall. As all of those are not taken from „canonical“ sources, let me share the lean ones here, as they are a first cood summary:

Even when applied to software development, Lean is not limited to software development teams alone. On page 7 a tables lines out the contributions from all parties:

This is easier said than done. In many organizations both business and management are focused on pushing and tracking the delivery team, but spend too less time on their contributions. Another thing noteworthy here is the notion of the „delivery team“. This is not a team supporting, testing, integrating and generally taking care of delivery, no this is actually the software development team. Hence, this seems a synonym to more widely used terms like „feature teams“. I like the term delivery team, and could think about combinding both in „feature and delivery team“. Each term focuses on one aspect, the former more about the product, the latter more about the activity. In modern software development, I think its crucial to combine both in one team. Diminishing on of both aspects will inevitably lead to a suboptimal efficiency because essential parts are outsourced to other teams.

Lean principles suggest focusing on shortening time-to-market by removing delays in the development process; using JIT methods to do this is more important than keeping everyone busy

page 8

This is very valuable statement. Too often I see engineers getting dragged into task forces just because in the moment they are not overloaded. As a consequence, this leads to a culture in which everyone wants to be perceived or actually be busy all of the time. Continuous busy-ness is not sustainable and leads to growing organizational and technical debt. The cited statement instead clarifies that a lean and efficient process doesnt correspond to a process in which everyone is busy all of the time. Essentially we are talking about different dimensions.

Eliminating waste is the primary guideline for the Lean practitioner. Waste is code that is more complex than it needs to be. Waste occurs when defects are created. Waste is non-value-added effort required to create a product. Wherever there is waste, the Lean practitioner looks to the system to see how to eliminate it because it is likely that an error will continue to repeat itself, in one form or another, until we fix the system that contributed to

page 10

While reading this, it comes to my mind that everything which is not automated which can be automated is also a waste. Manual execution is inherently more error-prone in any software process.

Developers tend to take one of two approaches when forced to handle some design issue on which they are unclear. One approach is to do the simplest thing possible without doing anything to handle future requirements. The other is to anticipate what may happen and build hooks into the system for those possibilities. Both of these approaches have different challenges. The first results in code that is hard to change. […] The second results in code that is more complex than necessary. […]

An alternative approach to both of these is called “Emergent Design.” Emergent Design in software incorporates three disciplines:

  • Using the thought process of design patterns to create application architectures that are resilient and flexible
  • Limiting the implementation of design patterns to only those features that are current
  • Writing automated acceptance- and unit-tests before writing code, both to improve the thought process and to create a test harness

Using design patterns makes the code easy to change. Limiting writing to what you currently need keeps code less complex. Automated testing both improves the design and makes it safe to change. These features of emergent design, taken together, allow you to defer the commitment of a particular implementation until you understand what you actually need to do.

Page 11f

Conflicts around the aforementioned two approaches are indeed quite common. Both sides typically are able to throw business needs into the ring (pragmatism vs. sustainability). Even more, I often observe conflicted parties which did take the opposite position in the last conflict. Hence, emergent design sounds like a promising middle ground. I already have ideas in which conflicts I may bring it to the table.

Table 1.2 lists a good transfer of the industrial production costs and risks to the software world, something my first training on lean was missing out on:

On assigning persons to multiple projects at the same time, the authors cite an interesting study by Aral, Brynjolfsson and Van Alstyne. This study showed that the overall productivity of on person is reduced by 20% for the second and third parallel project, each. This is huge, also considering that often the better engineers are pulled/pushed into multiple projects/teams to rescue them. As a result, the best engineer’s capacity is reduced and thinned.

In the chapter about „Going beyond Scrum“, there is a good summary of Misunderstandings, Inaccurate Beliefs, and Limitations of Scrum:

Misunderstandings commonly held by new Scrum practitioners

  • There is no planning before starting your first Sprint.
  • There is no documentation in Scrum.
  • There is no architecture in Scrum.

Scrum beliefs we think are incorrect

  • Scrum succeeds largely because the people doing the work define how to do the work.
  • Teams need to be protected from management.
  • The product owner is the “one wring-able neck” for what the product should be.
  • When deciding what to build, start with stories: Release planning is a process of selecting stories to include in your release.
  • Teams should be comprised of generalists.
  • Inspect-and-adapt is sufficient.

Limitations of Scrum that must be transcended

  • Self-organizing teams, alone, will improve their processes beyond the team.
  • Every sprint needs to deliver value to the customer.
  • Never plan beyond the current sprint.
  • You can use Scrum-of-Scrums to coordinate interrelated teams working on different products.
  • You can use Scrum without automated acceptance testing or up-front unit tests
page 84

I will not comment on each point. The first two sections I would confirm entirely. The last section is pointing at some „missing“ aspects in Scrum, but I think just because e.g. test-driven development is missing, its not a limitation. Scrum is not claiming to be describe every aspect of software development.

In general: tables. This book really contains some nice side-by-side comparisons in tabular form. Table 5.1 compares „Scrum and Lean Perspectives“:

The book is also strong in naming typical anti-patterns in agile execution, especially when those anti-patterns clash with lean mindset.

Some common anti-patterns for Scrum teams are

  • Stories are not completed in an iteration.
  • Stories are too big.
  • Stories are not really prioritized.
  • Teams work on too many things at once.
  • Acceptance tests are not written before coding starts.
  • Quality Assurance/Testing is far behind the developers.

Here are questions we always try to use.

  • Does the team’s workload exceed its capacity?
  • When was the last time you checked your actual work process against the standard process?
  • When was the last time you changed the standard process?
  • Where are the delays in your process?
  • Is all of that WIP necessary?
  • How are you managing your WIP?
  • Are developers and testers in sync?
  • Does the storyboard really help the team keep to its work-flow?
  • Are resources properly associated with the open stories?
  • How much will limited resources affect the team’s work?
  • What resource constraints are you experiencing?
  • Can these constraints be resolved with cross-training or are they something to live with?
  • Does the storyboard reflect constraints and help the team manage them?
  • What needs to be more visible to management?
  • How will you manage your dependencies
page 95f

The authors clearly are not satisfied with the amount of guidance provided by and the reality around Scrum. „Going beyond Scrum“, they present their own extended Scrum called „Scrum#“ in two pages. They also introduce Kanban as a simpler framework. A key concept already mentioned in the last citation, even more relevant for Kanban, are „work in progress limits“. WIP limits was a known concept to me since some years, learned from former colleagues. The relationship to lean, however, was new to me and it makes total sense. Focus is soooo important, it cant be overrated. In my own experience it would say around 50% of all issues in software projects originate from lack of focus and too many things going on in parallel. Finally in its comparison of process frameworks, the authors do not forget about Extreme Programming.

Before you write a line of code, set up the following:

  • The product
  • The team
  • The environment
  • The architecture
page 109

This sounds simple, still I never experienced a software project in which more than one of those points was clear to a basic degree. Too often, organisations spawn projects with a very fuzzy project idea, an undefined team, unknown environment and a notion of „architecture will be clarified along the way“. The book goes on to provide guidance on how to set up each points before the first development iterations are started. On page 140 the authors present a template for how to draft a product vision statement.

The book also spends a chapter on „The Role of Quality Assurance in Lean-Agile Software Development“. It has, in essence, one key recommendation which is Test Driven Development (TDD). They claim „The role of testers must be one of preventing defects, not finding them“. While TDD has its merits, I think this statement is too simple. It is a bit far from reality to expect testers to have tests ready before every implementation, especially in projects in which even technological basics are not remotely clear. On the other hand, I am not saying TDD is not recommended wherever it can be applied.

If the customer cannot or will not confirm that you have delivered what they want, you should simply state that you believe that the customer does not value the feature; that it is not a high priority. If it were valuable, they would make the effort to specify the tests. Moreover, you should tell your management team that you recommend not building the functionality. If you are required to build it anyway, go ahead, but know that it could well turn out to be a waste of time.

page 164

This is quite some radical take, and it its a bit hard to match to a typical setup in which customers typically do not care about the test cases. However, I can imagine that pushing for a project setup in which tests have such a crucial position that both customers and developer give them utmost priority can nothing but benefit the efficiency of the resulting project. The authors recommend to always keep the question „How will I know I’ve done that?“ in mind, which according to them is an ultimate tool tom avoid waste.

After focusing on the team-scale, the book goes on to widen the scope to full enterprises. Also here they provide some anti-patterns (excerpts):

  • Teams are not well formed.
  • Large batches of unprioritized requirements are pushed through the organization.
  • There is no mechanism to limit work to resource capacity.
  • Program managers and business sponsors compete for resources rather than working together to maximize the return on them.
  • Automated acceptance testing is not being done. Test-driven development also is not being done. Testing is initiated too late in the development cycle.
  • Code quality is left up to programmers’ personal beliefs.
  • Finding and removing the root causes of problems is not pursued aggressively. Bugs are tolerated as a way of life in the software world. In fact, many organizations utilize bug tracking as status for release readiness.
  • Continuous process improvement is not practiced or valued. Most companies are so busy trying to fix the latest crisis that there is no time to focus on process improvement to avoid causing the next one.
Page 171f

This is a good list, of course its rather examples than extensive or complete. The second item from the top is re-iterated on page 182 when the authors state „For example, it is common for management to track the number of unfixed bugs. It seems like natural approach to assess how a team is doing. Lean-Agile thinking uses a different approach: Instead of worrying about fixing the bugs, we should concern ourselves with what is causing them.“

In my opinion, all or most of the above points originate from a lack of discipline in the management team, leading to evasive activities with the above symptoms. For examples it is much simpler to track bug lists than to solve root causes in the organization. It is mentally simpler to run from fire to fire than to reflect about fundamental improvements for the process. It is simple to request new reports in every escalation meeting than to use existing ones continuously to create a sustainable frame for the development team. Who is to blame? I think its a management culture based on 100% meetings giving almost no time to reflect and short-sighted office politics. It speaks for the authors, when Alan Shalloway write:

Some people are natural managers; I am not one of them. Historically, I have always micromanaged. Because I am good in a crisis (often creating and then solving them), when one occurred I would tend to jump in and tell my team how fix it. I knew that this behavior was inhibiting the team’s growth, so I tried delegating—letting the team figure out how to do things on their own—often with very poor results.

I was really abdicating via delegation. I needed to find a way to let the team figure out the solution but remain involved enough to ensure that it would be a good one. Fortunately, Lean management provides a way to do this. With visual controls, I can see the team’s process—I can see how the team is doing at any time—and I can see the team’s outcomes.

If the team gets into trouble, I can actively coach them to improve their results without telling them what to do. Lean gives me a way to become a better manager without resorting to old habits.

page 190

Already earlier, we have touched on software architecture. In a separate chapter the authors dive more into the question how to find the sweet spot between too much and too less architectural work.

Build only what you need at the moment and build it in a way that allows for it to be changed readily as you discover new issues.

page 204

and

The purpose of software design is not to build a framework within which all things can fit nicely. It is to define the relationships between the major concepts of the system so that when they change or new requirements emerge, the impact of the change s required is limited to local modifications

page 208

Almost at the end the book comes to speak about the origins of lean at Toyota. Interestingly, they are highlighting that

One of the brilliant insights at Toyota was that Lean principles are implemented differently in manufacturing than they are in product development.

This gave rise to another great example of Lean: the Toyota Product Development System, which is a better example for us in software development.

page 215

Now this is one revelation. So far all trainings were about Toyota’s Production System, not the Product Development System. This makes me wonder if our sources are the right ones.

With that let me close this review. This book was a good read, it explained the existing frameworks, showed their flaws and issues both in theory and practice, made concrete recommendations. All in all this book is a recommendation if you want to read how agile and lean can be combined in state-of-the-art software development processes.

Kategorien
Coding Culture

Technology Radar #27: Automotive SW perspective

As written before, I really like the regular updates provided by Thoughtworks in their Technology Radar. My focus is on the applicability of techniques, tools, platforms and languages for automotive software, with a further focus on embedded in-car software. Hence, I am ignoring pure web-development and machine learning/data analytics stuff which usually makes a huge portion of the whole report. Recently, its volume 27 has been published. Let’s have a look!

As usual, lets start with a dive in the „Technologies“ sector and its „Adopt“ perimeter. The first entry we can find is about „path-to-production mapping„. Its as familiar as it sounds – many of my readers will have heard about the Value Stream Mapping or similar process mapping approaches. Thoughtworks state by themselves that this one is so obvious, still they didnt cover it in their reports yet. Sometimes, the simple ideas are the powerful ones. I can confirm from my own experience that a value stream map laying out all the process steps and inefficiencies in an easy to digest manner is a good eye opener and can help to focus on the real problems instead of beating around the bush.

Something very interesting for all the operating systems and platform plans in Automotive is the notion of an „incremental developer platform„. The underlying observation that „teams shooting for too much of that platform vision too fast“ is something I can confirm from own experience. Engineers love to develop sustainable platforms, but underestimate all the efforts required for it, and management with its impatience is further undermining platform plans. Following the book Team Topologies‘ concept of a „thinnest viable platform“ makes sense here. Not shooting too far in the first step, but also treating a platform product as an incremental endeavour.

Another one which strikes me is „observability in CI/CD pipelines„. With the increasing amount and complexity of CI/CD pipelines in one project, let alone a whole organization, many operational questions arise. And operations always benefit from clear data and overview. Recently, a then-student and now colleague and me designed and realized a tool which enables CI/CD monitoring for more than one repo, but for a graph of repos. I hope we can publish/open this project anytime soon.

In the platforms sector, Backstage entered the „adopt“ perimeter. The project is actively developing forward, and indeed could be an interesting tool for building an internal sw engineering community.

Looking at the tools sector, I liked Hadolint for finding common issues in Dockerfiles.