Kategorien
Book Culture

Thoughts on “Implementing Lean Software Development”

Reading and summarizing books on lean software development, so you dont have to. Part 3 (see Part 1 and Part 2).

“Implementing Lean Software Development” written by Mary and Tom Poppendieck and published 2007 at Addison-Wesley. The Poppendiecks are quite famous in the lean-agile software development community, as they published the constitutive book “Lean Software Development: An Agile Toolkit” in 2003, the first (recognized) book about bringing the lean principles to the software development space. The book reviewed here is a successor book aimed at delivering more practical advice. As in the last parts, my review will not focus on re-iterating lean and agile fundamentals, but rather focus on novelty aspects, ideas, and noteworthy pieces.

In the foreword, Jeff Sutherland (co-founder of the Scrum framework) introduces the Japanese terms of Muri (properly loading a system), Mura (never stressing a person, system or process) and Muda (waste):

Yet many managers want to load developers at 110 percent. They desperately want to create a greater sense of “urgency” so developers will “work harder.” They want to micromanage teams, which stifles
self-organization. These ill-conceived notions often introduce wait time, churn, death marches, burnout, and failed projects.
When I ask technical managers whether they load the CPU on their laptop to 110 percent they laugh and say, “Of course not. My computer would stop running!” Yet by overloading teams, projects are often late, software is brittle and hard to maintain, and things gradually get worse, not better.

page xix

In their historical review the authors bring a very interesting statistics which should resonate with many of my peers:

Both Toyodas had brilliantly perceived that the game to be played was
not economies of scale, but conquering complexity. Economies of scale will reduce costs about 15 percent to 25 percent per unit when volume doubles. But costs go up by 20 percent to 35 percent every time variety doubles. Just-in-Time flow drives out major contributors to the cost of variety. In fact, it is the only industrial model we have that effectively manages complexity.

page 5

As evidence, two papers are given: “Time -The Next Source of Competitive Advantage” by George Stalk and “Lean or Sigma?” by Freddy and Michael Balle. Managers and engineers increasingly become aware about the not-so-visible cost of complexity, typically by experiencing project failure or long-term product degradation.

For the aspect of inventory, the authors provide a quite good methaphor:

Inventory is the water level in a stream, and when the water level is high, a lot of big rocks lurking under the water are hidden. If you lower the water level, the big rocks begin to surface. At that point, you have to clear the rock out of the way, or your boat will crash into them. As the big rocks are removed, you can lower inventory level some more, find more rocks, clear them out of the stream, and keep on going until there are just pebbles left.

page 8

That adoption of lean practices and mindset is not straightforward and many organizations struggle or fail to do so is explained by the authors by pointing at a “cherrypicking” approach. Hence, only some activities of the lean domain are adopted in isolation, like just-in-time or stop-the-line. Instead, they a classic:

The truly lean plant […] transfers the maximum number of tasks and responsibilities to those workers actually adding value to the car on the line, and it has in place a system for detecting defects that quickly
traces every problem, once discovered, to its ultimate source.

Womack, Jones, Roos: The machine that changed the world, page 99

I think this cannot be underestimated. To seldom I have seen organizations and management really focussing on the “value creators” and the impediments those are facing.

In earlier blog posts I already wrote about the differences and similarities in the lean manufacturing and lean development. The Poppendiecks provide a table putting both side-by-side (page 14):

Later, in a footnote, the authors refer to a paper by Kajko-Mattsson et al. on the cost of software maintenance. The paper’s sources vary a lot, however its obvious that considering a typical big software project it becomes clear that this ratio quickly translates to millions of Euro/Dollar.

The published numbers point out that maintenance costs between 40% to 90% […]. There are very few publications reporting on the cost of each individual maintenance category. The reported ones are the
following: (1) corrective maintenance – 16-22% […] (2) perfective maintenance – 55% […], and (3) adaptive maintenance – 25% […].

Kajko-Mattsson et al: Taxonomy of problem management activities, page 1

On the lean principle of waste, the Poppendiecks make a simple but revelating statement:

To eliminate waste, you first have to recognize it. Since waste is anything
that does not add value, the first step to eliminating waste is to develop a keen sense of what value really is. There is no substitute for developing a deep understanding of what customers will actually value once they start using the software. In our industry, value has a habit of changing because, quite often, customers don’t really know what they want. In addition, once they see new software in action, their idea of what they want will invariably shift. Nevertheless, great software development organizations develop a deep sense of customer value and continually delight their customers.

page 23

Too often have I experienced software development projects who dont know what their product and the value they provide actually is. Of course, everyone has a vague feeling about what it could be, but putting it in clear words is seldom attempted and easily ends in conflict (a conflict which can be constructive if facilitated well).

On the second principle “Build Quality In”, there are some interesting distinctions on defects and the relation to “inspection”:

According to Shigeo Shingo, there are two kinds of inspection: inspection after defects occur and inspection to prevent defects.10 If you really want quality, you don’t inspect after the fact, you control conditions so as not to allow defects in the first place. If this is not possible, then you inspect the product after each small step, so that defects are caught immediately after they occur. When a defect is found, you stop-the-line, find its cause, and fix it immediately.
Defect tracking systems are queues of partially done work, queues of rework if you will. Too often we think that just because a defect is in a queue, it’s OK, we won’t lose track of it. But in the lean paradigm, queues are collection points for waste. The goal is to have no defects in the queue, in fact, the ultimate goal is to eliminate the defect tracking queue altogether. If you find this impossible to imagine, consider Nancy Van Schooenderwoert’s experience on a three-year project that developed complex and often-changing embedded software. Over
the three-year period there were a total of 51 defects after unit testing with a maximum of two defects open at once. Who needs a defect tracking system for two defects?

page 27

The authors are citing two papers by Nancy Van Schooenderwoert (“Taming the Embedded Tiger – Agile Test Techniques for Embedded
Software
” and “Embedded Agile Project by the Numbers With Newbies“). This resonates well with me, because accumulating too many defect (tickets) is very expensive waste. Its a kind of inventory with the worst properties. To break out of this is not straightforward, I have attempted and failed multiple times to establish a “zero defect policy” (i.e. as long as there is a defect no further feature development happens). In that context let me at two more quotes from the book:

The job of tests, and the people that develop and runs tests, is to prevent defects, not to find them.

page 28

“Do it right the first time,” has been interpreted to mean that once code is written, it should never have to be changed. This interpretation encourages developers to use some of the worst known practices for the design and development of complex systems. It is a dangerous myth to think that software should not have to be changed once it is written.

page 29

On the fifth principle of “Deliver Fast” a very important statement is made:

Caution: Don’t equate high speed with hacking. They are worlds apart. A fast-moving development team must have excellent reflexes and a disciplined, stop-the-line culture. The reason for this is clear: You can’t sustain high speed unless you build quality in.

page 35

Very often I observe a dire need for speed. Of course everyone wants to be faster in the software industry. Competition doesnt sleep. However, similar to unclear definitions of value and products, I have barely ever seen a clear definition of speed in a software project. Or, probably more correct: there were competing definitions of speed on people’s and especially decision maker’s minds. Its a huge difference to beat your team to “push out features now” and grind to a halt when quality activities are started, or to maintain a sustainable pace:

When you measure cycle time, you should not measure the shortest time through the system. It is a bad idea to measure how good you are at expediting, because in a lean environment, expediting should be neither necessary nor acceptable. The question is not how fast can you deliver, but how fast do you repeatedly and reliably deliver a new capability or respond to a customer request.

page 238

The Poppendiecks are summarizing those effects in two vicious cycles (page 38):

For all the lean principles, the Poppendiecks also discuss myths originating from mis-interpreting the principles or applying them wrongly. One which caught my attention was the myth “Optimize by decomposition”. Its about the proliferation of metrics once an organization starts to apply the benefits of visual management. All of a sudden, there are tens if not hundreds of dashboards, graphs, KPIs, and such flying around. Their recommendation:

When a measurement system has too many measurements the real goal of the effort gets lost among too many surrogates, and there is no guidance for making tradeoffs among them. The solution is to “Measure UP” that is, raise the measurement one level and decrease the number of measurements. Find a higher-level measurement that will drive the right results for the lower level metrics and establish a basis for making trade-offs.

page 40

Speaking about myths, they encourage readers to check which myths apply to their situation – certainly a worthwile exercise also for you 🙂

Early specification reduces waste
The job of testing is to find defects
Predictions create predictability
Planning is commitment
Haste makes waste
There is one best way
Optimize by decomposition

page 42

Coming back to the notion of value, the authors are asking the fundamental question how great products are conceived and developed. They write:

In 1991, Clark and Fujimoto’s book Product Development Performance presented strong evidence that great products are the result of excellent, detailed information flow. The customers’ perception of the product is determined by the quality of the flow of information between the marketplace and the development team. The technical integrity of the product is determined by the quality of the information flow among upstream and downstream technical team members. There are two steps you can take to facilitate this information flow: 1) provide leadership, and 2) empower a complete team.

page 52

The book has an extensive chapter on waste with many insightful aspects. I dont want to repeat all of them, and instead just provide some examples. For example I found this statement on the relationship of automation and waste/complexity very inspiring.

We are not helping our customers if we simply automate a complex or messy process; we would simply be encasing a process filled with waste in a straight jacket of software complexity. Any process that is a candidate for automation should first be clarified and simplified, possibly even removing existing automation. Only then can the process be clearly understood and the leverage points for effective automation identified.

page 72

In my current position, automation is a key activity, and we try to automate everything in an endeavour to increase speed, quality and convenience. The quote points out, that automation can hide or defer complexity. I can confirm this. Even though my team automated the complexity of product variants in the build process, our customers (e.g. manual testers) dont have a chance to test all the build we produce. Hence, even made with best intentions, our automation is overloading the whole.

Another good comparison between traditional manufacturing and software development is the following table, putting the seven waste equivalents side-by-side (page 74):

On architectural foresight, I like the following statement:

Creating an architectural capability to add features later rather than sooner is good. Extracting a reusable services “framework” for the enterprise has often proven to be a good idea. Creating a speculative application framework that can be configured to do just about anything has a track record of failure. Understand the difference.

page 76

While discussing Value Streams, the authors dig into effectiveness and efficiency. They are of the opinion that

chasing the phantom of full utilization creates long queues that take far more effort to maintain than they are worth-and actually decreases effective utilization.

page 88

This opinion is not speculation, they provide a good analogy to road traffic and computer utilization:

High utilization is another thing that makes systems unstable. This is obvious to anyone who has ever been caught in a traffic jam. Once the utilization of the road goes above about 80 percent, the speed of the traffic starts to slow down. Add a few more cars and pretty soon you are moving at a crawl. When operations managers see their servers running at 80 percent capacity at peak times, they know that response time is beginning to suffer, and they quickly get more servers. […]

Most operations managers would get fired for trying to get maximum utilization out of each server, because it’s common knowledge that high utilization slows servers to a crawl. Why is it that when development managers see a report saying that 90 percent of their available hours were used last month, their reaction is, “Oh look! We have time for another project!”

pages 101f

I think in daily work, management typically does not pay enough attention to those basics. It is not that this is not known that too high utilization of resouces is bad, quite the opposite is the case in my experience. However, the root causes and the remedies are often not considered. Instead, there is a sentiment of capitulation: “Yes I know our team is stressed and overloaded, but we have to get faster nevertheless.”

In order to reduce cycle times, the authors refer to queuing theory, which provides several approaches:

Even out the arrival of work

Minimize the number of things in process

Minimize the size of things in process

Establish a regular cadence

Limit work to capacity

Use pull scheduling

page 103

In the chapter “People”, there is a lot of reference to William Edwards Deming, a pioneer of quality management. Its an iron of history, that this American actually was teaching the fundamentals of what leater became Lean in post-war Japan, while he was “discovered” only in the 1980s by the US (industrial) public. Deming formulated a what he called “System of Profound Knowledge”:

  1. Appreciation of a System: A business is a system. Action in one part of the system will have effects in the other parts. We often call these “unintended consequences.” By learning about systems we can better avoid these unintended consequences and optimize the whole system.
  2. Knowledge of Variation: One goal of quality is to reduce variation. Managers who do not understand variation frequently increase variation by their actions. Critical to this is understanding the two types of variation — Common cause which is variation from the system and Special cause which variation from outside the system
  3. Theory of Knowledge: There is no knowledge without theory. Understanding the difference between theory and experience prevents shallow change. Theory requires prediction, not just explanation. While you can never prove that a theory is right, there must exist the possibility of proving it wrong by testing its predictions.
  4. Understanding of Psychology: To understand the interaction between work systems and people, leaders must seek to answer questions such as: How do people learn? How do people relate to change? What motivates people?
https://medium.com/10x-curiosity/system-of-profound-knowledge-ce8cd368ca62

When pursuing change and transformation, it is very important to take the staff on board. This is easier said than done, because the employees have a very fine sense. They realize very quickly, if for example a certain change in mindset is requested from them, but not exercised by their supervisors. In engineering projects, the demands and expectations of decision makers are often antagonistic to their communicated strategies and visions. Just consider if in your organization “quality” is an essential part of your long-term goals, and totally overriden by daily task force death marches.

The challenge to achieve quality is handled in another dedicated chapter. The authors point out the importance of “superb, detailed discipline” to achieve high quality. Here come the famous “5 S’s” into play. The book’s authors transfer them also to the software space:

Sort (Seiri): Sort through the stuff on the team workstations and servers, and find the old versions of software and old files and reports that will never be used any more. Back them up if you must, then delete them.

Systematize (Seiton): Desktop layouts and file structures are important. They should be crafted so that things are logically organized and easy to find. Any workspace that is used by more than one person should conform to a common team layout so people can find what they need every place they log in.

Shine (Seiso): Whew, that was a lot of work. Time to throw out the pop cans and coffee cups, clean the fingerprints off the monitor screens, and pick up all that paper. Clean up the whiteboards after taking pictures of the important designs that are sketched there.

Standardize (Seiketsu): Put some automation and standards in place to make sure that every workstation always has the latest version of the tools, backups occur regularly, and miscellaneous junk doesn’t accumulate.

Sustain (Shitsuke): Now you just have to keep up the discipline.

page 191

I really enjoyed reading this book and can absolutely recommend reading it. It contains a lot of gems, and is probably one of those book you want to read every other year again to re-discover aspects and connect them to new experience.

Kategorien
Book Culture

Thoughts on “Lean Software Development in Action”

Reading and summarizing books on lean software development, so you dont have to. Part 2 (see Part 1).

“Lean Software Development in Action” written by Andrea Janes and Giancarlo Succi and published 2014 at Springer. The authors are scientists at the University of Bolzano, and the book clearly has a more scientific approach than the last one.

As the last – and probably every book – on the matter of lean, agile and software engineering this book starts with an introduction on each of those aspects. Again, I will not reiterate on what lean and agile are, but focus on interesting observations and perspectives exposing new angles on “known stuff”.

The first noteworthy piece is about “tame and wicked projects”. This section is referring to work by Rittel and Webber who came up with a discution between tame and wicked problems (see also). Poppendieck and Poppendieck extended this to projects, and in this book they are described and identified by following ten points:

  1. Wicked projects cannot provide a definitive, analytical formulation of the problem they target. Formulating the project and the solution is essentially the same task. Each time you attempt to create a solution, you get a new, hopefully better, understanding of the project.
  2. Wicked projects have no a stopping rule telling when the problem they target has been solved. Since you cannot define the problem, it is almost impossible to tell when it has been resolved. The problem-solving process proceeds iteratively and ends when resources are depleted and/or stakeholders lose interest in a further refinement of the currently proposed solution.
  3. Solutions to problems in wicked projects are not true or false, but good or bad. Since there are no unambiguous criteria for deciding if the project is resolved, getting all stakeholders to agree that a resolution is “good enough” can be a challenge.
  4. There is no immediate or ultimate test of a solution to the targeted problem in a wicked project. Solutions to such projects generate waves of consequences, and it is impossible to know how these waves will eventually play out.
  5. Each solution to the problem targeted by a wicked project has irreversible consequences. Care must be placed in managing assumed solutions. Once the website is published or the new customer service package goes live, you cannot take back what was online or revert to the former customer database.
  6. Wicked projects do not have a well-described, widely accepted set of potential solutions. The various stakeholders may have differing views of what are acceptable solutions. It is a matter of judgment as to when enough potential solutions have emerged and which should be pursued.
  7. Each wicked project is essentially unique. There are no well-defined “classes” of solutions that can be applied to a specific case. It is not easy to find analogous projects, previously solved and well documented, so that their solution could be duplicated.
  8. The problem targeted by a wicked project can be considered a symptom of another problem. A wicked project deals with a set of interlocking issues and constraints that change over time, embedded in a dynamic and evolving context.
  9. The causes of a problem targeted by a wicked project can be explained in several ways. There are several stakeholders who have various and changing ideas about what is the project, its nature, its causes, and the associated solution.
  10. The project must not go wrong. Mistake is not an option here. Despite the inability to express the project solution analytically, it is not allowed to fail the project.
page 9

These are some interesting observations which resonate with my experience. One might say “many of those points do not apply to our project as we have a quite clear understanding of our product delivery (like an ECU for a car, providing a certain set of functions/features for the customers)”. However I think its not that simple. Many projects of non-trivial complexity I have been involved in do not only have the goal of releasing a product to the market, but there are other, interlinked objectives which give the project in total a semi- or non-defined goal. Besides delivering a good, innnovative product these objectives may consist of e.g. financial goals (better return on investment), efficiency gains, usage of new technologies and approaches and in- or outsourcing of activities. While project I know typically start with those defined on a high-level and people are onboarded or recruited referring to those motivating goals, as soon as the project enters the death march the project’s goals are gradually getting more fuzzy, unbalanced and volatile. I don’t know silver bullets to such situations (yet), but the notion of wicked projects resonates with those and other observations. However, isn’t awareness the first step to improvement?

[…] the quality of the final product is seen as a result of the process pro-
ducing them. This assumption creates a high attention for a high-quality production process according to the credo “prevention is better than healing”

page 41

This lean wisdom sounds trivial, however, I have never seen it realized. I have yet to understand why so many managers ignore the efficiency and effectiveness gains by a proper process and instead decide to continously apply brute force which cost them more money, time, energy, motivation and subsequently of course quality. And, of course, by a “proper process” I dont mean an perfect considers-everything-process which is both unreachable and undesirable.

“The role of standardization”, page 43

I like this illustration showing standardization as a wheel chock for the plan-do-study/check-act cycle. Similar to the paragraph above it strikes me how many projects and managers re-invent the wheel (haha) and then start the whole process from the bottom. Of course no one wants and says this, but this is what often happens.

The result is a development approach in which requirements are not “refined down to an implementation,” i.e., taken as the starting point to develop an implementation that represents those requirements, but where the business objectives are mapped to the capabilities of the technical platform to “equally consider and adjust business goals and technical aspects to come to an optimal solution corresponding to the current situation

page 63

This is another insightful depiction. It provided me a new perspective on requirements, as they help to close the gap between business goals and technical capabilities. This can be a good approach helping in situations in which a project is lacking a good “feeling” on the right amount of requirements, between over-specification and under-specification.

page 80

Referring to studies conducted by Herzberg this shows how motivators and hygiene factors influence the motivation of the staff, and how agile methods very well support/complement those.

Later the authors come to write about the “dark side of agile”, on which they also published a paper. As observed by others before, agile statements can be easily twisted and thwarted in good or bad faith to yield an abomination which leads to the opposite and extreme positions of the original intention. Citing Rakitin’s old paper the agile manifesto can be translated as

  • Individuals and interactions over processes and tools: “Talking to people instead of using a process gives us the freedom to do whatever we want.”
  • Working software over comprehensive documentation: “We want to spend all our time coding. Remember, real programmers don’t write documentation.”
  • Customer collaboration over contract negotiation: “Haggling over the details is merely a distraction from the real work of coding. We’ll work out the details once we deliver something.”
  • Responding to change over following a plan: “Following a plan implies we have to think about the problem and how we might actually solve it. Why would we want to do that when we could be coding?”
page 111

The above is one way to twist the agile manifesto in favor of what the authors a few paragraphs later call a “cowboy coder”. This reminds of the “Programming, Motherfucker” webpage (thanks, Kris). While such sentiment exists, I cant often blame the engineers on mocking the agile manifesto and similar approaches that way. Very often, such engineer’s reactions are preceeded by even more unfaithful perversions pushed by all sorts of management. Just to bring one example to the table: Who has not witnessed a manager throwing new (vague) requirements at the development team every other week claiming this is what agile is about and, of course, everyone has to be faster in reacting. Because agile is faster. I could go on.

In chapter 6 the authors start to synthesise and bring lean and software development together. They start with citing the seminal book of Poppendieck and Poppendieck “Implementing Lean Software Development From Concept to Cash”. I couldnt read it yet as it wasnt available in print when I tried to get a hold of it. So they provide 7 principles for lean software development:

  1. Eliminate waste;
  2. Build quality—we used the terms “autonomation” and “standardization”;
  3. Create knowledge;
  4. Defer commitment—we used the term “just-in-time”;
  5. Deliver fast—get frequent feedback from the customer and increase learning through frequent deployments;
  6. Respect people—we used the term “worker involvement”;
  7. Optimize the whole—we used the term “constant improvement.”
page 131

Nowadays, those principles may sound obvious. However, the Poppendieck book was published in 2006 and I think at that time many if not all of those principles where both not clear nor any best practices and tooling was available back then to realize them.

In a break-out box a comparison between lean and agile is given

  • Agile Methods aim to achieve Agility, i.e., the ability to adapt to the needs of the stakeholders.
  • Lean production aims to achieve efficiency, i.e., the ability to produce what the stakeholders need with the least amount of resources possible.
page 144

After this section which gives some more references to earlier work the book enters its less interesting but extensive part. Janes and Succi present three methods which shall support lean software development. They all have their merits, but I have to admit I dont catch fire for any of them.

Let me sketch those methods in short: First they introduce the “Goal Question Metric (plus)”, short GQM+. GQM+ is based on GQM, which is a methodology to derive crisp business goals. While I find some of the leading questions worthwhile, the overall concept strikes me as overly complex and hard to grasp.

After this, the authors present the “Experience Factory”. This is essentially an extension of the classic plan-do-study/check-act cycle with additional steps and a “sub-cycle”. Its a semi-interesting read, but doesnt convince me in its current form.

Finally, the concept of “Non-invasive Measurement” is laid out. The goal of this approach is to collect data without distracting the engineers. While such non-invasiveness is indeed desirable, the proposal seems overly complex to me. I mean, there are so many ways of analyzing process flows, code quality, efficiency, etc. Why do the authors describe a database scheme for a concrete solution.

All-in-all the book “Lean Software Development in Action” didn’t convince me. Its best part is where lean and agile are described, and the book offers a few interesting new perspectives on them to an already somewhat informed reader. Those aspects I have mostly covered above. The part where the authors bring in their ideas for methodologies which augment existing known approaches is rather weak, probably because its about academic ideas with little (not none!) exposure to real project life.

Kategorien
Book Culture

Thoughts on “Lean-Agile Software Development”

Reading and summarizing books on lean software development, so you dont have to. Part 1.

Besides agile philosophy, practices, processes & methods, lean becomes an increasingly recognized topic around software development. At least I can say that about my peer group. After an initial training on the matter in which I learned about the “general lean” practices from the industrial production area, I had a lot of questions and doubts about its applicability in software development. Of course there are obvious connections and transfers which one could try, but I was wondering about existing experience, studies, research and best practices. So I was checking out the available books and found three. Two of them I already read, and today I want to start with the first (which I actually read second). Please note: I will not provide introductions and details on neither Lean nor Agile, as for such there is a myriad of online resources available, and I assume my readers know at least the agile part very well. Also, as usual in my book reviews, I am less focused on how well-written a book is. My focus is on new thoughts, inspiring ideas, surprising perspectives and generally speaking everything which deserves an application in my professional life (and the ones around me).

“Lean-Agile Software Development – Achieveing Enterprise Agility” written by Alan Shalloway, Guy Beaver and James R. Trott was published in 2010, which means that in the fast-paced software industry its already quite old. For this book this was not a downside, as I could compare their takes against current state-of-the-art.

In the foreword Alan Shalloway makes an interesting observation:

Too long, this industry has suffered from a seemingly endless swing of
the pendulum from no process to too much process and then back to no process: from heavyweight methods focused on enterprise control to disciplined teams focused on the project at hand.

page xviii

I can confirm this from various scales. On a grand scale this has been true when you look at the software development history, starting decades ago. Enterprise processes, which followed the wild-west of the early days of computing, were replaced by agile practices. Even more, even within an organization down to project-level and individuals, the continued conflict about the “right amount of process” is probably the biggest philosophical debate around in software development. Shalloway claims that lean principles can “guide us in this” and “provides the way”. Lets see.

On page xxxviii the authors summarize “core beliefs of Lean”, preceeded by core beliefs of Agile and Waterfall. As all of those are not taken from “canonical” sources, let me share the lean ones here, as they are a first cood summary:

Even when applied to software development, Lean is not limited to software development teams alone. On page 7 a tables lines out the contributions from all parties:

This is easier said than done. In many organizations both business and management are focused on pushing and tracking the delivery team, but spend too less time on their contributions. Another thing noteworthy here is the notion of the “delivery team”. This is not a team supporting, testing, integrating and generally taking care of delivery, no this is actually the software development team. Hence, this seems a synonym to more widely used terms like “feature teams”. I like the term delivery team, and could think about combinding both in “feature and delivery team”. Each term focuses on one aspect, the former more about the product, the latter more about the activity. In modern software development, I think its crucial to combine both in one team. Diminishing on of both aspects will inevitably lead to a suboptimal efficiency because essential parts are outsourced to other teams.

Lean principles suggest focusing on shortening time-to-market by removing delays in the development process; using JIT methods to do this is more important than keeping everyone busy

page 8

This is very valuable statement. Too often I see engineers getting dragged into task forces just because in the moment they are not overloaded. As a consequence, this leads to a culture in which everyone wants to be perceived or actually be busy all of the time. Continuous busy-ness is not sustainable and leads to growing organizational and technical debt. The cited statement instead clarifies that a lean and efficient process doesnt correspond to a process in which everyone is busy all of the time. Essentially we are talking about different dimensions.

Eliminating waste is the primary guideline for the Lean practitioner. Waste is code that is more complex than it needs to be. Waste occurs when defects are created. Waste is non-value-added effort required to create a product. Wherever there is waste, the Lean practitioner looks to the system to see how to eliminate it because it is likely that an error will continue to repeat itself, in one form or another, until we fix the system that contributed to

page 10

While reading this, it comes to my mind that everything which is not automated which can be automated is also a waste. Manual execution is inherently more error-prone in any software process.

Developers tend to take one of two approaches when forced to handle some design issue on which they are unclear. One approach is to do the simplest thing possible without doing anything to handle future requirements. The other is to anticipate what may happen and build hooks into the system for those possibilities. Both of these approaches have different challenges. The first results in code that is hard to change. […] The second results in code that is more complex than necessary. […]

An alternative approach to both of these is called “Emergent Design.” Emergent Design in software incorporates three disciplines:

  • Using the thought process of design patterns to create application architectures that are resilient and flexible
  • Limiting the implementation of design patterns to only those features that are current
  • Writing automated acceptance- and unit-tests before writing code, both to improve the thought process and to create a test harness

Using design patterns makes the code easy to change. Limiting writing to what you currently need keeps code less complex. Automated testing both improves the design and makes it safe to change. These features of emergent design, taken together, allow you to defer the commitment of a particular implementation until you understand what you actually need to do.

Page 11f

Conflicts around the aforementioned two approaches are indeed quite common. Both sides typically are able to throw business needs into the ring (pragmatism vs. sustainability). Even more, I often observe conflicted parties which did take the opposite position in the last conflict. Hence, emergent design sounds like a promising middle ground. I already have ideas in which conflicts I may bring it to the table.

Table 1.2 lists a good transfer of the industrial production costs and risks to the software world, something my first training on lean was missing out on:

On assigning persons to multiple projects at the same time, the authors cite an interesting study by Aral, Brynjolfsson and Van Alstyne. This study showed that the overall productivity of on person is reduced by 20% for the second and third parallel project, each. This is huge, also considering that often the better engineers are pulled/pushed into multiple projects/teams to rescue them. As a result, the best engineer’s capacity is reduced and thinned.

In the chapter about “Going beyond Scrum”, there is a good summary of Misunderstandings, Inaccurate Beliefs, and Limitations of Scrum:

Misunderstandings commonly held by new Scrum practitioners

  • There is no planning before starting your first Sprint.
  • There is no documentation in Scrum.
  • There is no architecture in Scrum.

Scrum beliefs we think are incorrect

  • Scrum succeeds largely because the people doing the work define how to do the work.
  • Teams need to be protected from management.
  • The product owner is the “one wring-able neck” for what the product should be.
  • When deciding what to build, start with stories: Release planning is a process of selecting stories to include in your release.
  • Teams should be comprised of generalists.
  • Inspect-and-adapt is sufficient.

Limitations of Scrum that must be transcended

  • Self-organizing teams, alone, will improve their processes beyond the team.
  • Every sprint needs to deliver value to the customer.
  • Never plan beyond the current sprint.
  • You can use Scrum-of-Scrums to coordinate interrelated teams working on different products.
  • You can use Scrum without automated acceptance testing or up-front unit tests
page 84

I will not comment on each point. The first two sections I would confirm entirely. The last section is pointing at some “missing” aspects in Scrum, but I think just because e.g. test-driven development is missing, its not a limitation. Scrum is not claiming to be describe every aspect of software development.

In general: tables. This book really contains some nice side-by-side comparisons in tabular form. Table 5.1 compares “Scrum and Lean Perspectives”:

The book is also strong in naming typical anti-patterns in agile execution, especially when those anti-patterns clash with lean mindset.

Some common anti-patterns for Scrum teams are

  • Stories are not completed in an iteration.
  • Stories are too big.
  • Stories are not really prioritized.
  • Teams work on too many things at once.
  • Acceptance tests are not written before coding starts.
  • Quality Assurance/Testing is far behind the developers.

Here are questions we always try to use.

  • Does the team’s workload exceed its capacity?
  • When was the last time you checked your actual work process against the standard process?
  • When was the last time you changed the standard process?
  • Where are the delays in your process?
  • Is all of that WIP necessary?
  • How are you managing your WIP?
  • Are developers and testers in sync?
  • Does the storyboard really help the team keep to its work-flow?
  • Are resources properly associated with the open stories?
  • How much will limited resources affect the team’s work?
  • What resource constraints are you experiencing?
  • Can these constraints be resolved with cross-training or are they something to live with?
  • Does the storyboard reflect constraints and help the team manage them?
  • What needs to be more visible to management?
  • How will you manage your dependencies
page 95f

The authors clearly are not satisfied with the amount of guidance provided by and the reality around Scrum. “Going beyond Scrum”, they present their own extended Scrum called “Scrum#” in two pages. They also introduce Kanban as a simpler framework. A key concept already mentioned in the last citation, even more relevant for Kanban, are “work in progress limits”. WIP limits was a known concept to me since some years, learned from former colleagues. The relationship to lean, however, was new to me and it makes total sense. Focus is soooo important, it cant be overrated. In my own experience it would say around 50% of all issues in software projects originate from lack of focus and too many things going on in parallel. Finally in its comparison of process frameworks, the authors do not forget about Extreme Programming.

Before you write a line of code, set up the following:

  • The product
  • The team
  • The environment
  • The architecture
page 109

This sounds simple, still I never experienced a software project in which more than one of those points was clear to a basic degree. Too often, organisations spawn projects with a very fuzzy project idea, an undefined team, unknown environment and a notion of “architecture will be clarified along the way”. The book goes on to provide guidance on how to set up each points before the first development iterations are started. On page 140 the authors present a template for how to draft a product vision statement.

The book also spends a chapter on “The Role of Quality Assurance in Lean-Agile Software Development”. It has, in essence, one key recommendation which is Test Driven Development (TDD). They claim “The role of testers must be one of preventing defects, not finding them”. While TDD has its merits, I think this statement is too simple. It is a bit far from reality to expect testers to have tests ready before every implementation, especially in projects in which even technological basics are not remotely clear. On the other hand, I am not saying TDD is not recommended wherever it can be applied.

If the customer cannot or will not confirm that you have delivered what they want, you should simply state that you believe that the customer does not value the feature; that it is not a high priority. If it were valuable, they would make the effort to specify the tests. Moreover, you should tell your management team that you recommend not building the functionality. If you are required to build it anyway, go ahead, but know that it could well turn out to be a waste of time.

page 164

This is quite some radical take, and it its a bit hard to match to a typical setup in which customers typically do not care about the test cases. However, I can imagine that pushing for a project setup in which tests have such a crucial position that both customers and developer give them utmost priority can nothing but benefit the efficiency of the resulting project. The authors recommend to always keep the question “How will I know I’ve done that?” in mind, which according to them is an ultimate tool tom avoid waste.

After focusing on the team-scale, the book goes on to widen the scope to full enterprises. Also here they provide some anti-patterns (excerpts):

  • Teams are not well formed.
  • Large batches of unprioritized requirements are pushed through the organization.
  • There is no mechanism to limit work to resource capacity.
  • Program managers and business sponsors compete for resources rather than working together to maximize the return on them.
  • Automated acceptance testing is not being done. Test-driven development also is not being done. Testing is initiated too late in the development cycle.
  • Code quality is left up to programmers’ personal beliefs.
  • Finding and removing the root causes of problems is not pursued aggressively. Bugs are tolerated as a way of life in the software world. In fact, many organizations utilize bug tracking as status for release readiness.
  • Continuous process improvement is not practiced or valued. Most companies are so busy trying to fix the latest crisis that there is no time to focus on process improvement to avoid causing the next one.
Page 171f

This is a good list, of course its rather examples than extensive or complete. The second item from the top is re-iterated on page 182 when the authors state “For example, it is common for management to track the number of unfixed bugs. It seems like natural approach to assess how a team is doing. Lean-Agile thinking uses a different approach: Instead of worrying about fixing the bugs, we should concern ourselves with what is causing them.”

In my opinion, all or most of the above points originate from a lack of discipline in the management team, leading to evasive activities with the above symptoms. For examples it is much simpler to track bug lists than to solve root causes in the organization. It is mentally simpler to run from fire to fire than to reflect about fundamental improvements for the process. It is simple to request new reports in every escalation meeting than to use existing ones continuously to create a sustainable frame for the development team. Who is to blame? I think its a management culture based on 100% meetings giving almost no time to reflect and short-sighted office politics. It speaks for the authors, when Alan Shalloway write:

Some people are natural managers; I am not one of them. Historically, I have always micromanaged. Because I am good in a crisis (often creating and then solving them), when one occurred I would tend to jump in and tell my team how fix it. I knew that this behavior was inhibiting the team’s growth, so I tried delegating—letting the team figure out how to do things on their own—often with very poor results.

I was really abdicating via delegation. I needed to find a way to let the team figure out the solution but remain involved enough to ensure that it would be a good one. Fortunately, Lean management provides a way to do this. With visual controls, I can see the team’s process—I can see how the team is doing at any time—and I can see the team’s outcomes.

If the team gets into trouble, I can actively coach them to improve their results without telling them what to do. Lean gives me a way to become a better manager without resorting to old habits.

page 190

Already earlier, we have touched on software architecture. In a separate chapter the authors dive more into the question how to find the sweet spot between too much and too less architectural work.

Build only what you need at the moment and build it in a way that allows for it to be changed readily as you discover new issues.

page 204

and

The purpose of software design is not to build a framework within which all things can fit nicely. It is to define the relationships between the major concepts of the system so that when they change or new requirements emerge, the impact of the change s required is limited to local modifications

page 208

Almost at the end the book comes to speak about the origins of lean at Toyota. Interestingly, they are highlighting that

One of the brilliant insights at Toyota was that Lean principles are implemented differently in manufacturing than they are in product development.

This gave rise to another great example of Lean: the Toyota Product Development System, which is a better example for us in software development.

page 215

Now this is one revelation. So far all trainings were about Toyota’s Production System, not the Product Development System. This makes me wonder if our sources are the right ones.

With that let me close this review. This book was a good read, it explained the existing frameworks, showed their flaws and issues both in theory and practice, made concrete recommendations. All in all this book is a recommendation if you want to read how agile and lean can be combined in state-of-the-art software development processes.

Kategorien
Coding Culture

Technology Radar #27: Automotive SW perspective

As written before, I really like the regular updates provided by Thoughtworks in their Technology Radar. My focus is on the applicability of techniques, tools, platforms and languages for automotive software, with a further focus on embedded in-car software. Hence, I am ignoring pure web-development and machine learning/data analytics stuff which usually makes a huge portion of the whole report. Recently, its volume 27 has been published. Let’s have a look!

As usual, lets start with a dive in the “Technologies” sector and its “Adopt” perimeter. The first entry we can find is about “path-to-production mapping“. Its as familiar as it sounds – many of my readers will have heard about the Value Stream Mapping or similar process mapping approaches. Thoughtworks state by themselves that this one is so obvious, still they didnt cover it in their reports yet. Sometimes, the simple ideas are the powerful ones. I can confirm from my own experience that a value stream map laying out all the process steps and inefficiencies in an easy to digest manner is a good eye opener and can help to focus on the real problems instead of beating around the bush.

Something very interesting for all the operating systems and platform plans in Automotive is the notion of an “incremental developer platform“. The underlying observation that “teams shooting for too much of that platform vision too fast” is something I can confirm from own experience. Engineers love to develop sustainable platforms, but underestimate all the efforts required for it, and management with its impatience is further undermining platform plans. Following the book Team Topologies’ concept of a “thinnest viable platform” makes sense here. Not shooting too far in the first step, but also treating a platform product as an incremental endeavour.

Another one which strikes me is “observability in CI/CD pipelines“. With the increasing amount and complexity of CI/CD pipelines in one project, let alone a whole organization, many operational questions arise. And operations always benefit from clear data and overview. Recently, a then-student and now colleague and me designed and realized a tool which enables CI/CD monitoring for more than one repo, but for a graph of repos. I hope we can publish/open this project anytime soon.

In the platforms sector, Backstage entered the “adopt” perimeter. The project is actively developing forward, and indeed could be an interesting tool for building an internal sw engineering community.

Looking at the tools sector, I liked Hadolint for finding common issues in Dockerfiles.

Kategorien
Culture

Developing a CI/CD maturity model for ECU SW engineering

Continuous Integration in Automotive is like teenage sex:

everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…

(after Dan Anely)

When talking about software integration and its modern CI/CT/CD patterns, there is a lot of confusion and misunderstanding. I was already in multiple rounds in which projects showed their process maturity in shiny colors, however when observing the daily practice it was far away from the presented state. Claiming to do “CI/CD” and really living it is a huge difference. As we seek objectivity in comparisons, we went ahead and started to develop a CI/CD maturity model, applied to ECU software engineering.

As input we took our own experience and a lot of research papers. Of course we could hardly avoid the canonical work Accelerate from Forsgren et al. But even that one cannot be applied in a straightforward manner. For us, it is important to focus on one ECU project, and not the overall carline project. The latter is of course desirable, however far out of our scope and reach.

Without further ado, here is our current state for discussion, review and comments. Feel free to share constructive criticism. At the bottom you will find our references (the ones we already incporporated and the ones we still have on our bucket list).

Unfortunately the social intranet crops the table. You can still read it by clicking the middle mouse button and dragging it.

KPIDescriptionL1: InitialAd-hocL2: ManagedManaged by humansL3: DefinedExecuted by toolsL4: Quantitatively managedEnforced by toolsL5: OptimizingOptimized by tools
Lead time for changesadapted to ECU project:Product delivery lead time as the time it takes to go from code committed to code successfully running in release customers (carline, BROP orga)One month or longerBetween one week and one monthBetween one day and one weekless than one dayless than one hour
Deployment frequencyadapted to ECU project:The frequency of code deployment to internal customers (carline, BROP orga). This can include bug fixes, improved capabilities and new features.Fewer than once every six monthsBetween  once per month and once every six monthsBetween once per week and once per monthBetween once per day and once per weekOn demand, once per day or more often
Mean Time to Restore (MTTR)How quickly can a formerly working feature/capability be restored?(not: How quickly can a bug be fixed!)More than one monthLess than one monthLess than one weekLess than one dayLess than one hour
Change Fail PercentageApplicable ?
Pipeline knowledge bus factorHow many people are able to fix broken pipelines within reasonable time and/or add new pipeline features with significant complexity (doesnt include rocket science)?≤ 1233 + people in various teamsat least one in most project teams
DevOpsAre pipelines managed by the software developers or not?Pipelines are (almost) exclusively managed by dev-team-external staffPipelines are maintained by the developers; new capabilities are added by externalsPipelines are maintained and improved by the developers in almost all casesPipelines are maintained and improved by the developers in all cases
AccessibilityCan all contributors, also from suppliers, access whatever they need for their jobs?Developers can access only fragments of the system, and additional access is hardly possible (only with escalation/new paperwork)Developers can get access to most parts of the system, and for the other parts there is some replacement (libraries)Developers dont have full access, however they can trigger a system build with their changes and get full resultsDevelopers have access to the full project SW and are able to do a full system build.
Review CultureWhat (code) review culture does the project have?Optional reviews without guidelines and unclarity who is actually supposed to give reviews; regular flamewarsMandatory reviews with defined reviewers; defined guidelines; flamewars happen seldomTools conduct essential parts of review (code style, consistency); barely any flamewars
Communication
Release Cycle DurationTime needed from last functional contribution until release is officially deliveredmonthsweeksdayshoursminutes
Delayed deliveriesHow often do delayed contributions lead to special activities during a release cycleEvery timeOften (>50%)Seldom (<25%)Rarely (<10%)Never
Release timelinessHow often are releases not coming in timeOften (>50%)Seldom (<25%)Rarely (<10%)NeverNo release timelines needed
Release scopeAre given timelines determining the planned scope for the next release (excluding catastrophic surprise events)Planned scope is mostly considered unfeasible; priority discussions are ongoing at any time80% of planned scope seems feasible; priority discussions are coming up during the implementation cycle repeatedly100% of planned scope usually is feasiblePlanned scope doesnt fill available capacity, leaving room for refactoring, technical debt reductionThere is no scope from outside the dev team defined, things are delivered when they are done (team is trusted)
Release artifact collectionHow are release artifacts gathered, combined, put togetherEverything manualSome SW manual, some SW automaticall SW automatic, documentation manualSW + documentation automatic
TracabilityConsistency between configurations, requirements, architecture, code, test cases and deliveriesNo consistency/tracability targetedIncomplete manual activitiesMostly ensured by toolsUntracable/inconsistent elements are syntactically not possibleUntracable/inconsistent elements are semantically not possible
DeliveryHow is delivery happening?Ad-hoc without appropriate tooling (mail, usb stick, network drives)Systematic with appriopriate tooling (artifactory, MIC)Automatic delivery from development to customer with manual triggerAutomatic delivery for every release candidate
Ad-hoc customer/management demoability
Feature Toggles
Test automationNo or a bit of exploratory testingTest plan is executed by human testers for all test activitiesAll test activities except for special UX testing are automatedNo contributions can avoid passing the obligatory automated testsAautomatically deriving new test cases
Test activitiesA bit of exploratory testingManual, systematic black box testingStatic code analysisUnit TestingIntegration TestingSystem TestingE2E Acceptance TestingChaos engineering, Fuzzing and Mutation testing
Virtual Targets
Quality CriteriaSW Quality management does not exist/not known/not clearSW Quality management is a side activity and first standards are “in progress”SW quality is someone’s 100% occupation and a quality standard existsSW quality is measured every daySW quality measured gaps are actively closed with highest prio
RegressionsHow often do regressions occur (regression = loss of once working functionality)Regressions happen with every releaseRegressions happen often (>50%)Regressions happen seldom (<25%) and are known before the release is deliveredRegressions happen rarely (<10%) and are known before the released software is assembled
ReportingVideowall, dashboards, KPIsNo or ever changing KPIsDefined KPIs, manually measuredSome KPIs are automatically measured, but its meaningfulness is debated, therefore doesnt play a huge role in decision-makingAutomatically measured and live displayed KPI data, used as significant input by decision-makersLive data KPIs are main source of planning and priorization
A/B Testing
Release CandidatesHow often are release candidates made availableNo release candidates existingBefore a release, release candidates are regularly identified and sharedDaily release candidatesEvery change leads to a release candidate
Master branchWhen do developers contribute to master?Irregularly, feature branches are alive for weeks or moreRegularly, feature branches exist for daysChanges are usually merged to master every day
DevOpsAre pipelines managed by the software developers or not?Pipelines are (almost) exclusively managed by dev-team-external staffPipelines are maintained by the developers; new capabilities are added by externalsPipelines are maintained and improved by the developers in almost all casesPipelines are maintained and improved by the developers in all cases
SW Configuration ManagementHow are product variants and generations managed?No systematic variant management, some variants are just ignored yetsystematic manual management of variants, variants are partially covered by branchesAll variants are managed via configurations and pipelines on same branchSoftware is reused over generations and variants
Reproducability
Customer FeedbackCustomers can be internal, too; not only end-usersCustomer not known/existing, no customer feedback availableInternal proxy customer, providing feedback which doesn’t play a big roleExternal customer is available and his/her feedback plays a relevant roleEnd-users’ feedback is available to the developers and a relevant input for design, planning and priorizationEnd-users’ feedback is main input for design, planning and priorization
HeartbeatIs a regular heartbeat existing?No regular heartbeat (e.g. time between reelases)Regular heartbeat, but often exceptions happenRegular heartbeat without exceptions
IT reliabilitySW Factory is regularly breaking; developers have local alternatives in place to not get blockedSW Factory is often broken..
Developer FeedbackHow quickly do developers get (latest) feedback from respective test activitiesMonths after the respective implementationweeks after the respective implementationdays after the respective implementationhours after the respective implementationminutes after the respective implementation
DevSecOps Stuff
Security Scanning
ISO26262?

Incorporated references:

Kategorien
Coding Culture

Technology Radar #26: Automotive SW perspective

As written before, I really like the regular updates provided by Thoughtworks in their Technology Radar. Since the new version #26 was released a few weeks back, I found now the time to put down my notes. My focus is on the applicability of techniques, tools, platforms and languages for automotive software, with a further focus on embedded in-car software. Hence, I am ignoring pure web-development and machine learning/data analytics stuff which usually makes a huge portion of the whole report. Let’s go!

In the techniques section in the “adopt” circle we initially have “single team remote wall”. In a nutshell I think they mean having a dashboard showing the essential data, kpis and tasks for a remote development team. I think the trick here is the “single” as I assume that most remote teams have dashboards, however usually multiple ones loosely coupled. In my current team, our Scrum Master has created a great Jira dashboard showing some essential data which could give hints at the team’s performance.

The second noteworthy technique is “documentation quadrants”. Referring to documentation.divio.com/ this provides a nice taxonomy of different documentation types. This is very relatable, as I very often experience a fuzzy mixture of all those types scattered in many places. Certainly this is something I will bring to my work network’s attention.

Third, we have “rethinking remote standups”. This follows a general observation that conducting remote daily standups in the same duration and content like the were recommended in former times (e.g. the typical 15 min Scrum daily) does not provide the same amount of alignment within a development team. This is not necessarily because of the the meeting itself, but because other casual sync occasions during the day are happening less in remote setups. In the radar, its recommended to try an extension to one hour, and of course the goal is to decrease the overall meeting load by this. I am thorn on this one, as I was always a fan of crisp daily meetings, avoiding random rambling on topics concerning only parts of the team. Blocking 1 hour for everyone every day sounds like an overshoot approach.

Next there is again the “software bill of materials” topic. This is currently a huge topic in the software industry, there have been very concrete examples recently (e.g. the Log4Shell or NPM package events you probably read about). Tool support to transparently and consistently managing the used software in a bigger project is really needed. While in the web and cloud business there is a growing number of tools, in the embedded world there are only some puzzle pieces. I can currently think of some Yocto support for this, however this covers only Linux parts in usually more complex multi-os automotive ECUs.

“Transitional Architecture” sounds like a promising thing, even though the radar’s description stays a bit vague. Luckily there is an extensive article by Thoughtworks’ Martin Fowler on this approach. In my opinion, managing legacy software in complex setups is one of the key challenges in the whole software industry, even more so in automotive embedded software, which is characterized by the co-existence of decades old technologies with state of the art approaches. Formalizing the transition from on older architecture to a newer makes sense, as usually this transition is often not architecturally covered as extensively as a the target architecture. This leads to misunderstandings, hacky workarounds and other unwanted side effects to a sustainable development.

Going one circle to the outside, in the “assess” perimeter, we first find CUPID. Aimed at replacing the SOLID rules with instead a set of properties of “joyful code”, it has some interesting observations and paradigms. Currently I only skipped over it, I think this deserves more time and maybe a dedicated article. However, I can recommend to check out the well written original blogpost by Dan North.

In the “hold” perimeter we see “miscellaneous platform teams”. In contrast to “platform engineering product teams” described earlier in the radar, this is kind of a degradation form. If a platform team fails to define a clear product goal and identify its customers, usually the scope becomes (or is) very fuzzy, leading to a unclear platform system. Hence, its strongly recommended to avoid this by achieving clarity of what actually is the scope of the team.

In the platforms sector, I could only identify one relevant blip “GitLab CI/CD”. Recently I see a lot of discouragement of using Jenkins, and of course if you use already Gitlab for its other elements (code hosting, code review, issue tracking) you may as well use it for CI/CD pipelines. For sure its better integrated in the overall Gitlab experience. However its just another vendor-specific DSL, so I wonder if there will be practical standardization on the pipeline definition soon.

Looking at the “Tools” sector, I found the reference to the two code search tools Comby and Sourcegraph. Besides offering code search and browsing using abstract syntax tree analysis, they are also offering semi-semantical batch changes, enabling “large scale changes”. Comby is an open source tool, while sourcegraph is commercial. I think I will try at least one of them soon.

Kategorien
Book Culture

Thoughts on “Reinventing Organizations” by Frederic Laloux

Next in my reviews of work-related books we have “Reinventing Organizations” by Frederic Laloux (the illustrated, shortened version). This is going to be short, I essentially read the book in an afternoon.

After some introductions, Laloux gives a historical overview on organizational philosophies, based on Ken Wilber. It starts at the impulsive, pure power-based via hierarchical via modern performance-oriented philosophies to postmodern pluralist ones. One interesting aspect here is, that even the two “older” ones can still be experienced today, the first one e.g. in street gangs/mafia, the second one e.g. in the church or public administration – and of course many companies.

The latter is based on empowerment, values and integrates diverse interest groups, and probably is by most considered already quite new or in many organizations “still upcoming”. However, Laloux claims that we may experience aklready now the dawn of a new, next philosophy, a integral, evolutionary one. He gives names three breakthroughs, which require but also trigger this. First, the increasing complexity of the systems organizations need to cope with, and the need for self-organized, decentral organizations to be able to cope with that. I would say that this approach “extrapolates” the empowerment by “built-in” trust and top-down-ruling-inhibiting organization forms in small, completely autonomous entities.

The second breakthrough is “wholeness” (“Ganzheit” in my German version of the book). It is based on the view that at their workplace, many people show and use only a part of their personality. This leads to loosing many valuable aspects and creates an unhealthy mental tension. Laloux gives some examples how this can be reliefed, e.g. by medition, group-reflection and new meeting-moderation methodologies.

Third, there is a breakthrough called “evolutionary purpose”. Its probably a reference what more recent publications and consultants call “VUCA”(Volatility, Uncertainty, Complex, Ambiguous). According to that perspective–which I have some doubts on, but maybe more in another blog post–the world around is is increasingly “vuca”. Traditionally, we would try to fight this vuca-ness, trying to impose existing problem-solving techniques–and ultimately fail. The other mental approach is to accept, even embrace this situation and–how Laloux phrases is–“dance with it”. Here, Laloux makes an interesting claim: “Most of us react very cynical on mission statements”. Oh yes!

The book contains some interesting food for thought, and it certainly has some relevant observations. Where my personal doubts come in: If I look at an historic scale, the four “traditional” organizational philosophies have been in place 10.000s, 1.000s, 100s and 10s of years, respectively. As I wrote, especially the last one (postmodern, integral) is still a quite new kid on the block. What does it tell us that new even a new philosophy comes up, and if we extrapolate the logarithmic timescale even more, will it be replace in a few months? Just kidding, however I wonder if Laloux’ evolutionary model is just a sub-type of e.g. the postmodern one, or really a new philosophy. And of course – what comes beyond that. Will there be an even more “advanced”, “progressive” organizational philosophy soon? Will things break apart and we go back to the wild days? I dont know…

Kategorien
Book Culture

Thoughts on “How Google Tests Software”

During my recent years as Software Project Manager I learned to appreciate modern software testing beyond just a tool. In earlier blog posts I described my personal hands-on progress on the matter. Thanks to some teachers and mentors I gained knowledge in theory and practice and nowadays call myself a software testing enthusiast (which doesnt mean I am particularly good at it – I just enjoy the topic).

The Book “How Google Tests Software” from James Whittaker, Jason Arbon and Jeff Carollo is already some years old – a fact I didnt know about initially, but some aged concepts made me look into the publishing date. Also some references are not working anymore, including a link to Google’s own “Google Test Automation Conference”. However, its still a great summary on aspects of modern software testing, with a focus on organizational aspects. There is no mention of mutation testing, probabily-based testing and other testing approaches I would call “advanced”, instead there is a lot of focus on the specific roles in and around testing at Google.

In the forewords the authors’ boss Patrick Copeland introduces the term of “Engineering Productivity” and why he has chosen this as the department’s name. I find this term very appealing, especially in its connection to testing. In my domain, automotive, testing is often seen as something which slows projects down and its done in the very end after features have been completely implemented to “drive maturity”. For me, proper testing is a method to be faster after all, and doing it late is possible but is for from being efficient. The Engineering Productivity organization at Google gathers the lead on test activities, respective tooling around testing and the mentoring/training efforts to spread a testing mindset throughout the whole organization.

In the book, the saying “scarcity brings clarity” is coming up multiple times, and the authors say

The first piece of advice I give people when they ask for the keys to our success: Don’t hire too many testers.

How Google Tests Software, page 4

Googlers might be smart, but they are not plentiful. Every TEM we’ve ever hired from outside of Google makes the comment that their project is understaffed. Our response is a collective smile. We know and we’re not going to fix it. It’s through knowing your people and their skills well that a TEM can take a small team and make them perform like a larger team.

How Google Tests Software, page 188

As a young leader I dread the constant understaffing all around me. Such looks into highly regarded companies like Google make me believe its something inevitably in any fast-paced company. Being always short on good people compared to the tasks at hand must yield the most efficient use of people as possible. But how to balance this in a way which satisfies me is something I admit to not have found the key for yet.

In major parts of the book some key roles are described in detail with their relevance to testing: The (“normal”) Software Engineer (SWE), the Software Engineer in Test (SET) and the Test Engineer (TE). On page 7 the roles are summarized in a nutshell, while many many pages repeat and detail this again and again – a style I find very typical for American textbooks. In essence its clear and I assume my mentors must have either read the book years ago or used common root sources. The worksplit seems obvious to me now. SWEs are developing the code, but have to contribute automated testing on all levels to secure the functionality and quality of the system, even when no other test-focused roles are around. SETs are pushing the boundaries of test tooling, enabling new kinds or aspects of test automation wherever they can. TEs are driving testing especially in a “challenger” approach, to catch all the issues SWEs don’t see or dont want to see initially.

I already mentioned that the book doesnt go very deep into test technology. What I found very interesting is that Google separates their test activities simply by “small tests”, “medium tests” and “large tests”. According to the authors they chose to do this because other terms are overloaded and vague throughout the software industry do a degree that people are talking about different things and often didnt recognize.

On pages 40f some nowadays state-of-the-art concepts about how to include test automation into pipelines are brough up. E.g. having test code near to the functional code. Also the classification of small/medium/large tests is added with some concrete data: large tests have to run within 15 mins, small tests are required to execute in less than 100ms. This shows the enourmous compute power Google is able to leverage for their software development. Also there is a good note to quote:

Small tests lead to code quality. Medium and large tests lead to product quality.

How Google Tests Software, page 47

Only both in combination provide long-term value to a company. Having insufficient small tests strongly correlates to bad code quality which will slow down further development. Having insufficient medium and large tests (system tests) lead to many gaps in the functional and non-functional end-to-end testing, thus yielding bad product quality if there is not immense manual testing in place.

On page 48 two very basics rules for test cases are defined, which I find highly relatable: Tests have to be independent of each other so they can be executed in any order (and in parallel). Also tests must not have any persistent side effects.

Interestingly the book makes some references to a build and test execution system with sophisticated dependency graph analysis, without giving the tool’s name. Nowadays I know that they are referring to Bazel. The book must come from a time when it was not yet published.

On pages 54ff the authors describe a “test certified” program which they used to promote the testing inside Google, including some test certified levels for teams to achieve. In a very telling interview I learned a lot that even at Google the testing proponents had to overcome a lot of internal resistance.

The dream of a unified dashboard has haunted Googlers, much like it has at many other companies. Every year or so, a new effort tried to take root to build a centralized bug or project dashboard for all projects at Google.

How Google Tests Software, page 124

Its late and I want to finish this blog post, so I am closing with an inspiring quote from Joel Hynoski, answering why someone should pursue a career in test:

Test is the last frontier of engineering. We’ve solved a lot of the problems of how to develop software effectively, but we’ve still have a green field of opportunity to attack the really meaty problems of testing a product, from how to organize all the technical work that must get done to how we automate effectively, and responsively, and with agility without being too reactive. It’s the most interesting area of software engineering today, and the career opportunities are amazing. You’re not just banging on a piece of software any more, you’re testing the GPU acceleration of your HTML5 site, you’re making sure that you’re optimizing your CPU’s cores to get the best performance, and you’re ensuring that your sandbox is secure.

How Google Tests Software, page 206
Kategorien
Coding Culture

Technology Radar #24: Automotive SW perspective

I am following the Thoughtworks Technology Radar since approximately five years. Since I first became aware about it (IIRC a colleague forwarded a version to me), I enjoyed its volumes as a condensed summary of current trends in the software industry. There is some bias towards cloud technology and machine learning which is out of my current professional interest, however there are enough other concepts/tools/aspects every time which make me investigate and followup to my best possibilities either at work or in my private projects. In this blog post I will try to list those parts which are roughly new and relevent from an automotive software perspective.

As usual, there are some selected topics on the top level they are presenting. Already the first one – Platform Teams Drive Speed To Market – is highly relevant. Its no secret that the automotive industry is widely moving towards in-house platforms, recent example from the news are vw.os (Volkswagen) and MBOS (Mercedes-Benz). The particular recommendation is to handle such platform activities explicitly as products, with all the implications. Only a platform with its own set of features, product vision, dedicated team and lifecycle can actually serve its purpose. A platform handled as a side-gig of an enduser product will neither serve the platform users well nor will it provide sustainable value, plus there would be a high chance of not surviving the first generation.

The second highlighted aspect – Consolidated Convenience over Best in Class – is also very relevant in the automotive SW sector. All of us love the regular discussion which code review tool is the best, and which CI tool is the newest hot shot we must adopt. Integrated workflow tools which integrate requirements/tickets to code, integration, tools and deployment such as gitlab and github (with github actions) promise so much more convenience in inter-tool integration and generally less setup hassle compared to having multiple tools each with its own authentication foo. However, there is also much higher risk of vendor lock-in. This can result in financial dead-ends and of course sooner or later lead to loss of flexibility on adopting new technology in the workflow area.

Jumping over the third highlighted topic and going straight ahead to the fourth – Discerning the Context for Architectural Coupling – this covers the architectural state of the union address. Baseline of the message here is “it depends”. Discussions with buzzwords like “monoliths” and “microservices” are seldom helpful, the real discussion needs to happen on the detailled requirements, constraints and principles.

Going deeper into the four sectors of the radar – Techniques, Tools, Platforms, Languages&Frameworks – lets start with Techniques. The new entry API expand and contract is directly an “adopt” recommendation. It means rather than breaking changes with new API versions too often its more about adding new interfaces while keeping but deprecating old ones, giving consumers a chance to adopt without breaking their usage. I can see immediate benefit of such patterns especially on highly used interfaces such as the vehicle communication interfaces. If “deprecation would be a thing (at scale)” it could relief many breaking integration challenges in the automotive domain.

In the trial circle we find Hypothesis-driven legacy renovation:

Rather than working toward a standard backlog, the team takes ownership of a measurable technical outcome and collectively establishes a set of hypotheses about the problem. They then conduct iterative, time-boxed experiments to verify or disprove each hypothesis in order of priority. The resulting workflow is optimized for reducing uncertainty rather than following a plan toward a predictable outcome.

https://www.thoughtworks.com/radar/techniques/hypothesis-driven-legacy-renovation

Sounds quite interesting. I am, too, used to handling technical stories like user stories. The given approach may indeed yield more efficient and effective collaboration on technical improvements.

Another very relatable technique is Lightweight approach to RFCs. Its something which my personal workstyle already incorporated since long time. I strive to create very fast first drafts for any task and then seeking review comments from my peers. Hence, I can absolutely understand the point of this topic and clearly recommend this. Any organization in need of collaboration and frequent alignment on concepts can benefit from a very lean set of templates encouraging early feedback rather than a culture which expects “perfect first shots”.

Team cognitive load and its linked Inverse Conway Maneuver give also very interesting food for thought. I am well aware about some teams in my environment who struggle a lot with the complexity at hand, and the organzational environment adds to that complexity significantly. The Inverse Conway Maneuver – organizing the team organization around how the proper architecture of a system looks like rather than structuring the architecture around the semi-random existing organizational structure – makes a lot of sense. It chimes well with the notion of Feature Teams and team ownership.

Remote mob programming as an extension of local mob programming, which is by itself the advancement of pair programming is certainly a recommended practice in many situations. Recently I have tried to bring teams together in “remote debug sessions” for hard inter-team nuts (=bugs) with not so great success. There is a lot of hesitance of teams to just spend some focus time together, and time difference in international setups is only one external factor. While (local) pair programming is something I think every team manager should at least try with his team a few times to see if it works out, mob programming depends an order of magnitude more mature team dynamics.

Under “hold” we can find Peer review equals pull request. The baseline here is that pull requests are only one way to conduct peer review, and as such its a very valid claim and call for further action. Looking only at fragmented changes (which 99% of PRs usually are) and reviewing that alone doesnt give context how a codebase looks overall. Full code reviews or audits certainly give great insights for potential refactoring efforts and even re-designs, besides other learnings.

The scaled agile framework SAFe is a hold and a look in the history shows it was since the beginning. Thats interesting. I am no expert in SAFe, and only know it in theory.

Separate code and pipeline ownership is really triggering me while I write this. Thats because I recently gave a presentation about the exact same thing and why both shouldnt be separated. I am wondering if that thought appeared to me myself or if I read about it the radar earlier. Nevertheless, I think anything else than keeping both together cannot really be justified from a professional sw development context (neglecting some extreme theoretical circumstances). Having that said, its not yet an established truths, and organizational structures and other opinions enforce a separation more often than I would wish for.

The tools sector doesnt offer really relevant insights this time, its all about backend intra, web dev, ML and that kindof stuff.

In the platforms sector the first thing which appears intersting is Backstage. Its a platform developed by Spotify which in essence is a simple overview on projects going on within an organization (company). It can be used to advertise cool efforts in the developer community. Something which could be worth a try. However, as a separate tool it may lack synchronizity with the actual developer activities. Integrating such a frontend into gitlab could yield some interesting dynamics, because that could leverage existing information about activitiy of projects and use data directly from the individiual projects/repos in there.

In the languages & frameworks sector again nothing relatable.

In total the new tech radar volume give me some very valuable insights and food for more thought.

Kategorien
Book Coding Culture

Thoughts on Refactoring from Martin Fowler

In my series of book reviews on classics its time to take on another evergreen: Refactoring from Martin Fowler. It was already referenced a multiple times in the earlier books and now it was time.

The concept of refacoring is well-established nowadays, and I would say that only a company culture which incentivizes and motivates a spirit of constant refactoring can be a state-of-the-art software company. In my experience, this is easier said than done, especially when immense release pressure and tight resources lead to the well-known vicious circle of crunching and task forces, which never end. When one release was successfully squeezed out “somehow” and the next one is already knocking on the doors, how could anyone expect refactoring of code which actually “works” (it made it into the last release after all!!!). I had such discussions many times, and until today I dont feel strong enough in my arguments to convince senior managers in typical situations. Arguing with sustainable code and continuous improvement when the other side virtually puts the existence of the whole organization at risk is an uphill battle. But its worth fighting.

In essence when you refactor you are improving the design of the code after it has been written. […] With refactoring you can take a bad design, chaos even, and rework it into well-designed code. Each step is simple, even simplistic. You move a field from one class to another, pull some code out of a method to make into its own method, and push some code up or down a hierarchy. Yet the cumulative effect of these small changes can radically improve the design. It is the exact reverse of the normal notion of software decay.

Martin Fowler: Refactoring, page 9

I like this simple but strong definition, which underlines the cumulative effect of small improvements. Refactoring is not the same as the “grand redesign” or “from scratch” approaches which are often taken and loved, especially when the staff (developers, managers) is changed. Refactoring can achieve the same goals with either the same or a different staff.

After some introductory words Fowler goes ahead with a concrete example. He emphasizes the repetitive nature of changing something a little bit, then running an extensive unit test suite and then committing the changes to a repository. This approach is something I immediately started to exercise. For that I had to setup some of my testing scripts to work locally (they were purely in my CI before), but the effort was totally worth it. With my unit test coverage being 98% nowadays plus mutation testing in place, I dont have to worry a lot to break existing functionality while refactoring, and as a lost resort I have all the extensive API and acceptance testing in my CI before I can merge on master.

The true test for good code is how easy it can be changed.

Martin Fowler: Refactoring, page 77 (translated back to English)

This quote may be controversial among people with low exposure to professional software development (juniors, managers who gave up on programming a while ago), but for professionals in my environment its fortunately common sense. Which doesnt mean that its applied constistently, though.

If someone tells you that their code during refactoring didnt work for a few days you can be quite sure that they didnt apply refactoring as such.

Martin Fowler: Refactoring, page 79 (translated back to English)

Such reworks or redesign of larger parts of a codebase of course can and may happen, too, but I appreciate Fowler’s take on a clear separation. Especially in arguments with upper management, clear terminology can help. If refactoring can mean everything happening to a codebase except adding new functionality, its probably to vague. Restricting its meaning to pure tiny and small changes reduces risks a lot and may help to increase acceptance.

Refactoring Helps You Program Faster

Martin Fowler, Refactoring, page 82

This was and is also my personal strong belief, and I think there is enough evidence that this is a fact. I mentioned earlier sustainability, and if a well-design and maintained code base based on clean code principles and with constant refactoring exists, its the best way to get a graph like the red one below.

Source: https://frontendwebblog.wordpress.com/2020/07/07/notes-from-refactoring/ (who presumably has taken it from Martin Fowler: Refactoring, English version)

Note also the short part in the lower left corner, where poor design for a while may have more functionality than good design. Naturally, with hacky proof of concepts hitting production, you can push out some functionality “fast”, but at the cost of low efficiency later on. On pages 90f (German version) declares this the major argument in favor of refactoring – in the end its about the business value it provides, and that one can be immense.

On pages 86f Fowler discusses if one should reserve dedicated time for refactoring. I heard before of models like “every third sprint is a refactoring sprint” and of course situations where “management decided the next sprint is for refactoring to fully focus on features after that again”. Fowler argues that reserved refactoring slots shouldnt be a thing. I tend to agree with him, by stating that refactoring should be considered in day-to-day work efforts and not be a sepereate activity in contrast to “normal development work” (which it should be a part of). However, as the value of refactorings is hard to measure and aforementioned “implicit” integration into the developer workflow easily gets it ignored or watered down, some accompanying activities and events encouraging refactoring may provide benefit.

What Do I Tell My Manager? […]

Of course, many people say they are driven by quality but are more driven by schedule. In these cases I give my more controversial advice: Don’t tell!

Martin Fowler: Refactoring, page 89

Similar to Bob Martin, Fowler argues with refactoring being an essential part of professional software development, thus its not something a manager should even have the change to interfere with in particular. I agree – mature develops should just do it and get the time needed. Asking management for approval for such a core activitiy in the actual coding, that it would be weird to explicitly ask for it. Do you ask for permission to apply certain design patterns, too? Of course, when asked, management will respond and the answer in probably 80% not what you would do. Management will probably realize at some point of refactoring gets too extensive (if that is even possible), but in the end refactoring is for everyone’s best (see the graph above).

On pages 93f Fowler has some interesting input on the interaction of (feature) branches and refactorings which I didnt see before. In essence its simple: Both are working in opposite directions. If there are multiple branches living in parallel and in one of them refactoring is happening, it may make the other branches unmergable (and vice-versa). So if a company or project wants to encourage refactoring, it should avoid active branches wherever possible.

There are more arguments for refactoring and a myriad of hints and tricks how to apply it, when to apply it and how to carry it out in an optimal way. The second part of the book contains detailled descriptions of all types of refactorings. I have skipped through them, but I know I wouldnt remember them exactly anyways. However, with the gist of things in mind I did a major refactoring of my current pet project, and it was really fun and productive. Now my codebase is even cleaner (I did already a few rounds after reading Clean Code) and I am looking forward repeat it while I continue adding functionality.

It was a very good book and a certain recommend, if you didnt read it already.