Andy Hedges' Blog

4 Symptoms of Dysfunction in Software Teams

Once you’ve been in the industry for a decade or so you start to get a sixth sense of when things aren’t right. However even when your sixth sense isn’t working here are some signals that should raise alarm bells.

1. It Dependencies

Every time you ask someone how something works or where some data is the explanation starts with ‘it depends’.

For example say I enquire how a user’s password is verified, I’d get an answer something like this:

“It depends, if the user registered before 2001 then you need to call the genesis logon system, after 2001 however we switched to active directory, unless the user was a customer of that company we bought in 2005, in which case you’ll need to call the ‘migration’ LDAP they put in place at the time which uses Netscape LDAP so you’ll need JNDI library we patched to work around some bugs. Also if the user is an administrator we keep the credentials in the staff database so you’ll need to do an ODBC look up for that.”

You get the idea, technical debt has built up, poor decisions haven’t been rectified and now there is there a laundry list of ‘it depends’ for every question.

2. Jane Doe

Here’s where we have a systems or component that only one person understands, the cynical might refer to this a mortgage driven development however it often happens because of apathy too. Unfortunately, the system might not be considered sexy to work on and therefore hasn’t been touched other than to keep it ticking over. The trouble is that every time there is a problem only Jane can fix it and she’s not much into do that work either. She’ll do the minimum to keep it ticking over.

The trouble with this is that the system is providing value to the organisation otherwise it could be turned off. It’s very likely if it’s providing value that at some point in the future it will need enhancing and nobody is going to be able to do and that — and then Jane quits.

3. Town Hero

This is the support equivalent of ‘Ask Doe’, it isn’t a case of if this system will fail but when and how dramatically… and it’s always dramatic. When it does go wrong there is only one plucky guy in support who can fix it: our local hero. Undoubtedly he’s not around when things go wrong but everyone knows he’s the only guy that can fix it.

When the inevitable catastrophic failure occurs calls are made to his landline, his mobile and DMs to his twitter account. He’s unreachable, what should we do? We’re doomed. When all seems lost a call is received. He’s been located, but he’s on a beach in Rio. There’s no way for him to connect to take a look. However he’s had an idea he’s VPNing into a server in Hamburg to create a SSH tunnel through the San Francisco DC and from there to the corporate firewall… he’s in! A few minutes later he’s diagnosed the potential problem. People are called to mission control, orders are given, procedures are ignored, changes are made. The hero calls into the bridge and explains the problem, it’s going to be tough and take some time but he can do it — nobody else has a clue what he’s talking about. In the small hours the system is restored, everyone pats each other on the back for pulling together and working as a team — also, more importantly, thank goodness for the hero, what would we do without him?

All is well in the world thanks to that guy, our hero — until next month, when the exact same thing happens again.

The trouble is these systems never get fixed, either because the team that that supports them aren’t qualified to do so, aren’t allowed to do so or because they are too busy, maybe too busy being heroes.

4. Carcinogenic Prototype

This system starts with a great idea, an idea that has a lot of potential. In order to qualify that potential a demo/pilot/prototype is created and as it turns out the potential is realised. The business become very keen to get this prototype into full production — they might just make those targets they thought they were going to miss if this works out. The engineers who built the prototype are pleased because their work is demonstrating value but reticent because it was just a prototype.

The engineers mention this to the group. The system wasn’t designed to go live, it doesn’t scale easily, it doesn’t have good exception handling, it doesn’t do any logging, it can’t be monitored, the list goes on. The engineers are overruled but promised to be given time to fix it up later.

Later never comes because once the system is live feature enhancements come pouring in, the system grows rapidly and sometimes starts to affect other systems too. Unfortunately due to the quality of the system it’s very hard to add features in a clean way and so the technical debt grows, it compounds.

Solution

As with most things in software engineering the technical problems are symptoms of the organisational causes. For each of the symptoms listed above there are numerous organisational reasons for them occurring, I’m going to start to blog each of these organisational issues over time. However at the heart of the problem is a lack of desire to change the status quo, to enforce a level of quality in the engineering and take ownership. Take ownership of your minimum viable quality and stick to it.

Andy Hedges

SOA vs Microservices

The microservice term has been around for a while but for as long as it has been around there has been much arm waving around what it was. It continued as one of those nebulous concepts that meant anything and everything to everyone and no one. In this way it caused a little confusion but served as a token to reference something — but what? The closest you could get was was that the services were smaller than monolithic applications, by some measure that was vague. Of course, as is often the case some people were tacking fairly arbitrary stuff onto the term too. That’s where we were, microservices were smaller than just about the biggest thing anyone can ever create in software: monolithic applications. Monolithic applications are like the Graham Numbers of software architecture.

“Graham has recently established … a bound so vast that it holds the record for the largest number ever used in a serious mathematical proof.” — Martin Gardner (c.1977)

This wasn’t particularly useful but pretty much par for the course in the IT industry, buzzwords come and go and often don’t have concrete meaning attached to them, or have multiple conflicting meanings attached to them. However this is where things took an interesting turn, Martin Fowler, a well known name in the industry made an attempt to document what the terms could mean. Fowler takes a very thorough approach to documenting things and has a rather large following of people, so when he does document something the term:

  • tends to stick
  • have concrete meaning

The trouble was that a number of people to a great or lesser extent, including myself, believed the definition of microservices that Fowler came up with is remarkably similar to SOA as known and documented, with some other practises thrown in. Those other practises were debatably a little orthogonal to the topic at hand, that is, they were good practise but beside the point.

It’s well documented what SOA is and now it is well documented what microservices are. Compare them side by side and most reasonable people, I think, will come to roughly the same conclusion. So how did this happen?

Marketing and communication

SOA is unfortunately a hard concept to grasp initially, it takes a little time, then most people, if they persever have an epiphany and wonder what it was that they didn’t understand in the first place.

Trouble is I think that only those who really went to town on SOA the first time around had that epiphany, but why did some go to town on it and others not.

What happened with SOA is that the waters were muddied with a large amount of esoteric and/or complex language, take a look at the SOA Reference Model for example, it has some very good information in there but it is an intimidating read and I can imagine that not many have read it. I would absolutely forgive anyone for not reading it, it took me about 5 quite determined attempts…

If you contrast this document to Fowler’s blog post then you can see which is more easily digestable. Fowler is (literally) an expert in the written communication of technical concepts, perhaps the best .

Vendor spin

As anyone who’s been in the IT industry for any amount of time will know for any given buzzword there are roughly a billion vendors trying to capitalise on it. This may be as simple as saying it is SOA enabled or microservice compatible but it can take more subtle forms too.

One of these approaches is to create a related standard or pattern. It wasn’t long after SOA started getting discussed as a concept that WS-* appeared on the scene with a seamingly endless parade of complex specifications and reference implementations of said specification. Each specification was dutifully implemented by the big vendors so that they could become ‘service-enabled’.

Unfortunately these specifications were awful, some absurd, moreover these vendor implementations didn’t work very well with each other, indeed some of them needed standards to defined the standards. That is a web-service exposed by one vendor couldn’t be reliable called by a client from another vendor. This left a bad taste for the right thinking engineer and there was violent push to the opposite end of the interface spectrum: ReST, JSON and so forth but that’s another story and one we all learned from.

As I eluded to early it wasn’t just specifications that clouded the SOA picture. In order to sell software licenses it was key to have something big, expensive and most importantly critical to the client once implemented. There is no greater example of this than the ESB. However I’ve written about why you don’t need an ESB before, needless to say this also left a bad taste.

For the record, to my mind, neither ESB or WS-* are desireable in a modern SOA. Unfortunately, sometimes WS-* is the only way to communicate with legacy applications. The industry did learn from all this, I hope.

Time fades

Finally, time fades, engineers like new things, I know I do. I like to play with new technology, think about and discuss new ideas. I’m also forgetful, I forget some of the cool things I once knew. As the expression goes, everything old is new.

Summary

For the most part there isn’t really anything new in microservices over SOA. SOA has some baggage of ESB and WS-*, that is some people confuse that with SOA, unfortunately vendors did that. For the record some of us gave them a hard time for that at the time.

Do we need a new term? I don’t think so, I think it would be more useful to tie down what we mean by SOA, clearly remove the bad things and highlight new thinking. However by renaming I think we revise history, we lose the discussion, the ‘changelog’.

My humble suggestion would be that it is more useful to highlight the difference, what’s new and give those things names, think of these changes as microbuzzwords.

Andy Hedges