I feel a bit of a rant coming on. This is about software developer mindsets, but it's also about "processes", which can be a bit of a scary term to many developers.

I got quite lucky in my career, in two particular ways.

First, in my first job I struck up a pretty good relationship with our CTO. He and I had quite a few conversations about the why and wherefore of what we were doing, from a business rather than a development perspective. That has left me with an interest in the big picture.

Second, I got sucked pretty much from the get-go into ops. We called it system administration then.

For $reasons, I used to be one of the few developers who got root access to production machines in a bunch of companies.

This has helped me get a fairly good understanding of what operations require from developers. It's quite different from what developers require from each other.

In some ways, it's the exact opposite.

The way these two things fit together is through the nebulous concept of "processes", which most people seem to immediately associate with some ISO standard and certifications, which is followed by endless amount of documentation, which must lead to the conclusion that processes must stand in the way of developers doing what they should be doing, which is programming.

I used to be like that.

So what does operations require from developers?

Developers often think of that in terms of "make things easy for ops", such as giving them a single binary to run instead of having them wade through configuration files. And to a degree, that's a fair enough goal to have for devs.

But the truth is, that ops tend to have some coding ability, and there are a ton of configuration management tools out there and in active use.

Complex setups that can be scripted aren't all that complex.

In fact, what operations require *more* from software is adaptability. The way devs tend to run software is usually *not* the way ops need to run it. Simplicity can be a bit of an own goal.

Or, to put it differently, while ops are end-users, they're not to be confused with grandpa trying to figure out this new-fangled electronic mail thingimajig.

What ops require above anything else from devs is boring.

No, really, that's it. Software should be boring.

That is, it should be predictable.

Configuration file formats changing, interfaces changing, setup procedures changing, requirements changing - all of these things shouldn't happen.

Of course they will, and they will have to. That's where scripting helps ops out again. But that means that any change in what ops can reasonably expect must be documented and the documentation communicated in a manner that ops can deal with.

You cannot communicate this the day before a new release should go live.

Which brings me back to processes.

A process is nothing more than putting this communication in place.

The way this manifests is most typically through checklists. Follow steps a-z in your release from dev to ops.

But what these checklists *encapsulate* is the organisation's understanding of the needs of ops from devs (and vice versa for e.g. bug reporting procedures, etc.)

QA is usually in the middle of this process, adding the required check mark on one of the boxes on the list.

Every organisation has a process for this. Every single one of them. Even if it's "oh I just upload that JS file to the production server", that's a process.

It's a shitty one, but it is one.

What is often missing are the checklists. And some structured discussions around how to get to checklists. All of which is shorthand for communication.

The discussions tend to lead to more or less the same place: you either need a gate, or you need recovery.

A gate is a kind of barrier in the process that is somewhat cumbersome to pass, but if you have the right key, passing isn't all that hard.

QA is such a gate. If you cannot pass QA's requirements - which is usually in the form of regression tests and new feature tests - then you cannot go to production.

Recovery refers to the set of procedures you have for rolling back to the last known state in case of a production failure.

Recovery tends to be good in orgs with weak gates and vice versa.

I'm a bit traditionally minded here, and personally prefer my gates to be strong. The industry as a whole tends to prefer strong recovery. There is almost always a mixture of the two employed.

In the end, it barely matters. What matters is that the business goals are met reliably, and the delivery team is not unreasonably stressed out by their combined efforts.

What the whole thing requires regardless of the balance struck here is that developers take some ownership.

This isn't to say that developers "own" that everything goes smoothly. That would be an exaggeration.

But they cannot reasonably have the mindset that throwing code "over the wall" is their job. Their job is to provide to the next down the line - QA, ops, both - exactly what those parties require to do *their* respective jobs.

"Worked on my computer" is a symptom of this being broken, same as "no errors in the CI" or whatever form it takes.

I sometimes exaggerate this and say "your job is only done when the end user - aka grandpa - successfully uses it to solve their problems". This is more for illustration purposes than as a real assignment of responsibilities. Exaggeration is a it double-edged, people can take it too seriously. But often enough it carries a general point across fairly well.

FOSS developers have a particularly hard time here. They are not part of an overall organization that provides the kind of feedback they need to optimize their output.

What I mean is, even if FOSS developers work towards their company internal goals and just publish software, the "organization" never includes all the users they end up factually delivering to, and communications with those users often relies on the users initiating it.

I don't know who uses my FOSS software, but they sure do.

As a FOSS developer, you have a few methods at your disposal to help you along here.

One is *prolific* documentation. But it doesn't just have to cover a lot of ground, it has to have entry points tailored to needs you may reasonably expect. The ops perspective is one such entry point.

Another is to double down on predictability. Infamously, Debian lags behind on up-to-date software and has slow "stable" release cycles. But it does just that, and it has generally earned them love from ops.

A third method is actually quite difficult to balance, but is still crucial: you have to bring the barriers to contribution way, way down.

Where this is often difficult is in bug reporting software. I see a lot of popular software using bots to classify bug reports according to developer needs, sometimes closing them automatically when the bot deems the report to be a duplicate or whatever.

There is limited developer bandwidth for dealing with reports, so they need screening of this kind.

Unfortunately, there is a negative result here for would-be contributors.

If you run into such bots, or excessive formalism in the report templates, or contributor agreements they need to sign before sending a patch, etc, etc, the most likely result is that people end up contributing less overall.

I can deal with all of that if it solves a problem I can't circumvent. But if it's easier to switch software than to contribute, well...

Lowering barriers may be better long-term.

If nothing else it signals "yes I want your contribution" even if one cannot deal with it immediately, as opposed to sending the signal that contribution is only appreciated conditionally.

This feeds into community management.

I believe - I have no data, sorry - that FOSS projects benefit strongly from human community managers that take care of this screening. And community management can in itself be a form of contribution.

See I started this thread on developers and ops interfacing, but...

... this goes right back to that beginning: FOSS developers that do not visible invest effort into interfacing with the "next" down the chain, which in this case are random people picking your software up, do not do their job.

They throw code over the wall. They say "it works on my computer", or "in my use-case".

I sometimes hear - heard the other day - that it's the developer's spare time, so one can't make such demands on them.

There's some truth there, absolutely.

So let's say I volunteer for the local fire brigade. We have a lot of volunteer fire response in Germany in smaller towns, which is supplemented by professionals from neighbouring larger towns. Response time often trumps equipment and experience.

Let's say I volunteer, but I don't respond to alarms. Or I do, but I always turn up late.

On the one hand, I'm volunteering my time, so one cannot make demands of me, right?

But the kicker is, I am not actually volunteering for turning up in a red suit with a fire hose. That may be what it looks like, sure.

What I'm volunteering for is a job, one that is comprised both of rights and of responsibilities. If I ignore the responsibilities, people will rightly be upset.

So it is with the role of FOSS developer jobs. Nobody is asking for your coding time. Everyone is asking for your ability to solve user needs. Ignoring the user needs is not doing your *voluntary* job.

In practice, there will always be reasons for not always being able to meet responsibilities, and users should be adequately accepting of those. Practice is always full of complications, on a case-by-case basis.

It's the mindset that matters.


@jens in my previous job, I used to have sort of broken up the path to a form of devops, and was responsible for maintaining and educating the product devs on the build and deployment system, and the packaging for customer deployments.

Needless to say I had more than a few words with that one team leader who privately prided himself for not abiding with the process, and being a rockstar "break stuff" type of diva dev (he also was ingraining that mentality into his team mates), including that time where I had to educate him publicly on the all-devs mailing list with how to properly use the version control system so as not to hamper other teams' progress.

He also had his mind set on not reconciling conflicts regularly with his branch, and at one time was several *months* behind the main branch, which was of course against the agreed practices; I got vindicated when it took his team *two full weeks* to painfully merge their feature branch into the mainline, which I had (easily) predicted.

Fond memories...

· · Web · 0 · 0 · 1
Sign in to participate in the conversation

cybrespace: the social hub of the information superhighway jack in to the mastodon fediverse today and surf the dataflow through our cybrepunk, slightly glitchy web portal support us on patreon or liberapay!