The Rise of DevOps

hypermatt · on March 2, 2010

Finding more and more on projects that, old fashioned admins are less useful and you need developers to script server installations and automate the infrastructure. Whole new area its quite exciting.

thwarted · on March 2, 2010

Brooks called this role the "toolsmith", the only difference is that now the toolsmith creates things that are used in the production environment and may be full-on products in their own right.

strlen · on March 2, 2010

I'd have a minor quibble with that -- as someone who's done both application/product development and "dev ops": there's a difference when you write software that you yourself are the end-user of and when you write software for others to use/operate. That was the biggest challenge (initially) for me when I transitioned to "software engineering".

I've found that for infrastructure projects (especially open source ones, which may not always have dedicated QA resources available to them) it's a great idea to do "adversarial" testing/documentation: someone other than the developer must be able to follow the documentation and use the software in an environment other than the developer's desktop.

jpd · on March 2, 2010

True, but just because it starts as a personal project doesn't mean it has to stay that way. For example, z/OS's TERSE (a compression utility) was once someone's personal project, but is now shipped with the operating system. I'm sure there are many such examples. That said, these personal projects are no longer treated as such, and have gone through Test, and et cetera before being included in the system release.

strlen · on March 5, 2010

That's true. Brooks mentions this as the "program-system" and/or "program-product"/"system-system product" boundaries (I may have the terms confused, I don't have the book in front of me).

There's probably a range of software development: 0. "Script" (not necessarily something written in a scripting language, merely a short program written in an ad-hoc manner). 1. Program (could well be in a scripting language) 2. Project 3. Possibly a system (composed of multiple programs and/or projects) 4. Product 6. Company

chuhnk · on March 2, 2010

I can feel the shift from ops to devops happening and can see it everywhere. More managed hosting solutions are coming out, you can upload your code to the cloud and it just works. While this is great for devs, its a bit tough on us ops. While classic LAMP stacks wont be phased out anytime soon we do have to change with the times, that is by no means a bad thing. I find myself learning more and more ruby code as I want to shift into that devops category.

Welcome change, change with it.

strlen · on March 2, 2010

"Devops is on the rise primarily due to realization that there is a big gap between developing end-user systems and bare-bones systems administration"

That's the key take away here! I've held such a role in ~10,000+ node environment. I've since moved on to development, but I've been trying to drive this point to start-ups for years, but apparently most still think that:

1) Operations means installing an OS on a machine, doing backups (and may be occasionally checking how many threads are running in the application server)

2) As we no longer have to admin bare iron, start-ups do not need operations.

The hiring is done accordingly: the idea is to contract-out/outsource/reluctantly hire one or two "IT-type" admins whose goal is merely to (in a reactive fashion) respond to outages as they happen and (manually) provision servers and networking gear.

Nothing could be more wrong. With the move to commodity and virtualized commodity hardware, running operating systems that are geared (originally) towards developers rather than sys admins (e.g., Linux vs. Solaris or FreeBSD -- this isn't a stab at the latter two, it's rather a comment to how much more admin-friendly theses OSes are) the assumption should be that unless there's "lights out"-style automation in place, outages are going to be the norm rather than exception.

Periodic · on March 2, 2010

I'm a bit confused about your last sentence. Are you saying that the dev OSes will cause more outages, but we can automated that they won't be major?

The first thought that popped into my head while reading this article was how ZFS snapshots would work for versioning system deployments. Though I'm sure that's snapshots at a lower level than most people would actually use.

strlen · on March 3, 2010

> I'm a bit confused about your last sentence. Are you saying that the dev OSes will cause more outages, but we can automated that they won't be major?

I am saying that the "dev OSes" (really, I just meant Linux), if not automated, are difficult to manage. However, in a "worse is better" sense that means there's wealth of operations tools available for them (e.g., puppet, chef, bcfg2). Linux administrators are extremely used to working with these tools and developing their own.

At present web-scale (i.e., no longer a few Ultra-2s for front-end and an E10K for database) if there's no automation, outages are going to happen as result of routine failures (e.g., database master -- or Hadoop namenode -- goes down, there's no automation to fail it over).

ZFS snapshots are awesome (I've had a chance to deploy a project to a Solaris environment with ZFS -- and snapshots meant much less worries about data loss). Unfortunately, the only choices for ZFS are FreeBSD and Solaris (I wouldn't try ZFS-Fuse). However, FreeBSD provides no easy way to do many things automatically that are possible in Linux (e.g., no fully un-attended installation, although that's less of a problem with virtualization). Solaris has issues of its own and unfortunately, presently, the biggest issue isn't even technical.