I can also break servers.

I am migrating an application to our servers. Logged into one of the application’s cluster nodes – an old Sun server. Looked around, decided to transfer all files to our server. Rsync didn’t work – some kind of protocol error. What’s the next best choice? Tar, I thought!

As I was being stressed by several phone calls I simply began to tar the document root into the /tmp filesystem. I then did something else.

Hours later, I remembered my tar job. Switched back to that terminal. I saw an error: “No space left on device”. Then the connection had timed out. Well, I thought, no problem, because afterall it had been idle for about four hours.

Until my attempt to reconnect displayed a “connection refused”.

Whoops. Did I kill that machine? No, can’t be.

But I now found out: Yes, it was me.

The /tmp on this old Solaris installation is actually in memory. I was attempting to tar up about 16GB of data. This filled up the RAM of the machine and caused it to randomly kill processes – among others the ssh daemon.

Luckily, however: It was the passive cluster node.

Still, in my defense… that really shouldn’t happen. :-)

Advertisements