• FigMcLargeHuge@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    46
    ·
    edit-2
    13 hours ago

    A guy in our data center couldn’t figure out who owned a particular machine that he needed to work on. So his solution to figure it out was to let them come to him. He went and pulled out the network cable and waited. He was escorted out a little while later. The moral of the story is don’t go disabling production machines on purpose.

    • Hugin@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      35 minutes ago

      Where I worked we had a very important time sensitive project. The server had to do a lot of calculations on a terrain dataset that covered the entire planet.

      The server had a huge amount of RAM and each calculation block took about a week. It could not be saved until the end of the calculation and only that server had the RAM to do the work. So if it went down we could lose almost a weeks work.

      Project was due in 6 months and calculation time was estimated to be about 5 1/2 months. So we couldn’t afford any interruptions.

      We had bought a huge UPS meant for a whole server rack. For this one server. It could keep the server up for three days. That way even if wet lost power over the weekend it would keep going and we would have time to buy a generator.

      One Friday afternoon the building losses power and I go check on the server room. Sure enough the big UPS with a sign saying only for project xyz has a bunch of other servers plugged into it.

      I quickly unplug all but ours. I tell my boss and we go home at 5. Latter that day the power comes back on.

      On Monday there are a ton of departments bitching that they came in an their servers were unplugged. Lots of people wanted me fired. My boss backed me and nothing happened but it was stressful.

    • superkret@feddit.org
      link
      fedilink
      arrow-up
      17
      ·
      7 hours ago

      Yeah, I’ve done that before – after asking literally everyone in IT, plus our external consultants, and getting the go-ahead from my team lead and the head of IT.

    • ramble81@lemm.ee
      link
      fedilink
      arrow-up
      41
      ·
      12 hours ago

      Honestly we do that when we ask and no one speaks up. Lovingly called the “scream test” as we wait to see who screams.

          • superkret@feddit.org
            link
            fedilink
            arrow-up
            2
            ·
            1 hour ago

            I don’t understand how that is even possible.
            Are there no logs? No documentation? Does everyone share an admin user with full rights?
            I mean, there has to be a way to find out who accessed the machine last time.

            • ramble81@lemm.ee
              link
              fedilink
              arrow-up
              3
              ·
              edit-2
              1 hour ago

              You’d be surprised with inheriting tech debt. Quite often there’s no documentation, the last person to log in to the system is an admin that quit 3 years ago, but it doesn’t much matter because that’s only for a direct console login which normal users don’t do when accessing the application. With tribal knowledge gone and no documentation, only when you pull the network for a bit do you discover that there was this one random script running on it that was responsible for loading up all the needed data in the current system, when 9 of the other 10 times those scripts were no longer needed.

              In a perfect world you’d have documentation, architecture and data flow diagrams for everything, but “ain’t nobody got time for that” and it doesn’t happen.

              • superkret@feddit.org
                link
                fedilink
                arrow-up
                1
                ·
                55 minutes ago

                Had that the other way around recently. A docker container failed to come back up after I had updated the host OS.
                Was about ready to restore the snapshot, when I looked further back in the logs on a hunch.
                Turns out that container hadn’t worked before the update either. The software’s developer is long gone, and no one could tell me what it was supposedly doing.

      • FigMcLargeHuge@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        1
        ·
        12 hours ago

        I guess it depends on where you work. This was a large datacenter for a very large health insurance company. They made it a point later that day to remind people that it was a fireable offense to mess with production machines like that on purpose. And evidently the service he disabled was critical enough that it didn’t take long for the hammer to come down. There were plenty of ways to find out who owned the machine, he just chose the easiest and got fired on the spot for it.