• 0 Posts
  • 1.73K Comments
Joined 2 years ago
cake
Cake day: July 26th, 2023

help-circle














  • Echo Dot@feddit.uktoTechnology@lemmy.worldCloudfare outage post mortem
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    9
    ·
    4 days ago

    There are technical solutions to this. You update half your servers, and then if they die you just disconnect them from the network while you fix them and then have your own unaffected servers take up the load. Now yes, this doesn’t get a fixout quickly, but if you update kills your entire system, you’re not going to get the fix out quickly anyway.



  • Echo Dot@feddit.uktoTechnology@lemmy.worldCloudfare outage post mortem
    link
    fedilink
    English
    arrow-up
    69
    arrow-down
    9
    ·
    4 days ago

    So I work in the IT department of a pretty large company. One of the things that we do on a regular basis is staged updates, so we’ll get a small number of computers and we’ll update the software on them to the latest version or whatever. Then we leave it for about a week, and if the world doesn’t end we update the software onto the next group and then the next and then the next until everything is upgraded. We don’t just slap it onto production infrastructure and then go to the pub.

    But apparently our standards are slightly higher than that of an international organisation who’s whole purpose is cyber security.