The issue was not caused, directly or indirectly, by a cyber attack or malicious activity of any kind. Instead, it was triggered by a change to one of our database systems’ permissions which caused the database to output multiple entries into a “feature file” used by our Bot Management system. That feature file, in turn, doubled in size. The larger-than-expected feature file was then propagated to all the machines that make up our network.

The software running on these machines to route traffic across our network reads this feature file to keep our Bot Management system up to date with ever changing threats. The software had a limit on the size of the feature file that was below its doubled size. That caused the software to fail.

  • panda_abyss@lemmy.ca
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    35
    ·
    7 hours ago

    Classic example of how dangerous rust is.

    If they had just used Python and ran the whole thing in a try block with bare except this would have never been an issue.

    • Thallium_X@feddit.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      3 hours ago

      As a next step they should have wrapped everything in a true(while) loop so it automatically restarts and the program never dies

    • Zwuzelmaus@feddit.org
      link
      fedilink
      English
      arrow-up
      11
      ·
      6 hours ago

      So you think there is no error handling possible in Rust?

      Wait until you find out that Pyhon doesn’t write the error handling by itself either…

    • SinTan1729@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      4 hours ago

      I hope you’re joking. If anything, Rust makes error handling easier by returning them as values using the Result monad. As someone else pointed out, they literally used unwrap in their code, which basically means “panic if this ever returns error”. You don’t do this unless it’s impossible to handle the error inside the program, or if panicking is the behavior you want due to e.g. security reasons.

      Even as an absolute amateur, whenever I post any Rust to the public, the first thing I do is get rid of unwrap as much as possible, unless I intentionally want the application to crash. Even then, I use expect instead of unwrap to have some logging. This is definitely the work of some underpaid intern.

      Also, Python is sloooowwww.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      8
      ·
      edit-2
      6 hours ago

      This can happen regardless of language.

      The actual issue is that they should be canarying changes. Push them to a small percentage of servers, and ensure nothing bad happens before pushing them more broadly. At my workplace, config changes are automatically tested on one server, then an entire rack, then an entire cluster, before fully rolling out. The rollout process watches the core logs for things like elevated HTTP 5xx errors.

    • jimmy90@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      6 hours ago

      honestly this was a coding cock-up. there’s a code snippet in the article that unwraps on a Result which you don’t do unless you’re fine with that part of the code crashing

      i think they are turning linters back to max and rooting through all their rust code as we speak