cross-posted from: https://lemy.lol/post/778691

While I want to block content bots, I don’t want to block useful bots like @[email protected] @[email protected]. Because of this, I will block them one by one. I am sharing it here for community benefit. Any addition/removal is welcome.

gist for programmatic use: https://gist.github.com/ismailkarsli/0c6c7aa4f70d1905adea1b30271f16f7

  • marsara9@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    …I wonder if there’s a programmatic way to detect these bots? Some sort of analysis on their posting behavior?

    If they’re playing nice they’ll have the bot flag checked in their profile, and then maybe build a list of any bot that creates posts? As most of the “good” bots just reply to comments? Anyway just thinking out loud. But I’m thinking I could easily add a public API to my search engine that just returns a list of “posting bots”…

    • iso@lemy.lolOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      They are not identical much. Maybe we can assume that those who are marked as bots and share around 10-100 posts as bots.

      • marsara9@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 year ago

        Maybe. 2nd idea I’ve got is that if no one is replying after say 24hrs and something like 75-80% of your posts are as such and you have at least 100 such posts, you get added to the list?

        Main concern I see about something like this is false positives and how someone real could end up getting blocked.

        I definitely want to think on this some more but it might have some legs.

        • iso@lemy.lolOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 year ago

          I think

          • flagged as bot
          • doesn’t responds in n hours
          • has n numbers of posts in last n hours or overall

          is sufficient to determine a user is a content aggregator bot. Bot flag is an important indicator here. Like the biggest false positive would be ban a multi-purpose bot that also has content aggregation feature.

    • 𝘋𝘪𝘳𝘬@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I wonder if there’s a programmatic way to detect these bots?

      Technically there is. Bot accounts can be market as bot accounts and you can decide to show bot accounts. There are buttons for this in the settings.

      But if bots are not marked as bots, there is no way.

      • marsara9@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Problem is finding the difference between repost bots and bots that are helpful like automod and link redirectors.