• Skull giver@popplesburger.hilciferous.nl
      link
      fedilink
      English
      arrow-up
      26
      ·
      edit-2
      15 days ago

      ActivityPub doesn’t do DMs per se. Many ActivityPub implementations will use AP messages that are not posted on any public list or timeline. Basically, a Tweet with visibility set to “only people mentioned in this thread”.

      This design makes it quite easy for AP servers to misimplement DMs. Asking a server for all messages of a particular user (to get their timeline) and forgetting to filter out messages not published globally is trivial to get wrong.

      ActivityPub DMs are, in my opinion, not a good feature. This has come up before in Mastodon, where DMs mentioning a third account will add that account to the thread and destination of all future messages (and possibly authorise it for accessing past messages); one mention will give them full access to your “direct” messages.

      I doubt this scraper did anything wrong here, I think it’s just a matter of a buggy server or users sending DMs that aren’t really DMs because of Fediverse software with GUI design flaws.

      Edit: looks like it’s probably a Mastodon bug: https://hackers.town/@thegibson/112604700601089641

      • jherazob@beehaw.org
        link
        fedilink
        English
        arrow-up
        6
        ·
        15 days ago

        I recall somebody’s working on actual, E2EE Mastodon DMs, but couldn’t give you details, i guess when it’s ready we’ll know when people start using it

      • 4am@lemm.ee
        link
        fedilink
        arrow-up
        3
        ·
        15 days ago

        Seems if the messages are sent in an inherently insecure fashion, all one would need to do is set up an instance that purposefully does not filter out all the things it’s supposed to be kind/competent enough to filter out, and boom it has everything.

        • Yes, just like on twitter, reddit, and most of the other platforms the Fediverse is trying to replace, server admins are free to read your messages. There’s no encryption. The Fediverse just adds more server admins to the mix.

          I would not recommend using the DM function on most Fediverse platforms for things you’d like to keep private. While in most cases there are no privacy risks, there are also very few guardrails to ensure that.

          You’re better off using a federated platform with encryption support like Matrix or XMPP. Neither of those are very safe if you don’t verify the other’s keys (although neither is any other chat service, even Signal) but both are much safer.

          If it weren’t for the lack of shared credentials, I would’ve expected someone to add a minimal secure chat client to the Lemmy frontend already. Especially on the servers that host a Matrix server already

        • kevincox@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          15 days ago

          It’s not “inherently insecure” at least not to that degree. (Once could argue that lack of E2EE is insecure.) If you stand up an unrelated instance you shouldn’t be able to access private messages that don’t relate to an account on your instance. So only bugs in your instance, or your conversation partner’s instance, will be able to leak those messages.

  • IllNess@infosec.pub
    link
    fedilink
    arrow-up
    16
    ·
    15 days ago

    If we hit these AI companies with targeted suing, like how Scientology got their way with the IRS, maybe we then they can listen to not steal our shit.

    The MPAA and RIAA have created all these laws and used our own government againat us. Maybe we can use these same laws and do the same.

    • Pekka@feddit.nl
      link
      fedilink
      arrow-up
      5
      ·
      14 days ago

      Maybe we have some bias on this topic, but I had the same thought. Maven is such a well known tool in IT, that I’m surprised they just created a social network with the same name. Until they get a bit famous this won’t be good for SEO.

  • darkphotonstudio@beehaw.org
    link
    fedilink
    arrow-up
    3
    ·
    14 days ago

    I wouldn’t have a problem with all this scraping, if these companies had to release their models trained on this data as open source.

    • esaru@beehaw.org
      link
      fedilink
      arrow-up
      2
      ·
      14 days ago

      That’s a great idea. Can we not apply a license to that social content that forces AI models trained on it to be open source?