• renegadespork@lemmy.jelliefrontier.net
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 days ago

    Your confusion is understandable since MS has called like 4 different products “Copilot”. This refers to the coding assistant built into GitHub for everything from CI/CD to coding itself.

    All code uploaded to GitHub is subject to being scraped by Copilot to both train and provide inference context to its model(s).

    Basically having your code in GitHub is implicit consent to have your code fed to MSs LLMs.

    • The Octonaut@mander.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 days ago

      No, it isn’t.

      “Basically” your vibes aren’t an actual answer. Businesses are not forking over millions to give away their code.

      You can have conspiracy theories about it using the code anyway (I’m particularly confused about your use of the word “scrape” which tells me you don’t know how AI training works, how hosting a website works, or how scraping works - maybe all three?) but surreptitiously using its competitors’ code to train CoPilot would be a rare existential threat to Microsoft itself.

      Does GitHub use Copilot Business or Enterprise data to train GitHub’s model?

      No. GitHub does not use either Copilot Business or Enterprise data to train its models.

      https://github.com/features/copilot#faq