Meta Used Almost 82TB in Pirated Books to Train IA

TheDirector
By -
0


Meta is currently dealing with a lawsuit where it is accused of using pirated content to train its Artificial Intelligence (AI) model called Llama, with a page on the social network X sharing court documents that provide more details about these practices.


According to the information shared by the @vxunderground page (below), Meta reportedly used the equivalent of 81.7TB of pirated book data to train Llama, with the pirated books having been taken from platforms such as Anna's Archive, Z-Library and LibGen.


The documents also show internal communications between researchers and concerns about adopting these practices to train AI models. “I don't think we should use pirated material [to train AI]. We have to draw a line,” said one researcher. “[Downloading] torrents from a corporate computer doesn't seem right,” noted another Meta researcher.


As Tom's Hardware reports, the tech giant led by Mark Zuckerberg has also sought to ensure that these practices cannot be linked to the company, which could prove in court that the use of pirated material was intentional and deliberate on Meta's part.


Unsealed court documents from February 5th, 2024, in Kadrey v. Meta show Meta (formerly Facebook) illegally torrented 81.7TB of data from "shadow libraries" such as Anna's Archive, Z-Library, and LibGen to train Meta artificial intelligence.

Highlights include:
- A senior AI… pic.twitter.com/Bqf60Hhbb6

— vx-underground (@vxunderground) February 8, 2025

Post a Comment

0Comments

Post a Comment (0)

#buttons=(Ok, Go it!) #days=(20)

Our website uses cookies to enhance your experience. Check Now
Ok, Go it!