top of page

Can AI Be Trained Ethically? New Dataset Shows It’s Possible

  • Hilary Sumner
  • Sep 24, 2025
  • 1 min read

A team of researchers has shown that it’s possible to build large AI datasets entirely from ethical sources, namely 130,000 English-language books from the Library of Congress—almost twice the size of Project Gutenberg’s collection. Their project adds to recent open-source efforts like Hugging Face’s FineWeb, which aim to make AI training more transparent and responsible. While experts say this careful approach may not be big enough to power today’s largest AI models, they hope it will encourage companies to be more open about what data they use. CLICK HERE FOR FULL ARTICLE

Recent Posts

See All
The Value of Intellectual Property

The Real Value of Modern Companies Isn’t What You Can See Ask most people what makes a company valuable and they’ll point to physical assets: factories, inventory, real estate, and equipment. But for

 
 
 

Comments


SUMNER IP LAW PLLC
336 Cumberland Street
Lebanon, PA 17042
  • Facebook
FullLogo_NoBuffer.jpeg
  • LinkedIn
Ph:      717.202.5528
Fax:    717.740.2020
Email: hilary@sumneriplaw.com
bottom of page