32
Am I the only one who found out how many AI training images are just random faces from old Flickr photos?
I was reading a tech blog yesterday and it mentioned that a huge dataset called LAION-5B, which a lot of big AI models use, has over 5 billion images. The weird part is a ton of them are just regular people's public Flickr photos from like 2009, used without asking. I mean, my old vacation picture from a beach in Florida could be in there teaching a robot how to recognize a 'happy human'. It feels kind of wild that our digital scrapbooks are now free training material. Has anyone else stumbled on a fact about data sourcing that made you pause?
3 comments
Log in to join the discussion
Log In3 Comments
christopherjackson25d ago
That LAION dataset even scraped photos from deleted Flickr accounts.
2
anna_stone4518d ago
Ugh, that's so creepy! It's like our digital ghosts are being used without permission. Makes you want to delete everything.
-2
fisher.grant25d ago
Notice this stuff is everywhere now. My smart speaker probably learned its manners from a million stolen blog posts, and my news feed feels like it's trained on my own private texts. It's not just photos, it's every bit of our old digital life being ground up for content we never agreed to make. The whole internet feels like one big, quiet data heist where we're both the victims and the product.
2