10
Rant: People keep saying AI bias is just about bad data, but that's missing the whole point
I keep seeing posts that blame biased AI only on the data it's trained on. Sure, that's part of it, but the real problem is who gets to decide what 'good' data even is. I read a report from a group in Berlin that showed a team of 20 people, all from similar backgrounds, picking the training data for a hiring tool. They didn't even see their own blind spots. It's not just about cleaning the data; it's about who's in the room when the system is built. If the people making the AI all think the same way, the bias gets baked in from the start, not just from the numbers. Has anyone else worked on a project where the team makeup actually changed the outcome?
4 comments
Log in to join the discussion
Log In4 Comments
lopez.karen15d ago
The 2016 ProPublica study on COMPAS actually showed the opposite story. They found a tool that was less biased than human judges overall, even though the training data had all the usual problems. If you control for the actual outcomes instead of just looking at who built it, sometimes the machine does a better job of ignoring the team's personal hang-ups. I'd rather have a system that was built by 20 similar people but tested on 10,000 diverse outcomes than one built by a "diverse" group that never checks their work against reality.
4
gavina731mo ago
Oh man, you are so right. I saw this happen on a project for a school district. The team was all tech people who went to the same few colleges. They built a system to flag "at risk" students, but it just kept pointing to kids from certain neighborhoods. They thought it was just the data, but it was their own ideas about what "risk" looked like. They never asked any teachers or counselors from those areas to help. The bias was in the first meeting, not the spreadsheet.
0
Totally... saw the same thing with a hiring tool at my old place. Same group of guys picking what "good" looked like, so it just kept finding more of them.
10