In January, the company released a collection nearly a million photos which were scraped from Flickr and then annotated to describe the subject’s appearance. IBM touted the collection of pictures as a way to help eliminate bias in facial recognition.
However, it didn’t get expressed consent from the photographers or the subjects in the photos to use them in that manner. Instead, the images were uploaded to Flickr under a “Creative Commons” license, which allows others to use them without paying licensing fees, sometimes for commercial use.
IBM has said that it will remove any images from the set that a photographer or subject wishes. Doing so requires photographers to email IBM links to the images they would like to have removed. That makes it easy if you happen across a photo you’d like removed, but it doesn’t help in finding photos. IBM has not revealed the usernames of any of the users it pulled photos from so it would be easy for images to go undiscovered.
The photos in IBM’s dataset also do not include usernames or subject names, which would make it difficult for people to identify people in photos.
Critics of facial recognition say that it could eventually be used to target and profile minorities. At Amazon, a group of shareholders recently proposed that the company stop selling its facial recognition software until the board could determine that it would not threaten people’s civil rights.
In a statement provided to Fortune by an IBM representative, the company said:
IBM has been committed to building responsible, fair and trusted technologies for more than a century and believes it is critical to strive for fairness and accuracy in facial recognition. We take the privacy of individuals very seriously and have taken great care to comply with privacy principles, including limiting the Diversity in Faces dataset to publicly available image annotations and limiting the access of the dataset to verified researchers. Individuals can opt-out of this dataset.