Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

By Rachel Metz and Brody Ford

AI
Monday, 15 Apr 2024
3:30 PM MYT

Related News

Technology 22h ago

AI seen as key to high-quality growth

AI 2h ago

How AI and WhatsApp are helping football clubs

Technology 3h ago

Anthropic wins early round in music publishers' AI copyright case

An Adobe employee walking the crowd through Adobe Firefly Custom Models and Adobe Firefly Services during the opening keynote at Adobe Summit on March 26, 2024 in Las Vegas. Adobe never made clear publicly that Firefly had trained in part on images from competitors’ tools that are supposedly less ethical. — AP Images for Adobe

When Adobe Inc released its Firefly image-generating software last year, the company said the artificial intelligence model was trained mainly on Adobe Stock, its database of hundreds of millions of licensed images. Firefly, Adobe said, was a “commercially safe” alternative to competitors like Midjourney, which learned by scraping pictures from across the Internet.

But behind the scenes, Adobe also was relying in part on AI-generated content to train Firefly, including from those same AI rivals. In numerous presentations and public posts about how Firefly is safer than the competition due to its training data, Adobe never made clear that its model actually used images from some of these same competitors.

Massive amounts of data are needed to train AI models underlying popular content creation products, and there is increasing scrutiny on AI technology companies over their use of copyrighted materials in this process. Companies like Midjourney, Dall-E maker OpenAI and Stable Diffusion maker Stability AI built their media-generating models with datasets that pull imagery from across the Internet, a practice that has led to outrage and lawsuits from a number of artists.

“This shows the murkiness of the definition of responsible AI, and it also illustrates the difficulties of getting away from, if not the legal, then the social and cultural problems, or ethical problems, with generated content,” said Luke Stark, an assistant professor at Western University in Ontario, who studies the social and ethical impacts of AI.

Adobe’s decision to build Firefly with content the company holds the rights to and that in the public domain was meant to differentiate its AI image tool in the fast-growing market for generative artificial intelligence. The company promoted it as a more ethical, legally sound option for customers interested in conjuring images from just a few words but wary of potential copyright issues. It won’t generate content based on the intellectual property of other people or brands, Adobe has said, and will avoid producing harmful images, too.

AI-generated content made it into Firefly’s training set because creators were allowed to submit millions of images into Adobe’s stock marketplace that used the technology from other companies. “Generative AI images from the Adobe Stock collection are a small part of the Firefly training dataset,” wrote Adobe representative Michelle Haarhoff in September on a Discord group for photographers and artists who contribute to the marketplace.

Adobe said a relatively small amount – about 5% – of the images used to train its AI tool was generated by other AI platforms. “Every image submitted to Adobe Stock, including a very small subset of images generated with AI, goes through a rigorous moderation process to ensure it does not include IP, trademarks, recognisable characters or logos, or reference artists’ names,” a company spokesperson said.

StarPicks

Empowering women in STEM

Criticism of the practice has come from inside the company: Since the early days of Firefly, there has been internal disagreement on the ethics and optics of ingesting AI-generated imagery into the model, according to multiple employees familiar with its development who asked not to be named because the discussions were private. Some have suggested weaning the system off generated images over time, but one of the people said there are no current plans to do so.

Adobe has taken shots at competitors over their data collection practices. Other models are built on data that is “openly scraped”, Chief Strategy Officer Scott Belsky said last year. One way that Firefly is better than OpenAI’s comparable model is because it shows respect for the creative community by training only on licensed or freely available data, Adobe says on its website. And in a blog post last March titled “Responsible Innovation in the Age of Generative AI,” general counsel Dana Rao pointed out that generative AI “is only as good as the data on which it’s trained.”

“Training on curated, diverse datasets inherently gives your model a competitive edge when it comes to producing commercially safe and ethical results,” he wrote, while pointing out that Adobe trained Firefly on Adobe stock images, licensed content and public domain content in which the copyright has run out.

“Our enterprise customers came to us when we launched Firefly and said, ‘We love what you’re doing, we really appreciate that you’re not stealing all of our intellectual property out on the open Internet’,” Ashley Still, an Adobe senior vice president, said earlier this month during a Bloomberg Intelligence event.

Still, Adobe never made clear publicly that Firefly had trained in part on images from competitors’ tools that are supposedly less ethical. It did, however, outline such details in at least two online discussion groups the company runs on Discord – one for Adobe Stock and another devoted to Firefly – according to messages Bloomberg has viewed.

In March 2023, Adobe unveiled Firefly as a “beta” product. That month, Raúl Cerón, who works with the Adobe Stock community, posted on Discord that the company wasn’t planning to use generated images to train the forthcoming public version of Firefly.

“Once we go live out of beta, we will have a new training database for it, leaving Gen AI content out of it,” he wrote in a post in June.

When Adobe announced the public release of Firefly on Sept 13, the company also paid a special “Firefly bonus” to Adobe Stock contributors “whose content was used to train the first commercial Firefly model”. Contributors who used generative AI were among those who received the bonus payment, according to a Discord message from Mat Hayward, who also works with the Adobe Stock community.

AI-generated imagery in Adobe Stock “enhances our dataset training model, and we decided to include this content for the commercially released version of Firefly,” Hayward wrote.

Brian Penny, a writer and stock image contributor who has submitted thousands of AI-generated images – mostly made with Midjourney – to Adobe Stock, was surprised to get the bonus. He figured as an AI contributor he wouldn’t be eligible. Despite the financial gain, Penny thinks the decision to train Firefly on content such as his is a bad one, and said the company should be more candid about how it’s training the software for creating images.

“They need to be ethical, they need to be more transparent, they need to do more,” he said.

Adobe Stock’s library has boomed since it began formally accepting AI content in late 2022. Today, there are about 57 million images, or about 14% of the total, tagged as AI-generated images. Artists who submit AI images must specify that the work was created using the technology, though they don’t need to say which tool they used. To feed its AI training set, Adobe has also offered to pay for contributors to submit a mass amount of photos for AI training – such as images of bananas or flags.

Training on AI-generated content probably wouldn’t make Adobe’s Firefly image generator less commercially safe, and the company isn’t required to say what it’s training on as long as it isn’t misleading consumers, said Harvard professor Rebecca Tushnet, who focuses on copyright and advertising law. But training on AI images, such as those created by Midjourney, undermines the idea that Firefly is unique from competing services, she said.

"Adobe basically wants to position itself as the superior alternative, but it also wants really cheap inputs, and AI is a really good way to get cheap inputs,” she said. – Bloomberg

Topic:

AI Internet

Is this article useful?

50% of our readers find this article useful

Report a mistake

What is the issue about?

Spelling and grammatical error

Factually incorrect

Story is irrelevant

Email (optional)

Thank you for your report!

Next In Tech News

Others Also Read

Entertainment35m ago

TikTok star Joshua Blackledge dies at 16

Indonesia52m ago

Indonesian navy officers sentenced to life in prison over Tangerang shooting

starplus5h ago

INTERACTIVE: Malaysia's gender equality is No.1 in education but worst in politics among Asean countries

Thailand55m ago

Thai PM sails through no-confidence vote in parliament

Nation1h ago

Top baby names in Malaysia: A celebration of culture and identity

STARPICKS

Pfizer introduces two novel therapies to treat patients across the atopic dermatitis disease spectrum

Nation1h ago

AirAsia enforces aviation safety standards on power banks starting April 1

AseanPlus News1h ago

Filipinos see pathway from poverty with virtual assistant jobs

Nation1h ago

Boeing purchase above board, says political secretary to Finance Minister

Entertainment1h ago

Japanese actors Kento Yamazaki, Suzu Hirose have reportedly split

Nation1h ago

JB businessman loses RM120,000 to fake job offer

South Korea1h ago

Death toll rises to at least 22 as wildfires rage in South Korea

Symbol	Open	High	Low	Last	Chg	%Chg	Vol ('00)
HSI-PWD1	0.210	0.210	0.195	0.205	-0.015	-6.82	2,335,365
HSI-CWEJ	0.160	0.170	0.160	0.160	0.005	3.23	2,159,968
HSI-CWEO	0.145	0.155	0.140	0.140	-0.005	-3.45	2,017,844
HSI-PWD7	0.225	0.235	0.215	0.230	-0.020	-8.00	1,630,044
HSI-CWCY	0.165	0.170	0.155	0.155	0.000	0.00	878,917
HSI-PWFM	0.140	0.145	0.135	0.140	-0.005	-3.45	596,009
CLITE	0.230	0.235	0.220	0.230	-0.020	-8.00	469,988
HSI-CWCM	0.160	0.170	0.145	0.155	0.010	6.90	420,061
HSI-PWFJ	0.205	0.210	0.195	0.205	-0.005	-2.38	420,006
XIAOMI-C36	0.220	0.220	0.215	0.220	0.005	2.33	409,022
RENEUCO	0.050	0.060	0.050	0.055	0.005	10.00	294,730
DENGKIL	0.230	0.250	0.225	0.250	0.020	8.70	244,364
BORNOIL-WE	0.005	0.005	0.005	0.005	0.000	0.00	216,841
EDEN-WC	0.035	0.035	0.025	0.025	0.020	400.00	204,264
YEWLEE-WA	0.245	0.250	0.245	0.250	0.005	2.04	197,671

Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

Empowering women in STEM

Monthly Plan

Annual Plan

1 month

Next In Tech News

Others Also Read

TikTok star Joshua Blackledge dies at 16

Indonesian navy officers sentenced to life in prison over Tangerang shooting

INTERACTIVE: Malaysia's gender equality is No.1 in education but worst in politics among Asean countries

Thai PM sails through no-confidence vote in parliament

Top baby names in Malaysia: A celebration of culture and identity

Pfizer introduces two novel therapies to treat patients across the atopic dermatitis disease spectrum

AirAsia enforces aviation safety standards on power banks starting April 1

Filipinos see pathway from poverty with virtual assistant jobs

Boeing purchase above board, says political secretary to Finance Minister

Japanese actors Kento Yamazaki, Suzu Hirose have reportedly split

JB businessman loses RM120,000 to fake job offer

Death toll rises to at least 22 as wildfires rage in South Korea

StarPicks

Pfizer introduces two novel therapies to treat patients across the atopic dermatitis disease spectrum

Market Summary

FBM KLCI

21,284,143

Market Movers

Want to listen to full audio?

Majlis SIRIM Industri 2024

Thank you for downloading.

Adobe’s ‘ethical’ Firefly AI was trained on Midjourney images

Related News

Save 30% and win Bosch appliances! More Info

Monthly Plan

Annual Plan

1 month

Related stories:

Related News

Next In Tech News

Others Also Read

Trending in Tech

Market Summary

FBM KLCI

21,284,143

Want to listen to full audio?

Majlis SIRIM Industri 2024

Thank you for downloading.