CHATGPT, Bard, Claude. The world’s most popular and successful chatbots are trained on data scraped from vast swaths of the Internet, mirroring the cultural and linguistic dominance of the English language and Western perspectives.
This has raised alarms about the lack of diversity in artificial intelligence. There is also the worry that the technology will remain the province of a handful of American companies.
In South Korea, a technological powerhouse, firms are taking advantage of the technology’s malleability to shape AI systems from the ground up to address local needs. Some have trained AI models with sets of data rich in Korean language and culture.
South Korean companies say they’re building AI for Thai, Vietnamese and Malaysian audiences. Others are eyeing customers in Brazil, Saudi Arabia and the Philippines, and in industries like medicine and pharmacy.
This has fuelled hopes that AI can become more diverse, work in more languages, be customised to more cultures and be developed by more countries.
“The more competition is out there, the more systems are going to be robust: socially acceptable, safer, more ethical,” said Zhang Byong-tak, a computer science professor at Seoul National University.
While there are some prominent non-American AI companies, such as France’s Mistral, the recent upheaval at OpenAI, the maker of ChatGPT, has highlighted how concentrated the industry remains.
The emerging AI landscape in South Korea is one of the most competitive and diverse in the world, said Yong Lim, a professor of law at Seoul National University who leads its AI policy initiative.
The country’s export-driven economy has encouraged new ventures to seek ways to tailor AI systems to specific companies or countries.
South Korea is well positioned to build AI technology, developers say, given it has one of the world’s most wired populations to generate vast amounts of data to train AI systems.
Its tech giants have the resources to invest heavily in research. The government has also been encouraging: it has provided companies with money and data that could be used to train large language models, the technology that powers AI chatbots.
Few other countries have the combination of capital and technology required to develop a large language model that can power a chatbot, experts say. They estimate that it costs US$100mil to US$200mil to build a foundational model, the technology that serves as the basis for AI chatbots.
South Korea is still months behind the US in the AI race and may never fully catch up, as the leading chatbots continue to improve with more resources and data.
But South Korean companies believe they can compete.
Instead of going after the global market like their US competitors, companies like Naver and LG have tried to target their AI models to specific industries, cultures or languages instead of pulling from the entire internet.
“The localised strategy is a reasonable strategy for them,” said Choi Suk-woong, a professor of information systems at the University at Albany. “US firms are focused on general-purpose tools. South Korean AI firms can target a specific area.”
Outside the US, AI prowess appears limited in reach. In China, Baidu’s answer to ChatGPT, called Ernie, and Huawei’s large language model have shown some success at home, but they are far from dominating the global market.
Governments and companies in other nations like Canada, Britain, India and Israel have also said they are developing their own AI systems, though none has yet to release a system that can be used by the public.
About a year before ChatGPT was released, Naver, which operates South Korea’s most widely used search engine, announced that it had successfully created a large language model. But the chatbot based on that model, Clova X, was released only this September, nearly a year after ChatGPT’s debut.
Sung Na-ko, an executive at Naver who has led the company’s generative AI project, said the timing of ChatGPT’s release surprised him.
“Up until that point, we were taking a conservative approach to AI services and just cautiously exploring the possibilities,” Sung said.
“Then we realised that the timeline had been accelerated a lot,” he added. “We decided we had to move immediately.”
Now, Naver runs an AI model built for Korean language speakers from the ground up using data from the South Korean government and from its search engine, which has scraped the country’s Internet since 1999.
Clova X recognises Korean idioms and the latest slang – language that US-made chatbots like Bard, ChatGPT and Claude often struggle to understand. Naver’s chatbot is also integrated into the search engine, letting people use the tool to shop and travel.
Outside its home market, the company is exploring business opportunities with the Saudi Arabian government. Japan could be another potential customer, experts said, since Line, a messaging service owned by Naver, is used widely there.
LG has also created its own generative AI model, the type of artificial intelligence capable of creating original content based on inputs, called Exaone. Since its creation in 2021, LG has worked with publishers, research centres, pharmaceutical firms and medical companies to tailor its system to their data sets and provide them access to its AI system.
The company is targeting businesses and researchers instead of the general user, said Bae Kyung-hoon, the director of LG AI Research. Its subsidiaries have also begun using its own AI chatbots.
One of the chatbots, built to analyse chemistry research and chemical equations, has been used by researchers building new materials for batteries, chemicals and medicine.
“Rather than letting the best one or two AI systems dominate, it’s important to have an array of models specific to a domain, language or culture,” said Lee Hong-lak, the chief scientist of LG’s AI research arm.
Another South Korean behemoth, Samsung, recently announced Samsung Gauss, a generative AI model being used internally to compose email, summarise documents and translate text. The company plans to integrate it into its mobile phones and smart home appliances. — ©2023 The New York Times Company