Meta releases AI model for translating speech between dozens of languages

NEW YORK, Aug 22 (Reuters) – Facebook parent company Meta Platforms (META.O) on Tuesday released an AI model capable of translating and transcribing speech in dozens of languages, a potential building-block for tools enabling real-time communication across language divides.

The company said in a blog post that its SeamlessM4T model could support translations between text and speech in nearly 100 languages, as well as full speech-to-speech translation for 35 languages, combining technology that was previously available only in separate models.

advertisementsCEO Mark Zuckerberg has said he envisions such tools facilitating interactions between users from around the globe in the metaverse, the set of interconnected virtual worlds on which he is betting the company’s future.

Meta is making the model available to the public for non-commercial use, the blog post said.

The world’s biggest social media company has released a flurry of mostly free AI models this year, including a large language model called Llama that poses a serious challenge to proprietary models sold by Microsoft-backed (MSFT.O) OpenAI and Alphabet’s (GOOGL.O) Google.

Zuckerberg says an open AI ecosystem works to Meta’s advantage, as the company has more to gain by effectively crowd-sourcing the creation of consumer-facing tools for its social platforms than by charging for access to the models.

Nonetheless, Meta faces similar legal questions as the rest of the industry around the training data ingested to create its models.

In July, comedian Sarah Silverman and two other authors filed copyright infringement lawsuits against both Meta and OpenAI, accusing the companies of using their books as training data without permission.

For the SeamlessM4T model, Meta researchers said in a research paper that they gathered audio training data from 4 million hours of “raw audio originating from a publicly available repository of crawled web data,” without specifying which repository.

A Meta spokesperson did not respond to questions on the provenance of the audio data.

Text data came from datasets created last year that pulled content from Wikipedia and associated websites, the research paper said.

Reporting by Katie Paul, Editing by Rosalba O’Brien

August 16, 2022August 16, 2022

Faallo: Guusha Ruto iyo waxay uga dhigan tahay Soomaalida

MUQDISHO (Raxanreeb) Ruto, oo asalkiisa siyaasadda ka soo galay qoys dan yar ah, waxa uu ku guulaystey in uu tartanka u dhigo mid u dhexeeya dadka danyarta ee nolosha adag ku jira iyo koox boqortooyo ku saleysan oo uu ula jeedo madaxweyne Kenyatta iyo Odinga.

August 14, 2022August 14, 2022

Qarax khasaare sababay oo ka dhacay Gobolka Shabeellaha Hoose

Muqdisho (Raxan Reeb) — Qarax subaxnnimadii saakay ka dhacay shabeellaha hoose, qaraxa ayaa ahaa miinada dhulka lagu aaso taasoo sababtay...

August 16, 2022August 16, 2022

Muuse Biixi ma wuxuu ka cararayaa eedo la xiriira tacdiyo ka dhan ah xuquuqda aadanaha?

HARGEYSA (Raxanreeb) Booliska ayaa lagu eedeeyay inay rasaas u adeegsadeen dibadbaxayaasha iyadoo illaa 6 ruux ay ku dhinteen magaalooyinka Hargeysa, Ceerigaabo iyo Burco.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Leave a Reply Cancel Reply