Accéder au contenu principal

In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From

Wondering what data OpenAI used to train its buzzy new text-to-video AI? The company’s CTO is similarly unsure.

Mira Murati, OpenAI’s longtime chief technology officer, sat down with The Wall Street Journal’s Joanna Stern this week to discuss Sora, the company’s forthcoming video-generating AI. About halfway through the 10-minute-long interview, Stern straightforwardly asked Murati where the new model’s training data was gleaned from. But Murati, in the most cringe-inducing way possible, couldn’t find an answer beyond vague corporate language.

“We used publicly available data and licensed data,” Murati responded to the resoundingly simple question.

Stern pushed back with more specific source examples: “So, videos on YouTube?”

“I’m actually not sure about that,” said Murati, before rebuffing further queries about whether videos shared to Instagram or Facebook were fed into model.

“You know, if they were publicly available — publicly available to use,” the CTO answered, “but I’m not sure. I’m not confident about it.”

Stern then inquired about OpenAI’s data training partnership with the stock image company Shutterstock, asking if videos on the partnered platform were sucked into Sora’s training material. And this time? Murati decided to shut down the line of questioning altogether.

“I’m just not going to go into detail about the data that was used,” Murati continued. “But it was publicly available or licensed data.”

So, in sum, Murati can’t tell you exactly where the videos gobbled up by Sora first came from. But rest assured, the sourceless data was definitely, one hundred percent publicly available or licensed. Convincing stuff!

It’s a bad look all around for OpenAI, which has drawn wide controversy — not to mention multiple copyright lawsuits, including one from The New York Times — for its data-scraping practices. After all, if the company’s CTO can’t firmly tell you where its buzziest new model’s training data was sourced from, it doesn’t exactly communicate a particular amount of care for the issue from OpenAI’s higher-ups.

After the interview, Murati reportedly confirmed to the WSJ that Shutterstock videos were indeed included in Sora’s training set. But when you consider the vastness of video content across the web, any clips available to OpenAI through Shutterstock are likely only a small drop in the Sora training data pond.

Online, reactions to the clip were mixed, with many chalking Murati’s close-lipped responses up to a possible lack of candidness.

“So when *the CTO* of OpenAI is asked if Sora was trained on YouTube videos, she says ‘actually I’m not sure’ and refuses to discuss all further questions about the training data,” former LA Times tech columnist Brian Merchant wrote in an X-formerly-Twitter post. “Either a rather stunning level of ignorance of her own product, or a lie — pretty damning either way!”

“You’re the CTO ma’am,” added another netizen, “you should know.”

Others, meanwhile, jumped to Murati’s defense, arguing that if you’ve ever published anything to the internet, you should be perfectly fine with AI companies gobbling it up.

“Why does it matter? That is the question,” said one X user. “I find it insane that people make things public to everyone in the world and then complain when someone uses that public thing. If you want to be private, then be private.”

That latter argument, though, speaks to the bizarre new reality that internet users have now found themselves in. Historically, when someone told you to be careful of what you post online, the reasoning was something akin to “you might regret that later” — and not “a multibillion-dollar AI company might turn a profit by vacuuming that Facebook video of you and your family, or a goofy YouTube video you made with your friends, into a generative AI model.”

Whether Murati was keeping things close to the vest to avoid more copyright litigation or simply just didn’t know the answer, people have good reason to wonder where AI data — be it “publicly available and licensed” or not — is coming from. And moving forward, vague corporate mumbling probably isn’t going to cut it.

More on OpenAI and its data: OpenAI Says It’s Fine to Vacuum Up Everyone’s Content and Charge for It Without Paying Them




Source link

The post In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From appeared first on Job From Home Blog.

Commentaires

Posts les plus consultés de ce blog

15 Profitable Ideas for 2023

If you’ve honed your photography skills and learned how to take high-quality photos, you may be wondering how to make money from photography. While your photography passion could just stay a hobby, if you have a keen eye for taking a great shot, you can cash in and enjoy a new income stream. Whether you’re an aspiring photographer just starting out or a professional photographer looking to bump up your earnings, this article is going to review different ways to make money from your talents online and offline. How to Make Money from Photography Online Read on to learn how to make money from your photography talents in the online marketplace. 1. Photography Blog A top way to make money with photography is to start your own website. Once you’ve learned how to start a blog , you need to create content regularly and promote your articles on social media. As traffic levels start to increase, you can monetize your site with adverts and affiliate marketing partnerships. A few ideas ...

How To Repair Metatrader 4’s Off Quotes Error

Off quotes are a common occurrence in forex trading, however they are often frustrating for merchants who’re attempting to execute trades. By understanding the causes of off quotes and taking steps to prevent them, traders can enhance their probabilities of success in foreign currency trading. Choose a dependable broker, use a stable web connection, keep away from buying and selling throughout high volatility periods, and think about using a VPS to run your trading platform. With these tips, you possibly can scale back the chances of encountering off quotes and enhance your general buying and selling experience. Suppose there have been no new costs within the MetaTrader platform on the selected instrument for a while as a end result of connectivity was lost. In that case, these last costs can no longer be treated as market prices, and the platform shows an “off quotes” error to tell users that prices aren’t legitimate anymore. An error code 136 in MT4 means the price entered for exec...

Jojobet (464)

Jojobet bahis adresi ua – En Güvenilir Bahis Sitesi 2021 Jojobet bahis adresi ua – En Güvenilir Bahis Sitesi 2021 Geri dönüşü yüksek bir heyecana hazır olun! Heybetli rekabet atmosferinde keyifli bir oyun deneyimi sunan öncü bir bahis platformuna hoş geldiniz. İnternetin en güvenilir ve güçlü adreslerinden biri olarak, size en üst düzey kalite standartlarına sahip online bahis fırsatlarını sunmaktan gurur duyuyoruz. Bahis ve şans oyunları tutkunlarının beklentilerini aşan kapsamlı hizmetlerimizle sizi unutulmaz bir yolculuğa çıkarmak için buradayız. Profesyonel ekibimiz, Türkiye’nin en donanımlı ihtiyaçlarına uygun olarak sürekli yenilenen bahis seçenekleriyle geniş bir kumarhane atmosferi ile etkileyici deneyimler sunar. Farklı spor dallarında eşsiz tahmin ve analiz araçlarıyla dolu olan platformumuz, kazandıran oranlarla sizden tam not alacaktır. En yeni teknolojik yazılım alt yapımız sayesinde, kesintisiz ve sorunsuz oyunculuk deneyiminin tadını çıkarabilirsiniz. Şansın, yetene...