Accéder au contenu principal

In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From

Wondering what data OpenAI used to train its buzzy new text-to-video AI? The company’s CTO is similarly unsure.

Mira Murati, OpenAI’s longtime chief technology officer, sat down with The Wall Street Journal’s Joanna Stern this week to discuss Sora, the company’s forthcoming video-generating AI. About halfway through the 10-minute-long interview, Stern straightforwardly asked Murati where the new model’s training data was gleaned from. But Murati, in the most cringe-inducing way possible, couldn’t find an answer beyond vague corporate language.

“We used publicly available data and licensed data,” Murati responded to the resoundingly simple question.

Stern pushed back with more specific source examples: “So, videos on YouTube?”

“I’m actually not sure about that,” said Murati, before rebuffing further queries about whether videos shared to Instagram or Facebook were fed into model.

“You know, if they were publicly available — publicly available to use,” the CTO answered, “but I’m not sure. I’m not confident about it.”

Stern then inquired about OpenAI’s data training partnership with the stock image company Shutterstock, asking if videos on the partnered platform were sucked into Sora’s training material. And this time? Murati decided to shut down the line of questioning altogether.

“I’m just not going to go into detail about the data that was used,” Murati continued. “But it was publicly available or licensed data.”

So, in sum, Murati can’t tell you exactly where the videos gobbled up by Sora first came from. But rest assured, the sourceless data was definitely, one hundred percent publicly available or licensed. Convincing stuff!

It’s a bad look all around for OpenAI, which has drawn wide controversy — not to mention multiple copyright lawsuits, including one from The New York Times — for its data-scraping practices. After all, if the company’s CTO can’t firmly tell you where its buzziest new model’s training data was sourced from, it doesn’t exactly communicate a particular amount of care for the issue from OpenAI’s higher-ups.

After the interview, Murati reportedly confirmed to the WSJ that Shutterstock videos were indeed included in Sora’s training set. But when you consider the vastness of video content across the web, any clips available to OpenAI through Shutterstock are likely only a small drop in the Sora training data pond.

Online, reactions to the clip were mixed, with many chalking Murati’s close-lipped responses up to a possible lack of candidness.

“So when *the CTO* of OpenAI is asked if Sora was trained on YouTube videos, she says ‘actually I’m not sure’ and refuses to discuss all further questions about the training data,” former LA Times tech columnist Brian Merchant wrote in an X-formerly-Twitter post. “Either a rather stunning level of ignorance of her own product, or a lie — pretty damning either way!”

“You’re the CTO ma’am,” added another netizen, “you should know.”

Others, meanwhile, jumped to Murati’s defense, arguing that if you’ve ever published anything to the internet, you should be perfectly fine with AI companies gobbling it up.

“Why does it matter? That is the question,” said one X user. “I find it insane that people make things public to everyone in the world and then complain when someone uses that public thing. If you want to be private, then be private.”

That latter argument, though, speaks to the bizarre new reality that internet users have now found themselves in. Historically, when someone told you to be careful of what you post online, the reasoning was something akin to “you might regret that later” — and not “a multibillion-dollar AI company might turn a profit by vacuuming that Facebook video of you and your family, or a goofy YouTube video you made with your friends, into a generative AI model.”

Whether Murati was keeping things close to the vest to avoid more copyright litigation or simply just didn’t know the answer, people have good reason to wonder where AI data — be it “publicly available and licensed” or not — is coming from. And moving forward, vague corporate mumbling probably isn’t going to cut it.

More on OpenAI and its data: OpenAI Says It’s Fine to Vacuum Up Everyone’s Content and Charge for It Without Paying Them




Source link

The post In Cringe Video, OpenAI CTO Says She Doesn’t Know Where Sora’s Training Data Came From appeared first on Job From Home Blog.

Commentaires

Posts les plus consultés de ce blog

Opendoor Technologies, Diversified Healthcare Trust, Ashford Hospitality Trust among real estate gainers

Opendoor Technologies, Diversified Healthcare Trust, Ashford Hospitality Trust among real estate gainers Source link The post Opendoor Technologies, Diversified Healthcare Trust, Ashford Hospitality Trust among real estate gainers appeared first on Job From Home Blog .

17 Proven Ways to Make an Extra $1,000 a Month in 2023

Are you looking to make extra money on your own schedule? We all need more money sometimes, and this list of proven ways to make an extra $1,000 a month is going to detail how you can do it without the need to get a full-time job. These niche side hustles and passive income ideas can be started in your spare time and are ideal for students, stay-at-home moms, or anyone needing extra cash. Let’s dive in! How to Make an Extra $1,000 a Month (in Mostly Passive Income) All you need to get started earning extra cash is a few hours a day to spend getting your new income stream off the ground. First, we’ll take a look at how to make an extra $1,000 a month passive income. These side hustles require work upfront but will continue to pay you with minimal effort to maintain. 1. Personal Website A personal website or blog is one of the best ways to make an extra $1000 a month, and you can even start for free with the Weebly website builder. Learn more in this Weebly review. You can bl

How This 40-Year-Old Sold His Ecommerce Business for 6 Figures By Cold Calling 3 Buyers

Erin Shine comes from a family of entrepreneurs, so it’s not very surprising that he followed in his family’s footsteps. Starting in college, he developed a passion for renewable energy and energy efficiency, which he pursued throughout his career. He eventually started his own e-commerce store selling energy-efficient lighting and built it up to seven figures in profit before selling it for 6 figures. Over the years, he’s started other businesses with varying levels of success. Currently, he has an ambitious new venture: building out and renting or selling net zero homes, which he documents on his blog, Attainable Home , and YouTube channel. Keep reading to find out: How he got into real estate and energy efficiency Why he created his e-commerce business How he built it up to 7 figures in profit What other businesses he’s created What he learned from his many business ventures How he sold his e-commerce business What he aims to do with his blog How he was impacted by th