音声 — 今週のランキング

過去7日間に「音声」カテゴリで最もエンゲージメントを獲得した投稿。

動画生成89 画像生成81 コーディング115 音声7 エージェント160 リサーチ125

AI 音声技術の誰もが音声に背景と環境ノイズが必要であることに気づいていないことに驚いています音声 AI の業界トップの @ElevenLabs でさえ、背景ノイズや環境リバーブ音を備えた音声を生成できません AI 音声がそれらを持たない場合、人間として機能不全になる可能性が常にありますそしてそれについて話しているのは僕とこの他の人だけです

原文を表示 (en)

I'm just as surprised nobody in AI voice tech has realized a voice needs background and environmental noise to sound realistic Even @ElevenLabs the leader in voice AI can not produce voice with background noise, or environment reverb sound AI voices are always going to sound non-passable as human if they don't have that And it's only me and this other guy even talking about it

Alexander@alexanderrX_

@levelsio I'm surprised they haven't perfected the sound yet. It always has that empty, artificial feel. It's the biggest giveaway by far. I'm sure they're close to perfecting it.

6d ago

52114166107KXで開く

@levelsio @levelsio

音声

このビデオのAIノイズ除去何が起きてんの？マジで聞けたもんじゃない背景ノイズを全部除去すると、AIの吹き替え音声みたいに聞こえてクソ悪い

原文を表示 (en)

What in the AI noise reduction is happening in this video? It's unlistenable to me If you remove all background noise it just sounds like fake dubbed AI voices, so bad

Anjney Midha@AnjneyMidha

*New Lecture* Stanford @CS153Systems '26, Session 8 The Compute Behind Intelligence with Jensen Huang from @nvidia full link in comment - this clip is just the one where he talks about tomatoes

1w ago

32010139125KXで開く

う

うみゆき@AI研究 @umiyuki_ai

音声

ついにAI音声パクリで訴訟。原告はツダケンさん。被告はTikTok。何者かがTikTokにツダケンを学習させた音声使って動画上げまくってるのでTikTokに削除させたいという訴訟。なるほど、悪用してる本人じゃなくてプラットフォーム相手にすれば開示請求で相手の身元を突き止めるとか要らんわけね。根拠法はパブリシティ権や不正競争防止法。TikTok側は「いや、投稿者によればツダケンは学習させてないんだって。友達の声を学習させただけだが？だって」と主張してるという。そりゃまあ世の中ツダケンと声そっくりな人もいるっちゃいるだろうし、ツダケンを学習させたともさせてないとも証拠が無い。AI音声がツダケンと激似というだけではたして削除させれるかさせれないかという話。僕的には被告が本当にツダケンに声激似の投稿者の友達とやらを証人として連れてきてツダケンボイスで喋らせたなら無罪、連れてこれないなら削除で終わりでいいんじゃない？

読

読売新聞オンライン@Yomiuri_Online

津田健次郎さんの「低音ボイス」生成ＡＩで模倣、動画削除求めティックトック提訴…声無断利用で初の訴訟か https://www.yomiuri.co.jp/national/20260522-GYT1T00290/ #ニュース

2d ago

54191014KXで開く

Karan Goel @krandiash

音声

新しいスピーチモデル Sonic-3.5 は Artificial Analysis のリーダーボードで #1 になりました。 2年足らず前、世界最速のスピーチモデル Sonic-1 をリリースしました。 Sonic-3.5 は今、本番環境で最も低いレイテンシーを備えた会話用の最高のスピーチモデルをもたらします。

原文を表示 (en)

Our new speech model Sonic-3.5 is now #1 on Artificial Analysis's leaderboard. Less than 2 years ago, we released Sonic-1, the fastest speech model in the world. Sonic-3.5 now brings the best speech model for conversation with the lowest latency in production.

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following. Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209 ➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters ➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS See more details and listen to samples below 🧵