リサーチ — 今週のランキング

過去7日間に「リサーチ」カテゴリで最もエンゲージメントを獲得した投稿。

動画生成86 画像生成81 コーディング113 音声7 エージェント159 リサーチ125

個人的な報告：Anthropicに参画しました。LLMの最前線における今後数年は特に重要になると考えています。このチームに参画でき、R&Dに戻れることに非常に興奮しています。教育への情熱は変わらず、時間ができたら改めてその仕事に取り組む予定です。

原文を表示 (en)

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

6d ago

149K11K14K27MXで開く

Andrew Ng @AndrewYNg

リサーチ

AI ジョブポカリプスは起こらない。 AI が大量失業をもたらすという話は不要な恐怖を助長している。AI は他のテクノロジーと同じように仕事に影響するが、大規模失業の誇張された話をするのは無責任で有害だ。やめよう。過去の投稿で AI ジョブポカリプスについて懐疑的な意見を述べてきた。今メインストリームメディアがこのナラティブに異議を唱えてくれるのは嬉しい。下の画像は最近のヘッドラインをまとめたもの。ソフトウェアエンジニアリングは AI ツールの影響を最も受けた業界で、コーディングエージェントが急速に進化してる。でも、ソフトウェアエンジニアの採用は依然として堅調。だから AI が仕事を奪う例もあるけど、トレンドは強く、正味の雇用創出が雇用破壊よりはるかに大きいことを示してる。これは以前のテクノロジー波と同じだ。そして AI で素晴らしい進展があってもなお、米国の失業率は健全な4.3%に留まってる。なぜ AI ジョブポカリプスのナラティブがこんなに人気なのか？第一に、フロンティア AI ラボはこの技術をもっと強力に見せるストーリーを語る強いインセンティブを持ってる。極端な例では、AI が「支配する」とか人類滅亡を引き起こすみたいなSFシナリオを促進する。テクノロジーが多くの従業員を置き換えることができるなら、その技術は確実に非常に価値があるはず！また、多くの SaaS ソフトウェア企業はユーザー1人あたり年100～1000ドルの料金を取ってる。ただ AI 企業が年10万ドル稼いでる従業員を置き換えるか、生産性を50%上げることができるなら、年1万ドルでも妥当に見える。AI 企業は通常の SaaS 価格ではなく従業員の給与にアンカーすることで、もっと多くの料金を設定できる。さらに、企業は AI が原因のようにレイオフについて話す強いインセンティブを持ってる。結局のところ、AI を使ってスタッフを減らしながら生産性を大幅に向上させてるという話は、賢く見える。パンデミック中に低い金利と大規模な政府財政刺激のおかげで資本が豊富だったから採用しすぎたと認めるより、ずっといいメッセージだ。明確にしておくと、AI が多くの人の仕事を変えさせてることは認識してる。これは難しい。ストレスがある。(そして、楽しいと思う人もいる。)影響を受けた全員に共感する。同時に、これは雇用市場の崩壊を予測するのとは全然違う。社会は現実にはほとんど基づかず、社会全体の意思決定に悪影響をもたらすストーリーを何年間も自分たちに言い聞かせることができる。例えば、原発の安全性への懸念は原発への過少投資に繋がった。1960年代の「人口爆弾」への懸念は、国々に人口を減らすための厳しい政策を実施させた。そして食事の脂肪についての懸念は、政府が数十年間にわたって不健康な高糖質食を推進させた。メインストリームメディアが今 AI ジョブポカリプスに明らかに懐疑的になったので、これらのストーリーは(AI が人類滅亡を引き起こすという懸念と同じように)その力を失い始めると思う。 AI ジョブポカリプスの予測とは反対に、逆のことを予測する: AI ジョブアパルーザがあるだろう！AI はもっと多くの素晴らしい AI エンジニアリング職を生むだろうし、全体的な雇用市場の未来についても楽観的だ。AI エンジニアがすることは従来のソフトウェアエンジニアリングとは異なり、これらの多くの仕事は開発者の伝統的な大規模雇用者以外のビジネスにあるだろう。非 AI 職では、AI のせいで必要な skills も変わる。それは今がより多くの人を AI に習熟させるように励まし、未来の異なるが豊富な仕事の準備ができてることを確認するいい時期だ！ [The Batch ニュースレターの原文。]

原文を表示 (en)

There will be no AI jobpocalypse. The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — does affect jobs, but telling overblown stories of large-scale unemployment is irresponsible and damaging. Let’s put a stop to it. I’ve expressed skepticism about the jobpocalypse in previous posts. I’m glad to see that the popular press is now pushing back on this narrative. The image below features some recent headlines. Software engineering is the sector most affected by AI tools, as coding agents race ahead. Yet hiring of software engineers remains strong! So while there are examples of AI taking away jobs, the trends strongly suggest the net job creation is vastly greater than the job destruction — just like earlier waves of technology. Further, despite all the exciting progress in AI, the U.S. unemployment rate remains a healthy 4.3%. Why is the AI jobpocalypse narrative so popular? For one thing, frontier AI labs have a strong incentive to tell stories that make AI technology sound more powerful. At their most extreme, they promote science-fiction scenarios of AI “taking over” and causing human extinction. If a technology can replace many employees, surely that technology must be very valuable! Also, a lot of SaaS software companies charge around $100-$1000 per user/year. But if an AI company can replace an employee who makes $100,000 — or make them 50% more productive — then charging even $10,000 starts to look reasonable. By anchoring not to typical SaaS prices but to salaries of employees, AI companies can charge a lot more. Additionally, businesses have a strong incentive to talk about layoffs as if they were caused by AI. After all, talking about how they’re using AI to be far more productive with fewer staff makes them look smart. This is a better message than admitting they overhired during the pandemic when capital was abundant due to low interest rates and a massive government financial stimulus. To be clear, I recognize that AI is causing a lot of people’s work to change. This is hard. This is stressful. (And to some, it can be fun.) I empathize with everyone affected. At the same time, this is very different from predicting a collapse of the job market. Societies are capable of telling themselves stories for years that have little basis in reality and lead to poor society-wide decision making. For example, fears over nuclear plant safety led to under-investment in nuclear power. Fears of the “population bomb” in the 1960s led countries to implement harsh policies to reduce their populations. And worries about dietary fat led governments to promote unhealthy high-sugar diets for decades. Now that mainstream media is openly skeptical about the jobpocalypse, I hope these stories will start to lose their teeth (much like fears of AI-driven human extinction have). Contrary to the predictions of an AI jobpocalypse, I predict the opposite: There will be an AI jobapalooza! AI will lead to a lot more good AI engineering jobs, and I’m also optimistic about the future of the overall job market. What AI engineers do will be different from traditional software engineering, and many of these jobs will be in businesses other than traditional large employers of developers. In non-AI roles, too, the skills needed will change because of AI. That makes this a good time to encourage more people to become proficient in AI, and make sure they’re ready for the different but plentiful jobs of the future! [Original text in The Batch newsletter.]

1w ago

5.4K1.2K3.3K795KXで開く

Andrew Ng @AndrewYNg

リサーチ

グリーンカード申請者に米国外からの申請を求める新しいホワイトハウス政策は、合法的な移民への恣意的な攻撃だ。家族に悪影響を与え、医者や教師、科学者が減り、AI分野での米国の競争力を損なう。

原文を表示 (en)

The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families, leave us with fewer doctors, teachers and scientists, and hurt American competitiveness in AI.

3d ago

12K1.6K7991.3MXで開く

今

今井翔太 / Shota Imai@えるエル @ImAI_Eruel

リサーチ

Meta等のレイオフ、卒業式でのAIブーイングなど、AIが仕事を奪う論が本格化しており、海外では「年配層が好意的で、若い人が反発する最新技術は史上初では?」と言われ始めています。実は「AIで若い人は仕事が減り、年配者は逆に仕事が増えている」とする研究が存在し、今週のBSフジ出演時にも取り上げました。スタンフォード大学から出た"Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence"という論文では、生成AI普及と共に労働市場がどうなったか実際のデータを用いて分析しているのですが、まず「AIによって仕事が奪われやすい」とされていた職種については、実際に雇用の減少が見られました。ただ、画像の図を見てもわかるように、確かに雇用の減少傾向が見られるものの、年齢別に見てみると、むしろ雇用が増えている層がいます。「Early Career」とされている20代〜30代の雇用は減少しているものの、Seniorとされてい50代くらい層を中心に、年配者は雇用が伸びています。元々、AIによって仕事が奪われる論は、職種ごとに「奪われる/奪われない」を見ていたのですが、この研究によると、さらに年齢という軸が重要だったことがわかります。この辺の研究を踏まえると、特に若い人ほどAIに反発し、大学の卒業式でブーイングが起きるような事態になるのはわかる気がします。

5d ago

1.9K648683347KXで開く

今

今井翔太 / Shota Imai@えるエル @ImAI_Eruel

リサーチ

「ブルシットジョブ（無意味で必要ないはずの仕事）」の方がむしろAI時代には残るというのは、実は某誰もが知る有名AI研究者も同じことを言っているんですよね。元々必要とされてない仕事であるが故に、「役に立つように」「生産性を高めるように」学習されたAIに奪われようがないという... 現状の研究を見ても、「役に立つ」仕事の方がAIの影響を受ける傾向があるように思います。

Jun Rekimoto: 暦本純一@rkmt

AI耐性が最も高い仕事がbullshit job．本来、AI以前に無くなっても良いはずなのに残っているのにはそれなりの構造があり..

1w ago

1.8K420593310KXで開く

Sasha Rush @srush_nlp

リサーチ

Talk: Training Composer https://www.youtube.com/watch?v=uTgqYeVxy2c Cursorでモデルを構築する際に使用している手法の概要。

原文を表示 (en)

Talk: Training Composer https://www.youtube.com/watch?v=uTgqYeVxy2c Overview of the methods that we use at Cursor to build our model.

4d ago

6618469791KXで開く

@levelsio @levelsio

リサーチ

10年前なら日本がこれ発明してたよ。猫好きだし、テック好きだし。でもこれは中国のスタートアップが作った中国の製品で、中国の AI モデルを使ってる。アジアのイノベーションの中心は生産地じゃなくて、中国にシフトしたんだ！

原文を表示 (en)

Just a decade ago Japan would have invented this, they love cats and tech But this is a Chinese device by a Chinese startup running a Chinese AI model The center of innovation in Asia has shifted to China, not just production!

比

比特币橙子Trader@oragnes

中国杭州初创公司PettiChat，118美元AI宠物项圈使用阿里巴巴Qwen模型，能以94.6%准确率实时翻译猫狗声音和情绪，已获1万预订。

2d ago

1.3K49525379KXで開く

Javi Lopez ⛩️@javilopen

リサーチ

⚡ xAIが昨日Xアルゴリズム出したけど、誰もなぜ中身に気づかないのか分からん Claudeで全行ステップバイステップ見るのに$500費やした見つけたものはこれ（長編ツイート、あとで読むために保存して）： 0/ 全アカウントに「embedding」がくっついてて、AIモデルみたいな方法であなたを説明する：潜在空間で。モデルが全ユーザーについて保持する内部フィンガープリント。あなたのアカウントがどう振る舞うか（何のトピックに触れるか、どんなエンゲージを生むか、誰と相互作用するか）を数字のベクトルで表現してる。あなたの投稿を誰に見せるか決める時、毎回これを使う。履歴が良ければきれいなままでモデルがあなたを推す。否定シグナル（ブロック、ミュート、報告、not_interested）が積み重なれば、embeddingが「toxic」になって自動的にペナルティを始める。そして罠：リセットされない。今日やったことが数週間そこに残って、その後の投稿をすべて毒する、たとえそれが良い投稿でも。だからXでシャドウバンから抜け出すか低リーチ状態から抜け出すのが、でかくさびついた車輪を動かそうとする感じなんだ。想像じゃなくて、文字通りそれなんだよ。embeddingをクリーンアップするのは遅くて痛い。好きじゃない人の印象みたいなもん：その人がいくら親切になっても、また信用するまでに時間かかるだろう。もう一つ重要な発見：embeddingは時計のようにはデケイしない。システムに入ってくる新しいエンゲージメントでデケイする。投稿するのをやめれば、古い悪いシグナルはそこに凍って残ったまま。何もそれを上書きしない。アルゴリズムが好きなコンテンツ作り始めれば、6～8週後に改善が見え始めて、12～16週目に本当なシフトが見える。その間にもっと悪いシグナル積み重ねなけりゃの話だけど。なぜこれについて誰も話さないのか？俺はびっくりしてる。やっと「俺は悪いストリークにいるわ」っていう感じの確認が来たんだ。 1/ 最初の30分がすべて投稿がすぐにエンゲージ得なけりゃ、Grokは評価さえしない。クオリティスコアなし、深い分析なし、フォロワーじゃない人に届く可能性なし。死んで埋もれる。 2/ 投稿年齢80時間で上限： POST_AGE_MAX_MINUTES = 4800、1時間単位でバケット分け。その後は「overflowバケット」で「古い、無視」を意味する。ベストウィンドウ：最初の0～12時間。24時間後にはすでに悪いバケットにいる。 evergreen コンテンツを報酬するどころか、Xは新しい食料の絶え間ない流れを欲しい（YouTubeと文字通り反対）。 3/ 俺の最大の恐怖は根拠がなかった（はず）：EUに住んでて英語で米国向けに投稿：理論的にはゼロの直接ペナルティ： PostCandidate構造体には著者国、IP、場所のフィールドがない。Gizmoduck（Xの身元サービス）はフォロワー数+スクリーン名だけ返す。Phoenix transformerは author_id のハッシュだけ見る。間接的に傷つけるのは：タイムゾーン（米国が寝てる間にあなたの投稿が年をとる）と投稿自体の言語。だから「米国から投稿」するのにVPN使うのは文字通り何もしない（TikTokやInstagramと違ってね）。 4/ あなたのリーチを殺す5つの負のシグナル：モデルは投稿ごとに22個のアクションを予測する。5個がスコアから差し引かれる負の重み： - not_interested - block_author - mute_author - report - not_dwelled（人があなたの投稿を見ずにスクロール）最後のやつ本当にきつい。無視された投稿は数学的に投稿されなかった投稿より悪い。 5/ シャドウバン100%存在する。4種類： - Hard drop。Xはあなたの投稿を全員のフィードから削除して、お知らせもしない。深刻なコンテンツ（児童安全など）や停止されたアカウントに適用。知ることさえない。 - DO_NOT_AMPLIFYラベル。文字通りコードに「この投稿を増幅するな」ってフィールド。これを付けられたら、広告があなたの投稿の隣に表示されなくなる→Xはあなたを見せるのでお金稼がなくなる→システムはあなたをプッシュするのやめる。完全停電。 - BotMaker rules。Xの従業員が特定のアカウントを手動で制限できる内部パネル。コードは存在するカテゴリ（Content、ContentLimited、Safety、Grok）を見せるけど、誰に適用されてるかなぜかは見せない。ツールはドキュメント化されてる、使用方法はされてない。 - Poisoned embedding。最悪のやつ。モデルは各アカウントの内部「メモリ」がある。アカウントが十分な「not interested」+ブロック+ミュート+報告を時間をかけて積み重ねれば、そのメモリはtoxicになる。その後、良い将来の投稿さえ自動的にペナルティされる。誰も決めなかった。モデルがあなたのアカウントが悪いエンゲージ得るって学習して自己修正しただけ。 6/ オリジナル投稿だけが「Banger Screen」に入るリプライとリツイートはGrokのクオリティクラシファイアに入らない。バイラル垢にずっとリプライしてたら、Reply RankerになるようにOPTIMIZEしてることになって、増幅のためじゃない。ネットワーク外で発見されたい？オリジナル書け。他に方法ない。 7/ 小さいアカウントへのリプライはスパム走査。大きいアカウントへのリプライはGrok順位付け 2つの異なるクラシファイア。SpamEapiLowFollowerClassifierは小さいアカウントへのリプライを打つ。ReplyRankerは大きいアカウントへのリプライを0～3でGrokで順位付けする。「First!」か絵文字だけのリプライはスコア0。「Sir, this is a Wendy's」エネルギーはペナルティ。基本的に、リプライ書くなら何か足す必要があるってわけ。じゃなきゃやるな。 8/ すべてのフィードリクエストの50%は「shadow traffic」 is_sampled(request_id, 0.5)はすべてのフィードリクエストの半分をshadowとしてマーク。多くのコンテキスト機能（性別推論、デモグラフィクス、Grokのトピック環境設定）はshadow時のみまたはfeature flagで有効化翻訳：どのバージョンのアルゴリズムをどのユーザーが得てるのか文字通り知ることができない。どのユーザーも任意の時点で実験に入ってる。 9/ Dwell（ユーザーがあなたの投稿を見てスクロールするまでの時間）はライクスより5倍良いスコアは5つの異なるdwell信号（dwell、cont_dwell_time、click_dwell_time等）を持つけど、favorite信号は1つだけ。 - 投稿はいっぱいのライク持ってるけど、人々が1秒読んでスクロール→低スコア - 投稿はライク少ないけど、人々が8秒読んでる→高スコアライク数じゃなくて、あなたの投稿で費やされた時間に最適化しろ！ 10/ 実際に機能することら： - 最初の10分でエンゲージ得ろ。友達にDM、コミュニティにping、何でもいい - あなたの時間帯じゃなくて、あなたのオーディエンスのタイムゾーンで投稿。米国ターゲット：8～11am ET（Madrid時間で14～17） - 5つ一気に投稿するな。AuthorDiversityScorer各次の投稿にdecay^positionを掛ける。投稿4まででフロアにいる。 - ビデオ≥10秒。MinVideoDurationMsより下なら完全なVQV重みを失う。 - オーディオ付きビデオ。Grokはすべてのビデオで音声認識を走らせる。オーディオなし=空白信号。 - あなたのニッチでバイラルをクォートツイート。モデルはすでにオリジナルがエンゲージするって知ってて、あなたの付加価値が積み重なる。 11/ あなたのリーチを完全に殺すもの： - 野生の発見：10+ ツイートのスレッド。DedupConversationFilterはフィード1件ごとに会話1件のツイートだけ保持。メガスレッドは数学的に無駄。 - 同じコンテンツを再投稿。Bloomフィルターがデデュプ。 - AI slop。文字通り slop_score フィールドがBangerScreen出力にある。明示的に検出してる。 - タグなしのNSFW/暴力/ヘイト。自動MediumRisk=広告なし=構造的シャドウバン。 - 小さいアカウントへのリプライスパム。特定クラシファイア。 12/ 彼ら何をリリースしなかったか、こすい野郎共：スケルトンは公開。ダイアルはなし。 - 各重みの正確な数値（FavoriteWeight、ReplyWeight、OonWeightFactor、AuthorDiversityDecay）。xai_feature_switches::Paramsに住んでて、外部設定。 - 実際のGrokプロンプト（7つのPToSポリシープロンプト、BangerMiniVlmScreenScore、SafetyPtos）。あらゆるフレームが中にあるかもしれない。 - 特定アカウントにDO_NOT_AMPLIFYを適用するBotMakerルール。 - util/phoenix_request.rs、最終的なモデルコール構築。 - 参照されたけどインクルードされない25+ xai_*クレート。 - 本番Phoenix重み。miniバージョンだけリリース。俺の理論：やつらは本当にあるもの全体の本当に薄いスケルトンをくれた。筋肉（重み）と脳（プロンプトとBotMakerルール）は完全に不透明。ベストパーツを自分たちのために取ってた、明らかに。 13/ 忘れられないようなチートシート： - 最初の30分は他のどんなことより大事 - あなたの場所は無関係、あなたのタイミングと言語は違う - シャドウバン4つのフレーバーで存在。最悪は過去の悪いシグナルから静かにあなたの著者embeddingを毒してるモデル。過去の悪いシグナルをクリーンアップして登り返るのはつらいけど、できる。 - リプライとリツイートはクオリティクラシファイアを得ない。オリジナルは得る。 - Dwell（誰かが投稿見るのに実際とどまる）はライク5対1を倒す。 - すべてのトラフィックの半分は任意の時点で実験の中にある。 - 彼らがアルゴリズムのベストパーツを自分たちのために保持した。でもまぁ、何もないよりマシだ。

原文を表示 (en)

⚡ xAI dropped the X algorithm yesterday and I don't get why nobody noticed what's actually in there I burned $500 on Claude going through every single line Here's what I found (LONG POST, save it for later): 0/ Every account has an "embedding" attached to it that describes you the way AI models do: in latent space. It's the internal fingerprint the model keeps of every user, a vector of numbers that sums up how your account behaves (what topics you touch, what engagement you generate, who you interact with). The model uses it every time it decides who to show your posts to. If your history is good, it stays clean and the model pushes you. If you accumulate negative signals (blocks, mutes, reports, not_interested), it goes toxic and starts penalizing you automatically. And the trap: it does NOT reset. What you do today stays in there for weeks, poisoning everything you publish after, even if it's good. That's why getting out of a shadowban or a low-reach streak on X feels like trying to move a giant rusted wheel. It's not your imagination, it's literally that. Cleaning up your embedding is slow and painful, like the impression you have of someone you don't like: no matter how nice they get to you, it's gonna take a while before you trust them. Another important finding: the embedding doesn't decay on a clock. It decays with NEW engagement entering the system. If you stop posting, the old bad signals stay frozen in there. Nothing overwrites them. If you start making content the algorithm likes, you'd see improvement after 6 to 8 weeks and a real shift around 12 to 16 weeks, assuming you don't pile up more bad signals along the way. Why is nobody talking about this? It blows my mind. Finally a confirmation of that "I'm in a bad streak" feeling we've all been through. 1/ First 30 minutes are everything If your post doesn't get engagement fast, Grok doesn't even evaluate it. No quality score, no deep analysis, no chance of reaching anyone who doesn't follow you. Dead and buried 2/ Post age caps at 80 hours: POST_AGE_MAX_MINUTES = 4800, bucketed in 1 hour chunks. After that you're in the "overflow bucket" which translates to "ancient, ignore" Best window: first 0 to 12 hours. After 24 you're already in a worse bucket Far from rewarding "evergreen" content, X wants a constant stream of fresh meat (literally the opposite of YouTube) 3/ MY BIGGEST FEAR TURNED OUT TO BE UNFOUNDED (supposedly): living in EU posting English for US audience: ZERO direct penalty in theory: The PostCandidate struct has NO field for author country, IP, or location. Gizmoduck (X's identity service) returns only follower count + screen name. The Phoenix transformer just sees a hash of your author_id What hurts you indirectly: timezone (your post ages while US sleeps) and the language of the POST itself So using a VPN to "post from the US" does literally nothing (unlike TikTok or Instagram, by the way) 4/ The 5 negative signals that kill your reach: The model predicts 22 actions per post. 5 of them are negative weights that get SUBTRACTED from your score: - not_interested - block_author - mute_author - report - not_dwelled (people scrolling past your post without stopping) That last one is brutal tbh. A post that gets ignored is mathematically WORSE than a post that never got published 5/ Shadowbans 100% exist. 4 different kinds: - Hard drop. X removes your post from everyone's feed without telling you. Applied to posts with serious content (child safety, etc.) or suspended accounts. You don't even find out - DO_NOT_AMPLIFY label. Literally a field in the code that says "do not amplify this post". If they put it on you, ads stop showing next to your posts → X stops making money from showing you → the system stops pushing you. Full blackout - BotMaker rules. The internal panel where X employees can manually limit a specific account by hand. The code shows the categories that exist (Content, ContentLimited, Safety, Grok) but does NOT show who they're applied to or why. The tool is documented, the usage isn't - Poisoned embedding. The worst one, as we saw above. The model has an internal "memory" for every account. If your account racks up enough "not interested" + blocks + mutes + reports over time, that memory goes toxic. From then on, even your good future posts get penalized automatically. Nobody decided this. The model just learned your account gets bad engagement and self-corrected 6/ Only ORIGINAL posts get the "Banger Screen" Replies and retweets never enter the Grok quality classifier. If you spend your day replying to viral accounts, you're optimizing for the Reply Ranker, NOT for amplification Want to be discovered out of network? Write originals. There's no other way 7/ Replies to small accounts get spam-scanned. Replies to big accounts get Grok-ranked Two separate classifiers. The SpamEapiLowFollowerClassifier hits replies to small accounts. The ReplyRanker scores replies to big accounts 0 to 3 with Grok "First!" or emoji-only replies get a 0. "Sir, this is a Wendy's" energy gets penalized. Basically, if you write replies, they better add something. Otherwise don't bother 8/ 50% of all feed requests are "shadow traffic" is_sampled(request_id, 0.5) marks half of every feed request as shadow. Many context features (gender inference, demographics, Grok topic preferences) only activate on shadow OR with a feature flag Translation: you literally cannot know which version of the algorithm any given user is getting. Half your audience is in an experiment at any moment 9/ Dwell (the time a user spends looking at your post before scrolling) is 5x better than getting likes The scorer has 5 different dwell signals (dwell, cont_dwell_time, click_dwell_time, etc.) but only 1 favorite signal. - A post with tons of likes but people read it for 1 second and keep scrolling → low score - A post with few likes but people stay 8 seconds reading it → high score Optimize for time spent on your post, not for likes! 10/ Things that actually work: - Get engagement in the first 10 min. DM your friends, ping your community, whatever - Post in your AUDIENCE'S timezone, not yours. US targeting: 8 to 11am ET (14 to 17 Madrid time) - Don't post 5 things in a row. AuthorDiversityScorer multiplies each next post by decay^position. By post 4 you're at the floor - Video ≥ 10 seconds. Below MinVideoDurationMs you lose the full VQV weight - Videos with audio. Grok runs ASR (speech to text) on every video. No audio = blank signal - Quote tweet virals in your niche. The model already knows the original engages, your value-add stacks on top 11/ Things that absolutely kill your reach: - WILD FINDING: threads of 10+ tweets. DedupConversationFilter keeps only 1 tweet per conversation per feed. Megathreads are mathematically a waste - Reposting the same content. Bloom filters dedupe it - AI slop. There's literally a slop_score field in the BangerScreen output. They explicitly detect it - NSFW/violence/hate without tags. Auto MediumRisk = no ads = structural shadowban - Reply-spamming small accounts. Specific classifier for that 12/ What they DIDN'T release, the sneaky bastards: The skeleton is public. The dials are not - Exact numeric values of every weight (FavoriteWeight, ReplyWeight, OonWeightFactor, AuthorDiversityDecay). Live in xai_feature_switches::Params, external config - The actual Grok prompts (the 7 PToS policy prompts, BangerMiniVlmScreenScore, SafetyPtos). Could literally have any framing in them - The BotMaker rules that apply DO_NOT_AMPLIFY to specific accounts - util/phoenix_request.rs, which constructs the final model call - 25+ xai_* crates referenced but not included - The production Phoenix weights. They only released the mini version My theory: they gave us a pretty skinny skeleton of the whole thing they actually have. The muscle (weights) and the brain (prompts and BotMaker rules) are completely opaque. They kept the best parts for themselves, clearly 13/ Cheat sheet so you don't forget: - First 30 min matter more than anything - Your location is irrelevant, your timing and language are not - Shadowbans exist in 4 flavors. Worst is the model quietly poisoning your author embedding from past bad signals. Climbing back up by cleaning your embedding is gonna hurt, but it can be done - Replies and retweets don't get the quality classifier. Originals do - Dwell (someone actually staying to look at your post) beats likes 5 to 1 - Half of all traffic is in some experiment at any moment - They kept the best parts of the algorithm for themselves, but hey, something is something

1w ago

4839263959KXで開く

François Chollet @fchollet

リサーチ

シンボリック学習はコーディングエージェントの置き換えではなく、勾配降下法と NN の置き換え：低レベル、完全に汎用で、極めてスケーラブルな新しい学習基盤

原文を表示 (en)

Symbolic learning is not a replacement for coding agents, it's a replacement for gradient descent & NNs: a low-level, completely general, extremely scalable new learning substrate.

1w ago

7125331569KXで開く

Demis Hassabis @demishassabis

リサーチ

マウスポインタをインテリジェントに再設計するチーム素晴らしい仕事！@GoogleAIStudio でプロトタイプ試してみて、めっちゃマジカルだよ。

原文を表示 (en)

Really cool work from the team reimagining the mouse pointer to be intelligent! Try the prototype in @GoogleAIStudio it's pretty magical.

Google DeepMind@GoogleDeepMind

We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️ These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get things done 🧵

1w ago

2.1K1740221KXで開く

う

うみゆき@AI研究 @umiyuki_ai

リサーチ

AIは完全スルーしててもほっとけば消え失せてくれるかもしれない希望がいくつかあった ①性能がすぐ頭打ちに達して期待外れに終わる→Mythosですでに十分に人類の脅威になるほど賢くなってしまいました ②AI企業はAIで全然儲けることが出来ずに大赤字で最終的に潰れる→Anthropicは黒字達成してしまう ③AIが危険なほど高性能になったらそんなもん規制されるに決まってる→なんか規制したら敵対国家に競争で負けるから～とか言って野放しもはや希望は潰えました。これからはAIありきの世界に向き合わざるを得ない

IFDOCO@IFD_OCO

AIに関してはマウントを取りたいとかじゃなくて、「根底を覆すレベルの凄まじい技術だから、今のうちにちゃんと向き合った方がいいよ」って普通に注意喚起してるだけなんだけど、伝わらない人にはマジで伝わらないんだな理解できないものを無意識に遠ざけ始めたら人は終わりだなと思う

3d ago

654149205138KXで開く

す

すぐる | ChatGPTガチ勢 𝕏 @SuguruKun_ai

リサーチ

Googleが公開した「プロンプト講座」すごい。生成AIへの指示出しを5つのステップで効率的に学べる公式コースで、メール作成からデータ分析・プレゼン作成まで多様なテクニックを約6時間で習得可能。 ㅤ 修了した際には証明書も発行されるためChatGPTをビジネスで使いこなしたい方は必見👇🧵

1w ago

2662237627KXで開く

う

うみゆき@AI研究 @umiyuki_ai

リサーチ

Anthropicは「黒字達成はま～だ時間かかりそうっす。2028年まで待って」って言ってたのに急に客が殺到した結果すでに黒字になったという！！！！！！！！「AIって本当に儲かるのかよ？AI企業が黒字になったためしがねえじゃんｗｗｗｗ」という話が終わってAIバブル説も終了。AIはリアルガチでした

日

日本経済新聞電子版（日経電子版）@nikkei

アンソロピック、4〜6月に初の「営業黒字」見通し　AI売上高3カ月で倍増 https://www.nikkei.com/article/DGXZQOGN210GZ0R20C26A5000000/?n_cid=SNSTW001&n_tw=1779320842

5d ago

5951147792KXで開く

@levelsio @levelsio

リサーチ

それめっちゃ古い考えだよ。日本には DeepSeek みたいな独自の LLM すらない。最大のモデルは Rakuten AI で、中国の DeepSeek をファインチューニングしたやつ。「中国は革新してない」ってコピング、いつまで続くんだろう

原文を表示 (en)

That's a really outdated mindset Japan doesn't even have its own LLM like DeepSeek Its biggest model is Rakuten AI which is a finetuned version of Chinese DeepSeek The cope about China not innovating can't last forever

Dragos Roua@dragosroua

Given the Chinese approach to markets, I suspect this was legit invented in Japan, then silently copied in China.

2d ago

925427393KXで開く

François Chollet @fchollet

リサーチ

AIを既存のワークフローの生産性向上ツールとして考えるのは間違ったフレーミングです。コンピュータ化/ソフトウェア化のこれまでの波と同じように、AIは新しい方法で新しいことができるようにするツールです。

原文を表示 (en)

Thinking of AI as a productivity booster for prior workflows is the wrong framing. Like all of the previous waves of computerization/softwarization, AI is a tool that lets you do new things in new ways.

Computers and Society Papers@WGOV

Cognitive offloading and the speedup illusion in human-AI interaction Sunny Yu, Myra Cheng, Ahmad Jabbar, Ilia Sucholutsky, Katherine M. Collins, Dan Jurafsky, Robert D. Hawkins https://arxiv.org/abs/2605.23177 [𝚌𝚜.𝙲𝚈 𝚌𝚜.𝙷𝙲]

19h ago

3954914538KXで開く

Theo - t3.gg @theo

リサーチ

速報：あるもの（AI）が好きな人たちが、そのもの（AI）に投資して、そのもの（AI）で働いている

原文を表示 (en)

BREAKING: people who like a thing (ai) choose to invest in the thing (ai) and also work in the thing (ai)

MTS@MTSlive

SITUATION DETECTED: Demis Hassabis, CEO of Google DeepMind, is an angel investor in Anthropic, per FT.

1w ago

1.0K94487KXで開く

Sasha Rush @srush_nlp

リサーチ

Composer でテキストフィードバック / OPSD に取り組んでました。本当に興味深い領域で、まだまだ探索する余地がたくさんあります。

原文を表示 (en)

Been working on text feedback / OPSD in Composer. Really interesting space, and much more to be explored.

Cursor@cursor_ai

We improved Composer by scaling training, generating more complex RL environments, and introducing new learning methods. For example, we use text feedback during RL to learn faster by assigning credit in rollouts spanning hundreds of thousands of tokens.

1w ago

2752813038KXで開く

hardmaru @hardmaru

リサーチ

AI がソフトウェアエンジニアを置き換えるかどうかを聞かれ続けている。俺は全く逆だと思う。ジェヴォンスのパラドックスのおかげで、AI ツールは優秀なエンジニアを 10 倍生産性高くしており、より難しく、より大規模な問題に取り組むことが可能になっている。俺たちは @SakanaAILabs で SWE チームを拡大しているところだ。英語が話せる R&D とプラットフォームロールを含む5つの新しいオープンポジションがある。東京で俺たちと一緒に AI の未来を構築しよう！🐟

原文を表示 (en)

People keep asking if AI will replace software engineers. I believe the exact opposite. Thanks to the Jevons paradox, AI tools are making great engineers 10x more productive, allowing us to tackle much harder, larger-scale problems. We’re expanding our SWE teams at @SakanaAILabs We have 5 new open roles, including English-speaking R&D and Platform roles. Come build the future of AI with us in Tokyo! 🐟

Sakana AI@SakanaAILabs

【採用情報】「Software Engineer」の5ポジションが現在オープン！ https://t.co/1q07mb3TzE 「AIが進化すれば、ソフトウェアエンジニアの仕事はなくなるのか？」 Sakana AIは、全く逆だと考えています。 AIツールの登場で開発効率が劇的に向上する一方、ジェボンズのパラドックス（Jevons paradox）が示すように、私たちが解決できる課題の幅と規模が拡大し、優秀なSoftware Engineerの需要はかつてなく高まっています。事実、Sakana AIでは、AI支援ツールを駆使して最前線で活躍し、AIそのものを社会実装していくSoftware Engineerの採用をかつてない規模で強化しています！現在、以下の5つの専門領域で募集を公開中です。詳細はリンク先をご覧ください。 🐙 こんな挑戦が待っています・Enterprise: AI技術を組み込んだアプリケーションのFrontend〜Backendまでの一貫した設計・開発および運用・Defense & Intelligence: 日本の防衛・インテリジェンス分野に、AIを活用したソフトウェアで貢献 (※本ポジションは性質上、日本国籍保有等の要件がございます) ・Product: 自社AIプロダクトのUI/UXからバックエンド・インフラまでのフルスタック開発・Platform: LLMエージェントを支える強固なインフラ・データプラットフォームの設計・構築 (English req, 日本語 is a plus) ・Research and Development: ML研究と製品開発を繋ぎ、研究を加速させるツールやフルスタックインフラを構築 (English req, 日本語 is a plus) 🐡 こんな方を求めています・Frontend / Backend / Infrastructureのいずれか複数領域での実務経験をお持ちの方・AI支援コーディングツールを活用し、チームで自律的に開発を進められる方・AIシステム開発や、0→1でのプロダクト立ち上げ経験がある方はさらに歓迎！フルタイムに加え、業務委託・インターンシップと柔軟な働き方が可能です（※ポジションにより異なります）。最先端のAI技術を自らの手で社会へ届け、変革の波を創り出したい方。ぜひご応募ください。

2d ago

224166542KXで開く

Soumith Chintala @soumithchintala

リサーチ

クラスタマジシャンとGPUささやき職人たちよ、我々に参加しましょう！リアルタイムインタラクティブモデル、Tinker、大規模トレーニングの背後にあるインフラを構築するスーパーコンピューティングエンジニアを探しています：スケジューリング、ストレージ、ネットワーキング、信頼性、大規模な分散システム。 NYCとSFで採用中 https://t.co/jCx00R6UvB

原文を表示 (en)

Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training: scheduling, storage, networking, reliability, and distributed systems at scale. Hiring in NYC and SF https://t.co/jCx00R6UvB

1w ago

59333053KXで開く

swyx @swyx

リサーチ

個人的には深い研究は o3 以降死んでると思う。対話性は常に実際の学習と意図の引き出しにとってより重要だった何も考えずプロンプト -> 誰も読まない長いレポートは劣ってる読む -> 考える -> 聞く -> 読む -> 考える -> 聞くと比べて

原文を表示 (en)

IMO deep research has been ~dead since o3 and interactivity was always more impt for active learning and eliciting intention thoughtless prompt -> long ass report nobody reads is inferior to read -> think -> ask -> read -> think -> ask

swyx@swyx

getting some yeses getting some nos. have you run a Deep Research recently?

6d ago

27076234KXで開く

swyx @swyx

リサーチ

同意。今日のtransformerが得意な学習の種類と、なぜそれが限界に直面するのかについて、非常に実用的なメンタルフレームワークだ。@ankit2119と今年初めに敵対的なワールドモデルの必要性について書いた時、我々は現実のコルモゴロフ限界生成器にますます近づくこれらの思考の段階の機能のいくつかを説明していた。証明されている非効率なパラダイムに対してより多くのパラメータ、より多くの力、より多くのすべてを投げることは、仮説を立てて真実を探す単純なソリューションに劣る、それよりも事後的にカード城をはめ込むことになる。ただ、ビター・レッスンによると、スケールの方がシンプルであり、人間の知能がそこまで賢くも豊富でもないため、AGIに到達するかもしれない

原文を表示 (en)

co-sign. a very handy mental framework for what kinds of learning transformers do well today, and why it runs into limitations. when @ankit2119 and i wrote about the need for adversarial world models earlier this year, we were describing a couple of the functions of these rungs of thinking that bring us ever closer to the kolmogorov-limit generator of reality. throwing more params, more power, more everything at a demonstrably inefficient paradigm will be outclassed by the simple solution that can hypothesize and seek truth rather than backfit a house of cards - although the bitter lesson is it is simpler to scale and we may hit agi anyway because human intelligence just isn’t that smart nor plentiful

Rishabh Agarwal@agarwl_

Very well written blog. I think of RL as learning from interventions, and it kinda explains why it's more powerful as a paradigm than supervised learning. Now learning from counterfactuals is something we haven't been historically good at but maybe world modelling+ RL can get us there.

3d ago

8967414KXで開く

Soumith Chintala @soumithchintala

リサーチ

Interaction Models のデモがもっとあって、システム設計を協力して行ったり、論文を読んだり、ライブ生成 UI でファクトチェックしてる

原文を表示 (en)

more demos on Interaction Models collaboratively doing system design, reading papers, fact-checking with live generative UI

Seongsik Kim@SeongsikKi5837

1. (System design) - The Interaction Models see your screen and collaborates with you live. Here we're building a scalable system architecture together — no copy-pasting, no switching tabs, just thinking out loud and drawing on the screen together.

1w ago

17413032KXで開く

Jack Clark @jackclarkSF

リサーチ

Import AIの次号用に創作ストーリーを書いたんだけど、以前のものとはかなり異なるんだ。AIがより強力になるにつれて人類の前には美しいものが待ってると思うし、ここで何かそれをキャプチャしようとした。号は火曜日に公開される予定。

原文を表示 (en)

Wrote a fictional story for the next issue of Import AI that is pretty different to my previous ones. I think there are beautiful things ahead for humanity as AI gets more powerful and I've tried to capture some of that here. Issue should come out on Tuesday.

3d ago

1084169.7KXで開く

Chi Wang @Chi_Wang_

リサーチ

今日、AIが1946年のErdős予想を否定した。 OpenAIの汎用推論モデルが、平面単位距離上の80年来の格子状界を超える新しい無限族の構成を発見した。独立して Alon、Bloom、Gowers により検証済み。 AIにはできないと思ってたこと――リストを更新して。 93s WIREの解説 ↓

原文を表示 (en)

Today, AI disproved a 1946 Erdős conjecture. OpenAI's general-purpose reasoning model found a new infinite family of constructions that exceed the 80-year square-grid bound on planar unit distances. Independently verified by Alon, Bloom, Gowers. What you assumed AI couldn't do — refresh the list. 93s WIRE explainer ↓

5d ago

789175.6KXで開く

Furqan Rydhan @FurqanR

リサーチ

AI はトークンあたりのコスト問題です。トークンのコストが X で、それが生み出す成果がそれより多ければ、勝ちです。ゲーム全体は、トークンの収益がトークン支出を上回るループを見つけることです。 CAC to LTV みたいに感じます。 ROAS は今や Return on AI Spend を意味します。

原文を表示 (en)

AI is a cost per token problem. If a token costs you X and the work it produces earns more, you win. The whole game is finding the loop where token revenue outpaces token spend. Feels a lot like CAC to LTV. ROAS now means Return on AI Spend.

4d ago

655114.0KXで開く

Javi Lopez ⛩️@javilopen

リサーチ

最近何か学んでいない人は、学びたくないだけです。 ClaudeGPT、または Gemini を傍に置いて何かを学ぶのは、世界中で最高の専門家がそばにいるようなものです… 実は、並列で使えるので、何人もの専門家を同時に利用できます！

原文を表示 (en)

Anyone who isn't learning something these days just doesn't want to. Having Claude, ChatGPT, or Gemini by your side while you learn anything is like having the world's best expert right there with you... Actually, several of them, because you can use them in parallel!

1w ago

55182.9KXで開く

Yuki Saito @ysaito_human

リサーチ

（既にご本人からアナウンスいただいておりますが）来週5月26日（火）の計数工学特別講義では Google DeepMind Tokyo の全炳河氏 (@heiga_zen) にご講演いただきます．計数工学科4年生向けの講義ですが，それ以外の学部生・大学院生・教職員の皆様もご参加いただけますので是非！😊

Heiga Zen (全炳河)@heiga_zen

【東京大学計数工学特別講義】 5月26日（火）の午後、東京大学工学部計数工学科の「計数工学特別講義」を担当させていただきます。当日はGoogleのAI研究開発の現在地と将来についてお話しする予定です 🤖 計数工学科の特別講義は、企業実務家を中心に先端科学技術の応用事例を学ぶ場として歴史があり、今回招待いただき大変光栄です。本郷キャンパスの工学部6号館にて、学生の皆さんとお会いできるのを楽しみにしています ✨ 当日は単なる一方通行の講義にとどまらず、ご来場の皆さんと、AIについて議論ができると嬉しいです 🤝 https://t.co/58v4ZONeoi

1w ago

391309.1KXで開く

て

てんねん @munou_ac

リサーチ

外部リンクの本文貼付はデブースト？ 2026年1月版でも5月版でも公開ソースコードにそんな記述ないです。ただ、新しく追加された『Grox』はポストを『見ています』。コンテキストが不十分な場合は結果として拡散されにくくはなるかも。

て

てんねん@munou_ac

http://x.com/i/article/2055905934002790400

1w ago

25462.3KXで開く

Yohei @yoheinakajima

リサーチ

このエンベディング分析では、他の未来的なテック用語と比較すると、 AGIは恐怖より予期への強い関連を示しており、特に英語、日本語、中国語、イタリア語で顕著ですまた、複数のヨーロッパ言語で強い負の関連を示しています：スペイン語／イタリア語での怒り、ドイツ語での嫌悪感、フランス語、ドイツ語、ポルトガル語での恥辱／嫉妬

原文を表示 (en)

in this embedding analysis, compared to other futuristic tech terms, AGI maps more strongly to anticipation than fear, especially in english, japanese, chinese, and italian it also shows stronger negative associations across several european languages: anger in spanish/italian, disgust in german, and shame/envy in french, german, and portuguese

Yohei@yoheinakajima

now i compared similarity of 12 futuristic tech terms with 12 emotions across 12 languages and then normalized them you can see charts like how does "AGI" feel across languages* *this is in relation to the other 11 terms

1w ago

9272.9KXで開く

て

てんねん @munou_ac

リサーチ

アクションメカ娘 ⠀ ズダッ＿＿＿＿＿＿＿＿＿＿ ⠀ 最新アルゴリズムのソースコードが公開されて判明した新要素『Grox』 ⠀ シリーズ連載最終回は『投稿はどう分類され、　どう似た投稿と結びつくのか？』 ⠀ AIイラスト界隈では持ちネタがありますよね。 ⠀ わたしでいうと・アクションメカ娘・メカ娘・武者メカ娘・ゆるメカ娘・メカ女子・騎士メカ娘・女子甲生・フルアーマーメカ娘・失恋ガールズ・Elf Girl ⠀ 多いな持ちネタ！ ⠀ 持ちネタを毎日投稿しているとおそらくそのポストの届き先は同じようなユーザーになります。 ⠀ そしてそのユーザーのおすすめタイムラインには同じようなイラストが並びます。 ⠀ ざっくりいうとこの制御に関わっているのがembeddingです。 ⠀ もっと正確にいうと multimodal post embeddingです。 ⠀ この理屈から考えるとわたしのように異なるモチーフが混在しかも絵柄もそれぞれ毎回違うと上記のようにはならないかもしれません。 ⠀ 例えば同じメカ女子でも毎回届き先が違う可能性があります。よって運用的にいうとインプが安定しません。 ⠀ 逆に、同じ絵柄で同じモチーフの場合はインプが安定するはずですしエンゲージメントも高いはず。 ⠀ そのモチーフのイラストが好きな同じユーザーに届くからです。 ⠀ 運用的にはこちらが正しくてわたしは悪手ですね。 ⠀ でもいいんだ ⠀ 自分が好きなイラストを作っているんだから。 ⠀ 人の真似もしないし迎合もしない。我が道を行く。

て

てんねん@munou_ac

http://x.com/i/article/2057395254752636928

5d ago

48221.8KXで開く