Robin Rombach

画像生成

@robrombach

Krawallkrümel. Generative Models at https://t.co/1xqMb617gc, made with ❤️

13KFollowers561Following738PostsView on X

Recent posts

Robin Rombach @robrombach

画像生成

🔥🔥🔥

Modular@Modular

Generate images in less than 1 second. 99% cheaper than NanoBanna. 🚀 😱 Our latest 26.2 release ships FLUX.2 image generation with a 4.1x speedup over torch.compile on NVIDIA Blackwell - translating to a 5.5x TCO advantage with AMD MI355X. Read more ⬇️

3/24/2026

12002.7KXで開く

Robin Rombach @robrombach

画像生成

🤖🤖🤖🌲🌲🌲

Patrick Esser@pess_r

Fixed vision encoders like DINO have driven impressive progress in more learnable representations for generative modeling - but there is no universal variant across modalities, and they do not scale with the generative model. We introduce our self-supervised framework, Self-Flow, that builds learnability directly into flow models, working in a unified and scalable way across image, video and audio. Particularly excited about the gains on video-action prediction: Beyond the overall success rate improving substantially, more complex tasks - like "Open and Place" - see some of the clearest gains. So many interesting research questions to explore to make 🤖 go brrr Super glad to be working with my amazing colleagues @hila_chefer, Dominik, @dustin_podell, Vikash, @Vinh_Suhi, Antonio and @robrombach - as well as the whole @bfl_ml team! arxiv: https://t.co/eP7ip58Tff project page: https://t.co/GNShpBMEQ1

3/11/2026

25005.0KXで開く

Robin Rombach @robrombach

画像生成

New paper out! We present a training method for multimodal generative models, called Self-Flow, which combines classic flow matching and representation learning. Why? Unlike most representation alignment methods, our new approach does not require external, pretrained models and thus scales gracefully to joint multimodal training on images, videos and audio. How? It combines per-timestep flow matching with dual-timestep representation learning, improving the models' internal representations. This approach outperforms prior methods and shows promising scaling behavior in multimodal pretraining. It also enables downstream applications such as action prediction for embodied AI. webpage+paper: https://t.co/qzGQGj8JYk code: https://t.co/edhfdVEqSf Credit to @hila_chefer, @pess_r, Dominik, @dustin_podell, Vikash, @Vinh_Suhi and Antonio. If you enjoy doing open research like this, come and join BFL! We are actively hiring🌲

3/4/2026

31136028KXで開く

Robin Rombach @robrombach

画像生成

Deep Mandel

1/28/2026

25201.9KXで開く

Robin Rombach @robrombach

画像生成

Hey FLUX, create a gif using my profile pic. make no mistakes bitte

Black Forest Labs@bfl_ml

BFL Skills Packaged FLUX into a single install command for agents. Install once. Your coding agent handles the rest - model selection, prompting, API integration. All built in. Sub-second generation and editing with [klein]. Highest quality with [max]. Text rendering with [flex] Works in Claude Code, Cursor, and other IDEs. > npx skills add black-forest-labs/skills

1/26/2026

70288.7KXで開く