Back to people
@wchan212
W

William Chan

画像生成
@wchan212

Cofounder/CTO @ideogram_ai

1.5KFollowers484Following118PostsView on X

Recent posts

We've been working really hard on Ideogram Character 🥹 input: 1 image output: consistent characters

Photo 1

https://canadaspends.com/en/spending pretty cool visualization, very easy to digest and insightful! every Canadian should take a look and understand the revenue/spending

$AMD chip perf is great, but system level (compiler, networking, cloud ecosystem) perf is still catching up if your graph deviates from benchmarking models (e.g., Llama), work is needed to extract performance (e.g., gemm tuning, custom flash-attn impl, manual op fusion) but the future is very bright with Triton, ROCm, MI450 networking, etc... the path to $AMD success is clear, it's all about execution @AnushElangovan disclaimer: speaking from perspective of JAX/XLA on $AMD, PyTorch most likely much better

@techvisionasia
P
Photon Semi@techvisionasia

Think about the next question: Who’s the best $/memory bandwidth - which is more important than flops $AMD $TSM $AVGO $MRVL $NVDA

$AMD best $/flop, but significant effort to extract perf $GOOG TPU is great system perf (networking, goodput), but chip perf behind $NVDA lets you focus on solving AI, could be best value when you factor in TTM

~30% of my portfolio is $TSM 1. strong mgmt (CC Wei & co) 2. tech leadership (N2/A16/CoWoS/SoIC/CPO) 3. ecosytem (design, ip) 4. relentless execution 5. whether $AMD, $NVDA or some1 else wins, $TSM wins bonus: capex moat risk: geopolitical

working with $GOOG JAX Pallas, my immediate reaction is that this is Python CUDA. got $AMD MI3xx chips to work quite easily, dense matmuls are fast, but significant effort is needed to tune whole graph perf. IMO, the $NVDA CUDA moat is disappearing, but the total $NVDA ecosystem remains strong

silly question for the chip gurus -- why don’t we see even more hybrid bonding in ML accelerators? N4 reticle limit SRAM/IOD die on bottom, stack N3P/N2 compute tiles on top. is L3 cache too high latency to be useful? or only $AMD has mastered the interconnects @dylan522p 🙏