This is not thought of as an AI newsletter, but rather a "what's the most intellectually interesting stuff happening in the world in the past week", and we could probably write a full book just on what has happened this week in the AI world (without counting what happened in US politics!).
It certainly already feels like the week of the year in AI, and it's just the 4th! So this edition is going to be AI heavy.
The absolute meltdown on X about Deepseek's new models has been fascinating to dig into - all of the tenets of the modern AI world are being reconfigured on a weekly basis in real time around the world. Don't think that OpenAI was expecting its Operator launch and the $100-500B Stargate being so drowned down by a small team of Chinese quants.
There are so many interesting topics to talk about, that we'll try to distill a little bit of them all. US vs China is clearly the core, with people now realizing that China might have done what it did for consumer electronics in a century, it did for cars in decades, and is now doing in AI in years. In robotics it's already leapfrogged from the start.
We will continue living with this massive tension, of China destroying the fabric of US capitalism by annihilating margins and showing a world that is truly global also on advanced tech. How we navigate the security and biases of these models (they do censor the name of Xi Jinping automatically) will be one of the most interesting things to watch.
The EU is, sadly, as always, nowhere to be seen - with the only glimmer of hope being the AI push by the UK's new administration with Matt Clifford. The actual European Union itself is in deep, deep trouble.
What we're reading
The Short Case for Nvidia Stock
I've spent more than 1 hour reading and understanding this post, because it has a very linear, all encompassing, detailed and beginner friendly description of the whole AI situation. The author uses it all to show different threats to NVIDIA's profit margins, but it's also the best explanation of the various technical innovations that led to Deepseek R1. - FP8 vs FP32: by using 8bit instead of 32, they lose some precision but gain vast memory savings. (The other models lose the precision in compression later on anyways) - Multi token prediction system: "Most Transformer based LLM models do inference by predicting the next token— one token at a time. DeepSeek figured out how to predict multiple tokens while maintaining the quality you'd get from single-token prediction. Their approach achieves about 85-90% accuracy on these additional token predictions, which effectively doubles inference speed without sacrificing much quality. The clever part is they maintain the complete causal chain of predictions, so the model isn't just guessing - it's making structured, contextual predictions." - Multi-head Latent Attention (MLA): clever compression on key-value indices built into the model - GPU communication efficiency: DualPipe algorithm and custom communication kernels mean they only need about 20 of their GPUs' streaming multiprocessors (SMs) for communication, leaving the rest free for computation. - Mixture-of-Experts: "The beauty of the MOE model approach is that you can decompose the big model into a collection of smaller models that each know different, non-overlapping (at least fully) pieces of knowledge. DeepSeek's innovation here was developing what they call an "auxiliary-loss-free" load balancing strategy that maintains efficient expert utilization without the usual performance degradation that comes from load balancing." - Proper Reinforced Learning: trading off a small performance hit for much more readable and consistent outputs.
Another article that explains R2 in detail, also breaking down all the math. Main aspects here are: 1. Chain of Thought Reasoning 2. Reinforcement Learning 3. GRPO (Group Relative Policy Optimization) 4. Distillation
A pretty long, but very stimulating read by Packy at Not Boring. He paints a wildly optimistic picture which I sadly do not agree with. Pmarca is also one of the optimists here, with Elon being on the far spectrum of doom and Sama somewhat in the middle. I need to reorder my thoughts a little bit on this topic, but I do think that "this time it's different", because previous technology couldn't replicate a human 1-1. Previous technology could only augment productivity or do some specific tasks for humans to free them up to do something else. Now all that "something else" (short of going to the movies or to the beach) can also be done better and faster by a humanoid robot and/or AI. So.. a whole new paradigm starts, fairly well articulated in Consumption must flow.
Chris Sacca on Tim Ferris's Podcast
It is 3 hours long but it is a must listen. Specifically the bit around how AI is going to reshape everything and what to do about. Not surprisingly, my view is very close to his: "if you're paying attention, it's overwhelming and we're fucked".
A Revolution in How Robots Learn
"Roboticists increasingly believe that their field is approaching its ChatGPT moment." Extremely generalist piece that could be a good sunday read for someone that hasn't really been paying attention to everything that's going on in robotics and AI, with a parallel to children learning. Another similar article is "What’s next for robots" from MIT Press.
Science Corp. Aims to Plant Ideas in Brains with New Device
"Monkeys with GPUs coming soon to a casino near you" Max is a founder I had invested in as one of my first investments with Transcriptic. He then went on to cofound Neuralink with Elon Musk and departed to create his new company Science Corp. Max is now "creating an external store of neurons affixed to someone’s head and those neurons having the ability to offset some of the effects of serious illnesses. The idea is that, if your brain has been damaged, the manufactured neurons head out from the interface and into your tissue and make new connections that reroute around compromised tissue"
SpiRobs: Logarithmic spiral-shaped robots for versatile grasping across scales
China is just going at a ridiculous speed. This stuff would have been unimaginable just a few years ago in its complexity and speed. You really have to watch this video.
Quick bytes:
- ByteDance wasn't waiting around on Deepseek, and released its own incredibly cheap MoE model. What a week. Here is also their video-to-depth model.
- Guy in Berkeley replicates R1 training. It's legit. - R1 can run on a Raspberry Pi.
- AI reads floorplans and does the job of an entire architectural firm's department.
- Adam CAD is pretty mindblowing.
- Geoff at Bedrock thinks seed funding might be over.
- Some people are dissing American Dynamism as just marketing/smoke screen. A16Z maintains that edtech is core to American Dynamism, as we have to restructure the full fabric of society to raise a generation that cares about progress. In the meantime, they're closing their UK office (which seemed mostly crypto).
- AI tutors in Nigeria absolutely smashing it. In 6 weeks doing the work of 2 years of human tutoring.
- A new European unicorn emerges. Neko raises $260M Series B.
- Quantum computing gets another fund with QDNL Participations.