Title | Author | Date | Word count | HN points |
---|---|---|---|---|
Driving model performance optimization: 2024 highlights | Pankaj Gupta | Jan 14, 2025 | 1530 | - |
Private, secure DeepSeek-R1 in production in US & EU data centers | Amir Haghighat, Philip Kiely | Feb 11, 2025 | 1274 | - |
Testing Llama 3.3 70B inference performance on NVIDIA GH200 in Lambda Cloud | Pankaj Gupta, Philip Kiely | Feb 11, 2025 | 1033 | - |
Baseten Chains is now GA for production compound AI systems | Marius Killinger, Tyron Jung, Rachel Rapp | Feb 12, 2025 | 1123 | - |
How multi-node inference works for massive LLMs like DeepSeek-R1 | Phil Howes, Philip Kiely | Feb 15, 2025 | 1303 | - |