507 Hacker News submissions by month with at least  points since the start of

507 submissions with 1 points or greater

HN Points HN Title (Links to original post) Submitted Date
586 Uncensor any LLM with abliteration 2024-06-13
240 Microsoft Phi-2 model changes licence to MIT 2024-01-06
197 Space secrets leak disclosure 2024-06-01
181 Best 7B LLM on leaderboards made by an amateur following a medium tutorial 2024-01-05
168 Llama 3 8B is almost as good as Wizard 2 8x22B 2024-04-19
167 Nvidia releases NVLM 1.0 72B open weight model 2024-10-02
163 Explaining the SDXL Latent Space 2024-02-05
152 Hugging Face and Google partner for AI collaboration 2024-01-25
131 A CC-By Open-Source TTS Model with Voice Cloning 2024-11-04
127 FineWeb: Decanting the web for the finest text data at scale 2024-06-02
103 HuggingChat: Chat with Open Source Models 2024-02-21
95 More than 80 AI models from Qualcomm 2024-02-28
94 LLaMA-Pro-8B 2024-01-06
82 Apple/OpenELM: Efficient Open-Source Family Language Models 2024-04-24
75 YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license 2024-04-18
66 Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't 2024-01-22
58 MSFT's WizardLM2 models have been taken down 2024-04-16
54 LiteLlama-460M-1T has 460M parameters trained with 1T tokens 2024-01-07
52 Fine-Tuning LLMs to 1.58bit 2024-09-18
51 LLaMA 3 70B Llamafiles 2024-04-19
47 Improving Parquet Dedupe on Hugging Face Hub 2024-10-08
46 Open-LLM performances are plateauing 2024-06-29
33 Mixtral-8x22B on HuggingFace 2024-04-10
31 General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model 2024-09-11
30 Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat 2024-04-12
30 OpenFLUX.1 2024-10-04
29 Mistral 7B v0.2 2024-03-31
28 Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video 2024-04-16
26 Llama-3.2-3B-Instruct-uncensored 2024-09-27
26 Llama can now see and run on your device – welcome Llama 3.2 2024-09-25
25 New Phi-3.5 Models from Microsoft, including new MoE 2024-08-20
25 LLM: Transformer Is Linear 2024-05-24
23 HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B 2024-11-05
22 Lineage Explorer for open source models – Hugging Face Space 2024-01-18
22 Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included 2024-08-14
21 Llama 3.2 2024-09-25
19 Fine-tune and deploy open LLMs as containers using AIKit - Part 1 2024-06-06
19 makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch 2024-01-23
18 HuggingFace to Replace Git LFS with Xet 2024-08-23
18 Fake Insects: a game where you have to identify AI-generated insects 2024-08-17
18 Mixtral-8x22B-Instruct-v0.1 2024-04-17
18 Hermes-2-Pro-Llama-3-8B 2024-05-01
17 StableLM-2-12B 2024-04-08
16 NuExtract: A LLM for Structured Extraction 2024-06-29
16 An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct 2024-06-09
16 Phi-3 Weights Released 2024-04-23
16 New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks 2024-07-31
16 Miqu 70B – possible leak of the mistral-medium LLM 2024-01-29
15 Ollama can run any GGUF Model on Hugging Face Hub now 2024-10-16
14 Llama-3-70B-Instruct-Gradient-1048k 2024-05-04
14 New finance LLM passed the CFA Level III exam 2024-07-31
14 Run Mistral 7B model using less than 4GB of memory on your Mac with CoreML 2024-07-23
14 Stable Diffusion 3 Medium Released 2024-06-12
14 Pre-computed vector embeddings available on HuggingFace 2024-01-22
13 Yi-9B-200K 2024-03-17
13 An Introduction to Vision-Language Modeling 2024-05-28
12 FineWeb: 15T tokens of the finest data the web has to offer 2024-04-21
12 Language model can listen while speaking 2024-08-07
12 ML for 3D Course on Hugging Face 2024-05-16
12 Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs 2024-04-09
12 Command-R: open weights 35B params / 128k tokens context length model by Cohere 2024-03-11
12 StarCoder2 and The Stack v2: new code LLMs and dataset 2024-02-28
12 Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model 2024-03-28
11 HuggingFace Is Down 2024-02-28
11 Experiments with Bitnet 1.5 (Ngmi) 2024-03-23
11 FalconMamba 7B: The first attention-free and general-purpose pure Mamba model 2024-08-13
11 NPC-Playground, a 3D playground to interact with LLM-powered NPCs 2024-06-05
11 Open LLM Leaderboard 2024-01-02
10 CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher 2024-06-15
10 Whisperfile 2024-08-19
10 Llava Model for Video 2024-05-16
10 Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption 2024-01-31
10 Vector embeddings model for medical literature 2024-01-08
9 Not All Language Model Features Are Linear 2024-05-25
9 Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct 2024-10-16
9 Perspectives for first principles prompt engineering 2024-08-20
9 ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models 2024-05-28
9 Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B 2024-01-04
9 Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B 2024-11-18
9 MIT Researchers Unveil New Method to Improve LLM Inference Performance 2024-10-04
9 Aryn/deformable-detr-DocLayNet – open-source Layout Model 2024-07-31
9 AIMO (AI Math Olympiad) progress prize winning solution 2024-07-10
9 Mistral-7B-v0.3 released on HuggingFace 2024-05-22
9 Microsoft Phi-3 3.8B model with 128k Context 2024-04-23
9 The Stack v2: a 3B files in 600 programming languages dataset 2024-03-07
8 NousResearch/Nous-Hermes-2-Llama-2-70B 2024-02-12
8 Show HN: We made an encrypted DNA testing app using Homomorphic Encryption 2024-10-02
8 NexusRaven-V2-13B 2024-01-25
8 Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard 2024-10-15
8 Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released 2024-09-17
8 Mistral AI Pixtral 2024-09-11
8 Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces 2024-02-14
7 Phi-3 Technical a Highly Capable Language Model Locally on Your Phone 2024-04-23
7 Am I in the Stack? 2024-03-20
7 Common Corpus: the largest public domain dataset for training LLMs 2024-03-20
7 Hugging Face launches Agents 2.0 2024-05-13
7 OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 2024-02-26
7 Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space 2024-05-16
7 1B+ words corpus of original texts and experimental post-OCR correction output 2024-04-26
7 Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out 2024-03-14
7 Grandmaster-Level Chess Without Search 2024-02-08
7 Create a Web Interface for Your LLM in Python 2024-01-23
6 New leaderboard drop: Judge Arena 2024-11-19
6 Phased Consistency Model 2024-05-29
6 A Llama 70B finetune that has reflection baked into it's weights 2024-09-05
6 Show HN: Understand politics by visualising manifesto embeddings 2024-07-07
6 Mistral releases the v0.3 of its 7B LLM 2024-05-22
6 Idefics2: A Powerful 8B Vision-Language Model for the Community 2024-05-14
6 Show HN: Open-source LLM for data labeling 2024-05-08
6 Dolphin-2.9-Llama3-8B 2024-04-21
6 Introduction to 3D Gaussian Splatting 2024-04-02
5 Gemma-2 2B beats GPT3.5 on Chatbot Arena 2024-07-31
5 FineWeb-Edu: new 1.3T tokens web dataset 2024-06-02
5 Wall Street Journal Hedcut Stable Diffusion Model 2024-01-23
5 Hertz-dev is an open-source model for full-duplex conversational audio 2024-11-16
5 New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies 2024-11-11
5 Hugging Face launches HUGS: managed containers for on-premise model deployment 2024-10-23
5 Janus-1.3B: Unifying Multimodal Understanding and Generation 2024-10-18
5 Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling 2024-10-16
5 Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller 2024-10-16
5 Whisper-Large-v3-Turbo 2024-10-03
5 Show HN: Automatic chaptering – From raw transcripts to structured documents 2024-09-09
5 TabReD: A Benchmark of Tabular Machine Learning In-the-Wild 2024-07-04
5 Microsoft releases weights for Florence-2 vision model 2024-06-19
5 Phi-3-medium-128k-instruct 2024-05-22
5 Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs 2024-04-13
5 Gretel: Synthetic Text to SQL Dataset 2024-04-04
5 Detecting performance and ethical vulnerabilities in popular Hugging Face models 2024-03-21
5 Design2Code: How Far Are We from Automating Front-End Engineering? 2024-03-10
5 Genie: Generative Interactive Environments 2024-02-26
5 TTS Arena: Benchmarking TTS Models in the Wild 2024-02-25
5 Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral 2024-02-20
4 Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard 2024-01-26
4 Octopus V4: a graph of language models 2024-05-02
4 Llama-3 8B Instruct 262k 2024-04-26
4 CodeGemma – an official Google release for code LLMs 2024-04-09
4 Apple Open-Sources LLM DCLM-7B 2024-07-19
4 Open LLM Leaderboard v2 2024-06-29
4 Florence 2, Microsoft OCR Modell 2024-06-20
4 Apple OpenELM Instruct Models 2024-04-24
4 Phi-3 Released 2024-04-23
4 GemMoE: An 8x8 Mixture Of Experts based on Gemma 2024-03-13
4 Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science 2024-02-07
4 Introduction to State Space Models (SSM) 2024-01-24
4 HtmlRAG: HTML Is Better Than Plain Text for RAG Systems 2024-11-06
4 Structured generation with Outlines, now in Rust 2024-10-22
4 Llama 3.2 in the Browser with WebGPU 2024-09-30
4 Multimodal TextImage Augmentation for Document Images 2024-09-14
4 'Reflection 70B' AI model could be the answer to pesky LLM hallucinations 2024-09-06
4 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers 2024-08-14
4 FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving manner 2024-08-13
4 Introduction to Ggml 2024-08-13
4 Google releases Gemma 2 2B, ShieldGemma and Gemma Scope 2024-08-01
4 Gemma 2 2B Release 2024-08-01
4 Extracting Concepts from LLMs: Anthropic's recent discoveries 2024-06-08
4 EasyAnimate: End-to-end solution for high-resolution and long video generation 2024-06-04
4 Grokked Transformers Are Implicit Reasoners 2024-05-27
4 Paligemma: A versatile and lightweight vision-language model (VLM) 2024-05-14
4 4M Context – Llama-3-8B-Instruct 2024-05-09
4 ReFT: Representation Finetuning for Language Models 2024-04-05
4 Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage 2024-03-22
4 Show HN: Chatbot Guardrails Arena 2024-03-21
4 Quanto: A PyTorch Quantization Toolkit 2024-03-18
4 On-device background removal with Transformers.js 2024-02-07
4 SegMoE: Segmind Mixture of Diffusion Experts 2024-02-05
4 NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs 2024-02-03
4 HuggingChat Assistants: Open source models with custom instructions 2024-02-02
3 Show HN: Turn Any Article into a Conversation-Like Podcast 2024-05-22
3 Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI 2024-10-15
3 AI has a problem with objectifying women 2024-05-28
3 Linus Torvalds Chat Bot 2024-02-02
3 ChatQA: Building GPT-4 Level Conversational QA Models 2024-01-19
3 Frames: Factuality, Retrieval, and Reasoning MEasurement Set 2024-10-01
3 Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and it's sick 2024-09-20
3 Chronos-T5 (Tiny) – pretrained time series forecasting models 2024-08-14
3 HF for Legal, an open-source community on Hugging Face 2024-07-01
3 LegalKit, French labeled datasets built for legal ML training 2024-06-27
3 Nvidia releases ChatQA-1.5 in violation of Llama 3 license 2024-05-02
3 Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding 2024-04-26
3 Everyone seems to have forgotten about Gemma 2024-04-25
3 Introducing the Open Chain of Thought Leaderboard 2024-04-23
3 Google Gemma 1.1 2B and 7B instruct 2024-04-06
3 Starcoder-2 2024-02-28
3 DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development 2024-02-09
3 Nous-Hermes-2-SOLAR-10.7B 2024-01-02
3 SemScore: Evaluating LLMs with Semantic Similarity 2024-11-06
3 Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints 2024-10-31
3 Hugging Face Now Automatically Detects Leaked Secrets 2024-09-05
3 Selective fine-tuning of Language Models with Spectrum 2024-09-03
3 Idefics3: Open multimodal model based on Llama-3.1-8B 2024-08-09
3 New Google Gemma 2 2B model 2024-07-31
3 Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth 2024-07-29
3 DiLoCo: Distributed Low-Communication Training of Language Models 2024-07-26
3 The largest math dataset of Olympiad problems for training LLMs 2024-07-21
3 SmolLM – Fast and Remarkably Powerful 2024-07-16
3 Whisper WebGPU: Real-time in-browser speech recognition 2024-06-08
3 UGI Leaderboard – Uncensored General Intelligence 2024-06-07
3 Transformers Are SSMs: Generalized Models and Efficient Algorithms Through 2024-06-04
3 Recovering 4D World from Monocular Video 2024-05-29
3 LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models 2024-05-26
3 Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data 2024-05-26
3 Phi-3 in-browser inference using WebGPU 2024-05-08
3 Show HN: GPT Fine-Tune Formatter 2024-05-07
3 InstantMesh: Efficient 3D Mesh Generation from a Single Image 2024-04-15
3 Mixture of Finetuned and GPT4 Model 2024-04-07
3 H2O-Danube2-1.8B-Chat 2024-04-07
3 Yi-9B 2024-04-05
3 Dolphin-2.8-mistral-7B-v02 2024-04-03
3 Common Corpus – Start of the largest public domain dataset for training LLMs 2024-03-20
3 MoAI: Mixture of All Intelligence for Large Language and Vision Models 2024-03-14
3 OpenChat-3.5-0106-Gemma 2024-03-10
3 Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping 2024-02-23
3 Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens 2024-02-22
3 Stable Diffusion XL Lightning 2024-02-21
3 Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases 2024-02-03
3 Show HN: A lineage explorer for open source models and datasets 2024-01-23
3 Aim – An Apple Collection 2024-01-19
3 LLaVA-3B 2024-01-01
2 Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM 2024-04-19
2 Try Qwen2.5-Coder-32B on HuggingChat 2024-11-12
2 An orthogonalized AI to introduce an unengaged melancholic style 2024-06-13
2 Pearl-7B-slerp, an xtraordinary 7B model for maths 2024-02-05
2 Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station 2024-01-28
2 7B model from Snorkel tops Alpaca Eval 2.0 leaderboard 2024-01-24
2 LongVU – New Video LLM from Meta 2024-10-24
2 Hacker News Comments Dataset 2024-10-11
2 HuggingFace Accelerate 1.0.0 2024-10-07
2 Mistral-Small-Instruct-2409 2024-09-17
2 HuggingChat: Chat with Llama 3.1 (70B and 405B) 2024-07-23
2 Ocean Biodiversity Information System on Hugging Face 2024-07-21
2 CommonCanvas image generation from CC-licensed images – models, dataset released 2024-06-07
2 Show HN: PodGen generate podcasts on any topic 2024-06-01
2 Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models 2024-05-28
2 The Waifu Research Department 2024-05-16
2 Yi-1.5 LLM Models Released 2024-05-12
2 Fietje: An open and efficient LLM for Dutch 2024-05-02
2 Simple Multimodal LLM from Scratch 2024-04-23
2 Stability Releases Code Instruct 3B 2024-04-02
2 Mistral 7B v0.2 2024-04-01
2 PolarsBot, a New HuggingChat Assistant 2024-03-25
2 Easy and low cost model training on HF "DGX cloud" 2024-03-19
2 Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's Leaderboard 2024-02-19
2 LLMs can learn useful guidelines from their own mistakes 2024-02-12
2 Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard 2024-02-11
2 Aanaphi-2 3B 2024-02-09
2 Playground for Hugging Face Models 2024-02-05
2 Hallucinations Leaderboard 2024-01-29
2 Fine-tune Wav2Vec2-BERT for low resource speech recognition 2024-01-23
2 InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds 2024-01-22
2 Yayi2-30B-Llama 2024-01-01
2 Pixtral-Large-Instruct-2411 2024-11-18
2 FLUX.1-Dev LoRA Outfit Generator by TryOn Labs 2024-11-06
2 Contextual Document Embeddings 2024-11-01
2 Code a Simple RAG from Scratch – Hugging Face Community Article 2024-10-30
2 OmniParser for Pure Vision Based GUI Agent 2024-10-25
2 Hugs – Scale Your AI with Open Models 2024-10-23
2 Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins 2024-10-23
2 Pickle Scanning 2024-10-23
2 New Video Generation Model:Allegro 2024-10-22
2 TxT360 2024-10-18
2 Dataset About Where 30k+ Startups Trend 2024-10-18
2 Nvidia Nemotron 2024-10-17
2 Fixing Gradient Accumulation 2024-10-16
2 Animate-X: Universal Character Image Animation with Enhanced Motion 2024-10-15
2 SOTA Open Source Text to Video Model 2024-10-14
2 Exploring the Daily Papers Page on Hugging Face 2024-09-24
2 Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu) 2024-09-23
2 Recreating o1 at Home with Role-Play LLMs 2024-09-21
2 FineVideo: Annotated YouTube Dataset by HuggingFace 2024-09-12
2 Remove Background by Text 2024-09-12
2 Labeled Image generation using Meta Llama 3.5 2024-08-31
2 Scaling robotics datasets with video encoding 2024-08-30
2 New FashionCLIP and SigLIP Classification Demo 2024-08-28
2 Mozilla/TriLM-Llamafile · Hugging Face 2024-08-26
2 Play: How random can a human brain truly be? 2024-08-24
2 FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs 2024-08-21
2 Flux Dev 1 model that creates half_illustration images 2024-08-21
2 LLMs as Image Generators with Canonical Codec Representations 2024-08-19
2 Instant in-browser demo of SmolLM 2024-08-18
2 Marqo-FashionCLIP: New Embedding Model for Fashion 2024-08-14
2 A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine 2024-08-07
2 Generate and Export Segmentation Masks Using Meta's SAMv2 2024-07-31
2 HuggingChat: Chat with Llama 3.1 405B 2024-07-25
2 Meta-Llama-3.1-405B 2024-07-23
2 Apple's DCLM model shares data&training code with weights 2024-07-20
2 Predicting Multiplication with GPT-2 2024-07-20
2 Qwen2 Technical Report 2024-07-16
2 Gemma-2-27B-it llamafile 2024-07-03
2 OpenRAIL: Towards open and responsible AI licensing frameworks (2022) 2024-07-03
2 New LLM Agent writing actions in Python code tops the GAIA agent benchmark 2024-07-01
2 Stable Diffusion 3 Medium Online Demo, Free 2024-06-12
2 To Believe or Not to Believe Your LLM 2024-06-11
2 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs 2024-06-04
2 Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series 2024-05-31
2 Training and Finetuning Embedding Models with Sentence Transformers v3 2024-05-30
2 ChatTTS – open-source TTS model designed specifically for dialogue scenario 2024-05-29
2 Matryoshka Multimodal Models 2024-05-28
2 Aya 23: Open Weight Releases to Further Multilingual Progress 2024-05-28
2 HuggingFace Hub Incident Post Mortem 2024-05-24
2 Cohere Updates Weights for Aya 2024-05-23
2 Hugging Face on AMD Instinct MI300 GPU 2024-05-23
2 Show HN: Generate a Quiz from Any Url 2024-05-17
2 Show HN: EmuBert – the first open encoder model for Australian law 2024-05-14
2 New Yi 1.5 models under Apache 2.0 2024-05-12
2 Building Cost-Efficient Enterprise RAG Applications 2024-05-10
2 Google codegemma-1.1-7B-it 2024-05-03
2 Introduction to Matryoshka Embedding Models 2024-05-03
2 Iterative Reasoning Preference Optimization 2024-05-02
2 GPT-2 2024-05-01
2 Fine-tune Llama 3 with ORPO 2024-04-23
2 In-browser text-to-music generation using musicgen-small 2024-04-20
2 Compression Represents Intelligence Linearly 2024-04-16
2 Bringing serverless GPU inference to Hugging Face users 2024-04-16
2 From Words to Numbers: Your LLM Is a Capable Regressor 2024-04-12
2 Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace 2024-04-11
2 TinyTimeMixer: Open-source time series LLM by IBM 2024-04-09
2 Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction 2024-04-05
2 Command R+ 2024-04-04
2 Demo of Moondream2 vision language model running in browser 2024-04-03
2 Mini-Jamba 2024-04-01
2 Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs 2024-04-01
2 The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 Bits 2024-03-25
2 Cosmopedia: How to create large-scale synthetic data for pre-training 2024-03-21
2 Playground-v2.5-1024px-Aesthetic 2024-03-16
2 Gemini 1.5: Unlocking multimodal understanding across tokens of context 2024-03-15
2 Better RAG 1: Advanced Basics 2024-03-15
2 Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks 2024-03-13
2 LLM Red-Teaming Resistance Leaderboard 2024-03-01
2 Show HN: Visualize how you split your document into chunks for RAG applications 2024-02-27
2 From OpenAI to Open LLMs with Messages API on Hugging Face 2024-02-23
2 C4: colossal cleaned version of Common Crawl's web crawl corpus 2024-02-21
2 Constitutional AI with Open LLMs 2024-02-01
2 Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna AI 2024-01-31
2 AMUSEd: Efficient Text-to-Image Generation 2024-01-29
2 Minillama – 4.1 MB LLM for testing 2024-01-20
2 StableLM 2 Zephyr 1.6B 2024-01-20
2 Local vector embeddings index for analyzing ArXiv papers 2024-01-17
2 Stable Zero123 Model Weights get Released. Text to 3D and image to 3D 2024-01-15
2 Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL 2024-01-10
2 OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally 2024-01-10
2 Revolutionizing AI with Audio Classification via Computer Vision 2024-01-02
1 Show HN: Embedding model for PDF page retrieval 2024-08-08
1 Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune 2024-05-02
1 Get Insulted by AI 2024-02-25
1 Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio 2024-07-29
1 SmolLM2: The new, best, and open small language model 2024-11-01
1 The Romulus model series has been released on Hugging Face 2024-09-11
1 I added context data to the TruthfulQA dataset 2024-08-10
1 Chinese AI Community: open-source Heatmap 2024-07-31
1 Multi-token prediction models and baselines 2024-07-04
1 Mixtral or Llama 70B on Google Spreadsheet Thanks to Hugging Face's API 2024-06-17
1 Stupid Filter Corpus (2007) 2024-05-24
1 MMLU-Pro: Advanced edition of MMLU & new Leaderboard 2024-05-15
1 Ratchet and Phi 3 2024-05-01
1 Snowflake Arctic Instruct Open LLM 2024-04-24
1 LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes 2024-04-08
1 MANATEE(lm): Market Analysis based on language model architectures 2024-03-20
1 Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a Single GPU 2024-03-13
1 Serverless Image Similarity with Upstash Vector and HuggingFace Spaces 2024-02-02
1 Dutch Drug-Related Text Classification Model by NOS 2024-01-25
1 Implement Fractional GPUs in Kubernetes to save upto 50% cost 2024-01-22
1 The next person that says textual modalities gets it 2024-01-10
1 LLaMA Pro: Progressive LLaMA with Block Expansion 2024-01-05
1 Halo: Open-Source Health Tracking with Wearables 2024-11-20
1 Releasing the largest multilingual open pretraining dataset 2024-11-14
1 Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for coding 2024-11-12
1 Providing Open Investment Data – 25 years of data 2024-11-11
1 New Sota Text to Image 2024-10-31
1 Stable Diffusion 3.5 Medium 2024-10-29
1 Kolors Virtual Try-On in the Wild 2024-10-28
1 Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval 2024-10-23
1 Stable Diffusion 3.5-large released 2024-10-22
1 Transformers.js v3: WebGPU Support, New Models and Tasks, and More 2024-10-22
1 Allegro – New Open Source Text to Video Generator from Rhymes AI 2024-10-22
1 Distilabel Synthetic Data Generator on Hugging Face 2024-10-17
1 HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance 2024-10-17
1 Show HN: A dataset of all HN submission texts (2006-2024) in Markdown 2024-10-13
1 Scaling AI-Based Data Processing with Hugging Face and Dask 2024-10-10
1 LLMs Know More Than They Show 2024-10-08
1 Document Similarity Search with ColPali 2024-09-29
1 Prithvi WxC: Foundation Model for Weather and Climate 2024-09-24
1 Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance 2024-09-24
1 HN-Style HuggingFace Daily Papers 2024-09-22
1 Qwen2.5-Coder Technical Report 2024-09-21
1 Introducing Community Tools on HuggingChat 2024-09-20
1 InkubaLM-0.4B: Small language model for low-resource African Languages 2024-08-29
1 Diffusion models are real time game engines 2024-08-29
1 Everchanging Quest: Rogue-like game powered by LLMs 2024-08-21
1 xLSTM Model Trained on Music 2024-08-16
1 Qwen2-VL 2024-08-14
1 Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters 2024-08-07
1 Depth Compare – A Hugging Face space to compare different depth models 2024-07-29
1 Insilico Medicine on Hugging Face 2024-07-27
1 LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs 2024-07-26
1 Spreadsheetllm: Encoding Spreadsheets for Large Language Models 2024-07-24
1 Followgraph for Hugging Face 2024-07-23
1 Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text prompts 2024-07-23
1 Scaling Diffusion Transformers to 16B Parameters 2024-07-19
1 DeepSeek v2 Chat (0628) released 2024-07-18
1 The Rise of Agentic Data Generation 2024-07-15
1 Fast SD3 Medium 2024-07-10
1 Agentic RAG: query reformulation and self-query 2024-07-08
1 Meta LLM Compiler 2024-06-29
4 From Files to Chunks: Improving HF Storage Efficiency 2024-11-20
3 Dataset Card for 1M Bluesky Posts 2024-11-27
3 New 2B vision language model that consumes the least memory 2024-11-26
4 Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg 2024-11-24
3 New synthetic dataset beating MSFT and mistral's SFT recipe 2024-11-22
1 Allegro-TI2V: an open source video generation model 2024-11-27
1 PR Puppet Sora 2024-11-27
2 OpenGPT-X 2024-11-26
1 Lightricks/LTX-Video – first real-time video generation model 2024-11-23
425 Llama-3.3-70B-Instruct 2024-12-06
4 Show HN: LatComp – Compress your image into a small and reversible format 2024-11-30
3 Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer 2024-12-06
3 Quantum+AI Qiskit Code Assistant Open Source model 2024-11-27
3 informatiker/20-million-bluesky-posts 2024-11-29
3 Automated GitHub Issue Creation Using Structured Generation 2024-11-29
3 QwQ-32B-Preview 2024-11-27
2 Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI' 2024-11-28
2 The Lichess database is now on Hugging Face 2024-12-06
2 LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs 2024-12-05
2 Releasing: A dataset of two million Bluesky posts 2024-11-27
1 PaliGemma 2 – New vision language models by Google 2024-12-05
1 Open Source Developers Guide to the EU AI Act 2024-12-03
1 LM Studio using models from Hugging Face 2024-12-02
1 IC Light – Shade Generation Model 2024-12-02
348 A Replacement for BERT 2024-12-19
10 Show HN: Downloadable AI Musical Instruments 2024-12-10
9 Spaces ZeroGPU: Dynamic GPU Allocation for Spaces 2024-12-15
8 Scaling Test Time Compute with Open Models 2024-12-16
5 Moonshine – open-source, real-time speech-to-text in the browser 2024-12-19
3 Welcome to the Falcon 3 Family of Open Models 2024-12-17
3 Meta releases family of multimodal models that comprehend hour-long video 2024-12-16
3 Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset 2024-12-09
2 Just launched MilkDropLM model using 32B parameters 2024-12-20
2 FineMath: the best public math pre-training dataset 2024-12-19
2 I-JEPA Hugginface 2024-12-09
2 FineWeb2 dataset: A sparkling update with 1000s of languages 2024-12-08
1 ModernBERT 2024-12-20
1 Show HN: A ML powered text moderation model that outperforms Open AI 2024-12-14
1 Help Us Rank the Best Background Removal Tools 2024-12-11
1 I need your help to create brain-rot dataset 2024-12-08
1 Phi-4 GGUF 2024-12-14
1 HunyuanVideo and Diffusers Made Easy 2024-12-11
48 DeepSeek v3 beats Claude sonnet 3.5 and way cheaper 2024-12-26
4 DeepSeek-V3-Base 2024-12-25
11 smolagents: A simple library to build AI agents 2025-01-02
10 Phi-4 weights have been released under MIT license 2025-01-08
3 Timeline of AI model releases in 2024 2025-01-01
2 Vdr-2B-multi-v1 a multilingual embedding model for visual document retrieval 2025-01-10
2 Show HN: We collected detailed annotations for text-to-image generation 2025-01-10
2 Hugging Face Smolagents 2025-01-05
2 Hugging Face advocates for Code Agents: agents that write tool calls as code 2025-01-02
2 ModernBERT: Encoder-only Transformer Model Strictly Improving on past work 2025-01-01
2 Polish linguistic and cultural competency benchmark for LLMs 2024-12-31
52 Train faster static embedding models with sentence transformers 2025-01-15
6 Kokoro-TTS 2025-01-13
2 Flex.1-Alpha – A new modded Flux model that can properly handle being fine tuned 2025-01-19
1 Show HN: An Agentic AI dataset for deepfake detection 2025-01-15
394 Open-R1: an open reproduction of DeepSeek-R1 2025-01-28
227 Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser 2025-02-07
49 Janus-Pro: Autoregressive framework unifying multimodal understanding&generation 2025-01-27
39 DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks 2025-01-20
38 Fully autonomous AI agents should not be developed 2025-02-07
20 Selene Mini: Open-sourced SOTA small language-model-as-a-judge 2025-01-29
19 The smallest VLM ever: 250M parameters 2025-01-23
17 DeepSeek R1 2025-01-20
12 Open-source DeepResearch – Freeing our search agents 2025-02-04
6 Microsoft Phi 4 with R1 Reasoning 2025-02-04
5 Open R1: Update #2 2025-02-11
5 Deepseek VL2 Small 2025-02-08
4 Qwen 2.5 Max 2025-01-28
4 Hugging Face open sources a web-browsing agent that uses VLMs 2025-01-24
4 Deepseek R1 Zero 2025-01-20
3 Fine-Tune Deepseek-R1 with a Synthetic Reasoning Dataset 2025-02-11
3 Hugging Face AI Agents Course 2025-02-10
3 HuggingFace open reproduction of R1 data and training pipeline 2025-01-27
3 DeepSeek-R1 on iPhone? (DeepSeek-R1-Distill-Qwen-1.5B-GGUF) 2025-01-21
2 OpenAI o3 just scored 99.8% on CodeForces using brute-force 2025-02-12
2 FinePersonas 2025-02-10
2 #9: Does AI Remember? The Role of Memory in Agentic Workflows 2025-02-03
2 Mistral-Small-24B-Base-2501 2025-01-30
2 Generate Images, Chat with PDF in WebGPU via DeepSeek Janus Pro 1B 2025-01-28
2 The state of open video generation models 2025-01-28
2 Bespoke-Stratos-17k: Open Reasoning Dataset by Distilling DeepSeek-R1 2025-01-27
2 DeepSeek-R1 WebGPU 2025-01-22
1 FP8 DeepSeek R1 Distilled LLMs for SGLang and VLLM 2025-01-29
33 The Ultra-Scale Playbook: Training LLMs on GPU Clusters 2025-02-19
17 Vector Search with DuckDB 2025-02-26
9 Show HN: A Transformer model that preserves logical equivalence 2025-03-02
6 DeepSeek-R1 without CCP censorship 2025-02-20
6 More Efficient Chain-of-Thought Reasoning Through Certainty Probing 2025-02-18
6 SigLIP 2: A better multilingual vision language encoder 2025-02-22
4 LLaSE-G1 A FOSS speech enhancement model 2025-03-08
4 Qwen/QwQ-32B released on Hugging Face 2025-03-06
4 Wan2.1-T2V-14B 2025-02-25
4 The Curse of Depth in Large Language Models 2025-02-13
3 GEN3C: 3D-Informed World-Consistent Video 2025-03-06
3 Microsoft Releases Phi-4-multimodal [pdf] 2025-02-26
3 WanX open weight sota 14B video model release 2025-02-25
3 Step-Audio-Chat: a 132B end-to-end speech-to-speech model 2025-02-17
2 FastRTC: The Real-Time Communication Library for Python 2025-02-25
2 Show HN: Roast Any Website with AI 2025-02-25
2 SWE-Lancer: Can LLMs Earn $1M from Real-World Freelance Software Engineering? 2025-02-18
2 Desklib AI Detector Ranks No 1 on Raid Benchmark for AI Detection 2025-02-17
2 Forget What You Know about LLMs Evaluations – LLMs Are Like a Chameleon 2025-02-13