586 |
Uncensor any LLM with abliteration |
2024-06-13 |
240 |
Microsoft Phi-2 model changes licence to MIT |
2024-01-06 |
197 |
Space secrets leak disclosure |
2024-06-01 |
181 |
Best 7B LLM on leaderboards made by an amateur following a medium tutorial |
2024-01-05 |
168 |
Llama 3 8B is almost as good as Wizard 2 8x22B |
2024-04-19 |
167 |
Nvidia releases NVLM 1.0 72B open weight model |
2024-10-02 |
163 |
Explaining the SDXL Latent Space |
2024-02-05 |
152 |
Hugging Face and Google partner for AI collaboration |
2024-01-25 |
131 |
A CC-By Open-Source TTS Model with Voice Cloning |
2024-11-04 |
127 |
FineWeb: Decanting the web for the finest text data at scale |
2024-06-02 |
103 |
HuggingChat: Chat with Open Source Models |
2024-02-21 |
95 |
More than 80 AI models from Qualcomm |
2024-02-28 |
94 |
LLaMA-Pro-8B |
2024-01-06 |
82 |
Apple/OpenELM: Efficient Open-Source Family Language Models |
2024-04-24 |
75 |
YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license |
2024-04-18 |
66 |
Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't |
2024-01-22 |
58 |
MSFT's WizardLM2 models have been taken down |
2024-04-16 |
54 |
LiteLlama-460M-1T has 460M parameters trained with 1T tokens |
2024-01-07 |
52 |
Fine-Tuning LLMs to 1.58bit |
2024-09-18 |
51 |
LLaMA 3 70B Llamafiles |
2024-04-19 |
47 |
Improving Parquet Dedupe on Hugging Face Hub |
2024-10-08 |
46 |
Open-LLM performances are plateauing |
2024-06-29 |
33 |
Mixtral-8x22B on HuggingFace |
2024-04-10 |
31 |
General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model |
2024-09-11 |
30 |
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat |
2024-04-12 |
30 |
OpenFLUX.1 |
2024-10-04 |
29 |
Mistral 7B v0.2 |
2024-03-31 |
28 |
Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video |
2024-04-16 |
26 |
Llama-3.2-3B-Instruct-uncensored |
2024-09-27 |
26 |
Llama can now see and run on your device – welcome Llama 3.2 |
2024-09-25 |
25 |
New Phi-3.5 Models from Microsoft, including new MoE |
2024-08-20 |
25 |
LLM: Transformer Is Linear |
2024-05-24 |
23 |
HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B |
2024-11-05 |
22 |
Lineage Explorer for open source models – Hugging Face Space |
2024-01-18 |
22 |
Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included |
2024-08-14 |
21 |
Llama 3.2 |
2024-09-25 |
19 |
Fine-tune and deploy open LLMs as containers using AIKit - Part 1 |
2024-06-06 |
19 |
makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch |
2024-01-23 |
18 |
HuggingFace to Replace Git LFS with Xet |
2024-08-23 |
18 |
Fake Insects: a game where you have to identify AI-generated insects |
2024-08-17 |
18 |
Mixtral-8x22B-Instruct-v0.1 |
2024-04-17 |
18 |
Hermes-2-Pro-Llama-3-8B |
2024-05-01 |
17 |
StableLM-2-12B |
2024-04-08 |
16 |
NuExtract: A LLM for Structured Extraction |
2024-06-29 |
16 |
An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct |
2024-06-09 |
16 |
Phi-3 Weights Released |
2024-04-23 |
16 |
New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks |
2024-07-31 |
16 |
Miqu 70B – possible leak of the mistral-medium LLM |
2024-01-29 |
15 |
Ollama can run any GGUF Model on Hugging Face Hub now |
2024-10-16 |
14 |
Llama-3-70B-Instruct-Gradient-1048k |
2024-05-04 |
14 |
New finance LLM passed the CFA Level III exam |
2024-07-31 |
14 |
Run Mistral 7B model using less than 4GB of memory on your Mac with CoreML |
2024-07-23 |
14 |
Stable Diffusion 3 Medium Released |
2024-06-12 |
14 |
Pre-computed vector embeddings available on HuggingFace |
2024-01-22 |
13 |
Yi-9B-200K |
2024-03-17 |
13 |
An Introduction to Vision-Language Modeling |
2024-05-28 |
12 |
FineWeb: 15T tokens of the finest data the web has to offer |
2024-04-21 |
12 |
Language model can listen while speaking |
2024-08-07 |
12 |
ML for 3D Course on Hugging Face |
2024-05-16 |
12 |
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs |
2024-04-09 |
12 |
Command-R: open weights 35B params / 128k tokens context length model by Cohere |
2024-03-11 |
12 |
StarCoder2 and The Stack v2: new code LLMs and dataset |
2024-02-28 |
12 |
Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model |
2024-03-28 |
11 |
HuggingFace Is Down |
2024-02-28 |
11 |
Experiments with Bitnet 1.5 (Ngmi) |
2024-03-23 |
11 |
FalconMamba 7B: The first attention-free and general-purpose pure Mamba model |
2024-08-13 |
11 |
NPC-Playground, a 3D playground to interact with LLM-powered NPCs |
2024-06-05 |
11 |
Open LLM Leaderboard |
2024-01-02 |
10 |
CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher |
2024-06-15 |
10 |
Whisperfile |
2024-08-19 |
10 |
Llava Model for Video |
2024-05-16 |
10 |
Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption |
2024-01-31 |
10 |
Vector embeddings model for medical literature |
2024-01-08 |
9 |
Not All Language Model Features Are Linear |
2024-05-25 |
9 |
Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct |
2024-10-16 |
9 |
Perspectives for first principles prompt engineering |
2024-08-20 |
9 |
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models |
2024-05-28 |
9 |
Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B |
2024-01-04 |
9 |
Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B |
2024-11-18 |
9 |
MIT Researchers Unveil New Method to Improve LLM Inference Performance |
2024-10-04 |
9 |
Aryn/deformable-detr-DocLayNet – open-source Layout Model |
2024-07-31 |
9 |
AIMO (AI Math Olympiad) progress prize winning solution |
2024-07-10 |
9 |
Mistral-7B-v0.3 released on HuggingFace |
2024-05-22 |
9 |
Microsoft Phi-3 3.8B model with 128k Context |
2024-04-23 |
9 |
The Stack v2: a 3B files in 600 programming languages dataset |
2024-03-07 |
8 |
NousResearch/Nous-Hermes-2-Llama-2-70B |
2024-02-12 |
8 |
Show HN: We made an encrypted DNA testing app using Homomorphic Encryption |
2024-10-02 |
8 |
NexusRaven-V2-13B |
2024-01-25 |
8 |
Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard |
2024-10-15 |
8 |
Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released |
2024-09-17 |
8 |
Mistral AI Pixtral |
2024-09-11 |
8 |
Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces |
2024-02-14 |
7 |
Phi-3 Technical a Highly Capable Language Model Locally on Your Phone |
2024-04-23 |
7 |
Am I in the Stack? |
2024-03-20 |
7 |
Common Corpus: the largest public domain dataset for training LLMs |
2024-03-20 |
7 |
Hugging Face launches Agents 2.0 |
2024-05-13 |
7 |
OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 |
2024-02-26 |
7 |
Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space |
2024-05-16 |
7 |
1B+ words corpus of original texts and experimental post-OCR correction output |
2024-04-26 |
7 |
Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out |
2024-03-14 |
7 |
Grandmaster-Level Chess Without Search |
2024-02-08 |
7 |
Create a Web Interface for Your LLM in Python |
2024-01-23 |
6 |
New leaderboard drop: Judge Arena |
2024-11-19 |
6 |
Phased Consistency Model |
2024-05-29 |
6 |
A Llama 70B finetune that has reflection baked into it's weights |
2024-09-05 |
6 |
Show HN: Understand politics by visualising manifesto embeddings |
2024-07-07 |
6 |
Mistral releases the v0.3 of its 7B LLM |
2024-05-22 |
6 |
Idefics2: A Powerful 8B Vision-Language Model for the Community |
2024-05-14 |
6 |
Show HN: Open-source LLM for data labeling |
2024-05-08 |
6 |
Dolphin-2.9-Llama3-8B |
2024-04-21 |
6 |
Introduction to 3D Gaussian Splatting |
2024-04-02 |
5 |
Gemma-2 2B beats GPT3.5 on Chatbot Arena |
2024-07-31 |
5 |
FineWeb-Edu: new 1.3T tokens web dataset |
2024-06-02 |
5 |
Wall Street Journal Hedcut Stable Diffusion Model |
2024-01-23 |
5 |
Hertz-dev is an open-source model for full-duplex conversational audio |
2024-11-16 |
5 |
New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies |
2024-11-11 |
5 |
Hugging Face launches HUGS: managed containers for on-premise model deployment |
2024-10-23 |
5 |
Janus-1.3B: Unifying Multimodal Understanding and Generation |
2024-10-18 |
5 |
Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling |
2024-10-16 |
5 |
Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller |
2024-10-16 |
5 |
Whisper-Large-v3-Turbo |
2024-10-03 |
5 |
Show HN: Automatic chaptering – From raw transcripts to structured documents |
2024-09-09 |
5 |
TabReD: A Benchmark of Tabular Machine Learning In-the-Wild |
2024-07-04 |
5 |
Microsoft releases weights for Florence-2 vision model |
2024-06-19 |
5 |
Phi-3-medium-128k-instruct |
2024-05-22 |
5 |
Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs |
2024-04-13 |
5 |
Gretel: Synthetic Text to SQL Dataset |
2024-04-04 |
5 |
Detecting performance and ethical vulnerabilities in popular Hugging Face models |
2024-03-21 |
5 |
Design2Code: How Far Are We from Automating Front-End Engineering? |
2024-03-10 |
5 |
Genie: Generative Interactive Environments |
2024-02-26 |
5 |
TTS Arena: Benchmarking TTS Models in the Wild |
2024-02-25 |
5 |
Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral |
2024-02-20 |
4 |
Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard |
2024-01-26 |
4 |
Octopus V4: a graph of language models |
2024-05-02 |
4 |
Llama-3 8B Instruct 262k |
2024-04-26 |
4 |
CodeGemma – an official Google release for code LLMs |
2024-04-09 |
4 |
Apple Open-Sources LLM DCLM-7B |
2024-07-19 |
4 |
Open LLM Leaderboard v2 |
2024-06-29 |
4 |
Florence 2, Microsoft OCR Modell |
2024-06-20 |
4 |
Apple OpenELM Instruct Models |
2024-04-24 |
4 |
Phi-3 Released |
2024-04-23 |
4 |
GemMoE: An 8x8 Mixture Of Experts based on Gemma |
2024-03-13 |
4 |
Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science |
2024-02-07 |
4 |
Introduction to State Space Models (SSM) |
2024-01-24 |
4 |
HtmlRAG: HTML Is Better Than Plain Text for RAG Systems |
2024-11-06 |
4 |
Structured generation with Outlines, now in Rust |
2024-10-22 |
4 |
Llama 3.2 in the Browser with WebGPU |
2024-09-30 |
4 |
Multimodal TextImage Augmentation for Document Images |
2024-09-14 |
4 |
'Reflection 70B' AI model could be the answer to pesky LLM hallucinations |
2024-09-06 |
4 |
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers |
2024-08-14 |
4 |
FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving manner |
2024-08-13 |
4 |
Introduction to Ggml |
2024-08-13 |
4 |
Google releases Gemma 2 2B, ShieldGemma and Gemma Scope |
2024-08-01 |
4 |
Gemma 2 2B Release |
2024-08-01 |
4 |
Extracting Concepts from LLMs: Anthropic's recent discoveries |
2024-06-08 |
4 |
EasyAnimate: End-to-end solution for high-resolution and long video generation |
2024-06-04 |
4 |
Grokked Transformers Are Implicit Reasoners |
2024-05-27 |
4 |
Paligemma: A versatile and lightweight vision-language model (VLM) |
2024-05-14 |
4 |
4M Context – Llama-3-8B-Instruct |
2024-05-09 |
4 |
ReFT: Representation Finetuning for Language Models |
2024-04-05 |
4 |
Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage |
2024-03-22 |
4 |
Show HN: Chatbot Guardrails Arena |
2024-03-21 |
4 |
Quanto: A PyTorch Quantization Toolkit |
2024-03-18 |
4 |
On-device background removal with Transformers.js |
2024-02-07 |
4 |
SegMoE: Segmind Mixture of Diffusion Experts |
2024-02-05 |
4 |
NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs |
2024-02-03 |
4 |
HuggingChat Assistants: Open source models with custom instructions |
2024-02-02 |
3 |
Show HN: Turn Any Article into a Conversation-Like Podcast |
2024-05-22 |
3 |
Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI |
2024-10-15 |
3 |
AI has a problem with objectifying women |
2024-05-28 |
3 |
Linus Torvalds Chat Bot |
2024-02-02 |
3 |
ChatQA: Building GPT-4 Level Conversational QA Models |
2024-01-19 |
3 |
Frames: Factuality, Retrieval, and Reasoning MEasurement Set |
2024-10-01 |
3 |
Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and it's sick |
2024-09-20 |
3 |
Chronos-T5 (Tiny) – pretrained time series forecasting models |
2024-08-14 |
3 |
HF for Legal, an open-source community on Hugging Face |
2024-07-01 |
3 |
LegalKit, French labeled datasets built for legal ML training |
2024-06-27 |
3 |
Nvidia releases ChatQA-1.5 in violation of Llama 3 license |
2024-05-02 |
3 |
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding |
2024-04-26 |
3 |
Everyone seems to have forgotten about Gemma |
2024-04-25 |
3 |
Introducing the Open Chain of Thought Leaderboard |
2024-04-23 |
3 |
Google Gemma 1.1 2B and 7B instruct |
2024-04-06 |
3 |
Starcoder-2 |
2024-02-28 |
3 |
DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development |
2024-02-09 |
3 |
Nous-Hermes-2-SOLAR-10.7B |
2024-01-02 |
3 |
SemScore: Evaluating LLMs with Semantic Similarity |
2024-11-06 |
3 |
Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints |
2024-10-31 |
3 |
Hugging Face Now Automatically Detects Leaked Secrets |
2024-09-05 |
3 |
Selective fine-tuning of Language Models with Spectrum |
2024-09-03 |
3 |
Idefics3: Open multimodal model based on Llama-3.1-8B |
2024-08-09 |
3 |
New Google Gemma 2 2B model |
2024-07-31 |
3 |
Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth |
2024-07-29 |
3 |
DiLoCo: Distributed Low-Communication Training of Language Models |
2024-07-26 |
3 |
The largest math dataset of Olympiad problems for training LLMs |
2024-07-21 |
3 |
SmolLM – Fast and Remarkably Powerful |
2024-07-16 |
3 |
Whisper WebGPU: Real-time in-browser speech recognition |
2024-06-08 |
3 |
UGI Leaderboard – Uncensored General Intelligence |
2024-06-07 |
3 |
Transformers Are SSMs: Generalized Models and Efficient Algorithms Through |
2024-06-04 |
3 |
Recovering 4D World from Monocular Video |
2024-05-29 |
3 |
LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models |
2024-05-26 |
3 |
Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data |
2024-05-26 |
3 |
Phi-3 in-browser inference using WebGPU |
2024-05-08 |
3 |
Show HN: GPT Fine-Tune Formatter |
2024-05-07 |
3 |
InstantMesh: Efficient 3D Mesh Generation from a Single Image |
2024-04-15 |
3 |
Mixture of Finetuned and GPT4 Model |
2024-04-07 |
3 |
H2O-Danube2-1.8B-Chat |
2024-04-07 |
3 |
Yi-9B |
2024-04-05 |
3 |
Dolphin-2.8-mistral-7B-v02 |
2024-04-03 |
3 |
Common Corpus – Start of the largest public domain dataset for training LLMs |
2024-03-20 |
3 |
MoAI: Mixture of All Intelligence for Large Language and Vision Models |
2024-03-14 |
3 |
OpenChat-3.5-0106-Gemma |
2024-03-10 |
3 |
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping |
2024-02-23 |
3 |
Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens |
2024-02-22 |
3 |
Stable Diffusion XL Lightning |
2024-02-21 |
3 |
Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases |
2024-02-03 |
3 |
Show HN: A lineage explorer for open source models and datasets |
2024-01-23 |
3 |
Aim – An Apple Collection |
2024-01-19 |
3 |
LLaVA-3B |
2024-01-01 |
2 |
Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM |
2024-04-19 |
2 |
Try Qwen2.5-Coder-32B on HuggingChat |
2024-11-12 |
2 |
An orthogonalized AI to introduce an unengaged melancholic style |
2024-06-13 |
2 |
Pearl-7B-slerp, an xtraordinary 7B model for maths |
2024-02-05 |
2 |
Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station |
2024-01-28 |
2 |
7B model from Snorkel tops Alpaca Eval 2.0 leaderboard |
2024-01-24 |
2 |
LongVU – New Video LLM from Meta |
2024-10-24 |
2 |
Hacker News Comments Dataset |
2024-10-11 |
2 |
HuggingFace Accelerate 1.0.0 |
2024-10-07 |
2 |
Mistral-Small-Instruct-2409 |
2024-09-17 |
2 |
HuggingChat: Chat with Llama 3.1 (70B and 405B) |
2024-07-23 |
2 |
Ocean Biodiversity Information System on Hugging Face |
2024-07-21 |
2 |
CommonCanvas image generation from CC-licensed images – models, dataset released |
2024-06-07 |
2 |
Show HN: PodGen generate podcasts on any topic |
2024-06-01 |
2 |
Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models |
2024-05-28 |
2 |
The Waifu Research Department |
2024-05-16 |
2 |
Yi-1.5 LLM Models Released |
2024-05-12 |
2 |
Fietje: An open and efficient LLM for Dutch |
2024-05-02 |
2 |
Simple Multimodal LLM from Scratch |
2024-04-23 |
2 |
Stability Releases Code Instruct 3B |
2024-04-02 |
2 |
Mistral 7B v0.2 |
2024-04-01 |
2 |
PolarsBot, a New HuggingChat Assistant |
2024-03-25 |
2 |
Easy and low cost model training on HF "DGX cloud" |
2024-03-19 |
2 |
Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's Leaderboard |
2024-02-19 |
2 |
LLMs can learn useful guidelines from their own mistakes |
2024-02-12 |
2 |
Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard |
2024-02-11 |
2 |
Aanaphi-2 3B |
2024-02-09 |
2 |
Playground for Hugging Face Models |
2024-02-05 |
2 |
Hallucinations Leaderboard |
2024-01-29 |
2 |
Fine-tune Wav2Vec2-BERT for low resource speech recognition |
2024-01-23 |
2 |
InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds |
2024-01-22 |
2 |
Yayi2-30B-Llama |
2024-01-01 |
2 |
Pixtral-Large-Instruct-2411 |
2024-11-18 |
2 |
FLUX.1-Dev LoRA Outfit Generator by TryOn Labs |
2024-11-06 |
2 |
Contextual Document Embeddings |
2024-11-01 |
2 |
Code a Simple RAG from Scratch – Hugging Face Community Article |
2024-10-30 |
2 |
OmniParser for Pure Vision Based GUI Agent |
2024-10-25 |
2 |
Hugs – Scale Your AI with Open Models |
2024-10-23 |
2 |
Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins |
2024-10-23 |
2 |
Pickle Scanning |
2024-10-23 |
2 |
New Video Generation Model:Allegro |
2024-10-22 |
2 |
TxT360 |
2024-10-18 |
2 |
Dataset About Where 30k+ Startups Trend |
2024-10-18 |
2 |
Nvidia Nemotron |
2024-10-17 |
2 |
Fixing Gradient Accumulation |
2024-10-16 |
2 |
Animate-X: Universal Character Image Animation with Enhanced Motion |
2024-10-15 |
2 |
SOTA Open Source Text to Video Model |
2024-10-14 |
2 |
Exploring the Daily Papers Page on Hugging Face |
2024-09-24 |
2 |
Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu) |
2024-09-23 |
2 |
Recreating o1 at Home with Role-Play LLMs |
2024-09-21 |
2 |
FineVideo: Annotated YouTube Dataset by HuggingFace |
2024-09-12 |
2 |
Remove Background by Text |
2024-09-12 |
2 |
Labeled Image generation using Meta Llama 3.5 |
2024-08-31 |
2 |
Scaling robotics datasets with video encoding |
2024-08-30 |
2 |
New FashionCLIP and SigLIP Classification Demo |
2024-08-28 |
2 |
Mozilla/TriLM-Llamafile · Hugging Face |
2024-08-26 |
2 |
Play: How random can a human brain truly be? |
2024-08-24 |
2 |
FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs |
2024-08-21 |
2 |
Flux Dev 1 model that creates half_illustration images |
2024-08-21 |
2 |
LLMs as Image Generators with Canonical Codec Representations |
2024-08-19 |
2 |
Instant in-browser demo of SmolLM |
2024-08-18 |
2 |
Marqo-FashionCLIP: New Embedding Model for Fashion |
2024-08-14 |
2 |
A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine |
2024-08-07 |
2 |
Generate and Export Segmentation Masks Using Meta's SAMv2 |
2024-07-31 |
2 |
HuggingChat: Chat with Llama 3.1 405B |
2024-07-25 |
2 |
Meta-Llama-3.1-405B |
2024-07-23 |
2 |
Apple's DCLM model shares data&training code with weights |
2024-07-20 |
2 |
Predicting Multiplication with GPT-2 |
2024-07-20 |
2 |
Qwen2 Technical Report |
2024-07-16 |
2 |
Gemma-2-27B-it llamafile |
2024-07-03 |
2 |
OpenRAIL: Towards open and responsible AI licensing frameworks (2022) |
2024-07-03 |
2 |
New LLM Agent writing actions in Python code tops the GAIA agent benchmark |
2024-07-01 |
2 |
Stable Diffusion 3 Medium Online Demo, Free |
2024-06-12 |
2 |
To Believe or Not to Believe Your LLM |
2024-06-11 |
2 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs |
2024-06-04 |
2 |
Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series |
2024-05-31 |
2 |
Training and Finetuning Embedding Models with Sentence Transformers v3 |
2024-05-30 |
2 |
ChatTTS – open-source TTS model designed specifically for dialogue scenario |
2024-05-29 |
2 |
Matryoshka Multimodal Models |
2024-05-28 |
2 |
Aya 23: Open Weight Releases to Further Multilingual Progress |
2024-05-28 |
2 |
HuggingFace Hub Incident Post Mortem |
2024-05-24 |
2 |
Cohere Updates Weights for Aya |
2024-05-23 |
2 |
Hugging Face on AMD Instinct MI300 GPU |
2024-05-23 |
2 |
Show HN: Generate a Quiz from Any Url |
2024-05-17 |
2 |
Show HN: EmuBert – the first open encoder model for Australian law |
2024-05-14 |
2 |
New Yi 1.5 models under Apache 2.0 |
2024-05-12 |
2 |
Building Cost-Efficient Enterprise RAG Applications |
2024-05-10 |
2 |
Google codegemma-1.1-7B-it |
2024-05-03 |
2 |
Introduction to Matryoshka Embedding Models |
2024-05-03 |
2 |
Iterative Reasoning Preference Optimization |
2024-05-02 |
2 |
GPT-2 |
2024-05-01 |
2 |
Fine-tune Llama 3 with ORPO |
2024-04-23 |
2 |
In-browser text-to-music generation using musicgen-small |
2024-04-20 |
2 |
Compression Represents Intelligence Linearly |
2024-04-16 |
2 |
Bringing serverless GPU inference to Hugging Face users |
2024-04-16 |
2 |
From Words to Numbers: Your LLM Is a Capable Regressor |
2024-04-12 |
2 |
Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace |
2024-04-11 |
2 |
TinyTimeMixer: Open-source time series LLM by IBM |
2024-04-09 |
2 |
Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction |
2024-04-05 |
2 |
Command R+ |
2024-04-04 |
2 |
Demo of Moondream2 vision language model running in browser |
2024-04-03 |
2 |
Mini-Jamba |
2024-04-01 |
2 |
Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs |
2024-04-01 |
2 |
The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 Bits |
2024-03-25 |
2 |
Cosmopedia: How to create large-scale synthetic data for pre-training |
2024-03-21 |
2 |
Playground-v2.5-1024px-Aesthetic |
2024-03-16 |
2 |
Gemini 1.5: Unlocking multimodal understanding across tokens of context |
2024-03-15 |
2 |
Better RAG 1: Advanced Basics |
2024-03-15 |
2 |
Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks |
2024-03-13 |
2 |
LLM Red-Teaming Resistance Leaderboard |
2024-03-01 |
2 |
Show HN: Visualize how you split your document into chunks for RAG applications |
2024-02-27 |
2 |
From OpenAI to Open LLMs with Messages API on Hugging Face |
2024-02-23 |
2 |
C4: colossal cleaned version of Common Crawl's web crawl corpus |
2024-02-21 |
2 |
Constitutional AI with Open LLMs |
2024-02-01 |
2 |
Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna AI |
2024-01-31 |
2 |
AMUSEd: Efficient Text-to-Image Generation |
2024-01-29 |
2 |
Minillama – 4.1 MB LLM for testing |
2024-01-20 |
2 |
StableLM 2 Zephyr 1.6B |
2024-01-20 |
2 |
Local vector embeddings index for analyzing ArXiv papers |
2024-01-17 |
2 |
Stable Zero123 Model Weights get Released. Text to 3D and image to 3D |
2024-01-15 |
2 |
Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL |
2024-01-10 |
2 |
OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally |
2024-01-10 |
2 |
Revolutionizing AI with Audio Classification via Computer Vision |
2024-01-02 |
1 |
Show HN: Embedding model for PDF page retrieval |
2024-08-08 |
1 |
Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune |
2024-05-02 |
1 |
Get Insulted by AI |
2024-02-25 |
1 |
Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio |
2024-07-29 |
1 |
SmolLM2: The new, best, and open small language model |
2024-11-01 |
1 |
The Romulus model series has been released on Hugging Face |
2024-09-11 |
1 |
I added context data to the TruthfulQA dataset |
2024-08-10 |
1 |
Chinese AI Community: open-source Heatmap |
2024-07-31 |
1 |
Multi-token prediction models and baselines |
2024-07-04 |
1 |
Mixtral or Llama 70B on Google Spreadsheet Thanks to Hugging Face's API |
2024-06-17 |
1 |
Stupid Filter Corpus (2007) |
2024-05-24 |
1 |
MMLU-Pro: Advanced edition of MMLU & new Leaderboard |
2024-05-15 |
1 |
Ratchet and Phi 3 |
2024-05-01 |
1 |
Snowflake Arctic Instruct Open LLM |
2024-04-24 |
1 |
LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes |
2024-04-08 |
1 |
MANATEE(lm): Market Analysis based on language model architectures |
2024-03-20 |
1 |
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a Single GPU |
2024-03-13 |
1 |
Serverless Image Similarity with Upstash Vector and HuggingFace Spaces |
2024-02-02 |
1 |
Dutch Drug-Related Text Classification Model by NOS |
2024-01-25 |
1 |
Implement Fractional GPUs in Kubernetes to save upto 50% cost |
2024-01-22 |
1 |
The next person that says textual modalities gets it |
2024-01-10 |
1 |
LLaMA Pro: Progressive LLaMA with Block Expansion |
2024-01-05 |
1 |
Halo: Open-Source Health Tracking with Wearables |
2024-11-20 |
1 |
Releasing the largest multilingual open pretraining dataset |
2024-11-14 |
1 |
Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for coding |
2024-11-12 |
1 |
Providing Open Investment Data – 25 years of data |
2024-11-11 |
1 |
New Sota Text to Image |
2024-10-31 |
1 |
Stable Diffusion 3.5 Medium |
2024-10-29 |
1 |
Kolors Virtual Try-On in the Wild |
2024-10-28 |
1 |
Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval |
2024-10-23 |
1 |
Stable Diffusion 3.5-large released |
2024-10-22 |
1 |
Transformers.js v3: WebGPU Support, New Models and Tasks, and More |
2024-10-22 |
1 |
Allegro – New Open Source Text to Video Generator from Rhymes AI |
2024-10-22 |
1 |
Distilabel Synthetic Data Generator on Hugging Face |
2024-10-17 |
1 |
HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance |
2024-10-17 |
1 |
Show HN: A dataset of all HN submission texts (2006-2024) in Markdown |
2024-10-13 |
1 |
Scaling AI-Based Data Processing with Hugging Face and Dask |
2024-10-10 |
1 |
LLMs Know More Than They Show |
2024-10-08 |
1 |
Document Similarity Search with ColPali |
2024-09-29 |
1 |
Prithvi WxC: Foundation Model for Weather and Climate |
2024-09-24 |
1 |
Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance |
2024-09-24 |
1 |
HN-Style HuggingFace Daily Papers |
2024-09-22 |
1 |
Qwen2.5-Coder Technical Report |
2024-09-21 |
1 |
Introducing Community Tools on HuggingChat |
2024-09-20 |
1 |
InkubaLM-0.4B: Small language model for low-resource African Languages |
2024-08-29 |
1 |
Diffusion models are real time game engines |
2024-08-29 |
1 |
Everchanging Quest: Rogue-like game powered by LLMs |
2024-08-21 |
1 |
xLSTM Model Trained on Music |
2024-08-16 |
1 |
Qwen2-VL |
2024-08-14 |
1 |
Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters |
2024-08-07 |
1 |
Depth Compare – A Hugging Face space to compare different depth models |
2024-07-29 |
1 |
Insilico Medicine on Hugging Face |
2024-07-27 |
1 |
LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs |
2024-07-26 |
1 |
Spreadsheetllm: Encoding Spreadsheets for Large Language Models |
2024-07-24 |
1 |
Followgraph for Hugging Face |
2024-07-23 |
1 |
Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text prompts |
2024-07-23 |
1 |
Scaling Diffusion Transformers to 16B Parameters |
2024-07-19 |
1 |
DeepSeek v2 Chat (0628) released |
2024-07-18 |
1 |
The Rise of Agentic Data Generation |
2024-07-15 |
1 |
Fast SD3 Medium |
2024-07-10 |
1 |
Agentic RAG: query reformulation and self-query |
2024-07-08 |
1 |
Meta LLM Compiler |
2024-06-29 |
4 |
From Files to Chunks: Improving HF Storage Efficiency |
2024-11-20 |
3 |
Dataset Card for 1M Bluesky Posts |
2024-11-27 |
3 |
New 2B vision language model that consumes the least memory |
2024-11-26 |
4 |
Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg |
2024-11-24 |
3 |
New synthetic dataset beating MSFT and mistral's SFT recipe |
2024-11-22 |
1 |
Allegro-TI2V: an open source video generation model |
2024-11-27 |
1 |
PR Puppet Sora |
2024-11-27 |
2 |
OpenGPT-X |
2024-11-26 |
1 |
Lightricks/LTX-Video – first real-time video generation model |
2024-11-23 |
425 |
Llama-3.3-70B-Instruct |
2024-12-06 |
4 |
Show HN: LatComp – Compress your image into a small and reversible format |
2024-11-30 |
3 |
Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer |
2024-12-06 |
3 |
Quantum+AI Qiskit Code Assistant Open Source model |
2024-11-27 |
3 |
informatiker/20-million-bluesky-posts |
2024-11-29 |
3 |
Automated GitHub Issue Creation Using Structured Generation |
2024-11-29 |
3 |
QwQ-32B-Preview |
2024-11-27 |
2 |
Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI' |
2024-11-28 |
2 |
The Lichess database is now on Hugging Face |
2024-12-06 |
2 |
LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs |
2024-12-05 |
2 |
Releasing: A dataset of two million Bluesky posts |
2024-11-27 |
1 |
PaliGemma 2 – New vision language models by Google |
2024-12-05 |
1 |
Open Source Developers Guide to the EU AI Act |
2024-12-03 |
1 |
LM Studio using models from Hugging Face |
2024-12-02 |
1 |
IC Light – Shade Generation Model |
2024-12-02 |
348 |
A Replacement for BERT |
2024-12-19 |
10 |
Show HN: Downloadable AI Musical Instruments |
2024-12-10 |
9 |
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces |
2024-12-15 |
8 |
Scaling Test Time Compute with Open Models |
2024-12-16 |
5 |
Moonshine – open-source, real-time speech-to-text in the browser |
2024-12-19 |
3 |
Welcome to the Falcon 3 Family of Open Models |
2024-12-17 |
3 |
Meta releases family of multimodal models that comprehend hour-long video |
2024-12-16 |
3 |
Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset |
2024-12-09 |
2 |
Just launched MilkDropLM model using 32B parameters |
2024-12-20 |
2 |
FineMath: the best public math pre-training dataset |
2024-12-19 |
2 |
I-JEPA Hugginface |
2024-12-09 |
2 |
FineWeb2 dataset: A sparkling update with 1000s of languages |
2024-12-08 |
1 |
ModernBERT |
2024-12-20 |
1 |
Show HN: A ML powered text moderation model that outperforms Open AI |
2024-12-14 |
1 |
Help Us Rank the Best Background Removal Tools |
2024-12-11 |
1 |
I need your help to create brain-rot dataset |
2024-12-08 |
1 |
Phi-4 GGUF |
2024-12-14 |
1 |
HunyuanVideo and Diffusers Made Easy |
2024-12-11 |
48 |
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper |
2024-12-26 |
4 |
DeepSeek-V3-Base |
2024-12-25 |
11 |
smolagents: A simple library to build AI agents |
2025-01-02 |
10 |
Phi-4 weights have been released under MIT license |
2025-01-08 |
3 |
Timeline of AI model releases in 2024 |
2025-01-01 |
2 |
Vdr-2B-multi-v1 a multilingual embedding model for visual document retrieval |
2025-01-10 |
2 |
Show HN: We collected detailed annotations for text-to-image generation |
2025-01-10 |
2 |
Hugging Face Smolagents |
2025-01-05 |
2 |
Hugging Face advocates for Code Agents: agents that write tool calls as code |
2025-01-02 |
2 |
ModernBERT: Encoder-only Transformer Model Strictly Improving on past work |
2025-01-01 |
2 |
Polish linguistic and cultural competency benchmark for LLMs |
2024-12-31 |
52 |
Train faster static embedding models with sentence transformers |
2025-01-15 |
6 |
Kokoro-TTS |
2025-01-13 |
2 |
Flex.1-Alpha – A new modded Flux model that can properly handle being fine tuned |
2025-01-19 |
1 |
Show HN: An Agentic AI dataset for deepfake detection |
2025-01-15 |
394 |
Open-R1: an open reproduction of DeepSeek-R1 |
2025-01-28 |
227 |
Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser |
2025-02-07 |
49 |
Janus-Pro: Autoregressive framework unifying multimodal understanding&generation |
2025-01-27 |
39 |
DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks |
2025-01-20 |
38 |
Fully autonomous AI agents should not be developed |
2025-02-07 |
20 |
Selene Mini: Open-sourced SOTA small language-model-as-a-judge |
2025-01-29 |
19 |
The smallest VLM ever: 250M parameters |
2025-01-23 |
17 |
DeepSeek R1 |
2025-01-20 |
12 |
Open-source DeepResearch – Freeing our search agents |
2025-02-04 |
6 |
Microsoft Phi 4 with R1 Reasoning |
2025-02-04 |
5 |
Open R1: Update #2 |
2025-02-11 |
5 |
Deepseek VL2 Small |
2025-02-08 |
4 |
Qwen 2.5 Max |
2025-01-28 |
4 |
Hugging Face open sources a web-browsing agent that uses VLMs |
2025-01-24 |
4 |
Deepseek R1 Zero |
2025-01-20 |
3 |
Fine-Tune Deepseek-R1 with a Synthetic Reasoning Dataset |
2025-02-11 |
3 |
Hugging Face AI Agents Course |
2025-02-10 |
3 |
HuggingFace open reproduction of R1 data and training pipeline |
2025-01-27 |
3 |
DeepSeek-R1 on iPhone? (DeepSeek-R1-Distill-Qwen-1.5B-GGUF) |
2025-01-21 |
2 |
OpenAI o3 just scored 99.8% on CodeForces using brute-force |
2025-02-12 |
2 |
FinePersonas |
2025-02-10 |
2 |
#9: Does AI Remember? The Role of Memory in Agentic Workflows |
2025-02-03 |
2 |
Mistral-Small-24B-Base-2501 |
2025-01-30 |
2 |
Generate Images, Chat with PDF in WebGPU via DeepSeek Janus Pro 1B |
2025-01-28 |
2 |
The state of open video generation models |
2025-01-28 |
2 |
Bespoke-Stratos-17k: Open Reasoning Dataset by Distilling DeepSeek-R1 |
2025-01-27 |
2 |
DeepSeek-R1 WebGPU |
2025-01-22 |
1 |
FP8 DeepSeek R1 Distilled LLMs for SGLang and VLLM |
2025-01-29 |
33 |
The Ultra-Scale Playbook: Training LLMs on GPU Clusters |
2025-02-19 |
17 |
Vector Search with DuckDB |
2025-02-26 |
9 |
Show HN: A Transformer model that preserves logical equivalence |
2025-03-02 |
6 |
DeepSeek-R1 without CCP censorship |
2025-02-20 |
6 |
More Efficient Chain-of-Thought Reasoning Through Certainty Probing |
2025-02-18 |
6 |
SigLIP 2: A better multilingual vision language encoder |
2025-02-22 |
4 |
LLaSE-G1 A FOSS speech enhancement model |
2025-03-08 |
4 |
Qwen/QwQ-32B released on Hugging Face |
2025-03-06 |
4 |
Wan2.1-T2V-14B |
2025-02-25 |
4 |
The Curse of Depth in Large Language Models |
2025-02-13 |
3 |
GEN3C: 3D-Informed World-Consistent Video |
2025-03-06 |
3 |
Microsoft Releases Phi-4-multimodal [pdf] |
2025-02-26 |
3 |
WanX open weight sota 14B video model release |
2025-02-25 |
3 |
Step-Audio-Chat: a 132B end-to-end speech-to-speech model |
2025-02-17 |
2 |
FastRTC: The Real-Time Communication Library for Python |
2025-02-25 |
2 |
Show HN: Roast Any Website with AI |
2025-02-25 |
2 |
SWE-Lancer: Can LLMs Earn $1M from Real-World Freelance Software Engineering? |
2025-02-18 |
2 |
Desklib AI Detector Ranks No 1 on Raid Benchmark for AI Detection |
2025-02-17 |
2 |
Forget What You Know about LLMs Evaluations – LLMs Are Like a Chameleon |
2025-02-13 |