899 Hacker News submissions for HuggingFace

HN Points HN Title (Links to submission) Submitted Date
586 Uncensor any LLM with abliteration 2024-06-13
415 Try Stable Diffusion's Img2Img Mode 2022-08-29
323 MonadGPT – What would have happened if ChatGPT was invented in the 17th century? 2023-11-24
252 LLM in a Flash: Efficient LLM Inference with Limited Memory 2023-12-20
240 Microsoft Phi-2 model changes licence to MIT 2024-01-06
238 Falcon 180B 2023-09-06
229 OpenLLaMA 13B Released 2023-06-18
218 T0* – Series of encoder-decoder models trained on a large set of different tasks 2021-10-18
214 Hugging Face Releases Agents 2023-05-10
211 A neural network to auto-complete your thoughts 2019-09-17
200 PaddleOCR: Lightweight, 80 Langauge OCR 2021-09-09
197 Space secrets leak disclosure 2024-06-01
185 BigCode Project Releases StarCoder: A 15B Code LLM 2023-05-04
181 Best 7B LLM on leaderboards made by an amateur following a medium tutorial 2024-01-05
180 AnimeGANv2: Convert Face Portraits into Anime 2021-11-09
179 Stability.ai sent a take down request to Runway ML's SD v1.5 citing IP Leak 2022-10-20
175 We raised $100M for open and collaborative machine learning 2022-05-09
168 Llama 3 8B is almost as good as Wizard 2 8x22B 2024-04-19
168 SantaCoder: A new 1.1B code model for generation and infilling 2022-12-22
167 Nvidia releases NVLM 1.0 72B open weight model 2024-10-02
165 StackLlama: A hands-on guide to train LlaMa with RLHF 2023-04-06
163 Explaining the SDXL Latent Space 2024-02-05
160 BLOOM: The largest open multilingual language model 2022-07-12
152 Hugging Face and Google partner for AI collaboration 2024-01-25
137 Wordalle – Guess the prompt used to generate a set of images from DalleMini 2022-07-01
131 Mistral-8x7B-Chat 2023-12-10
131 A CC-By Open-Source TTS Model with Voice Cloning 2024-11-04
127 FineWeb: Decanting the web for the finest text data at scale 2024-06-02
117 The age of machine learning as code has arrived 2021-10-22
115 Yi-34B-Chat 2023-11-24
107 GPT-3.5 and Wolfram Alpha via LangChain 2023-01-18
105 The Falcon has landed in the Hugging Face ecosystem 2023-06-05
103 HuggingChat: Chat with Open Source Models 2024-02-21
102 Hugging Face and AWS partner to make AI more accessible 2023-02-21
101 HuggingFace Training Cluster as a Service 2023-09-05
95 More than 80 AI models from Qualcomm 2024-02-28
95 Segmind Stable Diffusion – A smaller version of Stable Diffusion XL 2023-10-25
94 LLaMA-Pro-8B 2024-01-06
93 HuggingChat 2023-04-25
88 Yarn-Mistral-7B-128k 2023-11-11
82 Apple/OpenELM: Efficient Open-Source Family Language Models 2024-04-24
78 Sparse LLM Inference on CPU: 75% fewer parameters 2023-10-19
77 Pokemon GAN 2022-02-14
75 YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license 2024-04-18
73 Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022) 2023-11-20
69 Few-Shot Learning in Practice: GPT-Neo & 'HuggingFace' Accelerated Inference API 2021-06-04
66 Multimodal Neurons in Pretrained Text-Only Transformers 2023-08-04
66 Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't 2024-01-22
61 Find images from movies based on what you draw 2021-10-13
61 HuggingChat – ChatGPT alternative with open source models 2023-12-15
58 MSFT's WizardLM2 models have been taken down 2024-04-16
58 OpenLLaMA 7B Training Completed to 1T Tokens 2023-06-07
57 Phi-2 2023-12-13
56 Dolphin-2_6-Phi-2 2023-12-24
55 Alibaba releases 72B LLM with 32k context length 2023-11-30
54 LiteLlama-460M-1T has 460M parameters trained with 1T tokens 2024-01-07
54 Large Language Models: A New Moore's Law? 2021-10-27
52 Fine-Tuning LLMs to 1.58bit 2024-09-18
51 LLaMA 3 70B Llamafiles 2024-04-19
47 Improving Parquet Dedupe on Hugging Face Hub 2024-10-08
47 Open LLAMA 13B released, trained on 1T tokens 2023-06-19
46 DALL·E Mini 2022-04-11
46 Open-LLM performances are plateauing 2024-06-29
46 The AI Research Residency Program 2022-03-23
45 Show HN: Interpretable Text Classification and Clustering in the Browser 2021-12-20
41 4-Bit Quantization and QLoRA 2023-05-25
40 BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat 2023-05-19
40 What's Going on with the Open LLM Leaderboard? 2023-06-23
39 Show HN: Deep Learning Personas 2017-02-17
39 Kai-Fu Li's Yi-34B uses exactly Llama's architecture except for 2 tensor renamed 2023-11-14
37 Zephyr 7B – Mistral Finetune that responds like ChatGPT 2023-10-15
36 Whisper Jax: Transcribe a 1 hour of audio in under 15 seconds 2023-04-22
34 MistralLite by Amazon Web Services 2023-11-01
33 Mixtral-8x22B on HuggingFace 2024-04-10
31 General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model 2024-09-11
30 Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat 2024-04-12
30 OpenFLUX.1 2024-10-04
29 Mistral 7B v0.2 2024-03-31
29 Mixture of Experts Explained 2023-12-11
29 TinyLlama at 2T of 3T 2023-11-19
28 Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video 2024-04-16
27 Real-Time Latent Consistency Model 2023-10-30
27 Language Modeling Is Compression 2023-09-21
26 Llama-3.2-3B-Instruct-uncensored 2024-09-27
26 Pixel Art XL: Stable Diffusion XL for Pixel Art 2023-08-03
26 UC Berkeley's open-source Vicuna LLM chatbot released new improved model weights 2023-04-14
26 Llama can now see and run on your device – welcome Llama 3.2 2024-09-25
25 Llama 1.3B Trained on 200B Tokens for Commercial Use 2023-04-28
25 New Phi-3.5 Models from Microsoft, including new MoE 2024-08-20
25 LLM: Transformer Is Linear 2024-05-24
24 NousResearch/Nous-Hermes-2-Yi-34B 2023-12-26
23 Accelerating Stable Diffusion XL Inference with Jax on Cloud TPU v5e 2023-10-03
23 Show HN: DALL·E mini – Generate images from text 2021-09-18
23 HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B 2024-11-05
22 Lineage Explorer for open source models – Hugging Face Space 2024-01-18
22 Llama 22B: 13B V2 with 33B attention heads frankensteined on 2023-08-18
22 Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included 2024-08-14
21 Mistral-7B-OpenOrca. First 7B model to beat all other models <30B 2023-10-02
21 Würstchen: Fast Diffusion for Image Generation 2023-09-13
21 Llama 3.2 2024-09-25
20 Code Generation with HuggingFace 2022-06-07
19 Ernie-ViLG better anime quality than Stable Diffusion 2022-09-01
19 Fine-tune and deploy open LLMs as containers using AIKit - Part 1 2024-06-06
19 makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch 2024-01-23
19 AMD and: Large Language Models Out-of-the-Box Acceleration with AMD GPU 2023-12-13
18 This Pokémon Does Not Exist: Using AI models to create fake cards that look real 2022-03-22
18 HuggingFace to Replace Git LFS with Xet 2024-08-23
18 GPT-NeoX 2022-12-14
18 Fake Insects: a game where you have to identify AI-generated insects 2024-08-17
18 Mixtral-8x22B-Instruct-v0.1 2024-04-17
18 Stable Diffusion Multiplayer 2022-10-30
18 Encrypted Large Language Models with Homomorphic Encryption 2023-08-03
18 Hermes-2-Pro-Llama-3-8B 2024-05-01
18 Orca 2: Teaching Small Language Models How to Reason 2023-11-21
17 Show HN: MiniSearch, a minimalist search engine with integrated browser-based AI 2023-10-15
17 StableLM-2-12B 2024-04-08
17 Gemini vs. GPT-4V: A Preliminary Comparison Through Qualitative Cases 2023-12-28
17 Una-Cybertron-7B 2023-12-08
17 GPT Baker lets you build your own open-source GPTs 2023-11-23
17 Deploy Livebook (Elixir) Notebooks as Apps to Hugging Face Spaces 2023-06-15
17 ChatRWKV 2023-03-23
16 NuExtract: A LLM for Structured Extraction 2024-06-29
16 An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct 2024-06-09
16 Phi-3 Weights Released 2024-04-23
16 New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks 2024-07-31
16 Miqu 70B – possible leak of the mistral-medium LLM 2024-01-29
16 New Stable Diffusion model trained on high quality Art 2022-12-11
15 Ollama can run any GGUF Model on Hugging Face Hub now 2024-10-16
15 Hugging Face just released a new NLP course 2021-06-15
14 Llama-3-70B-Instruct-Gradient-1048k 2024-05-04
14 New finance LLM passed the CFA Level III exam 2024-07-31
14 Airoboros-13B: 98% against GPT-3.5 2023-05-22
14 Run Mistral 7B model using less than 4GB of memory on your Mac with CoreML 2024-07-23
14 Stable Diffusion 3 Medium Released 2024-06-12
14 Pre-computed vector embeddings available on HuggingFace 2024-01-22
13 Create a GPT3 powered Q&A Chatbot for *any* GitHub repo by posting its link 2023-02-05
13 Yi-9B-200K 2024-03-17
13 An Introduction to Vision-Language Modeling 2024-05-28
12 Attention Sinks in LLMs for endless fluency 2023-10-09
12 FineWeb: 15T tokens of the finest data the web has to offer 2024-04-21
12 Idefics: Open Access 60B multimodal model 2023-08-22
12 Google AI just released Flan-T5 models 2022-10-24
12 Poster2Plot: Generate Movie/T.V show plot from a poster using Machine Learning 2021-11-25
12 Language model can listen while speaking 2024-08-07
12 ML for 3D Course on Hugging Face 2024-05-16
12 Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs 2024-04-09
12 Command-R: open weights 35B params / 128k tokens context length model by Cohere 2024-03-11
12 StarCoder2 and The Stack v2: new code LLMs and dataset 2024-02-28
12 Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model 2024-03-28
12 Stable difusion on multiplayer: Internet at it best 2022-10-30
11 HuggingFace Is Down 2024-02-28
11 30B uncensored OSS model with no guardrails 2023-11-07
11 The Stack: 3 TB of permissively licensed source code in 30 programming languages 2022-10-31
11 Experiments with Bitnet 1.5 (Ngmi) 2024-03-23
11 Hierarchical Masked 3D Diffusion Model for Video Outpainting 2023-09-06
11 FalconMamba 7B: The first attention-free and general-purpose pure Mamba model 2024-08-13
11 NPC-Playground, a 3D playground to interact with LLM-powered NPCs 2024-06-05
11 Open LLM Leaderboard 2024-01-02
11 Shallow Feed-Forward Neural Networks as Alternative to Attention in Transformers 2023-11-21
10 CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher 2024-06-15
10 Whisperfile 2024-08-19
10 Llava Model for Video 2024-05-16
10 Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption 2024-01-31
10 Vector embeddings model for medical literature 2024-01-08
10 From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting 2023-09-11
10 Origin of LLMs: An Evolutionary Tree and Graph for 15K Large Language Models 2023-07-20
10 Show HN: Image Filtering App Using Homomorphic Encryption 2023-02-23
10 CMFNet: AI Image Deblurring 2022-02-27
9 Not All Language Model Features Are Linear 2024-05-25
9 Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct 2024-10-16
9 Stable Diffusion XL Inpainting model released 2023-09-01
9 Opentensor and Cerebras announce BTLM-3B-8K, a leading 3B param. language model 2023-07-24
9 Perspectives for first principles prompt engineering 2024-08-20
9 ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models 2024-05-28
9 Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B 2024-01-04
9 LLM Arena. Mistral-small best open model. Gemini Pro beaten by 2 open models 2023-12-17
9 Meta-llama (Meta Llama 2) 2023-07-18
9 Summary of the Tokenizers 2023-02-07
9 Show HN: Sentiment Analysis on Encrypted Data with Homomorphic Encryption 2022-11-21
9 RunwayML fine tuned Stable Diffusion 1.5 model 2022-10-20
9 Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B 2024-11-18
9 MIT Researchers Unveil New Method to Improve LLM Inference Performance 2024-10-04
9 Aryn/deformable-detr-DocLayNet – open-source Layout Model 2024-07-31
9 AIMO (AI Math Olympiad) progress prize winning solution 2024-07-10
9 Mistral-7B-v0.3 released on HuggingFace 2024-05-22
9 Microsoft Phi-3 3.8B model with 128k Context 2024-04-23
9 The Stack v2: a 3B files in 600 programming languages dataset 2024-03-07
8 NousResearch/Nous-Hermes-2-Llama-2-70B 2024-02-12
8 Gradio-Lite: Serverless Gradio Running in the Browser 2023-10-25
8 Show HN: Parley: The RPG where you Negotiate with Bandits 2023-04-26
8 Show HN: We made an encrypted DNA testing app using Homomorphic Encryption 2024-10-02
8 NexusRaven-V2-13B 2024-01-25
8 Generate 1 page comic by text 2023-09-03
8 Drag Your GAN: Interactive Point-Based Manipulation on Generative Image Manifold 2023-05-23
8 Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard 2024-10-15
8 Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released 2024-09-17
8 Mistral AI Pixtral 2024-09-11
8 Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces 2024-02-14
8 Show HN: Open-source model to chat with your documents/data 2023-08-14
8 Yes, Transformers Are Effective for Time Series Forecasting (+ Autoformer) 2023-06-25
8 Hugging Face OpenAssistant 2023-06-24
8 Dataset of 35,316,999 HackerNews Posts and Comments (2006 – 2023) 2023-04-24
8 Show HN: Athelas – Automagically Repair Broken Code 2023-01-03
7 Phi-3 Technical a Highly Capable Language Model Locally on Your Phone 2024-04-23
7 TinyStories: How Small Can Language Models Be and Still Speak Coherent English? 2023-05-16
7 Am I in the Stack? 2024-03-20
7 Common Corpus: the largest public domain dataset for training LLMs 2024-03-20
7 Introducing “Clerkie“: A LangChain Q&A bot for AI developers 2023-01-18
7 Show HN: Step up your Midjourney AI images with this prompt autocomplete 2022-09-10
7 Hugging Face launches Agents 2.0 2024-05-13
7 OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 2024-02-26
7 Microsoft's Orca 7B may violate OpenAI's Terms of Use 2023-12-05
7 Stable Beluga 2 – Llama2 70B finetuned on an Orca style Dataset by Stability AI 2023-07-28
7 Databricks’ dolly-v2-12B, an instruction-following large language model 2023-04-12
7 Cerebras releases its own open source GPT models (Apache 2.0 License) 2023-03-28
7 Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space 2024-05-16
7 1B+ words corpus of original texts and experimental post-OCR correction output 2024-04-26
7 Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out 2024-03-14
7 Grandmaster-Level Chess Without Search 2024-02-08
7 Create a Web Interface for Your LLM in Python 2024-01-23
7 Show HN: Interactively explore your Hugging Face dataset with one line of code 2023-10-25
7 Show HN: DocQuery, an OSS tool for analyzing documents with LLMs 2022-09-01
7 Show HN: State-of-the-Art Conversational AI 2019-05-09
6 CodeFusion: A Pre-Trained Diffusion Model for Code Generation 2023-10-30
6 OpenChat 3.5: 7B model with comparable perf to ChatGPT 2023-11-02
6 New leaderboard drop: Judge Arena 2024-11-19
6 Phased Consistency Model 2024-05-29
6 Generate Illusions with Stable Diffusion 2023-09-16
6 Mann-E, an open source Equivalent of Midjourney reached its version 4.1.3 2023-03-04
6 A Llama 70B finetune that has reflection baked into it's weights 2024-09-05
6 Show HN: Understand politics by visualising manifesto embeddings 2024-07-07
6 Mistral releases the v0.3 of its 7B LLM 2024-05-22
6 Idefics2: A Powerful 8B Vision-Language Model for the Community 2024-05-14
6 Show HN: Open-source LLM for data labeling 2024-05-08
6 Dolphin-2.9-Llama3-8B 2024-04-21
6 Introduction to 3D Gaussian Splatting 2024-04-02
6 Qwen is a large language model series by Alibaba Cloud 2023-09-27
6 Show HN: TCO Calculator to compare on-prem LLM deployment vs. OpenAI and Co 2023-08-21
6 Llama-2-70B-instruct-v2 2023-08-03
6 Falcon 40B-Instruct GGML 2023-06-15
6 RWKV – An RNN with the Advantages of a Transformer 2023-05-15
6 Assisted Generation: a new direction toward low-latency text generation 2023-05-11
6 Databricks Publishes a Version of Dolly LLM to Hugging Face 2023-03-30
6 Hugging Face introduces Pull Requests and Discussions 2022-05-25
5 TinyLlama a 1.1B Llama model trained on 3T tokens reaches 1.0 release 2023-12-31
5 Gemma-2 2B beats GPT3.5 on Chatbot Arena 2024-07-31
5 FineWeb-Edu: new 1.3T tokens web dataset 2024-06-02
5 Wall Street Journal Hedcut Stable Diffusion Model 2024-01-23
5 New Mixtral HQQ Quantzied 4-bit/2-bit configuration 2023-12-18
5 Personal co-pilot with a fine-tuning and a VSCode extension 2023-10-31
5 Segment Anything Model (Sam) in the Browser with Rust and WASM 2023-09-16
5 SD-XL 1.0 Model Card 2023-07-26
5 AI Policy: Open ML Considerations in the EU AI Act 2023-07-26
5 Modified Version of Apache 2.0 License with Royalty Payments 2023-05-26
5 Creating a Coding Assistant with StarCoder 2023-05-10
5 CLIP Interrogator 2022-10-22
5 Blip: Image Captioning and Visual Question Answering AI 2022-02-26
5 Hertz-dev is an open-source model for full-duplex conversational audio 2024-11-16
5 New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies 2024-11-11
5 Hugging Face launches HUGS: managed containers for on-premise model deployment 2024-10-23
5 Janus-1.3B: Unifying Multimodal Understanding and Generation 2024-10-18
5 Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling 2024-10-16
5 Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller 2024-10-16
5 Whisper-Large-v3-Turbo 2024-10-03
5 Show HN: Automatic chaptering – From raw transcripts to structured documents 2024-09-09
5 TabReD: A Benchmark of Tabular Machine Learning In-the-Wild 2024-07-04
5 Microsoft releases weights for Florence-2 vision model 2024-06-19
5 Phi-3-medium-128k-instruct 2024-05-22
5 Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs 2024-04-13
5 Gretel: Synthetic Text to SQL Dataset 2024-04-04
5 Detecting performance and ethical vulnerabilities in popular Hugging Face models 2024-03-21
5 Design2Code: How Far Are We from Automating Front-End Engineering? 2024-03-10
5 Genie: Generative Interactive Environments 2024-02-26
5 TTS Arena: Benchmarking TTS Models in the Wild 2024-02-25
5 Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral 2024-02-20
5 DeciLM-7B 2023-12-12
5 Nash Learning from Human Feedback 2023-12-05
5 Real-time image generation demo on Gradio 2023-11-12
5 Convert a transformers model to Core ML 2023-04-06
5 Wikipedia Txtai Embeddings Index 2023-03-21
5 Show HN: Get the gist of anyone's Twitter feed 2023-02-24
5 Illustrating RLHF that's critical for ChatGPT 2022-12-09
5 Stable Diffusion Webapp 2022-09-28
5 The World’s Largest Open Multilingual Language Model: Bloom 2022-08-15
5 Wikipedia assistant directly answers your questions 2022-02-15
5 Show HN: Generate a neural hash collision with any two images 2021-08-28
5 Show HN: Try Apple’s Neural Hash in the Browser 2021-08-25
4 Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard 2024-01-26
4 Octopus V4: a graph of language models 2024-05-02
4 Llama-3 8B Instruct 262k 2024-04-26
4 CodeGemma – an official Google release for code LLMs 2024-04-09
4 Solar 10.7B: Elevating AI, Effortlessly 2023-12-27
4 WhiteRabbitNeo model series can be used for offensive/defensive cybersecurity 2023-12-20
4 Eric Hartford releases uncensored dolphin-2.5-mixtral-8x7B 2023-12-14
4 XTTS: New Generative model for Voice (weights released on HF) 2023-09-15
4 Prompt Injection Detection Model 2023-06-14
4 GPT-2 Output Detector 2022-12-05
4 Chat with Gandalf (GPTJ-6B) 2021-10-14
4 Apple Open-Sources LLM DCLM-7B 2024-07-19
4 Open LLM Leaderboard v2 2024-06-29
4 Florence 2, Microsoft OCR Modell 2024-06-20
4 Apple OpenELM Instruct Models 2024-04-24
4 Phi-3 Released 2024-04-23
4 GemMoE: An 8x8 Mixture Of Experts based on Gemma 2024-03-13
4 Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science 2024-02-07
4 Introduction to State Space Models (SSM) 2024-01-24
4 Distributed Inference and Fine-Tuning of Large Language Models over the Internet 2023-12-17
4 Distil-Whisper: Distil-Small.en 2023-12-14
4 2-bit and 4-bit versions of Mixtral 2023-12-11
4 Nous-Capybara-34B-200k 2023-11-14
4 An open-source and privacy-by-design Conversational AI in-browser 2023-09-22
4 Large Language Models for Compiler Optimization 2023-09-14
4 Gaussian viewer streaming splats in web browser 2023-09-12
4 Puma: Secure Inference of LLaMA-7B in Five Minutes 2023-07-25
4 FreeWilly2: New LLM from Stability AI 2023-07-24
4 40B LLM wants to charge 10% royalty on revenue? 2023-05-26
4 Falcon-40B 2023-05-26
4 Fully Open Source LLM Chat App – Chat about the Transformers Docs 2023-03-14
4 Karlo, the first open source DALL-E 2 replication is here 2022-12-21
4 Show HN: Thought Leadership as a Service 2022-06-09
4 HtmlRAG: HTML Is Better Than Plain Text for RAG Systems 2024-11-06
4 Structured generation with Outlines, now in Rust 2024-10-22
4 Llama 3.2 in the Browser with WebGPU 2024-09-30
4 Multimodal TextImage Augmentation for Document Images 2024-09-14
4 'Reflection 70B' AI model could be the answer to pesky LLM hallucinations 2024-09-06
4 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers 2024-08-14
4 FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving manner 2024-08-13
4 Introduction to Ggml 2024-08-13
4 Google releases Gemma 2 2B, ShieldGemma and Gemma Scope 2024-08-01
4 Gemma 2 2B Release 2024-08-01
4 Extracting Concepts from LLMs: Anthropic's recent discoveries 2024-06-08
4 EasyAnimate: End-to-end solution for high-resolution and long video generation 2024-06-04
4 Grokked Transformers Are Implicit Reasoners 2024-05-27
4 Paligemma: A versatile and lightweight vision-language model (VLM) 2024-05-14
4 4M Context – Llama-3-8B-Instruct 2024-05-09
4 ReFT: Representation Finetuning for Language Models 2024-04-05
4 Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage 2024-03-22
4 Show HN: Chatbot Guardrails Arena 2024-03-21
4 Quanto: A PyTorch Quantization Toolkit 2024-03-18
4 On-device background removal with Transformers.js 2024-02-07
4 SegMoE: Segmind Mixture of Diffusion Experts 2024-02-05
4 NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs 2024-02-03
4 HuggingChat Assistants: Open source models with custom instructions 2024-02-02
4 TinyLlama Reaches 3T Checkpoint 2023-12-28
4 Obsidian-3B 2023-11-25
4 Yarn-Llama-2-70B-32k 2023-11-20
4 SDXL in 4 steps with Latent Consistency LoRAs 2023-11-09
4 Zephyr 7B 2023-10-27
4 Apple/coreml-stable-diffusion-XL-base-iOS 2023-09-30
4 DeepSpeed-Chat: Easy RLHF Training of ChatGPT-Like Models at All Scales 2023-08-04
4 Deploy LLMs with Hugging Face Inference Endpoints 2023-07-04
4 Instruct-Codegen: open-source instruction following codegen model 2023-05-27
4 MPT-7B-StoryWriter-65k+: LLM for super long contexts (Apache 2.0) 2023-05-05
4 BioGPT for Biomedical Scientific Discovery 2023-02-07
4 Using LoRA for Efficient Stable Diffusion Fine-Tuning 2023-01-26
4 From GPT2 to Stable Diffusion: Hugging Face Arrives to the Elixir Community 2022-12-09
4 Stable Diffusion pre-loaded with 250 community textual inversion concepts 2022-09-14
4 Overview of how Stable Diffusion works 2022-08-27
4 Editing Videos by Editing Text 2022-05-23
4 Latent Diffusion, open source alternative to DALL·E 2 2022-04-13
4 Snowball Fight: a multi-agent environment for ML-Agents 2021-12-02
4 Demucs: Music Source Separation in the Waveform Domain 2021-10-22
4 Deep Learning over the Internet: Training Language Models Collaboratively 2021-07-20
4 Hugging Face: How to train a new language model from scratch 2020-02-14
4 NLP: State-of-the-art neural coreference resolution system 2017-07-07
3 MiniLM-L6-v2 maps paragraphs to 384 dimension vector for clustering or search 2023-03-21
3 Show HN: Turn Any Article into a Conversation-Like Podcast 2024-05-22
3 Phi-1.5 (1.3B Outperforms Llama 2 7B) 2023-09-12
3 GPT-2B-001 2023-04-20
3 Model converts casual text to formal 2021-10-19
3 Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI 2024-10-15
3 AI has a problem with objectifying women 2024-05-28
3 Linus Torvalds Chat Bot 2024-02-02
3 ChatQA: Building GPT-4 Level Conversational QA Models 2024-01-19
3 10.7B Solar: Elevating Performance with Upstage Depth Up Scaling 2023-12-18
3 Voice Chat with Mistral 7B 2023-10-16
3 Hugging Face partner with AMD to accelerate state-of-the-art models 2023-06-14
3 Frames: Factuality, Retrieval, and Reasoning MEasurement Set 2024-10-01
3 Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and it's sick 2024-09-20
3 Chronos-T5 (Tiny) – pretrained time series forecasting models 2024-08-14
3 HF for Legal, an open-source community on Hugging Face 2024-07-01
3 LegalKit, French labeled datasets built for legal ML training 2024-06-27
3 Nvidia releases ChatQA-1.5 in violation of Llama 3 license 2024-05-02
3 Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding 2024-04-26
3 Everyone seems to have forgotten about Gemma 2024-04-25
3 Introducing the Open Chain of Thought Leaderboard 2024-04-23
3 Google Gemma 1.1 2B and 7B instruct 2024-04-06
3 Starcoder-2 2024-02-28
3 DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development 2024-02-09
3 Nous-Hermes-2-SOLAR-10.7B 2024-01-02
3 Solar 10.7B 2023-12-27
3 Transformer.js: Machine Learning for the Web 2023-12-09
3 PixArt-α: Fast Training of Diffusion Transformer for Text-to-Image Synthetis 2023-12-04
3 Laiyer AI Released Its Open Source Prompt Injection Model 2023-11-29
3 LZMD: Lempel-Ziv Montecarlo Diffusion file format 2023-11-29
3 Faster MusicGen Generation with Streaming 2023-10-06
3 Llama 2 on Amazon SageMaker a Benchmark 2023-09-26
3 LoRA Roulette 2023-09-22
3 Open-source AI Discord bots with HuggingFace 2023-08-17
3 StableBeluga-7B 2023-07-29
3 MPT-30B – Apache 2.0 licensed LLM 2023-07-22
3 Show HN: I created a first-of-its-kind open corpus of Australian law 2023-06-26
3 Show HN: DocsGPT-7B – purpose optimised and finetuned model for documentation QA 2023-06-16
3 Alpaca Dataset Translated into Polish 2023-04-12
3 Bert 101 State of the Art NLP Model Explained 2022-03-02
3 Transformers Can Do Bayesian Inference 2021-11-09
3 SemScore: Evaluating LLMs with Semantic Similarity 2024-11-06
3 Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints 2024-10-31
3 Hugging Face Now Automatically Detects Leaked Secrets 2024-09-05
3 Selective fine-tuning of Language Models with Spectrum 2024-09-03
3 Idefics3: Open multimodal model based on Llama-3.1-8B 2024-08-09
3 New Google Gemma 2 2B model 2024-07-31
3 Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth 2024-07-29
3 DiLoCo: Distributed Low-Communication Training of Language Models 2024-07-26
3 The largest math dataset of Olympiad problems for training LLMs 2024-07-21
3 SmolLM – Fast and Remarkably Powerful 2024-07-16
3 Whisper WebGPU: Real-time in-browser speech recognition 2024-06-08
3 UGI Leaderboard – Uncensored General Intelligence 2024-06-07
3 Transformers Are SSMs: Generalized Models and Efficient Algorithms Through 2024-06-04
3 Recovering 4D World from Monocular Video 2024-05-29
3 LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models 2024-05-26
3 Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data 2024-05-26
3 Phi-3 in-browser inference using WebGPU 2024-05-08
3 Show HN: GPT Fine-Tune Formatter 2024-05-07
3 InstantMesh: Efficient 3D Mesh Generation from a Single Image 2024-04-15
3 Mixture of Finetuned and GPT4 Model 2024-04-07
3 H2O-Danube2-1.8B-Chat 2024-04-07
3 Yi-9B 2024-04-05
3 Dolphin-2.8-mistral-7B-v02 2024-04-03
3 Common Corpus – Start of the largest public domain dataset for training LLMs 2024-03-20
3 MoAI: Mixture of All Intelligence for Large Language and Vision Models 2024-03-14
3 OpenChat-3.5-0106-Gemma 2024-03-10
3 Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping 2024-02-23
3 Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens 2024-02-22
3 Stable Diffusion XL Lightning 2024-02-21
3 Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases 2024-02-03
3 Show HN: A lineage explorer for open source models and datasets 2024-01-23
3 Aim – An Apple Collection 2024-01-19
3 LLaVA-3B 2024-01-01
3 Dolphin-2.6-Mistral-7B 2023-12-29
3 MonadGPT 2023-12-28
3 MiniMA-2-3B 2023-12-27
3 WaveCoder: Widespread Versatile Enhanced Instruction Tuning with Refine Data Gen 2023-12-26
3 StarVector: Generating Scalable Vector Graphics Code from Images 2023-12-20
3 AITube - Youtube but everything is AI generated 2023-12-15
3 Refact-1.6B 2023-12-08
3 Llama-2-7B-chat-mlx for Apple’s new MLX framework 2023-12-06
3 NeuralHermes-2.5-Mistral-7B 2023-11-29
3 Tulu-2-Dpo-70B 2023-11-21
3 Show HN: New Launch OrionStar-Yi-34B-Chat beats Llama2-70B and GPT-3.5-turbo 2023-11-20
3 Nvidia nemotron-3-8B-base-4k 2023-11-16
3 Optimizing LLMs in Production 2023-11-15
3 HuggingFace Daily Papers 2023-11-14
3 Make your llama generation time fly with AWS Inferentia2 2023-11-11
3 Show HN: Face-Stylization – Create face styling with just 8 images 2023-11-09
3 Document Question Answering 2023-10-30
3 Apple's LLMs and other GenAI models on HuggingFace 2023-10-19
3 Using HuggingFace to Train a GPT-2 Model for Music Generation 2023-10-09
3 MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators 2023-09-19
3 Generative Image Dynamics 2023-09-15
3 OpenHermes-13B based on Llama-2 2023-09-07
3 Llama2.c LLM: ported to Rust and running in the browser 2023-09-07
3 Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2 2023-09-01
3 Fine-tuned CodeLlama beats GPT-4 on HumanEval 2023-08-27
3 LoRA the Explorer 2023-08-17
3 Fine-tune Llama 2 with DPO 2023-08-08
3 Show HN: Goat-7B LLM, a new SOTA among the open-source 7B models 2023-07-25
3 How is ChatGPT's behavior changing over time? 2023-07-19
3 Show HN: New control net model for AI art QRcode 2023-06-27
3 Show HN: Bert-Based Classification Model for Google Local Listings 2023-06-26
3 Mosaic ML: MPT-30B-Chat 2023-06-25
3 Video Composer: Create videos using GPT-4 and FFmpeg 2023-06-15
3 MusicGen from Meta on Hugging Face 2023-06-09
3 OpenLLaMA 7B Released 2023-06-07
3 WizardLM-30B 2023-06-06
3 Can AI Code? 2023-06-05
3 Constrained Text Generation with Transformers 2023-05-22
3 StarCoder: A State-of-the-Art LLM for Code 2023-05-05
3 Swift Diffusers: Fast Stable Diffusion for Mac 2023-04-02
3 Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU 2023-03-12
3 Parameter-Efficient Fine-Tuning Billion-Scale Models on Low-Resource Hardware 2023-02-10
3 Finetuned Stable Diffusion: open, free, beautiful results near to Midjouney 2022-12-28
3 Hugging Face Machine Learning Demos Are Now on ArXiv 2022-11-17
3 Pony Diffusion 2022-10-01
3 Show HN: Audio Intelligence Dashboard 2022-09-26
3 Fast Bloom Inference with DeepSpeed and Accelerate 2022-09-15
3 YOLOv6: Real-Time Object Detection Demo 2022-07-15
3 An Introduction to Deep Reinforcement Learning 2022-05-13
3 Transform natural language queries to vector search SQL 2022-04-19
3 Single Image to 3D in the Browser 2022-04-15
3 JPEG Artifacts Removal 2022-04-12
3 Multimodal Augmentation of Generative Models Through Adapter-Based Finetuning 2022-03-20
3 AI Line Drawing Generation 2022-03-11
3 OCR Model Beats Captcha 2022-02-23
3 Fairseq S2: Scalable Speech Synthesis 2022-01-21
3 Show HN: JoJoGAN, face to JoJo, Disney and Arcane style 2021-12-29
3 AI Arcane style selfie filter 2021-12-11
3 Build composable NLP workflows with txtai 2021-11-09
3 Coqui TTS: Multi Language Synthetic Text-to-Speech 2021-09-14
3 Super Resolution Models 2021-09-13
3 SpaCy added to the Hugging Face hub 2021-07-13
3 Pretrain GPT-Neo for Open Source GitHub Copilot Model 2021-07-03
3 Scaling up BERT-like model Inference on modern CPU 2021-05-08
3 A prompt is worth 1000 data points: combining GPT3 prompting and fine-tuning 2021-04-07
3 Understanding BigBird's Block Sparse Attention 2021-04-07
3 Hugging Face – The AI community building the future 2021-04-05
3 GPT Neo on Hugging Face Transformers with Inference API 2021-03-31
3 Write with Transformer 2019-12-06
2 Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM 2024-04-19
2 Show HN: A Reassuring Parables Generator 2021-12-25
2 Try Qwen2.5-Coder-32B on HuggingChat 2024-11-12
2 An orthogonalized AI to introduce an unengaged melancholic style 2024-06-13
2 Pearl-7B-slerp, an xtraordinary 7B model for maths 2024-02-05
2 Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station 2024-01-28
2 7B model from Snorkel tops Alpaca Eval 2.0 leaderboard 2024-01-24
2 Run Deepseek Coder LLM locally 2023-12-03
2 Releasing Swift Transformers: Run On-Device LLMs in Apple Devices 2023-08-08
2 Stable Diffusion Bias Explorer 2022-11-09
2 LongVU – New Video LLM from Meta 2024-10-24
2 Hacker News Comments Dataset 2024-10-11
2 HuggingFace Accelerate 1.0.0 2024-10-07
2 Mistral-Small-Instruct-2409 2024-09-17
2 HuggingChat: Chat with Llama 3.1 (70B and 405B) 2024-07-23
2 Ocean Biodiversity Information System on Hugging Face 2024-07-21
2 CommonCanvas image generation from CC-licensed images – models, dataset released 2024-06-07
2 Show HN: PodGen generate podcasts on any topic 2024-06-01
2 Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models 2024-05-28
2 The Waifu Research Department 2024-05-16
2 Yi-1.5 LLM Models Released 2024-05-12
2 Fietje: An open and efficient LLM for Dutch 2024-05-02
2 Simple Multimodal LLM from Scratch 2024-04-23
2 Stability Releases Code Instruct 3B 2024-04-02
2 Mistral 7B v0.2 2024-04-01
2 PolarsBot, a New HuggingChat Assistant 2024-03-25
2 Easy and low cost model training on HF "DGX cloud" 2024-03-19
2 Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's Leaderboard 2024-02-19
2 LLMs can learn useful guidelines from their own mistakes 2024-02-12
2 Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard 2024-02-11
2 Aanaphi-2 3B 2024-02-09
2 Playground for Hugging Face Models 2024-02-05
2 Hallucinations Leaderboard 2024-01-29
2 Fine-tune Wav2Vec2-BERT for low resource speech recognition 2024-01-23
2 InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds 2024-01-22
2 Yayi2-30B-Llama 2024-01-01
2 Mixtral_7Bx2_MoE 2023-12-24
2 Universal AnglE Sentence Embedding: New SOTA on MTEB Leaderboard 2023-12-05
2 Non-engineers guide: Train a LLaMA 2 chatbot 2023-12-02
2 AutoTrain: (not just)LLM finetuning without code and infra 2023-11-23
2 How do you think LLM inference on CPUs? 2023-11-03
2 State-of-the-Art Ember embedding model for retrieval augmented generation 2023-10-20
2 Large Language Models as Analogical Reasoners 2023-10-05
2 QR Code Monster 2023-10-02
2 CausalLM is not optimal for in-context learning 2023-08-15
2 Count tokens used by GPT-4 and Llama for large texts (> 50k characters) 2023-08-05
2 Apply ControlNet to a Video 2023-08-01
2 Making real-time ML-powered web games with Transformers.js 2023-07-05
2 LLaMA: Large Language Model Meta AI 2023-03-17
2 Small Stable Diffusion 2023-01-19
2 Dreambooth training UI for training a model for less than US$0.80 2022-12-01
2 Stable Diffusion: Generating One Image a Second 2022-10-15
2 VToonify Web Demo for Portrait Video Style Transfer 2022-10-04
2 CodeParrot: Train and evaluate your own CoPilot model 2021-12-10
2 SummerTime: Text Summarization Toolkit for Non-Experts Web Demo 2021-09-01
2 HuggingFace AutoNLP 2021-02-27
2 Pixtral-Large-Instruct-2411 2024-11-18
2 FLUX.1-Dev LoRA Outfit Generator by TryOn Labs 2024-11-06
2 Contextual Document Embeddings 2024-11-01
2 Code a Simple RAG from Scratch – Hugging Face Community Article 2024-10-30
2 OmniParser for Pure Vision Based GUI Agent 2024-10-25
2 Hugs – Scale Your AI with Open Models 2024-10-23
2 Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins 2024-10-23
2 Pickle Scanning 2024-10-23
2 New Video Generation Model:Allegro 2024-10-22
2 TxT360 2024-10-18
2 Dataset About Where 30k+ Startups Trend 2024-10-18
2 Nvidia Nemotron 2024-10-17
2 Fixing Gradient Accumulation 2024-10-16
2 Animate-X: Universal Character Image Animation with Enhanced Motion 2024-10-15
2 SOTA Open Source Text to Video Model 2024-10-14
2 Exploring the Daily Papers Page on Hugging Face 2024-09-24
2 Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu) 2024-09-23
2 Recreating o1 at Home with Role-Play LLMs 2024-09-21
2 FineVideo: Annotated YouTube Dataset by HuggingFace 2024-09-12
2 Remove Background by Text 2024-09-12
2 Labeled Image generation using Meta Llama 3.5 2024-08-31
2 Scaling robotics datasets with video encoding 2024-08-30
2 New FashionCLIP and SigLIP Classification Demo 2024-08-28
2 Mozilla/TriLM-Llamafile · Hugging Face 2024-08-26
2 Play: How random can a human brain truly be? 2024-08-24
2 FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs 2024-08-21
2 Flux Dev 1 model that creates half_illustration images 2024-08-21
2 LLMs as Image Generators with Canonical Codec Representations 2024-08-19
2 Instant in-browser demo of SmolLM 2024-08-18
2 Marqo-FashionCLIP: New Embedding Model for Fashion 2024-08-14
2 A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine 2024-08-07
2 Generate and Export Segmentation Masks Using Meta's SAMv2 2024-07-31
2 HuggingChat: Chat with Llama 3.1 405B 2024-07-25
2 Meta-Llama-3.1-405B 2024-07-23
2 Apple's DCLM model shares data&training code with weights 2024-07-20
2 Predicting Multiplication with GPT-2 2024-07-20
2 Qwen2 Technical Report 2024-07-16
2 Gemma-2-27B-it llamafile 2024-07-03
2 OpenRAIL: Towards open and responsible AI licensing frameworks (2022) 2024-07-03
2 New LLM Agent writing actions in Python code tops the GAIA agent benchmark 2024-07-01
2 Stable Diffusion 3 Medium Online Demo, Free 2024-06-12
2 To Believe or Not to Believe Your LLM 2024-06-11
2 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs 2024-06-04
2 Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series 2024-05-31
2 Training and Finetuning Embedding Models with Sentence Transformers v3 2024-05-30
2 ChatTTS – open-source TTS model designed specifically for dialogue scenario 2024-05-29
2 Matryoshka Multimodal Models 2024-05-28
2 Aya 23: Open Weight Releases to Further Multilingual Progress 2024-05-28
2 HuggingFace Hub Incident Post Mortem 2024-05-24
2 Cohere Updates Weights for Aya 2024-05-23
2 Hugging Face on AMD Instinct MI300 GPU 2024-05-23
2 Show HN: Generate a Quiz from Any Url 2024-05-17
2 Show HN: EmuBert – the first open encoder model for Australian law 2024-05-14
2 New Yi 1.5 models under Apache 2.0 2024-05-12
2 Building Cost-Efficient Enterprise RAG Applications 2024-05-10
2 Google codegemma-1.1-7B-it 2024-05-03
2 Introduction to Matryoshka Embedding Models 2024-05-03
2 Iterative Reasoning Preference Optimization 2024-05-02
2 GPT-2 2024-05-01
2 Fine-tune Llama 3 with ORPO 2024-04-23
2 In-browser text-to-music generation using musicgen-small 2024-04-20
2 Compression Represents Intelligence Linearly 2024-04-16
2 Bringing serverless GPU inference to Hugging Face users 2024-04-16
2 From Words to Numbers: Your LLM Is a Capable Regressor 2024-04-12
2 Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace 2024-04-11
2 TinyTimeMixer: Open-source time series LLM by IBM 2024-04-09
2 Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction 2024-04-05
2 Command R+ 2024-04-04
2 Demo of Moondream2 vision language model running in browser 2024-04-03
2 Mini-Jamba 2024-04-01
2 Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs 2024-04-01
2 The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 Bits 2024-03-25
2 Cosmopedia: How to create large-scale synthetic data for pre-training 2024-03-21
2 Playground-v2.5-1024px-Aesthetic 2024-03-16
2 Gemini 1.5: Unlocking multimodal understanding across tokens of context 2024-03-15
2 Better RAG 1: Advanced Basics 2024-03-15
2 Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks 2024-03-13
2 LLM Red-Teaming Resistance Leaderboard 2024-03-01
2 Show HN: Visualize how you split your document into chunks for RAG applications 2024-02-27
2 From OpenAI to Open LLMs with Messages API on Hugging Face 2024-02-23
2 C4: colossal cleaned version of Common Crawl's web crawl corpus 2024-02-21
2 Constitutional AI with Open LLMs 2024-02-01
2 Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna AI 2024-01-31
2 AMUSEd: Efficient Text-to-Image Generation 2024-01-29
2 Minillama – 4.1 MB LLM for testing 2024-01-20
2 StableLM 2 Zephyr 1.6B 2024-01-20
2 Local vector embeddings index for analyzing ArXiv papers 2024-01-17
2 Stable Zero123 Model Weights get Released. Text to 3D and image to 3D 2024-01-15
2 Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL 2024-01-10
2 OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally 2024-01-10
2 Revolutionizing AI with Audio Classification via Computer Vision 2024-01-02
2 Chatglm3-6B-32k 2023-12-29
2 DreaMoving: A Human Video Generation Framework Based on Diffusion Models 2023-12-28
2 Dream-Talk: Realistic Audio-Driven Single Image Talking Face Generation 2023-12-24
2 Time Is Encoded in the Weights of Finetuned Language Models 2023-12-22
2 2023, Year of Open LLMs 2023-12-19
2 Hugging Face releases Optimum-Nvidia to accelerate LLM inference 2023-12-07
2 Open LLM Leaderboard: DROP deep dive 2023-12-02
2 Starling-RM-7B-Alpha 2023-11-27
2 Intel: neural-chat-7B-v3-1 2023-11-16
2 Whisper Large v3 2023-11-09
2 MonadGPT – OS ChatGPT-like for the 17th century 2023-11-09
2 OpenHermes-2.5-Mistral-7B 2023-11-08
2 Yi-34B, 76.3 on MMLU, Apache 2.0 2023-11-04
2 Templates for Chat Models 2023-10-17
2 HF Shopify Image Background Replacement 2023-10-12
2 OpenWebMath, a dataset containing every math docs found on the internet 2023-10-11
2 Paper Page – NExT-GPT: Any-to-Any Multimodal LLM 2023-09-12
2 Using Machine Learning to Improve Language Metadata on the Hugging Face Hub 2023-09-12
2 Open ASR Leaderboard 2023-09-07
2 Show HN: A LLM pull reqeust review tool [feedback wanted] 2023-09-07
2 Technology Innovation Institute Releases Falcon 180B LLM 2023-09-06
2 Hugging Face Tutorial for Unity RL Agents 2023-08-31
2 Dolma: The Largest Open Dataset For Training Language Models 2023-08-24
2 WizardMath: Empowering Math Reasoning for LLM via Reinforced Evol-Instruct 2023-08-15
2 Hugging Face Launches Tools for Running LLMs on Apple Devices 2023-08-09
2 Open sourcing OpenAI’s function calling 2023-08-08
2 Autotrain – Create powerful AI models without code 2023-07-30
2 Understanding Embeddings 2023-07-28
2 Scaling TransNormer to 175B Parameters 2023-07-28
2 Llama 2 is here – get it on Hugging Face 2023-07-19
2 Building an AI WebTV 2023-07-18
2 Open-Source Text Generation and LLM Ecosystem at Hugging Face 2023-07-17
2 OpenOrca-Preview1 2023-07-12
2 Large Language Models can complete complex non linguistic patterns in context 2023-07-11
2 Whisper Web: Speech recognition in the web browser 2023-07-10
2 Chat with Falcon-7B-instruct demo 2023-07-08
2 OpenChat: Less is More for Open-source Models 2023-07-06
2 Can foundation models label data like humans? 2023-07-05
2 Are Text-to-image models biased? 2023-07-03
2 Orca: Progressive Learning from Complex Explanation Traces of GPT-4 2023-07-01
2 Can foundation models label data like humans? 2023-06-30
2 A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion 2023-06-30
2 Hugging Face – Transformers Agents 4.30 with local agents 2023-06-28
2 DragGan – Interactive Point-Based Manipulation on the Generative Image Manifold 2023-06-26
2 QR Code Conditioned ControlNet Models for Stable Diffusion 1.5 and 2.1 2023-06-16
2 Cluster and Visualise 100K Wines by Tasting Notes with T-SNE 2023-06-11
2 Hugging Face and IBM partner on watsonx.ai, next-gen enterprise studio for AI 2023-05-28
2 HuggingFace Demo: DragGAN 2023-05-26
2 Audit shows that safetensors is safe and ready to become the default 2023-05-23
2 A Dive into Text-to-Video Models 2023-05-15
2 HuberChat, a Chatbot trained on HubermanLab podcast (OpenAI key required) 2023-05-10
2 Demo: Code Completion with replit-code-v1-3B 2023-05-03
2 RLHF – Hugging Face Course 2023-04-27
2 Ekimetrics launches a “ChatGPT” dedicated to climate 2023-04-07
2 Alpaca GarbageCollector – Curating high-quality data for open-source LLMs 2023-04-04
2 Text2Video-Zero 2023-03-26
2 Train your own ControlNet models with diffusers 2023-03-24
2 Open source models for various Machine Learning tasks 2023-03-08
2 Ultra Fast ControlNet with Hugging Face Diffusers 2023-03-03
2 Using Stable Diffusion with Core ML on Apple Silicon 2023-02-22
2 HuggingFace/Transformers-Stats 2023-02-20
2 Playable Demo for MarioGPT: Open-Ended Text2Level Generation Through LLMs 2023-02-18
2 Faster Training and Inference: Habana Gaudi -2 vs. Nvidia A100 80GB 2023-02-16
2 Speech Synthesis, Recognition, and More with SpeechT5 2023-02-09
2 Threat actors using HuggingFace to deliver malware 2023-02-07
2 Generating Human Motion from Textual Descriptions (T2M-GPT) 2023-01-31
2 AI for Game Development: 3D Asset Generation 2023-01-20
2 Show HN: ML Q&A – Get answers to questions about ML frameworks 2023-01-05
2 Probabilistic Time Series Forecasting with Transformers 2022-12-02
2 Fine-Tune Whisper for Multilingual ASR with Transformers 2022-11-23
2 Ask a question, YouTube and OpenAI Whisper will try to answer 2022-10-28
2 Show HN: Ask YouTube – search for specific answers in videos 2022-10-28
2 New Google big language model Flan-T5 available on HuggingFace 2022-10-22
2 The Annotated Diffusion Model 2022-09-13
2 Text2Human: Text-Driven Controllable Human Image Generation 2022-08-04
2 Highly Accurate Dichotomous Image Segmentation 2022-07-31
2 The Technology Behind BLOOM Training 2022-07-23
2 BLOOM Language Model 2022-07-04
2 GPT4-Chan – Conditions for Availability 2022-06-24
2 Hugging Face Hub: discover and share ML models, datasets, and demos 2022-06-01
2 Decision Transformers on Hugging Face 2022-06-01
2 Mask Transfiner for High-Quality Instance Segmentation 2022-04-17
2 MultiMAE: Multi-modal Multi-task Masked Autoencoders 2022-04-16
2 Self-Distilled StyleGAN: Towards Generation from Internet Photos Gradio Demo 2022-04-05
2 CVPR2022 Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer 2022-03-24
2 Show HN: HF-BERTopic – Transformer based topic modeling in the browser 2022-02-02
2 Turn a Photo into an Animation 2022-01-29
2 DeepPrivacy: GANs for Face Anonymization 2022-01-24
2 Show HN: HN-KeyBERT: AI KeyPhrase extraction in the browser 2022-01-24
2 Similarity search for current Hacker News front page titles 2022-01-23
2 Steerable discovery of neural audio effects 2021-12-12
2 Image GPT 2021-12-09
2 AnimeGANV2 on Videos 2021-12-02
2 Snowball Fight, a deep reinforcement learning environment 2021-12-02
2 Facebook XLS Speech Translation 2021-11-18
2 Kinda-English RuDALL-E: Generate Images from Text 2021-11-10
2 Hugging Face Announces Infinity:Ultra-Fast Inference in Your Own Infrastructure 2021-09-28
2 Introducing Optimum: The Optimization Toolkit for Transformers at Scale 2021-09-14
2 Real-Esrgan: Training Real-World Blind Super-Resolution with Pure Synthetic Data 2021-08-28
2 Using and Mixing Hugging Face Models 2021-06-15
2 Hugging Face Course on Transformer Models 2021-06-15
2 Accelerate – Run PyTorch training on any kind of device 2021-05-16
2 Singlish/Manglish Pre-Trained Deep Learning NLP Model (Bert) 2020-08-29
2 Summary of the models available in the transformers library 2020-06-05
2 Electra is now integrated in huggingface/transformers 2020-04-06
2 Pytorch-Transformers: An Updated PyTorch Library for NLP 2019-07-17
1 Show HN: Embedding model for PDF page retrieval 2024-08-08
1 Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune 2024-05-02
1 Show HN: Elon Musk's Tweet Classifier 2022-04-30
1 Get Insulted by AI 2024-02-25
1 Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio 2024-07-29
1 With LLMs we can create an open-source Library of Alexandria 2023-09-28
1 Show HN: Find Your Celebrity Lookalike (With AI) 2023-01-04
1 Stable difussion trained with “El Risitas” dataset 2022-10-27
1 EleutherAI's GPT-J 2021-09-29
1 SmolLM2: The new, best, and open small language model 2024-11-01
1 The Romulus model series has been released on Hugging Face 2024-09-11
1 I added context data to the TruthfulQA dataset 2024-08-10
1 Chinese AI Community: open-source Heatmap 2024-07-31
1 Multi-token prediction models and baselines 2024-07-04
1 Mixtral or Llama 70B on Google Spreadsheet Thanks to Hugging Face's API 2024-06-17
1 Stupid Filter Corpus (2007) 2024-05-24
1 MMLU-Pro: Advanced edition of MMLU & new Leaderboard 2024-05-15
1 Ratchet and Phi 3 2024-05-01
1 Snowflake Arctic Instruct Open LLM 2024-04-24
1 LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes 2024-04-08
1 MANATEE(lm): Market Analysis based on language model architectures 2024-03-20
1 Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a Single GPU 2024-03-13
1 Serverless Image Similarity with Upstash Vector and HuggingFace Spaces 2024-02-02
1 Dutch Drug-Related Text Classification Model by NOS 2024-01-25
1 Implement Fractional GPUs in Kubernetes to save upto 50% cost 2024-01-22
1 The next person that says textual modalities gets it 2024-01-10
1 LLaMA Pro: Progressive LLaMA with Block Expansion 2024-01-05
1 DiffMorpher – Using Diffusion Models for Image Morphing 2023-12-24
1 Tencent Announces AppAgent 2023-12-22
1 How Do Prompt Injection Scanners Perform? A Benchmark 2023-12-07
1 Show HN: ChatData – an open-source ChatGPT-like chatbot 2023-11-29
1 3D Gaussian Splat Viewer (top item) 2023-10-23
1 Who loves you Hacker News? 2023-10-12
1 Curious about Causality and Generative Models? Check Out This Demo 2023-07-26
1 Have You Tried AWS Inferentia2 for ML Deployments? 2023-07-16
1 Open Source LLM Inference DLC 2023-06-29
1 WizardCoder: Empowering Code Large Language Models with Evol-Instruct 2023-06-15
1 Text Embedding Benchmark (MTEB) Leaderboard 2023-02-20
1 Diffusion Models Live Event with Hugging Face 2022-11-25
1 Train a language model with Megatron-LM and convert it to Transformers 2022-09-13
1 Multilingual GPT model with 1.3B parameters trained on 25 languages 2022-05-01
1 Hugging Face Model Comparator Space Builder 2022-03-28
1 Deploy Hugging Face Models Easily with Amazon SageMaker 2021-07-20
1 HuggingFace Transformers Meets Vision 2021-05-13
1 Neural network chatbot trained through Neuralconvo, a Torch library 2016-09-08
1 Halo: Open-Source Health Tracking with Wearables 2024-11-20
1 Releasing the largest multilingual open pretraining dataset 2024-11-14
1 Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for coding 2024-11-12
1 Providing Open Investment Data – 25 years of data 2024-11-11
1 New Sota Text to Image 2024-10-31
1 Stable Diffusion 3.5 Medium 2024-10-29
1 Kolors Virtual Try-On in the Wild 2024-10-28
1 Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval 2024-10-23
1 Stable Diffusion 3.5-large released 2024-10-22
1 Transformers.js v3: WebGPU Support, New Models and Tasks, and More 2024-10-22
1 Allegro – New Open Source Text to Video Generator from Rhymes AI 2024-10-22
1 Distilabel Synthetic Data Generator on Hugging Face 2024-10-17
1 HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance 2024-10-17
1 Show HN: A dataset of all HN submission texts (2006-2024) in Markdown 2024-10-13
1 Scaling AI-Based Data Processing with Hugging Face and Dask 2024-10-10
1 LLMs Know More Than They Show 2024-10-08
1 Document Similarity Search with ColPali 2024-09-29
1 Prithvi WxC: Foundation Model for Weather and Climate 2024-09-24
1 Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance 2024-09-24
1 HN-Style HuggingFace Daily Papers 2024-09-22
1 Qwen2.5-Coder Technical Report 2024-09-21
1 Introducing Community Tools on HuggingChat 2024-09-20
1 InkubaLM-0.4B: Small language model for low-resource African Languages 2024-08-29
1 Diffusion models are real time game engines 2024-08-29
1 Everchanging Quest: Rogue-like game powered by LLMs 2024-08-21
1 xLSTM Model Trained on Music 2024-08-16
1 Qwen2-VL 2024-08-14
1 Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters 2024-08-07
1 Depth Compare – A Hugging Face space to compare different depth models 2024-07-29
1 Insilico Medicine on Hugging Face 2024-07-27
1 LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs 2024-07-26
1 Spreadsheetllm: Encoding Spreadsheets for Large Language Models 2024-07-24
1 Followgraph for Hugging Face 2024-07-23
1 Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text prompts 2024-07-23
1 Scaling Diffusion Transformers to 16B Parameters 2024-07-19
1 DeepSeek v2 Chat (0628) released 2024-07-18
1 The Rise of Agentic Data Generation 2024-07-15
1 Fast SD3 Medium 2024-07-10
1 Agentic RAG: query reformulation and self-query 2024-07-08
1 Meta LLM Compiler 2024-06-29
4 From Files to Chunks: Improving HF Storage Efficiency 2024-11-20
3 Dataset Card for 1M Bluesky Posts 2024-11-27
3 New 2B vision language model that consumes the least memory 2024-11-26
4 Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg 2024-11-24
3 New synthetic dataset beating MSFT and mistral's SFT recipe 2024-11-22
1 Allegro-TI2V: an open source video generation model 2024-11-27
1 PR Puppet Sora 2024-11-27
2 OpenGPT-X 2024-11-26
1 Lightricks/LTX-Video – first real-time video generation model 2024-11-23
425 Llama-3.3-70B-Instruct 2024-12-06
4 Show HN: LatComp – Compress your image into a small and reversible format 2024-11-30
3 Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer 2024-12-06
3 Quantum+AI Qiskit Code Assistant Open Source model 2024-11-27
3 informatiker/20-million-bluesky-posts 2024-11-29
3 Automated GitHub Issue Creation Using Structured Generation 2024-11-29
3 QwQ-32B-Preview 2024-11-27
2 Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI' 2024-11-28
2 The Lichess database is now on Hugging Face 2024-12-06
2 LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs 2024-12-05
2 Releasing: A dataset of two million Bluesky posts 2024-11-27
1 PaliGemma 2 – New vision language models by Google 2024-12-05
1 Open Source Developers Guide to the EU AI Act 2024-12-03
1 LM Studio using models from Hugging Face 2024-12-02
1 IC Light – Shade Generation Model 2024-12-02
348 A Replacement for BERT 2024-12-19
10 Show HN: Downloadable AI Musical Instruments 2024-12-10
9 Spaces ZeroGPU: Dynamic GPU Allocation for Spaces 2024-12-15
8 Scaling Test Time Compute with Open Models 2024-12-16
5 Moonshine – open-source, real-time speech-to-text in the browser 2024-12-19
3 Welcome to the Falcon 3 Family of Open Models 2024-12-17
3 Meta releases family of multimodal models that comprehend hour-long video 2024-12-16
3 Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset 2024-12-09
2 Just launched MilkDropLM model using 32B parameters 2024-12-20
2 FineMath: the best public math pre-training dataset 2024-12-19
2 I-JEPA Hugginface 2024-12-09
2 FineWeb2 dataset: A sparkling update with 1000s of languages 2024-12-08
1 ModernBERT 2024-12-20
1 Show HN: A ML powered text moderation model that outperforms Open AI 2024-12-14
1 Help Us Rank the Best Background Removal Tools 2024-12-11
1 I need your help to create brain-rot dataset 2024-12-08
1 Phi-4 GGUF 2024-12-14
1 HunyuanVideo and Diffusers Made Easy 2024-12-11
48 DeepSeek v3 beats Claude sonnet 3.5 and way cheaper 2024-12-26
4 DeepSeek-V3-Base 2024-12-25
11 smolagents: A simple library to build AI agents 2025-01-02
10 Phi-4 weights have been released under MIT license 2025-01-08
3 Timeline of AI model releases in 2024 2025-01-01
2 Vdr-2B-multi-v1 a multilingual embedding model for visual document retrieval 2025-01-10
2 Show HN: We collected detailed annotations for text-to-image generation 2025-01-10
2 Hugging Face Smolagents 2025-01-05
2 Hugging Face advocates for Code Agents: agents that write tool calls as code 2025-01-02
2 ModernBERT: Encoder-only Transformer Model Strictly Improving on past work 2025-01-01
2 Polish linguistic and cultural competency benchmark for LLMs 2024-12-31