AI research review - Merging Models Modulo Permutation Symmetries
In this week's AI Research Review, "Git Re-Basin: Merging Models Modulo Permutation Symmetries" is discussed. The authors reveal that wide neural networks have a single basin in their loss landscape, leading to many permutations of the same model weights calculating the same function. Key findings include the linear interpolation between model weights being an emergent behavior of SGD and two models with different parameters can be merged without affecting loss. Three methods are proposed for finding mappings between neurons: matching activations, matching weights, and learning permutations with a straight-through estimator. The paper's findings could impact distributed model training, federated learning, and model optimization by potentially identifying global minima.
Company
AssemblyAI
Date published
Nov. 16, 2022
Author(s)
Yash Khare
Word count
245
Language
English
Hacker News points
None found.