Reinforcement Learning from AI Feedback (RLAIF) is a method used to supervise the training of large language models (LLMs). It is similar to another technique called Reinforcement Learning from Human Feedback (RLHF), with the main difference being that RLAIF uses feedback provided by an artificial intelligence model, rather than humans. In both methods, ranked preference modeling is commonly used for supervision. While RLHF has been successful in training helpful and harmless AI assistants, RLAIF offers several advantages over RLHF, including improved performance and ethical considerations.