Create a custom practice set
Pick category, difficulty, number of questions, and time limit. Start instantly with your own quiz.
Generate QuizPick category, difficulty, number of questions, and time limit. Start instantly with your own quiz.
Generate QuizNo weekly quiz is published yet. Check the weekly page for the latest updates.
View Weekly PageAnswer: Reinforcement / RLHF
Reinforcement Learning from Human Feedback (RLHF) trains reward models from human preferences, then optimizes LLM to maximize rewards. Critical for aligning AI with human values.