GK Question

technology hard fill_blank

The technique that enables LLMs to learn from feedback without explicit labels is called ________ Learning.

Answer: Reinforcement / RLHF

Reinforcement Learning from Human Feedback (RLHF) trains reward models from human preferences, then optimizes LLM to maximize rewards. Critical for aligning AI with human values.

Topic Advanced AI/ML
Exam Relevance UPSC, Banking, SSC