technology hard Fill in the Blank

The technique that enables LLMs to learn from feedback without explicit labels is called ________ Learning.

  1. Reinforcement / RLHF
  2. Multimodal
  3. Reinforcement
  4. Building Energy / BEMS

Answer: Reinforcement / RLHF

Reinforcement Learning from Human Feedback (RLHF) trains reward models from human preferences, then optimizes LLM to maximize rewards. Critical for aligning AI with human values.

Topic Advanced AI/ML
Exam Relevance UPSC, Banking, SSC