This can be done by letting the agent play against itself (selflearning), other pre-programmed agents, or human players.
Offline training is used to bootstrap the learning process. Their approach faces both dimensions with reinforcement learning (RL). This dichotomy between competence and performance is well known and studied in linguistics, as proposed by Noam Chomsky. divide the DGB problem into two dimensions: competence (learn as well as possible) and performance (act just as well as necessary).