Isn’t the reward function in reinforcement learning something like a desire it has? I mean training works because we give it some function to minimize/maximize… A goal that it strives for?! Sure it’s a mathematical way of doing it and in no way as complex as the different and sometimes conflicting desires and goals I have as a human… But nonetheless I think I’d consider this as a desire and a reason to do something at all, or machine learning wouldn’t work in the first place.