This is awesome, I'm just getting started with ROS and seeing that you already did something I need for a project is so cool, thank you so much for posting this!
Oh, I guess I understand. The agent does want to "take watch 1 from drawer 1," but the controller doesn't care "from where". It only uses the mask to complete the action. The agent is just lucky because there is indeed a watch near drawer 1.