This is a Great Leap Forward and indeed a major breakthrough. Understating language plus the physical world and making sense out of it is mind-blowing. Well done team @Covariant !
As soon as there is contact, a robot has to make its decisions mainly based on touch and no longer just on vision. Of course, you can ignore that if your solution has to work only in the narrow suck-and-place niche, but it won't scale to the general setting where the robot has to perform dexterous manipulations on deformable objects and use tools which were made for human hands.