In a company, daily work activities are carried out through an interaction between robots and humans. In particular, two robots take care of transporting objects from one location to another, leaving humans with less mechanical tasks.
The task of the two robots is, therefore, to transport an object from an initial position to a final position trying to avoid collisions with any objects and machinery in the company. Furthermore, in order to have correct grasping on the object so that it does not slide, the robots must stay at a desired distance.
Unfortunately, one of the two robots sometimes does not work properly and there are small deviations in the direction of movement. In particular, with probability 0.7 the robot maintains the desired direction, with probability 0.2 deviates to the left side and with probability 0.1 deviates to the right side.
To visualize better the situation, suppose there is a grid of hexagonal cells that represents the environment.
Suppose that at the initial position we have the two robots that are distant one cell and between them there is the object. This is the desired setup that should be kept. At each time instant each robot can perform a move in six different directions since for each cell we have attached six other cells.
Cost function will consider not only the time steps that passes, but also time spent when there are collisions between robots and obstacles. In particular the cost function will take account also cost related to the distance between the robots. It should be remembered that if the malfunctioning robot collides against an obstacle, it will stay put for a time step (one second).
The goal is to determine an optimal policy so that the time required to complete the task is minimized. Assume that we are in 2D dimensions. For simplicity, suppose that the object cannot collide with obstacles and assume that the two robots can be on the same cell but this is not considered a collision.