Deep Mimic example in Webots? #18
Replies: 4 comments
-
Hello @rohit-kumar-j ! Could you please elaborate on how do you imagine the deep mimic example being implemented in Webots? |
Beta Was this translation helpful? Give feedback.
-
HI! Thank you for your reply! @tsampazk You can do a
...and run the following:
I was hoping that this example could be implemented in a less cumbersome way(not necessarily though) within in Webots as compared to the original Deep Mimic code (in Mujoco) and the example provided in pybullet (in pybullet) I reckon that this is an example provided in pybullet with the env, action space, reward function, etc all custom-defined. Perhaps, there can be a 'general' environment, predefined, based on the deep-mimic code(but upgraded and implemented for webots) that can be used for many other purposes. Like using the same environment for various other robots. This could be also used as a common library for dynamic robot simulations based on the environment library. Of course, Webots, primarily being a GUI-based simulator would have much of the functionality in the GUI itself and the code would be relatively simpler (as compared to the pybullet implementation). And as an example implementation, we could translate deep mimic example (above), similar to the cart-pole example. The development could perhaps be supplemented by support from webots on their discord server, or other community forums Warm Regards, Rohit |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed response! The generic environment that can support multiple robots using Deep Mimic sounds like a great idea and a nice addition to deepbots examples. I will list some thoughts regarding the implementation:
Feel free to share what you think about all these and/or any additional thoughts that you might have. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your reply! I now have some more insight into the implementation of Deep Mimic after some research. I might be wrong in my understanding, but here goes: I did find this video of the implementation of deep mimic in PyBullet. Here, the ghost of the robot (lighter shade robot) is merely a representation of the reward function (lines 42-79). It is not actually in a physics environment and does not face any forces or collisions. The darker shade robot is supposed to follow the 'ghost' and thereby earn the reward. The ghost itself does not have a controller. It uses the methods listed humanoid_pose_interpolator.py for the spherical linear interpolation. I think that both the actual robot and the ghost share the same group of methods to interpolate joints,
There is a custom method for interpolating motion to the joints that are referenced from the mocap files, These mocap files follow a pattern of having 43 indices. The 1st 3 are for the
Indeed, implementing a specific example first and then exporting a new urdf model to swap out with the humanoid is a better option. I think we can use the given humanoid.urdf at first, as the methods are already predefined in the original code within Pybullet's implementation of deep mimic.
If we can find a way to make the 'ghost' of the humanoid robot transform in space. through the keyframes within the specified keyframe durations, then we could, in theory, make the RL agent perform using the RobotSupervisor class, after adding some additional functionality. The reward function given in Pybullet's implementation is given in the getReward() method. This would not essentially require a robot controller (teacher controller) and robots with similar physical structures can be trained.
I think we do not need a predefined teacher controller, but a reward function 'ghost' transforming through space.
Yes, we need an RL agent that works on this or a similar reward function.
Yes, the student needs methods to gather information from the action executed by the teacher (ghost) and generate an action based on the action weights.
I agree. This needs an efficient implementation of the deep_mimic code originally provided. Since Webots has GUI+Code Based development, it might be inherently less cumbersome. Warm Regards, Rohit PS: This was a long reply. I will try to shorten them from now on or reply in multiple comments. |
Beta Was this translation helpful? Give feedback.
-
Hi! Could you integrate/translate a deep mimic environment example from HERE.?
Beta Was this translation helpful? Give feedback.
All reactions