Pose Detection and Game Animations

jasonrespress
Jul 27, 2024
2 min read

I am currently working on integrating Mediapipe into Unreal Engine 5 to make game prototyping easier by generating character animations.

Design

Below is an early mockup:

Design Pillars:

Security and Privacy
- Local AI Model Inference
- Does not require an internet connection for use
Friendly Animation Interface
- Drag and Drop interaction with the AI
Real-Time Performance
- AI inference runs efficiently on both CPU and GPU

Top Features:

Pose Generation from Image or Video

Model selection and Fine tuning

Bone and Joint matching

Custom control rig editor

Pose Library

The basic premise is that a user can drag a photo or movie into the UI and the software

will generate a pose. Sounds pretty simple right? Well.....

Current Progress:

So far, I have been able to import an image or video, run inference and "affect" the control rig. This is remarkable progress but I have run into a few roadblocks which will take another week or two to overcome:

Mediapipe's output for global landmarks has an artifact where depth is calculated with an offset when viewing points from top to bottom of the image
Using IK offset joints is not sufficient
Joint rotation must be calculated prior to creating a pose

Above is what my interface looks like in the engine. Roadblock # 3 is apparent, as the torso should be leaning slightly forward. This is not too bad; one can establish a local plane of rotation around the hips and apply that rotation to the torso control. However, there is an additional layer of complexity, an order of operation.

The result is far less favorable when trying to apply a pose with rotated limbs. Earlier in the post, I mentioned that I am moving the IK controls on joints to create a pose. This is the correct and wrong way to make a pose from this data. Moving an IK control in lateral directions will eventually reach a point where minor joint rotations are required.

Finally, Issue 1 is partially solved by adding a camera angle to account for depth. This adds a bit of complexity to the user experience, but the gain in accuracy should be a good payoff.

The plan for what is next

The workflow within the engine is feature-complete yet buggy. The bugs originate from data processing after inference. I plan to make the data more accurate and account for more joint rotations.
The UI does not account for the UX issues discovered during usability testing. I will be working on that as well.

Pose Detection and Game Animations

Design

The plan for what is next

Recent Posts

Comments