Synth Training
On-device SAC training for synth humanoids using TorchSharp. Supports macOS Metal, Android/Quest, and Windows CPU.
com.genesis.synth.training 
Install via UPM
Add to Unity Package Manager using this URL
https://www.pkglnk.dev/training.git README Markdown
Copy this to your project's README.md
## Installation
Add **Synth Training** to your Unity project via Package Manager:
1. Open **Window > Package Manager**
2. Click **+** > **Add package from git URL**
3. Enter:
```
https://www.pkglnk.dev/training.git
```
[](https://www.pkglnk.dev/pkg/training)Dependencies (2)
README
Synth Training
On-device reinforcement learning for Synth humanoids using TorchSharp. Train directly in the Unity editor or on Meta Quest ā no external Python server needed.
Features
- Dual Algorithm Support ā SAC (off-policy) and PPO (on-policy) via a shared
BaseTrainingSkillabstraction. Choose the right algorithm for each task. - DeepMimic Imitation Learning ā
ImitationLearningSkilltracks reference AnimationClips using pose, velocity, and key-body rewards with multi-clip support and hard negative mining. - Continuous Learning ā
ContinuousLearningSkillfor persistent, always-on training with phase-based reward shaping and contact-based micro-rewards. - Inference Mode ā Run trained policies without the training loop. Deterministic or stochastic action modes, with automatic model loading from saved checkpoints.
- Platform-Adaptive ā macOS (Metal/MPS GPU), Android/Quest (CPU), Windows (CPU). Training thread auto-throttles based on platform capabilities.
- Model Deployment Pipeline ā Automatic packaging of trained models into builds via
IPreprocessBuild/IPostprocessBuildhooks. First-launch extraction on device viaModelBootstrap. - Double-Buffered CPU Inference ā PPO uses lock-free CPU inference clones, allowing the main thread to run inference while the training thread updates GPU weights concurrently.
- Progressive Action Curriculum ā Unlock joints in stages as the agent improves, with automatic target entropy adjustment.
- Live Training Dashboard ā Editor window (
Synth/Training Dashboard) with real-time graphs for reward components, losses, alpha, SPS, and skill-specific diagnostics. - Motion Reference Tooling ā Extract reference motion from AnimationClips, play back on non-MuJoCo characters, and visually validate motion extraction pipelines.
- Atomic State Persistence ā Crash-safe save/load with temporary file and atomic rename. Survives interrupted writes.
- IL2CPP Compatible ā Custom bridge for TorchSharp on IL2CPP (Quest/Android). Static forward-slot pool avoids marshalling issues.
Ecosystem
synth-training is part of a three-package architecture for creating, training, and interacting with physics-simulated humanoids:
| Package | Role | |
|---|---|---|
| synth-core | Humanoid creation, MuJoCo physics, skill architecture | Required |
| synth-training (this repo) | On-device RL training (SAC + PPO) and inference via TorchSharp | ā |
| synth-vr | Mixed reality interaction on Meta Quest | Optional |
synth-core provides the physics body, motor system, and extensible skill/sense interfaces that synth-training builds on. This package implements ISynthSkill to add learning directly in Unity. When combined with synth-vr, training runs live on Meta Quest while you physically interact with the Synth in your room.
Requirements
- Unity 6000.x or later
- synth-core package
- MuJoCo Unity plugin (
org.mujoco) ā via arghyasur1991/mujoco fork (synth-patchesbranch) - TorchSharp fork (
unity-il2cpp-supportbranch) ā includes IL2CPP bridge for Quest/Android - Platform-specific native LibTorch libraries (see build instructions below)
Build Prerequisites (for native libraries)
| Requirement | Purpose |
|---|---|
| .NET SDK 8+ | Build TorchSharp managed DLL |
| CMake 3.18+ | Cross-compile LibTorchSharp for Android |
| Android NDK r26+ | Android arm64 cross-compilation |
| PyTorch source (v2.7.1) | Build LibTorch for Android (via submodule or clone) |
Installation
Add to Packages/manifest.json:
{
"dependencies": {
"com.genesis.synth.training": "https://github.com/arghyasur1991/synth-training.git",
"com.genesis.synth": "https://github.com/arghyasur1991/synth-core.git",
"org.mujoco": "https://github.com/arghyasur1991/mujoco.git?path=unity#synth-patches"
}
}
Native Libraries
TorchSharp requires platform-specific native libraries. Build and deploy using the included scripts:
# macOS (builds TorchSharp from source, deploys to Unity project)
./scripts/setup_torchsharp_macos.sh /path/to/YourUnityProject
# Android arm64 (cross-compiles LibTorch + LibTorchSharp)
./scripts/setup_torchsharp_android.sh /path/to/YourUnityProject
| Platform | Libraries | Deployment Location |
|---|---|---|
| macOS arm64 | libtorch.dylib, libtorch_cpu.dylib, libc10.dylib, libLibTorchSharp.dylib |
Assets/Plugins/arm64/ |
| Android arm64 | libLibTorchSharp.so |
Assets/Plugins/Android/arm64-v8a/ |
The managed TorchSharp.dll is deployed to Assets/Packages/TorchSharp/.
Quick Start
Imitation Learning (PPO)
- Set up a Synth using synth-core (see its README).
- Add
ImitationLearningSkillto your Synth prefab. - Assign one or more reference AnimationClips.
- Press Play ā PPO training tracks the reference motion using DeepMimic rewards.
Continuous Learning (SAC)
- Add
ContinuousLearningSkillto your Synth prefab. - Configure SAC hyperparameters in the inspector.
- Press Play ā training begins automatically with contact-based rewards.
Inference Only
- Check Inference Only on any training skill component.
- Optionally uncheck Deterministic Inference for stochastic (noisy) actions.
- Press Play ā the policy runs from saved weights without training.
Model Deployment (Quest Builds)
Trained models are automatically packaged into builds:
- Create
Assets/Resources/SynthBuildSettings.assetvia Assets > Create > Synth > Build Settings (auto-created on first build if missing). - Build for Android/Quest ā models are copied from
persistentDataPathtoStreamingAssetspre-build. - On first launch,
ModelBootstrapextracts models topersistentDataPathon the device. StreamingAssetscopies are cleaned up post-build (configurable).
Architecture
BaseTrainingSkill (MonoBehaviour, ISynthSkill)
āāā observe ā BuildFullObs ā normalize ALL ā policy ā action
āāā Inference mode: obs ā deterministic/stochastic action (no training loop)
āāā Training mode: obs ā action ā reward ā store ā background train thread
ā
āāā ImitationLearningSkill (PPO)
ā āāā DeepMimic reward (pose, velocity, root, key-body)
ā āāā Multi-clip library with hard negative mining
ā āāā Reference motion advancement via AdvanceTime()
ā
āāā ContinuousLearningSkill (SAC)
āāā Contact-based micro-rewards
āāā Progressive action curriculum
āāā Prioritized experience replay
Package Structure
synth-training/
āāā Runtime/
ā āāā Skills/ BaseTrainingSkill, ImitationLearningSkill,
ā ā ContinuousLearningSkill
ā āāā Agent/ PPOAgent, SACAgent, StructuredActorNetwork
ā āāā Training/ ISkillTrainer, BaseSkillTrainer,
ā ā PPOSkillTrainer, SACSkillTrainer,
ā ā RolloutBuffer, ReplayBuffer, TrainingThread
ā āāā Reward/ DeepMimicReward, ContinuingReward
ā āāā Curriculum/ ActionCurriculum
ā āāā Observation/ ObservationNormalizer
ā āāā Build/ SynthBuildSettings, ModelBootstrap
ā āāā Diagnostics/ TrainingMetrics
ā āāā Persistence/ StatePersister
ā āāā MotionReference/ MotionClipExtractor, MotionReferenceData,
ā ā ReferenceAnimationPlayer, MotionExtractionTestBench
ā āāā Utility/ LearningLogger, TorchSharpLoader
āāā Editor/
ā āāā TrainingDashboard.cs
ā āāā SynthModelBuildProcessor.cs
ā āāā ContinuousLearningSkillEditor.cs
āāā scripts/
ā āāā setup_torchsharp_macos.sh
ā āāā setup_torchsharp_android.sh
āāā tools~/
āāā torchsharp_android/ CMakeLists.txt, android_stubs.cpp
Supported Platforms
| Platform | Device | Status |
|---|---|---|
| macOS Metal (MPS) | Mac editor | GPU training + CPU inference |
| Android CPU | Meta Quest 3 | Throttled training, inference mode |
| Windows CPU | Windows editor | Supported |
License
Apache-2.0 ā see LICENSE for details.
No comments yet. Be the first!