Humo AI

Multi-modal input, human-centric video with consistent subject & audio-visual sync

AI Tool English

About Humo AI

HuMo AI is a human-centric video generation tool co-developed by Tsinghua University and Bytedance. It supports multi-modal inputs (text, image, audio) via three modes: TI (Text+Image), TA (Text+Audio), and TIA (full-modal).
It fixes common pain points like subject inconsistency and audio-visual mismatch, delivering polished videos. No advanced skills are needed, making it ideal for creators wanting efficient, high-quality video creation.

Website Information

Category AI Tool

Language English

Added Dec 16, 2025

Updated Mar 11, 2026

Submitted by

zhao68733

Member since Dec 2025

View All Creations

Back to Directory