When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion ModelsZhengyang Sun, Yu Chen, Xin Zhou, Xiaofan Li, Xiwu Chen, Dingkang Liang, Xiang BaiRead on ELI