| _ | text2air | _ |
|---|---|---|
| text2air | S: Fine-Grained Airplane Image Generation in Remote Sensing from Nature Language |
| _ | text2avatar | _ |
|---|---|---|
| text2avatar | : Articulated 3D Avatar Creation With Text Instructions |
| _ | text2concept | _ |
|---|---|---|
| text2concept | : Concept Activation Vectors Directly from Text |
| _ | text2hoi | _ |
|---|---|---|
| text2hoi | : Text-Guided 3D Motion Generation for Hand-Object Interaction |
| _ | text2im | _ |
|---|---|---|
| Im2Text and | text2im | : Associating Images and Texts for Cross-Modal Retrieval |
| _ | text2lidar | _ |
|---|---|---|
| text2lidar | : Text-guided Lidar Point Cloud Generation via Equirectangular Transformer |
| _ | text2live | _ |
|---|---|---|
| text2live | : Text-Driven Layered Image and Video Editing |
| _ | text2loc | _ |
|---|---|---|
| text2loc | : 3D Point Cloud Localization from Natural Language |
| _ | text2mask | _ |
|---|---|---|
| SegGen: Supercharging Segmentation Models with | text2mask | and Mask2Img Synthesis |
| _ | text2mesh | _ |
|---|---|---|
| text2mesh | : Text-Driven Neural Stylization for Meshes |
| _ | text2performer | _ |
|---|---|---|
| text2performer | : Text-Driven Human Video Generation |
| _ | text2place | _ |
|---|---|---|
| text2place | : Affordance-aware Text Guided Human Placement |
| _ | text2po | _ |
|---|---|---|
| text2po | s: Text-to-Point-Cloud Cross-Modal Localization |
| _ | text2qr | _ |
|---|---|---|
| text2qr | : Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation |
| _ | text2room | _ |
|---|---|---|
| text2room | : Extracting Textured 3D Meshes from 2D Text-to-Image Models |
| _ | text2scene | _ |
|---|---|---|
| text2scene | : Generating Compositional Scenes From Textual Descriptions | |
| text2scene | : Text-driven Indoor Scene Stylization with Part-Aware Details |
| _ | text2shape | _ |
|---|---|---|
| HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D | text2shape | Generation |
| text2shape | : Generating Shapes from Natural Language by Learning Joint Embeddings |
| _ | text2sign | _ |
|---|---|---|
| text2sign | : Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks |
| _ | text2sketch | _ |
|---|---|---|
| text2sketch | : Learning Face Sketch from Facial Attribute Text |
| _ | text2tex | _ |
|---|---|---|
| text2tex | : Text-driven Texture Synthesis via Diffusion Models |
| _ | text2video | _ |
|---|---|---|
| Spatial-Temporal Graphs for Cross-Modal | text2video | Retrieval |
| text2video | -Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators | |
| text2video | : An End-to-end Learning Framework for Expressing Text With Videos |