Keith Price Bibliography kwic Details for text

Index for text

_ text _

2D and 3D Video Scene text Classification

2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene text Recognition

360PanT: Training-Free text -Driven 360-Degree Panorama-to-Panorama Translation

3D Highlighter: Localizing Regions on 3D Shapes via text Descriptions

3D Human Motion Generation from the text Via Gesture Action Classification and the Autoregressive Model

3D-Aware text -Driven Talking Avatar Generation

3D-SceneDreamer: text -Driven 3D-Consistent Scene Generation

3D-VisTA: Pre-trained Transformer for 3D Vision and text Alignment

4D-fy: text -to-4D Generation Using Hybrid Score Distillation Sampling

A-STAR: Test-time Attention Segregation and Retention for text -to-image Synthesis

ABCNet v2: Adaptive Bezier-Curve Network for Real-Time End-to-End text Spotting

ABCNet: Real-Time Scene text Spotting With Adaptive Bezier-Curve Network

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene text Spotting

Ablating Concepts in text -to-Image Diffusion Models

Accurate Arbitrary-Shaped Scene text Detection via Iterative Polynomial Parameter Regression

Accurate Detection for Scene text s with a Cascaded CNN Networks

Accurate Scene text Detection Through Border Semantics Awareness and Bootstrapping

Accurate Scene text Detection Via Scale-Aware Data Augmentation and Shape Similarity Constraint

Accurate Scene text Recognition Based on Recurrent Neural Network

Accurate Scene text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Accurate Segmentation-Based Scene text Detector with Context Attention and Repulsive Text Border, An

Accurate Segmentation-Based Scene text Detector with Context Attention and Repulsive Text Border, An

Accurate text localization in images based on SVM output scores

Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped text , An

Accurate video text detection through classification of low and high contrast images

Accurate, data-efficient, unconstrained text recognition with convolutional neural networks

ACE: Anti-Editing Concept Erasure in text -to-Image Models

Acquire and then Adapt: Squeezing out text -to-Image Model for Image Restoration

ActBERT: Learning Global-Local Video- text Representations

Active Collection of Land Cover Sample Data from Geo-Tagged Web text s

Active Contours Network to Straighten Distorted text Lines

active learning approach to frequent itemset-based text clustering, An

Active Learning With Complementary Sampling for Instructing Class-Biased Multi-Label text Emotion Classification

Activity Recognition Applications from Con text ual Video-Text Fusion

ActivityCLIP: Enhancing group activity recognition by mining complementary information from text to supplement image modality

Actor and Action Modular Network for text -Based Video Segmentation

AdaBoost for text Detection in Natural Scene

Adapting Style and Content for Attended text Sequence Recognition

Adapting text -to-Image Generation with Feature Difference Instruction for Generic Image Restoration

Adaptive Algorithm for text Detection from Natural Scenes, An

Adaptive Boundary Proposal Network for Arbitrary Shape text Detection

Adaptive Correlation Filtering Method for text -Based Person Search, An

Adaptive Fuzzy text Segmentation in Images with Complex Backgrounds Using Color and Texture

Adaptive fuzzy wavelet algorithm for text -independent speaker recognition

Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured text

Adaptive Latent Graph Representation Learning for Image- text Matching

Adaptive method for multi colored text binarization

Adaptive multi- text union for stable text-to-image synthesis learning

Adaptive multi- text union for stable text-to-image synthesis learning

Adaptive Offline Quintuplet Loss for Image- text Matching

Adaptive Region Growing Color Segmentation for text Using Irregular Pyramid

Adaptive Scene text Detection Based on Transferring Adaboost

Adaptive scene- text binarisation on images captured by smartphones

Adaptive Script-Independent Block-Based text Line Extraction, An

Adaptive Script-Independent text Line Extraction

Adaptive text Recognition Through Visual Matching

Adding Conditional Control to text -to-Image Diffusion Models

Addressing Information Inequality for text -Based Person Search via Pedestrian-Centric Visual Denoising and Bias-Aware Alignments

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene text Detection

Advance One-Shot Multispectral Instance Detection With text 's Supervision

Advancing Zero-Shot Digital Human Quality Assessment Through text -Prompted Evaluation

Adversarial and Isotropic Gradient Augmentation for Image Retrieval With text Feedback

Adversarial Attribute- text Embedding for Person Search With Natural Language Query

Adversarial learning based attentional scene text recognizer

Adversarial Representation Learning for text -to-Image Matching

Adversarial Robustification via text -to-image Diffusion Models

Adversarial Synthesis of Human Pose from text

Adversarial text to Continuous Image Generation

Adversarial Training Lattice LSTM for Named Entity Recognition of Rail Fault text s

ADVMIX: Data Augmentation for Accurate Scene text Spotting

Ae text spotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

AEA-FIRM: Adaptive Elastic Alignment with Fine-Grained Representation Mining for text -Based Aerial Pedestrian Retrieval

Aesthetic text Logo Synthesis via Content-aware Layout Inferring

Affective Image Editing: Shaping Emotional Factors via text Descriptions

Affective Image Filter: Reflecting Emotions from text to Images

Agent-Based Control Prompt Tuning for Video- text Retrieval

Aggregating Image and text Quantized Correlated Components

Aggregating Local and Global text Features for Linguistic Steganalysis

Aggregating Local Con text for Accurate Scene Text Detection

AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of text -to-Video Generation with LMM

AITTI: Learning Adaptive Inclusive Token for text -to-Image Generation

Aletheia: An Advanced Document Layout and text Ground-Truthing System for Production Environments

Algorithm for Colour-Based Natural Scene text Segmentation, An

Algorithm for Matching OCR-Generated text Strings, An

Algorithm for Reducing text Line Candidates of Incorrect Orientation, An

Algorithm for text page up/down orientation determination

Algorithms for compressing compound document images with large text /background overlap

ALIF: A dataset for Arabic embedded text recognition in TV broadcast

Align and Retrieve: Composition and Decomposition Learning in Image Retrieval With text Feedback

Align Your Gaussians: text -to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Aligning text and Document Illustrations: Towards Visually Explainable Digital Humanities

Aligning text -to-Image Diffusion Models With Constrained Reinforcement Learning

AlignIT: Enhancing Prompt Alignment in Customization of text -to-Image Models

Alignment and Generation Adapter for Efficient Video- text Understanding

Alignment of Curved text Strings for Enhanced OCR Readability

Alignment of free layout color text s for character recognition

Alignment of Paragraphs in Bilingual text s Using Bilingual Dictionaries and Dynamic Programming

All You Need Is a Second Look: Towards Arbitrary-Shaped text Detection

ALR-GAN: Adaptive Layout Refinement for text -to-Image Synthesis

alternative framework for univariate filter based feature selection for text categorization, An

ALTID : Arabic/Latin text Images Database for recognition research

AMITA: Attribute-Guided Masked Image- text Alignment for Multi-Label Image Representation

AMO Sampler: Enhancing text Rendering with Overshooting

Analysis of Features and Metrics for Alignment in text -Dependent Voice Conversion

Analysis of the Novel Transformer Module Combination for Scene text Recognition

Analytical evaluation of term weighting schemes for text categorization

anchor-free region proposal network for Faster R-CNN-based text detection approaches, An

Ancient document analysis based on text line extraction

AniClipart: Clipart Animation with text -to-Video Priors

Animatabledreamer: text -guided Non-rigid 3d Model Generation and Reconstruction with Canonical Score Distillation

AniMo: Species-Aware Model for text -Driven Animal Motion Generation

Annotated Databases for the Recognition of Screen-Rendered text

ANNP: a neural network parser for real world text s

Anonymizing Temporal Phrases in Natural Language text to be Posted on Social Networking Services

Anti-DreamBooth: Protecting users from personalized text -to-image synthesis

Anycontrol: Create Your Artwork with Versatile Control on text -to-image Generation

AnyFace++: A Unified Framework for Free-Style text -to-Face Synthesis and Manipulation

AnyFace: Free-style text -to-Face Synthesis and Manipulation

AON: Towards Arbitrarily-Oriented text Recognition

Application of autoregressive models to the study of the temporal structure of a handwritten text

Application of Cluster Detection to text and Picture Processing, An

Application of Novel Chaotic Neural Networks to text Classification Based on PCA

Application of Planar Motion Segmentation for Scene text Extraction

Apply Hierarchical-Chain-of-Generation to Complex Attributes text -to-3D Generation

Applying GIS and text Mining Methods to Twitter Data to Explore the Spatiotemporal Patterns of Topics of Interest in Kuwait

Applying the conjugate gradient method for text document categorization

approach for detecting and cleaning of struck-out handwritten text , An

approach for handwritten Chinese text recognition unifying character segmentation and recognition, An

Approach for Recognizing text Labels in Raster Maps, An

approach to extracting the target text line from a document image captured by a pen scanner, An

approach to get overall emotion from comment text towards a certain image uploaded to social network using Latent Semantic Analysis, An

Approximate String Match for Garbled text with Various Accuracies, An

Arabic character recognition system: A statistical approach for recognizing cursive typewritten text

Arabic hand-written text -line extraction

Arabic handwritten text s clusterization based on Feature Relation Graph (FRG)

Arabic ligatures: Analysis and application in text recognition

Arabic text detection in videos using neural and boosting-based approaches: Application to video indexing

Arbitrarily oriented text detection using geodesic distances between corners and skeletons

Arbitrarily Shaped Scene text Detection With a Mask Tightness Text Detector

Arbitrarily Shaped Scene text Detection With a Mask Tightness Text Detector

Arbitrarily shaped scene text detection with dynamic convolution

Arbitrarily-Oriented text Detection in Low Light Natural Scene Images

Arbitrary Shape Scene text Detection With Adaptive Text Region Representation

Arbitrary Shape Scene text Detection With Adaptive Text Region Representation

Arbitrary Shape text Detection using Transformers

Arbitrary Shape text Detection via Boundary Transformer

Arbitrary Shape text Detection via Segmentation with Probability Maps

Arbitrary Style Guidance for Enhanced Diffusion-Based text -to-Image Generation

Arbitrary-Oriented Scene text Detection via Rotation Proposals

Arbitrary-Shape Scene text Detection via Visual-Relational Rectification and Contour Approximation

architecture for handwritten text recognition systems, An

Are 2D-LSTM really dead for offline text recognition?

Are All Combinations Equal? Combining text ual and Visual Features with Multiple Space Learning for Text-based Video Retrieval

Are Digraphs Good for Free- text Keystroke Dynamics?

Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in text

ARES: text -Driven Automatic Realistic Simulator for Autonomous Traffic

ARRPNGAN: text -to-image GAN with attention regularization and region proposal networks

ArtAdapter: text -to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation

ArtGlyphDiffuser: text -driven artistic glyph generation via Style-to-CLIP Projection and Multi-Level Controlled diffusion

ARTIST: Improving the Generation of text -Rich Images with Disentangled Diffusion Models and Large Language Models

Artistic Style Transfer via Fine-Grained text Guidance and Contrastive Semantics Similarity

ART•V: Auto-Regressive text -to-Video Generation with Diffusion Models

ASAYAR: A Dataset for Arabic-Latin Scene text Localization in Highway Traffic Panels

Assessing Affective Dimensions of Play in Psychodynamic Child Psychotherapy via text Analysis

Assessing Image and text Generation with Topological Analysis and Fuzzy Logic

Assessing similarity in handwritten text s

Assessing User Experience of text Readability with Eye Tracking in Virtual Reality

Assistive text Reading from Complex Background for Blind Persons

Associating text and graphics for scientific chart understanding

ASTER: An Attentional Scene text Recognizer with Flexible Rectification

ASTS: A Unified Framework for Arbitrary Shape text Spotting

Asymmetric Cross-Scale Alignment for text -Based Person Search

ATA: Adaptive Transformation Agent for text -Guided Subject-Position Variable Background Inpainting

ATM: Attentional text Matting

AToM: Aligning text -to-Motion Model at Event-Level with GPT-4Vision Reward

ATT3D: Amortized text -to-3D Object Synthesis

Attend, Correct and Focus: A Bidirectional Correct Attention Network for Image- text Matching

Attention Calibration for Disentangled text -to-Image Personalization

Attention Guidance by Cross-Domain Supervision Signals for Scene text Recognition

attention-based row-column encoder-decoder model for text recognition in Japanese historical documents, An

Attention-Bridged Modal Interaction for text -to-Image Generation

Attentionhand: text -driven Controllable Hand Image Generation for 3d Hand Reconstruction in the Wild

AttnGAN: Fine-Grained text to Image Generation with Attentional Generative Adversarial Networks

Attribute-Centric Compositional text -to-Image Generation

Attribute-Centric Cross-Modal Alignment for Weakly Supervised text -Based Person Re-ID

AttriDiffuser: Adversarially enhanced diffusion model for text -to-facial attribute image synthesis

AttT2M: text -Driven Human Motion Generation with Multi-Perspective Attention Mechanism

Audio Visual Segmentation through text Embeddings

Audio-Enhanced text -to-Video Retrieval using Text-Conditioned Feature Alignment

Audio-Enhanced text -to-Video Retrieval using Text-Conditioned Feature Alignment

Authenticating Binary text Documents Using a Localising OMAC Watermark Robust to Printing and Scanning

Automated cartographic text placement

Automated Detection of Adverse Drug Events from Older Patients' Electronic Medical Records Using text Mining

automatic algorithm for text skew estimation in document images using recursive morphological transforms, An

Automatic annotation of unique locations from video and text

Automatic Chinese text Classification Using Character-Based and Word-Based Approach

Automatic Concept Discovery from Parallel text and Visual Corpora

Automatic Detection and Localization of Natural Scene text in Video

Automatic detection and recognition of Korean text in outdoor signboard images

Automatic diacritization of Arabic text using recurrent neural networks

Automatic discrimination of text and non-text natural images

Automatic discrimination of text and non-text natural images

Automatic document classification using text and images

Automatic dottization of Arabic text (Rasms) using deep recurrent neural networks

Automatic extraction of correlation-entropy features for text document analysis directly in run-length compressed domain

Automatic Feature Extraction and text Recognition From Scanned Topographic Maps

Automatic identification and skew estimation of text lines in real scene images

Automatic Identification of text in Digital Video Key Frames

Automatic image- text alignment for large-scale web image indexing and retrieval

Automatic Inpainting Scheme for Video text Detection and Removal

Automatic Labeling for Scene text Database

Automatic news video segmentation and categorization based on closed-captioned text

Automatic performance evaluation for video text detection

Automatic Performance Evaluation Protocol for Video text Detection Algorithms, An

Automatic recognition of printed arabic text using neural network classifier

Automatic Recognition of Printed Farsi text s

Automatic Segmentation of Printed Persian (Farsi) text

Automatic segmentation of the IAM off-line database for handwritten English text

Automatic separation of machine-printed and hand-written text lines

Automatic text area segmentation in natural images

Automatic text Detection and Recognition

Automatic text detection and removal in video sequences

Automatic text Detection and Tracking in Digital Video

Automatic text detection for mobile augmented reality translation

Automatic text Extraction from Arabic Newspapers

Automatic text Extraction from Video for Content-Based Annotation and Retrieval

Automatic text Extraction in Digital Video Based on Motion Analysis

Automatic text Location in Images and Video Frames

Automatic text location in natural scene images

Automatic text location using cluster-based template matching

Automatic text processing

Automatic text segmentation from complex background

Automatic tracing and extraction of text -line and word segments directly in JPEG compressed document images

Automatic writer identification from text line images

Autonomous Document Cleaning: A Generative Approach to Reconstruct Strongly Corrupted Scanned text s

Autonomous text Capturing Robot Using Improved DCT Feature and Text Tracking

Autonomous text Capturing Robot Using Improved DCT Feature and Text Tracking

AutoSplice: A text -prompt Manipulated Image Dataset for Media Forensics

Autostr: Efficient Backbone Search for Scene text Recognition

Auxiliary captioning: Bridging image- text matching and image captioning

AvatarCraft: Transforming text into Neural Human Avatars with Parameterized Shape and Pose Control

AvatarStudio: High-Fidelity and Animatable 3D Avatar Creation from text

Awesome Typography: Statistics-Based text Effects Transfer

Background-Insensitive Scene text Recognition with Text Semantic Segmentation

Background-Insensitive Scene text Recognition with Text Semantic Segmentation

Bag of Embedded Words learning for text retrieval

Bag of features approach for offline text -independent Chinese writer identification

Balancing Optimization Strategies and Practical Goals: An Efficient Scene text Detector

BAMG: text -based Person Re-identification via Bottlenecks Attention and Masked Graph Modeling

Baseline detection of multi-lingual unconstrained handwritten text lines

BATINeT: Background-Aware text to Image Synthesis and Manipulation Network

Bayesian Similarity Model Estimation for Approximate Recognized text Search

Bayesian Super-Resolution of text in Video with a Text-Specific Bimodal Prior

Bayesian Super-Resolution of text in Video with a Text-Specific Bimodal Prior

Bayesian-based method of unconstrained handwritten offline Chinese text line recognition, A

BDNet: A BERT-based dual-path network for text -to-image cross-modal person re-identification

Be Yourself: Bounded Attention for Multi-subject text -to-image Generation

Beatrix: A Self-Learning System for Off-Line Recognition of Handwritten text s

Being Comes from Not-Being: Open-Vocabulary text -to-Motion Generation with Wordless Training

Belief Mining in Persian text s Based on Deep Learning and Users' Opinions

Benchmark for Chinese-English Scene text Image Super-resolution, A

Benchmark for Controllable text -Image-to-Video Generation, A

Benchmarking Robustness to text -Guided Corruptions

better fitness measure of a text -document for a given set of keywords, A

Beyond Coarse-grained Matching in Video- text Retrieval

Beyond One and Two Tower: Cross-Modal Consensus Learning for Image- text Retrieval

Beyond text QA: Multimedia Answer Generation by Harvesting Web Information

Beyond text : Frozen Large Language Models in Visual Signal Comprehension

Beyond verbs: Understanding actions in videos with text

Beyond visual semantics: Exploring the role of scene text in image understanding

Bi-Attention enhanced representation learning for image- text matching

Bi-Directional Image- text Retrieval With Position Attention and Similarity Filtering

Bi-Directional Spatial-Semantic Attention Networks for Image- text Matching

Bi-directional Training for Composed Image Retrieval via text Prompt Learning

Bi-modal Handwritten text Corpus: Baseline Results, A

Bi-modal Handwritten text Recognition (BiHTR) ICPR 2010 Contest Report

Bi-tonal image non- text matter removal with run length and connected component analysis

Bi-VLGM: Bi-Level Class-Severity-Aware Vision-Language Graph Matching for text Guided Medical Image Segmentation

Bidirectional extraction and recognition of scene text with layout consistency

Bilevel Feature Extraction-Based text Mining for Fault Diagnosis of Railway Systems

Bilingual text Classification

Bilingual, Open World Video text Dataset and Real-Time Video Text Spotting With Contrastive Learning, A

Bilingual, Open World Video text Dataset and Real-Time Video Text Spotting With Contrastive Learning, A

BiLMa: Bidirectional Local-Matching for text -based Person Re-identification

Bimodal beta mixture distribution for enhanced OOD inner-differentiation in multi-class text classification

Binarization and cleanup of handwritten text from carbon copy medical form images

Binarization of low quality text using a Markov random field model

Binarization-Free Clustering Approach to Segment Curved text Lines in Historical Manuscripts, A

Binary text image compression using overlapping rectangular partitioning

Binary text image file preprocessing to account for printer dot gain

Biometric Recognition Based on Free- text Keystroke Dynamics

Biometric recognition using online uppercase handwritten text

Bipartite Graph Coarsening for text Classification Using Graph Neural Networks

BiSeR-LMA: A Bidirectional Semantic Reasoning and Large Model Enhancement Approach for text -Video Cross-Modal Retrieval

BizGen: Advancing Article-level Visual text Rendering for Infographics Generation

Blended Diffusion for text -driven Editing of Natural Images

Blending-NeRF: text -Driven Localized Editing in Neural Radiance Fields

Blind Deblurring of text Images Using a Text-Specific Hybrid Dictionary

Blind Deblurring of text Images Using a Text-Specific Hybrid Dictionary

Blind deblurring text images via Beltrami regularization

blind deconvolution model for scene text detection and recognition in video, A

Blind Source Separation Techniques for Detecting Hidden text s and Textures in Document Images

Blind text images deblurring based on a generative adversarial network

BlobGEN-Vid: Compositional text -to-Video Generation with Blob Video Representations

Block Segmentation and text Extraction in Mixed Text/Image Documents

Block Segmentation and text Extraction in Mixed Text/Image Documents

BLSTM-based handwritten text recognition using Web resources

BLTRCNN-Based 3-D Articulatory Movement Prediction: Learning Articulatory Synchronicity From Both text and Audio Inputs

Boosting SpLSA for text Classification

Boosting text -To-Image Person Re-Identification With Generative Hard Negative

Boosting Weakly-Supervised Temporal Action Localization with text Information

Boosting-based transductive learning for text detection

Bootstrapping text Recognition from Stop Words

Bordernet: An Efficient Border-attention text Detector

BOTH2Hands: Inferring 3D Hands from Both text Prompts and Body Dynamics

Bottom-Up Scene text Detection with Markov Clustering Networks

Boundary text Spotter: Toward Arbitrary-Shaped Scene Text Spotting

Boundary-Aware Arbitrary-Shaped Scene text Detector With Learnable Embedding Network

Box It to Bind It: Unified Layout Control and Attribute Binding in text -to-Image Diffusion Models

BoxDiff: text -to-Image Synthesis with Training-Free Box-Constrained Diffusion

Breaking text -Based CAPTCHA with Sparse Convolutional Neural Networks

Breaking text -based CAPTCHAs with variable word and character orientation

Breaking The Limits of text -conditioned 3D Motion Synthesis with Elaborative Descriptions

BreakingNews: Article Annotation by Image and text Processing

Breathing Life Into Sketches Using text -to-Video Priors

Bridge-GAN: Interpretable Representation Learning for text -to-Image Synthesis

Bridging Different Language Models and Generative Vision Models for text -to-image Generation

Bridging Synthetic and Real Worlds for Pre-training Scene text Detectors

Bridging the Gap Between Audio and text Using Parallel-Attention for User-Defined Keyword Spotting

Bridging the Gap Between End-to-End and Two-Step text Spotting

Bridging Video and text : A Two-Step Polishing Transformer for Video Captioning

Bridging Video- text Retrieval with Multiple Choice Questions

BRsyn-Caps: Chinese text Classification Using Capsule Network Based on Bert and Dependency Syntax

BTS: A Bi-lingual Benchmark for text Segmentation in the Wild

Building compact recognizer with recognition rate maintained for on-line handwritten Japanese text recognition

Building text features for object image classification

BURSTS: A bottom-up approach for robust spotting of text s in scenes

ByTheWay: Boost Your text -to-Video Generation Model to Higher Quality in a Training-free Way

C-CLIP: Contrastive Image- text Encoders to Close the Descriptive-Commentative Gap

C-Net: A Compression-Based Lightweight Network for Machine-Generated text Detection

C4Synth: Cross-Caption Cycle-Consistent text -to-Image Synthesis

Cache-aided cross-modal correlation correction for unsupervised cross-domain text -based person search

CAETFN: Con text Adaptively Enhanced Text-Guided Fusion Network for Multimodal Sentiment Analysis

CAMEL: CAusal Motion Enhancement Tailored for Lifting text -Driven Video Editing

Camera based degraded text recognition using grayscale feature

Camera text Recognition based on Perspective Invariants

Camera-based analysis of text and documents: a survey

CAMP: Cross-Modal Adaptive Message Passing for text -Image Retrieval

CamType: assistive text entry using gaze with an off-the-shelf webcam

Can Generative Adversarial Networks Teach Themselves text Segmentation?

Can text -to-Video Generation help Video-Language Alignment?

Canny text Detector: Fast and Robust Scene Text Localization Algorithm

Canny text Detector: Fast and Robust Scene Text Localization Algorithm

Cap4Video: What Can Auxiliary Captions Do for text -Video Retrieval?

Capacity of text Marking Channel

CapsFusion: Rethinking Image- text Data at Scale

Caption text extraction for indexing purposes using a hierarchical region-based image model

Caption text recognition in video frames by MAP matching

cascade detector for text detection in natural scene images, A

Cascaded Segmentation-Detection Networks for text -Based Traffic Sign Detection

Caseg: CLIP-Based Action Segmentation with Learnable text Prompt

CAT-TPT: Class-Agnostic text -based Test-time Prompt Tuning for Vision-Language Models

CatVersion: Concatenating Embeddings for Diffusion-Based text -to-Image Personalization

Causality-Driven Explainable Multimodal Fusion With Visual- text Parallel Computing for Cloth-Changing Pedestrian Re-Identification

CBNet: A Plug-and-Play Network for Segmentation-Based Scene text Detection

CCDPlus: Towards Accurate Character to Character Distillation for text Recognition

CDistNet: Perceiving Multi-domain Character Distance for Robust text Recognition

CD text : Scene text detector based on context-aware deformable transformer

CE- text : A context-Aware and embedded text detector in natural scene images

CE- text : A context-Aware and embedded text detector in natural scene images

CelebV- text : A Large-Scale Facial Text-Video Dataset

CelebV- text : A Large-Scale Facial Text-Video Dataset

Center text Spotter: A Novel Text Spotter for Autonomous Unmanned Vehicles

CFOR: Character-First Open-Set text Recognition via Context-Free Learning

CGNN: Caption-assisted graph neural network for image- text retrieval

Challenges in Content-Based Image Indexing of Cultural Heritage Collections: Support vector machine active learning with applications to text classification

Character Energy and Link Energy-Based text Extraction in Scene Images

Character extraction in web image for text recognition

Character feature Alignment-based scene text spotter

Character Grounding and Re-identification in Story of Videos and text Descriptions

Character Position-Aware Compression Framework for Screen text Image, A

Character Region Attention for text Spotting

Character Region Awareness for text Detection

Character Segmentation of Handwritten Bangla text by Vertex Characterization of Isothetic Covers

Character Segmenting Techniques for Handwritten text : A Survey

Character-Aware Sampling and Rectification for Scene text Recognition

Character-Level Interaction in Computer-Assisted Transcription of text Images

Character-Level Interaction in Multimodal Computer-Assisted Transcription of text Images

Character-like region verification for extracting text in scene images

Character-Position-Free On-Line Handwritten Japanese text Recognition by Two Segmentation Methods

Character-Stroke Detection for text -Localization and Extraction

Characterization and classification of semantic image- text relations

Characterness: An Indicator of text in the Wild

Chat-edit-3d: Interactive 3d Scene Editing via text Prompts

ChatGen: Automatic text -to-Image Generation From FreeStyle Chatting

ChatTraffic: text -to-Traffic Generation via Diffusion Model

Check, Locate, Rectify: A Training-Free Layout Calibration System for text - to- Image Generation

Chinese Street View text : Large-Scale Chinese Text Reading With Partially Supervised Learning

Chinese Street View text : Large-Scale Chinese Text Reading With Partially Supervised Learning

Chinese text distinction and font identification by recognizing most frequently used characters

Chinese text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

Chinese/Kanji text and Data Processing

Choose What You Need: Disentangled Representation Learning for Scene text Recognition, Removal and Editing

Circle text Expansion as Low-Rank Textures

CiteTracker: Correlating Image and text for Visual Tracking

CKD: Cross-Task Knowledge Distillation for text -to-Image Synthesis

Class dependent feature scaling method using naive Bayes classifier for text datamining

Class-agnostic Object Counting with text -to-image Diffusion Model

Class-Aware Mask-guided feature refinement for scene text recognition

Class-Balanced text to Image Synthesis With Attentive Generative Adversarial Network

Class-dependent projection based method for text categorization

Classification Architecture Based on Connected Components for text Detection in Unconstrained Environments, A

Classification of Machine Printed and Handwritten text s Using Character Block Layout Variance

Classification of Noisy Free- text Prostate Cancer Pathology Reports Using Natural Language Processing

Classification of text Documents

Classification of text documents based on score level fusion approach

Classification with reject option in text categorisation systems

Classifying networked text data with positive and unlabeled examples

CLEval: Character-Level Evaluation for text Detection and Recognition Tasks

CLIP is Almost All You Need: Towards Parameter-Efficient Scene text Retrieval without OCR

CLIP is Also an Efficient Segmenter: A text -Driven Approach for Weakly Supervised Semantic Segmentation

CLIP-Actor: text -Driven Recommendation and Stylization for Animating Human Meshes

CLIP-Driven Fine-Grained text -Image Person Re-Identification

CLIP-Event: Connecting text and Images with Event Structures

CLIP-Forge: Towards Zero-Shot text -to-Shape Generation

CLIP-GAN: Stacking CLIPs and GAN for Efficient and Controllable text -to-Image Synthesis

CLIP-NeRF: text -and-Image Driven Manipulation of Neural Radiance Fields

CLIP2GAN: Toward Bridging text With the Latent Space of GANs

CLIP2Protect: Protecting Facial Privacy Using text -Guided Makeup via Adversarial Latent Search

Clip2Sam: Enhanced End-to-End text -to-Image Segmentation and Image Diffusion System

CLIPAG: Towards Generator-Free text -to-Image Generation

CLIPDraw++: text -to-Sketch Synthesis with Simple Primitives

CLIPstyler: Image Style Transfer with a Single text Condition

CLIPTER: Looking at the Bigger Picture in Scene text Recognition

CLIPtone: Unsupervised Learning for text -Based Image Tone Adjustment

Cloud of Line Distribution and Random Forest Based text Detection from Natural/Video Scene Images

Clustering-Based Approach to the Separation of text Strings from Mixed Text/Graphics Documents, A

Clustering-Based Approach to the Separation of text Strings from Mixed Text/Graphics Documents, A

CM-Net: Concentric Mask Based Arbitrary-Shaped text Detection

CMA-CLIP: Cross-Modality Attention Clip for text -Image Classification

CMFG: Cross-model Fine-grained Feature Interaction for text -video Retrieval

CMMLoc: Advancing text -to-PointCloud Localization with Cauchy-Mixture-Model Based Framework

CMPD: Using Cross Memory Network With Pair Discrimination for Image- text Retrieval

CMT-CO: Contrastive Learning with Character Movement Task for Handwritten text Recognition

CNN for text Detection, Convolutional Neural Network

CNN-based text image super-resolution tailored for OCR

CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video- text Dataset

coarse-to-fine scene text detection method based on Skeleton-cut detector and Binary-Tree-Search based rectification, A

Code-Mixing and Code-Switching on Social Media text : A Brief Survey

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image- text Retrieval

Coding with ASCII: compact, yet text -based 3D content

Cognition Transferring and Decoupling for text -Supervised Egocentric Semantic Segmentation

Cognitive Themes Emerging from Air Photo Interpretation text s Published to 1960

Cogview3: Finer and Faster text -to-image Generation via Relay Diffusion

Collaborative Vision- text Representation Optimizing for Open-vocabulary Segmentation

Color Based Image Segmentation and its Application to text Segmentation, A

Color segmentation for text extraction

Color structure recovering in strong specular text regions

Color text extraction from camera-based images: The impact of the choice of the clustering distance

Color text extraction with selective metric-based clustering

Color text image binarization based on binary texture analysis

Coloring with Words: Guiding Image Colorization Through text -Based Palette Generation

Colour text segmentation in web images based on human perception

colour text /graphics separation based on a graph representation, A

Combination of global and local con text s for text/non-text classification in heterogeneous online handwritten documents

Combination of global and local con text s for text/non-text classification in heterogeneous online handwritten documents

combined Convolutional Neural Network and Dynamic Programming approach for text line normalization, A

Combined orientation and skew detection using geometric text -line modeling

Combining Deep and Ad-hoc Solutions to Localize text Lines in Ancient Arabic Document Images

Combining diverse on-line and off-line systems for handwritten text line recognition

Combining diverse systems for handwritten text line recognition

Combining HMM classifiers in a handwritten text recognition system

Combining Statistical Measures to Find Image text Regions

Combining Structure and Parameter Adaptation of HMMs for Printed text Recognition

Combining text and image information in content-based retrieval

Combining text and prosodic analysis for prominent word detection

COME: Clip-OCR and Master ObjEct for text image captioning

Comic text Detection Using Neural Network Approach

COMIM-GAN: Improved text -to-Image Generation via Condition Optimization and Mutual Information Maximization

CoMM: A Coherent Interleaved Image- text Dataset for Multimodal Understanding and Generation

Commercial Quality text : What Does it Take?

Commonsense-Guided Semantic and Relational Consistencies for Image- text Retrieval

comparative study of features for handwritten Bangla text recognition, A

Comparative Study of HMM and BLSTM Segmentation-Free Approaches for the Recognition of Handwritten text -Lines

Comparative Study to Evaluate a text -Independent Speaker Identification Engine for Arabic Speakers Using a CHMM-Based Approach, A

Comparing Data-driven and Phonetic N-gram Systems for text -Independent Speaker Verification

Comparison of Approaches for Automated text Extraction from Scholarly Figures, A

Comparison of clustering methods: A case study of text -independent speaker modeling

Comparison of some thresholding algorithms for text /background segmentation in difficult document images

Comparison of text String Similarity Algorithms for POI Name Harmonisation, A

comparison study on multiple binary-class SVM methods for unilabel text categorization, A

Compass Control: Multi Object Orientation Control for text -to-Image Generation

Compensating for the Incomplete With the Complete: An Efficient Scene text Detector

COMPGS: Unleashing 2D Compositionality for Compositional text -to-3D via Dynamically Optimizing 3D Gaussians

Complementarity-Aware Space Learning for Video- text Retrieval

complete OCR for printed Hindi text in Devanagari script, A

Complete Pyramidal Geometrical Scheme for text Based Image Description and Retrieval, A

Complying with Privacy Legislation: From Legal text to Implementation of Privacy-Aware Location-Based Services

component-tree based method for user-intention guided text extraction, A

Components Regulated Generation of Handwritten Chinese text -lines in Arbitrary Length

Composing Object Relations and Attributes for Image- text Matching

Composing text and Image for Image Retrieval - an Empirical Odyssey

Composite Script Identification and Orientation Detection for Indian text Images

Compositional coding capsule network with k-means routing for text classification

Compositional Image- text Matching and Retrieval by Grounding Entities

Compositional Learning of Image- text Query for Image Retrieval

Compositional Mixture Representations for Vision and text

comprehensive method for multilingual video text detection, localization, and extraction, A

comprehensive neural-based approach for text recognition in videos using natural language processing, A

Comprehensive regional guidance for attention map semantics in text -to-image diffusion models

comprehensive scheme for tattoo text detection, A

Comprehensive Study of Decoder-Only LLMs for text -to-Image Generation, A

comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition, A

Comprehensive Survey of Transformers in text Recognition: Techniques, Challenges, and Future Directions, A

Computational Topology in text Mining

Computer Assisted Transcription for Ancient text Images

Computer Assisted Transcription of Handwritten text Images

Computer Assisted Transcription of text Images: Results on the GERMANA Corpus and Analysis of Improvements Needed for Practical Use

Computer Interpretation of English text and Picture Patterns

Con- text : Text Detection for Fine-Grained Object Classification

Con- text : Text Detection for Fine-Grained Object Classification

Concept decompositions for short text clustering by identifying word communities

Concept Weaver: Enabling Multi-Concept Fusion in text -to-Image Models

ConceptCraft: One-Shot Personalized text -to-Image Generation via Object-Background Disentanglement

ConceptGuard: Continual Personalized text -to-Image Generation with Forgetting and Confusion Mitigation

Concepts-Locations-Emotions: Semantic Analysis and Visualization of Climate Change text s

Conceptual 12M: Pushing Web-Scale Image- text Pre-Training To Recognize Long-Tail Visual Concepts

Conditional Feature Learning Based Transformer for text -Based Person Search

Conditional Image- text Embedding Networks

conditional random field approach for face identification in broadcast news using overlaid text , A

Conditional random field for text segmentation from images with complex background

Conditional text Image Generation with Diffusion Models

Confidence Measures for Error Correction in Interactive Transcription Handwritten text

Configurable text Stamp Identification Tool with Application of Fuzzy Logic

CONFORM: Contrast is All You Need For High-Fidelity text -to-Image Diffusion Models

Connected and Degraded text Recognition Using Hidden Markov Model

Connected Component Level Discrimination of Handwritten and Machine-Printed text Using Eigenfaces

Connecting Consistency Distillation to Score Distillation for text -to-3d Generation

Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora

Connecting NeRFs, Images, and text

Consensus-aware Visual-semantic Embedding for Image- text Matching

Consistent Partition and Labelling of text Blocks

Consistent3D: Towards Consistent High-Fidelity text -to-3D Generation with Deterministic Sampling Prior

Constructing the Discriminative Kernels Using GMM for text -Independent Speaker Identification

Content and Style Aware Generation of text -Line Images for Handwriting Recognition

Content Based Image and Video Retrieval Using Embedded text

Content-based image retrieval with pachinko allocation model and a combination of colour, text ure and text features

Content-Based Query of Image Databases, Inspirations from text Retrieval: Inverted Files, Frequency-based Weights and Relevance Feedback

Content-based query of image databases: Inspirations from text retrieval

Con text Driven Text Segmentation and Recognition

Con text Perception Parallel Decoder for Scene Text Recognition

Con text Supplied by Text or Language

Con text -Aware Attention Network for Image-Text Retrieval

Con text -Aware Hierarchical Transformer for Fine-Grained Video-Text Retrieval

Con text -aware relation enhancement and similarity reasoning for image-text retrieval

Con text -Aware Text-Based Binary Image Stylization and Synthesis

Con text -based text detection in natural scenes

Con text -CIR: Learning from Concepts in Text for Composed Image Retrieval

Con text 2Rec: Leveraging comment text semantics and sequential features for enhanced recommendation systems

Con text ual Text Block Detection Towards Scene Text Understanding

Con text ual Text Block Detection Towards Scene Text Understanding

Con text ual text/non-text stroke classification in online handwritten notes with conditional random fields

Con text ual text/non-text stroke classification in online handwritten notes with conditional random fields

Continual Learning for Cross-Modal Image- text Retrieval Based on Domain-Selective Attention

Continuous approach to segmentation of handwritten text

Contour Restoration of text Components for Recognition in Video/Scene Images

contour-based approach to 3D text labeling on triangulated surfaces, A

Contour-Based Robust Algorithm for text Detection in Color Images, A

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene text Detection

Contra: (con) text (tra)nsformer for Cross-modal Video Retrieval

Contrastive author-aware text clustering

Contrastive Denoising Score for text -Guided Latent Diffusion Image Editing

Contrastive Transformer Learning With Proximity Data Generation for text -Based Person Search

Contribution of recurrent connectionist language models in improving LSTM-based Arabic text recognition in videos

Contribution to the Discrimination of the Medieval Manuscript text s: Application in the Palaeography

Control4D: Efficient 4D Portrait Editing With text

Controllable Artistic text Style Transfer via Shape-Matching GAN

Controllable Multi-Lingual Multi-Speaker Multi-Style text -to-Speech Synthesis With Multivariate Information Minimization, A

Controllable text -to-3D Generation via Surface-Aligned Gaussian Splatting

Controllable text -to-Image Synthesis for Multi-Modality MR Images

Controllable Video Generation With text -Based Instructions

Controlling Human Shape and Pose in text -to-Image Diffusion Models via Domain Adaptation

Controlnet-xs: Rethinking the Control of text -to-image Diffusion Models as Feedback-control Systems

Convolutional Neural Network Based text Steganalysis

Convolutional Neural Network-Based Chinese text Detection Algorithm via Text Structure Modeling, A

Convolutional Neural Network-Based Chinese text Detection Algorithm via Text Structure Modeling, A

Convolutional Neural Networks for Direct text Deblurring

Convolutional Recurrent Neural Network for the Handwritten text Recognition of Historical Greek Manuscripts, A

Convolutional recurrent neural networks with hidden Markov model bootstrap for scene text recognition

COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated text s

CookGAN: Causality Based text -to-Image Synthesis

Cooperative Game Modeling With Weighted Token-Level Alignment for Audio- text Retrieval

COPT: Unsupervised Domain Adaptive Segmentation Using Domain-agnostic text Embeddings

Copyright protection for the electronic distribution of text documents

Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text

Correcting document image warping based on regression of curved text lines

Correlated Topic Modeling for Short text s in Spherical Embedding Spaces

CoSER: Towards Consistent Dense Multiview text -To-Image Generator for 3D Creation

CosmicMan: A text -to-Image Foundation Model for Humans

CoSMo: Content-Style Modulation for Image Retrieval with text Feedback

Cost-Effective Adversarial Attacks against Scene text Recognition

Could scene con text be beneficial for scene text detection?

Countering Personalized text -to-Image Generation with Influence Watermarks

Counting Guidance for High Fidelity text -to-Image Synthesis

Coupled Snakelets for Curled text -Line Segmentation from Warped Document Images

Coverless Image Steganography Based on Semantic-Controlled text -to-Image Generation

Cpgan: Content-parsing Generative Adversarial Networks for text -to-image Synthesis

Cps-STS: Bridging the Gap Between Content and Position for Coarse-Point-Supervised Scene text Spotter

Create Your World: Lifelong text -to-Image Diffusion

Creating generic text summaries

Creation and Analysis of a Corpus of text Rich Indian TV Videos

CRF Based Scheme for Overlapping Multi-colored text Graphics Separation, A

Crime Prediction and Monitoring in Porto, Portugal, Using Machine Learning, Spatial and text Analytics

CrisisHateMM: Multimodal Analysis of Directed and Undirected Hate Speech in text -Embedded Images from Russia-Ukraine Conflict

Cross Initialization for Face Personalization of text -to-Image Models

Cross-Dataset Study for text -based 3D Human Motion Retrieval, A

Cross-Domain detection of AI-Generated text : Integrating linguistic richness and lexical pair dispersion via deep learning

Cross-Domain Multi-Modal Few-Shot Object Detection via Rich text

Cross-Lingual text Image Recognition via Multi-Hierarchy Cross-Modal Mimic

Cross-Lingual text Image Recognition via Multi-Task Sequence to Sequence Learning

Cross-Modal Adaptive Dual Association for text -to-Image Person Retrieval

Cross-Modal and Hierarchical Modeling of Video and text

Cross-Modal Contrastive Learning for text -to-Image Generation

Cross-modal domain adaptation for text -based regularization of image semantics in image retrieval systems

Cross-Modal Dynamic Networks for Video Moment Retrieval With text Query

Cross-Modal Feature Fusion-Based Knowledge Transfer for text -Based Person Search

Cross-modal feature learning and alignment network for text -image person re-identification

Cross-Modal Implicit Relation Reasoning and Aligning for text -to-Image Person Retrieval

Cross-modal independent matching network for image- text retrieval

Cross-modal knowledge learning with scene text for fine-grained image classification

Cross-Modal Person Search: A Coarse-to-Fine Framework using Bi-Directional text -Image Matching

Cross-Modal Progressive Perspective Matching Network for Remote Sensing Image- text Retrieval

Cross-modal Scene Graph Matching for Relationship-aware Image- text Retrieval

Cross-Modal Semantic Matching Generative Adversarial Networks for text -to-Image Synthesis

Cross-Modal text Steganography Against Synonym Substitution-Based Text Attack

Cross-Modal text Steganography Against Synonym Substitution-Based Text Attack

Cross-Modal Uncertainty Modeling With Diffusion-Based Refinement for text -Based Person Retrieval

Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and text Guidance

Crossing the lines: making optimal use of con text in line-based Handwritten Text Recognition

Crossmodal Translation Based Meta Weight Adaption for Robust Image- text Sentiment Analysis

Crypto-stego System for Securing text and Image Data

CSA: Cross-scale alignment with adaptive semantic aggregation and filter for image- text retrieval

CT-GAN: A conditional Generative Adversarial Network of transformer architecture for text -to-image

CT-Net: Arbitrary-Shaped text Detection via Contour Transformer

CTIGEN-CDM: Controlled text -to-Image Generation Using Cropped Diffusion Models

Ctrl-Room: Controllable text -to-3D Room Meshes Generation with Layout Constraints

Curriculum learning for printed text line recognition of ligature-based scripts

Cursive Script, Historical Documents, text Line Segmentation, Script Line, Segmentation, Text Line Extraction

Cursive Script, Historical Documents, text Line Segmentation, Script Line, Segmentation, Text Line Extraction

Cursive stroke sequencing for handwritten text documents recognition

Curved scene text detection via transverse and longitudinal sequence connection

Customization Assistant for text -to-image Generation

Customize-a-video: One-shot Motion Customization of text -to-video Diffusion Models

Customizing 360-Degree Panoramas through text -to-Image Diffusion Models

CustomListener: text -Guided Responsive Interaction for User-Friendly Listening Head Generation

CycleMatch: A cycle-consistent embedding network for image- text matching

DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for text -to-Image Generation

DAE-GAN: Dynamic Aspect-aware GAN for text -to-Image Synthesis

DALL-EVAL: Probing the Reasoning Skills and Social Biases of text -to-Image Generation Models

DART: Disease-aware Image- text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation

Data Augmentation for Scene text Recognition

data base for arabic handwritten text recognition research, A

Data Embedding in text for a Copier System

Data-Hiding Capacity Improvement for text Watermarking Using Space Coding Method

Database for Arabic Handwritten text Image Recognition and Writer Identification, A

Database for Arabic Printed text Recognition Research

Database for Handwritten text Recognition Research, A

Database for Offline Arabic Handwritten text Recognition, A

dataset for Arabic text detection, tracking and recognition in news videos- AcTiV, A

Dataset to Support Sexist Content Detection in Arabic text , A

Datenerf: Depth-aware text -based Editing of Nerfs

DATID-3D: Diversity-Preserved Domain Adaptation Using text -to-Image Diffusion for 3D Generative Model

DCDM: Diffusion-conditioned-diffusion Model for Scene text Image Super-resolution

De-Diffusion Makes text a Strong Cross-Modal Interface

De-rendering Stylized text s

Debiased Video- text Retrieval via Soft Positive Sample Calibration

Debiasing Framework For Attribute Binding In Diffusion-Based text -To-Image Generation, A

Deblurring text Images via L0-Regularized Intensity and Gradient Prior

Deep Automated text Scoring Model Based on Memory Network

Deep Belief Networks Based Toponym Recognition for Chinese text

Deep BLSTM neural networks for unconstrained continuous handwritten text recognition

Deep Boosting Learning: A Brand-New Cooperative Approach for Image- text Matching

Deep Convolutional Deblurring and Detection Neural Network for Localizing text in Videos, A

Deep correlation for matching images and text

Deep Cross-Modal Projection Learning for Image- text Matching

Deep Direct Regression for Multi-oriented Scene text Detection

Deep feature extraction with tri-channel text ual feature map for text classification

Deep Features for text Spotting

Deep Geometric Moments Promote Shape Consistency in text -to-3D Generation

Deep image compression using scene text quality assessment

Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos

deep learning approach to handwritten text recognition in the presence of struck-out text, A

deep learning approach to handwritten text recognition in the presence of struck-out text, A

Deep Learning for Image-to- text Generation: A Technical Overview

Deep Learning in the Domain of Multi-Document text Summarization

Deep Matching Prior Network: Toward Tighter Multi-oriented text Detection

Deep Multi-Scale Con text Aware Feature Aggregation for Curved Scene Text Detection

Deep Neural Network Based 3D Articulatory Movement Prediction Using Both text and Audio Inputs

Deep neural network based hidden Markov model for offline handwritten Chinese text recognition

Deep Neural Network with Attention Model for Scene text Recognition

Deep Relational Reasoning Graph Network for Arbitrary Shape text Detection

Deep Reward Supervisions for Tuning text -to-image Diffusion Models

Deep text Spotter: An End-to-End Trainable Scene Text Localization and Recognition Framework

DeepErase: Weakly Supervised Ink Artifact Removal in Document text Images

DeepEraser: Deep Iterative Con text Mining for Generic Text Eraser

DeepSolo: Let Transformer Decoder with Explicit Points Solo for text Spotting

DeepWriterID: An End-to-End Online text -Independent Writer Identification System

Deformable scene text detection using harmonic features and modified pixel aggregation network

Deformation Robust text Spotting with Geometric Prior

Deformation-Invariant Networks for Handwritten text Recognition

Degraded Gray-Scale text Recognition Using Pseudo-2D Hidden Markov-Models and N-Best Hypotheses

Delaunay triangulation based text detection from multi-view images of natural scene

DeltaEdit: Exploring text -free Training for Text-Driven Image Manipulation

DeltaEdit: Exploring text -free Training for Text-Driven Image Manipulation

Dense Chained Attention Network for Scene text Recognition

Dense prediction for text line segmentation in handwritten document images

Dense text -to-Image Generation with Attention Modulation

density-based approach for text extraction in images, A

Dependability Feature Learning Based on Sample Generation for Unsupervised text -to-Image Person Re-Identification

Dependence Models for Searching text in Document Images

Deriving a Priori Co-occurrence Probability Estimates for Object Recognition from Social Networks and text Processing

Deriving Symbol Dependent Edit Weights for text Correction: The Use of Error Dictionaries

Design and Evaluation of Features That Best Define text in Complex Scene Images

Design and Preliminary Evaluation of a Finger-Mounted Camera and Feedback System to Enable Reading of Printed text for the Blind, The

DesignDiffusion: High-Quality text -to-Design Image Generation with Diffusion Models

Detect Arbitrary-Shaped text via Adaptive Thresholding and Localization Quality Estimation

Detect Visual Spoofing in Unicode-Based text

Detect-and-Guide: Self-regulation of Diffusion Models for Safe text -to-Image Generation via Guideline Token Optimization

Detected text -Based Image Retrieval Approach for Textual Images

Detecting and reading text in natural scenes

Detecting Arbitrarily Oriented text Labels in Early Maps

Detecting dense text in natural images

Detecting Misspelled Words in Turkish text Using Syllable n-gram Frequencies

Detecting moving text in video using temporal information

Detecting natural scenes text via auto image partition, two-stage grouping and two-layer classification

Detecting Oriented text in Natural Images by Linking Segments

Detecting Origin Attribution for text -to-Image Diffusion Models

Detecting Signs of Depression Using Social Media text s Through an Ensemble of Ensemble Classifiers

Detecting Tampered Scene text in the Wild

Detecting text Areas and Decorative Elements in Ancient Manuscripts

Detecting text in Natural Image with Connectionist Text Proposal Network

Detecting text in Natural Image with Connectionist Text Proposal Network

Detecting text in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Color Invariance

Detecting text in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Text Detection

Detecting text in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Text Detection

Detecting text in natural scenes with stroke width transform

Detecting text in Scene and Traffic Guide Panels With Attention Anchor Mechanism

Detecting text in the Wild with Deep Character Embedding Network

Detecting text Lines in Handwritten Documents

Detecting text s of arbitrary orientations in natural images

Detecting Traffic Information From Social Media text s With Deep Learning Approaches

Detecting Video text s Using Spatial-Temporal Wavelet Transform

Detection and Interpretation of text Information in Noisy Video Sequences

Detection and Location of Multicharacter Sequences in Lines of Imaged text

Detection and rectification of arbitrary shaped scene text s by using text keypoints and links

Detection and rectification of arbitrary shaped scene text s by using text keypoints and links

Detection and Segmentation of Antialiased text in Screen Images

Detection Approaches for Table Semantics in text

Detection of Curved text in Video: Quad Tree Based Method

Detection of curved text path based on the fuzzy curve-tracing (FCT) algorithm

Detection of Data Hiding in Binary text Images

Detection of text marks on moving vehicles

Detection of text on road signs from video

Detection of text Region and Segmentation from Natural Scene Images

Detection of text regions from digital engineering drawings

Determining Number of Clusters Using Firefly Algorithm with Cluster Merging for text Clustering

Deterministic Turing Machine for Con text Sensitive Translation of Braille Codes to Urdu Text, A

Devanagari and Bangla text Extraction from Natural Scene Images

Devanagari text Recognition: A Transcription Based Formulation

Development and Evaluation of text Localization Techniques Based on Structural Texture Features and Neural Classifiers

Development of a Robust and Compact On-Line Handwritten Japanese text Recognizer for Hand-Held Devices

Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for text -To-Video Generation, The

DF-GAN: A Simple and Effective Baseline for text -to-Image Synthesis

diabetic retinopathy classification method based on image- text contrastive learning, A

Dial: Dense Image- text Alignment for Weakly Supervised Semantic Segmentation

DiCTI: Diffusion-based Clothing Designer via text -guided Input

Dictionary design for text image compression with JBIG2

Dictionary-guided Scene text Recognition

Diff-tracker: text -to-image Diffusion Models are Unsupervised Trackers

DiffAgent: Fast and Accurate text -to-Image API Selection with Large Language Model

DiffBoost: Enhancing Medical Image Segmentation via text -Guided Diffusion Model

Different Approaches to Bilingual text Classification Based on Grammatical Inference Techniques

Differentiable Duration Refinement Using Internal Division for Non-Autoregressive text -to-Speech

Differential-Processing Extraction Approach to text and Image Segmentation, A

Differentiation of alphabets in handwritten text s

Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between text and Vision for Zero-Shot Image Captioning

Diffusion for Description or text to Image Generation

Diffusion for Layout Control in text to Image Generation

Diffusion in the Dark: A Diffusion Model for Low-Light text Recognition

Diffusion Models in 3D Synthesis, text to 3D Models

Diffusion Soup: Model Merging for text -to-image Diffusion Models

Diffusion-based Blind text Image Super-Resolution

Diffusion-Enhanced Test-Time Adaptation with text and Image Augmentation

Diffusion-SDF: text -to-Shape via Voxelized Diffusion

DiffusionCLIP: text -Guided Diffusion Models for Robust Image Manipulation

DiffusionGAN3D: Boosting text -guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

Diffusionpen: Towards Controlling the Style of Handwritten text Generation

DiffusionRet: Generative text -Video Retrieval with Diffusion Model

DiffusionSTR: Diffusion Model for Scene text Recognition

Digital image analysis to enhance underwritten text in the Archimedes palimpsest

Digital Ink Recogntion Server for Handwritten Japanese text , A

DINOv2 Meets text : A Unified Framework for Image- and Pixel-Level Vision-Language Alignment

Diphone spanish text -to-speech synthesizer

Direct Regression Scene text Detector With Position-Sensitive Segmentation, A

Direct text to Speech Translation System Using Acoustic Units

Direct Unsupervised text Line Extraction from Colored Historical Manuscript Images Using DCT

DIRECT-3D: Learning Direct text -to-3D Generation on Massive Noisy 3D Data

Direct2.5: Diverse text -to-3D Generation via Multi-view 2.5D Diffusion

Discovering Low-Rank Shared Concept Space for Adapting text Mining Models

Discovering meaningful multimedia patterns with audio-visual concepts and associated text

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video- text Retrieval

Discrete Joint Semantic Alignment Hashing for Cross-Modal Image- text Search

Discrete-continuous Action Space Policy Gradient-based Attention for Image- text Matching

Discrimination of machine-printed from handwritten text using simple structural characteristics

Discriminative Class Tokens for text -to-Image Diffusion Models

Discriminative Kernel-Based Approach to Rank Images from text Queries, A

Discriminative Model for On-line Handwritten Japanese text Retrieval, A

Discriminative Probing and Tuning for text -to-Image Generation

discriminative semi-Markov model for robust scene text recognition, A

Disease-Specific Extraction of text from Cardiac Echo Videos for Decision Support

DisenDreamer: Subject-Driven text -to-Image Generation With Sample-Aware Disentangled Tuning

Disentangled Clothed Avatar Generation from text Descriptions

Disentangled Contour Learning for Quadrilateral text Detection

Disentangling Inter- and Intra-Video Relations for Multi-Event Video- text Retrieval and Grounding

Disentangling Subject-Irrelevant Elements in Personalized text -to-Image Diffusion via Filtered Self-Distillation

Dissecting Deep Metric Learning Losses for Image- text Retrieval

Distilling Knowledge of Bidirectional Language Model for Scene text Recognition

Distinction between handwritten and machine-printed text based on the bag of visual words model

Distinguishing between Handwritten and Machine Printed text in Bank Cheque Images

Distinguishing mathematics notation from English text using computational geometry

Distinguishing text /Non-Text Natural Images with Multi-Dimensional Recurrent Neural Networks

Distinguishing text /Non-Text Natural Images with Multi-Dimensional Recurrent Neural Networks

Distributional semantics of objects in visual scenes in comparison to text

Diverse text -to-3d Synthesis with Augmented Text Embedding

Diverse text -to-3d Synthesis with Augmented Text Embedding

Diversified text -to-image generation via deep mutual information estimation

DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone

DiZNet: An end-to-end text detection and recognition algorithm with detail in text zone

DM-GAN: Dynamic Memory Generative Adversarial Networks for text -To-Image Synthesis

DM-PCL: text -Driven Dual-Modal Prototype Consistency Learning for Weakly-Supervised Few-Shot Part Segmentation

DMF-GAN: Deep Multimodal Fusion Generative Adversarial Networks for text -to-Image Synthesis

Do text -free Diffusion Models Learn Discriminative Visual Representations?

DOC: text Recognition via Dual Adaptation and Clustering

DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for text -rich Document Understanding

Document Analysis System Based on text Line Matching of Multiple OCR Outputs, A

Document dewarping via text -line based optimization

Document filtering for fast approximate string matching of erroneous text

Document Image De-warping Based on Detection of Distorted text Lines

Document Image Dewarping using Robust Estimation of Curled text Lines

Document image ground truth generation from electronic text

Document Mining Based on Semantic Understanding of text

Document Rectification Approach Dealing with Both Perspective Distortion and Warping Based on text Flow Curve Fitting, A

Document segmentation and classification into musical scores and text

Document skew detection/control system for printed document images containing a mixture of pure text lines and non-text portions

Document skew detection/control system for printed document images containing a mixture of pure text lines and non-text portions

Does text attract attention on e-commerce images: A novel saliency prediction dataset and method

Domain adaptive multigranularity proposal network for text detection under extreme traffic scenes

Domain Generalization in CLIP via Learning with Diverse text Prompts

Domain-Complementary Prior With Fine-Grained Feedback for Scene text Image Super-Resolution

Don't Forget Me: Accurate Background Recovery for text Removal via Modeling Local-Global Context

Dot text Detection Based on FAST Points

Double supervision for scene text detection and recognition based on BMINet

Doubly Abductive Counterfactual Inference for text -Based Image Editing

Downtown Osaka Scene text Dataset

Drag text : Rethinking Text Embedding in Point-Based Image Editing

Dream-in-Style: text -to-3D Generation Using Stylized Score Distillation

Dream3D: Zero-Shot text -to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

Dream3D: Zero-Shot text -to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

DreamArtist: Controllable One-Shot text -to-Image Generation via Positive-Negative Adapter

DreamAvatar: text -and-Shape Guided 3D Human Avatar Generation via Diffusion Models

DreamBlend: Advancing Personalized Fine-Tuning of text -to-Image Diffusion Models

DreamBooth3D: Subject-Driven text -to-3D Generation

DreamBooth: Fine Tuning text -to-Image Diffusion Models for Subject-Driven Generation

DreamControl: Control-Based text -to-3D Generation with 3D Self-Prior

Dreamdissector: Learning Disentangled text -to-3d Generation from 2d Diffusion Priors

Dreamdrone: text -to-image Diffusion Models Are Zero-shot Perpetual View Generators

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent text -to-Image Personalization

Dreammesh: Jointly Manipulating and text uring Triangle Meshes for Text-to-3d Generation

DreamPropeller: Supercharge text -to-3D Generation with Parallel Sampling

Dreamreward: text -to-3d Generation with Human Preference

Dreamscene360: Unconstrained text -to-3d Scene Generation with Panoramic Gaussian Splatting

Dreamscene: 3d Gaussian-based text -to-3d Scene Generation via Formation Pattern Sampling

DreamStone: Image as a Stepping Stone for text -Guided 3D Shape Generation

Dream text : High Fidelity Scene Text Synthesis

Dreamview: Injecting View-specific text Guidance Into Text-to-3d Generation

Dreamview: Injecting View-specific text Guidance Into Text-to-3d Generation

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable text -to-Image Diffusion Generation

DSTA: Reinforcing Vision-Language Understanding for Scene- text VQA With Dual-Stream Training Approach

DS text V2: A comprehensive video text spotting dataset for dense and small text

DS text V2: A comprehensive video text spotting dataset for dense and small text

DTLLM-VLT: Diverse text Generation for Visual Language Tracking Based on LLM

DU-Net: A Dual U-Net for semantic text -guided style transfer

Dual Adversarial Inference for text -to-Image Synthesis

Dual Alignment Unsupervised Domain Adaptation for Video- text Retrieval

dual branch graphic text detection network based on progressive Domain adaptation, A

Dual Encoding for Video Retrieval by text

Dual Relation Network for Scene text Recognition

Dual Stream Relation Learning Network for Image- text Retrieval

Dual-branch scale disentanglement for text -video retrieval

Dual-Level Representation Enhancement on Characteristic and Con text for Image-Text Retrieval

Dual-path CNN with Max Gated block for text -based person re-identification

Dual-Path Rare Content Enhancement Network for Image and text Matching

DUET: Detection Utilizing Enhancement for text in Scanned or Captured Documents

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations Without text Alignment

DVHMM: variable length text recognition error model

Dynamic Attention Analysis for Backdoor Detection in text -to-Image Diffusion Models

Dynamic Contrastive Distillation for Image- text Retrieval

Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End text Spotting

Dynamic Prompt Optimizing for text -to-Image Generation

Dynamic receptive field adaptation for scene text recognition

Dynamic recognition in the omni-writer frame: Application to hand-printed text recognition

Dynamic semantic prototype perception for text -video retrieval

Dynamic sparse and weight allocation-based text -driven person retrieval

Dynamic text Line Segmentation for Real-Time Recognition of Chinese Handwritten Sentences

Dynamic text s From UAV Perspective Natural Images

Dynamic Unilateral Dual Learning for text to Image Synthesis

Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking for Image- text Retrieval

Dynamic Word Based text Compression

DynTypo: Example-Based Dynamic text Effects Transfer

Dysen-VDM: Empowering Dynamics-Aware text -to-Video Diffusion with LLMs

E.T. the Exceptional Trajectories: text -to-Camera-Trajectory Generation with Character Awareness

E2VTS: Energy-Efficient Video text Spotting from Unmanned Aerial Vehicles

E4C: Enhance Editability for text -Based Image Editing by Harnessing Efficient CLIP Guidance

EA-VTR: Event-aware Video- text Retrieval

Eaformer: Scene text Segmentation with Edge-aware Transformers

Early feature stream integration versus decision level combination in a multiple classifier system for text line recognition

Earthquake Information Extraction and Comparison from Different Sources Based on Web text

EAST: An Efficient and Accurate Scene text Detector

ECLIPSE: A Resource-Efficient text -to-Image Prior for Image Generations

EDA: Explicit text -Decoupling and Dense Alignment for 3D Visual Grounding

Edge Approximation text Detector

Edge Based Binarization for Video text Images

Edge guided and Fourier attention-based Dual Interaction Network for scene text erasing

Edge-Based Features for Localization of Artificial Urdu text in Video Images

Edge-based method for text detection from complex document images

Edge-based text localization and character segmentation algorithms for automatic slab information recognition

EdgeRelight360: text -Conditioned 360-Degree HDR Image Generation for Real-Time On-Device Video Portrait Relighting

Edit Probability for Scene text Recognition

Editing Implicit Assumptions in text -to-Image Diffusion Models

Educational video understanding: Mapping handwritten text to textbook chapters

EESSO: Exploiting Extreme and Smooth Signals via Omni-frequency learning for text -based Person Retrieval

Effect of Improved Path Evaluation for On-line Handwritten Japanese text Recognition

Effective 3D text Recurrent Voting Generator for Metaverse, An

Effective and efficient video text extraction using key text points

Effective and efficient video text extraction using key text points

Effective feature descriptor-based new framework for off-line text -independent writer identification

effective method for text line segmentation in historical document images, An

effective sentence-extraction technique using con text ual information and statistical approaches for text summarization, An

Effective shrinkage of large multi-class linear SVM models for text categorization

Effective text localization in natural scene images with MSER, geometry-based grouping and AdaBoost

Effective Uyghur Language text Detection in Complex Background Images for Traffic Prompt Identification

Effective video text detection using line features

Effectively localize text in natural scene images

effectiveness of T5, GPT-2, and BERT on text -to-image generation task, The

Efficiency investigation of manifold matching for text document classification

Efficient Algorithm for Segmenting Warped text -Lines in Document Images, An

Efficient and Accurate Arbitrary-Shaped text Detection With Pixel Aggregation Network

Efficient and flexible text extraction from document pages

Efficient Automatic text Location Method and Content-Based Indexing and Structuring of Video Database

Efficient Character Skew Rectification in Scene text Images

Efficient Exploration of Image Classifier Failures with Bayesian Optimization and text -to-Image Models

Efficient Exploration of text Regions in Natural Scene Images Using Adaptive Image Sampling

Efficient graph-based dictionary search and its application to text -image searching

Efficient Image- text Retrieval via Keyword-Guided Pre-Screening

Efficient indexing for Query By String text retrieval

Efficient Industrial System for Vehicle Tyre (Tire) Detection and text Recognition Using Deep Learning, An

Efficient label-free pruning and retraining for text -VQA Transformers

Efficient Light Balancing Techniques for text Images in Video Presentation Systems

Efficient Method for Offline text Independent Writer Identification, An

Efficient Method for text Detection in Video Based on Stroke Width Similarity, An

Efficient Multimodal Aggregation Network for Video- text Retrieval, An

Efficient Scene text localization and recognition with local character refinement

Efficient side information encoding for text hardcopy documents

Efficient System for Hazy Scene text Detection using a Deep CNN and Patch-NMS, An

Efficient text analyser with prosody generator-driven approach for Mandarin text-to-speech

Efficient text analyser with prosody generator-driven approach for Mandarin text-to-speech

Efficient text Capture Method for Moving Robots Using DCT Feature and Text Tracking, An

Efficient text Capture Method for Moving Robots Using DCT Feature and Text Tracking, An

Efficient text Classification Using Tree-structured Multi-linear Principal Component Analysis

Efficient text independent speaker recognition with wavelet feature selection based multilayered neural network using supervised learning algorithm

Efficient text localization in born-digital images by local contrast-based segmentation

Efficient text Segmentation Technique Based on Naive Bayes Classifier, An

Efficient text -based Person Search via Single-stage Identity-guided Attribute Parsing and Alignment

Efficient text -Guided 3D-Aware Generation With Score Distillation on 3D Distribution

Efficient text -to-Image Generation: An Adaptive Step Schedule Controller for Diffusion Models

Efficient Token-Guided Image- text Retrieval With Consistent Multimodal Contrastive Training

Efficient Transcript Mapping to Ease the Creation of Document Image Segmentation Ground Truth with text -Image Alignment

Efficient video text detection using edge features

Efficient video text recognition using multiple frame integration

Efficient Visual Search of Videos Cast as text Retrieval

EGO-LM: An efficient, generic, and out-of-the-box language model for handwritten text recognition

Ego text VQA: Towards Egocentric Scene-Text Aware Video Question Answering

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free text -to-Video Generation

Eigenspace method for text retrieval in historical document images

ELITE: Encoding Visual Concepts into text ual Embeddings for Customized Text-to-Image Generation

Elucidating Optimal Reward-Diversity Tradeoffs in text -to-Image Diffusion Models

Embedded Application for Degraded text Recognition, An

Embedded Bernoulli Mixture HMMs for Continuous Handwritten text Recognition

embedded method: Improve the relevance of text and face image with enhanced face attributes, An

Embil: An English-manipuri Bi-lingual Benchmark for Scene text Detection and Language Identification

Emergent Visual-semantic Hierarchies in Image- text Representations

EmoGen: Emotional Image Content Generation with text -to-Image Diffusion Models

EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media text

EmoSphere++: Emotion-Controllable Zero-Shot text -to-Speech Via Emotion-Adaptive Spherical Vector

Emotion Correlation Mining Through Deep Learning Models on Natural Language text

Emotion Recognition in text for 3-D Facial Expression Rendering

EmotionAlBERTo: Emotion Recognition of Italian Social Media text s Through BERT

Empathy Detection From text , Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods

Empirical Study and Analysis of text -to-image Generation Using Large Language Model-powered Textual Representation, An

Empirical Study of Scaling Law for Scene text Recognition, An

EMU: Effective Multi-Hot Encoding Net for Lightweight Scene text Recognition With a Large Character Set

Encapsulated Composition of text -to-Image and Text-to-Video Models for High-Quality Video Synthesis

Encapsulated Composition of text -to-Image and Text-to-Video Models for High-Quality Video Synthesis

Encoding Video Narration as text

End-to-End Approach for Handwriting Recognition: From Handwritten text Lines to Complete Pages, An

End-to-End Handwritten Paragraph text Recognition Using a Vertical Attention Network

End-to-end interactive joint model: Clause-phrase multi-task learning for suicidal ideation cause extraction (SICE) in Chinese Weibo text

end-to-end model for multi-view scene text recognition, An

End-to-end OCR text Re-organization Sequence Learning for Rich-text Detail Image Comprehension, An

End-to-end OCR text Re-organization Sequence Learning for Rich-text Detail Image Comprehension, An

End-to-End page-Level assessment of handwritten text recognition

End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for text -Video Retrieval

End-to-end scene text recognition

End-to-end scene text recognition using tree-structured models

End-to-end text recognition with convolutional neural networks

End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene text Recognition, An

End-to-End Video text Detection with Online Tracking

End-to-End Video text Spotting with Transformer

Enforcing similarity constraints with integer programming for better scene text recognition

Enhanced Active Contour Method for Locating text

Enhanced Feature Extraction Framework for Cross-Modal Image- text Retrieval, An

Enhanced Generative Structure Prior for Chinese text Image Super-Resolution

Enhanced Motion- text Alignment for Image-to-Video Transfer Learning

Enhanced Network Embedding with text Information

Enhanced Probabilistic Neural Network Approach Applied to text Classification, An

Enhanced Semantic Similarity Learning Framework for Image- text Matching

Enhanced text Extraction from Arabic Degraded Document Images Using EM Algorithm

Enhancement and feature extraction for images of incised and ink text s

Enhancement of camera captured text images with specular reflection

Enhancement of text images using a context based nonlinear interpolative vector quantization method

Enhancing 3D Fidelity of text -to-3D using Cross-View Correspondences

Enhancing Diffusion Models with text -encoder Reinforcement Learning

Enhancing energy minimization framework for scene text recognition with top-down cues

Enhancing fine-detail image synthesis from text descriptions by text aggregation and connection fusion module

Enhancing fine-detail image synthesis from text descriptions by text aggregation and connection fusion module

Enhancing Handwritten text Recognition with N-gram sequence decomposition and Multitask Learning

Enhancing knowledge distillation for semantic segmentation through text -assisted modular plugins

Enhancing Micro Gesture Recognition for Emotion Understanding via Con text -Aware Visual-Text Contrastive Learning

Enhancing Scene text Detection via Fused Semantic Segmentation Network with Attention

Enhancing Scene text Detectors with Realistic Text Image Synthesis Using Diffusion Models

Enhancing Scene text Detectors with Realistic Text Image Synthesis Using Diffusion Models

Enhancing Semantic Fidelity in text -to-image Synthesis: Attention Regulation in Diffusion Models

Enhancing Tampered text Detection Through Frequency Feature Fusion and Decomposition

Enhancing text -Based Person Retrieval by Combining Fused Representation and Reciprocal Learning With Adaptive Loss Refinement

Enhancing text -like edges in digital images

Enhancing text -Video Retrieval Performance With Low-Salient but Discriminative Objects

Enhancing the Video Editing Capabilities of text -to-Video Generators Using DDPM Inversion

Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided text Prompts

Enriching Video Captions With Con text ual Text

Ensemble Methods to Improve the Performance of an English Handwritten text Line Recognizer

Episodic Learning Network for text Detection on Human Bodies in Sports Images, An

ER-Chat: A text -to-Text Open-Domain Dialogue Framework for Emotion Regulation

ER-Chat: A text -to-Text Open-Domain Dialogue Framework for Emotion Regulation

EraseNet: End-to-End text Removal in the Wild

Erasing Scene text with Weak Supervision

ERNIE-ViLG 2.0: Improving text -to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts

Erp-Aware text -To-360 Panorama Diffusion Model

ESA: External Space Attention Aggregation for Image- text Retrieval

Escaping Plato's Cave: Towards the Alignment of 3D and text Latent Spaces

ESIR: End-To-End Scene text Recognition via Iterative Image Rectification

Estate: Expert-Guided State text Enhancement for Zero-Shot Industrial Anomaly Detection

ES text Spotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

Estimating the Orientation and Recovery of text Planes in a Single Image

Estimating the readability of handwritten text : A Support Vector Regression based approach

Estimating the Semantics via Sector Embedding for Image- text Retrieval

Estimating urban flooding depth by integrating multimodal image- text data: A segment-level direct preference optimization-based multimodal large language model

Estimation of Skew Angle in text -Image Analysis by SLIDE: Subspace-Based Line Detection

Evaluating a Hidden Markov Model of Syntax in a text Recognition System

Evaluating and Improving Compositional text -to-Visual Generation

Evaluating Data Attribution for text -to-Image Models

Evaluating OCR and Non-OCR text Representations for Learning Document Classifiers

Evaluating text -to-Image Matching using Binary Image Selection (BISON)

Evaluating text -to-Video Alignment: A Hierarchical Benchmark for Video Generation Models

Evaluating text -to-visual Generation with Image-to-text Generation

Evaluating text -to-visual Generation with Image-to-text Generation

Evaluation of HMM-Based Techniques for the Recognition of Screen Rendered text , An

Evaluation of Model-Based Retrieval Effectiveness with OCR text

Evaluation of neural network language models in handwritten Chinese text recognition

Evaluation of the Concatenative Turkish text -to-Speech System

Evaluation of the Optimal Topic Classification for Social Media Data Combined with text Semantics: A Case Study of Public Opinion Analysis Related to COVID-19 with Microblogs

Event-Guided Procedure Planning from Instructional Videos with text Supervision

evidence-based model of saliency feature extraction for scene text analysis, An

Evolution Maps for Connected Components in text Documents

example-based prior model for text image super-resolution, An

Expanding Large Pre-trained Unimodal Models with Multimodal Information Injection for Image- text Multimodal Classification

Experimental Evaluation of OCR text Representations for Learning Document Classifiers, An

Experimental Investigation of text -Based CAPTCHA Attacks and Their Robustness, An

Experimental Study of Pruning Techniques in Handwritten text Recognition Systems, An

Experimental System for Office Document Handling and text Recognition, An

Experiments in text Recognition with Binary N-Gram and Viterbi Algorithms

Experiments in text Recognition with the Modified Viterbi Algorithm

Experiments in the Recognition of Handprinted text : Part I Character Recognition

Explain2Attack: text Adversarial Attacks via Cross-Domain Interpretability

Explaining Semantic text Similarity in Knowledge Graphs

Explicitly-Decoupled text Transfer With Minimized Background Reconstruction for Scene Text Editing

Explicitly-Decoupled text Transfer With Minimized Background Reconstruction for Scene Text Editing

Exploiting Color Information for Better Scene text Recognition

Exploiting colour information for better scene text detection and recognition

Exploiting Unlabeled Videos for Video- text Retrieval via Pseudo-Supervised Learning

Exploring AIGC Video Quality: A Focus on Visual Harmony, Video- text Consistency and Domain Distribution Gap

Exploring Effective Interactive text -Based Video Search in vitrivr

Exploring Fine-Grained Visual- text Feature Alignment With Prompt Tuning for Domain-Adaptive Object Detection

Exploring Global and Local Linguistic Representations for text -to-Image Synthesis

Exploring Phrase Grounding without Training: Con text ualisation and Extension to Text-Based Image Retrieval

Exploring Phrase-level Grounding with text -to-image Diffusion Model

Exploring Pre-trained text -to-video Diffusion Models for Referring Video Object Segmentation

Exploring Sparse MoE in GANs for text -conditioned Image Synthesis

Exploring Sparse Spatial Relation in Graph Inference for text -Based VQA

Exploring text representation impact on K-means based arabic text documents clustering

Exploring text representation impact on K-means based arabic text documents clustering

Exploring text -to-Motion Generation with Human Preference

Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene text Detection

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for text -to-Image Synthesis

Exploring The Potential of Vision-Language Models for Pure-Image and text -Guided-Image Saliency Prediction

Exploring the Spatial Distribution Characteristics of Emotions of Weibo Users in Wuhan Waterfront Based on Gender Differences Using Social Media text s

Exploring the Spatiotemporal Patterns of Residents' Daily Activities Using text -Based Social Media Data: A Case Study of Beijing, China

Exposing fake images generated by text -to-image diffusion models

Expressive Image Generation and Editing with Rich text

Expressive text -to-Image Generation with Rich Text

Expressive text -to-Image Generation with Rich Text

Expressive visual text -to-speech as an assistive technology for individuals with autism spectrum conditions

Expressive Visual text -to-Speech Using Active Appearance Models

Extended Bi-gram Features in text Categorization

Extended character defect model for recognition of text from maps

Extending TrOCR for text Localization-Free OCR of Full-Page Scanned Receipt Images

External Word Segmentation of Off-Line Handwritten text Lines

Extracting Curved text Lines Using Local Linearity of Text Line

Extracting Curved text Lines Using Local Linearity of Text Line

Extracting Spatio-Temporal Information from Chinese Archaeological Site text

Extracting text From Greyscale Images

Extracting text from WWW Images

Extraction and Recognition of Bangla text s from Natural Scene Images Using Cnn

Extraction of Handwritten text from Carbon Copy Medical Form Images

Extraction of line-word-character segments directly from run-length compressed printed text -documents

Extraction of Lines of text s in Unconstrained Handwritten Documents

Extraction of Nom text Regions from Stele Images Using Area Voronoi Diagram

Extraction of Pluvial Flood Relevant Volunteered Geographic Information (VGI) by Deep Learning from User Generated text s and Photos

Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed text -Documents

Extraction of special effects caption text events from digital video

Extraction of Spelling Variations from Language Structure for Noisy text Correction

Extraction of text boxes from Engineering Drawings

Extraction of text Lines and Text Blocks on Document Images Based on Statistical Modeling

Extraction of text Lines and Text Blocks on Document Images Based on Statistical Modeling

Extraction of text Words in Document Images Based on a Statistical Characterization

Extraction Of Thematically Relevant text From Images

extractive text summarization technique for Bengali document(s) using K-means clustering algorithm, An

Extractive text Summarization Using Topological Features

Extrapolate azimuth angles: text and edge guided ISAR image generation based on foundation model

Extremely Low-Light Image Enhancement with Scene text Restoration

Eyes Closed, Safety on: Protecting Multimodal LLMs via Image-to- text Transformation

FA-GAN: Feature-Aware GAN for text to Image Synthesis

Face typing: Vision-based perceptual interface for hands-free text entry with a scrollable virtual keyboard

FaceCLIP: Facial Image-to-Video Translation via a Brief text Description

FaceCLIPNeRF: text -driven 3D Face Manipulation using Deformable Neural Radiance Fields

Faces a la Carte: text -to-Face Generation via Attribute Disentanglement

Faces that Speak: Jointly Synthesising Talking Face and Speech from text

Facial Action Unit Recognition Enhanced by text Descriptions of FACS

Facsimile device with skew correction and text line direction detection

Factorizing text -to-video Generation by Explicit Image Conditioning

Factors in Emotion Recognition With Deep Learning Models Using Speech and text on Multiple Corpora

FakeInversion: Learning to Detect Images from Unseen text -to-Image Models by Inverting Stable Diffusion

Fantasia3D: Disentangling Geometry and Appearance for High-quality text -to-3D Content Creation

FARNet: Fragmented affinity reasoning network of text instances for arbitrary shape text detection

FARNet: Fragmented affinity reasoning network of text instances for arbitrary shape text detection

Fashion Image Retrieval with text Feedback by Additive Attention Compositional Learning

Fast and accurate scene text understanding with image binarization and off-the-shelf OCR

Fast and Accurate text Detection in Natural Scene Images with User-Intention

Fast and effective text detection

fast and efficient method for extracting text paragraphs and graphics from unconstrained documents, A

Fast and Efficient text Steganalysis Method, A

Fast and Flexible Statistical Method for text Extraction in Document Pages, A

Fast and memory efficient text image compression with JBIG2

Fast and robust text detection in images and video frames

fast and robust text spotter, A

Fast and simple text replacement algorithm for text-based augmented reality

Fast and simple text replacement algorithm for text-based augmented reality

Fast Appearance-Based Full- text Search Method for Historical Newspaper Images, A

Fast Approximate Modelling of the Next Combination Result for Stopping the text Recognition in a Video

Fast Coding-Mode Selection and CU-Depth Prediction Algorithm Based on text -Block Recognition for Screen Content Coding

fast hierarchical method for multi-script and arbitrary oriented scene text extraction, A

Fast Lexicon-Based Scene text Recognition with Sparse Belief Propagation

fast multiresolution text line and non text-line structures extraction and discrimination scheme for document image analysis, A

fast multiresolution text line and non text-line structures extraction and discrimination scheme for document image analysis, A

Fast online incremental approach of unseen place classification using disjoint- text attribute prediction

Fast perspective recovery of text in natural scenes

Fast scene text localization by learning-based filtering and verification

Fast Selection of Small and Precise Candidate Sets from Dictionaries for text Correction Tasks

Fast Supervised Topic Models for Short text Emotion Detection

Fast text categorization using concise semantic analysis

Fast text line detection by finding linear connected components on Canny edge image

Fast text line extraction in document images

Fast text /graphics resolution improvement using wavelet based denoising and chain-code table lookup

Fast Uyghur text detection in videos based on learning of baseline feature

Fast Uyghur text Detector for Complex Background Images, A

Fast(er) Reconstruction of Shredded text Documents via Self-Supervised Deep Asymmetric Metric Learning

Fast, Accurate, and Lightweight Memory-Enhanced Embedding Learning Framework for Image- text Retrieval

FAST: Facilitated and Accurate Scene text Proposals through FCN Guided Pruning

FastCLIPstyler: Optimisation-free text -based Image Style Transfer Using Style Representations

FastEdit: fast text -guided single-image editing via semantic-aware diffusion fine-tuning

FASTER: A Font-Agnostic Scene text Editing and Rendering Framework

FAS text : Efficient Unconstrained Scene Text Detector

FastFaceCLIP: A lightweight text -driven high-quality face image manipulation

FastVideoEdit: Leveraging Consistency Models for Efficient text -to-Video Editing

FateZero: Fusing Attentions for Zero-shot text -based Video Editing

FC-Render: Adaptive Font- and Color-Aware text Diffusion Model

FDS: Frequency-Aware Denoising Score for text -Guided Latent Diffusion Image Editing

Feature Embedding Based text Instance Grouping for Largely Spaced and Occluded Text Detection

Feature Embedding Based text Instance Grouping for Largely Spaced and Occluded Text Detection

Feature extracted from wavelet decomposition using biorthogonal Riesz basis for text -independent speaker recognition

Feature extracted from wavelet eigenfunction estimation for text -independent speaker recognition

Feature First: Advancing Image- text Retrieval Through Improved Visual Features

Feature Fusion Network for Scene text Detection

Feature Representations for Scene text Character Recognition: A Comparative Study

Feature selection for event extraction in biomedical text

Feature selection to recognize text from palm leaf manuscripts

Feature selection using hybrid poor and rich optimization algorithm for text classification

Feature subset selection using naive Bayes for text classification

Feature Weight Optimization and Pruning in Historical text Recognition

FedSH: Towards Privacy-Preserving text -Based Person Re-Identification

FeedEdit: text -Based Image Editing with Dynamic Feedback Regulation

FERGI: Automatic Scoring of User Preferences for text -to-Image Generation from Spontaneous Facial Expression Reaction

FETNet: Feature erasing and transferring network for scene text removal

Few Could Be Better Than All: Feature Sampling and Grouping for Scene text Detection

Few shots are all you need: A progressive learning approach for low resource handwritten text recognition

Few-shot Hierarchical text Classification with Bidirectional Path Constraint by label weighting

Few-Shot text Style Transfer via Deep Feature Similarity

Fg-T2M++: LLMs-Augmented Fine-Grained text Driven Human Motion Generation

Fg-T2M: Fine-Grained text -Driven Human Motion Generation via Diffusion Model

FHT: An Unconstraint Farsi Handwritten text Database

Find More Accurate text Boundary for Scene Text Detection

Find More Accurate text Boundary for Scene Text Detection

Find text in Documents

Find text in Video Scenes

Finding Hidden Semantics of text Tables

Finding structure in noisy text : Topic classification and unsupervised clustering

Finding text In Images

Finding text in Natural Scenes by Figure-Ground Segmentation

Finding text Regions using Localised Statistical Measures

Fine-Grained Erasure in text -To-Image Diffusion-Based Foundation Models

Fine-Grained Image- text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation

Fine-grained Image- text Matching by Cross-modal Hard Aligning Network

Fine-grained Image- text Retrieval via Complementary Feature Learning

Fine-Grained Image- text Retrieval via Discriminative Latent Space Learning

Fine-grained semantic oriented embedding set alignment for text -based person search

Fine-Grained Video- text Retrieval With Hierarchical Graph Reasoning

Fine-Grained Visual text Prompting

Fine-Granularity Alignment for text -Based Person Retrieval Via Semantics-Centric Visual Division

Fine-Tuning text -To-Image Diffusion Models for Class-Wise Spurious Feature Generation

FineControlNet: Fine-level text Control for Image Generation with Spatially Aligned Text Control Injection

FineControlNet: Fine-level text Control for Image Generation with Spatially Aligned Text Control Injection

FineLIP: Extending CLIP's Reach via Fine-Grained Alignment with Longer text Inputs

Finematch: Aspect-based Fine-grained Image and text Mismatch Detection and Correction

Fisher Linear Discriminant Analysis for text -image combination in multimedia information retrieval

FlashEval: Towards Fast and Accurate Evaluation of text -to-Image Diffusion Generative Models

Flexible text Recovery from Degraded Typewritten Historical Documents

Flick Typing: A New VR text Input System Based on Space Gestures

FlipSketch: Flipping Static Drawings to text -Guided Sketch Animations

Focal text : an Accurate Text Detection with Focal Loss

Focal text : an Accurate Text Detection with Focal Loss

Focal Visual- text Attention for Memex Question Answering

Focal Visual- text Attention for Visual Question Answering

Focus Entirety and Perceive Environment for Arbitrary-Shaped text Detection

Focus on Scene text Using Deep Reinforcement Learning

Focus-N-Fix: Region-Aware Fine-Tuning for text -to-Image Generation

FocusCLIP: Focusing on Anomaly Regions by Visual- text Discrepancies

Focusing Attention: Towards Accurate text Recognition in Natural Images

Font Recognition and Con text ual Processing for More Accurate Text Recognition

Font Watermarking Network for text Images

Fontender: Interactive Japanese text Design with Dynamic Font Fusion Method for Comics

Food3D: text -Driven Customizable 3D Food Generation With Gaussian Splatting

Foreground and background separated image style transfer with a single text condition

Foreground and text -lines Aware Document Image Rectification

Foreground text Extraction in Color Document Images for Enhanced Readability

Foreground text segmentation in complex color document images using Gabor filters

Forged text detection in video, scene, and document images

Forget-Me-Not: Learning to Forget in text -to-Image Diffusion Models

Formal Distance vs. Association Strength in text Processing

Formalization of On-Line Handwritten Japanese text Recognition Free from Line Direction Constraint, A

FOTS: Fast Oriented text Spotting with a Unified Network

Fourier Contour Embedding for Arbitrary-Shaped text Detection

Fractals Based Multi-Oriented text Detection System for Recognition in Mobile Video Images

Fractional poisson enhancement model for text detection and recognition in video frames

Framework for Detecting and Selecting text Line Candidates of Correct Orientation, A

Framework for Performance Evaluation of Face, text , and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol

Free-editor: Zero-shot text -driven 3d Scene Editing

Free- text keystroke dynamics authentication for Arabic language

FREE: A Fast and Robust End-to-End Video text Spotter

FreeControl: Training-Free Spatial Control of Any text -to-Image Diffusion Model with Any Condition

Freemotion: A Unified Framework for Number-Free text -to-Motion Synthesis

Frequency-selective countnet: Enhancing text -guided object counting with frequency features

Frequent Itemsets Methods for text Clustering

Fringe Map Based text Line Segmentation of Printed Telugu Document Images

From External to Internal: Structuring Image for text -to-Image Attributes Manipulation

From senses to text s: An all-in-one graph-based approach for measuring semantic similarity

From text Detection to Text Segmentation: A Unified Evaluation Scheme

From text Detection to Text Segmentation: A Unified Evaluation Scheme

From text to Speech: A Multimodal Cross-Domain Approach for Deception Detection

From text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification

From Two to One: A New Scene text Recognizer with Visual Language Modeling Network

From Words to Structured Visuals: A Benchmark and Framework for text -to-Diagram Generation and Editing

FT2TF: First-Person Statement text -to-Talking Face Generation

Full- text Access to Historical Newspapers

Full- text Search System for Images of Hand-Written Cursive Documents, A

Fully convolutional network with dilated convolutions for handwritten text line segmentation

Fully convolutional recurrent network for handwritten Chinese text recognition

Fully Shareable Scene text Recognition Modeling for Horizontal and Vertical Writing

Fundamental Visual Concept Learning From Correlated Images and text

Furniture-geek: Understanding fine-grained furniture attributes from freely associated text and tags

Further explorations in text alignment with handwritten documents

Further reduced form of wavelet feature for text independent speaker recognition

Fused text Segmentation Networks for Multi-oriented Scene Text Detection

Fused text Segmentation Networks for Multi-oriented Scene Text Detection

Fusion Encoder with Multi-Task Guidance for Cross-Modal text -Image Retrieval in Remote Sensing, A

Fusion of Speech, Faces and text for Person Identification in TV Broadcast

Fusion Strategy for the Single Shot text Detector, A

fuzzy find matching tool for image text analysis, A

Fuzzy Inference-Based Models for Extractive text Summarization

Fuzzy Semantics for Arbitrary-Shaped Scene text Detection

Fuzzy text /non-text classification of document images based on morphological operator, wavelet transform, and strong feature vector

Fuzzy text /non-text classification of document images based on morphological operator, wavelet transform, and strong feature vector

GA-DAN: Geometry-Aware Domain Adaptation Network for Scene text Detection and Recognition

Gabor filter based block energy analysis for text extraction from digital document images

GADNet: Improving image- text matching via graph-based aggregation and disentanglement

GALIP: Generative Adversarial CLIPs for text -to-Image Synthesis

Gamma correction acceleration for real-time text extraction from complex colored images

GAN-TSTEGA: text Steganography Based on Generative Adversarial Networks

GANFusion: Feed-Forward text -to-3D with Diffusion in GAN Space

Garmentaligner: text -to-garment Generation via Retrieval-augmented Multi-level Corrections

Gated Cross Word-visual Attention-driven Generative Adversarial Networks for text -to-image Synthesis

Gatha: Relational Loss for enhancing text -based style transfer

Gaussctrl: Multi-view Consistent text -driven 3d Gaussian Splatting Editing

Gaussian Constrained Attention Network for Scene text Recognition

Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

Gaussian Mixture Modeling of Neighbor Characters for Multilingual text Extraction in Images

GaussianDreamer: Fast Generation from text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

GaussianEditor: Editing 3D Gaussians Delicately with text Instructions

Gaussians-to-Life: text -Driven Animation of 3D Gaussian Splatting Scenes

GCNs-Based Con text -Aware Short Text Similarity Model

Gender Bias in text -to-Video Generation Models: A Case Study of Sora

General and domain-specific techniques for detecting and recognizing superimposed text in video

general approach for multi-oriented text line extraction of handwritten documents, A

Generalized Interpolative Vector Quantization Method for Jointly Optimal Quantization, Interpolation, and Binarization of text Images, A

Generalizing Edit Distance to Incorporate Domain Information: Handwritten text Recognition as a Case-Study

Generalizing to Unseen Domains via text -guided Augmentation: A Training-free Approach

Generatect: text -conditional Generation of 3d Chest CT Volumes

Generating Diverse and Natural 3D Human Motions from text

Generating Holistic 3D Scene Abstractions for text -Based Image Retrieval

Generating Human Interaction Motions in Scenes with text Control

Generating Human Motion in 3D Scenes from text Descriptions

Generation of Viewed Image Captions From Human Brain Activity Via Unsupervised text Latent Space

Generative Adversarial Approach for Zero-Shot Learning from Noisy text s, A

Generative Adversarial Network for text -to-Face Synthesis and Manipulation with Pretrained BERT Model

Generative Adversarial Networks Based on Dynamic Word-Level Update for text -to-Image Synthesis

Generative and Discriminative Fuzzy Restricted Boltzmann Machine Learning for text and Image Classification

Generative Image Steganography Based on text -to-Image Multimodal Generative Model

Generative Negative text Replay for Continual Vision-Language Pretraining

Generative Photography: Scene-Consistent Camera Control for Realistic text -to-Image Synthesis

Generative text Convolutional Neural Network for Hierarchical Document Representation Learning

generic method for determining the up/down orientation of text in Roman and non-Roman scripts, A

generic method for determining up/down orientation of text in Roman and non-Roman scripts, A

GeoAnnotator: A Collaborative Semi-Automatic Platform for Constructing Geo-Annotated text Corpora

Geometry Normalization Networks for Accurate Scene text Detection

Geometry-Aware Scene text Detection with Instance Transformation Network

Geospatial Semantics Analysis of the Qinghai-Tibetan Plateau Based on Microblog Short text s

Geotagging text Content With Language Models and Feature Mining

Getting it Right: Improving Spatial Consistency in text -to-image Models

GHOST: Grounded Human Motion Generation with Open Vocabulary Scene-and- text Contexts

GLASS: Global to Local Attention for Scene- text Spotting

GLIGEN: Open-Set Grounded text -to-Image Generation

Global-aware Fragment Representation Aggregation Network for image- text retrieval

Global-local prompts guided image- text embedding, alignment and aggregation for multi-label zero-shot learning

Global-Shared text Representation Based Multi-Stage Fusion Transformer Network for Multi-Modal Dense Video Captioning

Glyph-BYT5: A Customized text Encoder for Accurate Visual Text Rendering

Glyph-BYT5: A Customized text Encoder for Accurate Visual Text Rendering

GlyphMastero: A Glyph Encoder for High-Fidelity Scene text Editing

goal-oriented verification-based approach for target text line extraction from a document image captured by a pen scanner, A

Good Seed Makes a Good Crop: Discovering Secret Seeds in text -to-Image Diffusion Models

GPT-4V(ision) is a Human-Aligned Evaluator for text -to-3D Generation

GPT-Based text -to-SQL for Spatial Databases

GPT4Motion: Scripting Physical Motions in text -to-Video Generation via Blender-Oriented GPT Planning

GradBias: Unveiling Word Influence on Bias in text -to-Image Generative Models

Gradient Difference Based Technique for Video text Detection, A

Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene text Detection in Video Images

Gradient-based approach to offline text -independent Persian writer identification

GraDual: Graph-based Dual-modal Representation for Image- text Matching

Granularity-Aware Single-Point Scene text Spotting With Sequential Recurrence Self-Attention

Graph based method for Arabic text summarization

Graph Clustering-Based Ensemble Method for Handwritten text Line Segmentation

Graph Structured Network for Image- text Matching

Graph-based Method to Remove Interferential Curve From text Image, A

Graph-Based Segmentation and Feature-extraction Framework for Arabic text Recognition, A

Graph-Based text Segmentation Using a Selected Channel Image

Graph-empowered text -to-SQL generation on Electronic Medical Records

Graphical Figure Classification Using Data Fusion for Integrating text and Image Features

Graphics and Scene text Classification in Video

Graphological Analysis of Handwritten text Documents for Human Resources Recruitment

Grid Diffusion Models for text -to-Video Generation

Grit: A Generative Region-to- text Transformer for Object Understanding

Grounded Image text Matching with Mismatched Relation Reasoning

Grounded text -to-Image Synthesis with Attention Refocusing

Grounding Visual Representations with text s for Domain Generalization

Grouping text lines in freeform handwritten notes

Grouping Using Factor Graphs: An Approach for Finding text with a Camera Phone

GroupViT: Semantic Segmentation Emerges from text Supervision

GSAM+Cutie: text -Promptable Tool Mask Annotation for Endoscopic Video

Guided text Spotting for Assistive Blind Navigation in Unfamiliar Indoor Environments

Guiding Prototype Networks with label semantics for few-shot text classification

Gvgen: text -to-3d Generation with Volumetric Representation

HACG: Leveraging Hierarchical Alignment and Caption Generation for text -Video Retrieval

HairCLIP: Design Your Hair by text and Reference Image

Hallucination Elimination and text Annotation Framework for Large Vision-Language Models in Traffic Scenarios

HAM: Hidden Anchor Mechanism for Scene text Detection

Hand-Gesture Based text Input for Wearable Computers

Hand-written text recognition based on a new formulation

HanDiffuser: text -to-Image Generation with Realistic Hand Appearances

Handwriting Recognition: Tablet PC text Input

handwritten ancient text detector based on improved feature pyramid network, A

Handwritten and Machine Printed text Separation in Document Images Using the Bag of Visual Words Paradigm

Handwritten and Printed text Segmentation: A Signature Case Study

Handwritten and Printed text Separation: Linearity and Regularity Assessment

Handwritten and Typewritten text Identification and Recognition Using Hidden Markov Models

Handwritten Arabic text recognition using Deep Belief Networks

Handwritten Arabic text recognition using multi-stage sub-core-shape HMMs

Handwritten Chinese text line segmentation by clustering with distance metric learning

Handwritten Chinese text Recognition by Integrating Multiple Contexts

Handwritten Chinese/Japanese text Recognition Using Semi-Markov Conditional Random Fields

Handwritten document image segmentation into text lines and words

Handwritten Signature and text based User Verification using Smartwatch

Handwritten text Generation from Visual Archetypes

Handwritten text Generation via Disentangled Representations

Handwritten text Line Identification in Indian Scripts

Handwritten text Line Segmentation by Shredding Text into its Lines

Handwritten text Line Segmentation by Shredding Text into its Lines

Handwritten text Localization in Skewed Documents

Handwritten text Recognition for Marriage Register Books

Handwritten text recognition through writer adaptation

Handwritten text Retrieval Using Two-Stage Pattern Matching with Handwritten Query

Handwritten text segmentation using average longest path algorithm

Handwritten text Segmentation Using Elastic Shape Analysis

Handwritten text Separation from Annotated Machine Printed Documents Using Markov Random Fields

Handwritten text s for Personality Identification Using Convolutional Neural Networks

Harivo: Harnessing text -to-image Models for Video Generation

Harnessing text Insights With Visual Alignment for Medical Image Segmentation

Harnessing text -to-image Diffusion Models for Category-agnostic Pose Estimation

Harnessing the Power of MLLMs for Transferable text -to-Image Person ReID

Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing text Spotting Performance

Harnessing the Power of text -image Contrastive Models for Automatic Detection of Online Misinformation

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity text -to-Image Synthesis

HD-Fusion: Detailed text -to-3D Generation Leveraging Multiple Noise Estimation

Head-Mounted Device for Recognizing text in Natural Scenes, A

HeadEvolver: text to Head Avatars via Expressive and Attribute-Preserving Mesh Deformation

Headstudio: text to Animatable Head Avatars with 3d Gaussian Splatting

Heterogeneous Graph to Abstract Syntax Tree Framework for text -to-SQL, A

HexaGen3D: StableDiffusion is One Step Away from Fast and Diverse text -to-3D Generation

HFENet: Hybrid Feature Enhancement Network for Detecting text s in Scenes and Traffic Panels

HGAN: Hierarchical Graph Alignment Network for Image- text Retrieval

HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene text Detection

Hi-SAM: Marrying Segment Anything Model for Hierarchical text Segmentation

Hidden Bawls, Whispers, and Yelps: Can text Convey the Sound of Speech, Beyond Words?

Hidden Markov Model for Language Syntax in text Recognition, A

Hidden Markov Model-Based Ensemble Methods for Offline Handwritten text Line Recognition

Hierarchical Adaptive Filtering Network for text Image Specular Highlight Removal

Hierarchical Feature Aggregation Based on Transformer for Image- text Matching

Hierarchical online NMF for detecting and tracking topic hierarchies in a text stream

Hierarchical Shape Primitive Features for Online text -independent Writer Identification

Hierarchical Spatio-temporal Decoupling for text -to-Video Generation

Hierarchical text Spotter for Joint Text Spotting and Layout Analysis

Hierarchical text Spotter for Joint Text Spotting and Layout Analysis

Hierarchically-Fused Generative Adversarial Network for text to Realistic Image Synthesis

HierCode: A lightweight hierarchical codebook for zero-shot Chinese text recognition

HierLabelNet: A Two-Stage LLMs Framework with Data Augmentation and Label Selection for Geographic text Classification

high-capacity text watermarking method based on geometric micro-distortion, A

High-Dimensional Access Method for Approximated Similarity Search in text Mining, A

Highly Transparent and Secure Scheme for Concealing text Within Audio

Histogram-Based Two-Stage Adaptive Character Segmentation for Transcription of Inter-Point Hindi Braille to text , A

Historical Handwritten text Images Word Spotting Through Sliding Window HOG Features

HiT: Hierarchical Transformer with Momentum Contrast for Video- text Retrieval

HMM-Based Approach for text Region Detection in Coded Video Bitstreams

HMM-Based Multi Oriented text Recognition in Natural Scene Image

HMM-Based Recognizer with Segmentation-free Strategy for Unconstrained Chinese Handwritten text

HOI-Diff: text -Driven Synthesis of 3D Human-Object Interactions using Diffusion Models

HOIAnimator: Generating text -Prompt Human-Object Animations Using Novel Perceptive Diffusion Models

Holistic Features are Almost Sufficient for text -to-Video Retrieval

Holistic Vertical Regional Proposal Network for Scene text Detection

HOVER: Hyperbolic Video- text Retrieval

How Good Is Good Enough? Establishing Quality Thresholds for the Automatic text Analysis of Retro-Digitized Comics

How is Visual Attention Influenced by text Guidance? Database and Model

How Much Handwritten text Is Needed for Text-Independent Writer Verification and Identification

How Much Handwritten text Is Needed for Text-Independent Writer Verification and Identification

How to Make Cross Encoder a Good Teacher for Efficient Image- text Retrieval?

HowTo100M: Learning a text -Video Embedding by Watching Hundred Million Narrated Video Clips

HRS-Bench: Holistic, Reliable and Scalable Benchmark for text -to-Image Models

HTD: A Fast Human-centered text -locating Method for Auxiliary Reading

HTR-VT: Handwritten text recognition with vision transformer

Human Motion Aware text -to-Video Generation with Explicit Camera Control

Human Preference Score: Better Aligning text -to-image Models with Human Preference

Human-centered Interactive Learning via MLLMs for text -to-Image Person Re-identification

HumanGaussian: text -Driven 3D Human Generation with Gaussian Splatting

Hybrid Algorithm for Con text ual Text Recognition, A

Hybrid approach for Farsi/Arabic text detection and localisation in video frames

Hybrid Approach to Detect and Localize text s in Natural Scene Images, A

Hybrid Approach to Detect text s in Natural Scenes by Integration of a Connected-Component Method and a Sliding-Window Method, A

Hybrid approach to efficient text extraction in complex color images

Hybrid Chinese/English text detection in images and video frames

Hybrid Con text ual Text Recognition with String Matching

Hybrid Deep Architecture for Robust Recognition of text Lines of Degraded Printed Documents, A

hybrid method based on estimation of distribution algorithms to train convolutional neural networks for text categorization, A

Hybrid Network For End-To-End text -Independent Speaker Identification

Hybrid R-BILSTM-C Neural Network Based text Steganalysis, A

Hybrid word/Part-of-Arabic-Word Language Models for arabic text document recognition

HybridEditDif: text and Exemplar Guided Image Editing with Diffusion Models

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and text s

Hyper-3DG: text -to-3D Gaussian Generation via Hypergraph

HyperDreamBooth: HyperNetworks for Fast Personalization of text -to-Image Models

HyperStyle3D: text -Guided 3D Portrait Stylization via Hypernetworks

Hypothesis Preservation Approach to Scene text Recognition with Weighted Finite-State Transducer

hypothesize-and-verify framework for text recognition using deep recurrent neural networks, A

Hy text : A Scene-Text Extraction Method for Video Retrieval

I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for text -Guided Multi-Mask Inpainting

I2T2I: Learning text to image synthesis with textual data augmentation

I2T: Image Parsing to text Description

I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene text Detection

IAM-OnDB: An on-line English sentence database acquired from handwritten text on a whiteboard

IBM Rich Transcription 2007 Speech-to- text Systems for Lecture Meetings, The

IBN-STR: A Robust text Recognizer for Irregular Text in Natural Scenes

IBN-STR: A Robust text Recognizer for Irregular Text in Natural Scenes

ICA Based Approach for Complex Color Scene text Binarization, An

ICDAR 2005 text locating competition results

ICDAR 2011 Robust Reading Competition - Challenge 1: Reading text in Born-Digital Images (Web and Email)

ICDAR 2011 Robust Reading Competition Challenge 2: Reading text in Scene Images

ICDAR 2011: Arabic Recognition Competition: Multi-font Multi-size Digitally Represented text

ICDAR 2015 competition HTRtS: Handwritten text Recognition on the tranScriptorium dataset

ICDAR 2015 competition on text line detection in historical documents

ICDAR 2015 contest on MultiSpectral text Extraction (MS-TEx 2015)

ICDAR2013 Competition on Multi-font and Multi-size Digitally Represented Arabic text

ICDAR2015 competition on text Image Super-Resolution

ICPR 2020 Competition on text Block Segmentation on a NewsEye Dataset

ICPR 2020 Competition on text Block Segmentation on a Newseye Dataset

ICPR2016 contest on Arabic text detection and Recognition in video frames - AcTiVComp

ICPR2020 Competition on text Detection and Recognition in Arabic News Video Frames

ICPR2020 Competition on text Detection and Recognition in Arabic News Video Frames

ICT-QA: Question Answering Over Multi-Modal Con text s Including Image, Chart, and Text Modalities

IDAdapter: Learning Mixed Features for Tuning-Free Personalization of text -to-Image Models

IDBNet: Improved differentiable binarisation network for natural scene text detection

IDEA: Inverted text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Identification of personality traits from handwritten text documents using multi-label classification models

Identifying Handwritten text in Mixed Documents

Identifying SCADA Systems and Their Vulnerabilities on the Internet of Things: A text -Mining Approach

Identifying visual attributes for object recognition from text and taxonomy

Identity-Preserving text -To-Video Generation by Frequency Decomposition

iEdit: Localised text -guided Image Editing with Weak Supervision

Illegible text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks

Illegible text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks

Illusion of Unlearning: The Unstable Nature of Machine Unlearning in text -to-Image Diffusion Models, The

iLogBook: Enabling text -Searchable Event Query Using Sparse Vehicle-Mounted GPS Data

Im2 text and Text2Im: Associating Images and Texts for Cross-Modal Retrieval

Image and text Coupling for Creating Electronic Books from Manuscripts

Image and text fusion for UPMC Food-101 using BERT and CNNs

Image Binarization for End-to-End text Understanding in Natural Images

Image clustering using generated text centroids

Image Dataset of text Patches in Everyday Scenes, An

Image Generation Method of Bird text Based on Improved StackGAN

Image is Worth Multiple Words: Multi-Attribute Inversion for Constrained text -To-Image Synthesis, An

Image Over text : Transforming Formula Recognition Evaluation with Character Detection Matching

Image Overlay text Detection Based on JPEG Truncation Error Analysis

Image Retrieval for Visual Localization via Scene text Detection and Logo Filtering

Image Search With text Feedback by Visiolinguistic Attention Learning

Image Segmentation Using text and Image Prompts

Image text Detection Using a Bandlet-Based Edge Detector and Stroke Width Transform

Image-based Document Vectors for text Retrieval

Image- text Co-Decomposition for Text-Supervised Semantic Segmentation

Image- text Co-Decomposition for Text-Supervised Semantic Segmentation

Image- text Embedding Learning via Visual and Textual Semantic Reasoning

Image- text feature learning for unsupervised visible-infrared person re-identification

Image- text Matching, Image Text Retrieval, Image-Text Retrieval

Image- text Matching, Image Text Retrieval, Image-Text Retrieval

Image- text Matching, Image Text Retrieval, Image-Text Retrieval

Image- text Multimodal Emotion Classification via Multi-View Attentional Network

Image- text Pre-Training for Logo Recognition

Image- text Retrieval With Cross-Modal Semantic Importance Consistency

Image- text -Image Knowledge Transfer for Lifelong Person Re-Identification With Hybrid Clothing States

Image-to-Character-to-Word Transformers for Accurate Scene text Recognition

Image-to- text Conversion and Aspect-Oriented Filtration for Multimodal Aspect-Based Sentiment Analysis

Image/ text filtering system and method

Imaged Document text Retrieval Without OCR

Imagen Editor and EditBench: Advancing and Evaluating text -Guided Image Inpainting

Imagic: text -Based Real Image Editing with Diffusion Models

IMMA: Immunizing text -to-image Models Against Malicious Adaptation

Impact of Character Models Choice on Arabic text Recognition Performance

Impact of OCR Accuracy and Feature Transformation on Automatic text Classification, The

Impact of OCR Errors on Automated Classification of OCR Japanese text s with Parts-of-Speech Analysis, An

Impact of online handwriting recognition performance on text categorization

Impact of Pre-Processing on Recognition of Cursive Video text

Imperceptible Backdoor Attacks on text -Guided 3D Scene Grounding

Implementation of Advanced Encryption Standard for encryption and decryption of images and text on a GPU

Implementation of Three text to Speech Systems for Kurdish Language

Implicit Bias Injection Attacks against text -to-Image Diffusion Models

Implicit Feature Alignment: Learn to Convert text Recognizer to Text Spotter

Implicit Feature Alignment: Learn to Convert text Recognizer to Text Spotter

Improved Component Tree Based Approach to User-Intention Guided text Extraction from Natural Scene Images, An

Improved Document Skew Detection Based on text Line Connected-component Clustering

Improved Gini-Index Algorithm to Correct Feature-Selection Bias in text Classification

Improved Legibility of text for Multiprojector Tiled Displays

Improved localization accuracy by LocNet for Faster R-CNN based text detection in natural scene images

Improved Method Based on Weighted Grid Micro-structure Feature for text -Independent Writer Recognition, An

Improved SAR Ship Classification Method Using text -to-Image Generation-Based Data Augmentation and Squeeze and Excitation, An

Improved Scene text Extraction Method Using Conditional Random Field and Optical Character Recognition, An

Improved shot boundary detection method based on text edges

Improved text -detection methods for a camera-based text reading system for blind persons

Improved text -detection methods for a camera-based text reading system for blind persons

Improved Zero-Shot Classification by Adapting VLMs with text Descriptions

Improvement of video text recognition by character selection

Improving accuracy of arbitrary-shaped text detection using ResNet-152 backbone-based pixel aggregation network

Improving Cross-Modal Constraints: text Attribute Person Search With Graph Attention Networks

Improving Cross-Modal Image- text Retrieval With Teacher-Student Learning

Improving Description-Based Person Re-Identification by Multi-Granularity Image- text Alignments

Improving distinctiveness in video captioning with text -video similarity

Improving End-to-End text Image Translation From the Auxiliary Text Translation Task

Improving End-to-End text Image Translation From the Auxiliary Text Translation Task

Improving Faithfulness of text -to-Image Diffusion Models through Inference Intervention

Improving Fine-Grained Understanding for Retrieval in Human Motion and text

Improving Full- text Precision on Short Queries Using Simple Constraints

Improving Handwritten Chinese text Recognition by Confidence Transformation

Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models

Improving Image Recognition by Retrieving from Web-Scale Image- text Data

Improving image similarity measures for image browsing and retrieval through latent space learning between images and long text s

Improving Image- text Matching by Integrating Word Sense Disambiguation

Improving Image- text Matching With Bidirectional Consistency of Cross-Modal Alignment

Improving Multi-class text Classification with Naive Bayes

Improving Multiclass text Classification with the Support Vector Machine

Improving OCR text Categorization Accuracy with Electronic Abstracts

Improving Offline Handwritten text Recognition with Hybrid HMM/ANN Models

Improving Open-Vocabulary Scene text Recognition

Improving patch-based scene text script identification with ensembles of conjoined networks

Improving Persian text Classification Using Persian Thesaurus

Improving Scene text Detection by Scale-Adaptive Segmentation and Weighted CRF Verification

Improving text Classifier Performance based on AUC

Improving text -Based Person Search by Spatial Matching and Adaptive Threshold

Improving text -guided Object Inpainting with Semantic Pre-inpainting

Improving text -image Matching with Adversarial Learning and Circle Loss for Multi-modal Steganography

Improving Vision-and-language Navigation with Image- text Pairs from the Web

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image- text Retrieval

In-air Handwritten Chinese text Recognition with Attention Convolutional Recurrent Network

In-air handwritten Chinese text recognition with temporal convolutional recurrent network

In-Style: Bridging text and Uncurated Videos with Style Transfer for Text-Video Retrieval

In-Style: Bridging text and Uncurated Videos with Style Transfer for Text-Video Retrieval

Incorporating Language Syntax in Visual text Recognition with a Statistical-Model

Incorporating Self-attention Mechanism and Multi-task Learning into Scene text Detection

Incremental Approach to text Representation, Categorization, and Retrieval, An

Incremental Detection of text on Road Signs

Incremental text -to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model

Indexing On-line Handwritten text s Using Word Confusion Networks

Indexing text Events in Digital Video Databases

InducT-GCN: Inductive Graph Convolutional Networks for text Classification

Industrial Scene text Detection With Refined Feature-Attentive Network

Inferential Rules for Identifying Answers in TOEFL text s

Inferring Semantic Layout for Hierarchical text -to-Image Synthesis

Infinite Liouville mixture models with application to text and texture categorization

Inflation with Diffusion: Efficient Temporal Adaptation for text -to-Video Super-Resolution

Influence of text line segmentation in Handwritten Text Recognition

Influence of text line segmentation in Handwritten Text Recognition

Information Detection for the Process of Typhoon Events in Microblog text : A Spatio-Temporal Perspective

Information Extraction and Classification from Free text Using a Neural Approach

Information extraction from scanned invoice images using text analysis and layout features

Information fusion for text classification an experimental comparison

Information Theoretic text Classification Using the Ziv-Merhav Method

InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot text -based Video Editing

Initialized and Guided EM-clustering of Sparse Binary Data with Application to text Based Documents

Initno: Boosting text -to-Image Diffusion Models via Initial Noise Optimization

Injecting text Clues for Improving Anomalous Event Detection From Weakly Labeled Videos

InNeRF360: text -Guided 3D-Consistent Object Inpainting on 360° Neural Radiance Fields

inpainting system for automatic image structure- text ure restoration with text removal, An

Inspecting the Geographical Representativeness of Images from text -to-Image Models

Instance-wise distribution control of text -to-image diffusion models

InstanceCap: Improving text -to-Video Generation via Instance-aware Structured Caption

Instant3D: Instant text -to-3D Generation

InstantBooth: Personalized text -to-Image Generation without Test-Time Finetuning

Instruction-Augmented Multimodal Alignment for Image- text and Element Matching

Instruction-Guided Scene text Recognition

Instructive3D: Editing Large Reconstruction Models with text Instructions

Instrumental Assessment of Prosodic Quality for text -to-Speech Signals

Integrated Algorithm for text Recognition: Comparison with a Cascaded Algorithm, An

Integrated text and Line-Art Extraction from a Topographic Map

Integrating Geometric Con text for Text Alignment of Handwritten Chinese Documents

Integrating Knowledge Sources in Devanagari text Recognition System

Integrating Language Guidance Into Image- text Matching for Correcting False Negatives

Integrating Language Model in Handwritten Chinese text Recognition

Integrating multiple character proposals for robust scene text extraction

Integrating Visual, Audio and text Analysis for News Video

Integrating word level knowledge in text recognition

Integration of Linguistic and Geospatial Features Using Global Con text Embedding for Automated Text Geocoding, The

Intelligent Typography: Artistic text Style Transfer for Complex Texture and Structure

IntelliSearch: Intelligent Search for Images and text on the Web

Inter-Intra Modal Representation Augmentation With DCT-Transformer Adversarial Network for Image- text Matching

InteractDiffusion: Interaction Control in text -to-Image Diffusion Models

Interactive Enhancement of Handwritten text through Multi-resolution Gaussian

Interactive Image Manipulation with Complex text Instructions

Interactive Off-Line Handwritten text Transcription Using On-Line Handwritten Text as Feedback

Interactive Off-Line Handwritten text Transcription Using On-Line Handwritten Text as Feedback

Interactive System to Extract Structured text from a Geometrical Representation, An

Interactive text books; Embedding Image Processing Operator Demonstrations in Text

Interfusion: text -driven Generation of 3d Human-object Interaction

Interleaved text /image Deep Mining on a large-scale radiology database

Interpretation of The Function of The Obelisk of Augustus in Rome From Antique text s to Present Time Virtual Reconstruction

Interword distance changes represented by sine waves for watermarking text images

Intra-modal consistency for image- text retrieval through soft-label distillation

Intra-Modal Constraint Loss for Image- text Retrieval

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in text -to-Image Generative Models

Inverse-Like Antagonistic Scene text Spotting via Reading-Order Estimation and Dynamic Sampling

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of text -to-Video Diffusion Models

Investigation on LLMs' Visual Understanding Ability Using SVG for Image- text Bridging, An

Investigations in Psychological Stress Detection from Social Media text using Deep Architectures

IOS-Net: An inside-to-outside supervision network for scale robust text detection in the wild

IPAD: Iterative, Parallel, and Diffusion-Based Network for Scene text Recognition

Irregular text block recognition via decoupling visual, linguistic, and positional information

Is An Image Worth Five Sentences? A New Look into Semantics for Image- text Matching

Is Arabic text categorization a solved task?

ISL RT-07 Speech-to- text System, The

ISTD-DLA: Industrial Scene text Detection Method Based on Dynamic Local-Aware Aggregation Network

It's All About The Scale: Efficient text Detection Using Adaptive Scaling

ITACLIP: Boosting Training-Free Semantic Segmentation with Image, text , and Architectural Enhancements

IterVM: Iterative Vision Modeling Module for Scene text Recognition

ITI-Gen: Inclusive text -to-Image Generation

JECL: Joint Embedding and Cluster Learning for Image- text Pairs

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized text -to-Image Generation

Joint architecture and knowledge distillation in CNN for Chinese text recognition

Joint embeddings with multimodal cues for video- text retrieval

Joint Handwritten text Recognition and Word Classification for Tabular Information Extraction

Joint Image- text News Topic Detection and Tracking by Multimodal Topic And-Or Graph

Joint Inference of Objects and Scenes With Efficient Learning of text -Object-Scene Relations

Joint Intra & Inter-Grained Reasoning: A New Look Into Semantic Consistency of Image- text Retrieval

Joint representation learning for text and 3D point cloud

Joint stroke classification and text line grouping in online handwritten documents with edge pooling attention networks

Joint Token and Feature Alignment Framework for text -Based Person Search

Joint Video and text Parsing for Understanding Events and Answering Queries

Joint Visual Semantic Reasoning: Multi-Stage Decoder for text Recognition

Jointdreamer: Ensuring Geometry Consistency and text Congruence in Text-to-3d Generation via Joint Score Distillation

Jointdreamer: Ensuring Geometry Consistency and text Congruence in Text-to-3d Generation via Joint Score Distillation

JPEG2000 Compatible Watermarking of text in Images

Kanji recognition in scene images without detection of text fields: robust against variation of viewpoint, contrast, and background texture

KDProR: A Knowledge-decoupling Probabilistic Framework for Video- text Retrieval

Kernel Adaptive Convolution for Scene text Detection via Distance Map Prediction

kernel trick for sequences applied to text -independent speaker verification systems, A

Kernel-Based Mixture Mapping for Image and text Association

Keystroke Biometric Recognition Studies on Long- text Input under Ideal and Application-Oriented Conditions

Keyword spotting in handwritten documents based on a generic text line HMM and a SVM verification

Keyword Spotting in Online Handwritten Documents Containing text and Non-text Using BLSTM Neural Networks

Keyword Spotting in Online Handwritten Documents Containing text and Non-text Using BLSTM Neural Networks

KHATT: An open Arabic offline handwritten text database

KHATT: Arabic Offline Handwritten text Database

Khmerst: A Low-resource Khmer Scene text Detection and Recognition Benchmark

Knowing Where to Focus: Attention-Guided Alignment for text -based Person Search

Knowledge Mining with Scene text for Fine-Grained Recognition

Knowledge-Driven Generative Adversarial Network for text -to-Image Synthesis

KOHTD: Kazakh offline handwritten text dataset

KT-GAN: Knowledge-Transfer Generative Adversarial Network for text -to-Image Synthesis

K text : Arbitrary shape text detection using modified K-Means

L-Verse: Bidirectional Generation Between Image and text

Label embedding for text recognition

Label Embedding: A Frugal Baseline for text Recognition

Label Incorporated Graph Neural Networks for text Classification

Label or Message: A Large-Scale Experimental Survey of text s and Objects Co-Occurrence

LAM Dataset: A Novel Benchmark for Line-Level Handwritten text Recognition, The

Language Adaptive Methodology for Handwritten text Line Segmentation

Language Identification for Printed text Independent of Segmentation

Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene text Detection and Spotting

language model using variable length tokens for open-vocabulary Hangul text recognition, A

Language-Aware Soft Prompting: text -to-Text Optimization for Few- and Zero-Shot Adaptation of V&L Models

Language-Aware Soft Prompting: text -to-Text Optimization for Few- and Zero-Shot Adaptation of V&L Models

Language-Independent text Lines Extraction Using Seam Carving

Language-Independent text -Line Extraction Algorithm for Handwritten Documents

Laplacian Approach to Multi-Oriented text Detection in Video, A

Laplacian Method for Video text Detection, A

Large scalability in document image matching using text retrieval

Large Scale Scene text Verification with Guided Attention

Large-Lexicon Attribute-Consistent text Recognition in Natural Images

Large-Scale text -to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

LarTap: A Luminance-Aware Framework With text -Correlation Priors for Multi-Exposure Image Fusion

LASP: text -to-Text Optimization for Language-Aware Soft Prompting of Vision and Language Models

LASP: text -to-Text Optimization for Language-Aware Soft Prompting of Vision and Language Models

Latent Guard: A Safety Framework for text -to-image Generation

Latenteditor: text Driven Local Editing of 3d Scenes

LaTeRF: Label and text Driven Object Radiance Fields

LaTr: Layout-Aware Transformer for Scene- text VQA

Latte3d: Large-scale Amortized text -to-enhanced3d Synthesis

LATTECLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic text s

Layerdiff: Exploring text -guided Multi-layered Composable Image Synthesis via Layer-collaborative Diffusion Model

layered method for determining manga text bubble reading order, A

Layout and language: exploring text block discovery in tables using linguistic resources

Layout-Agnostic Scene text Image Synthesis with Diffusion Models

Layout-Bridging text -to-Image Synthesis

LayoutFormer: Hierarchical text Detection Towards Scene Text Understanding

LayoutFormer: Hierarchical text Detection Towards Scene Text Understanding

LCM-Lookahead for Encoder-based text -to-image Personalization

LD-ZNet: A Latent Diffusion Approach for text -Based Image Segmentation

Learn to Augment: Joint Data Augmentation and Network Optimization for text Recognition

Learned Image Compression with text Quality Enhancement

Learning a Limited text Space for Cross-Media Retrieval

Learning Aligned Image- text Representations Using Graph Attentive Relational Network

Learning analytics system for assessing students' performance quality and text mining in online communication

Learning and Integrating Multi-Level Matching Features for Image- text Retrieval

Learning Audio-guided Video Representation with Gated Attention for Video- text Retrieval

Learning bottom-up text attention maps for text detection using stroke width transform

Learning bottom-up text attention maps for text detection using stroke width transform

Learning by Imagination: A Joint Framework for text -Based Image Manipulation and Change Captioning

Learning CLIP Guided Visual- text Fusion Transformer for Video-based Pedestrian Attribute Recognition

Learning Coarse-to-Fine Graph Neural Networks for Video- text Retrieval

Learning confidence transformation for handwritten Chinese text recognition

Learning Continuous 3D Words for text -to-Image Generation

Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using text and Sketch

Learning Deep Structure-Preserving Image- text Embeddings

Learning Disentangled Identifiers for Action-Customized text -to-Image Generation

Learning Dual Semantic Relations With Graph Attention for Image- text Matching

Learning From Short text Streams With Topic Drifts

Learning From text : A Multimodal Face Inpainting Network for Irregular Holes

Learning from Video and text via Large-Scale Discriminative Clustering

Learning Generative Structure Prior for Blind text Image Super-resolution

Learning Linguistic Association Towards Efficient text -Video Retrieval

Learning Markov Clustering Networks for Scene text Detection

Learning Multi-Dimensional Human Preference for text -to-Image Generation

Learning multi-view embedding in joint space for bidirectional image- text retrieval

Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image- text Matching

Learning Semantic Polymorphic Mapping for text -Based Person Retrieval

Learning Semantic Relationship among Instances for Image- text Matching

Learning Semantic text Features for Web Text-Aided Image Classification

Learning Semantic text Features for Web Text-Aided Image Classification

Learning Shape-Aware Embedding for Scene text Detection

Learning Shape-Color Diffusion Priors for text -Guided 3D Object Generation

Learning Spatial-Semantic Con text with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

Learning Spatially-Variable Filters for Super-Resolution of text

Learning text -Line Segmentation Using Codebooks and Graph Partitioning

Learning text -to-Video Retrieval from Image Captioning

Learning the Lexicon from raw text s for open-vocabulary Korean word recognition

Learning to Detect Scene text Using a Higher-Order MRF with Belief Propagation

Learning to detect, localize and recognize many text objects in document images from few examples

Learning to Embed Semantic Similarity for Joint Image- text Retrieval

Learning to Generate Semantic Layouts for Higher text -Image Correspondence in Text-to-Image Synthesis

Learning to Generate Semantic Layouts for Higher text -Image Correspondence in Text-to-Image Synthesis

Learning to Generate text -Grounded Mask for Open-World Semantic Segmentation from Only Image-Text Pairs

Learning to Generate text -Grounded Mask for Open-World Semantic Segmentation from Only Image-Text Pairs

Learning to Group text Lines and Regions in Freeform Handwritten Notes

Learning to Localize Actions in Instructional Videos with Llm-based Multi-pathway text -video Alignment

Learning to Read L'Infinito: Handwritten text Recognition with Synthetic Training Data

Learning to Sample Effective and Diverse Prompts for text -to-Image Generation

Learning to Sort Handwritten text Lines in Reading Order through Estimated Binary Order Relations

Learning to summarize web image and text mutually

Learning to Super-Resolve Blurry Face and text Images

Learning transferable features in meta-learning for few-shot text classification

Learning Two-Branch Neural Networks for Image- text Matching Tasks

Learning Visual Compound Models from Parallel Image- text Datasets

Learning Visual Generative Priors without text

Lecture Video Enhancement and Editing by Integrating Posture, Gesture, and text

LEDITS++: Limitless Image Editing Using text -to-Image Models

LeftRefill: Filling Right Canvas based on Left Reference through Generalized text -to-Image Diffusion Model

Legit: text Legibility for User-Generated Media

Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in text -to-image Diffusion Models

Length Normalization in Degraded text Collections

Length-sensitive Language-bound Recognition Network for Multilingual text Recognition, A

Leveraging Multimodal Large Language Models for Joint Discrete and Continuous Evaluation in text -to-Image Alignment

Leveraging Smart Devices for Scene text Preserved Image Stylization: A Deep Gaming Approach

Leveraging Style and Content features for text Conditioned Image Retrieval

Leveraging surrounding con text for scene text detection

Leveraging text Localization for Scene Text Removal via Text-aware Masked Image Modeling

Leveraging text Localization for Scene Text Removal via Text-aware Masked Image Modeling

Leveraging text Localization for Scene Text Removal via Text-aware Masked Image Modeling

Leveraging the Mixed- text Segmentation Problem to Design Secure Handwritten CAPTCHAs

Lexicon based feature extraction for emotion text classification

Lexicon Generation for Emotion Detection from text

Lexicon-based offline recognition of Amharic words in unconstrained handwritten text

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image- text Sparse Retrieval

Lifelong Learning for text Steganalysis Based on Chronological Task Sequence

Light Weight text Extraction Technique for Hand-Held Device, A

light-weight text image processing method for handheld embedded cameras, A

Lightweight Attentional Feature Fusion: A New Baseline for text -to-Video Retrieval

Lightweight dynamic conditional GAN with pyramid attention for text -to-image synthesis

Lightweight Multi-Grained Image- text Retrieval Paradigm via Cascaded Representation Learning and Parameter-Free Feature Aggregation, A

Lightweight text -Driven Image Editing With Disentangled Content and Attributes

LIMITR: Leveraging Local Information for Medical Image- text Representation

Line Segmentation for Grayscale text Images of Khmer Palm Leaf Manuscripts

Line-Direction-Free and Character-Orientation-Free On-Line Handwritten Japanese text Recognition System, A

Linecounter: Learning Handwritten text Line Segmentation By Counting

LinGen: Towards High-Resolution Minute-Length text -to-Video Generation with Linear Computational Complexity

Linguistic Hallucination for text -Based Video Retrieval

Linguistic integration information in the aabatas arabic text analysis system

Linguistic Steganalysis via text Dual Attention Fusing Statistical and Multi-Layer Semantic Features

Linguistics-aware Masked Image Modeling for Self-supervised Scene text Recognition

Link the Head to the Beak: Zero Shot Learning from Noisy text Description at Part Precision

Linking Image and text with 2-Way Nets

Linking text and visual concepts semantically for cross modal multimedia search

LISTER: Neighbor Decoding for Length-Insensitive Scene text Recognition

LiT: Zero-Shot Transfer with Locked-Image text Tuning

Livephoto: Real Image Animation with text -guided Motion Control

Local Action-guided Motion Diffusion Model for text -to-motion Generation

Local Binary Pattern-Based Features for text Identification of Web Images

Local Gradient Difference Features for Classification of 2D-3D Natural Scene text Images

Local Skew Angle Estimation from Background Space in text Regions

Local variance image-based for scene text binarization under illumination effects

Local-enhanced representation for text -based person search

Local-Global Video- text Interactions for Temporal Grounding

Localization and Manipulation of Immoral Visual Cues for Safe text -to-Image Generation

Localization, extraction and recognition of text in Telugu document images

localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods, A

Localize, Group, and Select: Boosting text -VQA by Scene Text Modeling

Localize, Group, and Select: Boosting text -VQA by Scene Text Modeling

Localized Concept Erasure for text -to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Localizing and segmenting text in images and videos

Localizing blurry and low-resolution text in natural images

Localizing Object-level Shape Variations with text -to-Image Diffusion Models

Localizing scene text s by fuzzy inference systems and low rank matrix recovery model

Localizing text in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification

LOCAT: Localization-Driven text Watermarking via Large Language Models

Locating text in Color Documents

Locating text in Complex Color Images

Locating text in Images Based on the Smooth Gray-Level Detection

Locating text in images using matched wavelets

Locating Uniform-colored text in Video Frames

LocVTP: Video- text Pre-training for Temporal Localization

LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese text Line Recognition

LoGoPrompt: Synthetic text Images Can Be Good Visual Prompts for Vision-Language Models

Long-CLIP: Unlocking the Long- text Capability of CLIP

Long-FAS: Cross-domain face anti-spoofing with long text guidance

Longest Common Subsequence Algorithm Suitable for Similar text Strings, A

Look More Than Once: An Accurate Detector for text of Arbitrary Shapes

Looking at Words and Points with Attention: A Benchmark for text -to-Shape Coherence

Looking from a Higher-level Perspective: Attention and Recognition Enhanced Multi-scale Scene text Segmentation

LoSh: Long-Short text Joint Prediction Network for Referring Video Object Segmentation

lossy/lossless compression method for printed typeset bi-level text images based on improved pattern matching, A

Lost in Translation: Latent Concept Misalignment in text -to-image Diffusion Models

Lost Your Style? Navigating with Semantic-Level Approach for text -to-Outfit Retrieval

LoTeR: Localized text prompt refinement for zero-shot referring image segmentation

Low Complexity Sign Detection and text Localization Method for Mobile Applications, A

LucidDreamer: Towards High-Fidelity text -to-3D Generation via Interval Score Matching

LuoJiaHOG: A hierarchy oriented geo-aware image caption dataset for remote sensing image- text retrieval

L_0-Regularized Intensity and Gradient Prior for Deblurring text Images and Beyond

M-Adaptor: text -Driven Whole-Body Human Motion Generation

M2d2m: Multi-Motion Generation from text with Discrete Diffusion Models

M3TTS: Multi-modal text -to-speech of multi-scale style control for dubbing

MA-CRNN: a multi-scale attention CRNN for Chinese text line recognition in natural scenes

MAAN: Memory-Augmented Auto-Regressive Network for text -Driven 3D Indoor Scene Generation

MAC: Masked Contrastive Pre-Training for Efficient Video- text Retrieval

Machine Learning Approach to Hypothesis Decoding in Scene text Recognition, A

Machine printed text and handwriting identification in noisy document images

Machine reading of camera-held low quality text images: An ICA-based image enhancement approach for improving OCR accuracy

Machine reading of handwritten text information in field technician's maps

Machine recognition and correction of printed Arabic text

Machine Recognition of Multi Font Printed Arabic text s

Machine Recognition of Optically Captured Machine Printed Arabic text

Machine Recognition of Printed Kannada text

Machine-printed and hand-written text lines identification

MADA:Multi-Window Attention and Dual-Alignment for Image- text Retrieval

MAGAE: Multi-Level Alignment Over Aggregation Semantic Graph With Attribute Enhancement for text -Based Vehicle Retrieval

Magic3D: High-Resolution text -to-3D Content Creation

MAGIC: Multi-granularity domain adaptation for text recognition

MagicFusion: Boosting text -to-Image Generation Performance by Fusing Diffusion Models

Major Components of a Complete text Reading System

Make It Count: text -to-Image Generation with an Accurate Number of Objects

Make It Move: Controllable Image-to-Video Generation with text Descriptions

Make-A-Scene: Scene-Based text -to-Image Generation with Human Priors

Make-An-Animation: Large-Scale text -conditional 3D Human Motion Generation

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from text

Making the Most of text Semantics to Improve Biomedical Vision-Language Processing

Making the V in text -VQA Matter

Mandarin text -to-Speech Front-End With Lightweight Distilled Convolution Network

Manga text Detection with Manga-specific Data Augmentation and Its Applications on Emotion Analysis

ManiCLIP: Multi-attribute Face Manipulation from text

ManiGAN: text -Guided Image Manipulation

ManiTrans: Entity-Level text -Guided Image Manipulation via Token-wise Semantic Alignment and Generation

MANTA: A Large-Scale Multi-View and Visual- text Anomaly Detection Dataset for Tiny Objects

Many Hands Make Light Work: Transferring Knowledge from Auxiliary Tasks for Video- text Retrieval

Marking text Documents

Marking text features of document images to deter illicit dissemination

Markov Model Order Optimization for text Recognition

Markov Random Field Based text Identification from Annotated Machine Printed Documents

MarkovGen: Structured Prediction for Efficient text -to-Image Generation

Markovian Engine for text Recognition: Cursive Arabic Text, Statistical Features and Interconnected HMMs, A

Markovian Engine for text Recognition: Cursive Arabic Text, Statistical Features and Interconnected HMMs, A

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity text -to-3D Content Creation

Mask R-CNN With Pyramid Attention Network for Scene text Detection

Mask text spotter v3: Segmentation Proposal Network for Robust Scene Text Spotting

Mask text Spotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

MaskDiffusion: Boosting text -to-Image Consistency with Conditional Mask

Masked and Permuted Implicit Con text Learning for Scene Text Recognition

Masked text Pre-Training for Scene Text Detection

Masked text Pre-Training for Scene Text Detection

MASTER: Multi-aspect non-local network for scene text recognition

Masterweaver: Taming Editability and Face Identity for Personalized text -to-image Generation

Mathematical properties of the native integral ratio handwriting and text extraction technique

Matryoshka Learning With Metric Transfer for Image- text Matching

Maxfusion: Plug&play Multi-modal Generation in text -to-Image Diffusion Models

Maximum Likelihood Discriminant Feature for text -Independent Speaker Verification

Maximum Margin Approach to Learning text Classifiers Methods, Theory and Algorithms, The

Maximum Spanning Trees For text Segmentation

maximum-likelihood approach to segmentation-based recognition of unconstrained handwriting text , A

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex text -to-Image Generation

mDRA: A Multimodal Depression Risk Assessment Model Using Audio and text

MEAN: Multi - Element Attention Network for Scene text Recognition

Medblip: Bootstrapping Language-image Pretraining from 3d Medical Images and text s

Medical-Image Retrieval Based on Knowledge-Assisted text and Image Indexing

MedSyn: text -Guided Anatomy-Aware Synthesis of High-Fidelity 3-D CT Images

Memorize, Associate and Match: Embedding Enhancement via Fine-Grained Alignment for Image- text Retrieval

Memory-Efficient Models for Scene text Recognition via Neural Architecture Search

MER-CAPF: Audio- text emotion recognition through cross-attention mechanism and multi-granularity pooling strategy

MESA: text -Driven Terrain Generation Using Latent Diffusion and Global Copernicus Data

Met-MLTS: Leveraging Smartphones for End-to-End Spotting of Multilingual Oriented Scene text s and Traffic Signs in Adverse Meteorological Conditions

MetaCloak: Preventing Unauthorized Subject-Driven text -to-Image Diffusion-Based Synthesis via Meta-Learning

MetaHTR: Towards Writer-Adaptive Handwritten text Recognition

MetaWriter: Personalized Handwritten text Recognition Using Meta-Learned Prompt Tuning

Method and apparatus for detecting running text in an image

Method and apparatus for the resolution enhancement of gray-scale images that include text and line art

Method and system for recognizing a boundary between characters in handwritten text

Method for automatic recognition of white blocks as well as text , graphics and/or gray image areas on a printed master

method for detecting text of arbitrary shapes in natural scenes that improves text spotting, A

method for detecting text of arbitrary shapes in natural scenes that improves text spotting, A

method for discovering knowledge in text s, A

Method for Extracting text from Stone Inscriptions Using Character Spotting, A

Method for identification and compression of facsimile symbols in text processing systems

Method for identifying word bounding boxes in text

Method for Semantic Relatedness Based Query Focused text Summarization, A

Method for text Localization and Recognition in Real-World Images, A

Method for Transformer Oil Leakage Detection text Generation Using Combined Large and Small Models

Method for unconstrained text detection in natural scene image

method for variable quantization in JPEG for improved text quality in compound documents, A

method of N-grams in large-scale clustering of DNA text s, The

Method of separating text and graphs in digital image data

Methods for text segmentation from scene images

Metric Learning for text Documents

Mevg: Multi-event Video Generation with text -to-video Models

MF-GAN: Multi-conditional Fusion Generative Adversarial Network for text -to-Image Synthesis

MFECLIP: CLIP With Mapping-Fusion Embedding for text -Guided Image Editing

MicroCinema: A Divide-and-Conquer Approach for text -to-Video Generation

MIGC: Multi-Instance Generation Controller for text -to-Image Synthesis

MILES: Visual BERT Pre-training with Injected Language Semantics for Video- text Retrieval

Mimir: Improving Video Diffusion Models for Precise text Understanding

Minimal Interaction Touchless text Input with Head Movements and Stereo Vision

Minimum Error Rate Training for PHMM-Based text Recognition

Minimum Risk Training for Handwritten Chinese/Japanese text Recognition Using Semi-Markov Conditional Random Fields

Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition

Mining conversational text for procedures with applications in contact centers

Mining False Positive Examples for text -Based Person Re-Identification

Mining the displacement of max-pooling for text recognition

Minority-Focused text -to-Image Generation via Prompt Optimization

MirrorGAN: Learning text -To-Image Generation by Redescription

Mis?-) Using DRT for Generation of Natural Language text from Image Sequences

MISL: Multi-grained image- text semantic learning for text-guided image inpainting

MISL: Multi-grained image- text semantic learning for text-guided image inpainting

Mismatch Quest: Visual and text ual Feedback for Image-Text Misalignment

Mita: An Information Extraction Approach to the Analysis of Free-Form text in Life-Insurance Applications

Mixdq: Memory-efficient Few-step text -to-image Diffusion Models with Metric-decoupled Mixed Precision Quantization

Mixed-Supervised Scene text Detection With Expectation-Maximization Algorithm

Mobile visual search on printed documents using text and low bit-rate features

MobileCLIP: Fast Image- text Models through Multi-Modal Reinforced Training

Mobilediffusion: Instant text -to-image Generation on Mobile Devices

Modality Disentangled Discriminator for text -to-Image Synthesis

Model and Data Integrated Transfer Learning for Unstructured Map text Detection

Model Based text Line Segmentation Method for Off-line Handwritten Documents, A

model for detecting and merging vertically spanned table cells in plain text documents, A

Model of On-line Handwritten Japanese text Recognition Free from Line Direction and Writing Format Constraints, A

model-based approach to offline text -independent Arabic writer identification and verification, A

Model-Based System Specification With Tesperanto: Readable text From Formal Graphics

Modeling Motion with Multi-Modal Features for text -Based Video Segmentation

Modeling of image, video and text fusion quality data packet system for aerospace complex products based on business intelligence

Modeling Stroke Mask for End-to-End text Erasing

Modeling Thousands of Human Annotators for Generalizable text -to-Image Person Re-identification

Modern vs Diplomatic Transcripts for Historical Handwritten text Recognition

Moment-Based Image Normalization for Handwritten text Recognition

Monkey: Image Resolution and text Label are Important Things for Large Multi-Modal Models

Mono-font Cursive Arabic text Recognition Using Speech Recognition System

Morality Classification in Natural Language text

MORAN: A Multi-Object Rectified Attention Network for scene text recognition

More Grounded Image Captioning by Distilling Image- text Matching Model

More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image- text Matching

More than Words: In-the-Wild Visually-Driven Prosody for text -to-Speech

Morpheus: text -Driven 3D Gaussian Splat Shape and Color Stylization

MorphNeRF: text -Guided 3D-Aware Editing via Morphing Generative Neural Radiance Fields

Morphological Approach for text -Line Segmentation in Handwritten Documents, A

Morphological Approach to text String Extraction from Regular Periodic Overlapping Text-Background Images, A

Morphological Approach to text String Extraction from Regular Periodic Overlapping Text-Background Images, A

Morphological text Extraction from Images

Morphology-based hierarchical representation with application to text segmentation in natural images

Morphology-based text line extraction

Morph text : Deep Morphology Regularized Accurate Arbitrary-Shape Scene Text Detection

Mosaicing-by-recognition for video-based text recognition

Mosaicing-by-recognition: a technique for video-based text recognition

MOST: A Multi-Oriented Scene text Detector with Localization Refinement

MotiF: Making text Count in Image Animation with Motion Focal Loss

MotionDiffuse: text -Driven Human Motion Generation With Diffusion Model

Motiondirector: Motion Customization of text -to-video Diffusion Models

Moto: Enhancing Embedding with Multiple Joint Factors for Chinese text Classification

Movie fill in the blank by joint learning from video and text with adaptive temporal attention

Movie/Script: Alignment and Parsing of Video and text Transcription

MPEG-7 Video text Description Scheme for Superimposed Text in Images and Video

MRF based text binarization in complex images using stroke feature

MRF Model for Binarization of Natural Scene text , An

MRN: Multiplexed Routing Network for Incremental Multilingual text Recognition

MRP-GAN: Multi-resolution parallel generative adversarial networks for text -to-image synthesis

MSCap: Multi-Style Image Captioning With Unpaired Stylized text

MSDLF-K: A Multimodal Feature Learning Approach for Sentiment Analysis in Korean Incorporating text and Speech

MSER-Based Real-Time text Detection and Tracking

MSR-Video to text dataset with clean annotations, The

MSSA: A Multi-Scale Semantic-Aware Method for Remote Sensing Image- text Retrieval

MTA-CLIP: Language-guided Semantic Segmentation with Mask- text Alignment

MTADiffusion: Mask text Alignment Diffusion Model for Object Inpainting

MTGT: Multiscale text Feature-Guided Transformer in medical image segmentation

MTRNet++: One-stage mask-based scene text eraser

MUGEN: A Playground for Video-Audio- text Multimodal Understanding and GENeration

MULAN: A Multi Layer Annotated Dataset for Controllable text -to-Image Generation

MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned text Embedding and Alternating Training

Multi language text detection using fast stroke width transform

Multi scale mirror connection based encoder decoder network for text localization

Multi-branch Network with Ensemble Learning for text Removal in the Wild

Multi-Concept Customization of text -to-Image Diffusion

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Multi-Dimensional Quality Assessment for text -to-3D Assets: Dataset and Model

Multi-event Video- text Retrieval

Multi-fractal Modeling for On-line text -Independent Writer Identification

Multi-Grained Vision-and-Language Model for Medical Image and text Alignment

Multi-Granularity Aggregation Transformer for Joint Video-Audio- text Representation Learning

Multi-Granularity Matching Transformer for text -Based Person Search

Multi-granularity Prediction for Scene text Recognition

Multi-Granularity Prediction with Learnable Fusion for Scene text Recognition

Multi-Group Proportional Representation for text -to-Image Models

Multi-head Self-relation Network for Scene text Recognition, A

Multi-Label Generalized Zero Shot Chest X-Ray Classification by Combining Image- text Information With Feature Disentanglement

Multi-label text Classification Approach for Sentence Level News Emotion Analysis

Multi-layer feature fusion based image style transfer with arbitrary text condition

Multi-Layer Probabilistic Association Reasoning Network for Image- text Retrieval

Multi-lingual scene text detection and language identification

Multi-lingual text recognition from video frames

Multi-Modal Architecture With Spatio-Temporal- text Adaptation for Video-Based Traffic Accident Anticipation, A

Multi-modal Con text ual Graph Neural Network for Text Visual Question Answering

Multi-Modal Fusion Network for Rumor Detection with text s and Images

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene text

Multi-modal In-Con text Learning Makes an Ego-evolving Scene Text Recognizer

Multi-Modal Reasoning Graph for Scene- text Based Fine-Grained Image Classification and Retrieval

Multi-Modal Reference Learning for Fine-Grained text -to-Image Retrieval

Multi-Modal Representation Learning with text -Driven Soft Masks

Multi-modal text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features

Multi-Modal Topic Model for Image Annotation Using text Analysis, A

Multi-Orientation Scene text Detection with Adaptive Clustering

Multi-orientation scene text detection with multi-information fusion

Multi-Oriented and Multi-Lingual Scene text Detection With Direct Regression

Multi-oriented Bangla and Devnagari text recognition

Multi-oriented English text Line Identification

Multi-oriented Scene text Detection via Corner Localization and Region Segmentation

Multi-oriented text detection from natural scene images based on a CNN and pruning non-adjacent graph edges

Multi-oriented text Detection with Fully Convolutional Networks

Multi-Oriented text Extraction in Stylistic Documents

Multi-oriented touching text character segmentation in graphical documents using dynamic programming

Multi-phase recognition of multifont photoscript Arabic text

multi-plane approach for text segmentation of complex document images, A

Multi-polarity text segmentation using graph theory

Multi-resolution form of SVD for text -independent speaker recognition

Multi-Resolution Pathology-Language Pre-training Model with text -Guided Visual Representation

Multi-Scale Feature Fusion Based on Piecewise Polynomial Activation Function for Image- text Matching

Multi-scale sequential network for semantic text segmentation and localization

Multi-scale text Line Segmentation Method in Freestyle Handwritten Documents, A

Multi-scale video text detection based on corner and stroke width verification

multi-scenario text generation method based on meta reinforcement learning, A

Multi-schema prompting powered token-feature woven attention network for short text classification

Multi-script and Multi-oriented text Localization from Scene Images

Multi-script iterative steerable directional filtering for handwritten text line extraction

Multi-script text Extraction from Natural Scenes

Multi-script text versus non-text classification of regions in scene images

Multi-script text versus non-text classification of regions in scene images

Multi-Script-Oriented text Detection and Recognition in Video/Scene/Born Digital Images

Multi-sensor text classification experiments: A comparison

Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained text -to-Image Synthesis

Multi-Sentence Complementarily Generation for text -to-Image Synthesis

Multi-Speaker text -to-Speech Training With Speaker Anonymized Data

Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene text Detection in Video Images

Multi-stage HMM based Arabic text recognition with rescoring

Multi-strategy tracking based text detection in scene videos

Multi-Style Shape Matching GAN for text Images

Multi- text Guidance Is Important: Multi-Modality Image Fusion via Large Generative Vision-Language Model

Multi-Track Timeline Control for text -Driven 3D Human Motion Generation

Multi-View User Preference Modeling for Personalized text -to-Image Generation

Multi-View Visual Semantic Embedding for Cross-Modal Image- text Retrieval

Multi3DRefer: Grounding text Description to Multiple 3D Objects

Multifractal Characterization of text s for Pattern Recognition: On the Complexity of Morphological Structures in Modern and Ancient Languages

Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs

Multilabel text Classification With Incomplete Labels: A Safe Generative Model With Label Manifold Regularization and Confidence Constraint

Multilateral Semantic Relations Modeling for Image text Retrieval

Multilevel Semantic Interaction Alignment for Video- text Cross-Modal Retrieval

Multilevel text -Line Segmentation Framework for Handwritten Historical Documents, A

Multilingual Artificial text Detection Using a Cascade of Transforms

Multilingual text -to-Image Person Retrieval via Bidirectional Relation Reasoning and Aligning

Multimodal alignment of event and text streams in spiking neural networks for human action recognition

Multimodal grid features and cell pointers for scene text visual question answering

Multimodal interactive transcription of text images

Multimodal Meme Classification Identifying Offensive Content in Image and text

Multimodal Neurons in Pretrained text -Only Transformers

Multimodal Processing and Interaction: Audio, Video, text

Multimodal Sentiment Analysis With Image- text Interaction Network

Multimodal Topic Modeling by Exploring Characteristics of Short text Social Media

Multimodal-LLM Agent For text -Driven Multi-Attribute Face Editing

Multioriented and Curved text Lines Extraction From Indian Documents

Multioriented Video Scene text Detection Through Bayesian Classification and Boundary Growing

multiple agent architecture for handwritten text recognition, A

Multiple attention encoded cascade R-CNN for scene text detection

Multiple Classifier Approach for the Recognition of Screen-Rendered text , A

Multiple Document Datasets Pre-training Improves text Line Detection With Deep Neural Networks

Multiple Geometry Transform Estimation from Single Camera-Captured text Image

Multiple Handwritten text Line Recognition Systems Derived from Specific Integration of a Language Model

Multiple Learned Dictionaries Based Clustered Sparse Coding for the Super-Resolution of Single text Image

Multiple Positives Enhanced NCE Loss for Image- text Retrieval, A

Multitwine: Multi-Object Compositing with text and Layout Control

Multivariate Feedback-Based Image- text Joint Learning for Sketch-Less Facial Image Retrieval

Multiview text Imagination Network Based on Latent Alignment for Image-Text Matching, A

Multiview text Imagination Network Based on Latent Alignment for Image-Text Matching, A

MuLTReNets: Multilingual text recognition networks for simultaneous script identification and handwriting recognition

MUST-VQA: Multilingual Scene- text VQA

Mutually Guided Dual-Task Network for Scene text Detection

Mutually text ual and Visual Refinement Network for Image-Text Matching, A

MV-Adapter: Multimodal Video Transfer Learning for Video text Retrieval

MVCM: Enhancing Multi-View and Cross-Modality Alignment for Medical Visual Question Answering and Medical Image- text Retrieval

MVPortrait: text -Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

N-Gram-Based text Categorization

Name your style: text -guided artistic style transfer

Narrating the Video: Boosting text -Video Retrieval via Comprehensive Utilization of Frame-Level Captions

Natural Language Watermarking Using Semantic Substitution for Chinese text

Natural scene text detection based on SWT, MSER and candidate classification

Natural Scene text Detection with Multi-channel Connected Component Segmentation

Natural scene text detection with multi-layer segmentation and higher order conditional random field based analysis

NaturalSpeech: End-to-End text -to-Speech Synthesis With Human-Level Quality

Navigating text -to-image Generative Bias Across Indic Languages

NCAP: Scene text Image Super-Resolution with Non-CAtegorical Prior

Negative-Aware Attention Framework for Image- text Matching

Negative-Prompt Inversion: Fast Image Inversion for Editing with text -Guided Diffusion Models

NEOCR: A Configurable Dataset for Natural Image text Recognition

neural model for text localization, transcription and named entity recognition in full pages, A

Neural network-based prediction of the stopping moment for text recognition in a video stream

Neural Network-based text Location for News Video Indexing

Neural network-based text location in color images

Neural Sign Actors: A diffusion model for 3D sign language production from text

Neuro or Symbolic? Fine-Tuned Transformer With Unsupervised LDA Topic Clustering for text Sentiment Analysis

Neuro-Symbolic Evaluation of text -to-Video Models using Formal Verification

Neuron-Based Spiking Transmission and Reasoning Network for Robust Image- text Retrieval

New Approach Based on text ure and Geometric Features for Text Detection

New Approach for Overlay text Detection and Extraction From Complex Video Scene, A

new approach for text -independent speaker recognition, A

new approach for video text detection, A

New Approach towards text Filtering, A

New Arabic Printed text Image Database and Evaluation Protocols, A

New Binarization Approach Based on text Block Extraction

New Block Partitioned text Feature for Text Verification, A

New Block Partitioned text Feature for Text Verification, A

new deep CNN for 3D text localization in the wild through shadow removal, A

New Deep Wavefront Based Model for text Localization in 3D Video, A

new edge-based text verification approach for video, A

New Fourier-Statistical Features in RGB Space for Video text Detection

New Fuzzy Hierarchical Classification Based on SVM for text Categorization, A

New Gradient Based Character Segmentation Method for Video text Recognition, A

new hybrid method to detect text in natural scene, A

new instrumented approach for translating American Sign Language into sound and text , A

New Language-Independent Deep CNN for Scene text Detection and Style Transfer in Social Media Images, A

New Method for Arabic text Detection in Natural Scene Image Based on the Color Homogeneity, A

New Method for Arabic text Detection in Natural Scene Images, A

new method for detection and prediction of occluded text in natural scene images, A

New Method for Handwritten Scene text Detection in Video, A

new method for multi-oriented graphics-scene-3D text classification in video, A

New Method for text Verification Based on Random Forests, A

New Method for text -Line Segmentation for Warped Documents, A

New Method for Word Segmentation from Arbitrarily-Oriented Video text Lines, A

New Method for Writer Identification and Verification Based on Farsi/Arabic Handwritten text s, A

new multi-modal approach to bib number/ text detection and recognition in Marathon images, A

New Nearest Neighbor Rule for text Categorization, A

new robust algorithm for video text extraction, A

new scheme for unconstrained handwritten text -line segmentation, A

new segmentation technique for omnifont Farsi text , A

new segmentation technique of Arabic text , A

New Smoothing Method for Lexicon-Based Handwritten text Keyword Spotting, A

New Strategy for Reducing Errors in Scene text Detection, A

new structural technique for recognizing printed Arabic text , A

New Symmetry Based on Proximity of Wavelet-Moments for text Frame Classification in Video, A

New Technique for Multi-Oriented Scene text Line Detection and Tracking in Video, A

New text Extraction Method Incorporating Local Information, A

New text -Line Alignment Approach Based on Piece-Wise Painting Algorithm for Handwritten Documents, A

New Type of Feature: Loose N-Gram Feature in text Categorization, A

new unified method for detecting text from marathon runners and sports players in video (PR-D-19-01078R2), A

New Video Images text Localization Approach Based on a Fast Hough Transform, A

New Wavelet and Color Features for text Detection in Video

new wavelet-Laplacian method for arbitrarily-oriented character segmentation in video text lines, A

Newmove: Customizing text -to-video Models with Novel Motions

News2meme: An Automatic Content Generator from News Based on Word Subspaces from text and Image

NIVeL: Neural Implicit Vector Layers for text -to-Vector Generation

Noise Diffusion for Enhancing Semantic Faithfulness in text -to-Image Synthesis

Noise-aware Learning from Web-crawled Image- text Data for Image Captioning

NoiseCollage: A Layout-Aware text -to-Image Diffusion Model Based on Noise Cropping and Merging

Noisy text Categorization

Noisy-Aware Unsupervised Domain Adaptation for Scene text Recognition

Noisy-Correspondence Learning for text -to-Image Person Re-Identification

Non-Local text Image Reconstruction

Non-negative Sparse Semantic Coding for text categorization

non-stationary density model to separate overlapped text s in degraded documents, A

Non-Uniform Slant Correction for Handwritten text Line Recognition

Not Just text : Uncovering Vision Modality Typographic Threats in Image Generation Models

Not Only text : Exploring Compositionality of Visual Representations in Vision-Language Models

Novel Algorithm for text Detection and Localization in Natural Scene Images, A

novel automated depression detection technique using text transcript, A

novel binarization approach for text in images, A

Novel Data Independent Approach for Conversion of Hand Punched Kannada Braille Script to text and Speech, A

Novel Data Representation for text Extraction from Multispectral Historical Document Images

novel domain independent scene text localizer, A

Novel Edge Features for text Frame Classification in Video

Novel Fuzzy Logic-Based text Classification Method for Tracking Rare Events on Twitter, A

Novel Illumination-Balance Technique for Improving the Quality of Degraded text -Photo Images, A

Novel Integrated Framework for Learning both text Detection and Recognition, A

Novel Method for Embedded text Segmentation Based on Stroke and Color, A

novel method for straightening curved text -lines in stylistic documents, A

novel method of text line segmentation for historical document image of the uchen Tibetan, A

Novel Multi-oriented Chinese text Extraction Approach from Videos, A

novel mutual nearest neighbor based symmetry for text frame classification in video, A

novel scene text detection algorithm based on convolutional neural network, A

Novel Sub-character HMM Models for Arabic text Recognition

Novel System for Robust text Location and Recognition of Book Covers, A

Novel text Detection System Based on Character and Link Energies, A

novel text structure feature extractor for Chinese scene text detection and recognition, A

novel text structure feature extractor for Chinese scene text detection and recognition, A

Novel text -Independent Speaker Verification System Using Ant Colony Optimization Algorithm, A

novel triangulation procedure for thinning hand-written text , A

novel two-stage algorithm for baseline estimation and correction in Farsi and Arabic handwritten text line, A

Novel Visual Representation on text Using Diverse Conditional GAN for Visual Recognition, A

Novice and Expert Performance of KeyScretch: A Gesture-Based text Entry Method for Touch-Screens

NTIRE 2025 challenge on text to Image Generation Model Quality Assessment

Null- text Inversion for Editing Real Images using Guided Diffusion Models

OASIS: Object-guided Attention for text -conditional Diffusion Synthesis of Human Interaction Sequences

Object proposals for text extraction in the wild

Object Reading: text Recognition for Object Recognition

Object-aware Query Perturbation for Cross-modal Image- text Retrieval

Object-conditioned Energy-based Attention Map Alignment in text -to-image Diffusion Models

Object-Driven text -To-Image Synthesis via Adversarial Training

Object-level semantic alignment for enhancing fidelity in text -to-image generation with diffusion models

Objective Distortion Measure for Binary text Image Based on Edge Line Segment Similarity

Objective Function Design for MCE-Based Combination of On-line and Off-line Character Recognizers for On-line Handwritten Japanese text Recognition

Occluded text Detection and Recognition in the Wild

Occlusion-Aware text -Image-Point Cloud Pretraining for Open-World 3D Object Recognition

OCR and Voting Shell Fulfilling Specific text Analysis Requirements

OCR of Printed Telugu text with High Recognition Accuracies

OCR Pipeline and Semantic text Analysis for Comics, An

OCR-VQGAN: Taming text -within-Image Generation

OCRSpell: An Interactive Spelling Correction System for OCR Errors in text

ODM: A text -Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

ODM: A text -Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

Off-line Chinese Writer Retrieval System Based on text -sensitive Writer Identification, An

Offline arabic handwritten text recognition: A Survey

Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking

Offline Recognition of Large Vocabulary Cursive Handwritten text

Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)

Offline Recognition of Unconstrained Handwritten text s Using HMMs and Statistical Language Models

Offline text -independent writer identification using codebook and efficient code extraction methods

Old fashion text -based image retrieval using FCA

Omnifont recognition of text using topological recognition techniques

OMNIPARSER: A Unified Framework for text Spotting, Key Information Extraction and Table Recognition

On appearance-based feature extraction methods for writer-independent handwritten text recognition

On Calibration of Scene- text Recognition Models

On Combining Multiple Segmentations in Scene text Recognition

On Manipulating Scene text in the Wild with Diffusion Models

On optimal stopping strategies for text recognition in a video stream as an application of a monotone sequential decision model

On partitioning a dictionary for visual text recognition

On Recognizing text s of Arbitrary Shapes with 2D Self-Attention

On the Behavior of Contrastive Regularization in Improving Chinese text Recognizer

On the Detection of Images Generated from text

On the discriminability of keystroke feature vectors used in fixed text keystroke authentication

On the Evaluation of Handwritten text Line Detection Algorithms

On the General Value of Evidence, and Bilingual Scene- text Visual Question Answering

On the Generalization of Handwritten text Recognition Models

On the influence of vocabulary size and language models in unconstrained handwritten text recognition

On the Modification of Binarization Algorithms to Retain Grayscale Information for Handwritten text Recognition

On the Processing of Fuzzy Patterns for text Independent Phonetic Speech Segmentation

On the Scalability of Diffusion-based text -to-Image Generation

On the Segmentation of text in Videos

On the use of Bernoulli mixture models for text classification

On the use of duration-corrected N-best hypotheses for text recognition in gray-scale document images

On Vocabulary Reliance in Scene text Recognition

On-Device text Image Super Resolution

On-Line Handwritten Japanese text Recognition Free from Constrains on Line Direction and Character Orientation

On-line Handwritten Japanese text Recognition System Free from Line Direction and Character Orientation Constraints, An

On-Line Handwritten text Line Detection Using Dynamic Programming

On-line Handwritten text Search Method Based on Directional Feature Matching, An

On-line recognition of handwritten Renqun shorthand for fast mobile Chinese text entry

On-line Writing-box-free Recognition of Handwritten Japanese text Considering Character Size Variations

ONE-DM: One-shot Diffusion Mimicker for Handwritten text Generation

One-shot Compositional Data Generation for Low Resource Handwritten text Recognition

One-Shot Doc Snippet Detection: Powering Search in Document Beyond text

One-Step Diffusion for Real-World Image Super-Resolution via Degradation Removal and text Prompts

One-Way Ticket: Time-Independent Unified Encoder for Distilling text -to-Image Diffusion Models

Online Biterm Topic Model based short text stream classification using short text expansion and concept drifting detection

Online Biterm Topic Model based short text stream classification using short text expansion and concept drifting detection

Online text -Independent Writer Identification Based on Stroke's Probability Distribution Function

Online text -independent Writer Identification Based on Temporal Sequence and Shape Codes

Ontology-Based text Mining Method to Develop D-Matrix From Unstructured Text, An

Ontology-Based text Mining Method to Develop D-Matrix From Unstructured Text, An

Opaque Document Imaging: Building Images of Inaccessible text s

Open set classification of untranscribed handwritten text image documents

Open-Set text Recognition via Character-Context Decoupling

Open-Vocabulary 3D Semantic Segmentation with text -to-Image Diffusion Models

Open-Vocabulary Panoptic Segmentation with text -to-Image Diffusion Models

Open-vocabulary recognition of machine-printed Arabic text using hidden Markov models

Open-Vocabulary text -Driven Human Image Generation

OpenBias: Open-Set Bias Detection in text -to-Image Generative Models

OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image- text Generation

Opinion mining from noisy text data

OPMP: An Omnidirectional Pyramid Mask Proposal Network for Arbitrary-Shape Scene text Detection

Optical character correction of large-curvature annular sector text in polar coordinate system

Optical flow based dynamic curved video text detection

Optical modelling and language modelling trade-off for Handwritten text Recognition

Optimal Boxes: Boosting End-to-End Scene text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

Optimal Classification Model for text Detection and Recognition in Video Frames

Optimal text /Background Color Combination of LED Information Boards for Visibility Improvement Based on Psychological Measurements, An

Optimal word order for non-causal text generation with Large Language Models: The Spanish case

Optimizing the class information divergence for transductive classification of text s using propagation in bipartite graphs

Optimizing the integration of a statistical language model in HMM based offline handwritten text recognition

Orientation and Scale Invariant text Region Extraction in WWW Images

Orientation Robust text Line Detection in Natural Images

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page text Recognition by learning to unfold

Oscillating Feature Subset Search Algorithm for text Categorization

OST: Refining text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

OTE: Exploring Accurate Scene text Recognition Using One Token

Out of vocabulary word detection and recovery in Arabic handwritten text recognition

Outline Generation Transformer for Bilingual Scene text Recognition

Overview of text -Based Person Search: Recent Advances and Future Directions, An

P-CLIP: Progressive Discrepancy Learning for One-Shot text -to-Image Person Re-Identification

PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese text Recognition

Paint-it: text -to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

Pair-Copula Based Scheme for text Extraction from Digital Images, A

PairAug: What Can Augmented Image- text Pairs Do for Radiology?

Pairwise optimized Rocchio algorithm for text categorization

PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped text

Pangu-draw: Advancing Resource-efficient text -to-image Synthesis with Time-decoupled Training and Reusable Coop-diffusion

PanoDreamer: Consistent text to 360-Degree Scene Generation

Paragraph text segmentation into lines with Recurrent Neural Networks

Parameter efficient finetuning of text -to-image models with trainable self-attention layer

Parametric Spectral-Based Method for Verification of text in Videos, A

Parco: Part-coordinating text -to-motion Synthesis

Parrot Captions Teach CLIP to Spot text

Parrot: Pareto-optimal Multi-reward Reinforcement Learning Framework for text -to-image Generation

Part-based method on handwritten text s

Partial Scene text Retrieval

Parts2Words: Learning Joint Embedding of Point Clouds and text s by Bidirectional Matching Between Parts and Words

PathLDM: text conditioned Latent Diffusion Model for Histopathology

Pay attention to what you read: Non-recurrent handwritten text -Line recognition

Pea-diffusion: Parameter-efficient Adapter with Knowledge Distillation in Non-english text -to-image Generation

Pen Acoustic Emissions for text and Gesture Recognition

Perceptive Vision for Headline Localisation in Bangla Handwritten text Recognition

Performance Analysis of text Halftone Modulation

Performance Evaluation of text Detection and Tracking in Video

Person Identification Using text and Image Data

Person Search by text Attribute Query As Zero-Shot Learning

PersonaBooth: Personalized text -to-Motion Generation

Personalised video summarisation using video- text multi-modal fusion

Personalized Residuals for Concept-Driven text -to-Image Generation

Personalized text snippet extraction using statistical language models

Perspective Scene text Recognition with Feature Compression and Ranking

PETR: Rethinking the Capability of Transformer-Based Language Model in Scene text Recognition

PFAN++: Bi-Directional Image- text Retrieval With Position Focused Attention Network

Phenology description is all you need! mapping unknown crop types with remote sensing time-series and LLM generated text alignment

Photographic text -to-Image Synthesis with a Hierarchically-Nested Adversarial Network

PhotoOCR: Reading text in Uncontrolled Conditions

PhyS-EdiT: Physics-aware Semantic Image Editing with text Description

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded text -to-Video Generation

PI3D: Efficient text -to-3D Generation with Pseudo-Image Diffusion

PIA: Your Personalized Image Animator via Plug-and-Play Modules in text -to-Image Models

Picture and text Query and Archiving System, A

Picture is Worth More Than 77 text Tokens: Evaluating CLIP-Style Models on Dense Captions, A

PIDRo: Parallel Isomeric Attention with Dynamic Routing for text -Video Retrieval

Piece-wise linearity based method for text frame classification in video

Pitch Based Segmentation and Recognition of Dot-Matrix text

Pitman Shorthand inspired model for plain text compression

Pixart-sigma: Weak-to-strong Training of Diffusion Transformer for 4k text -to-image Generation

Pixel-Based Evaluation Method for text Detection in Color Images, A

Pix text GAN: structure aware text image synthesis for license plate recognition

Plan, Posture and Go: Towards Open-vocabulary text -to-motion Generation

Platypus: A Generalized Specialist Model for Reading text in Various Forms

Plda-based system for text -prompted password speaker verification

Plot: text -based Person Search with Part Slot Attention for Corresponding Part Discovery

Plug-and-Play Diffusion Features for text -Driven Image-to-Image Translation

Plug-and-Play Interpretable Responsible text -to-Image Generation via Dual-Space Multi-facet Concept Control

Plug-and-Play Regulators for Image- text Matching

Plugnet: Degradation Aware Scene text Recognition Supervised by a Pluggable Super-resolution Unit

PMMN: Pre-Trained Multi-Modal Network for Scene text Recognition

PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved text -to-Image Diffusion

PointCloud- text Matching: Benchmark Dataset and Baseline

PolygloNet: Multilingual Approach for Scene text Recognition Without Language Constraints

Polygon-based technique for the automatic classification of text and graphics components from digitized paper-based forms

Polygon-Free: Unconstrained Scene text Detection with Box Annotations

pooling based scene text proposal technique for scene text reading in the wild, A

pooling based scene text proposal technique for scene text reading in the wild, A

Portable and fast text detection

Portmanteauing Features for Scene text Recognition

Position-Guided text Prompt for Vision-Language Pre-Training

Post-training Quantization with Progressive Calibration and Activation Relaxing for text -to-image Diffusion Models

PosterMaker: Towards High-Quality Product Poster Generation with Accurate text Rendering

Powerful and Flexible: Personalized text -to-image Generation via Reinforcement Learning

PQPP: A Joint Benchmark for text -to-Image Prompt and Query Performance Prediction

PR-CLIP: Cross-Modal Positional Reconstruction for Remote Sensing Image- text Retrieval

Pre-Training a Graph Recurrent Network for text Understanding

PreciseCam: Precise Camera Control for text -to-Image Generation

Precisecontrol: Enhancing text -to-image Diffusion Models with Fine-grained Attribute Control

Predicated Diffusion: Predicate Logic-Based Attention Guidance for text -to-Image Diffusion Models

Predict, Prevent, and Evaluate: Disentangled text -Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

Predicting audio-visual salient events based on visual, audio and text modalities for movie summarization

Predicting Emotional Responses to Long Informal text

Predicting Motivations of Actions by Leveraging text

Predicting Visual Features From text for Image and Video Caption Retrieval

PRESENT: Zero-Shot text -to-Prosody Control

Preserve or Modify? Con text -Aware Evaluation for Balancing Preservation and Modification in Text-Guided Image Editing

Preserving privacy without compromising accuracy: Machine unlearning for handwritten text recognition

PreSTU: Pre-Training for Scene- text Understanding

Primitive Representation Learning for Scene text Recognition

Printed text Discrimination

Printed text Featuring Using the Visual Criteria of Legibility and Complexity

Printed text segmentation using distance transform

Prior knowledge guided text to image generation

Prior Preserved text -to-Image Personalization Without Image Regularization

Probabilistic Hierarchical Clustering Method for Organising Collections of text Documents, A

Probabilistic Kernels for Improved text -to-Speech Alignment in Long Audio Tracks

probabilistic model derived term weighting scheme for text classification, A

Processing of Binary Images of Handwritten text Documents

Processing of Off-Line Handwritten text : Polygonal-Approximation and Enforcement of Temporal Information

Progressive Contour Regression for Arbitrary-Shape Scene text Detection

Progressive Feature Mining and External Knowledge-Assisted text -Pedestrian Image Retrieval

Progressive Human Motion Generation Based on text and Few Motion Frames

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant text -to-Mesh Generation without 3D Data

Progressive scene text erasing with self-supervision

Progressive Spatio-Temporal Prototype Matching for text -Video Retrieval

Progressive text -Semantic-Aware Generative Adversarial Network for Image Fusion

Progressive text -to-Face Synthesis with Generative Adversarial Network

Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward text -to-3D Scene Generation

Prompt Augmentation for Self-supervised text -guided Image Manipulation

Prompt Switch: Efficient CLIP Adaptation for text -Video Retrieval

Prompt Tuning Inversion for text -Driven Image Editing Using Diffusion Models

Prompt-Free Diffusion: Taking text Out of Text-to-Image Diffusion Models

Prompt-Free Diffusion: Taking text Out of Text-to-Image Diffusion Models

Prompt2Perturb (P2P): text -Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images

PromptAD: Zero-shot Anomaly Detection using text Prompts

Prompting Hard or Hardly Prompting: Prompt Inversion for text -to-Image Diffusion Models

Proposal for a text -Indicated Writer Verification Method, A

Proposal of the hybrid spectral gradient method to extract character- text regions from general scene images

Protip: Probabilistic Robustness Verification on text -to-image Diffusion Models Against Stochastic Perturbation

Prototype-guided text -based person search on rich Chinese descriptions

Psg-adapter: Controllable Planning Scene Graph for Improving text -to-image Diffusion

Pull Pole Points to text Contour by Magnetism: A Real-Time Scene Text Detector

Pull Pole Points to text Contour by Magnetism: A Real-Time Scene Text Detector

Pure Transformer with Integrated Experts for Scene text Recognition

Push the limit of scene text recognition using character and text length guided text super-resolution

Push the limit of scene text recognition using character and text length guided text super-resolution

Push the limit of scene text recognition using character and text length guided text super-resolution

Pushing the Performance Limit of Scene text Recognizer without Human Annotation

PYRAD-DCNN: A Fully Convolutional Neural Network to Replace BLSTM in Offline text Recognition Systems

Pyrboxes: An efficient multi-scale scene text detector with feature pyramids

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for text -to-Vision Content

quad tree based method for blurred and non-blurred video text frames classification through quality metrics, A

Quadrilateral Scene text Detector with Two-Stage Network Architecture, A

Quality Assessment for text -to-Image Generation: A Survey

Quality inspection of printed text s

Quality-related English text classification based on recurrent neural network

QWERTY- and 8pen- Based Touchless text Input with Hand Movement

R-Net: A Relationship Network for Efficient and Accurate Scene text Detection

R.A.C.E.: Robust Adversarial Concept Erasure for Secure text -to-image Diffusion Model

R2CNN: Rotational Region CNN for Arbitrarily-Oriented Scene text Detection

Rail Transit Line-Sign text Detection With Patch-Based Region Proposal Network

Random Subspace Method in text Categorization

Ranni: Taming text -to-Image Diffusion for Accurate Instruction Following

Rapid Evaluation of the Handwriting Performance for Gesture Based text Input

Re-ranking and TOPSIS-based ensemble feature selection with multi-stage aggregation for text categorization

Re-ranking image- text matching by adaptive metric fusion

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene text Recognition

Read text from Signs in General Scenes

Reading Arbitrary-shaped Scene text from Images Through Spline Regression and Rectification

Reading Newspaper text

Reading text in the Wild from Compressed Images

Reading text in the Wild with Convolutional Neural Networks

Reading-Life Log: Technologies to Recognize text s That We Read, The

Reading-Strategy Inspired Visual Representation Learning for text -to-Video Retrieval

ReADS: A Rectified Attentional Double Supervised Network for Scene text Recognition

Real time image enhancement and segmentation for sign/ text detection

Real time image enhancement for both text and color photo images

Real-Time Lexicon-Free Scene text Localization and Recognition

Real-time Lexicon-free Scene text Retrieval

Real-Time Scene text Detection Based on Stroke Model

Real-Time Scene text Detection With Differentiable Binarization and Adaptive Scale Fusion

Real-time scene text localization and recognition

Real-Time Scene text to Speech System, A

Real-Time text Detection With Similar Mask in Traffic, Industrial, and Natural Scenes

Real-Time text Steganalysis Based on Multi-Stage Transfer Learning

Real-time text tracking in natural scenes

real-time text -independent speaker identification system, A

Real-Time Visual Analytics for text Streams

RealCustom: Narrowing Real text Word for Real-Time Open-Domain Text-to-Image Customization

RealCustom: Narrowing Real text Word for Real-Time Open-Domain Text-to-Image Customization

RealDTT: Towards A Comprehensive Real-World Dataset for Tampered text Detection

RealmDreamer: text -Driven 3D Scene Generation with Inpainting and Depth Diffusion

Realtime multi-scale scene text detection with scale-based region proposal network

Reasoning elicitation and multi-granularity contrastive learning for text -rich image understanding in large vision-language models

Receler: Reliable Concept Erasing of text -to-image Diffusion Models via Lightweight Erasers

Recipe for Scaling up text -to-Video Generation with Text-free Videos, A

Recipe for Scaling up text -to-Video Generation with Text-free Videos, A

Recipe2Video: Synthesizing Personalized Videos from Recipe text s

ReCo: Region-Controlled text -to-Image Generation

Recognising text in Real Scenes

Recognition based text localization from natural scene images

Recognition of Apparent Personality Traits from text and Handwritten Images

Recognition of Arabic Machine-Printed Cursive text

Recognition of Bangla text from scene images through perspective correction

Recognition of cursive video text using a deep learning framework

Recognition of Hand-Written Archive text Documents

Recognition of Handwritten Chinese text by Segmentation: A Segment-Annotation-Free Approach

Recognition of Indian multi-oriented and curved text

Recognition of Multi-oriented, Multi-sized, and Curved text

Recognition of Noise Polyfont Printed text Using Combined HMMS, The

Recognition of Pornographic Web Pages by Classifying text s and Images

Recognition of printed arabic text based on global features and decision tree learning techniques

Recognition of Printed Arabic text Using Neural Networks

Recognition of printed Devanagari text using BLSTM Neural Network

Recognition of Printed text under Realistic Conditions

Recognition of Screen-Rendered text

Recognition of Video text through Temporal Integration

Recognition-Based Segmentation of Nom Characters from Body text Regions of Stele Images Using Area Voronoi Diagram

Recognition-Synergistic Scene text Editing

Recognize text in General Scenes

Recognizing Chinese text s with 3D Convolutional Neural Network

Recognizing irregular entities in biomedical text via deep neural networks

Recognizing Multiple text Sequences from an Image by Pure End-to-End Learning

Recognizing perspective scene text with context feature

Recognizing semantic correlation in image- text Weibo via feature space mapping

Recognizing text Elements for SVG Comic Compression and Its Novel Applications

Recognizing text in historical maps using maps from multiple time periods

Recognizing text in raster maps

Recognizing text with a CNN

Recognizing text with Perspective Distortion in Natural Scenes

Recognizing text -Based Traffic Guide Panels with Cascaded Localization Network

Recognizing text -Based Traffic Signs

Recon: Training-free Acceleration for text -to-image Synthesis with Retrieval of Concept Prompt Trajectories

Reconsidering Tourism Destination Images by Exploring Similarities between Travelogue text s and Photographs

Rectification and recognition of text in 3-D scenes

Rectifying Perspective Views of text in 3D Scenes Using Vanishing Points

Recurrent Affine Transformation for text -to-Image Synthesis

Recurrent Global Convolutional Network for Scene text Detection

Recurrent Highway Networks with Attention Mechanism for Scene text Recognition

Redefining the DCT-based feature for scene text detection: Analysis and comparison of spatial frequency-based features

Redif Extraction in Handwritten Ottoman Literary text s

Reduced annotation based on deep active learning for arabic text detection in natural scene images

Reference-Aware Adaptive Network for Image- text Matching

Referring Image Segmentation Using text Supervision

Refine, Control and Distill: A text -to-Image Framework for Faithful Image Generation

Refining text -to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation

Region Reinforcement Network With Topic Constraint for Image- text Matching

Region-Aware Arbitrary-Shaped text Detection With Progressive Fusion

Region-Based Discriminative Feature Pooling for Scene text Recognition

Regularizing Visual Semantic Embedding With Contrastive Learning for Image- text Matching

Reinforcement Shrink-Mask for text Detection

Rejection Strategies for Offline Handwritten text Line Recognition

ReLa text : Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks

Relation Graph Reasoning for Image- text Matching

Relation Mining and Visualization Framework for Automated text Summarization, A

Relation-Guided Network for Image- text Retrieval

Reliable and Efficient Concept Erasure of text -to-image Diffusion Models

Reliable Phrase Feature Mining for Hierarchical Video- text Retrieval

Remote Sensing Cross-Modal text -Image Retrieval Based on Attention Correction and Filtering

Remote Sensing Image Augmentation Based on text Description for Waterside Change Detection

Remote Sensing Image Generation via Object text Decoupling

Removing Distributional Discrepancies in Captions Improves Image- text Alignment

RenAIssance: A Survey Into AI text -to-Image Generation in the Era of Large Model

Report from the AND 2009 working group on noisy text datasets

Representation and Recognition of text Using Hidden Markov Models, The

Representation learning for very short text s using weighted word embedding aggregation

Representation transfer and data cleaning in multi-views for text simplification

Residual Dual Scale Scene text Spotting by Fusing Bottom-Up and Top-Down Processing

ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video- text Data Streams

ReStGAN: A step towards visually guided shopper experience via text -to-image synthesis

Retaining Knowledge and Enhancing Long- text Representations in CLIP through Dual-Teacher Distillation

Rethinking Diffusion for text -Driven Human Motion Generation: Redundant Representations, Evaluation, and Masked Autoregression

Rethinking Noisy Video- text Retrieval via Relation-aware Alignment

Rethinking text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Training for De-biasing text -to-Image Generation: Unlocking the Potential of Stable Diffusion

Rethinking Video- text Understanding: Retrieval from Counterfactually Augmented Data

Retrieval Methods for English- text with Misrecognized OCR Characters

Retrieval Strategies for Noisy text

Revealing Directions for text -Guided 3D Face Editing

Review of Cross-Modal Image- text Retrieval in Remote Sensing, A

Review of Segmentation and Con text ual Analysis Techniques for Text Recognition, A

Revisiting Scene text Recognition: A Data Perspective

RIATIG: Reliable and Imperceptible Adversarial text -to-Image Generation with Natural Prompts

Rich Human Feedback for text -to-Image Generation

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in text -to-3D

Rickrolling the Artist: Injecting Backdoors into text Encoders for Text-to-Image Synthesis

Rickrolling the Artist: Injecting Backdoors into text Encoders for Text-to-Image Synthesis

RiFeGAN2: Rich Feature Generation for text -to-Image Synthesis From Constrained Prior Knowledge

RiFeGAN: Rich Feature Generation for text -to-Image Synthesis From Prior Knowledge

RLita: A Region-Level Image- text Alignment Method for Remote Sensing Foundation Model

RLST: A Reinforcement Learning Approach to Scene text Detection Refinement

RMGNet: The Progressive Relationship-Mining Graph Neural Network for text -to-Image Person Re-Identification

robust algorithm for text detection in color images, A

Robust Algorithm for text String Separation from Mixed Text/Graphics Images, A

Robust Algorithm for text String Separation from Mixed Text/Graphics Images, A

Robust and Accurate text Stroke Segmentation

robust and multiscale document image segmentation for block line/ text line structures extraction, A

Robust and Non-Negative Collective Matrix Factorization for text -to-Image Transfer Learning

Robust and parallel Uyghur text localization in complex background images

Robust and Secure Data Hiding for PDF text Document

robust approach for recognition of text embedded in natural scenes, A

robust approach for text detection from natural scene images, A

Robust Approach to Extraction of text s from Camera Captured Images, A

robust approach to text line grouping in online handwritten Japanese documents, A

Robust Binarization for Video text Recognition

Robust Color-Independent text Detection Method from Complex Videos, A

Robust detection of stylized text events in digital video

Robust Disaster Assessment from Aerial Imagery Using text -to-Image Synthetic Data

Robust Extraction of text from Camera Images

Robust Extraction of text in Video

Robust Hashing With Bilinear Drift for Image- text Retrieval

robust hybrid approach for text line segmentation in historical documents, A

Robust Lexicon-Free Confidence Prediction for text Recognition

Robust Local Scoring Function for text -Independent Speaker Verification

Robust Model for On-Line Handwritten Japanese text Recognition, A

Robust outdoor text detection using text intensity and shape features

Robust outdoor text detection using text intensity and shape features

Robust Scene text Detection for Multi-script Languages Using Deep Learning

Robust Scene text Detection for Partially Annotated Training Data

Robust scene text detection using integrated feature discrimination

Robust Scene text Detection with Convolution Neural Network Induced MSER Trees

Robust Scene text Detection with Deep Feature Pyramid Network and CNN based NMS Model

Robust Scene text Recognition with Automatic Rectification

Robust scene text understanding with OCR token and word alignment for Text-VQA and text-caption

Robust scene text understanding with OCR token and word alignment for Text-VQA and text-caption

Robust scene text understanding with OCR token and word alignment for Text-VQA and text-caption

Robust seed-based stroke width transform for text detection in natural images

Robust Segmentation Technique for Line, Word and Character Extraction from Kannada text in Low Resolution Display Board Images, A

Robust skew detection in mixed text /graphics documents

Robust Split-and-Merge text Segmentation Approach for Images, A

Robust stereo correspondence for documents by matching connected components of text -lines with dynamic programming

Robust stereo matching for document images using parameter selection of text -line extraction

robust system for text extraction in video, A

Robust System For Thresholding And Skew Detection In Mixed text /graphics Documents, A

robust technique for text extraction in mixed-type binary documents, A

Robust text detection from binarized document images

Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions

Robust text Detection in Natural Scene Images

Robust text Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks

Robust text Detection with Vertically-Regressed Proposal Network

Robust text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

Robust text Line Segmentation for Historical Manuscript Images Using Color and Texture

Robust text Segmentation in Low Quality Images via Adaptive Stroke Width Estimation and Stroke Based Superpixel Grouping

Robust text segmentation using graph cut

Robust text watermarking based on average skeleton mass of characters against cross-media attacks

Robust Two Level Classification Algorithm for text Localization in Documents, A

Robust video text segmentation and recognition with multiple hypotheses

Robust Video- text Retrieval Via Noisy Pair Calibration

Robust Wavelet Transform Based Technique for Video text Detection, A

Robustly Recognizing Irregular Scene text by Rectifying Principle Irregularities

Robustscanner: Dynamically Enhancing Positional Clues for Robust text Recognition

Rolling bilateral filter-based text image deblurring

Rotation and script independent text detection from video frames using sub pixel mapping

Rotation-Sensitive Regression for Oriented Scene text Detection

Rough-fuzzy based scene categorization for text detection and recognition in video

RSCA: Real-time Segmentation-based Con text -Aware Scene Text Detection

RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to- text Adversarial Attacks

RUArt: A Novel text -Centered Solution for Text-Based Visual Question Answering

RUArt: A Novel text -Centered Solution for Text-Based Visual Question Answering

Rule Based Con text ual Post-Processing for Devanagari Text Recognition

RVMamba: Selective text -Vision Mamba for Referring Video Object Segmentation

SAC: Semantic Attention Composition for text -Conditioned Image Retrieval

SAFE: Scale Aware Feature Encoder for Scene text Recognition

Safeguard text -to-image Diffusion Models with Human Feedback Inversion

SaHAN: Scale-Aware Hierarchical Attention Network for Scene text Recognition

SALAD: Skeleton-aware Latent Diffusion for text -driven Motion Generation and Editing

Salient Guided text Detection in E-Commerce Images

Salient Object-Aware Background Generation using text -Guided Diffusion Models

SAM: Self Attention Mechanism for Scene text Recognition Based on Swin Transformer

Sample-aware Data Augmentor for Scene text Recognition

SAMWISE: Infusing Wisdom in SAM2 for text -Driven Video Segmentation

SARAT-a system for the recognition of Arabic printed text

SAST: Semantic-Aware stylized text -to-Image generation

SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker text -to-Speech Systems

Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and text Data

Scale and Orientation Invariant text Segmentation for Born-Digital Compound Images

Scale robust deep oriented- text detection network

Scale Up Composed Image Retrieval Learning via Modification text Generation

Scale-aware Polar Representation for Arbitrarily-shaped text Detection

Scale-Invariant Multi-Oriented text Detection in Wild Scene Image

Scale-Residual Learning Network for Scene text Detection

Scaledreamer: Scalable text -to-3d Synthesis with Asynchronous Score Distillation

Scaling Down text Encoders of Text-to-Image Diffusion Models

Scaling Down text Encoders of Text-to-Image Diffusion Models

Scaling up GANs for text -to-Image Synthesis

SCATTER: Selective Con text Attentional Scene Text Recognizer

ScenarioDiff: text -to-video Generation with Dynamic Transformations of Scene Conditions

Scene Graph Driven text -Prompt Generation for Image Inpainting

Scene Retrieval for Video Summarization Based on text -to-Image GAN

Scene text Character Recognition Using Spatiality Embedded Dictionary

Scene text Deblurring Using Text-Specific Multiscale Dictionaries

Scene text Deblurring Using Text-Specific Multiscale Dictionaries

Scene text detection and recognition with advances in deep learning: A survey

Scene text Detection and Recognition: The Deep Learning Era

Scene text Detection and Segmentation Based on Cascaded Convolution Neural Networks

Scene text Detection and Tracking for a Camera-Equipped Wearable Reading Assistant for the Blind

Scene text detection based on component-level fusion and region-level verification

Scene text detection based on multi-scale SWT and edge filtering

Scene text Detection Based on Robust Stroke Width Transform and Deep Belief Network

Scene text detection based on skeleton-cut detector

Scene text Detection in Foggy Weather Utilizing Knowledge Distillation of Diffusion Models

Scene text detection method based on the hierarchical model

Scene text detection suitable for parallelizing on multi-core

Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy

Scene text detection using graph model built upon maximally stable extremal regions

Scene text detection using sequential nontext filtering

Scene text detection using sparse stroke information and MLP

Scene text Detection Using Superpixel-Based Stroke Feature Transform and Deep Learning Based Region Classification

Scene text Detection via Connected Component Clustering and Nontext Filtering

Scene text Detection via Deep Semantic Feature Fusion and Attention-based Refinement

Scene text Detection via Integrated Discrimination of Component Appearance and Consensus

Scene text detection via stroke width

Scene text Detection with Adaptive Line Clustering

Scene text detection with extremal region based cascaded filtering

Scene text Detection with Recurrent Instance Segmentation

Scene text detection with robust character candidate extraction method

Scene text Detection with Selected Anchors

Scene text detection with superpixels and hierarchical model

Scene text Extraction and Translation for Handheld Devices

Scene text extraction based on edges and support vector regression

Scene text Extraction by Superpixel CRFs Combining Multiple Character Features

Scene text Extraction in Complex Images

Scene text extraction in natural scene images using hierarchical feature combining and verification

Scene text Extraction Using Focus of Mobile Camera

Scene text Extraction with Edge Constraint and Text Collinearity

Scene text Extraction with Edge Constraint and Text Collinearity

Scene text extraction with local symmetry transform

Scene text Identification by Leveraging Mid-level Patches and Context Information

Scene text Image Super-resolution based on Text-conditional Diffusion Models

Scene text Image Super-resolution based on Text-conditional Diffusion Models

Scene text Image Super-Resolution in the Wild

Scene text Image Super-Resolution Via Semantic Distillation and Text Perceptual Loss

Scene text Image Super-Resolution Via Semantic Distillation and Text Perceptual Loss

Scene text Localization and Recognition with Oriented Stroke Detection

Scene text Localization Using Gradient Local Correlation

Scene text Recognition and Retrieval for Large Lexicons

Scene text recognition by learning co-occurrence of strokes based on spatiality embedded dictionary

Scene text Recognition in Mobile Applications by Character Descriptor and Structure Configuration

Scene text Recognition Models Explainability Using Local Features

Scene text recognition using a Hough forest implicit shape model and semi-Markov conditional random fields

Scene text Recognition Using Co-occurrence of Histogram of Oriented Gradients

Scene text Recognition using Higher Order Language Priors

Scene text Recognition Using Part-Based Tree-Structured Character Detection

Scene text Recognition Using Progressive Rectification Network And Spelling Error Correction Language Model

Scene text recognition using residual convolutional recurrent neural network

Scene text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

Scene text recognition using sparse coding based features

Scene text Recognition Using Structure-Guided Character Detection and Linguistic Knowledge

Scene text Recognition with a Hough Forest Implicit Shape Model

Scene text recognition with CNN classifier and WFST-based word labeling

Scene text recognition with deeper convolutional neural networks

Scene text Recognition with Permuted Autoregressive Sequence Models

Scene text Recognition with Self-supervised Contrastive Predictive Coding

Scene text Recognition: No Country for Old Men?

Scene text rectification using glyph and character alignment properties

Scene text Removal, Text Erasing

Scene text Removal, Text Erasing

Scene text Retrieval via Joint Text Detection and Similarity Learning

Scene text Retrieval via Joint Text Detection and Similarity Learning

Scene text Script Identification with Convolutional Recurrent Neural Networks

Scene text Segmentation Based on Local Image Phase Information and MSER Method

Scene text Segmentation by Paired Data Synthesis

Scene text Segmentation via Inverse Rendering

Scene text Segmentation with Multi-level Maximally Stable Extremal Regions

Scene text Telescope: Text-Focused Scene Image Super-Resolution

Scene text Telescope: Text-Focused Scene Image Super-Resolution

Scene text Visual Question Answering

Scene text , Assistance for Visually Imapired

Scene- text Oriented Referring Expression Comprehension

Scene- text Synthesis Engine Achieved Through Learning From Decomposed Real-World Data, A

Scene- text -Detection Method Robust Against Orientation and Discontiguous Components of Characters

SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and text

Scientometric Full- text Analysis of Papers Published in Remote Sensing between 2009 and 2021

SciOL and MuLMS-Img: Introducing A Large-Scale Multimodal Scientific Dataset and Models for Image- text Tasks in the Scientific Domain

SCOB: Universal text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

SCOB: Universal text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

SCoRD: Subject-Conditional Relation Detection with text -Augmented Data

ScrabbleGAN: Semi-Supervised Varying Length Handwritten text Generation

Screen-rendered text images recognition using a deep residual network based segmentation-free method

Scribble-Guided Diffusion for Training-Free text -to-Image Generation

Script and nature differentiation for Arabic and Latin text images

Script-Free text Line Segmentation Using Interline Space Model for Printed Document Images

Script-Independent text Line Segmentation in Freestyle Handwritten Documents

Script-independent, HMM-based text Line Finding for OCR

Sculpt3D: Multi-View Consistent text -to-3D Generation with Sparse 3D Prior

SCUT-COUCH text line_NU: An Unconstrained Online Handwritten Chinese Text Lines Dataset

SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents

SD-Prompt: Learnable and Adaptive Prompts for Enhancing Subject-Driven text -to-Image Synthesis

Search method and apparatus for locating digitally stored content, such as visual images, music and sounds, text , or software, in storage devices on a computer network

Searching a High Performance Feature Extractor for text Recognition Network

Searching OCR'ed text : An LDA Based Approach

Searching through a Speech Memory for text -Independent Speaker Verification

See Finer, See More: Implicit Modality Alignment for text -based Person Retrieval

See-Through- text Grouping for Referring Image Segmentation

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene text Recognition

Seek Common Ground While Reserving Differences: Semi-Supervised Image- text Sentiment Recognition

SeeTek: Very Large-Scale Open-set Logo Recognition with text -Aware Metric Learning

SegINR: Segment-Wise Implicit Neural Representation for Sequence Alignment in Neural text -to-Speech

SegLink++: Detecting Dense and Arbitrary-shaped Scene text by Instance-aware Component Grouping

Segmentation and Classification of Mixed text /Graphics/Image Documents

Segmentation and Recognition of Continuous Handwriting Chinese text

Segmentation and Recognition of Dimensioning text from Engineering Drawings

Segmentation and Word Spotting Methods for Printed and Handwritten Arabic text s: A Comparative Study

Segmentation Method of Single- and Multiple-Touching Characters in Offline Handwritten Japanese text Recognition, A

Segmentation of Bangla unconstrained handwritten text

Segmentation of On-Line Freely Written Japanese text Using SVM for Improving Text Recognition

Segmentation of On-Line Freely Written Japanese text Using SVM for Improving Text Recognition

Segmentation of on-line handwritten Japanese text of arbitrary line direction by a neural network for improving text recognition

Segmentation of on-line handwritten Japanese text of arbitrary line direction by a neural network for improving text recognition

Segmentation of On-Line Handwritten Japanese text Using SVM for Improving Text Recognition

Segmentation of On-Line Handwritten Japanese text Using SVM for Improving Text Recognition

Segmentation of stick text based on sub connected area analysis

Segmentation of text and graphics

Segmentation of text and Graphics from Document Images

Segmentation of text From Color Map Images

Segmentation of text , picture and lines of a document image

Segmentation of text /image documents using texture approaches

Segmentation of Uniform Colored text from Color Graphics Background

Segmentation of Very Low Resolution Screen-Rendered text

Segmentation-Aware text -Guided Image Manipulation

Segmentation-Free Approach to text Recognition Recognition with Application to Arabic Text, A

Segmentation-Free Approach to text Recognition Recognition with Application to Arabic Text, A

Segmentation-Free Guidance for text -to-Image Diffusion Models

Segmentation-free handwritten Chinese text recognition with LSTM-RNN

Segmented handwritten text recognition with recurrent neural network classifiers

Segmenting a page of a document into areas which are text and areas which are halftone

Segmenting Messy text : Detecting Boundaries in Text Derived from Historical Newspaper Images

Segmenting Messy text : Detecting Boundaries in Text Derived from Historical Newspaper Images

Segmenting text Images with Massively Parallel Machines

Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image- text Matching

Selectively Informative Description can Reduce Undesired Embedding Entanglements in text -to-Image Personalization

Self-Adaptive Image- text Fusion for Medical Image Classification

Self-attention based text Knowledge Mining for Text Detection

Self-attention based text Knowledge Mining for Text Detection

Self-Cross Diffusion Guidance for text -to-Image Synthesis of Similar Subjects

Self-Discovering Interpretable Diffusion Latent Directions for Responsible text -to-Image Generation

Self-learning structure for text localization

Self-Organized text Detection with Minimal Post-processing via Border Learning

Self-paced Learning to Improve text Row Detection in Historical Documents with Missing Labels

Self-supervised adaptation for on-line script text recognition

Self-supervised Character-to-Character Distillation for text Recognition

Self-supervised deep reconstruction of mixed strip-shredded text documents

Self-Supervised Discovery of Cross-Lingual Shared Knowledge for Continual text Recognition

Self-Supervised Implicit Glyph Attention for text Recognition

Self-Supervised Learning for text Recognition: A Critical Survey

Self-Supervised Learning of Visual Features through Embedding Images into text Topic Spaces

Self-supervised writer adaptation using perceptive concepts: application to on-line text recognition

Self-Training for Domain Adaptive Scene text Detection

Self-training for Handwritten text Line Recognition

SEM-CS: Semantic Clipstyler for text -Based Image Style Transfer

SEMACOL: Semantic-enhanced multi-scale approach for text -guided grayscale image colorization

Semantic and Morphological Information Guided Chinese text Classification

Semantic Controllable Long text Steganography Framework Based on LLM Prompt Engineering and Knowledge Graph, A

Semantic Correlation Mining between Images and text s with Global Semantics and Local Mapping

Semantic Distance Adversarial Learning for text -to-Image Synthesis

Semantic Indexing of Multimedia Content Using Visual, Audio, and text Cues

Semantic Integration of Information Through Relation Mining: Application to Bio-medical text Processing

Semantic keyword extraction via adaptive text binarization of unstructured unsourced video

Semantic Object Accuracy for Generative text -to-Image Synthesis

Semantic Oriented text Clustering Based on RDF

Semantic Proximity Based System of Arabic text Indexation, A

Semantic Role Aware Correlation Transformer for text To Video Retrieval

Semantic role-based representations in text classification

Semantic Similarity Distance: Towards better text -image consistency metric in text-to-image generation

Semantic Similarity Distance: Towards better text -image consistency metric in text-to-image generation

Semantic text Summarization of Long Videos

Semantic-Aware Video text Detection

Semantic-Compensated and Attention-Guided Network for Scene text Detection

Semantic-Preserving Metric Learning for Video- text Retrieval

Semantic-Spatial Attention for Refined Object Placement in text -to-Image Synthesis

Semantically Consistent Hierarchical text to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network

Semantically consistent text to fashion image synthesis with an enhanced attentional generative adversarial network

Semantically Invariant text -to-Image Generation

Semantics Disentangling for text -To-Image Generation

Semantics-Enhanced Adversarial Nets for text -to-Image Synthesis

Semi-automatic news video annotation framework for Arabic text

Semi-Incremental Recognition Method for On-Line Handwritten Japanese text , A

Semi-Incremental Recognition of On-Line Handwritten Japanese text

Semi-supervised learning for text -line detection

Semi-supervised network embedding with text information

Semi-Supervised Pixel-Level Scene text Segmentation by Mutually Guided Network

Semi-Supervised Scene text Recognition

Semi-Supervised text Classification With Universum Learning

Semi-Supervised text Detection With Accurate Pseudo-Labels

Semi-Supervised text -Based Person Search

Semiautomatic Ground Truth Generation for text Detection and Recognition in Video Images

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end text Spotting

SemStyle: Learning to Generate Stylised Image Captions Using Unaligned text

Sense discovery via co-clustering on images and text

Sentence level text classification in the Kannada language: A classifier's perspective

Sentiment analysis based on text information enhancement and multimodal feature fusion

Sentiment Similarity-oriented Attention Model with Multi-task Learning for text -based Emotion Recognition, A

Separate Images and Graphics from text

Separate, Locate, and Align: Determine Con text Relation of Scene Text From Multiple Perspectives in TextVQA

Separating Content from Style Using Adversarial Learning for Recognizing text in the Wild

Separating handwritten material from machine printed text using hidden Markov models

Separating Handwritten text from Non-Textual Interference

Separating Lines of text in Free-Form Handwritten Historical Documents

Separating text and background in degraded document images: A comparison of global thresholding techniques for multi-stage thresholding

Separation of overlapping text from graphics

Separation of touching and overlapping words in adjacent lines of handwritten text

Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised text Recognition

Seq2seq-based Model with Global Semantic Con text for Scene Text Recognition, A

Sequence as a Whole: A Unified Framework for Video Action Localization With Long-Range text Query

Sequence to Sequence -- Video to text

Sequence-to-Sequence Contrastive Learning for text Recognition

Sequence-To-Sequence Domain Adaptation Network for Robust text Image Recognition

Sequential alignment attention model for scene text recognition

Sequential Deformation for Accurate Scene text Detection

Sequential Monte Carlo video text segmentation

Sequential text s Driven Cohesive Motions Synthesis with Natural Transitions

Sequential Transformer for End-to-End Video text Detection

Sequential visual and semantic consistency for semi-supervised text recognition

set of benchmarks for Handwritten text Recognition on historical documents, A

SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene text Recognition

SGDM: An Adaptive Style-Guided Diffusion Model for Personalized text to Image Generation

Shape My Moves: text -Driven Shape-Aware Synthesis of Human Motions

Shape Robust text Detection With Progressive Scale Expansion Network

Shape-Aware text -Driven Layered Video Editing

Shape-DNA: Effective Character Restoration and Enhancement for Arabic text Documents

Shape-Matching GAN++: Scale Controllable Dynamic Artistic text Style Transfer

ShapeScaffolder: Structure-Aware 3D Shape Generation from text

ShapeWords: Guiding text -to-Image Synthesis with 3D Shape-Aware Prompts

Shatter and Gather: Learning Referring Image Segmentation with text Supervision

SHE-Net: Syntax-Hierarchy-Enhanced text -Video Retrieval

Sherpa3D: Boosting High-Fidelity text -to-3D Generation via Coarse 3D Prior

Shifted Diffusion for text -to-image Generation

ShotAdapter: text -to-Multi-Shot Video Generation with Diffusion Models

Show-1: Marrying Pixel and Latent Diffusion Models for text -to-Video Generation

Shuffle and Divide: Contrastive Learning for Long text

SiamCLIM: text -Based Pedestrian Search Via Multi-Modal Siamese Contrastive Learning

Sigma-Lognormal Model for Handwritten text CAPTCHA Generation, A

Sign Detection Based text Localization in Mobile Device Captured Scene Images

Signing Avatars: Multimodal Challenges for text -to-sign Generation

Silent Branding Attack: Trigger-free Data Poisoning Attack on text -to-Image Diffusion Models

SILMM: Self-Improving Large Multimodal Models for Compositional text -to-Image Generation

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy Against text -to-Image Synthesis of Diffusion Models

SimAN: Exploring Self-Supervised Representation Learning of Scene text via Similarity-Aware Normalization

Similarity Search on Semantic Trajectories Using text Processing

Similarity Shuffled Criss-Cross Transformer With Angle Loss for Image- text Matching

SimMotionEdit: text -Based Human Motion Editing with Motion Similarity Prediction

Simple and Effective Multi-word Query Spotting in Handwritten text Images

Simple and Robust Correlation Filtering Method for text -Based Person Search, A

Simple Framework for text -Supervised Semantic Segmentation, A

Simulated Annealing Clustering of Chinese Words for Con text ual Text Recognition

Simulated annealing-based text clustering

SINE: SINgle Image Editing with text -to-Image Diffusion Models

Single Shot Scene text Retrieval

Single Shot text Detector with Regional Attention

Single-frame text super-resolution: a bayesian approach

Single-Line text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition

Single-Line text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition

Six-CD: Benchmarking Concept Removals for text -to-image Diffusion Models

SKED: Sketch-guided text -based 3D Editing

skeleton based descriptor for detecting text in real scene images, A

Skeleton Filter: A Self-Symmetric Filter for Skeletonization in Noisy text Images

Sketch and text Guided Diffusion Model for Colored Point Cloud Generation

Sketch is Worth a Thousand Words: Image Retrieval with text and Sketch, A

SketchBird: Learning to Generate Bird Sketches from text

Skew Angle Detection and Correction in text Images Using RGB Gradient

Skew correction and line extraction in binarized printed text images

Skew Detection and text Line-Position Determination in Digitized Documents

Skew detection for complex document images using robust borderlines in both text and non-text regions

Skew detection for complex document images using robust borderlines in both text and non-text regions

Skew detection of text in a noisy digitized image

Skewed text correction based on the improved Hough transform

Skews in the Phenomenon Space Hinder Generalization in text -to-image Generation

SleeperMark: Towards Robust Watermark against Fine-Tuning text -to-image Diffusion Models

Sliding Line Point Regression for Shape Robust Scene text Detection

SLOAN: Scale-Adaptive Orientation Attention Network for Scene text Recognition

SMAN: Stacked Multimodal Attention Network for Cross-Modal Image- text Retrieval

SmartBrush: text and Shape Guided Object Inpainting with Diffusion Model

Smile: Sequence-to-Sequence Domain Adaptation with Minimizing Latent Entropy for text Image Recognition

SNAC: Speaker-Normalized Affine Coupling Layer in Flow-Based Architecture for Zero-Shot Multi-Speaker text -to-Speech

Snap Video: Scaled Spatiotemporal Transformers for text -to-Video Synthesis

SnapGen: Taming High-Resolution text -To-Image Models for Mobile Devices with Efficient Architectures and Training

Snooper text : A multiresolution system for text detection in complex visual scenes

Snooper text : A text detection system for automatic indexing of urban scenes

Snoopertrack: text detection and tracking for outdoor videos

SNP-S3: Shared Network Pre-Training and Significant Semantic Strengthening for Various Video- text Tasks

So Many Heads, So Many Wits: Multimodal Graph Reasoning for text -Based Visual Question Answering

Social Image- text Sentiment Classification With Cross-Modal Consistency and Knowledge Distillation

Sounding Video Generator: A Unified Framework for text -Guided Sounding Video Generation

Source-Free Image- text Matching via Uncertainty-Aware Learning

Space-Time Diffusion Features for Zero-Shot text -Driven Motion Transfer

sparse version of the ridge logistic regression for large-scale text categorization, A

Sparsectrl: Adding Sparse Controls to text -to-video Diffusion Models

Spatial and Color Spaces Combination for Natural Scene text Extraction

Spatial and Spectral Based Segmentation of text in Multispectral Images of Ancient Documents

Spatial con text -based Self-Supervised Learning for Handwritten Text Recognition

Spatial Transport Optimization by Repositioning Attention Map for Training-Free text -to-Image Synthesis

Spatially Prioritized and Persistent text Detection and Decoding

Spatio-Temporal Relevance Classification from Geographic text s Using Deep Learning

Spatiotemporal Typhoon Damage Assessment: A Multi-Task Learning Method for Location Extraction and Damage Identification from Social Media text s

SPCL: Semantic Polymorphism and Commonality Learning for text -Based Person Retrieval

Special issue on camera-based text and document recognition

Special issue on deep learning for video text analysis

Special Issue on Noisy text Analytics

Special Issue on Noisy text Analytics, II

Special Issue on Noisy text Analytics, III

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of text -to-Image Diffusion Models to Learn Any Unseen Style

Specific Category Region Proposal Network for text Detection in Natural Scene

Specific Diverse text -to-Image Synthesis via Exemplar Guidance

Spectral approach to find number of clusters of short- text documents

Spectral Fluctuation Method: A text ure-Based Method to Extract Text Regions in General Scene Images

SpectralCLIP: Preventing Artifacts in text -Guided Style Transfer from a Spectral Perspective

SpeechPalette: A Comprehensive Speech Editing Method for text -Based Speech Editing, One-Shot TTS and Attributes Editing

Speedupnet: A Plug-and-play Adapter Network for Accelerating text -to-image Diffusion Models

SPEye: A Calibration-Free Gaze-Driven text Entry Technique Based on Smooth Pursuit

Spherical Linear Interpolation and text -Anchoring for Zero-Shot Composed Image Retrieval

Split-net: Dual transformer encoder with splitting scene text image for script identification

Spontaneous Handwriting text Recognition and Classification Using Finite-State Models

Spotlight text Detector: Spotlight on Candidate Regions Like a Camera

Spotting Phrases in Lines of Imaged text

SPS-SQL: Enhancing text -to-SQL generation onr small-scale LLMs with pre-synthesized queries

SPTS v2: Single-Point Scene text Spotting

ST-LDM: A Universal Framework for text -grounded Object Generation in Real Images

Stable Preference: Redefining Training Paradigm of Human Preference Model for text -to-image Synthesis

Stable text line detection

StableID: Multimodal learning for stable identity in personalized text -to-Face generation

StableVideo: text -driven Consistency-aware Diffusion Video Editing

Stacked Cross Attention for Image- text Matching

StackGAN: text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

StacMR: Scene- text Aware Cross-Modal Retrieval

STAN: A sequential transformation attention-based network for scene text recognition

STAR-Net: A SpaTial Attention Residue Network for Scene text Recognition

STARS: Semantics-Aware text -guided Aerial Image Refinement and Synthesis

StarVector: Generating Scalable Vector Graphics Code from Images and text

State Estimation in a Document Image and Its Application in text Block Identification and Text Line Extraction

State Estimation in a Document Image and Its Application in text Block Identification and Text Line Extraction

State-of-the-Art in Action: Unconstrained text Detection

Static text region detection in video sequences using color and orientation consistencies

Statistical Approach for Phrase Location and Recognition within a text Line: An Application to Street Name Recognition, A

Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

Statistical text Line Analysis in Handwritten Documents

Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic text Lines, A

STEFANN: Scene text Editor Using Font Adaptive Neural Network

Steganalysis for text , Documents

STEP - Towards Structured Scene- text Spotting

STEPS: Sequential Probability Tensor Estimation for text -to-Image Hard Prompt Search

STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from text -to-Image Diffusion Models

Stochastic text Models for Music Categorization

Stop Word Location and Identification for Adaptive text Recognition

Store classification using text -Exemplar-Similarity and Hypotheses-Weighted-CNN

Story Segmentation in News Videos Using Visual and text Cues

Story Visualization by Online text Augmentation with Context Memory

StoryDALL-E: Adapting Pretrained text -to-Image Transformers for Story Continuation

STPNet: Scale-Aware text Prompt Network for Medical Image Segmentation

Straight-Line Approximation and 1D Representation of Off-Line Handwritten text

Straightening warped text lines using polynomial regression

Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene text Recognition

Stratified Multi-Task Learning for Robust Spotting of Scene text s

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from text

StreamMel: Real-Time Zero-Shot text -to-Speech Via Interleaved Continuous Autoregressive Modeling

Street View text Recognition With Deep Learning for Urban Scene Understanding in Intelligent Transportation Systems

Stretching deep architectures for text recognition

String Matching, text Matching

String-level learning of confidence transformation for Chinese handwritten text recognition

STRIVE: Scene text Replacement In Videos

stroke filter and its application to text localization, A

Stroke Filter for text Localization in Video Images

Stroke Segmentation and Recognition from Bangla Online Handwritten text

Stroke Verification with Gray-level Image for Hangul Video text Recognition

Stroke-Based Scene text Erasing Using Synthetic Data for Training

Strokelets: A Learned Multi-Scale Mid-Level Representation for Scene text Recognition

Strokelets: A Learned Multi-scale Representation for Scene text Recognition

Structural feature-based event clustering for short text streams

Structure-Aware Generative Adversarial Network for text -to-Image Generation

Structured Human Assessment of text -to-Image Generative Models

Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis

Study on Automatic Chinese text Classification, A

Style Transformer With Common Knowledge Optimization for Image- text Retrieval, The

Style-A-Video: Agile Diffusion for Arbitrary text -Based Video Style Transfer

Style-Editor: text -driven object-centric style editing

Style-Preserving Diffusion for Scene text Editing

StyleCLIP: text -Driven Manipulation of StyleGAN Imagery

StyleMC: Multi-Channel Based Fast text -Guided Image Generation and Manipulation

StyleStudio: text -Driven Style Transfer with Selective Control of Style Elements

StyleT2I: Toward Compositional and High-Fidelity text -to-Image Synthesis

Stylized text -to-Fashion Image Generation

Sub-structure Learning Based Handwritten Chinese text Recognition

subtractive clustering scheme for text -independent online writer identification, A

Super-resolution Enhancement of text Image Sequences

Super-Resolved Binarization of text Based on the FAIR Algorithm

Superresolution-based Enhancement of text in Digital Video

supervised algorithm with a new differentiated-weighting scheme for identifying the author of a handwritten text , A

Supervised and Traditional Term Weighting Methods for Automatic text Categorization

Supervised Domain Adaptation from Scene text Recognition for Licence Plate Recognition

Supervised semantic relation mining from linguistically noisy text documents

support vector approach for cross-modal search of images and text s, A

Support vector machine-based approach for text description from the video

Support vector machine-based text detection in digital video

Suppression of non- text components in handwritten document images

Surgical text -to-image generation

Surprisingly Straightforward Scene text Removal Method with Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis, The

Survey of text Watermarking in the Era of Large Language Models, A

survey on camera-captured scene text detection and extraction: towards Gurmukhi script, A

survey on methods, datasets and implementations for scene text spotting, A

survey on text generation using generative adversarial networks, A

SVGDreamer++: Advancing Editability and Diversity in text -Guided SVG Generation

SVGDreamer: text Guided SVG Generation with Diffusion Model

SViTT: Temporal Learning of Sparse Video- text Transformers

Swap Attention in Spatiotemporal Diffusions for text -to-Video Generation

Swap text : Image Based Texts Transfer in Scenes

SwiftBrush: One-Step text -to-Image Diffusion Model with Variational Score Distillation

SwiftEdit: Lightning Fast text -Guided Image Editing via One-Step Diffusion

Swin text Spotter v2: Towards Better Synergy for Scene Text Spotting

Swin text Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

Swin text Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

Swin text Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

SWT voting-based color reduction method for detecting text in natural scene images

Symbolic Subtraction from Fixed Formatted Graphics and text from Filled in Forms

Symbolization of Regional Elements Based on Local-Chronicle text Mining and Image-Feature Extraction, The

Symmetric-key block cipher for image and text cryptography

Symmetry-based object proposal for text detection

Symmetry-based text line detection in natural scenes

Symmetry-Constrained Rectification Network for Scene text Recognition

Syn3DTxt: Embedding 3D Cues for Scene text Generation

Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to- text Translation

Synthesizing Talking Faces from text and Audio: An Autoencoder and Sequence-to-Sequence Convolutional Neural Network

Synthetic Data for text Localisation in Natural Images

Synthetic-to-real Unsupervised Domain Adaptation for Scene text Detection in the Wild

Synthetically Supervised Feature Learning for Scene text Recognition

System and method for automatically distinguishing between graphic information and text information of image data

System for Bangla Online Handwritten text , A

System for Handwritten and Machine-Printed text Separation in Bangla Document Images, A

system for the off-line recognition of handwritten text , A

T-HOG: An effective gradient-based descriptor for single line text regions

T-REX2: Towards Generic Object Detection via text -visual Prompt Synergy

T-Skeleton: Accurate scene text detection via instance-aware skeleton embedding

t-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

t-SS3: A text classifier with dynamic n-grams for early risk detection over text streams

t-Test feature selection approach based on term frequency for text categorization

T-VSL: text -Guided Visual Sound Source Localization in Mixtures

T2I-CompBench++: An Enhanced and Comprehensive Benchmark for Compositional text -to-Image Generation

T2ishield: Defending Against Backdoors on text -to-image Diffusion Models

T2TD: text -3D Generation Model Based on Prior Knowledge Guidance

T2V-CompBench: A Comprehensive Benchmark for Compositional text -to-video Generation

T2V2T: text -to-Video-to-Text Fusion for Text-to-Video Retrieval

T2V2T: text -to-Video-to-Text Fusion for Text-to-Video Retrieval

T2V2T: text -to-Video-to-Text Fusion for Text-to-Video Retrieval

T2VBench: Benchmarking Temporal Dynamics for text -to-Video Generation

T2VLAD: Global-Local Sequence Alignment for text -Video Retrieval

TA2V: text -Audio Guided Video Generation

TAB: text -Align Anomaly Backbone Model for Industrial Inspection Tasks

Tablet identification using support vector machine based text recognition and error correction by enhanced n-grams algorithm

TACMT: text -aware cross-modal transformer for visual grounding on high-resolution SAR images

TACo: Token-aware Cascade Contrastive Learning for Video- text Alignment

TACT: text attention based CNN-Transformer network for polyp segmentation

TADA! text to Animatable Digital Avatars

Tag2Pix: Line Art Colorization Using text Tag With SECat and Changing Loss

Tag: text Prompt Augmentation for Zero-shot Out-of-distribution Detection

Tagging Webcast text in Baseball Videos by Video Segmentation and Text Alignment

Tagging Webcast text in Baseball Videos by Video Segmentation and Text Alignment

Tailored Visions: Enhancing text -to-Image Generation with Personalized Prompt Rewriting

Tailoring text for automatic layouting of newspaper pages

TalkCLIP: Talking Head Generation with text -Guided Expressive Speaking Styles

TAM-TR: text -guided attention multi-modal transformer for object detection in UAV images

Taming Mode Collapse in Score Distillation for text -to-3D Generation

Taming Stable Diffusion for text to 360° Panorama Image Generation

TAP: text -Aware Pre-training for Text-VQA and Text-Caption

TAP: text -Aware Pre-training for Text-VQA and Text-Caption

TAP: text -Aware Pre-training for Text-VQA and Text-Caption

TAPS3D: text -Guided 3D Textured Shape Generation from Pseudo Supervision

Target-level Sentiment Analysis Based on Image and text Fusion

TASDF-Stega: High Capacity Secure text -Audio Joint Steganography Using Diffusion Latent Space

Task Grouping for Multilingual text Recognition

TC4D: Trajectory-Conditioned text -to-4D Generation

TCATD: text Contour Attention for Scene Text Detection

TCATD: text Contour Attention for Scene Text Detection

TCFF-Adapter: text -Driven Adaption of CLIP for Few-Shot Image Classification

TCP: text -Guided Cascade Network for Pedestrian Crossing Intention Prediction

TE141K: Artistic text Benchmark for Text Effect Transfer

TE141K: Artistic text Benchmark for Text Effect Transfer

Teach text : CrossModal Generalized Distillation for Text-Video Retrieval

Teach text : CrossModal text-video retrieval through generalized distillation

TECA: text -Guided Generation and Editing of Compositional 3D Avatars

TeCH: text -Guided Reconstruction of Lifelike Clothed Humans

Technique for Segmentation of Gurmukhi text , A

Tecm-clip: text -based Controllable Multi-attribute Face Image Manipulation

TediGAN: text -Guided Diverse Face Image Generation and Manipulation

TEDRA: text -Based Editing of Dynamic and Photoreal Actors

Tela: text to Layer-wise 3d Clothed Human Generation

Tell Me What Happened: Unifying text -guided Video Completion via Multimodal Masked Video Generation

Tell Your Story: text -Driven Face Video Synthesis with High Diversity via Adversarial Learning

Tem-adapter: Adapting Image- text Pretraining for Video Question Answer

TeMO: Towards text -Driven 3D Stylization for Multi-Object Meshes

Template Based Segmentation of Touching Components in Handwritten text Lines

Temporal Multimodal Graph Transformer With Global-Local Alignment for Video- text Retrieval

Temporal prompt guided visual- text -object alignment for zero-shot video captioning

Temporal video segmentation with natural language using text -video cross attention and Bayesian order-priors

TEMSA: text enhanced modal representation learning for multimodal sentiment analysis

Tensor representation learning based image patch analysis for text identification and recognition

Tensor Voting Based text Localization in Natural Scene Images

Term relevance dependency model for text classification

TETFN: A text enhanced transformer fusion network for multimodal sentiment analysis

TeViR: text -to-Video Reward With Diffusion Models for Efficient Reinforcement Learning

TEXDC: text -driven Disease-aware 4d Cardiac Cine MRI Images Generation

TexFusion: Synthesizing 3D text ures with Text-Guided Image Diffusion Models

Texgen: text -guided 3d Texture Generation with Multi-view Sampling and Resampling

text alignment in early printed books combining deep learning and dynamic programming

text alignment with handwritten documents

text analysis using local energy

text and Documents in the Deep Learning Era

text and Image Guided 3D Avatar Generation and Manipulation

text and Image Sharpening of Scanned Images in the JPEG Domain

text and Layout Information Extraction from Document Files of Various Formats Based on the Analysis of Page Description Language

text and Non-Text Latent Feature Disentanglement for Screen Content Image Compression

text and Non-Text Latent Feature Disentanglement for Screen Content Image Compression

text and non-text segmentation based on connected component features

text and non-text segmentation based on connected component features

text and non-text separation in offline document images: a survey

text and non-text separation in offline document images: a survey

text and picture segmentation by the distribution analysis of wavelet coefficients

text and User Generic Model for Writer Verification Using Combined Pen Pressure Information From Ink Intensity and Indented Writing on Paper

text Area Detection in Digital Documents Images Using Textural Features

text area localization under complex-background using wavelet decomposition

text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution, A

text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution, A

text Augmented Correlation Transformer For Few-shot Classification & Segmentation

text baseline detection, a single page trained system

text Baseline Recognition Using a Recurrent Convolutional Neural Network

text binarization in color documents

text Block Segmentation in Comic Speech Bubbles

text box proposals for handwritten word spotting from documents

text Categorization Approach for Music Style Recognition, A

text Categorization: A Symbolic Approach

text Change Detection in Multilingual Documents Using Image Comparison

text Classification and Document Layout Analysis of Paper Fragments

text classification with the support of pruned dependency patterns

text Co-Detection in Multi-View Scene

text Compression-Aided Transformer Encoding

text data extraction from microfilm images of punched cards

text degradations and OCR training

text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm

text Detection and Recognition in Imagery: A Survey

text detection and recognition in images and video frames

text detection and recognition in natural scene with edge analysis

text Detection and Recognition in Real World Images

text detection and recognition in urban scenes

text Detection and Recognition on Traffic Panels From Street-Level Imagery Using Visual Appearance

text detection and restoration in natural scene images

text Detection and Translation from Natural Scenes

text detection based on convolutional neural networks with spatial pyramid pooling

text Detection for Video Analysis

text detection from natural scene images using topographic maps and sparse representations

text Detection from Natural Scene Images: Towards a System for Visually Impaired Persons

text detection from scene images using sparse representation

text detection in color scene images based on unsupervised clustering of multi-channel wavelet features

text detection in continuous tone image segments

text Detection in Digital Images Captured with Low Resolution Under Nonuniform Illumination Conditions

text detection in images based on unsupervised classification of edge-based features

text detection in images based on unsupervised classification of high-frequency wavelet coefficients

text detection in images using sparse representation with discriminative dictionaries

text detection in manga by combining connected-component-based and region-based classifications

text Detection in Natural Images Using Bio-inspired Models

text Detection in Natural Images Using Localized Stroke Width Transform

text Detection in Natural Scene Images by Stroke Gabor Words

text detection in natural scene images with user-intention

text detection in natural scene with edge analysis

text detection in natural scenes using Gradient Vector Flow-Guided symmetry

text detection in nature scene images using two-stage nontext filtering

text detection in scene images based on exhaustive segmentation

text detection in stores using a repetition prior

text Detection of Two Major Indian Scripts in Natural Scene Images

text detection on camera acquired document images using supervised classification of connected components in wavelet domain

text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification, A

text Detection Using Edge Gradient and Graph Spectrum

text Detection, Find Text in General Scenes, Scene Text

text Detection, Find Text in General Scenes, Scene Text

text Detection, Find Text in General Scenes, Scene Text

text detection, localization, and tracking in compressed video

text Detection, Scene Text, Curved Text, Arbitrary Orientation

text Detection, Scene Text, Curved Text, Arbitrary Orientation

text Detection, Scene Text, Curved Text, Arbitrary Orientation

text Detection, Tracking and Recognition in Video: A Comprehensive Survey

text Detector Based on the Specific Text Prompt, A

text Detector Based on the Specific Text Prompt, A

text discrimination method and related apparatus

text Driven Face-Video Synthesis Using GMM and Spatial Correlation

text Driven Temporal Segmentation of Cricket Videos

text effects transfer via distribution-aware texture synthesis

text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps

text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps

text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps

text Encryption: Hybrid cryptographic method using Vigenere and Hill Ciphers.

text Enhancement by PDE's Based Methods

text Enhancement for Laser Copiers

text Enhancement Network for Cross-Domain Scene Text Detection

text Enhancement Network for Cross-Domain Scene Text Detection

text enhancement with asymmetric filter for video OCR

text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model

text extraction from color documents-clustering approaches in three and four dimensions

text Extraction from Colored Book and Journal Covers

text extraction from degraded document images

text extraction from gray scale document images using edge information

text extraction from gray scale historical document images using adaptive local connectivity map

text Extraction from Grey Scale Page Images by Simple Edge Detectors

text extraction from images captured via mobile and digital devices

text extraction from name cards with complex design

text extraction from scene images by character appearance and structure modeling

text Extraction from Street Level Images

text Extraction from Video Using Conditional Random Fields

text extraction from web images based on a split-and-merge segmentation method using colour perception

text extraction in complex color documents

text Extraction in Digital News Video Using Morphology

text Extraction in MPEG Compressed Video for Content-based Indexing

text extraction in real scene images on planar planes

text Extraction Using Component Analysis and Neuro-fuzzy Classification on Complex Backgrounds

text Extraction Using Pyramid

text Flow: A Unified Text Detection System in Natural Scene Images

text Flow: A Unified Text Detection System in Natural Scene Images

text From Corners: A Novel Approach to Detect Text and Caption in Videos

text From Corners: A Novel Approach to Detect Text and Caption in Videos

text generation and multi-modal knowledge transfer for few-shot object detection

text Generation, Text Synthesis, Text Placement on Maps

text Generation, Text Synthesis, Text Placement on Maps

text Generation, Text Synthesis, Text Placement on Maps

text Geolocation Prediction via Self-Supervised Learning

text Grouping Adapter: Adapting Pre-Trained Text Detector for Layout Analysis

text Grouping Adapter: Adapting Pre-Trained Text Detector for Layout Analysis

text Growing on Leaf

text Guided Person Image Synthesis

text Identification for Document Image Analysis Using a Neural Network

text Identification in Complex Background Using SVM

text identification in noisy document images using Markov random field

text Image Classifier Using Image-Wise Annotation

text Image Clean-Up and Thresholding: A Comparative Study

text Image Compression Using Soft Pattern Matching

text Image Deblurring Using Kernel Sparsity Prior

text Image Deblurring Using Text-Specific Properties

text Image Deblurring Using Text-Specific Properties

text Image Deblurring via Intensity Extremums Prior

text in Everything

text in Scenes, Stroke Based, Contour Based

text in the dark: Extremely low-light text image enhancement

text in the dark: Extremely low-light text image enhancement

text independent speaker gender recognition using lip movement

text Independent Writer Identification for Bengali Script

text independent writer identification of Arabic manuscripts and the effects of writers increase

text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features

text information extraction in images and video: a survey

text Input System Using Online Overlapped Handwriting Recognition for Mobile Devices

text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation

text line and word segmentation of handwritten documents

text line bounding system

text Line Characterization by Connected Component Transformations

text Line Detection for Heterogeneous Documents

text Line Detection in Corrupted and Damaged Historical Manuscripts

text Line Detection in Document Images: Towards a Support System for the Blind

text line detection in handwritten documents

text Line Detection in Unconstrained Handwritten Documents Using a Block-Based Hough Transform Approach

text Line Detection Method for Mathematical Formula Recognition, A

text line extraction for historical document images

text line extraction from multi-skewed handwritten documents

text line extraction in document images

text Line Extraction in Documents

text line extraction in graphical documents using background and foreground information

text Line Extraction in Handwritten Document with Kalman Filter Applied on Low Resolution Image

text Line Extraction Method Using Domain-Based Active Contour Model

text line extraction strategy for palm leaf manuscripts

text Line Extraction Using Adaptive Partial Projection for Palm Leaf Manuscripts from Thailand

text Line Extraction Using DMLP Classifiers for Historical Manuscripts

text Line Extraction Using Fully Convolutional Network and Energy Minimization

text line segmentation and word recognition in a system for general writer independent handwriting recognition

text Line Segmentation Based on Morphology and Histogram Projection

text Line Segmentation for Unconstrained Handwritten Document Images Using Neighborhood Connected Component Analysis

text line segmentation in Chinese handwritten text images

text line segmentation in Chinese handwritten text images

text Line Segmentation in Handwritten Documents Using Mumford-Shah Model

text Line Segmentation in Images of Handwritten Historical Documents

text Line Segmentation of Historical Arabic Documents

text line segmentation of historical documents: a survey

text line segmentation using a fully convolutional network in handwritten document images

text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis

text Localization and Extraction from Complex Color Images

text Localization and Extraction from Complex Gray Images

text Localization and Recognition in Complex Scenes Using Local Features

text Localization Based on Fast Feature Pyramids and Multi-Resolution Maximally Stable Extremal Regions

text Localization in Born-Digital Images of Advertisements

text Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors

text Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors

text Localization in Natural Scene Images Based on Conditional Random Field

text Localization in Real-World Images Using Efficiently Pruned Exhaustive Search

text Localization in Web Images Using Probabilistic Candidate Selection Model

text localization using image cues and text line information

text localization using image cues and text line information

text localization, enhancement and binarization in multimedia documents

text locating from natural scene images using image intensities

text location in complex images

text Mining in Remotely Sensed Phenology Studies: A Review on Research Development, Main Topics, and Emerging Issues

text Mining the Contributors to Rail Accidents

text Motion Translator: A Bi-directional Model for Enhanced 3d Human Motion Generation from Open-vocabulary Descriptions

text OCR by Solving a Cryptogram

text only Analysis, Natural Language

text optimization with latent inversion for non-rigid image editing

text Page Recognition Using Grey-Level Features and Hidden Markov-Models

text Parsing Using Spatial Information for Recognizing Addresses in Mail Pieces

text Particles Multi-band Fusion for Robust Text Detection

text Particles Multi-band Fusion for Robust Text Detection

text Position-Aware Pixel Aggregation Network With Adaptive Gaussian Threshold: Detecting Text in the Wild

text Position-Aware Pixel Aggregation Network With Adaptive Gaussian Threshold: Detecting Text in the Wild

text Prior Guided Scene Text Image Super-Resolution

text Prior Guided Scene Text Image Super-Resolution

text Prompt Region Decomposition for Effective Facial Expression Recognition

text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection

text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding

text Query to Web Image to Video: A Comprehensive Ad-hoc Video Search

text reading algorithm for natural images, A

text Recognition - Real World Data and Where to Find Them

text Recognition and Retrieval System for e-Business Image Management, A

text recognition from grey level images using hidden Markov models

text Recognition in Images Based on Transformer with Hierarchical Attention

text recognition in multimedia documents: A study of two neural-based OCRs using and avoiding character segmentation

text Recognition in Real Scenarios with a Few Labeled Samples

text Recognition in the Wild: A Survey

text recognition of low-resolution document images

text Recognition System for Japanese Documents

text recognition using deep BLSTM networks

text Recognition: From Pixels to Meaning

text Region Conditional Generative Adversarial Network for Text Concealment in the Wild

text Region Conditional Generative Adversarial Network for Text Concealment in the Wild

text region extraction and text segmentation on camera-captured document style images

text region extraction and text segmentation on camera-captured document style images

text Region Extraction from Quality Degraded Document Images

text Region Extraction From Scene Images Using AGF and MSER

text region extraction in a document image based on the Delaunay tessellation

text retrieval from early printed books

text scanner with text detection technology on image sequences

text scanner with text detection technology on image sequences

text search for medieval manuscript images

text segmentation and recognition in complex background based on markov random field

text Segmentation by Clustering Cohesion

text Segmentation for MRC Document Compression

text Segmentation from Complex Background Using Sparse Representations

text segmentation in color images using tensor voting

text Segmentation in Colour Posters from the Spanish Civil War Era

text segmentation in natural scenes using Toggle-Mapping

text Segmentation in Unconstrained Hand-Drawings in Whiteboard Photos

text Segmentation of Consumer Magazines in PDF Format

text Segmentation Using Gabor Filters for Automatic Document Processing

text segmentation using superpixel clustering

text selection by structured light marking for hand-held cameras

text Separation from Mixed Documents Using a Tree-Structured Classifier

text Similarity Measurement Method Based on BiLSTM-SECapsNet Model

text Spotting Transformers

text String Detection From Natural Scenes by Structure-Based Partition and Grouping

text String Extraction from Images of Color-Printed Documents

text Synopsis Generation for Egocentric Videos

text to 3D Synthesis, Text to 3D Generation

text to 3D Synthesis, Text to 3D Generation

text to Image for Multi-Label Image Recognition With Joint Prompt-Adapter Learning

text to Image Generation with Semantic-Spatial Aware GAN

text To Image Synthesis With Erudite Generative Adversarial Networks

text to image synthesis with multi-granularity feature aware enhancement Generative Adversarial Networks

text to Image, Image Based Rendering

text to photo-realistic image synthesis via chained deep recurrent generative adversarial network

text to Video Synthesis, Text to Motion

text to Video Synthesis, Text to Motion

text to visual synthesis with appearance models

text Verification in an Automated System for the Extraction of Bibliographic Data

text vs. Non-Text Regions

text vs. Non-Text Regions

text watermarking algorithm based on word classification and inter-word space statistics, A

text with Knowledge Graph Augmented Transformer for Video Captioning

text zone classification using unsupervised feature learning

text - and speech-based phonotactic models for spoken language identification of Basque and Spanish

text -anchored Score Composition: Tackling Condition Misalignment in Text-to-image Diffusion Models

text -anchored Score Composition: Tackling Condition Misalignment in Text-to-image Diffusion Models

text -Attentional Convolutional Neural Network for Scene Text Detection

text -Attentional Convolutional Neural Network for Scene Text Detection

text -augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification

text -aware balloon extraction from manga

text -aware image dehazing using stroke width transform

text -Based Audio Retrieval by Learning From Similarities Between Audio Captions

text -Based Fine-Grained Emotion Prediction

text -based Geometric Normalization for Robust Watermarking of Digital Maps

text -based image retrieval using progressive multi-instance learning

text -Based Localization of Moments in a Video Corpus

text -based Person Search via Attribute-aided Matching

text -Based Person Search via Cross-Modal Alignment Learning

text -based person search via fine-grained cross-modal semantic alignment

text -Based Temporal Localization of Novel Events

text -based visual context modulation neural model for multimodal machine translation, A

text -Centric multimodal sentiment analysis with asymmetric fine-tuning

text -Conditional Attribute Alignment Across Latent Spaces for 3D Controllable Face Image Synthesis

text -Conditioned Generative Model of 3D Strand-Based Human Hairstyles

text -conditioned Resampler For Long Form Video Understanding

text -Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

text -Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

text -Controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion

text -Controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion

text -Derived Relational Graph-Enhanced Network for Skeleton-Based Action Segmentation

text -Driven Automatic Frame Generation Using MPEG-4 Synthetic/Natural Hybrid Coding for 2-D Head-and-Shoulder Scene

text -Driven Fashion Image Editing with Compositional Concept Learning and Counterfactual Abduction

text -Driven Generative Domain Adaptation with Spectral Consistency Regularization

text -Driven Image Editing via Learnable Regions

text -Driven Medical Image Segmentation With LLM Semantic Bridge and LLM Prompt Bridge

text -driven Stylization of Video Objects

text -Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos

text -Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method

text -Edge-Box: An Object Proposal Approach for Scene Texts Localization

text -Edge-Box: An Object Proposal Approach for Scene Texts Localization

text -Enhanced Data-Free Approach for Federated Class-Incremental Learning

text -Enhanced Scene Image Super-Resolution via Stroke Mask and Orthogonal Attention

text -Enriched Air Traffic Flow Modeling and Prediction Using Transformers

text -free diffusion inpainting using reference images for enhanced visual fidelity

text -graphics separation to detect logo and stamp from color document images: A spectral approach

text -Guided 3D Face Synthesis: From Generation to Editing

text -guided camouflaged object detection

text -Guided Coarse-to-Fine Fusion Network for robust remote sensing visual question answering

text -guided distillation learning to diversify video embeddings for text-video retrieval

text -guided distillation learning to diversify video embeddings for text-video retrieval

text -Guided Explorable Image Super-Resolution

text -Guided Eyeglasses Manipulation With Spatial Constraints

text -Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning

text -Guided Facial Image Manipulation for Wild Images via Manipulation Direction-Based Loss

text -guided Fourier Augmentation for long-tailed recognition

text -Guided Generation and Refinement Model for Image Captioning, A

text -Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks

text -Guided Human Image Manipulation via Image-Text Shared Space

text -Guided Human Image Manipulation via Image-Text Shared Space

text -Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue

text -Guided Neural Network Training for Image Recognition in Natural Scenes and Medicine

text -Guided Object Detector for Multi-modal Video Question Answering

text -Guided Patch Scoring and Local Distortion Guidance for Image Quality Assessment

text -Guided Prototype Generation for Occluded Person Re-Identification

text -Guided Reconstruction Network for Sentiment Analysis With Uncertain Missing Modalities

text -Guided Semantic Alignment Network With Spatial-Frequency Interaction for Infrared-Visible Image Fusion Under Extreme Illumination

text -guided Sparse Voxel Pruning for Efficient 3D Visual Grounding

text -Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation

text -Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation

text -guided Video Masked Autoencoder

text -guided visual representation learning for medical image retrieval systems

text -guided weakly supervised framework for dynamic facial expression recognition

text -IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion

text -IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion

text -Image Alignment for Diffusion-Based Perception

text -image separation in Devanagari documents

text -image super-resolution through anchored neighborhood regression with multiple class-specific dictionaries

text -Independent Online Writer Identification Using Hidden Markov Models

text -independent Persian writer identification based on feature relation graph (FRG), A

text -independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram

text -independent speaker recognition using graph matching

text -independent speaker verification with ant colony optimization feature selection and support vector machine

text -independent voice conversion using deep neural network based phonetic level features

text -Independent Writer Identification and Verification on Offline Arabic Handwriting

text -Independent Writer Identification and Verification Using Textural and Allographic Features

text -Independent Writer Identification Based on Fusion of Dynamic and Static Features

text -Independent Writer Identification on Online Arabic Handwriting

text -independent writer identification using convolutional neural network

text -independent writer identification using SIFT descriptor and contour-directional feature

text -independent writer recognition using multi-script handwritten texts

text -independent writer recognition using multi-script handwritten texts

text -indicated writer verification using hidden Markov models

text -Injected Discriminative Model for Remote Sensing Visual Grounding

text -instance graph: Exploring the relational semantics for text-based visual question answering

text -instance graph: Exploring the relational semantics for text-based visual question answering

text -Line Detection in Camera-Captured Document Images Using the State Estimation of Connected Components

text -line examination for document forgery detection

text -line Extraction and Character Recognition of Document Headlines with Graphical Designs Using Complementary Similarity Measure

text -Line Extraction and Character Recognition of Japanese Newspaper Headlines With Graphical Designs

text -Line Extraction in Handwritten Chinese Documents Based on an Energy Minimization Framework

text -Line Extraction Using a Convolution of Isotropic Gaussian Filter with a Set of Line Filters

text -mining based journal splitting

text -only weakly supervised learning framework for text spotting via text-to-polygon generator, A

text -only weakly supervised learning framework for text spotting via text-to-polygon generator, A

text -only weakly supervised learning framework for text spotting via text-to-polygon generator, A

text -Pose Estimation in 3D Using Edge-Direction Distributions

text -RGNNs: Relational Modeling for Heterogeneous Text Graphs

text -RGNNs: Relational Modeling for Heterogeneous Text Graphs

text -Scene Retrieval for Driving Scenes in Transportation Cyber-Physical Systems

text -to-3D Generation with Bidirectional Diffusion Using Both 2D and 3D Priors

text -to-3D using Gaussian Splatting

text -to-Floorplan Synthesis via Graph-Conditioned Diffusion Processes

text -to-Image Diffusion Models are Great Sketch-Photo Matchmakers

text -to-image Editing by Image Information Removal

text -to-Image Generation Grounded by Fine-Grained User Attention

text -to-Image Generation via Semi-Supervised Training

text -to-Image Models for Counterfactual Explanations: A Black-Box Approach

text -to-Image Person Re-Identification Based on Multimodal Graph Convolutional Network

text -to-Image Synthesis based on Object-Guided Joint-Decoding Transformer

text -to-Image Synthesis for Domain Generalization in Face Anti-Spoofing

text -to-image synthesis with self-supervised bi-stage generative adversarial network

text -to-image synthesis with self-supervised learning

text -to-Image Vehicle Re-Identification: Multi-Scale Multi-View Cross-Modal Alignment Network and a Unified Benchmark

text -to-image via mask anchor points

text -to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction

text -to-Speech With Lip Synchronization Based on Speech-Assisted Text-to-Video Alignment and Masked Unit Prediction

text -to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression

text -to-Traffic Generative Adversarial Network for Traffic Situation Generation

text -to-video: a semantic search engine for internet videos

text -Tracking Wearable Camera System for the Blind

text -tracking wearable camera system for visually-impaired people

text -Video Completion Using Structure Repair and Texture Propagation

text -Video Knowledge Guided Prompting for Weakly Supervised Temporal Action Localization

text -video retrieval re-ranking via multi-grained cross attention and frozen image encoders

text -Video Retrieval With Global-LocalSemantic Consistent Learning

text -Visual Prompting for Efficient 2D Temporal Video Grounding

text /continuous tone image decision processor

text /graphic labelling of ancient printed documents

text /graphic separation using a sparse representation with multi-learned dictionaries

text /Graphics Segmentation in Architectural Floor Plans

text /Graphics Separation Revisited

text /image separation method

text /Non-Text Image Classification in the Wild with Convolutional Neural Networks

text /Non-Text Image Classification in the Wild with Convolutional Neural Networks

text /Non-text Ink Stroke Classification in Japanese Handwriting Based on Markov Random Fields

text /Non-text Ink Stroke Classification in Japanese Handwriting Based on Markov Random Fields

text /shape classifier for mobile applications with handwriting input

text 2Avatar: Articulated 3D Avatar Creation With Text Instructions

text 2Concept: Concept Activation Vectors Directly from Text

text 2HOI: Text-Guided 3D Motion Generation for Hand-Object Interaction

text 2LiDAR: Text-guided Lidar Point Cloud Generation via Equirectangular Transformer

text 2LIVE: Text-Driven Layered Image and Video Editing

text 2Mesh: Text-Driven Neural Stylization for Meshes

text 2Performer: Text-Driven Human Video Generation

text 2place: Affordance-aware Text Guided Human Placement

text 2Pos: Text-to-Point-Cloud Cross-Modal Localization

text 2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

text 2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models

text 2Scene: Text-driven Indoor Scene Stylization with Part-Aware Details

text 2Sketch: Learning Face Sketch from Facial Attribute Text

text 2Tex: Text-driven Texture Synthesis via Diffusion Models

text 2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

text 2Video: An End-to-end Learning Framework for Expressing Text With Videos

text AdaIN: Paying Attention to Shortcut Learning in Text Recognizers

text Adapter: Self-Supervised Domain Adaptation for Cross-Domain Text Recognition

text Aug: Test Time Text Augmentation for Multimodal Person Re-Identification

text Boxes++: A Single-Shot Oriented Scene Text Detector

text Catcher: a method to detect curved and challenging text in natural scenes

text ContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade

text Craftor: Your Text Encoder can be Image Quality Controller

text DCT: Arbitrary-Shaped Text Detection via Discrete Cosine Transform Mask

text Diff: Enhancing scene text image super-resolution with mask-guided residual diffusion models

text diffuser-2: Unleashing the Power of Language Models for Text Rendering

text Dragon: An End-to-End Framework for Arbitrary Shaped Text Spotting

text Face: Text-to-Style Mapping Based Face Generation and Manipulation

text Field: Learning a Deep Direction Field for Irregular Scene Text Detection

text Finder: An Automatic System to Detect and Recognize Text in Images

text invision: Text and Prompt Complexity Driven Visual Text Generation Benchmark

text invision: Text and Prompt Complexity Driven Visual Text Generation Benchmark

text ManiA: Enriching Visual Feature by Text-driven Manifold Augmentation

text Mesh: Generation of Realistic 3D Meshes From Text Prompts

text Mountain: Accurate scene text detection via instance segmentation

text NeRF: A Novel Scene-Text Image Synthesis Method Based on Neural Radiance Fields

text Net: Irregular Text Reading from Images with an End-to-End Trainable Network

text OCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

text Place: Visual Place Recognition and Topological Localization Through Reading Scene Texts

text Proposals: A text-specific selective search algorithm for word spotting in the wild

text ron: Weakly Supervised Multilingual Text Detection through Data Programming

text RS: Deep Bidirectional Triplet Network for Matching Text to Remote Sensing Images

text s as Images in Prompt Tuning for Multi-Label Image Recognition

text s as points: Scene text detection with point supervision

text s as points: Scene text detection with point supervision

text SLAM: Visual SLAM With Semantic Planar Text Features

text Snake: A Flexible Representation for Detecting Text of Arbitrary Shapes

text SRNet: Scene Text Super-Resolution Based on Contour Prior and Atrous Convolution

text StyleBrush: Transfer of Text Aesthetics From a Single Example

text ual Alchemy: CoFormer for Scene Text Understanding

text ual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching

text ual Visual Semantic Dataset for Text Spotting

text ual-visual Logic Challenge: Understanding and Reasoning in Text-to-image Generation

text ure-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm

TFRGAN: Leveraging text Information for Blind Face Restoration with Extreme Degradation

TF²: Few-Shot text -Free Training-Free Defect Image Generation for Industrial Anomaly Inspection

TG-TSGNet: A text -Guided Arbitrary-Resolution Terrain Scene Generation Network

There and Back Again: 3D Sign Language Generation from text Using Back-Translation

Thinking Fast and Slow: Efficient text -to-Visual Retrieval with Transformers

Three decision levels strategy for Arabic and Latin text s differentiation in printed and handwritten natures

Three-Dimensional Lip Motion Network for text -Independent Speaker Recognition

Thresholding video images for text detection

TI2V-Zero: Zero-Shot Image Conditioning for text -to-Video Diffusion Models

TIAM - A Metric for Evaluating Alignment in text -to-Image Generation

Tibet: Identifying and Evaluating Biases in text -to-image Generative Models

Ticker: An Adaptive Single-Switch text Entry Method for Visually Impaired Users

TIED: A Cycle Consistent Encoder-Decoder Model for text -to-Image Retrieval

TieNet: text -Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays

TIFA: Accurate and Interpretable text -to-Image Faithfulness Evaluation with Question Answering

Tightness-Aware Evaluation Protocol for Scene text Detection

TIMTQE: Benchmarking Machine Translation Quality Estimation for text Images

TIPS: text -Induced Pose Synthesis

TISE: Bag of Metrics for text -to-Image Synthesis Evaluation

TJCMNet: An Efficient Vision- text Joint Identity Clues Mining Network for Visible-Infrared Person Re-Identification

Tk- text : Multi-shaped Scene Text Detection via Instance Segmentation

Tk- text : Multi-shaped Scene Text Detection via Instance Segmentation

TKDN: Scene text Detection via Keypoints Detection

TLDR: text Based Last-Layer Retraining for Debiasing Image Classifiers

TlTScore: Towards Long-Tail Effects in text -to-Visual Evaluation with Generative Foundation Models

TLWSR: Weakly supervised real-world scene text image super-resolution using text label

TLWSR: Weakly supervised real-world scene text image super-resolution using text label

TM2D: Bimodality Driven 3D Dance Generation via Music- text Integration

TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and text s

TMR: text -to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis

To Speak or to text : Effects of Display Type and I/O Style on Mobile Virtual Humans Nurse Training

To text or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging

To text or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging

To text or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging

Token-Mixer: Bind Image and text in One Embedding Space for Medical Image Reporting

TokenBinder: text -Video Retrieval with One-to-Many Alignment Paradigm

TokenCompose: text -to-Image Diffusion with Token-Level Supervision

Tokenfocus-VQA: Enhancing text -to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs

Tool for Ground-Truthing text Lines and Characters in Off-Line Handwritten Chinese Documents, A

Top-down and bottom-up cues for scene text recognition

Topic Language Model Adaption for Recognition of Homologous Offline Handwritten Chinese text Image

Total- text : toward orientation robustness in scene text detection

Total- text : toward orientation robustness in scene text detection

Toward Automation in text -Based Video Retrieval with LLM Assistance

Toward Integrated Scene text Reading

Toward Open-World text -Driven Face Generation and Manipulation via StyleGAN3

Toward real text manipulation detection: New dataset and new solution

Toward text -independent Cross-lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset

Toward Understanding WordArt: Corner-Guided Transformer for Scene text Recognition

Toward Verifiable and Reproducible Human Evaluation for text -to-Image Generation

Towards Accurate Scene text Recognition With Semantic Reasoning Networks

Towards Accurate text -based Image Captioning with Content Diversity Exploration

Towards an Extensible and text -Oriented Analytical Semantic Trajectory Framework

Towards Automated Transcription of Label text from Pinned Insect Collections

Towards Cycle-Consistent Models for text and Image Retrieval

Towards Effective Usage of Human-Centric Priors in Diffusion Models for text -based Human Image Generation

Towards End-to-End text Spotting in Natural Scenes

Towards End-to-End text Spotting with Convolutional Recurrent Neural Networks

Towards End-to-End Unified Scene text Detection and Layout Analysis

Towards Fast and Accurate Image- text Retrieval With Self-Supervised Fine-Grained Alignment

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image- text Pre-Training

Towards Generic text -Line Extraction

Towards High-Fidelity text -Guided 3D Face Generation and Manipulation Using only Images

Towards Implicit text -Guided 3D Shape Generation

Towards Improved text -Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Towards Improved text -Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Towards Improved text -Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Towards Interactive Facial Image Inpainting by text or Exemplar Image

Towards Language-Free Training for text -to-Image Generation

Towards Modelling an Attention-Based text Localization Process

Towards Open Domain text -driven Synthesis of Multi-person Motions

Towards open-set text recognition via label-to-prototype learning

Towards robust and efficient text sign reading from a mobile phone

Towards Robust Curve text Detection With Conditional Spatial Expansion

Towards Robust Tampered text Detection in Document Image: New Dataset and New Solution

Towards Robust text -Guided Image Compression Under Modality Missing

Towards Scalable Human-aligned Benchmark for text -guided Image Editing

Towards Specific Domain Prompt Learning via Improved text Label Optimization

Towards text -guided 3D Scene Composition

Towards the Unseen: Iterative text Recognition by Distilling from Errors

Towards Unconstrained End-to-End text Spotting

Towards Understanding and Quantifying Uncertainty for text -to-Image Generation

Towards Understanding Cross and Self-Attention in Stable Diffusion for text -Guided Image Editing

Towards Unified Scene text Spotting Based on Sequence Generation

Towards Weakly Supervised text -to-Audio Grounding

Towards Weakly-Supervised text Spotting using a Multi-Task Transformer

Towards Zero-Shot Multi-Speaker Multi-Accent text -to-Speech Synthesis

TP-LReID: Lifelong person re-identification using text prompts

TP2O: Creative text Pair-to-object Generation Using Balance Swap-Sampling

TPA-Seg: Multi-Class Nucleus Segmentation Using text Prompts and Cross-Attention

TPA3D: Triplane Attention for Fast text -to-3d Generation

TPD-STR: text Polygon Detection with Split Transformers

TPEech: Target Speaker Extraction and Noise Suppression With Historical Dialogue text Cues

TPWGAN: Wavelet-aware text prior guided super-resolution for scene text images

TPWGAN: Wavelet-aware text prior guided super-resolution for scene text images

Trace Controlled text to Image Generation

Tracking Based Multi-Orientation Scene text Detection: A Unified Framework With Dynamic Programming

Traffic Video Event Retrieval via text Query using Vehicle Appearance and Motion Attributes

Training on severely degraded text -line images

Training-Free Color-Style Disentanglement for Constrained text -to-Image Synthesis

Training-Free Location-Aware text -to-Image Synthesis

Training-free subject-enhanced attention guidance for compositional text -to-image generation

TrAME: Trajectory-Anchored Multi-View Editing for text -Guided 3D Gaussian Manipulation

Transcript Mapping for Handwritten text Lines Using Conditional Random Fields

Transferable Adversarial Attacks for Deep Scene text Detection

Transferring Image-CLIP to Video- text Retrieval via Temporal Relations

Transferring Knowledge From text to Video: Zero-Shot Anticipation for Procedural Actions

Transform invariant text extraction

Transformation of arc-form- text to linear-form-text suitable for OCR

Transformation of arc-form- text to linear-form-text suitable for OCR

Transformer models for enhancing AttnGAN based text to image generation

Transformer Reasoning Network for Image- text Matching and Retrieval

Transformer-based text Detection in the Wild

Transparent text Detection and Background Recovery

TransPixeler: Advancing text -to-Video Generation with Transparency

Trans text Net: Transducing Text for Recognizing Unseen Visual Relationships

Tree structure for word extraction from handwritten text lines

TriCoLo: Trimodal Contrastive Loss for text to Shape Retrieval

TriMatch: Triple Matching for text -to-Image Person Re-Identification

Trinity Detector: text -Assisted and Attention Mechanisms Based Spectral Fusion for Diffusion Generation Image Detection

TRIS: A multimodal and multitask framework for unifying text -image retrieval and referring image segmentation

TRTST: Arbitrary High-Quality text -Guided Style Transfer With Transformers

True color distributions of scene text and background

TS-RNN: text Steganalysis Based on Recurrent Neural Networks

TS2-Net: Token Shift and Selection Transformer for text -Video Retrieval

TSA-SCC: text Semantic-Aware Screen Content Coding With Ultra Low Bitrate

TSINIT: A Two-Stage Inpainting Network for Incomplete text

TTD: text -tag Self-distillation Enhancing Image-text Alignment in CLIP to Alleviate Single Tag Bias

TTD: text -tag Self-distillation Enhancing Image-text Alignment in CLIP to Alleviate Single Tag Bias

TTDNet: An End-to-End Traffic text Detection Framework for Open Driving Environments

TTS: Hilbert Transform-Based Generative Adversarial Network for Tattoo and Scene text Spotting

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for text -to-Video Generation

Tuning-Free Image Customization with Image and text Guidance

Turbo3D: Ultra-fast text -to-3D Generation

Turboedit: Instant text -based Image Editing

TurboFill: Adapting Few-step text -to-image Model for Fast Image Inpainting

Turning a CLIP Model into a Scene text Detector

Turning a CLIP Model Into a Scene text Spotter

TV Commercial Detection Based on Shot Change and text Extraction

TV Program Classification Based on Face and text Processing

TV program segmentation using text -visual analysis

TVI-MFAN: A text -Visual Interaction Multilevel Feature Alignment Network for Visual Grounding in Remote Sensing

TVMTrailer: A text -Video-Music AIGC Framework for Film Trailer Generation

TWD: A New Deep E2E Model for text Watermark/Caption and Scene Text Detection in Video

TWD: A New Deep E2E Model for text Watermark/Caption and Scene Text Detection in Video

Twitter Stream Analysis, Tweets, text s, SMS, Internet

Two approaches for text segmentation in web images

Two combination stages of clustered One-Class Classifiers for writer identification from text fragments

Two Stage SVM and kNN text Documents Classifier

Two-Level Rectification Attention Network for Scene text Recognition, A

Two-Pass Clustering Technique for Orientation-Invariant and Language-Independent text Localization

Two-stage hybrid binarization around fringe map based text line segmentation for document images

two-stage method for text line detection in historical documents, A

Two-stage Multimodality Fusion for High-performance text -based Visual Question Answering

Two-stage partial image- text clustering (TPIT-C)

two-stage scheme for text detection in video images, A

Two-Stage Seamless text Erasing on Real-World Scene Images

Txt2Img-MHN: Remote Sensing Image Generation From text Using Modern Hopfield Networks

Type-2 Fuzzy GMMs for Robust text -Independent Speaker Verification in Noisy Environments

Type-R: Automatically Retouching Typos for text -to-Image Generation

Typing in Mid Air: Assessing One- and Two-Handed text Input Methods of the Microsoft HoloLens 2

Typographical Features for Scene text Recognition

Typography With Decor: Intelligent text Style Transfer

UATST: Towards unpaired arbitrary text -guided style transfer with cross-space modulation

UATVR: Uncertainty-Adaptive text -Video Retrieval

Udiff text : A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

UFineBench: Towards text -based Person Retrieval with Ultra-Fine Granularity

UFOGen: You Forward Once Large Scale text -to-Image Generation via Diffusion GANs

UHaT: Urdu handwritten text dataset

Ump: Unified Modality-Aware Prompt Tuning for text -Video Retrieval

Unambiguous Scene text Segmentation With Referring Expression Comprehension

Unambiguous text Localization and Retrieval for Cluttered Scenes

Unambiguous text Localization, Retrieval, and Recognition for Cluttered Scenes

Unconstrained end-to-end text reading with feature rectification

Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from text to Image via CLIP Inversion

Uncorrelated Geo- text Inhibition Method Based on Voronoi K-Order and Spatial Correlations in Web Maps

Uncovering the Disentanglement Capability in text -to-Image Diffusion Models

Uncurated Image- text Datasets: Shedding Light on Demographic Bias

Understand Layout and Translate text : Unified Feature-Conductive End-to-End Document Image Translation

Understanding and Mitigating Toxicity in Image- text Pretraining Datasets: A Case Study on LLaVA

Understanding Handwritten text in a Structured Environment

Understanding Plane Geometry Problems by Integrating Relations Extracted from text and Diagram

Understanding Video Scenes through text : Insights from Text-based Video Question Answering

Understanding Video Scenes through text : Insights from Text-based Video Question Answering

UniCanvas: Affordance-Aware Unified Real Image Editing via Customized text -to-Image Generation

Unidream: Unifying Diffusion Priors for Relightable text -to-3d Generation

Unified Adaptive Relevance Distinguishable Attention Network for Image- text Matching

Unified Approach for text -and Image-Guided 4D Scene Generation, A

Unified Coarse-to-Fine Alignment for Video- text Retrieval

Unified Contrastive Learning in Image- text -Label Space

Unified Framework for Multioriented text Detection and Recognition, A

Unified Framework for Tracking Based text Detection and Recognition from Web Videos, A

unified framework of data augmentation using large language models for text -based cross-modal retrieval, A

Unified learning for image- text alignment via multi-scale feature fusion

unified method for augmented incremental recognition of online handwritten Japanese and English text , A

Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes text Zones, Tables and Non-text Objects

Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes text Zones, Tables and Non-text Objects

Unified Pre-training with Pseudo text s for Text-To-Image Person Re-identification

Unified Pre-training with Pseudo text s for Text-To-Image Person Re-identification

Unified Prompt Attack Against text -to-Image Generation Models

Unified text Extraction Method for Instructional Videos, A

Unifying Vision, text , and Layout for Universal Document Processing

UniMultNet: Action recognition method based on multi-scale feature fusion and video- text constraint guidance

Uniprocessor: A text -induced Unified Low-level Image Processor

unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network, A

Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing text -to-Speech System in Hindi

UniTAB: Unifying text and Box Outputs for Grounded Vision-Language Modeling

Uniter: Universal Image- text Representation Learning

UniTMGE: Uniform text -Motion Generation and Editing Model via Diffusion

Unleashing text -to-Image Diffusion Models for Visual Perception

Unleashing text -to-image Diffusion Prior for Zero-shot Image Captioning

Unlocking text ual and Visual Wisdom: Open-vocabulary 3d Object Detection Enhanced by Comprehensive Guidance from Text and Image

Unpaired Image- text Matching via Multimodal Aligned Conceptual Knowledge

Unsupervised Alignment of News Video and text Using Visual Patterns and Textual Concepts

Unsupervised Approach for Video text Localization, An

Unsupervised Block Covering Analysis for text -Line Segmentation of Arabic Ancient Handwritten Document Images

Unsupervised categorization of heterogeneous text images based on fractals

Unsupervised clustering of text entities in heterogeneous grey level documents

Unsupervised Co-Generation of Foreground-Background Segmentation from text -to-Image Synthesis

Unsupervised Compositional Concepts Discovery with text -to-Image Generative Models

Unsupervised Cross-Modal Hashing Method Robust to Noisy Training Image- text Correspondences in Remote Sensing, An

Unsupervised deep learning for text line segmentation

Unsupervised Domain Adaptation via Class Aggregation for text Recognition

Unsupervised Domain Adaptation with Imbalanced Character Distribution for Scene text Recognition

Unsupervised Image and text Fusion for Travel Information Enhancement

Unsupervised language model adaptation for handwritten Chinese text recognition

Unsupervised Prompt Tuning for text -Driven Object Detection

Unsupervised refinement of color and stroke features for text binarization

Unsupervised Segmentation of text Fragments in Real Scenes

Unsupervised Speech text Localization in Comic Images

Unsupervised text Segmentation Using Color and Wavelet Features

Unsupervised text -to-image synthesis

Unsupervised writer adaptation applied to handwritten text recognition

Unveiling and Mitigating Memorization in text -to-image Diffusion Models Through Cross Attention

UP-Person: Unified Parameter-Efficient Transfer Learning for text -Based Person Retrieval

Urdu handwritten text recognition: a survey

Usage-Oriented Performance Evaluation for text Localization Algorithms

Use of a Dictionary in Conjunction with a Handwritten text s Recognizer

Use of an Evolutive Base of Models in a System for Reading Printed text s

Use of Captions and Other Collateral text in Understanding Photographs

Use of Collateral text in Image Interpretation

Use of Collateral text in Understanding Photos in Documents

Use of Global Con text in Text Recognition, The

Use of the Hough transform to separate merged text /graphics in forms

USER: Unified Semantic Enhancement With Momentum Contrast for Image- text Retrieval

Using a boosted tree classifier for text segmentation in hand-annotated documents

Using a Probabilistic Syllable Model to Improve Scene text Recognition

Using Adaptive Run Length Smoothing Algorithm for Accurate text Localization in Images

Using an Exact Performance of Hough Transform for Image text Segmentation

Using Biographical text s as Linked Data for Prosopographical Research and Applications

Using double attention for text tattoo localisation

Using Hidden Markov Models as a Tool for Handwritten text Line Segmentation

Using histogram representation and Earth Mover's Distance as an evaluation tool for text detection

Using irregular pyramid for text segmentation and binarization of gray scale images

Using Kernel Density Classifier with Topic Model and Cost Sensitive Learning for Automatic text Categorization

Using Large text To Image Models with Structured Prompts for Skin Disease Identification: A Case Study

Using Mouse Feedback in Computer Assisted Transcription of Handwritten text Images

Using Multimodal Contrastive Knowledge Distillation for Video- text Retrieval

Using Multiple Frame Integration for the text Recognition of Video

Using Object Information for Spotting text

Using pyramid of histogram of oriented gradients on natural scene text recognition

Using Readers' Highlighting on Monochromatic Documents for Automatic text Transcription and Summarization

Using Scale-Space Anisotropic Smoothing for text Line Extraction in Historical Documents

Using Shape and Layout Information to Find Signatures, text , and Graphics

Using text to Teach Image Retrieval

Using Typical Testors for Feature Selection in text Categorization

Using web search engines to improve text recognition

Using Webcast text for Semantic Event Detection in Broadcast Sports Video

UT-GAN: A Novel Unpaired text ual-Attention Generative Adversarial Network for Low-Light Text Image Enhancement

Utilization of text ure, contrast and color homogeneity for detecting and recognizing text from video frames

Uyghur Language text Detection in Complex Background Images Using Enhanced MSERs

Uyghur text Localization with Fast Component Detection

VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for text -to-Image Generative Models

Variable-Length Speaker Conditioning in Flow-Based text -to-Speech

Variance Based Image Binarization Scheme and Its Application in text Segmentation, A

Variational Bayes Method for Handwritten text Line Segmentation, A

Variational Distribution Learning for Unsupervised text -to-Image Generation

Variational DNN embeddings for text -independent speaker verification

VATr++: Choose Your Words Wisely for Handwritten text Generation

VCD- text ure: Variance Alignment Based 3D-2D Co-Denoising for Text-Guided Texturing

Vector Field Decomposition-Based Flow Matching for Zero-Shot Cross-Lingual text -to-Speech

Vector Quantized Diffusion Model for text -to-Image Synthesis

VectorFusion: text -to-SVG by Abstracting Pixel-Based Diffusion Models

VerbDiff: text -Only Diffusion Models with Enhanced Interaction Awareness

Verisimilar Image Synthesis for Accurate Detection and Recognition of text s in Scenes

Versatile Diffusion: text , Images and Variations All in One Diffusion Model

Vertical bar detection for gauging text similarity of document images

Vesselness for text detection in historical document images

VGSG: Vision-Guided Semantic-Group Network for text -Based Person Search

VicTR: Video-conditioned text Representations for Activity Recognition

Video Analysis -- Captions, text , Video Text

Video Analysis -- Captions, text , Video Text

Video and text Matching with Conditioned Embeddings

Video and text semantic center alignment for text-video cross-modal retrieval

Video and text semantic center alignment for text-video cross-modal retrieval

Video captioning with text -based dynamic attention and step-by-step learning

Video Diffusion, Video Sysnthesis, text to Video

Video Frame-wise Explanation Driven Contrastive Learning for Procedural text Generation

Video Generation from text Employing Latent Path Construction for Temporal Modeling

Video google: A text retrieval approach to object matching in videos

Video Question Answering Using Clip-Guided Visual- text Attention

Video Question Answering with Iterative Video- text Co-tokenization

Video Scene text Frames Categorization for Text Detection and Recognition

Video Scene text Frames Categorization for Text Detection and Recognition

Video Script Identification Based on text Lines

Video search in concept subspace: a text -like paradigm

Video Search with CLIP and Interactive text Query Reformulation

Video text detection and recognition: Dataset and benchmark

Video text Detection System Based on Automated Training, A

Video text Detection With Robust Feature Representation

Video text Extraction Using the Fusion of Color Gradient and Log-Gabor Filter

video text location method based on background classification, A

Video text recognition using feature compensation as category-dependent feature extraction

Video text recognition using sequential Monte Carlo and error voting methods

Video text Tracking with a Spatio-Temporal Complementary Model

Video, text , and Speech-Driven Realistic 3-D Virtual Head for Human-Machine Interface, A

Video-ColBERT: Con text ualized Late Interaction for Text-to-Video Retrieval

Video- text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

Video- text Compliance: Activity Verification Based on Natural Language Instructions

Video- text Representation Learning via Differentiable Weak Temporal Alignment

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video- text Models

VideoDirector: Precise Video Editing via text -to-Video Models

VideoDreamer: Customized Multi-Subject text -to-Video Generation With Disen-Mix Finetuning on Language-Video Foundation Models

VideoMage: Multi-Subject and Motion Customization of text -to-Video Diffusion Models

ViewDiff: 3D-Consistent Image Generation with text -to-Image Models

ViLEM: Visual-Language Error Modeling for Image- text Retrieval

ViLNM: Visual-Language Noise Modeling for text -to-Image Person Retrieval

VimTS: A Unified Video and Image text Spotter for Enhancing the Cross-Domain Generalization

VinTAGe: Joint Video and text Conditioning for Holistic Audio Generation

VIRES: Video Instance Repainting via Sketch and text Guided Generation

Vision and text Transformer for Predicting Answerability on Visual Question Answering

Vision-Aware text Features in Referring Image Segmentation: From Object Understanding to Context Understanding

Vision-Language Matching for text -to-Image Synthesis via Generative Adversarial Networks

Vision-Language Pre-Training for Boosting Scene text Detectors

Vision-Language Relational Transformer for Video-to- text Generation

ViSTA: Vision and Scene text Aggregation for Cross-Modal Retrieval

Visual and text prompts guided interpretable network for universal low-dose CT MAR

Visual Attention Based Approach to text Extraction, A

Visual enhancement of incised text

Visual Re-ranking with Natural Language Understanding for text Spotting

Visual Semantic Reasoning for Image- text Matching

Visual Semantics: Extracting Visual Information from text Accompanying Pictures

Visual speaker authentication with random prompt text s by a dual-task CNN framework

Visual text Correction

Visual text Generation in the Wild

Visual text Recognition Through Contextual Processing

Visual Word Embedding for text Classification

Visual-Aware text as Query for Referring Video Object Segmentation

Visual-relation Conscious Image Generation from Structured- text

Visual- text ual Capsule Routing for Text-Based Video Segmentation

Visualizing Unstructured text Sequences Using Iterative Visual Clustering

Visually-Enabled Active Deep Learning for (Geo) text and Image Classification: A Review

VisualRAG: Knowledge-Guided Retrieval Augmentation for Image- text Matching

Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting

ViTA: An Efficient Video-to- text Algorithm using VLM for RAG-based Video Analysis System

Viterbi algorithm as an aid in text recognition, The

Vividdreamer: Invariant Score Distillation for Hyper-realistic text -to-3d Generation

VMC: Video Motion Customization Using Temporal Attention Adaption for text -to-Video Diffusion Models

VODiff: Controlling Object Visibility Order in text -to-Image Generation

VOLTER: Visual Collaboration and Dual-Stream Fusion for Scene text Recognition

VolTex: Food Volume Estimation Using text -Guided Segmentation and Neural Surface Reconstruction

VoP: text -Video Co-Operative Prompt Tuning for Cross-Modal Retrieval

Vox-E: text -guided Voxel Editing of 3D Objects

VP3D: Unleashing 2D Visual Prompt for text -to-3D Generation

VSR++: Improving Visual Semantic Reasoning for Fine-Grained Image- text Matching

VSRNet: End-to-end video segment retrieval with text query

VTC: Improving Video- text Retrieval with User Comments

VTD-FCENet: A Real-Time HD Video text Detection with Scale-Aware Fourier Contour Embedding

VTPL: Visual and text prompt learning for visual-language models

VTQA: Visual text Question Answering via Entity Alignment and Cross-Media Reasoning

VX2 text : End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

W-A net: Leveraging Atrous and Deformable Convolutions for Efficient text Detection

Wacnet: Word Segmentation Guided Characters Aggregation Net for Scene text Spotting With Arbitrary Shapes

Was: Dataset and Methods for Artistic text Segmentation

Watch Your Steps: Local Image and Scene Editing by text Instructions

Watch Your Strokes: Improving Handwritten text Recognition with Deformable Convolutions

Watermark Removal Attack Against text -to-Image Generative Model Watermarking

Watermarking JBIG2 text Region for Image Authentication

Watermarking text document images using edge direction histograms

WaterVG: Waterway Visual Grounding Based on text -Guided Vision and mmWave Radar

Wave: Warping Ddim Inversion Features for Zero-shot text -to-video Editing

Wavelet feature domain adaptive noise reduction using learning algorithm for text -independent speaker recognition

Wavelet feature selection based neural networks with application to the text independent speaker identification

Wavelet-gradient-fusion for video text binarization

Weak supervision for generating pixel-level annotations in scene text segmentation

Weakly Supervised Attention Rectification for Scene text Recognition

Weakly Supervised Salient Object Detection with text Supervision

Weakly Supervised text -based Person Re-Identification

Weakly Supervised Video Moment Retrieval From text Queries

Weakly Supervised Video Representation Learning with Unaligned text for Sequential Videos

Weakly-Supervised 3D Spatial Reasoning for text -Based Visual Question Answering

Weakly-Supervised Alignment of Video with text

Weakly-Supervised text -driven Contrastive Learning for Facial Behavior Understanding

WEB Image Classification Based on the Fusion of Image and text Classifiers

Webly Supervised Image- text Embedding with Noisy Tag Refinement

WECROMCL: Weakly Supervised Cross-modality Contrastive Learning for Transcription-only Supervised text Spotting

Weighted Graph Embedding Feature with Bi-Directional Long Short-Term Memory Classifier for Multi-Document text Summarization

Well-calibrated confidence measures for multi-label text classification with a large number of labels

WeStcoin: Weakly-Supervised Con text ualized Text Classification with Imbalance and Noisy Labels

We text : Scene Text Detection under Weak Supervision

WETM: A word embedding-based topic model with modified collapsed Gibbs sampling for short text

What Are You Talking About? text -to-Image Coreference

What does scene text tell us?

What If We Only Use Real Datasets for Scene text Recognition? Toward Scene Text Recognition With Fewer Labels

What If We Only Use Real Datasets for Scene text Recognition? Toward Scene Text Recognition With Fewer Labels

What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions

What is the Real Need for Scene text Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties

What Is Wrong With Scene text Recognition Model Comparisons? Dataset and Model Analysis

What Machines See Is Not What They Get: Fooling Scene text Recognition Models With Adversarial Text Images

What Machines See Is Not What They Get: Fooling Scene text Recognition Models With Adversarial Text Images

When IC meets text : Towards a rich annotated integrated circuit text dataset

When IC meets text : Towards a rich annotated integrated circuit text dataset

Where you edit is what you get: text -guided image editing with region-based attention

Which super-resolution algorithm is proper for Farsi text image sequences

Who's Waldo? Linking People Across text and Images

Whole is Greater than Sum of Parts: Recognizing Scene text Words

Wikipedia-based semantic tensor space model for text analytics, A

Word Extraction from On-Line Handwritten text Lines

Word Image Matching as a Technique for Degraded text Recognition

Word segmentation in handwritten Korean text lines based on gap clustering techniques

Word segmentation of printed text lines based on gap clustering and special symbol detection

Word separation of unconstrained handwritten text lines in PCR forms

Word Shape Analysis in a Knowledge-Based System for Reading text

Word spotting and recognition via a joint deep embedding of image and text

Wordfence: text detection in natural images with border awareness

Wordrobe: text -guided Generation of Textured 3d Garments

Words Matter: Scene text for Image Classification and Retrieval

Words or Vision: Do Vision-Language Models Have Blind Faith in text ?

WordSup: Exploiting Word Annotations for Character Based text Detection

WOUAF: Weight Modulation for User Attribution and Fingerprinting in text -to-Image Diffusion Models

Write a Classifier: Predicting Visual Classifiers from Unstructured text

Writer identification using text line based features

Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition

Writing speed normalization for on-line handwritten text recognition

X-Edit: Detecting and Localizing Edits in Images Altered by text -Guided Diffusion Models

X-Mesh: Towards Fast and Accurate text -driven 3D Stylization via Dynamic Textual Guidance

X-Pool: Cross-Modal Language-Video Attention for text -Video Retrieval

You'll Never Walk Alone: A Sketch and text Duet for Fine-Grained Image Retrieval

Your Student is Better than Expected: Adaptive Teacher-Student Collaboration for text -Conditional Diffusion Models

Zero-Painter: Training-Free Layout Control for text -to-Image Synthesis

Zero-Shot Composed Image Retrieval Considering Query-Target Relationship Leveraging Masked Image- text Pairs

Zero-Shot Contrastive Loss for text -Guided Diffusion Image Style Transfer

Zero-shot skeleton-based action recognition with dual visual- text alignment

Zero-shot spatial layout conditioning for text -to-image diffusion models

Zero-Shot Styled text Image Generation, but Make It Autoregressive

Zero-Shot Temporal Action Detection by Learning Multimodal Prompts and text -Enhanced Actionness

Zero-Shot text Classification with Semantically Extended Graph Convolutional Network

Zero-Shot text -Driven Dynamic Neural Radiance Fields Stylization

Zero-Shot text -Guided Object Generation with Dream Fields

Zero-Shot text -to-Parameter Translation for Game Character Auto-Creation

Zero-Shot Video Moment Retrieval With Angular Reconstructive text Embeddings

ZeroCap: Zero-Shot Image-to- text Generation for Visual-Semantic Arithmetic

Zone identification in the printed Gujarati text

4233 for text

_	text	_
2D and 3D Video Scene	text	Classification
2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene	text	Recognition
360PanT: Training-Free	text	-Driven 360-Degree Panorama-to-Panorama Translation
3D Highlighter: Localizing Regions on 3D Shapes via	text	Descriptions
3D Human Motion Generation from the	text	Via Gesture Action Classification and the Autoregressive Model
3D-Aware	text	-Driven Talking Avatar Generation
3D-SceneDreamer:	text	-Driven 3D-Consistent Scene Generation
3D-VisTA: Pre-trained Transformer for 3D Vision and	text	Alignment
4D-fy:	text	-to-4D Generation Using Hybrid Score Distillation Sampling
A-STAR: Test-time Attention Segregation and Retention for	text	-to-image Synthesis
ABCNet v2: Adaptive Bezier-Curve Network for Real-Time End-to-End	text	Spotting
ABCNet: Real-Time Scene	text	Spotting With Adaptive Bezier-Curve Network
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene	text	Spotting
Ablating Concepts in	text	-to-Image Diffusion Models
Accurate Arbitrary-Shaped Scene	text	Detection via Iterative Polynomial Parameter Regression
Accurate Detection for Scene	text	s with a Cascaded CNN Networks
Accurate Scene	text	Detection Through Border Semantics Awareness and Bootstrapping
Accurate Scene	text	Detection Via Scale-Aware Data Augmentation and Shape Similarity Constraint
Accurate Scene	text	Recognition Based on Recurrent Neural Network
Accurate Scene	text	Recognition with Efficient Model Scaling and Cloze Self-Distillation
Accurate Segmentation-Based Scene	text	Detector with Context Attention and Repulsive Text Border, An
Accurate Segmentation-Based Scene	text	Detector with Context Attention and Repulsive Text Border, An
Accurate	text	localization in images based on SVM output scores
Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped	text	, An
Accurate video	text	detection through classification of low and high contrast images
Accurate, data-efficient, unconstrained	text	recognition with convolutional neural networks
ACE: Anti-Editing Concept Erasure in	text	-to-Image Models
Acquire and then Adapt: Squeezing out	text	-to-Image Model for Image Restoration
ActBERT: Learning Global-Local Video-	text	Representations
Active Collection of Land Cover Sample Data from Geo-Tagged Web	text	s
Active Contours Network to Straighten Distorted	text	Lines
active learning approach to frequent itemset-based	text	clustering, An
Active Learning With Complementary Sampling for Instructing Class-Biased Multi-Label	text	Emotion Classification
Activity Recognition Applications from Con	text	ual Video-Text Fusion
ActivityCLIP: Enhancing group activity recognition by mining complementary information from	text	to supplement image modality
Actor and Action Modular Network for	text	-Based Video Segmentation
AdaBoost for	text	Detection in Natural Scene
Adapting Style and Content for Attended	text	Sequence Recognition
Adapting	text	-to-Image Generation with Feature Difference Instruction for Generic Image Restoration
Adaptive Algorithm for	text	Detection from Natural Scenes, An
Adaptive Boundary Proposal Network for Arbitrary Shape	text	Detection
Adaptive Correlation Filtering Method for	text	-Based Person Search, An
Adaptive Fuzzy	text	Segmentation in Images with Complex Backgrounds Using Color and Texture
Adaptive fuzzy wavelet algorithm for	text	-independent speaker recognition
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured	text
Adaptive Latent Graph Representation Learning for Image-	text	Matching
Adaptive method for multi colored	text	binarization
Adaptive multi-	text	union for stable text-to-image synthesis learning
Adaptive multi-	text	union for stable text-to-image synthesis learning
Adaptive Offline Quintuplet Loss for Image-	text	Matching
Adaptive Region Growing Color Segmentation for	text	Using Irregular Pyramid
Adaptive Scene	text	Detection Based on Transferring Adaboost
Adaptive scene-	text	binarisation on images captured by smartphones
Adaptive Script-Independent Block-Based	text	Line Extraction, An
Adaptive Script-Independent	text	Line Extraction
Adaptive	text	Recognition Through Visual Matching
Adding Conditional Control to	text	-to-Image Diffusion Models
Addressing Information Inequality for	text	-Based Person Search via Pedestrian-Centric Visual Denoising and Bias-Aware Alignments
ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene	text	Detection
Advance One-Shot Multispectral Instance Detection With	text	's Supervision
Advancing Zero-Shot Digital Human Quality Assessment Through	text	-Prompted Evaluation
Adversarial and Isotropic Gradient Augmentation for Image Retrieval With	text	Feedback
Adversarial Attribute-	text	Embedding for Person Search With Natural Language Query
Adversarial learning based attentional scene	text	recognizer
Adversarial Representation Learning for	text	-to-Image Matching
Adversarial Robustification via	text	-to-image Diffusion Models
Adversarial Synthesis of Human Pose from	text
Adversarial	text	to Continuous Image Generation
Adversarial Training Lattice LSTM for Named Entity Recognition of Rail Fault	text	s
ADVMIX: Data Augmentation for Accurate Scene	text	Spotting
Ae	text	spotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
AEA-FIRM: Adaptive Elastic Alignment with Fine-Grained Representation Mining for	text	-Based Aerial Pedestrian Retrieval
Aesthetic	text	Logo Synthesis via Content-aware Layout Inferring
Affective Image Editing: Shaping Emotional Factors via	text	Descriptions
Affective Image Filter: Reflecting Emotions from	text	to Images
Agent-Based Control Prompt Tuning for Video-	text	Retrieval
Aggregating Image and	text	Quantized Correlated Components
Aggregating Local and Global	text	Features for Linguistic Steganalysis
Aggregating Local Con	text	for Accurate Scene Text Detection
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of	text	-to-Video Generation with LMM
AITTI: Learning Adaptive Inclusive Token for	text	-to-Image Generation
Aletheia: An Advanced Document Layout and	text	Ground-Truthing System for Production Environments
Algorithm for Colour-Based Natural Scene	text	Segmentation, An
Algorithm for Matching OCR-Generated	text	Strings, An
Algorithm for Reducing	text	Line Candidates of Incorrect Orientation, An
Algorithm for	text	page up/down orientation determination
Algorithms for compressing compound document images with large	text	/background overlap
ALIF: A dataset for Arabic embedded	text	recognition in TV broadcast
Align and Retrieve: Composition and Decomposition Learning in Image Retrieval With	text	Feedback
Align Your Gaussians:	text	-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models
Aligning	text	and Document Illustrations: Towards Visually Explainable Digital Humanities
Aligning	text	-to-Image Diffusion Models With Constrained Reinforcement Learning
AlignIT: Enhancing Prompt Alignment in Customization of	text	-to-Image Models
Alignment and Generation Adapter for Efficient Video-	text	Understanding
Alignment of Curved	text	Strings for Enhanced OCR Readability
Alignment of free layout color	text	s for character recognition
Alignment of Paragraphs in Bilingual	text	s Using Bilingual Dictionaries and Dynamic Programming
All You Need Is a Second Look: Towards Arbitrary-Shaped	text	Detection
ALR-GAN: Adaptive Layout Refinement for	text	-to-Image Synthesis
alternative framework for univariate filter based feature selection for	text	categorization, An
ALTID : Arabic/Latin	text	Images Database for recognition research
AMITA: Attribute-Guided Masked Image-	text	Alignment for Multi-Label Image Representation
AMO Sampler: Enhancing	text	Rendering with Overshooting
Analysis of Features and Metrics for Alignment in	text	-Dependent Voice Conversion
Analysis of the Novel Transformer Module Combination for Scene	text	Recognition
Analytical evaluation of term weighting schemes for	text	categorization
anchor-free region proposal network for Faster R-CNN-based	text	detection approaches, An
Ancient document analysis based on	text	line extraction
AniClipart: Clipart Animation with	text	-to-Video Priors
Animatabledreamer:	text	-guided Non-rigid 3d Model Generation and Reconstruction with Canonical Score Distillation
AniMo: Species-Aware Model for	text	-Driven Animal Motion Generation
Annotated Databases for the Recognition of Screen-Rendered	text
ANNP: a neural network parser for real world	text	s
Anonymizing Temporal Phrases in Natural Language	text	to be Posted on Social Networking Services
Anti-DreamBooth: Protecting users from personalized	text	-to-image synthesis
Anycontrol: Create Your Artwork with Versatile Control on	text	-to-image Generation
AnyFace++: A Unified Framework for Free-Style	text	-to-Face Synthesis and Manipulation
AnyFace: Free-style	text	-to-Face Synthesis and Manipulation
AON: Towards Arbitrarily-Oriented	text	Recognition
Application of autoregressive models to the study of the temporal structure of a handwritten	text
Application of Cluster Detection to	text	and Picture Processing, An
Application of Novel Chaotic Neural Networks to	text	Classification Based on PCA
Application of Planar Motion Segmentation for Scene	text	Extraction
Apply Hierarchical-Chain-of-Generation to Complex Attributes	text	-to-3D Generation
Applying GIS and	text	Mining Methods to Twitter Data to Explore the Spatiotemporal Patterns of Topics of Interest in Kuwait
Applying the conjugate gradient method for	text	document categorization
approach for detecting and cleaning of struck-out handwritten	text	, An
approach for handwritten Chinese	text	recognition unifying character segmentation and recognition, An
Approach for Recognizing	text	Labels in Raster Maps, An
approach to extracting the target	text	line from a document image captured by a pen scanner, An
approach to get overall emotion from comment	text	towards a certain image uploaded to social network using Latent Semantic Analysis, An
Approximate String Match for Garbled	text	with Various Accuracies, An
Arabic character recognition system: A statistical approach for recognizing cursive typewritten	text
Arabic hand-written	text	-line extraction
Arabic handwritten	text	s clusterization based on Feature Relation Graph (FRG)
Arabic ligatures: Analysis and application in	text	recognition
Arabic	text	detection in videos using neural and boosting-based approaches: Application to video indexing
Arbitrarily oriented	text	detection using geodesic distances between corners and skeletons
Arbitrarily Shaped Scene	text	Detection With a Mask Tightness Text Detector
Arbitrarily Shaped Scene	text	Detection With a Mask Tightness Text Detector
Arbitrarily shaped scene	text	detection with dynamic convolution
Arbitrarily-Oriented	text	Detection in Low Light Natural Scene Images
Arbitrary Shape Scene	text	Detection With Adaptive Text Region Representation
Arbitrary Shape Scene	text	Detection With Adaptive Text Region Representation
Arbitrary Shape	text	Detection using Transformers
Arbitrary Shape	text	Detection via Boundary Transformer
Arbitrary Shape	text	Detection via Segmentation with Probability Maps
Arbitrary Style Guidance for Enhanced Diffusion-Based	text	-to-Image Generation
Arbitrary-Oriented Scene	text	Detection via Rotation Proposals
Arbitrary-Shape Scene	text	Detection via Visual-Relational Rectification and Contour Approximation
architecture for handwritten	text	recognition systems, An
Are 2D-LSTM really dead for offline	text	recognition?
Are All Combinations Equal? Combining	text	ual and Visual Features with Multiple Space Learning for Text-based Video Retrieval
Are Digraphs Good for Free-	text	Keystroke Dynamics?
Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in	text
ARES:	text	-Driven Automatic Realistic Simulator for Autonomous Traffic
ARRPNGAN:	text	-to-image GAN with attention regularization and region proposal networks
ArtAdapter:	text	-to-Image Style Transfer using Multi-Level Style Encoder and Explicit Adaptation
ArtGlyphDiffuser:	text	-driven artistic glyph generation via Style-to-CLIP Projection and Multi-Level Controlled diffusion
ARTIST: Improving the Generation of	text	-Rich Images with Disentangled Diffusion Models and Large Language Models
Artistic Style Transfer via Fine-Grained	text	Guidance and Contrastive Semantics Similarity
ART•V: Auto-Regressive	text	-to-Video Generation with Diffusion Models
ASAYAR: A Dataset for Arabic-Latin Scene	text	Localization in Highway Traffic Panels
Assessing Affective Dimensions of Play in Psychodynamic Child Psychotherapy via	text	Analysis
Assessing Image and	text	Generation with Topological Analysis and Fuzzy Logic
Assessing similarity in handwritten	text	s
Assessing User Experience of	text	Readability with Eye Tracking in Virtual Reality
Assistive	text	Reading from Complex Background for Blind Persons
Associating	text	and graphics for scientific chart understanding
ASTER: An Attentional Scene	text	Recognizer with Flexible Rectification
ASTS: A Unified Framework for Arbitrary Shape	text	Spotting
Asymmetric Cross-Scale Alignment for	text	-Based Person Search
ATA: Adaptive Transformation Agent for	text	-Guided Subject-Position Variable Background Inpainting
ATM: Attentional	text	Matting
AToM: Aligning	text	-to-Motion Model at Event-Level with GPT-4Vision Reward
ATT3D: Amortized	text	-to-3D Object Synthesis
Attend, Correct and Focus: A Bidirectional Correct Attention Network for Image-	text	Matching
Attention Calibration for Disentangled	text	-to-Image Personalization
Attention Guidance by Cross-Domain Supervision Signals for Scene	text	Recognition
attention-based row-column encoder-decoder model for	text	recognition in Japanese historical documents, An
Attention-Bridged Modal Interaction for	text	-to-Image Generation
Attentionhand:	text	-driven Controllable Hand Image Generation for 3d Hand Reconstruction in the Wild
AttnGAN: Fine-Grained	text	to Image Generation with Attentional Generative Adversarial Networks
Attribute-Centric Compositional	text	-to-Image Generation
Attribute-Centric Cross-Modal Alignment for Weakly Supervised	text	-Based Person Re-ID
AttriDiffuser: Adversarially enhanced diffusion model for	text	-to-facial attribute image synthesis
AttT2M:	text	-Driven Human Motion Generation with Multi-Perspective Attention Mechanism
Audio Visual Segmentation through	text	Embeddings
Audio-Enhanced	text	-to-Video Retrieval using Text-Conditioned Feature Alignment
Audio-Enhanced	text	-to-Video Retrieval using Text-Conditioned Feature Alignment
Authenticating Binary	text	Documents Using a Localising OMAC Watermark Robust to Printing and Scanning
Automated cartographic	text	placement
Automated Detection of Adverse Drug Events from Older Patients' Electronic Medical Records Using	text	Mining
automatic algorithm for	text	skew estimation in document images using recursive morphological transforms, An
Automatic annotation of unique locations from video and	text
Automatic Chinese	text	Classification Using Character-Based and Word-Based Approach
Automatic Concept Discovery from Parallel	text	and Visual Corpora
Automatic Detection and Localization of Natural Scene	text	in Video
Automatic detection and recognition of Korean	text	in outdoor signboard images
Automatic diacritization of Arabic	text	using recurrent neural networks
Automatic discrimination of	text	and non-text natural images
Automatic discrimination of	text	and non-text natural images
Automatic document classification using	text	and images
Automatic dottization of Arabic	text	(Rasms) using deep recurrent neural networks
Automatic extraction of correlation-entropy features for	text	document analysis directly in run-length compressed domain
Automatic Feature Extraction and	text	Recognition From Scanned Topographic Maps
Automatic identification and skew estimation of	text	lines in real scene images
Automatic Identification of	text	in Digital Video Key Frames
Automatic image-	text	alignment for large-scale web image indexing and retrieval
Automatic Inpainting Scheme for Video	text	Detection and Removal
Automatic Labeling for Scene	text	Database
Automatic news video segmentation and categorization based on closed-captioned	text
Automatic performance evaluation for video	text	detection
Automatic Performance Evaluation Protocol for Video	text	Detection Algorithms, An
Automatic recognition of printed arabic	text	using neural network classifier
Automatic Recognition of Printed Farsi	text	s
Automatic Segmentation of Printed Persian (Farsi)	text
Automatic segmentation of the IAM off-line database for handwritten English	text
Automatic separation of machine-printed and hand-written	text	lines
Automatic	text	area segmentation in natural images
Automatic	text	Detection and Recognition
Automatic	text	detection and removal in video sequences
Automatic	text	Detection and Tracking in Digital Video
Automatic	text	detection for mobile augmented reality translation
Automatic	text	Extraction from Arabic Newspapers
Automatic	text	Extraction from Video for Content-Based Annotation and Retrieval
Automatic	text	Extraction in Digital Video Based on Motion Analysis
Automatic	text	Location in Images and Video Frames
Automatic	text	location in natural scene images
Automatic	text	location using cluster-based template matching
Automatic	text	processing
Automatic	text	segmentation from complex background
Automatic tracing and extraction of	text	-line and word segments directly in JPEG compressed document images
Automatic writer identification from	text	line images
Autonomous Document Cleaning: A Generative Approach to Reconstruct Strongly Corrupted Scanned	text	s
Autonomous	text	Capturing Robot Using Improved DCT Feature and Text Tracking
Autonomous	text	Capturing Robot Using Improved DCT Feature and Text Tracking
AutoSplice: A	text	-prompt Manipulated Image Dataset for Media Forensics
Autostr: Efficient Backbone Search for Scene	text	Recognition
Auxiliary captioning: Bridging image-	text	matching and image captioning
AvatarCraft: Transforming	text	into Neural Human Avatars with Parameterized Shape and Pose Control
AvatarStudio: High-Fidelity and Animatable 3D Avatar Creation from	text
Awesome Typography: Statistics-Based	text	Effects Transfer
Background-Insensitive Scene	text	Recognition with Text Semantic Segmentation
Background-Insensitive Scene	text	Recognition with Text Semantic Segmentation
Bag of Embedded Words learning for	text	retrieval
Bag of features approach for offline	text	-independent Chinese writer identification
Balancing Optimization Strategies and Practical Goals: An Efficient Scene	text	Detector
BAMG:	text	-based Person Re-identification via Bottlenecks Attention and Masked Graph Modeling
Baseline detection of multi-lingual unconstrained handwritten	text	lines
BATINeT: Background-Aware	text	to Image Synthesis and Manipulation Network
Bayesian Similarity Model Estimation for Approximate Recognized	text	Search
Bayesian Super-Resolution of	text	in Video with a Text-Specific Bimodal Prior
Bayesian Super-Resolution of	text	in Video with a Text-Specific Bimodal Prior
Bayesian-based method of unconstrained handwritten offline Chinese	text	line recognition, A
BDNet: A BERT-based dual-path network for	text	-to-image cross-modal person re-identification
Be Yourself: Bounded Attention for Multi-subject	text	-to-image Generation
Beatrix: A Self-Learning System for Off-Line Recognition of Handwritten	text	s
Being Comes from Not-Being: Open-Vocabulary	text	-to-Motion Generation with Wordless Training
Belief Mining in Persian	text	s Based on Deep Learning and Users' Opinions
Benchmark for Chinese-English Scene	text	Image Super-resolution, A
Benchmark for Controllable	text	-Image-to-Video Generation, A
Benchmarking Robustness to	text	-Guided Corruptions
better fitness measure of a	text	-document for a given set of keywords, A
Beyond Coarse-grained Matching in Video-	text	Retrieval
Beyond One and Two Tower: Cross-Modal Consensus Learning for Image-	text	Retrieval
Beyond	text	QA: Multimedia Answer Generation by Harvesting Web Information
Beyond	text	: Frozen Large Language Models in Visual Signal Comprehension
Beyond verbs: Understanding actions in videos with	text
Beyond visual semantics: Exploring the role of scene	text	in image understanding
Bi-Attention enhanced representation learning for image-	text	matching
Bi-Directional Image-	text	Retrieval With Position Attention and Similarity Filtering
Bi-Directional Spatial-Semantic Attention Networks for Image-	text	Matching
Bi-directional Training for Composed Image Retrieval via	text	Prompt Learning
Bi-modal Handwritten	text	Corpus: Baseline Results, A
Bi-modal Handwritten	text	Recognition (BiHTR) ICPR 2010 Contest Report
Bi-tonal image non-	text	matter removal with run length and connected component analysis
Bi-VLGM: Bi-Level Class-Severity-Aware Vision-Language Graph Matching for	text	Guided Medical Image Segmentation
Bidirectional extraction and recognition of scene	text	with layout consistency
Bilevel Feature Extraction-Based	text	Mining for Fault Diagnosis of Railway Systems
Bilingual	text	Classification
Bilingual, Open World Video	text	Dataset and Real-Time Video Text Spotting With Contrastive Learning, A
Bilingual, Open World Video	text	Dataset and Real-Time Video Text Spotting With Contrastive Learning, A
BiLMa: Bidirectional Local-Matching for	text	-based Person Re-identification
Bimodal beta mixture distribution for enhanced OOD inner-differentiation in multi-class	text	classification
Binarization and cleanup of handwritten	text	from carbon copy medical form images
Binarization of low quality	text	using a Markov random field model
Binarization-Free Clustering Approach to Segment Curved	text	Lines in Historical Manuscripts, A
Binary	text	image compression using overlapping rectangular partitioning
Binary	text	image file preprocessing to account for printer dot gain
Biometric Recognition Based on Free-	text	Keystroke Dynamics
Biometric recognition using online uppercase handwritten	text
Bipartite Graph Coarsening for	text	Classification Using Graph Neural Networks
BiSeR-LMA: A Bidirectional Semantic Reasoning and Large Model Enhancement Approach for	text	-Video Cross-Modal Retrieval
BizGen: Advancing Article-level Visual	text	Rendering for Infographics Generation
Blended Diffusion for	text	-driven Editing of Natural Images
Blending-NeRF:	text	-Driven Localized Editing in Neural Radiance Fields
Blind Deblurring of	text	Images Using a Text-Specific Hybrid Dictionary
Blind Deblurring of	text	Images Using a Text-Specific Hybrid Dictionary
Blind deblurring	text	images via Beltrami regularization
blind deconvolution model for scene	text	detection and recognition in video, A
Blind Source Separation Techniques for Detecting Hidden	text	s and Textures in Document Images
Blind	text	images deblurring based on a generative adversarial network
BlobGEN-Vid: Compositional	text	-to-Video Generation with Blob Video Representations
Block Segmentation and	text	Extraction in Mixed Text/Image Documents
Block Segmentation and	text	Extraction in Mixed Text/Image Documents
BLSTM-based handwritten	text	recognition using Web resources
BLTRCNN-Based 3-D Articulatory Movement Prediction: Learning Articulatory Synchronicity From Both	text	and Audio Inputs
Boosting SpLSA for	text	Classification
Boosting	text	-To-Image Person Re-Identification With Generative Hard Negative
Boosting Weakly-Supervised Temporal Action Localization with	text	Information
Boosting-based transductive learning for	text	detection
Bootstrapping	text	Recognition from Stop Words
Bordernet: An Efficient Border-attention	text	Detector
BOTH2Hands: Inferring 3D Hands from Both	text	Prompts and Body Dynamics
Bottom-Up Scene	text	Detection with Markov Clustering Networks
Boundary	text	Spotter: Toward Arbitrary-Shaped Scene Text Spotting
Boundary-Aware Arbitrary-Shaped Scene	text	Detector With Learnable Embedding Network
Box It to Bind It: Unified Layout Control and Attribute Binding in	text	-to-Image Diffusion Models
BoxDiff:	text	-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Breaking	text	-Based CAPTCHA with Sparse Convolutional Neural Networks
Breaking	text	-based CAPTCHAs with variable word and character orientation
Breaking The Limits of	text	-conditioned 3D Motion Synthesis with Elaborative Descriptions
BreakingNews: Article Annotation by Image and	text	Processing
Breathing Life Into Sketches Using	text	-to-Video Priors
Bridge-GAN: Interpretable Representation Learning for	text	-to-Image Synthesis
Bridging Different Language Models and Generative Vision Models for	text	-to-image Generation
Bridging Synthetic and Real Worlds for Pre-training Scene	text	Detectors
Bridging the Gap Between Audio and	text	Using Parallel-Attention for User-Defined Keyword Spotting
Bridging the Gap Between End-to-End and Two-Step	text	Spotting
Bridging Video and	text	: A Two-Step Polishing Transformer for Video Captioning
Bridging Video-	text	Retrieval with Multiple Choice Questions
BRsyn-Caps: Chinese	text	Classification Using Capsule Network Based on Bert and Dependency Syntax
BTS: A Bi-lingual Benchmark for	text	Segmentation in the Wild
Building compact recognizer with recognition rate maintained for on-line handwritten Japanese	text	recognition
Building	text	features for object image classification
BURSTS: A bottom-up approach for robust spotting of	text	s in scenes
ByTheWay: Boost Your	text	-to-Video Generation Model to Higher Quality in a Training-free Way
C-CLIP: Contrastive Image-	text	Encoders to Close the Descriptive-Commentative Gap
C-Net: A Compression-Based Lightweight Network for Machine-Generated	text	Detection
C4Synth: Cross-Caption Cycle-Consistent	text	-to-Image Synthesis
Cache-aided cross-modal correlation correction for unsupervised cross-domain	text	-based person search
CAETFN: Con	text	Adaptively Enhanced Text-Guided Fusion Network for Multimodal Sentiment Analysis
CAMEL: CAusal Motion Enhancement Tailored for Lifting	text	-Driven Video Editing
Camera based degraded	text	recognition using grayscale feature
Camera	text	Recognition based on Perspective Invariants
Camera-based analysis of	text	and documents: a survey
CAMP: Cross-Modal Adaptive Message Passing for	text	-Image Retrieval
CamType: assistive	text	entry using gaze with an off-the-shelf webcam
Can Generative Adversarial Networks Teach Themselves	text	Segmentation?
Can	text	-to-Video Generation help Video-Language Alignment?
Canny	text	Detector: Fast and Robust Scene Text Localization Algorithm
Canny	text	Detector: Fast and Robust Scene Text Localization Algorithm
Cap4Video: What Can Auxiliary Captions Do for	text	-Video Retrieval?
Capacity of	text	Marking Channel
CapsFusion: Rethinking Image-	text	Data at Scale
Caption	text	extraction for indexing purposes using a hierarchical region-based image model
Caption	text	recognition in video frames by MAP matching
cascade detector for	text	detection in natural scene images, A
Cascaded Segmentation-Detection Networks for	text	-Based Traffic Sign Detection
Caseg: CLIP-Based Action Segmentation with Learnable	text	Prompt
CAT-TPT: Class-Agnostic	text	-based Test-time Prompt Tuning for Vision-Language Models
CatVersion: Concatenating Embeddings for Diffusion-Based	text	-to-Image Personalization
Causality-Driven Explainable Multimodal Fusion With Visual-	text	Parallel Computing for Cloth-Changing Pedestrian Re-Identification
CBNet: A Plug-and-Play Network for Segmentation-Based Scene	text	Detection
CCDPlus: Towards Accurate Character to Character Distillation for	text	Recognition
CDistNet: Perceiving Multi-domain Character Distance for Robust	text	Recognition
CD	text	: Scene text detector based on context-aware deformable transformer
CE-	text	: A context-Aware and embedded text detector in natural scene images
CE-	text	: A context-Aware and embedded text detector in natural scene images
CelebV-	text	: A Large-Scale Facial Text-Video Dataset
CelebV-	text	: A Large-Scale Facial Text-Video Dataset
Center	text	Spotter: A Novel Text Spotter for Autonomous Unmanned Vehicles
CFOR: Character-First Open-Set	text	Recognition via Context-Free Learning
CGNN: Caption-assisted graph neural network for image-	text	retrieval
Challenges in Content-Based Image Indexing of Cultural Heritage Collections: Support vector machine active learning with applications to	text	classification
Character Energy and Link Energy-Based	text	Extraction in Scene Images
Character extraction in web image for	text	recognition
Character feature Alignment-based scene	text	spotter
Character Grounding and Re-identification in Story of Videos and	text	Descriptions
Character Position-Aware Compression Framework for Screen	text	Image, A
Character Region Attention for	text	Spotting
Character Region Awareness for	text	Detection
Character Segmentation of Handwritten Bangla	text	by Vertex Characterization of Isothetic Covers
Character Segmenting Techniques for Handwritten	text	: A Survey
Character-Aware Sampling and Rectification for Scene	text	Recognition
Character-Level Interaction in Computer-Assisted Transcription of	text	Images
Character-Level Interaction in Multimodal Computer-Assisted Transcription of	text	Images
Character-like region verification for extracting	text	in scene images
Character-Position-Free On-Line Handwritten Japanese	text	Recognition by Two Segmentation Methods
Character-Stroke Detection for	text	-Localization and Extraction
Characterization and classification of semantic image-	text	relations
Characterness: An Indicator of	text	in the Wild
Chat-edit-3d: Interactive 3d Scene Editing via	text	Prompts
ChatGen: Automatic	text	-to-Image Generation From FreeStyle Chatting
ChatTraffic:	text	-to-Traffic Generation via Diffusion Model
Check, Locate, Rectify: A Training-Free Layout Calibration System for	text	- to- Image Generation
Chinese Street View	text	: Large-Scale Chinese Text Reading With Partially Supervised Learning
Chinese Street View	text	: Large-Scale Chinese Text Reading With Partially Supervised Learning
Chinese	text	distinction and font identification by recognizing most frequently used characters
Chinese	text	Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning
Chinese/Kanji	text	and Data Processing
Choose What You Need: Disentangled Representation Learning for Scene	text	Recognition, Removal and Editing
Circle	text	Expansion as Low-Rank Textures
CiteTracker: Correlating Image and	text	for Visual Tracking
CKD: Cross-Task Knowledge Distillation for	text	-to-Image Synthesis
Class dependent feature scaling method using naive Bayes classifier for	text	datamining
Class-agnostic Object Counting with	text	-to-image Diffusion Model
Class-Aware Mask-guided feature refinement for scene	text	recognition
Class-Balanced	text	to Image Synthesis With Attentive Generative Adversarial Network
Class-dependent projection based method for	text	categorization
Classification Architecture Based on Connected Components for	text	Detection in Unconstrained Environments, A
Classification of Machine Printed and Handwritten	text	s Using Character Block Layout Variance
Classification of Noisy Free-	text	Prostate Cancer Pathology Reports Using Natural Language Processing
Classification of	text	Documents
Classification of	text	documents based on score level fusion approach
Classification with reject option in	text	categorisation systems
Classifying networked	text	data with positive and unlabeled examples
CLEval: Character-Level Evaluation for	text	Detection and Recognition Tasks
CLIP is Almost All You Need: Towards Parameter-Efficient Scene	text	Retrieval without OCR
CLIP is Also an Efficient Segmenter: A	text	-Driven Approach for Weakly Supervised Semantic Segmentation
CLIP-Actor:	text	-Driven Recommendation and Stylization for Animating Human Meshes
CLIP-Driven Fine-Grained	text	-Image Person Re-Identification
CLIP-Event: Connecting	text	and Images with Event Structures
CLIP-Forge: Towards Zero-Shot	text	-to-Shape Generation
CLIP-GAN: Stacking CLIPs and GAN for Efficient and Controllable	text	-to-Image Synthesis
CLIP-NeRF:	text	-and-Image Driven Manipulation of Neural Radiance Fields
CLIP2GAN: Toward Bridging	text	With the Latent Space of GANs
CLIP2Protect: Protecting Facial Privacy Using	text	-Guided Makeup via Adversarial Latent Search
Clip2Sam: Enhanced End-to-End	text	-to-Image Segmentation and Image Diffusion System
CLIPAG: Towards Generator-Free	text	-to-Image Generation
CLIPDraw++:	text	-to-Sketch Synthesis with Simple Primitives
CLIPstyler: Image Style Transfer with a Single	text	Condition
CLIPTER: Looking at the Bigger Picture in Scene	text	Recognition
CLIPtone: Unsupervised Learning for	text	-Based Image Tone Adjustment
Cloud of Line Distribution and Random Forest Based	text	Detection from Natural/Video Scene Images
Clustering-Based Approach to the Separation of	text	Strings from Mixed Text/Graphics Documents, A
Clustering-Based Approach to the Separation of	text	Strings from Mixed Text/Graphics Documents, A
CM-Net: Concentric Mask Based Arbitrary-Shaped	text	Detection
CMA-CLIP: Cross-Modality Attention Clip for	text	-Image Classification
CMFG: Cross-model Fine-grained Feature Interaction for	text	-video Retrieval
CMMLoc: Advancing	text	-to-PointCloud Localization with Cauchy-Mixture-Model Based Framework
CMPD: Using Cross Memory Network With Pair Discrimination for Image-	text	Retrieval
CMT-CO: Contrastive Learning with Character Movement Task for Handwritten	text	Recognition
CNN for	text	Detection, Convolutional Neural Network
CNN-based	text	image super-resolution tailored for OCR
CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-	text	Dataset
coarse-to-fine scene	text	detection method based on Skeleton-cut detector and Binary-Tree-Search based rectification, A
Code-Mixing and Code-Switching on Social Media	text	: A Brief Survey
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-	text	Retrieval
Coding with ASCII: compact, yet	text	-based 3D content
Cognition Transferring and Decoupling for	text	-Supervised Egocentric Semantic Segmentation
Cognitive Themes Emerging from Air Photo Interpretation	text	s Published to 1960
Cogview3: Finer and Faster	text	-to-image Generation via Relay Diffusion
Collaborative Vision-	text	Representation Optimizing for Open-vocabulary Segmentation
Color Based Image Segmentation and its Application to	text	Segmentation, A
Color segmentation for	text	extraction
Color structure recovering in strong specular	text	regions
Color	text	extraction from camera-based images: The impact of the choice of the clustering distance
Color	text	extraction with selective metric-based clustering
Color	text	image binarization based on binary texture analysis
Coloring with Words: Guiding Image Colorization Through	text	-Based Palette Generation
Colour	text	segmentation in web images based on human perception
colour	text	/graphics separation based on a graph representation, A
Combination of global and local con	text	s for text/non-text classification in heterogeneous online handwritten documents
Combination of global and local con	text	s for text/non-text classification in heterogeneous online handwritten documents
combined Convolutional Neural Network and Dynamic Programming approach for	text	line normalization, A
Combined orientation and skew detection using geometric	text	-line modeling
Combining Deep and Ad-hoc Solutions to Localize	text	Lines in Ancient Arabic Document Images
Combining diverse on-line and off-line systems for handwritten	text	line recognition
Combining diverse systems for handwritten	text	line recognition
Combining HMM classifiers in a handwritten	text	recognition system
Combining Statistical Measures to Find Image	text	Regions
Combining Structure and Parameter Adaptation of HMMs for Printed	text	Recognition
Combining	text	and image information in content-based retrieval
Combining	text	and prosodic analysis for prominent word detection
COME: Clip-OCR and Master ObjEct for	text	image captioning
Comic	text	Detection Using Neural Network Approach
COMIM-GAN: Improved	text	-to-Image Generation via Condition Optimization and Mutual Information Maximization
CoMM: A Coherent Interleaved Image-	text	Dataset for Multimodal Understanding and Generation
Commercial Quality	text	: What Does it Take?
Commonsense-Guided Semantic and Relational Consistencies for Image-	text	Retrieval
comparative study of features for handwritten Bangla	text	recognition, A
Comparative Study of HMM and BLSTM Segmentation-Free Approaches for the Recognition of Handwritten	text	-Lines
Comparative Study to Evaluate a	text	-Independent Speaker Identification Engine for Arabic Speakers Using a CHMM-Based Approach, A
Comparing Data-driven and Phonetic N-gram Systems for	text	-Independent Speaker Verification
Comparison of Approaches for Automated	text	Extraction from Scholarly Figures, A
Comparison of clustering methods: A case study of	text	-independent speaker modeling
Comparison of some thresholding algorithms for	text	/background segmentation in difficult document images
Comparison of	text	String Similarity Algorithms for POI Name Harmonisation, A
comparison study on multiple binary-class SVM methods for unilabel	text	categorization, A
Compass Control: Multi Object Orientation Control for	text	-to-Image Generation
Compensating for the Incomplete With the Complete: An Efficient Scene	text	Detector
COMPGS: Unleashing 2D Compositionality for Compositional	text	-to-3D via Dynamically Optimizing 3D Gaussians
Complementarity-Aware Space Learning for Video-	text	Retrieval
complete OCR for printed Hindi	text	in Devanagari script, A
Complete Pyramidal Geometrical Scheme for	text	Based Image Description and Retrieval, A
Complying with Privacy Legislation: From Legal	text	to Implementation of Privacy-Aware Location-Based Services
component-tree based method for user-intention guided	text	extraction, A
Components Regulated Generation of Handwritten Chinese	text	-lines in Arbitrary Length

Index for "t"

Last update:26-Feb-26 11:52:11
Use price@usc.edu for comments.