_ | text | _ |
2D and 3D Video Scene | text | Classification |
2LSPE: 2D Learnable Sinusoidal Positional Encoding using Transformer for Scene | text | Recognition |
3D Highlighter: Localizing Regions on 3D Shapes via | text | Descriptions |
3D Human Motion Generation from the | text | Via Gesture Action Classification and the Autoregressive Model |
3D-VisTA: Pre-trained Transformer for 3D Vision and | text | Alignment |
A-STAR: Test-time Attention Segregation and Retention for | text | -to-image Synthesis |
ABCNet v2: Adaptive Bezier-Curve Network for Real-Time End-to-End | text | Spotting |
ABCNet: Real-Time Scene | text | Spotting With Adaptive Bezier-Curve Network |
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene | text | Spotting |
Ablating Concepts in | text | -to-Image Diffusion Models |
Accurate Arbitrary-Shaped Scene | text | Detection via Iterative Polynomial Parameter Regression |
Accurate Detection for Scene | text | s with a Cascaded CNN Networks |
Accurate Scene | text | Detection Through Border Semantics Awareness and Bootstrapping |
Accurate Scene | text | Detection Via Scale-Aware Data Augmentation and Shape Similarity Constraint |
Accurate Scene | text | Recognition Based on Recurrent Neural Network |
Accurate Segmentation-Based Scene | text | Detector with Context Attention and Repulsive Text Border, An |
Accurate Segmentation-Based Scene | text | Detector with Context Attention and Repulsive Text Border, An |
Accurate | text | localization in images based on SVM output scores |
Accurate Threshold Insensitive Kernel Detector for Arbitrary Shaped | text | , An |
Accurate video | text | detection through classification of low and high contrast images |
Accurate, data-efficient, unconstrained | text | recognition with convolutional neural networks |
ActBERT: Learning Global-Local Video- | text | Representations |
Active Collection of Land Cover Sample Data from Geo-Tagged Web | text | s |
Active Contours Network to Straighten Distorted | text | Lines |
active learning approach to frequent itemset-based | text | clustering, An |
Active Learning With Complementary Sampling for Instructing Class-Biased Multi-Label | text | Emotion Classification |
Activity Recognition Applications from Con | text | ual Video-Text Fusion |
Actor and Action Modular Network for | text | -Based Video Segmentation |
AdaBoost for | text | Detection in Natural Scene |
Adapting Style and Content for Attended | text | Sequence Recognition |
Adaptive Algorithm for | text | Detection from Natural Scenes, An |
Adaptive Boundary Proposal Network for Arbitrary Shape | text | Detection |
Adaptive Fuzzy | text | Segmentation in Images with Complex Backgrounds Using Color and Texture |
Adaptive fuzzy wavelet algorithm for | text | -independent speaker recognition |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured | text | |
Adaptive Latent Graph Representation Learning for Image- | text | Matching |
Adaptive method for multi colored | text | binarization |
Adaptive Offline Quintuplet Loss for Image- | text | Matching |
Adaptive Region Growing Color Segmentation for | text | Using Irregular Pyramid |
Adaptive Scene | text | Detection Based on Transferring Adaboost |
Adaptive scene- | text | binarisation on images captured by smartphones |
Adaptive Script-Independent Block-Based | text | Line Extraction, An |
Adaptive Script-Independent | text | Line Extraction |
Adaptive | text | Recognition Through Visual Matching |
Adding Conditional Control to | text | -to-Image Diffusion Models |
Addressing Information Inequality for | text | -Based Person Search via Pedestrian-Centric Visual Denoising and Bias-Aware Alignments |
ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene | text | Detection |
Adversarial and Isotropic Gradient Augmentation for Image Retrieval With | text | Feedback |
Adversarial Attribute- | text | Embedding for Person Search With Natural Language Query |
Adversarial learning based attentional scene | text | recognizer |
Adversarial Representation Learning for | text | -to-Image Matching |
Adversarial Synthesis of Human Pose from | text | |
Adversarial Training Lattice LSTM for Named Entity Recognition of Rail Fault | text | s |
ADVMIX: Data Augmentation for Accurate Scene | text | Spotting |
Ae | text | spotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting |
Aesthetic | text | Logo Synthesis via Content-aware Layout Inferring |
Affective Image Filter: Reflecting Emotions from | text | to Images |
Aggregating Image and | text | Quantized Correlated Components |
Aggregating Local and Global | text | Features for Linguistic Steganalysis |
Aggregating Local Con | text | for Accurate Scene Text Detection |
Aletheia: An Advanced Document Layout and | text | Ground-Truthing System for Production Environments |
Algorithm for Colour-Based Natural Scene | text | Segmentation, An |
Algorithm for Matching OCR-Generated | text | Strings, An |
Algorithm for Reducing | text | Line Candidates of Incorrect Orientation, An |
Algorithm for | text | page up/down orientation determination |
Algorithms for compressing compound document images with large | text | /background overlap |
ALIF: A dataset for Arabic embedded | text | recognition in TV broadcast |
Aligning | text | and Document Illustrations: Towards Visually Explainable Digital Humanities |
Alignment and Generation Adapter for Efficient Video- | text | Understanding |
Alignment of Curved | text | Strings for Enhanced OCR Readability |
Alignment of free layout color | text | s for character recognition |
Alignment of Paragraphs in Bilingual | text | s Using Bilingual Dictionaries and Dynamic Programming |
All You Need Is a Second Look: Towards Arbitrary-Shaped | text | Detection |
ALR-GAN: Adaptive Layout Refinement for | text | -to-Image Synthesis |
alternative framework for univariate filter based feature selection for | text | categorization, An |
ALTID : Arabic/Latin | text | Images Database for recognition research |
Analysis of Features and Metrics for Alignment in | text | -Dependent Voice Conversion |
Analysis of the Novel Transformer Module Combination for Scene | text | Recognition |
Analytical evaluation of term weighting schemes for | text | categorization |
anchor-free region proposal network for Faster R-CNN-based | text | detection approaches, An |
Ancient document analysis based on | text | line extraction |
Annotated Databases for the Recognition of Screen-Rendered | text | |
ANNP: a neural network parser for real world | text | s |
Anonymizing Temporal Phrases in Natural Language | text | to be Posted on Social Networking Services |
Anti-DreamBooth: Protecting users from personalized | text | -to-image synthesis |
AnyFace: Free-style | text | -to-Face Synthesis and Manipulation |
AON: Towards Arbitrarily-Oriented | text | Recognition |
Application of autoregressive models to the study of the temporal structure of a handwritten | text | |
Application of Cluster Detection to | text | and Picture Processing, An |
Application of Novel Chaotic Neural Networks to | text | Classification Based on PCA |
Application of Planar Motion Segmentation for Scene | text | Extraction |
Applying GIS and | text | Mining Methods to Twitter Data to Explore the Spatiotemporal Patterns of Topics of Interest in Kuwait |
Applying the conjugate gradient method for | text | document categorization |
approach for detecting and cleaning of struck-out handwritten | text | , An |
Approach for Recognizing | text | Labels in Raster Maps, An |
approach to extracting the target | text | line from a document image captured by a pen scanner, An |
approach to get overall emotion from comment | text | towards a certain image uploaded to social network using Latent Semantic Analysis, An |
Approximate String Match for Garbled | text | with Various Accuracies, An |
Arabic character recognition system: A statistical approach for recognizing cursive typewritten | text | |
Arabic hand-written | text | -line extraction |
Arabic handwritten | text | s clusterization based on Feature Relation Graph (FRG) |
Arabic ligatures: Analysis and application in | text | recognition |
Arabic | text | detection in videos using neural and boosting-based approaches: Application to video indexing |
Arbitrarily oriented | text | detection using geodesic distances between corners and skeletons |
Arbitrarily Shaped Scene | text | Detection With a Mask Tightness Text Detector |
Arbitrarily Shaped Scene | text | Detection With a Mask Tightness Text Detector |
Arbitrarily shaped scene | text | detection with dynamic convolution |
Arbitrarily-Oriented | text | Detection in Low Light Natural Scene Images |
Arbitrary Shape Scene | text | Detection With Adaptive Text Region Representation |
Arbitrary Shape Scene | text | Detection With Adaptive Text Region Representation |
Arbitrary Shape | text | Detection using Transformers |
Arbitrary Shape | text | Detection via Boundary Transformer |
Arbitrary Shape | text | Detection via Segmentation with Probability Maps |
Arbitrary Style Guidance for Enhanced Diffusion-Based | text | -to-Image Generation |
Arbitrary-Oriented Scene | text | Detection via Rotation Proposals |
Arbitrary-Shape Scene | text | Detection via Visual-Relational Rectification and Contour Approximation |
architecture for handwritten | text | recognition systems, An |
Are 2D-LSTM really dead for offline | text | recognition? |
Are All Combinations Equal? Combining | text | ual and Visual Features with Multiple Space Learning for Text-based Video Retrieval |
Are Digraphs Good for Free- | text | Keystroke Dynamics? |
Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in | text | |
ARRPNGAN: | text | -to-image GAN with attention regularization and region proposal networks |
ASAYAR: A Dataset for Arabic-Latin Scene | text | Localization in Highway Traffic Panels |
Assessing Affective Dimensions of Play in Psychodynamic Child Psychotherapy via | text | Analysis |
Assessing Image and | text | Generation with Topological Analysis and Fuzzy Logic |
Assessing similarity in handwritten | text | s |
Assessing User Experience of | text | Readability with Eye Tracking in Virtual Reality |
Assistive | text | Reading from Complex Background for Blind Persons |
Associating | text | and graphics for scientific chart understanding |
ASTER: An Attentional Scene | text | Recognizer with Flexible Rectification |
ASTS: A Unified Framework for Arbitrary Shape | text | Spotting |
Asymmetric Cross-Scale Alignment for | text | -Based Person Search |
ATM: Attentional | text | Matting |
ATT3D: Amortized | text | -to-3D Object Synthesis |
Attend, Correct and Focus: A Bidirectional Correct Attention Network for Image- | text | Matching |
attention-based row-column encoder-decoder model for | text | recognition in Japanese historical documents, An |
AttnGAN: Fine-Grained | text | to Image Generation with Attentional Generative Adversarial Networks |
AttT2M: | text | -Driven Human Motion Generation with Multi-Perspective Attention Mechanism |
Audio-Enhanced | text | -to-Video Retrieval using Text-Conditioned Feature Alignment |
Audio-Enhanced | text | -to-Video Retrieval using Text-Conditioned Feature Alignment |
Authenticating Binary | text | Documents Using a Localising OMAC Watermark Robust to Printing and Scanning |
Automated cartographic | text | placement |
Automated Detection of Adverse Drug Events from Older Patients' Electronic Medical Records Using | text | Mining |
automatic algorithm for | text | skew estimation in document images using recursive morphological transforms, An |
Automatic annotation of unique locations from video and | text | |
Automatic Chinese | text | Classification Using Character-Based and Word-Based Approach |
Automatic Concept Discovery from Parallel | text | and Visual Corpora |
Automatic Detection and Localization of Natural Scene | text | in Video |
Automatic detection and recognition of Korean | text | in outdoor signboard images |
Automatic diacritization of Arabic | text | using recurrent neural networks |
Automatic discrimination of | text | and non-text natural images |
Automatic discrimination of | text | and non-text natural images |
Automatic document classification using | text | and images |
Automatic dottization of Arabic | text | (Rasms) using deep recurrent neural networks |
Automatic extraction of correlation-entropy features for | text | document analysis directly in run-length compressed domain |
Automatic Feature Extraction and | text | Recognition From Scanned Topographic Maps |
Automatic identification and skew estimation of | text | lines in real scene images |
Automatic Identification of | text | in Digital Video Key Frames |
Automatic image- | text | alignment for large-scale web image indexing and retrieval |
Automatic Inpainting Scheme for Video | text | Detection and Removal |
Automatic Labeling for Scene | text | Database |
Automatic news video segmentation and categorization based on closed-captioned | text | |
Automatic performance evaluation for video | text | detection |
Automatic Performance Evaluation Protocol for Video | text | Detection Algorithms, An |
Automatic recognition of printed arabic | text | using neural network classifier |
Automatic Recognition of Printed Farsi | text | s |
Automatic Segmentation of Printed Persian (Farsi) | text | |
Automatic segmentation of the IAM off-line database for handwritten English | text | |
Automatic separation of machine-printed and hand-written | text | lines |
Automatic | text | area segmentation in natural images |
Automatic | text | Detection and Recognition |
Automatic | text | detection and removal in video sequences |
Automatic | text | Detection and Tracking in Digital Video |
Automatic | text | detection for mobile augmented reality translation |
Automatic | text | Extraction from Arabic Newspapers |
Automatic | text | Extraction from Video for Content-Based Annotation and Retrieval |
Automatic | text | Extraction in Digital Video Based on Motion Analysis |
Automatic | text | Location in Images and Video Frames |
Automatic | text | location in natural scene images |
Automatic | text | location using cluster-based template matching |
Automatic | text | processing |
Automatic | text | segmentation from complex background |
Automatic tracing and extraction of | text | -line and word segments directly in JPEG compressed document images |
Automatic writer identification from | text | line images |
Autonomous Document Cleaning: A Generative Approach to Reconstruct Strongly Corrupted Scanned | text | s |
Autonomous | text | Capturing Robot Using Improved DCT Feature and Text Tracking |
Autonomous | text | Capturing Robot Using Improved DCT Feature and Text Tracking |
AutoSplice: A | text | -prompt Manipulated Image Dataset for Media Forensics |
Autostr: Efficient Backbone Search for Scene | text | Recognition |
AvatarCraft: Transforming | text | into Neural Human Avatars with Parameterized Shape and Pose Control |
Awesome Typography: Statistics-Based | text | Effects Transfer |
Background-Insensitive Scene | text | Recognition with Text Semantic Segmentation |
Background-Insensitive Scene | text | Recognition with Text Semantic Segmentation |
Bag of Embedded Words learning for | text | retrieval |
Bag of features approach for offline | text | -independent Chinese writer identification |
Baseline detection of multi-lingual unconstrained handwritten | text | lines |
BATINeT: Background-Aware | text | to Image Synthesis and Manipulation Network |
Bayesian Similarity Model Estimation for Approximate Recognized | text | Search |
Bayesian Super-Resolution of | text | in Video with a Text-Specific Bimodal Prior |
Bayesian Super-Resolution of | text | in Video with a Text-Specific Bimodal Prior |
Bayesian-based method of unconstrained handwritten offline Chinese | text | line recognition, A |
BDNet: A BERT-based dual-path network for | text | -to-image cross-modal person re-identification |
Beatrix: A Self-Learning System for Off-Line Recognition of Handwritten | text | s |
Being Comes from Not-Being: Open-Vocabulary | text | -to-Motion Generation with Wordless Training |
Benchmark for Chinese-English Scene | text | Image Super-resolution, A |
Benchmark for Controllable | text | -Image-to-Video Generation, A |
Benchmarking Robustness to | text | -Guided Corruptions |
better fitness measure of a | text | -document for a given set of keywords, A |
Beyond | text | QA: Multimedia Answer Generation by Harvesting Web Information |
Beyond verbs: Understanding actions in videos with | text | |
Beyond visual semantics: Exploring the role of scene | text | in image understanding |
Bi-Attention enhanced representation learning for image- | text | matching |
Bi-Directional Image- | text | Retrieval With Position Attention and Similarity Filtering |
Bi-Directional Spatial-Semantic Attention Networks for Image- | text | Matching |
Bi-modal Handwritten | text | Corpus: Baseline Results, A |
Bi-modal Handwritten | text | Recognition (BiHTR) ICPR 2010 Contest Report |
Bi-tonal image non- | text | matter removal with run length and connected component analysis |
Bidirectional extraction and recognition of scene | text | with layout consistency |
Bilevel Feature Extraction-Based | text | Mining for Fault Diagnosis of Railway Systems |
Bilingual | text | Classification |
BiLMa: Bidirectional Local-Matching for | text | -based Person Re-identification |
Binarization and cleanup of handwritten | text | from carbon copy medical form images |
Binarization of low quality | text | using a Markov random field model |
Binarization-Free Clustering Approach to Segment Curved | text | Lines in Historical Manuscripts, A |
Binary | text | image compression using overlapping rectangular partitioning |
Binary | text | image file preprocessing to account for printer dot gain |
Biometric Recognition Based on Free- | text | Keystroke Dynamics |
Biometric recognition using online uppercase handwritten | text | |
Bipartite Graph Coarsening for | text | Classification Using Graph Neural Networks |
Blended Diffusion for | text | -driven Editing of Natural Images |
Blending-NeRF: | text | -Driven Localized Editing in Neural Radiance Fields |
Blind Deblurring of | text | Images Using a Text-Specific Hybrid Dictionary |
Blind Deblurring of | text | Images Using a Text-Specific Hybrid Dictionary |
blind deconvolution model for scene | text | detection and recognition in video, A |
Blind Source Separation Techniques for Detecting Hidden | text | s and Textures in Document Images |
Blind | text | images deblurring based on a generative adversarial network |
Block Segmentation and | text | Extraction in Mixed Text/Image Documents |
Block Segmentation and | text | Extraction in Mixed Text/Image Documents |
BLSTM-based handwritten | text | recognition using Web resources |
BLTRCNN-Based 3-D Articulatory Movement Prediction: Learning Articulatory Synchronicity From Both | text | and Audio Inputs |
Boosting SpLSA for | text | Classification |
Boosting Weakly-Supervised Temporal Action Localization with | text | Information |
Boosting-based transductive learning for | text | detection |
Bootstrapping | text | Recognition from Stop Words |
Bordernet: An Efficient Border-attention | text | Detector |
Bottom-Up Scene | text | Detection with Markov Clustering Networks |
Boundary | text | Spotter: Toward Arbitrary-Shaped Scene Text Spotting |
Boundary-Aware Arbitrary-Shaped Scene | text | Detector With Learnable Embedding Network |
BoxDiff: | text | -to-Image Synthesis with Training-Free Box-Constrained Diffusion |
Breaking | text | -Based CAPTCHA with Sparse Convolutional Neural Networks |
Breaking | text | -based CAPTCHAs with variable word and character orientation |
Breaking The Limits of | text | -conditioned 3D Motion Synthesis with Elaborative Descriptions |
BreakingNews: Article Annotation by Image and | text | Processing |
Bridge-GAN: Interpretable Representation Learning for | text | -to-Image Synthesis |
Bridging Video and | text | : A Two-Step Polishing Transformer for Video Captioning |
Bridging Video- | text | Retrieval with Multiple Choice Questions |
BRsyn-Caps: Chinese | text | Classification Using Capsule Network Based on Bert and Dependency Syntax |
BTS: A Bi-lingual Benchmark for | text | Segmentation in the Wild |
Building compact recognizer with recognition rate maintained for on-line handwritten Japanese | text | recognition |
Building | text | features for object image classification |
BURSTS: A bottom-up approach for robust spotting of | text | s in scenes |
C4Synth: Cross-Caption Cycle-Consistent | text | -to-Image Synthesis |
Camera based degraded | text | recognition using grayscale feature |
Camera | text | Recognition based on Perspective Invariants |
Camera-based analysis of | text | and documents: a survey |
CAMP: Cross-Modal Adaptive Message Passing for | text | -Image Retrieval |
CamType: assistive | text | entry using gaze with an off-the-shelf webcam |
Can Generative Adversarial Networks Teach Themselves | text | Segmentation? |
Canny | text | Detector: Fast and Robust Scene Text Localization Algorithm |
Canny | text | Detector: Fast and Robust Scene Text Localization Algorithm |
Cap4Video: What Can Auxiliary Captions Do for | text | -Video Retrieval? |
Capacity of | text | Marking Channel |
Caption | text | extraction for indexing purposes using a hierarchical region-based image model |
Caption | text | recognition in video frames by MAP matching |
cascade detector for | text | detection in natural scene images, A |
Cascaded Segmentation-Detection Networks for | text | -Based Traffic Sign Detection |
CDistNet: Perceiving Multi-domain Character Distance for Robust | text | Recognition |
CD | text | : Scene text detector based on context-aware deformable transformer |
CE- | text | : A context-Aware and embedded text detector in natural scene images |
CE- | text | : A context-Aware and embedded text detector in natural scene images |
CelebV- | text | : A Large-Scale Facial Text-Video Dataset |
CelebV- | text | : A Large-Scale Facial Text-Video Dataset |
Center | text | Spotter: A Novel Text Spotter for Autonomous Unmanned Vehicles |
CGNN: Caption-assisted graph neural network for image- | text | retrieval |
Challenges in Content-Based Image Indexing of Cultural Heritage Collections: Support vector machine active learning with applications to | text | classification |
Character Energy and Link Energy-Based | text | Extraction in Scene Images |
Character extraction in web image for | text | recognition |
Character Grounding and Re-identification in Story of Videos and | text | Descriptions |
Character Region Attention for | text | Spotting |
Character Region Awareness for | text | Detection |
Character Segmentation of Handwritten Bangla | text | by Vertex Characterization of Isothetic Covers |
Character Segmenting Techniques for Handwritten | text | : A Survey |
Character-Aware Sampling and Rectification for Scene | text | Recognition |
Character-Level Interaction in Computer-Assisted Transcription of | text | Images |
Character-Level Interaction in Multimodal Computer-Assisted Transcription of | text | Images |
Character-like region verification for extracting | text | in scene images |
Character-Position-Free On-Line Handwritten Japanese | text | Recognition by Two Segmentation Methods |
Character-Stroke Detection for | text | -Localization and Extraction |
Characterization and classification of semantic image- | text | relations |
Characterness: An Indicator of | text | in the Wild |
Chinese Street View | text | : Large-Scale Chinese Text Reading With Partially Supervised Learning |
Chinese Street View | text | : Large-Scale Chinese Text Reading With Partially Supervised Learning |
Chinese | text | distinction and font identification by recognizing most frequently used characters |
Chinese | text | Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning |
Chinese/Kanji | text | and Data Processing |
Circle | text | Expansion as Low-Rank Textures |
CiteTracker: Correlating Image and | text | for Visual Tracking |
CKD: Cross-Task Knowledge Distillation for | text | -to-Image Synthesis |
Class dependent feature scaling method using naive Bayes classifier for | text | datamining |
Class-Aware Mask-guided feature refinement for scene | text | recognition |
Class-Balanced | text | to Image Synthesis With Attentive Generative Adversarial Network |
Class-dependent projection based method for | text | categorization |
Classification Architecture Based on Connected Components for | text | Detection in Unconstrained Environments, A |
Classification of Machine Printed and Handwritten | text | s Using Character Block Layout Variance |
Classification of Noisy Free- | text | Prostate Cancer Pathology Reports Using Natural Language Processing |
Classification of | text | Documents |
Classification of | text | documents based on score level fusion approach |
Classification with reject option in | text | categorisation systems |
Classifying networked | text | data with positive and unlabeled examples |
CLEval: Character-Level Evaluation for | text | Detection and Recognition Tasks |
CLIP is Also an Efficient Segmenter: A | text | -Driven Approach for Weakly Supervised Semantic Segmentation |
CLIP-Actor: | text | -Driven Recommendation and Stylization for Animating Human Meshes |
CLIP-Driven Fine-Grained | text | -Image Person Re-Identification |
CLIP-Event: Connecting | text | and Images with Event Structures |
CLIP-Forge: Towards Zero-Shot | text | -to-Shape Generation |
CLIP-NeRF: | text | -and-Image Driven Manipulation of Neural Radiance Fields |
CLIP2Protect: Protecting Facial Privacy Using | text | -Guided Makeup via Adversarial Latent Search |
CLIPstyler: Image Style Transfer with a Single | text | Condition |
CLIPTER: Looking at the Bigger Picture in Scene | text | Recognition |
Cloud of Line Distribution and Random Forest Based | text | Detection from Natural/Video Scene Images |
Clustering-Based Approach to the Separation of | text | Strings from Mixed Text/Graphics Documents, A |
Clustering-Based Approach to the Separation of | text | Strings from Mixed Text/Graphics Documents, A |
CM-Net: Concentric Mask Based Arbitrary-Shaped | text | Detection |
CMA-CLIP: Cross-Modality Attention Clip for | text | -Image Classification |
CMFG: Cross-model Fine-grained Feature Interaction for | text | -video Retrieval |
CMPD: Using Cross Memory Network With Pair Discrimination for Image- | text | Retrieval |
CMT-CO: Contrastive Learning with Character Movement Task for Handwritten | text | Recognition |
CNN for | text | Detection, Convolutional Neural Network |
CNN-based | text | image super-resolution tailored for OCR |
CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video- | text | Dataset |
coarse-to-fine scene | text | detection method based on Skeleton-cut detector and Binary-Tree-Search based rectification, A |
Code-Mixing and Code-Switching on Social Media | text | : A Brief Survey |
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image- | text | Retrieval |
Coding with ASCII: compact, yet | text | -based 3D content |
Cognitive Themes Emerging from Air Photo Interpretation | text | s Published to 1960 |
Color Based Image Segmentation and its Application to | text | Segmentation, A |
Color segmentation for | text | extraction |
Color structure recovering in strong specular | text | regions |
Color | text | extraction from camera-based images: The impact of the choice of the clustering distance |
Color | text | extraction with selective metric-based clustering |
Color | text | image binarization based on binary texture analysis |
Coloring with Words: Guiding Image Colorization Through | text | -Based Palette Generation |
Colour | text | segmentation in web images based on human perception |
colour | text | /graphics separation based on a graph representation, A |
Combination of global and local con | text | s for text/non-text classification in heterogeneous online handwritten documents |
Combination of global and local con | text | s for text/non-text classification in heterogeneous online handwritten documents |
combined Convolutional Neural Network and Dynamic Programming approach for | text | line normalization, A |
Combined orientation and skew detection using geometric | text | -line modeling |
Combining Deep and Ad-hoc Solutions to Localize | text | Lines in Ancient Arabic Document Images |
Combining diverse on-line and off-line systems for handwritten | text | line recognition |
Combining diverse systems for handwritten | text | line recognition |
Combining HMM classifiers in a handwritten | text | recognition system |
Combining Statistical Measures to Find Image | text | Regions |
Combining Structure and Parameter Adaptation of HMMs for Printed | text | Recognition |
Combining | text | and image information in content-based retrieval |
Combining | text | and prosodic analysis for prominent word detection |
COME: Clip-OCR and Master ObjEct for | text | image captioning |
Comic | text | Detection Using Neural Network Approach |
COMIM-GAN: Improved | text | -to-Image Generation via Condition Optimization and Mutual Information Maximization |
Commercial Quality | text | : What Does it Take? |
Commonsense-Guided Semantic and Relational Consistencies for Image- | text | Retrieval |
comparative study of features for handwritten Bangla | text | recognition, A |
Comparative Study of HMM and BLSTM Segmentation-Free Approaches for the Recognition of Handwritten | text | -Lines |
Comparative Study to Evaluate a | text | -Independent Speaker Identification Engine for Arabic Speakers Using a CHMM-Based Approach, A |
Comparing Data-driven and Phonetic N-gram Systems for | text | -Independent Speaker Verification |
Comparison of Approaches for Automated | text | Extraction from Scholarly Figures, A |
Comparison of clustering methods: A case study of | text | -independent speaker modeling |
Comparison of some thresholding algorithms for | text | /background segmentation in difficult document images |
Comparison of | text | String Similarity Algorithms for POI Name Harmonisation, A |
comparison study on multiple binary-class SVM methods for unilabel | text | categorization, A |
Complementarity-Aware Space Learning for Video- | text | Retrieval |
complete OCR for printed Hindi | text | in Devanagari script, A |
Complete Pyramidal Geometrical Scheme for | text | Based Image Description and Retrieval, A |
Complying with Privacy Legislation: From Legal | text | to Implementation of Privacy-Aware Location-Based Services |
component-tree based method for user-intention guided | text | extraction, A |
Components Regulated Generation of Handwritten Chinese | text | -lines in Arbitrary Length |
Composing | text | and Image for Image Retrieval - an Empirical Odyssey |
Composite Script Identification and Orientation Detection for Indian | text | Images |
Compositional coding capsule network with k-means routing for | text | classification |
Compositional Learning of Image- | text | Query for Image Retrieval |
Compositional Mixture Representations for Vision and | text | |
comprehensive method for multilingual video | text | detection, localization, and extraction, A |
comprehensive neural-based approach for | text | recognition in videos using natural language processing, A |
comprehensive scheme for tattoo | text | detection, A |
comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese | text | recognition, A |
Computational Topology in | text | Mining |
Computer Assisted Transcription for Ancient | text | Images |
Computer Assisted Transcription of Handwritten | text | Images |
Computer Assisted Transcription of | text | Images: Results on the GERMANA Corpus and Analysis of Improvements Needed for Practical Use |
Computer Interpretation of English | text | and Picture Patterns |
Con- | text | : Text Detection for Fine-Grained Object Classification |
Con- | text | : Text Detection for Fine-Grained Object Classification |
Concept decompositions for short | text | clustering by identifying word communities |
Concepts-Locations-Emotions: Semantic Analysis and Visualization of Climate Change | text | s |
Conceptual 12M: Pushing Web-Scale Image- | text | Pre-Training To Recognize Long-Tail Visual Concepts |
Conditional Feature Learning Based Transformer for | text | -Based Person Search |
Conditional Image- | text | Embedding Networks |
conditional random field approach for face identification in broadcast news using overlaid | text | , A |
Conditional random field for | text | segmentation from images with complex background |
Conditional | text | Image Generation with Diffusion Models |
Confidence Measures for Error Correction in Interactive Transcription Handwritten | text | |
Configurable | text | Stamp Identification Tool with Application of Fuzzy Logic |
Connected and Degraded | text | Recognition Using Hidden Markov Model |
Connected Component Level Discrimination of Handwritten and Machine-Printed | text | Using Eigenfaces |
Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned | text | corpora |
Consensus-aware Visual-semantic Embedding for Image- | text | Matching |
Consistent Partition and Labelling of | text | Blocks |
Constructing the Discriminative Kernels Using GMM for | text | -Independent Speaker Identification |
Content and Style Aware Generation of | text | -Line Images for Handwriting Recognition |
Content Based Image and Video Retrieval Using Embedded | text | |
Content-based image retrieval with pachinko allocation model and a combination of colour, | text | ure and text features |
Content-Based Query of Image Databases, Inspirations from | text | Retrieval: Inverted Files, Frequency-based Weights and Relevance Feedback |
Content-based query of image databases: Inspirations from | text | retrieval |
Con | text | Driven Text Segmentation and Recognition |
Con | text | Supplied by Text or Language |
Con | text | -Aware Attention Network for Image-Text Retrieval |
Con | text | -Aware Hierarchical Transformer for Fine-Grained Video-Text Retrieval |
Con | text | -Aware Text-Based Binary Image Stylization and Synthesis |
Con | text | -based text detection in natural scenes |
Con | text | ual Text Block Detection Towards Scene Text Understanding |
Con | text | ual Text Block Detection Towards Scene Text Understanding |
Con | text | ual text/non-text stroke classification in online handwritten notes with conditional random fields |
Con | text | ual text/non-text stroke classification in online handwritten notes with conditional random fields |
Continual Learning for Cross-Modal Image- | text | Retrieval Based on Domain-Selective Attention |
Continuous approach to segmentation of handwritten | text | |
Contour Restoration of | text | Components for Recognition in Video/Scene Images |
contour-based approach to 3D | text | labeling on triangulated surfaces, A |
Contour-Based Robust Algorithm for | text | Detection in Color Images, A |
ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene | text | Detection |
Contra: (con) | text | (tra)nsformer for Cross-modal Video Retrieval |
Contrastive author-aware | text | clustering |
Contribution of recurrent connectionist language models in improving LSTM-based Arabic | text | recognition in videos |
Contribution to the Discrimination of the Medieval Manuscript | text | s: Application in the Palaeography |
Controllable Artistic | text | Style Transfer via Shape-Matching GAN |
Controllable Multi-Lingual Multi-Speaker Multi-Style | text | -to-Speech Synthesis With Multivariate Information Minimization, A |
Controllable Video Generation With | text | -Based Instructions |
Convolutional Neural Network Based | text | Steganalysis |
Convolutional Neural Network-Based Chinese | text | Detection Algorithm via Text Structure Modeling, A |
Convolutional Neural Network-Based Chinese | text | Detection Algorithm via Text Structure Modeling, A |
Convolutional Neural Networks for Direct | text | Deblurring |
Convolutional Recurrent Neural Network for the Handwritten | text | Recognition of Historical Greek Manuscripts, A |
Convolutional recurrent neural networks with hidden Markov model bootstrap for scene | text | recognition |
COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated | text | s |
CookGAN: Causality Based | text | -to-Image Synthesis |
Cooperative Game Modeling With Weighted Token-Level Alignment for Audio- | text | Retrieval |
Copyright protection for the electronic distribution of | text | documents |
Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten | text | |
Correcting document image warping based on regression of curved | text | lines |
CoSMo: Content-Style Modulation for Image Retrieval with | text | Feedback |
Cost-Effective Adversarial Attacks against Scene | text | Recognition |
Could scene con | text | be beneficial for scene text detection? |
Coupled Snakelets for Curled | text | -Line Segmentation from Warped Document Images |
Cpgan: Content-parsing Generative Adversarial Networks for | text | -to-image Synthesis |
Creating generic | text | summaries |
Creation and Analysis of a Corpus of | text | Rich Indian TV Videos |
CRF Based Scheme for Overlapping Multi-colored | text | Graphics Separation, A |
Crime Prediction and Monitoring in Porto, Portugal, Using Machine Learning, Spatial and | text | Analytics |
CrisisHateMM: Multimodal Analysis of Directed and Undirected Hate Speech in | text | -Embedded Images from Russia-Ukraine Conflict |
Cross-Lingual | text | Image Recognition via Multi-Hierarchy Cross-Modal Mimic |
Cross-Lingual | text | Image Recognition via Multi-Task Sequence to Sequence Learning |
Cross-Modal and Hierarchical Modeling of Video and | text | |
Cross-Modal Contrastive Learning for | text | -to-Image Generation |
Cross-modal domain adaptation for | text | -based regularization of image semantics in image retrieval systems |
Cross-Modal Dynamic Networks for Video Moment Retrieval With | text | Query |
Cross-Modal Implicit Relation Reasoning and Aligning for | text | -to-Image Person Retrieval |
Cross-Modal Person Search: A Coarse-to-Fine Framework using Bi-Directional | text | -Image Matching |
Cross-modal Scene Graph Matching for Relationship-aware Image- | text | Retrieval |
Cross-Modal Semantic Matching Generative Adversarial Networks for | text | -to-Image Synthesis |
Cross-Modal | text | Steganography Against Synonym Substitution-Based Text Attack |
Cross-Modal | text | Steganography Against Synonym Substitution-Based Text Attack |
Crossing the lines: making optimal use of con | text | in line-based Handwritten Text Recognition |
Crypto-stego System for Securing | text | and Image Data |
CT-GAN: A conditional Generative Adversarial Network of transformer architecture for | text | -to-image |
CT-Net: Arbitrary-Shaped | text | Detection via Contour Transformer |
Curriculum learning for printed | text | line recognition of ligature-based scripts |
Cursive Script, Historical Documents, | text | Line Segmentation, Script Line, Segmentation, Text Line Extraction |
Cursive Script, Historical Documents, | text | Line Segmentation, Script Line, Segmentation, Text Line Extraction |
Cursive stroke sequencing for handwritten | text | documents recognition |
Curved scene | text | detection via transverse and longitudinal sequence connection |
CycleMatch: A cycle-consistent embedding network for image- | text | matching |
DAC-GAN: Dual Auxiliary Consistency Generative Adversarial Network for | text | -to-Image Generation |
DAE-GAN: Dynamic Aspect-aware GAN for | text | -to-Image Synthesis |
DALL-EVAL: Probing the Reasoning Skills and Social Biases of | text | -to-Image Generation Models |
Data Augmentation for Scene | text | Recognition |
data base for arabic handwritten | text | recognition research, A |
Data Embedding in | text | for a Copier System |
Data-Hiding Capacity Improvement for | text | Watermarking Using Space Coding Method |
Database for Arabic Handwritten | text | Image Recognition and Writer Identification, A |
Database for Arabic Printed | text | Recognition Research |
Database for Handwritten | text | Recognition Research, A |
Database for Offline Arabic Handwritten | text | Recognition, A |
dataset for Arabic | text | detection, tracking and recognition in news videos- AcTiV, A |
Dataset to Support Sexist Content Detection in Arabic | text | , A |
DATID-3D: Diversity-Preserved Domain Adaptation Using | text | -to-Image Diffusion for 3D Generative Model |
De-rendering Stylized | text | s |
Debiased Video- | text | Retrieval via Soft Positive Sample Calibration |
Deblurring | text | Images via L0-Regularized Intensity and Gradient Prior |
Deep Automated | text | Scoring Model Based on Memory Network |
Deep Belief Networks Based Toponym Recognition for Chinese | text | |
Deep BLSTM neural networks for unconstrained continuous handwritten | text | recognition |
Deep Convolutional Deblurring and Detection Neural Network for Localizing | text | in Videos, A |
Deep correlation for matching images and | text | |
Deep Cross-Modal Projection Learning for Image- | text | Matching |
Deep Direct Regression for Multi-oriented Scene | text | Detection |
Deep feature extraction with tri-channel | text | ual feature map for text classification |
Deep Features for | text | Spotting |
Deep image compression using scene | text | quality assessment |
Deep learning and recurrent connectionist-based approaches for Arabic | text | recognition in videos |
deep learning approach to handwritten | text | recognition in the presence of struck-out text, A |
deep learning approach to handwritten | text | recognition in the presence of struck-out text, A |
Deep Learning for Image-to- | text | Generation: A Technical Overview |
Deep Learning in the Domain of Multi-Document | text | Summarization |
Deep Matching Prior Network: Toward Tighter Multi-oriented | text | Detection |
Deep Multi-Scale Con | text | Aware Feature Aggregation for Curved Scene Text Detection |
Deep Neural Network Based 3D Articulatory Movement Prediction Using Both | text | and Audio Inputs |
Deep neural network based hidden Markov model for offline handwritten Chinese | text | recognition |
Deep Neural Network with Attention Model for Scene | text | Recognition |
Deep Relational Reasoning Graph Network for Arbitrary Shape | text | Detection |
Deep | text | Spotter: An End-to-End Trainable Scene Text Localization and Recognition Framework |
DeepErase: Weakly Supervised Ink Artifact Removal in Document | text | Images |
DeepSolo: Let Transformer Decoder with Explicit Points Solo for | text | Spotting |
DeepWriterID: An End-to-End Online | text | -Independent Writer Identification System |
Deformable scene | text | detection using harmonic features and modified pixel aggregation network |
Deformation Robust | text | Spotting with Geometric Prior |
Deformation-Invariant Networks for Handwritten | text | Recognition |
Degraded Gray-Scale | text | Recognition Using Pseudo-2D Hidden Markov-Models and N-Best Hypotheses |
Delaunay triangulation based | text | detection from multi-view images of natural scene |
DeltaEdit: Exploring | text | -free Training for Text-Driven Image Manipulation |
DeltaEdit: Exploring | text | -free Training for Text-Driven Image Manipulation |
Dense Chained Attention Network for Scene | text | Recognition |
Dense prediction for | text | line segmentation in handwritten document images |
Dense | text | -to-Image Generation with Attention Modulation |
density-based approach for | text | extraction in images, A |
Dependence Models for Searching | text | in Document Images |
Deriving a Priori Co-occurrence Probability Estimates for Object Recognition from Social Networks and | text | Processing |
Deriving Symbol Dependent Edit Weights for | text | Correction: The Use of Error Dictionaries |
Design and Evaluation of Features That Best Define | text | in Complex Scene Images |
Design and Preliminary Evaluation of a Finger-Mounted Camera and Feedback System to Enable Reading of Printed | text | for the Blind, The |
Detect Arbitrary-Shaped | text | via Adaptive Thresholding and Localization Quality Estimation |
Detect Visual Spoofing in Unicode-Based | text | |
Detected | text | -Based Image Retrieval Approach for Textual Images |
Detecting and reading | text | in natural scenes |
Detecting Arbitrarily Oriented | text | Labels in Early Maps |
Detecting dense | text | in natural images |
Detecting Misspelled Words in Turkish | text | Using Syllable n-gram Frequencies |
Detecting moving | text | in video using temporal information |
Detecting natural scenes | text | via auto image partition, two-stage grouping and two-layer classification |
Detecting Oriented | text | in Natural Images by Linking Segments |
Detecting Tampered Scene | text | in the Wild |
Detecting | text | Areas and Decorative Elements in Ancient Manuscripts |
Detecting | text | in Natural Image with Connectionist Text Proposal Network |
Detecting | text | in Natural Image with Connectionist Text Proposal Network |
Detecting | text | in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Color Invariance |
Detecting | text | in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Text Detection |
Detecting | text | in Natural Scenes Based on a Reduction of Photometric Effects: Problem of Text Detection |
Detecting | text | in natural scenes with stroke width transform |
Detecting | text | in Scene and Traffic Guide Panels With Attention Anchor Mechanism |
Detecting | text | in the Wild with Deep Character Embedding Network |
Detecting | text | Lines in Handwritten Documents |
Detecting | text | s of arbitrary orientations in natural images |
Detecting Traffic Information From Social Media | text | s With Deep Learning Approaches |
Detecting Video | text | s Using Spatial-Temporal Wavelet Transform |
Detection and Interpretation of | text | Information in Noisy Video Sequences |
Detection and Location of Multicharacter Sequences in Lines of Imaged | text | |
Detection and rectification of arbitrary shaped scene | text | s by using text keypoints and links |
Detection and rectification of arbitrary shaped scene | text | s by using text keypoints and links |
Detection and Segmentation of Antialiased | text | in Screen Images |
Detection Approaches for Table Semantics in | text | |
Detection of Curved | text | in Video: Quad Tree Based Method |
Detection of curved | text | path based on the fuzzy curve-tracing (FCT) algorithm |
Detection of Data Hiding in Binary | text | Images |
Detection of | text | marks on moving vehicles |
Detection of | text | on road signs from video |
Detection of | text | Region and Segmentation from Natural Scene Images |
Detection of | text | regions from digital engineering drawings |
Determining Number of Clusters Using Firefly Algorithm with Cluster Merging for | text | Clustering |
Deterministic Turing Machine for Con | text | Sensitive Translation of Braille Codes to Urdu Text, A |
Devanagari and Bangla | text | Extraction from Natural Scene Images |
Devanagari | text | Recognition: A Transcription Based Formulation |
Development and Evaluation of | text | Localization Techniques Based on Structural Texture Features and Neural Classifiers |
Development of a Robust and Compact On-Line Handwritten Japanese | text | Recognizer for Hand-Held Devices |
DF-GAN: A Simple and Effective Baseline for | text | -to-Image Synthesis |
Dictionary design for | text | image compression with JBIG2 |
Dictionary-guided Scene | text | Recognition |
Different Approaches to Bilingual | text | Classification Based on Grammatical Inference Techniques |
Differential-Processing Extraction Approach to | text | and Image Segmentation, A |
Differentiation of alphabets in handwritten | text | s |
Diffusion for | text | to Image Generation |
Diffusion-SDF: | text | -to-Shape via Voxelized Diffusion |
DiffusionCLIP: | text | -Guided Diffusion Models for Robust Image Manipulation |
DiffusionRet: Generative | text | -Video Retrieval with Diffusion Model |
DiffusionSTR: Diffusion Model for Scene | text | Recognition |
Digital image analysis to enhance underwritten | text | in the Archimedes palimpsest |
Digital Ink Recogntion Server for Handwritten Japanese | text | , A |
Diphone spanish | text | -to-speech synthesizer |
Direct Regression Scene | text | Detector With Position-Sensitive Segmentation, A |
Direct | text | to Speech Translation System Using Acoustic Units |
Direct Unsupervised | text | Line Extraction from Colored Historical Manuscript Images Using DCT |
Discovering Low-Rank Shared Concept Space for Adapting | text | Mining Models |
Discovering meaningful multimedia patterns with audio-visual concepts and associated | text | |
Discrete Joint Semantic Alignment Hashing for Cross-Modal Image- | text | Search |
Discrete-continuous Action Space Policy Gradient-based Attention for Image- | text | Matching |
Discrimination of machine-printed from handwritten | text | using simple structural characteristics |
Discriminative Class Tokens for | text | -to-Image Diffusion Models |
Discriminative Kernel-Based Approach to Rank Images from | text | Queries, A |
Discriminative Model for On-line Handwritten Japanese | text | Retrieval, A |
discriminative semi-Markov model for robust scene | text | recognition, A |
Disease-Specific Extraction of | text | from Cardiac Echo Videos for Decision Support |
Disentangled Contour Learning for Quadrilateral | text | Detection |
Dissecting Deep Metric Learning Losses for Image- | text | Retrieval |
Distilling Knowledge of Bidirectional Language Model for Scene | text | Recognition |
Distinction between handwritten and machine-printed | text | based on the bag of visual words model |
Distinguishing between Handwritten and Machine Printed | text | in Bank Cheque Images |
Distinguishing mathematics notation from English | text | using computational geometry |
Distinguishing | text | /Non-Text Natural Images with Multi-Dimensional Recurrent Neural Networks |
Distinguishing | text | /Non-Text Natural Images with Multi-Dimensional Recurrent Neural Networks |
Distributional semantics of objects in visual scenes in comparison to | text | |
Diversified | text | -to-image generation via deep mutual information estimation |
DM-GAN: Dynamic Memory Generative Adversarial Networks for | text | -To-Image Synthesis |
DOC: | text | Recognition via Dual Adaptation and Clustering |
Document Analysis System Based on | text | Line Matching of Multiple OCR Outputs, A |
Document dewarping via | text | -line based optimization |
Document filtering for fast approximate string matching of erroneous | text | |
Document Image De-warping Based on Detection of Distorted | text | Lines |
Document Image Dewarping using Robust Estimation of Curled | text | Lines |
Document image ground truth generation from electronic | text | |
Document Mining Based on Semantic Understanding of | text | |
Document Rectification Approach Dealing with Both Perspective Distortion and Warping Based on | text | Flow Curve Fitting, A |
Document segmentation and classification into musical scores and | text | |
Document skew detection/control system for printed document images containing a mixture of pure | text | lines and non-text portions |
Document skew detection/control system for printed document images containing a mixture of pure | text | lines and non-text portions |
Does | text | attract attention on e-commerce images: A novel saliency prediction dataset and method |
Domain adaptive multigranularity proposal network for | text | detection under extreme traffic scenes |
Don't Forget Me: Accurate Background Recovery for | text | Removal via Modeling Local-Global Context |
Dot | text | Detection Based on FAST Points |
Downtown Osaka Scene | text | Dataset |
Dream3D: Zero-Shot | text | -to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models |
Dream3D: Zero-Shot | text | -to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models |
DreamBooth3D: Subject-Driven | text | -to-3D Generation |
DreamBooth: Fine Tuning | text | -to-Image Diffusion Models for Subject-Driven Generation |
DreamStone: Image as a Stepping Stone for | text | -Guided 3D Shape Generation |
DS | text | V2: A comprehensive video text spotting dataset for dense and small text |
DS | text | V2: A comprehensive video text spotting dataset for dense and small text |
Dual Adversarial Inference for | text | -to-Image Synthesis |
Dual Alignment Unsupervised Domain Adaptation for Video- | text | Retrieval |
Dual Encoding for Video Retrieval by | text | |
Dual Relation Network for Scene | text | Recognition |
Dual-Level Representation Enhancement on Characteristic and Con | text | for Image-Text Retrieval |
Dual-path CNN with Max Gated block for | text | -based person re-identification |
Dual-Path Rare Content Enhancement Network for Image and | text | Matching |
DUET: Detection Utilizing Enhancement for | text | in Scanned or Captured Documents |
DVHMM: variable length | text | recognition error model |
Dynamic Contrastive Distillation for Image- | text | Retrieval |
Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End | text | Spotting |
Dynamic receptive field adaptation for scene | text | recognition |
Dynamic recognition in the omni-writer frame: Application to hand-printed | text | recognition |
Dynamic | text | Line Segmentation for Real-Time Recognition of Chinese Handwritten Sentences |
Dynamic | text | s From UAV Perspective Natural Images |
Dynamic Unilateral Dual Learning for | text | to Image Synthesis |
Dynamic Word Based | text | Compression |
DynTypo: Example-Based Dynamic | text | Effects Transfer |
E2VTS: Energy-Efficient Video | text | Spotting from Unmanned Aerial Vehicles |
Early feature stream integration versus decision level combination in a multiple classifier system for | text | line recognition |
Earthquake Information Extraction and Comparison from Different Sources Based on Web | text | |
EAST: An Efficient and Accurate Scene | text | Detector |
EDA: Explicit | text | -Decoupling and Dense Alignment for 3D Visual Grounding |
Edge Based Binarization for Video | text | Images |
Edge-Based Features for Localization of Artificial Urdu | text | in Video Images |
Edge-based method for | text | detection from complex document images |
Edge-based | text | localization and character segmentation algorithms for automatic slab information recognition |
Edit Probability for Scene | text | Recognition |
Editing Implicit Assumptions in | text | -to-Image Diffusion Models |
Educational video understanding: Mapping handwritten | text | to textbook chapters |
EESSO: Exploiting Extreme and Smooth Signals via Omni-frequency learning for | text | -based Person Retrieval |
Effect of Improved Path Evaluation for On-line Handwritten Japanese | text | Recognition |
Effective 3D | text | Recurrent Voting Generator for Metaverse, An |
Effective and efficient video | text | extraction using key text points |
Effective and efficient video | text | extraction using key text points |
Effective feature descriptor-based new framework for off-line | text | -independent writer identification |
effective method for | text | line segmentation in historical document images, An |
effective sentence-extraction technique using con | text | ual information and statistical approaches for text summarization, An |
Effective shrinkage of large multi-class linear SVM models for | text | categorization |
Effective | text | localization in natural scene images with MSER, geometry-based grouping and AdaBoost |
Effective Uyghur Language | text | Detection in Complex Background Images for Traffic Prompt Identification |
Effective video | text | detection using line features |
Effectively localize | text | in natural scene images |
effectiveness of T5, GPT-2, and BERT on | text | -to-image generation task, The |
Efficiency investigation of manifold matching for | text | document classification |
Efficient Algorithm for Segmenting Warped | text | -Lines in Document Images, An |
Efficient and Accurate Arbitrary-Shaped | text | Detection With Pixel Aggregation Network |
Efficient and flexible | text | extraction from document pages |
Efficient Automatic | text | Location Method and Content-Based Indexing and Structuring of Video Database |
Efficient Character Skew Rectification in Scene | text | Images |
Efficient Exploration of | text | Regions in Natural Scene Images Using Adaptive Image Sampling |
Efficient graph-based dictionary search and its application to | text | -image searching |
Efficient indexing for Query By String | text | retrieval |
Efficient Industrial System for Vehicle Tyre (Tire) Detection and | text | Recognition Using Deep Learning, An |
Efficient Light Balancing Techniques for | text | Images in Video Presentation Systems |
Efficient Method for Offline | text | Independent Writer Identification, An |
Efficient Method for | text | Detection in Video Based on Stroke Width Similarity, An |
Efficient Multimodal Aggregation Network for Video- | text | Retrieval, An |
Efficient Scene | text | localization and recognition with local character refinement |
Efficient side information encoding for | text | hardcopy documents |
Efficient System for Hazy Scene | text | Detection using a Deep CNN and Patch-NMS, An |
Efficient | text | analyser with prosody generator-driven approach for Mandarin text-to-speech |
Efficient | text | analyser with prosody generator-driven approach for Mandarin text-to-speech |
Efficient | text | Capture Method for Moving Robots Using DCT Feature and Text Tracking, An |
Efficient | text | Capture Method for Moving Robots Using DCT Feature and Text Tracking, An |
Efficient | text | Classification Using Tree-structured Multi-linear Principal Component Analysis |
Efficient | text | independent speaker recognition with wavelet feature selection based multilayered neural network using supervised learning algorithm |
Efficient | text | localization in born-digital images by local contrast-based segmentation |
Efficient | text | Segmentation Technique Based on Naive Bayes Classifier, An |
Efficient | text | -based Person Search via Single-stage Identity-guided Attribute Parsing and Alignment |
Efficient Token-Guided Image- | text | Retrieval With Consistent Multimodal Contrastive Training |
Efficient Transcript Mapping to Ease the Creation of Document Image Segmentation Ground Truth with | text | -Image Alignment |
Efficient video | text | detection using edge features |
Efficient video | text | recognition using multiple frame integration |
Efficient Visual Search of Videos Cast as | text | Retrieval |
Eigenspace method for | text | retrieval in historical document images |
ELITE: Encoding Visual Concepts into | text | ual Embeddings for Customized Text-to-Image Generation |
Embedded Application for Degraded | text | Recognition, An |
Embedded Bernoulli Mixture HMMs for Continuous Handwritten | text | Recognition |
embedded method: Improve the relevance of | text | and face image with enhanced face attributes, An |
Embil: An English-manipuri Bi-lingual Benchmark for Scene | text | Detection and Language Identification |
EmoLabel: Semi-Automatic Methodology for Emotion Annotation of Social Media | text | |
Emotion Correlation Mining Through Deep Learning Models on Natural Language | text | |
Emotion Recognition in | text | for 3-D Facial Expression Rendering |
EmotionAlBERTo: Emotion Recognition of Italian Social Media | text | s Through BERT |
EMU: Effective Multi-Hot Encoding Net for Lightweight Scene | text | Recognition With a Large Character Set |
Encoding Video Narration as | text | |
End-to-End Handwritten Paragraph | text | Recognition Using a Vertical Attention Network |
end-to-end model for multi-view scene | text | recognition, An |
End-to-end OCR | text | Re-organization Sequence Learning for Rich-text Detail Image Comprehension, An |
End-to-end OCR | text | Re-organization Sequence Learning for Rich-text Detail Image Comprehension, An |
End-to-End page-Level assessment of handwritten | text | recognition |
End-to-End Pre-Training With Hierarchical Matching and Momentum Contrast for | text | -Video Retrieval |
End-to-end scene | text | recognition |
End-to-end scene | text | recognition using tree-structured models |
End-to-end | text | recognition with convolutional neural networks |
End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene | text | Recognition, An |
End-to-End Video | text | Detection with Online Tracking |
Enforcing similarity constraints with integer programming for better scene | text | recognition |
Enhanced Active Contour Method for Locating | text | |
Enhanced Network Embedding with | text | Information |
Enhanced Probabilistic Neural Network Approach Applied to | text | Classification, An |
Enhanced | text | Extraction from Arabic Degraded Document Images Using EM Algorithm |
Enhancement and feature extraction for images of incised and ink | text | s |
Enhancement of camera captured | text | images with specular reflection |
Enhancement of | text | images using a context based nonlinear interpolative vector quantization method |
Enhancing energy minimization framework for scene | text | recognition with top-down cues |
Enhancing fine-detail image synthesis from | text | descriptions by text aggregation and connection fusion module |
Enhancing fine-detail image synthesis from | text | descriptions by text aggregation and connection fusion module |
Enhancing Handwritten | text | Recognition with N-gram sequence decomposition and Multitask Learning |
Enhancing Scene | text | Detection via Fused Semantic Segmentation Network with Attention |
Enhancing | text | -like edges in digital images |
Enhancing the Video Editing Capabilities of | text | -to-Video Generators Using DDPM Inversion |
Enriching Video Captions With Con | text | ual Text |
Ensemble Methods to Improve the Performance of an English Handwritten | text | Line Recognizer |
Episodic Learning Network for | text | Detection on Human Bodies in Sports Images, An |
ER-Chat: A | text | -to-Text Open-Domain Dialogue Framework for Emotion Regulation |
ER-Chat: A | text | -to-Text Open-Domain Dialogue Framework for Emotion Regulation |
EraseNet: End-to-End | text | Removal in the Wild |
Erasing Scene | text | with Weak Supervision |
ERNIE-ViLG 2.0: Improving | text | -to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts |
ESA: External Space Attention Aggregation for Image- | text | Retrieval |
ESIR: End-To-End Scene | text | Recognition via Iterative Image Rectification |
ES | text | Spotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer |
Estimating the Orientation and Recovery of | text | Planes in a Single Image |
Estimating the readability of handwritten | text | : A Support Vector Regression based approach |
Estimation of Skew Angle in | text | -Image Analysis by SLIDE: Subspace-Based Line Detection |
Evaluating a Hidden Markov Model of Syntax in a | text | Recognition System |
Evaluating Data Attribution for | text | -to-Image Models |
Evaluating OCR and Non-OCR | text | Representations for Learning Document Classifiers |
Evaluating | text | -to-Image Matching using Binary Image Selection (BISON) |
Evaluation of HMM-Based Techniques for the Recognition of Screen Rendered | text | , An |
Evaluation of Model-Based Retrieval Effectiveness with OCR | text | |
Evaluation of neural network language models in handwritten Chinese | text | recognition |
Evaluation of the Concatenative Turkish | text | -to-Speech System |
Evaluation of the Optimal Topic Classification for Social Media Data Combined with | text | Semantics: A Case Study of Public Opinion Analysis Related to COVID-19 with Microblogs |
Event-Guided Procedure Planning from Instructional Videos with | text | Supervision |
evidence-based model of saliency feature extraction for scene | text | analysis, An |
Evolution Maps for Connected Components in | text | Documents |
example-based prior model for | text | image super-resolution, An |
Expanding Large Pre-trained Unimodal Models with Multimodal Information Injection for Image- | text | Multimodal Classification |
Experimental Evaluation of OCR | text | Representations for Learning Document Classifiers, An |
Experimental Investigation of | text | -Based CAPTCHA Attacks and Their Robustness, An |
Experimental Study of Pruning Techniques in Handwritten | text | Recognition Systems, An |
Experimental System for Office Document Handling and | text | Recognition, An |
Experiments in | text | Recognition with Binary N-Gram and Viterbi Algorithms |
Experiments in | text | Recognition with the Modified Viterbi Algorithm |
Experiments in the Recognition of Handprinted | text | : Part I Character Recognition |
Explain2Attack: | text | Adversarial Attacks via Cross-Domain Interpretability |
Explaining Semantic | text | Similarity in Knowledge Graphs |
Exploiting Color Information for Better Scene | text | Recognition |
Exploiting colour information for better scene | text | detection and recognition |
Exploring Effective Interactive | text | -Based Video Search in vitrivr |
Exploring Global and Local Linguistic Representations for | text | -to-Image Synthesis |
Exploring Phrase Grounding without Training: Con | text | ualisation and Extension to Text-Based Image Retrieval |
Exploring Sparse Spatial Relation in Graph Inference for | text | -Based VQA |
Exploring | text | representation impact on K-means based arabic text documents clustering |
Exploring | text | representation impact on K-means based arabic text documents clustering |
Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene | text | Detection |
Exploring the Spatial Distribution Characteristics of Emotions of Weibo Users in Wuhan Waterfront Based on Gender Differences Using Social Media | text | s |
Exploring the Spatiotemporal Patterns of Residents' Daily Activities Using | text | -Based Social Media Data: A Case Study of Beijing, China |
Exposing fake images generated by | text | -to-image diffusion models |
Expressive | text | -to-Image Generation with Rich Text |
Expressive | text | -to-Image Generation with Rich Text |
Expressive visual | text | -to-speech as an assistive technology for individuals with autism spectrum conditions |
Expressive Visual | text | -to-Speech Using Active Appearance Models |
Extended Bi-gram Features in | text | Categorization |
Extended character defect model for recognition of | text | from maps |
Extending TrOCR for | text | Localization-Free OCR of Full-Page Scanned Receipt Images |
External Word Segmentation of Off-Line Handwritten | text | Lines |
Extracting Curved | text | Lines Using Local Linearity of Text Line |
Extracting Curved | text | Lines Using Local Linearity of Text Line |
Extracting Spatio-Temporal Information from Chinese Archaeological Site | text | |
Extracting | text | From Greyscale Images |
Extracting | text | from WWW Images |
Extraction and Recognition of Bangla | text | s from Natural Scene Images Using Cnn |
Extraction of Handwritten | text | from Carbon Copy Medical Form Images |
Extraction of line-word-character segments directly from run-length compressed printed | text | -documents |
Extraction of Lines of | text | s in Unconstrained Handwritten Documents |
Extraction of Nom | text | Regions from Stele Images Using Area Voronoi Diagram |
Extraction of Pluvial Flood Relevant Volunteered Geographic Information (VGI) by Deep Learning from User Generated | text | s and Photos |
Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed | text | -Documents |
Extraction of special effects caption | text | events from digital video |
Extraction of Spelling Variations from Language Structure for Noisy | text | Correction |
Extraction of | text | boxes from Engineering Drawings |
Extraction of | text | Lines and Text Blocks on Document Images Based on Statistical Modeling |
Extraction of | text | Lines and Text Blocks on Document Images Based on Statistical Modeling |
Extraction of | text | Words in Document Images Based on a Statistical Characterization |
Extraction Of Thematically Relevant | text | From Images |
extractive | text | summarization technique for Bengali document(s) using K-means clustering algorithm, An |
Extractive | text | Summarization Using Topological Features |
Extremely Low-Light Image Enhancement with Scene | text | Restoration |
FA-GAN: Feature-Aware GAN for | text | to Image Synthesis |
Face typing: Vision-based perceptual interface for hands-free | text | entry with a scrollable virtual keyboard |
FaceCLIPNeRF: | text | -driven 3D Face Manipulation using Deformable Neural Radiance Fields |
Faces a la Carte: | text | -to-Face Generation via Attribute Disentanglement |
Facsimile device with skew correction and | text | line direction detection |
Factors in Emotion Recognition With Deep Learning Models Using Speech and | text | on Multiple Corpora |
Fantasia3D: Disentangling Geometry and Appearance for High-quality | text | -to-3D Content Creation |
FARNet: Fragmented affinity reasoning network of | text | instances for arbitrary shape text detection |
FARNet: Fragmented affinity reasoning network of | text | instances for arbitrary shape text detection |
Fashion Image Retrieval with | text | Feedback by Additive Attention Compositional Learning |
Fast and accurate scene | text | understanding with image binarization and off-the-shelf OCR |
Fast and Accurate | text | Detection in Natural Scene Images with User-Intention |
Fast and effective | text | detection |
fast and efficient method for extracting | text | paragraphs and graphics from unconstrained documents, A |
Fast and Efficient | text | Steganalysis Method, A |
Fast and Flexible Statistical Method for | text | Extraction in Document Pages, A |
Fast and memory efficient | text | image compression with JBIG2 |
Fast and robust | text | detection in images and video frames |
fast and robust | text | spotter, A |
Fast and simple | text | replacement algorithm for text-based augmented reality |
Fast and simple | text | replacement algorithm for text-based augmented reality |
Fast Appearance-Based Full- | text | Search Method for Historical Newspaper Images, A |
Fast Approximate Modelling of the Next Combination Result for Stopping the | text | Recognition in a Video |
Fast Coding-Mode Selection and CU-Depth Prediction Algorithm Based on | text | -Block Recognition for Screen Content Coding |
fast hierarchical method for multi-script and arbitrary oriented scene | text | extraction, A |
Fast Lexicon-Based Scene | text | Recognition with Sparse Belief Propagation |
fast multiresolution | text | line and non text-line structures extraction and discrimination scheme for document image analysis, A |
fast multiresolution | text | line and non text-line structures extraction and discrimination scheme for document image analysis, A |
Fast online incremental approach of unseen place classification using disjoint- | text | attribute prediction |
Fast perspective recovery of | text | in natural scenes |
Fast scene | text | localization by learning-based filtering and verification |
Fast Selection of Small and Precise Candidate Sets from Dictionaries for | text | Correction Tasks |
Fast Supervised Topic Models for Short | text | Emotion Detection |
Fast | text | categorization using concise semantic analysis |
Fast | text | line detection by finding linear connected components on Canny edge image |
Fast | text | line extraction in document images |
Fast | text | /graphics resolution improvement using wavelet based denoising and chain-code table lookup |
Fast Uyghur | text | detection in videos based on learning of baseline feature |
Fast Uyghur | text | Detector for Complex Background Images, A |
Fast(er) Reconstruction of Shredded | text | Documents via Self-Supervised Deep Asymmetric Metric Learning |
FAST: Facilitated and Accurate Scene | text | Proposals through FCN Guided Pruning |
FAS | text | : Efficient Unconstrained Scene Text Detector |
FateZero: Fusing Attentions for Zero-shot | text | -based Video Editing |
Feature Embedding Based | text | Instance Grouping for Largely Spaced and Occluded Text Detection |
Feature Embedding Based | text | Instance Grouping for Largely Spaced and Occluded Text Detection |
Feature extracted from wavelet decomposition using biorthogonal Riesz basis for | text | -independent speaker recognition |
Feature extracted from wavelet eigenfunction estimation for | text | -independent speaker recognition |
Feature First: Advancing Image- | text | Retrieval Through Improved Visual Features |
Feature Fusion Network for Scene | text | Detection |
Feature Representations for Scene | text | Character Recognition: A Comparative Study |
Feature selection for event extraction in biomedical | text | |
Feature selection to recognize | text | from palm leaf manuscripts |
Feature selection using hybrid poor and rich optimization algorithm for | text | classification |
Feature subset selection using naive Bayes for | text | classification |
Feature Weight Optimization and Pruning in Historical | text | Recognition |
FETNet: Feature erasing and transferring network for scene | text | removal |
Few Could Be Better Than All: Feature Sampling and Grouping for Scene | text | Detection |
Few shots are all you need: A progressive learning approach for low resource handwritten | text | recognition |
Few-Shot | text | Style Transfer via Deep Feature Similarity |
Fg-T2M: Fine-Grained | text | -Driven Human Motion Generation via Diffusion Model |
FHT: An Unconstraint Farsi Handwritten | text | Database |
Find More Accurate | text | Boundary for Scene Text Detection |
Find More Accurate | text | Boundary for Scene Text Detection |
Find | text | in Documents |
Find | text | in Video Scenes |
Finding Hidden Semantics of | text | Tables |
Finding structure in noisy | text | : Topic classification and unsupervised clustering |
Finding | text | In Images |
Finding | text | in Natural Scenes by Figure-Ground Segmentation |
Finding | text | Regions using Localised Statistical Measures |
Fine-grained Image- | text | Matching by Cross-modal Hard Aligning Network |
Fine-grained Image- | text | Retrieval via Complementary Feature Learning |
Fine-Grained Image- | text | Retrieval via Discriminative Latent Space Learning |
Fine-Grained Video- | text | Retrieval With Hierarchical Graph Reasoning |
Fisher Linear Discriminant Analysis for | text | -image combination in multimedia information retrieval |
Flexible | text | Recovery from Degraded Typewritten Historical Documents |
Flick Typing: A New VR | text | Input System Based on Space Gestures |
Focal | text | : an Accurate Text Detection with Focal Loss |
Focal | text | : an Accurate Text Detection with Focal Loss |
Focal Visual- | text | Attention for Memex Question Answering |
Focal Visual- | text | Attention for Visual Question Answering |
Focus on Scene | text | Using Deep Reinforcement Learning |
Focusing Attention: Towards Accurate | text | Recognition in Natural Images |
Font Recognition and Con | text | ual Processing for More Accurate Text Recognition |
Font Watermarking Network for | text | Images |
Fontender: Interactive Japanese | text | Design with Dynamic Font Fusion Method for Comics |
Foreground and background separated image style transfer with a single | text | condition |
Foreground and | text | -lines Aware Document Image Rectification |
Foreground | text | Extraction in Color Document Images for Enhanced Readability |
Foreground | text | segmentation in complex color document images using Gabor filters |
Forged | text | detection in video, scene, and document images |
Formal Distance vs. Association Strength in | text | Processing |
Formalization of On-Line Handwritten Japanese | text | Recognition Free from Line Direction Constraint, A |
FOTS: Fast Oriented | text | Spotting with a Unified Network |
Fourier Contour Embedding for Arbitrary-Shaped | text | Detection |
Fractals Based Multi-Oriented | text | Detection System for Recognition in Mobile Video Images |
Fractional poisson enhancement model for | text | detection and recognition in video frames |
Framework for Detecting and Selecting | text | Line Candidates of Correct Orientation, A |
Framework for Performance Evaluation of Face, | text | , and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol |
Free- | text | keystroke dynamics authentication for Arabic language |
FREE: A Fast and Robust End-to-End Video | text | Spotter |
Frequent Itemsets Methods for | text | Clustering |
Fringe Map Based | text | Line Segmentation of Printed Telugu Document Images |
From External to Internal: Structuring Image for | text | -to-Image Attributes Manipulation |
From senses to | text | s: An all-in-one graph-based approach for measuring semantic similarity |
From | text | Detection to Text Segmentation: A Unified Evaluation Scheme |
From | text | Detection to Text Segmentation: A Unified Evaluation Scheme |
From | text | to Speech: A Multimodal Cross-Domain Approach for Deception Detection |
From | text | to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification |
From Two to One: A New Scene | text | Recognizer with Visual Language Modeling Network |
Full- | text | Access to Historical Newspapers |
Full- | text | Search System for Images of Hand-Written Cursive Documents, A |
Fully convolutional network with dilated convolutions for handwritten | text | line segmentation |
Fully convolutional recurrent network for handwritten Chinese | text | recognition |
Fully Shareable Scene | text | Recognition Modeling for Horizontal and Vertical Writing |
Fundamental Visual Concept Learning From Correlated Images and | text | |
Furniture-geek: Understanding fine-grained furniture attributes from freely associated | text | and tags |
Further explorations in | text | alignment with handwritten documents |
Further reduced form of wavelet feature for | text | independent speaker recognition |
Fused | text | Segmentation Networks for Multi-oriented Scene Text Detection |
Fused | text | Segmentation Networks for Multi-oriented Scene Text Detection |
Fusion Encoder with Multi-Task Guidance for Cross-Modal | text | -Image Retrieval in Remote Sensing, A |
Fusion of Speech, Faces and | text | for Person Identification in TV Broadcast |
Fusion Strategy for the Single Shot | text | Detector, A |
fuzzy find matching tool for image | text | analysis, A |
Fuzzy Inference-Based Models for Extractive | text | Summarization |
Fuzzy Semantics for Arbitrary-Shaped Scene | text | Detection |
GA-DAN: Geometry-Aware Domain Adaptation Network for Scene | text | Detection and Recognition |
Gabor filter based block energy analysis for | text | extraction from digital document images |
GALIP: Generative Adversarial CLIPs for | text | -to-Image Synthesis |
Gamma correction acceleration for real-time | text | extraction from complex colored images |
GAN-TSTEGA: | text | Steganography Based on Generative Adversarial Networks |
Gated Cross Word-visual Attention-driven Generative Adversarial Networks for | text | -to-image Synthesis |
Gatha: Relational Loss for enhancing | text | -based style transfer |
Gaussian Constrained Attention Network for Scene | text | Recognition |
Gaussian mixture modeling and learning of neighboring characters for multilingual | text | extraction in images |
Gaussian Mixture Modeling of Neighbor Characters for Multilingual | text | Extraction in Images |
GCNs-Based Con | text | -Aware Short Text Similarity Model |
General and domain-specific techniques for detecting and recognizing superimposed | text | in video |
general approach for multi-oriented | text | line extraction of handwritten documents, A |
Generalized Interpolative Vector Quantization Method for Jointly Optimal Quantization, Interpolation, and Binarization of | text | Images, A |
Generalizing Edit Distance to Incorporate Domain Information: Handwritten | text | Recognition as a Case-Study |
Generating Diverse and Natural 3D Human Motions from | text | |
Generating Holistic 3D Scene Abstractions for | text | -Based Image Retrieval |
Generation of Viewed Image Captions From Human Brain Activity Via Unsupervised | text | Latent Space |
Generative Adversarial Approach for Zero-Shot Learning from Noisy | text | s, A |
Generative Adversarial Network for | text | -to-Face Synthesis and Manipulation with Pretrained BERT Model |
Generative Adversarial Networks Based on Dynamic Word-Level Update for | text | -to-Image Synthesis |
Generative and Discriminative Fuzzy Restricted Boltzmann Machine Learning for | text | and Image Classification |
Generative Negative | text | Replay for Continual Vision-Language Pretraining |
Generative | text | Convolutional Neural Network for Hierarchical Document Representation Learning |
generic method for determining the up/down orientation of | text | in Roman and non-Roman scripts, A |
generic method for determining up/down orientation of | text | in Roman and non-Roman scripts, A |
GeoAnnotator: A Collaborative Semi-Automatic Platform for Constructing Geo-Annotated | text | Corpora |
Geometry Normalization Networks for Accurate Scene | text | Detection |
Geometry-Aware Scene | text | Detection with Instance Transformation Network |
Geospatial Semantics Analysis of the Qinghai-Tibetan Plateau Based on Microblog Short | text | s |
Geotagging | text | Content With Language Models and Feature Mining |
GLASS: Global to Local Attention for Scene- | text | Spotting |
GLIGEN: Open-Set Grounded | text | -to-Image Generation |
Global-Shared | text | Representation Based Multi-Stage Fusion Transformer Network for Multi-Modal Dense Video Captioning |
goal-oriented verification-based approach for target | text | line extraction from a document image captured by a pen scanner, A |
Gradient Difference Based Technique for Video | text | Detection, A |
Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene | text | Detection in Video Images |
Gradient-based approach to offline | text | -independent Persian writer identification |
GraDual: Graph-based Dual-modal Representation for Image- | text | Matching |
Graph based method for Arabic | text | summarization |
Graph Clustering-Based Ensemble Method for Handwritten | text | Line Segmentation |
Graph Structured Network for Image- | text | Matching |
Graph-based Method to Remove Interferential Curve From | text | Image, A |
Graph-Based Segmentation and Feature-extraction Framework for Arabic | text | Recognition, A |
Graph-Based | text | Segmentation Using a Selected Channel Image |
Graphical Figure Classification Using Data Fusion for Integrating | text | and Image Features |
Graphics and Scene | text | Classification in Video |
Graphological Analysis of Handwritten | text | Documents for Human Resources Recruitment |
Grounded Image | text | Matching with Mismatched Relation Reasoning |
Grounding Visual Representations with | text | s for Domain Generalization |
Grouping | text | lines in freeform handwritten notes |
Grouping Using Factor Graphs: An Approach for Finding | text | with a Camera Phone |
GroupViT: Semantic Segmentation Emerges from | text | Supervision |
Guided | text | Spotting for Assistive Blind Navigation in Unfamiliar Indoor Environments |
HairCLIP: Design Your Hair by | text | and Reference Image |
HAM: Hidden Anchor Mechanism for Scene | text | Detection |
Hand-Gesture Based | text | Input for Wearable Computers |
Hand-written | text | recognition based on a new formulation |
Handwriting Recognition: Tablet PC | text | Input |
handwritten ancient | text | detector based on improved feature pyramid network, A |
Handwritten and Machine Printed | text | Separation in Document Images Using the Bag of Visual Words Paradigm |
Handwritten and Printed | text | Segmentation: A Signature Case Study |
Handwritten and Printed | text | Separation: Linearity and Regularity Assessment |
Handwritten and Typewritten | text | Identification and Recognition Using Hidden Markov Models |
Handwritten Arabic | text | recognition using Deep Belief Networks |
Handwritten Arabic | text | recognition using multi-stage sub-core-shape HMMs |
Handwritten Chinese | text | line segmentation by clustering with distance metric learning |
Handwritten Chinese | text | Recognition by Integrating Multiple Contexts |
Handwritten Chinese/Japanese | text | Recognition Using Semi-Markov Conditional Random Fields |
Handwritten document image segmentation into | text | lines and words |
Handwritten Signature and | text | based User Verification using Smartwatch |
Handwritten | text | Generation from Visual Archetypes |
Handwritten | text | Generation via Disentangled Representations |
Handwritten | text | Line Identification in Indian Scripts |
Handwritten | text | Line Segmentation by Shredding Text into its Lines |
Handwritten | text | Line Segmentation by Shredding Text into its Lines |
Handwritten | text | Localization in Skewed Documents |
Handwritten | text | Recognition for Marriage Register Books |
Handwritten | text | recognition through writer adaptation |
Handwritten | text | Retrieval Using Two-Stage Pattern Matching with Handwritten Query |
Handwritten | text | segmentation using average longest path algorithm |
Handwritten | text | Segmentation Using Elastic Shape Analysis |
Handwritten | text | Separation from Annotated Machine Printed Documents Using Markov Random Fields |
Handwritten | text | s for Personality Identification Using Convolutional Neural Networks |
Harnessing the Power of | text | -image Contrastive Models for Automatic Detection of Online Misinformation |
Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity | text | -to-Image Synthesis |
Head-Mounted Device for Recognizing | text | in Natural Scenes, A |
Heterogeneous Graph to Abstract Syntax Tree Framework for | text | -to-SQL, A |
HFENet: Hybrid Feature Enhancement Network for Detecting | text | s in Scenes and Traffic Panels |
HGAN: Hierarchical Graph Alignment Network for Image- | text | Retrieval |
HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene | text | Detection |
Hidden Bawls, Whispers, and Yelps: Can | text | Convey the Sound of Speech, Beyond Words? |
Hidden Markov Model for Language Syntax in | text | Recognition, A |
Hidden Markov Model-Based Ensemble Methods for Offline Handwritten | text | Line Recognition |
Hierarchical Feature Aggregation Based on Transformer for Image- | text | Matching |
Hierarchical online NMF for detecting and tracking topic hierarchies in a | text | stream |
Hierarchical Shape Primitive Features for Online | text | -independent Writer Identification |
Hierarchically-Fused Generative Adversarial Network for | text | to Realistic Image Synthesis |
high-capacity | text | watermarking method based on geometric micro-distortion, A |
High-Dimensional Access Method for Approximated Similarity Search in | text | Mining, A |
Highly Transparent and Secure Scheme for Concealing | text | Within Audio |
Histogram-Based Two-Stage Adaptive Character Segmentation for Transcription of Inter-Point Hindi Braille to | text | , A |
Historical Handwritten | text | Images Word Spotting Through Sliding Window HOG Features |
HiT: Hierarchical Transformer with Momentum Contrast for Video- | text | Retrieval |
HMM-Based Approach for | text | Region Detection in Coded Video Bitstreams |
HMM-Based Multi Oriented | text | Recognition in Natural Scene Image |
HMM-Based Recognizer with Segmentation-free Strategy for Unconstrained Chinese Handwritten | text | |
Holistic Vertical Regional Proposal Network for Scene | text | Detection |
How Good Is Good Enough? Establishing Quality Thresholds for the Automatic | text | Analysis of Retro-Digitized Comics |
How Much Handwritten | text | Is Needed for Text-Independent Writer Verification and Identification |
How Much Handwritten | text | Is Needed for Text-Independent Writer Verification and Identification |
HowTo100M: Learning a | text | -Video Embedding by Watching Hundred Million Narrated Video Clips |
HRS-Bench: Holistic, Reliable and Scalable Benchmark for | text | -to-Image Models |
HTD: A Fast Human-centered | text | -locating Method for Auxiliary Reading |
Human Preference Score: Better Aligning | text | -to-image Models with Human Preference |
Hybrid Algorithm for Con | text | ual Text Recognition, A |
Hybrid approach for Farsi/Arabic | text | detection and localisation in video frames |
Hybrid Approach to Detect and Localize | text | s in Natural Scene Images, A |
Hybrid Approach to Detect | text | s in Natural Scenes by Integration of a Connected-Component Method and a Sliding-Window Method, A |
Hybrid approach to efficient | text | extraction in complex color images |
Hybrid Chinese/English | text | detection in images and video frames |
Hybrid Con | text | ual Text Recognition with String Matching |
Hybrid Deep Architecture for Robust Recognition of | text | Lines of Degraded Printed Documents, A |
hybrid method based on estimation of distribution algorithms to train convolutional neural networks for | text | categorization, A |
Hybrid Network For End-To-End | text | -Independent Speaker Identification |
Hybrid R-BILSTM-C Neural Network Based | text | Steganalysis, A |
Hybrid word/Part-of-Arabic-Word Language Models for arabic | text | document recognition |
Hypothesis Preservation Approach to Scene | text | Recognition with Weighted Finite-State Transducer |
hypothesize-and-verify framework for | text | recognition using deep recurrent neural networks, A |
Hy | text | : A Scene-Text Extraction Method for Video Retrieval |
I2T2I: Learning | text | to image synthesis with textual data augmentation |
I2T: Image Parsing to | text | Description |
I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-Shaped Scene | text | Detection |
IAM-OnDB: An on-line English sentence database acquired from handwritten | text | on a whiteboard |
IBM Rich Transcription 2007 Speech-to- | text | Systems for Lecture Meetings, The |
IBN-STR: A Robust | text | Recognizer for Irregular Text in Natural Scenes |
IBN-STR: A Robust | text | Recognizer for Irregular Text in Natural Scenes |
ICA Based Approach for Complex Color Scene | text | Binarization, An |
ICDAR 2005 | text | locating competition results |
ICDAR 2011 Robust Reading Competition - Challenge 1: Reading | text | in Born-Digital Images (Web and Email) |
ICDAR 2011 Robust Reading Competition Challenge 2: Reading | text | in Scene Images |
ICDAR 2011: Arabic Recognition Competition: Multi-font Multi-size Digitally Represented | text | |
ICDAR 2015 competition HTRtS: Handwritten | text | Recognition on the tranScriptorium dataset |
ICDAR 2015 competition on | text | line detection in historical documents |
ICDAR 2015 contest on MultiSpectral | text | Extraction (MS-TEx 2015) |
ICDAR2013 Competition on Multi-font and Multi-size Digitally Represented Arabic | text | |
ICDAR2015 competition on | text | Image Super-Resolution |
ICPR 2020 Competition on | text | Block Segmentation on a NewsEye Dataset |
ICPR 2020 Competition on | text | Block Segmentation on a Newseye Dataset |
ICPR2016 contest on Arabic | text | detection and Recognition in video frames - AcTiVComp |
ICPR2020 Competition on | text | Detection and Recognition in Arabic News Video Frames |
ICPR2020 Competition on | text | Detection and Recognition in Arabic News Video Frames |
IDBNet: Improved differentiable binarisation network for natural scene | text | detection |
Identification of personality traits from handwritten | text | documents using multi-label classification models |
Identifying Handwritten | text | in Mixed Documents |
Identifying SCADA Systems and Their Vulnerabilities on the Internet of Things: A | text | -Mining Approach |
Identifying visual attributes for object recognition from | text | and taxonomy |
Illegible | text | to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks |
Illegible | text | to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks |
iLogBook: Enabling | text | -Searchable Event Query Using Sparse Vehicle-Mounted GPS Data |
Im2 | text | and Text2Im: Associating Images and Texts for Cross-Modal Retrieval |
Image and | text | Coupling for Creating Electronic Books from Manuscripts |
Image and | text | fusion for UPMC Food-101 using BERT and CNNs |
Image Binarization for End-to-End | text | Understanding in Natural Images |
Image Dataset of | text | Patches in Everyday Scenes, An |
Image Generation Method of Bird | text | Based on Improved StackGAN |
Image Overlay | text | Detection Based on JPEG Truncation Error Analysis |
Image Retrieval for Visual Localization via Scene | text | Detection and Logo Filtering |
Image Search With | text | Feedback by Visiolinguistic Attention Learning |
Image Segmentation Using | text | and Image Prompts |
Image | text | Detection Using a Bandlet-Based Edge Detector and Stroke Width Transform |
Image-based Document Vectors for | text | Retrieval |
Image- | text | Embedding Learning via Visual and Textual Semantic Reasoning |
Image- | text | Matching, Image Text Retrieval, Image-Text Retrieval |
Image- | text | Matching, Image Text Retrieval, Image-Text Retrieval |
Image- | text | Matching, Image Text Retrieval, Image-Text Retrieval |
Image- | text | Multimodal Emotion Classification via Multi-View Attentional Network |
Image- | text | Pre-Training for Logo Recognition |
Image- | text | Retrieval With Cross-Modal Semantic Importance Consistency |
Image-to-Character-to-Word Transformers for Accurate Scene | text | Recognition |
Image/ | text | filtering system and method |
Imaged Document | text | Retrieval Without OCR |
Imagen Editor and EditBench: Advancing and Evaluating | text | -Guided Image Inpainting |
Imagic: | text | -Based Real Image Editing with Diffusion Models |
Impact of Character Models Choice on Arabic | text | Recognition Performance |
Impact of OCR Accuracy and Feature Transformation on Automatic | text | Classification, The |
Impact of OCR Errors on Automated Classification of OCR Japanese | text | s with Parts-of-Speech Analysis, An |
Impact of online handwriting recognition performance on | text | categorization |
Impact of Pre-Processing on Recognition of Cursive Video | text | |
Implementation of Advanced Encryption Standard for encryption and decryption of images and | text | on a GPU |
Implementation of Three | text | to Speech Systems for Kurdish Language |
Implicit Feature Alignment: Learn to Convert | text | Recognizer to Text Spotter |
Implicit Feature Alignment: Learn to Convert | text | Recognizer to Text Spotter |
Improved Component Tree Based Approach to User-Intention Guided | text | Extraction from Natural Scene Images, An |
Improved Document Skew Detection Based on | text | Line Connected-component Clustering |
Improved Gini-Index Algorithm to Correct Feature-Selection Bias in | text | Classification |
Improved Legibility of | text | for Multiprojector Tiled Displays |
Improved localization accuracy by LocNet for Faster R-CNN based | text | detection in natural scene images |
Improved Method Based on Weighted Grid Micro-structure Feature for | text | -Independent Writer Recognition, An |
Improved Scene | text | Extraction Method Using Conditional Random Field and Optical Character Recognition, An |
Improved shot boundary detection method based on | text | edges |
Improved | text | -detection methods for a camera-based text reading system for blind persons |
Improved | text | -detection methods for a camera-based text reading system for blind persons |
Improvement of video | text | recognition by character selection |
Improving Cross-Modal Constraints: | text | Attribute Person Search With Graph Attention Networks |
Improving Cross-Modal Image- | text | Retrieval With Teacher-Student Learning |
Improving Description-Based Person Re-Identification by Multi-Granularity Image- | text | Alignments |
Improving distinctiveness in video captioning with | text | -video similarity |
Improving End-to-End | text | Image Translation From the Auxiliary Text Translation Task |
Improving End-to-End | text | Image Translation From the Auxiliary Text Translation Task |
Improving Full- | text | Precision on Short Queries Using Simple Constraints |
Improving Handwritten Chinese | text | Recognition by Confidence Transformation |
Improving handwritten Chinese | text | recognition using neural network language models and convolutional neural network shape models |
Improving Image Recognition by Retrieving from Web-Scale Image- | text | Data |
Improving image similarity measures for image browsing and retrieval through latent space learning between images and long | text | s |
Improving Multi-class | text | Classification with Naive Bayes |
Improving Multiclass | text | Classification with the Support Vector Machine |
Improving OCR | text | Categorization Accuracy with Electronic Abstracts |
Improving Offline Handwritten | text | Recognition with Hybrid HMM/ANN Models |
Improving Open-Vocabulary Scene | text | Recognition |
Improving patch-based scene | text | script identification with ensembles of conjoined networks |
Improving Persian | text | Classification Using Persian Thesaurus |
Improving Scene | text | Detection by Scale-Adaptive Segmentation and Weighted CRF Verification |
Improving | text | Classifier Performance based on AUC |
Improving | text | -Based Person Search by Spatial Matching and Adaptive Threshold |
Improving | text | -image Matching with Adversarial Learning and Circle Loss for Multi-modal Steganography |
Improving Vision-and-language Navigation with Image- | text | Pairs from the Web |
IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image- | text | Retrieval |
In-air Handwritten Chinese | text | Recognition with Attention Convolutional Recurrent Network |
In-air handwritten Chinese | text | recognition with temporal convolutional recurrent network |
In-Style: Bridging | text | and Uncurated Videos with Style Transfer for Text-Video Retrieval |
In-Style: Bridging | text | and Uncurated Videos with Style Transfer for Text-Video Retrieval |
Incorporating Language Syntax in Visual | text | Recognition with a Statistical-Model |
Incorporating Self-attention Mechanism and Multi-task Learning into Scene | text | Detection |
Incremental Approach to | text | Representation, Categorization, and Retrieval, An |
Incremental Detection of | text | on Road Signs |
Incremental | text | -to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model |
Indexing On-line Handwritten | text | s Using Word Confusion Networks |
Indexing | text | Events in Digital Video Databases |
InducT-GCN: Inductive Graph Convolutional Networks for | text | Classification |
Industrial Scene | text | Detection With Refined Feature-Attentive Network |
Inferential Rules for Identifying Answers in TOEFL | text | s |
Inferring Semantic Layout for Hierarchical | text | -to-Image Synthesis |
Infinite Liouville mixture models with application to | text | and texture categorization |
Influence of | text | line segmentation in Handwritten Text Recognition |
Influence of | text | line segmentation in Handwritten Text Recognition |
Information Detection for the Process of Typhoon Events in Microblog | text | : A Spatio-Temporal Perspective |
Information Extraction and Classification from Free | text | Using a Neural Approach |
Information extraction from scanned invoice images using | text | analysis and layout features |
Information fusion for | text | classification an experimental comparison |
Information Theoretic | text | Classification Using the Ziv-Merhav Method |
InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot | text | -based Video Editing |
Initialized and Guided EM-clustering of Sparse Binary Data with Application to | text | Based Documents |
inpainting system for automatic image structure- | text | ure restoration with text removal, An |
Inspecting the Geographical Representativeness of Images from | text | -to-Image Models |
Instrumental Assessment of Prosodic Quality for | text | -to-Speech Signals |
Integrated Algorithm for | text | Recognition: Comparison with a Cascaded Algorithm, An |
Integrated | text | and Line-Art Extraction from a Topographic Map |
Integrating Geometric Con | text | for Text Alignment of Handwritten Chinese Documents |
Integrating Knowledge Sources in Devanagari | text | Recognition System |
Integrating Language Guidance Into Image- | text | Matching for Correcting False Negatives |
Integrating Language Model in Handwritten Chinese | text | Recognition |
Integrating multiple character proposals for robust scene | text | extraction |
Integrating Visual, Audio and | text | Analysis for News Video |
Integrating word level knowledge in | text | recognition |
Integration of Linguistic and Geospatial Features Using Global Con | text | Embedding for Automated Text Geocoding, The |
Intelligent Typography: Artistic | text | Style Transfer for Complex Texture and Structure |
IntelliSearch: Intelligent Search for Images and | text | on the Web |
Inter-Intra Modal Representation Augmentation With DCT-Transformer Adversarial Network for Image- | text | Matching |
Interactive Enhancement of Handwritten | text | through Multi-resolution Gaussian |
Interactive Image Manipulation with Complex | text | Instructions |
Interactive Off-Line Handwritten | text | Transcription Using On-Line Handwritten Text as Feedback |
Interactive Off-Line Handwritten | text | Transcription Using On-Line Handwritten Text as Feedback |
Interactive System to Extract Structured | text | from a Geometrical Representation, An |
Interactive | text | books; Embedding Image Processing Operator Demonstrations in Text |
Interleaved | text | /image Deep Mining on a large-scale radiology database |
Interpretation of The Function of The Obelisk of Augustus in Rome From Antique | text | s to Present Time Virtual Reconstruction |
Interword distance changes represented by sine waves for watermarking | text | images |
Intra-Modal Constraint Loss for Image- | text | Retrieval |
Inverse-Like Antagonistic Scene | text | Spotting via Reading-Order Estimation and Dynamic Sampling |
Investigations in Psychological Stress Detection from Social Media | text | using Deep Architectures |
IOS-Net: An inside-to-outside supervision network for scale robust | text | detection in the wild |
Is An Image Worth Five Sentences? A New Look into Semantics for Image- | text | Matching |
Is Arabic | text | categorization a solved task? |
ISL RT-07 Speech-to- | text | System, The |
It's All About The Scale: Efficient | text | Detection Using Adaptive Scaling |
IterVM: Iterative Vision Modeling Module for Scene | text | Recognition |
ITI-Gen: Inclusive | text | -to-Image Generation |
JECL: Joint Embedding and Cluster Learning for Image- | text | Pairs |
Joint architecture and knowledge distillation in CNN for Chinese | text | recognition |
Joint embeddings with multimodal cues for video- | text | retrieval |
Joint Handwritten | text | Recognition and Word Classification for Tabular Information Extraction |
Joint Image- | text | News Topic Detection and Tracking by Multimodal Topic And-Or Graph |
Joint Inference of Objects and Scenes With Efficient Learning of | text | -Object-Scene Relations |
Joint representation learning for | text | and 3D point cloud |
Joint stroke classification and | text | line grouping in online handwritten documents with edge pooling attention networks |
Joint Token and Feature Alignment Framework for | text | -Based Person Search |
Joint Video and | text | Parsing for Understanding Events and Answering Queries |
Joint Visual Semantic Reasoning: Multi-Stage Decoder for | text | Recognition |
JPEG2000 Compatible Watermarking of | text | in Images |
Kanji recognition in scene images without detection of | text | fields: robust against variation of viewpoint, contrast, and background texture |
kernel trick for sequences applied to | text | -independent speaker verification systems, A |
Kernel-Based Mixture Mapping for Image and | text | Association |
Keystroke Biometric Recognition Studies on Long- | text | Input under Ideal and Application-Oriented Conditions |
Keyword spotting in handwritten documents based on a generic | text | line HMM and a SVM verification |
Keyword Spotting in Online Handwritten Documents Containing | text | and Non-text Using BLSTM Neural Networks |
Keyword Spotting in Online Handwritten Documents Containing | text | and Non-text Using BLSTM Neural Networks |
KHATT: An open Arabic offline handwritten | text | database |
KHATT: Arabic Offline Handwritten | text | Database |
Knowledge Mining with Scene | text | for Fine-Grained Recognition |
Knowledge-Driven Generative Adversarial Network for | text | -to-Image Synthesis |
KOHTD: Kazakh offline handwritten | text | dataset |
KT-GAN: Knowledge-Transfer Generative Adversarial Network for | text | -to-Image Synthesis |
K | text | : Arbitrary shape text detection using modified K-Means |
L-Verse: Bidirectional Generation Between Image and | text | |
Label embedding for | text | recognition |
Label Embedding: A Frugal Baseline for | text | Recognition |
Label Incorporated Graph Neural Networks for | text | Classification |
Label or Message: A Large-Scale Experimental Survey of | text | s and Objects Co-Occurrence |
LAM Dataset: A Novel Benchmark for Line-Level Handwritten | text | Recognition, The |
Language Adaptive Methodology for Handwritten | text | Line Segmentation |
Language Identification for Printed | text | Independent of Segmentation |
Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene | text | Detection and Spotting |
language model using variable length tokens for open-vocabulary Hangul | text | recognition, A |
Language-Independent | text | Lines Extraction Using Seam Carving |
Language-Independent | text | -Line Extraction Algorithm for Handwritten Documents |
Laplacian Approach to Multi-Oriented | text | Detection in Video, A |
Laplacian Method for Video | text | Detection, A |
Large scalability in document image matching using | text | retrieval |
Large Scale Scene | text | Verification with Guided Attention |
Large-Lexicon Attribute-Consistent | text | Recognition in Natural Images |
LASP: | text | -to-Text Optimization for Language-Aware Soft Prompting of Vision and Language Models |
LASP: | text | -to-Text Optimization for Language-Aware Soft Prompting of Vision and Language Models |
LaTeRF: Label and | text | Driven Object Radiance Fields |
LaTr: Layout-Aware Transformer for Scene- | text | VQA |
layered method for determining manga | text | bubble reading order, A |
Layout and language: exploring | text | block discovery in tables using linguistic resources |
Layout-Bridging | text | -to-Image Synthesis |
LD-ZNet: A Latent Diffusion Approach for | text | -Based Image Segmentation |
Learn to Augment: Joint Data Augmentation and Network Optimization for | text | Recognition |
Learning a Limited | text | Space for Cross-Media Retrieval |
Learning Aligned Image- | text | Representations Using Graph Attentive Relational Network |
Learning analytics system for assessing students' performance quality and | text | mining in online communication |
Learning and Integrating Multi-Level Matching Features for Image- | text | Retrieval |
Learning bottom-up | text | attention maps for text detection using stroke width transform |
Learning bottom-up | text | attention maps for text detection using stroke width transform |
Learning by Imagination: A Joint Framework for | text | -Based Image Manipulation and Change Captioning |
Learning CLIP Guided Visual- | text | Fusion Transformer for Video-based Pedestrian Attribute Recognition |
Learning Coarse-to-Fine Graph Neural Networks for Video- | text | Retrieval |
Learning confidence transformation for handwritten Chinese | text | recognition |
Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using | text | and Sketch |
Learning Deep Structure-Preserving Image- | text | Embeddings |
Learning Dual Semantic Relations With Graph Attention for Image- | text | Matching |
Learning From Short | text | Streams With Topic Drifts |
Learning from Video and | text | via Large-Scale Discriminative Clustering |
Learning Generative Structure Prior for Blind | text | Image Super-resolution |
Learning Linguistic Association Towards Efficient | text | -Video Retrieval |
Learning Markov Clustering Networks for Scene | text | Detection |
Learning multi-view embedding in joint space for bidirectional image- | text | retrieval |
Learning Relationship-Enhanced Semantic Graph for Fine-Grained Image- | text | Matching |
Learning Semantic Relationship among Instances for Image- | text | Matching |
Learning Semantic | text | Features for Web Text-Aided Image Classification |
Learning Semantic | text | Features for Web Text-Aided Image Classification |
Learning Shape-Aware Embedding for Scene | text | Detection |
Learning Spatial-Semantic Con | text | with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition |
Learning Spatially-Variable Filters for Super-Resolution of | text | |
Learning | text | -Line Segmentation Using Codebooks and Graph Partitioning |
Learning the Lexicon from raw | text | s for open-vocabulary Korean word recognition |
Learning to Detect Scene | text | Using a Higher-Order MRF with Belief Propagation |
Learning to detect, localize and recognize many | text | objects in document images from few examples |
Learning to Embed Semantic Similarity for Joint Image- | text | Retrieval |
Learning to Generate Semantic Layouts for Higher | text | -Image Correspondence in Text-to-Image Synthesis |
Learning to Generate Semantic Layouts for Higher | text | -Image Correspondence in Text-to-Image Synthesis |
Learning to Generate | text | -Grounded Mask for Open-World Semantic Segmentation from Only Image-Text Pairs |
Learning to Generate | text | -Grounded Mask for Open-World Semantic Segmentation from Only Image-Text Pairs |
Learning to Group | text | Lines and Regions in Freeform Handwritten Notes |
Learning to Read L'Infinito: Handwritten | text | Recognition with Synthetic Training Data |
Learning to Sort Handwritten | text | Lines in Reading Order through Estimated Binary Order Relations |
Learning to summarize web image and | text | mutually |
Learning to Super-Resolve Blurry Face and | text | Images |
Learning transferable features in meta-learning for few-shot | text | classification |
Learning Two-Branch Neural Networks for Image- | text | Matching Tasks |
Learning Visual Compound Models from Parallel Image- | text | Datasets |
Lecture Video Enhancement and Editing by Integrating Posture, Gesture, and | text | |
Length Normalization in Degraded | text | Collections |
Length-sensitive Language-bound Recognition Network for Multilingual | text | Recognition, A |
Leveraging Smart Devices for Scene | text | Preserved Image Stylization: A Deep Gaming Approach |
Leveraging Style and Content features for | text | Conditioned Image Retrieval |
Leveraging surrounding con | text | for scene text detection |
Leveraging the Mixed- | text | Segmentation Problem to Design Secure Handwritten CAPTCHAs |
Lexicon based feature extraction for emotion | text | classification |
Lexicon Generation for Emotion Detection from | text | |
Lexicon-based offline recognition of Amharic words in unconstrained handwritten | text | |
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image- | text | Sparse Retrieval |
Lifelong Learning for | text | Steganalysis Based on Chronological Task Sequence |
Light Weight | text | Extraction Technique for Hand-Held Device, A |
light-weight | text | image processing method for handheld embedded cameras, A |
Lightweight Attentional Feature Fusion: A New Baseline for | text | -to-Video Retrieval |
Lightweight dynamic conditional GAN with pyramid attention for | text | -to-image synthesis |
Lightweight | text | -Driven Image Editing With Disentangled Content and Attributes |
LIMITR: Leveraging Local Information for Medical Image- | text | Representation |
Line Segmentation for Grayscale | text | Images of Khmer Palm Leaf Manuscripts |
Line-Direction-Free and Character-Orientation-Free On-Line Handwritten Japanese | text | Recognition System, A |
Linecounter: Learning Handwritten | text | Line Segmentation By Counting |
Linguistic integration information in the aabatas arabic | text | analysis system |
Link the Head to the Beak: Zero Shot Learning from Noisy | text | Description at Part Precision |
Linking Image and | text | with 2-Way Nets |
Linking | text | and visual concepts semantically for cross modal multimedia search |
LISTER: Neighbor Decoding for Length-Insensitive Scene | text | Recognition |
LiT: Zero-Shot Transfer with Locked-Image | text | Tuning |
Local Binary Pattern-Based Features for | text | Identification of Web Images |
Local Gradient Difference Features for Classification of 2D-3D Natural Scene | text | Images |
Local Skew Angle Estimation from Background Space in | text | Regions |
Local variance image-based for scene | text | binarization under illumination effects |
Local-Global Video- | text | Interactions for Temporal Grounding |
Localization, extraction and recognition of | text | in Telugu document images |
localization/verification scheme for finding | text | in images and video frames based on contrast independent features and machine learning methods, A |
Localize, Group, and Select: Boosting | text | -VQA by Scene Text Modeling |
Localize, Group, and Select: Boosting | text | -VQA by Scene Text Modeling |
Localizing and segmenting | text | in images and videos |
Localizing blurry and low-resolution | text | in natural images |
Localizing Object-level Shape Variations with | text | -to-Image Diffusion Models |
Localizing scene | text | s by fuzzy inference systems and low rank matrix recovery model |
Localizing | text | in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification |
Locating | text | in Color Documents |
Locating | text | in Complex Color Images |
Locating | text | in Images Based on the Smooth Gray-Level Detection |
Locating | text | in images using matched wavelets |
Locating Uniform-colored | text | in Video Frames |
LocVTP: Video- | text | Pre-training for Temporal Localization |
LODENet: A Holistic Approach to Offline Handwritten Chinese and Japanese | text | Line Recognition |
LoGoPrompt: Synthetic | text | Images Can Be Good Visual Prompts for Vision-Language Models |
Longest Common Subsequence Algorithm Suitable for Similar | text | Strings, A |
Look More Than Once: An Accurate Detector for | text | of Arbitrary Shapes |
Looking at Words and Points with Attention: A Benchmark for | text | -to-Shape Coherence |
Looking from a Higher-level Perspective: Attention and Recognition Enhanced Multi-scale Scene | text | Segmentation |
lossy/lossless compression method for printed typeset bi-level | text | images based on improved pattern matching, A |
Low Complexity Sign Detection and | text | Localization Method for Mobile Applications, A |
L_0-Regularized Intensity and Gradient Prior for Deblurring | text | Images and Beyond |
M3TTS: Multi-modal | text | -to-speech of multi-scale style control for dubbing |
MA-CRNN: a multi-scale attention CRNN for Chinese | text | line recognition in natural scenes |
Machine Learning Approach to Hypothesis Decoding in Scene | text | Recognition, A |
Machine printed | text | and handwriting identification in noisy document images |
Machine reading of camera-held low quality | text | images: An ICA-based image enhancement approach for improving OCR accuracy |
Machine reading of handwritten | text | information in field technician's maps |
Machine recognition and correction of printed Arabic | text | |
Machine Recognition of Multi Font Printed Arabic | text | s |
Machine Recognition of Optically Captured Machine Printed Arabic | text | |
Machine Recognition of Printed Kannada | text | |
Machine-printed and hand-written | text | lines identification |
Magic3D: High-Resolution | text | -to-3D Content Creation |
MagicFusion: Boosting | text | -to-Image Generation Performance by Fusing Diffusion Models |
Major Components of a Complete | text | Reading System |
Make It Move: Controllable Image-to-Video Generation with | text | Descriptions |
Make-A-Scene: Scene-Based | text | -to-Image Generation with Human Priors |
Make-An-Animation: Large-Scale | text | -conditional 3D Human Motion Generation |
Making the Most of | text | Semantics to Improve Biomedical Vision-Language Processing |
Making the V in | text | -VQA Matter |
Mandarin | text | -to-Speech Front-End With Lightweight Distilled Convolution Network |
Manga | text | Detection with Manga-specific Data Augmentation and Its Applications on Emotion Analysis |
ManiGAN: | text | -Guided Image Manipulation |
ManiTrans: Entity-Level | text | -Guided Image Manipulation via Token-wise Semantic Alignment and Generation |
Many Hands Make Light Work: Transferring Knowledge from Auxiliary Tasks for Video- | text | Retrieval |
Marking | text | Documents |
Marking | text | features of document images to deter illicit dissemination |
Markov Model Order Optimization for | text | Recognition |
Markov Random Field Based | text | Identification from Annotated Machine Printed Documents |
Markovian Engine for | text | Recognition: Cursive Arabic Text, Statistical Features and Interconnected HMMs, A |
Markovian Engine for | text | Recognition: Cursive Arabic Text, Statistical Features and Interconnected HMMs, A |
Mask R-CNN With Pyramid Attention Network for Scene | text | Detection |
Mask | text | spotter v3: Segmentation Proposal Network for Robust Scene Text Spotting |
Mask | text | Spotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes |
MASTER: Multi-aspect non-local network for scene | text | recognition |
Mathematical properties of the native integral ratio handwriting and | text | extraction technique |
Maximum Likelihood Discriminant Feature for | text | -Independent Speaker Verification |
Maximum Margin Approach to Learning | text | Classifiers Methods, Theory and Algorithms, The |
Maximum Spanning Trees For | text | Segmentation |
maximum-likelihood approach to segmentation-based recognition of unconstrained handwriting | text | , A |
MEAN: Multi - Element Attention Network for Scene | text | Recognition |
Medical-Image Retrieval Based on Knowledge-Assisted | text | and Image Indexing |
Memorize, Associate and Match: Embedding Enhancement via Fine-Grained Alignment for Image- | text | Retrieval |
Memory-Efficient Models for Scene | text | Recognition via Neural Architecture Search |
Met-MLTS: Leveraging Smartphones for End-to-End Spotting of Multilingual Oriented Scene | text | s and Traffic Signs in Adverse Meteorological Conditions |
MetaHTR: Towards Writer-Adaptive Handwritten | text | Recognition |
Method and apparatus for detecting running | text | in an image |
Method and apparatus for the resolution enhancement of gray-scale images that include | text | and line art |
Method and system for recognizing a boundary between characters in handwritten | text | |
Method for automatic recognition of white blocks as well as | text | , graphics and/or gray image areas on a printed master |
method for detecting | text | of arbitrary shapes in natural scenes that improves text spotting, A |
method for detecting | text | of arbitrary shapes in natural scenes that improves text spotting, A |
method for discovering knowledge in | text | s, A |
Method for Extracting | text | from Stone Inscriptions Using Character Spotting, A |
Method for identification and compression of facsimile symbols in | text | processing systems |
Method for identifying word bounding boxes in | text | |
Method for Semantic Relatedness Based Query Focused | text | Summarization, A |
Method for | text | Localization and Recognition in Real-World Images, A |
Method for unconstrained | text | detection in natural scene image |
method for variable quantization in JPEG for improved | text | quality in compound documents, A |
method of N-grams in large-scale clustering of DNA | text | s, The |
Method of separating | text | and graphs in digital image data |
Methods for | text | segmentation from scene images |
Metric Learning for | text | Documents |
MF-GAN: Multi-conditional Fusion Generative Adversarial Network for | text | -to-Image Synthesis |
MFECLIP: CLIP With Mapping-Fusion Embedding for | text | -Guided Image Editing |
MILES: Visual BERT Pre-training with Injected Language Semantics for Video- | text | Retrieval |
Minimal Interaction Touchless | text | Input with Head Movements and Stereo Vision |
Minimum Error Rate Training for PHMM-Based | text | Recognition |
Minimum Risk Training for Handwritten Chinese/Japanese | text | Recognition Using Semi-Markov Conditional Random Fields |
Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese | text | recognition |
Mining conversational | text | for procedures with applications in contact centers |
Mining False Positive Examples for | text | -Based Person Re-Identification |
Mining the displacement of max-pooling for | text | recognition |
MirrorGAN: Learning | text | -To-Image Generation by Redescription |
Mis?-) Using DRT for Generation of Natural Language | text | from Image Sequences |
MISL: Multi-grained image- | text | semantic learning for text-guided image inpainting |
MISL: Multi-grained image- | text | semantic learning for text-guided image inpainting |
Mita: An Information Extraction Approach to the Analysis of Free-Form | text | in Life-Insurance Applications |
Mixed-Supervised Scene | text | Detection With Expectation-Maximization Algorithm |
Mobile visual search on printed documents using | text | and low bit-rate features |
Modality Disentangled Discriminator for | text | -to-Image Synthesis |
Model and Data Integrated Transfer Learning for Unstructured Map | text | Detection |
Model Based | text | Line Segmentation Method for Off-line Handwritten Documents, A |
model for detecting and merging vertically spanned table cells in plain | text | documents, A |
Model of On-line Handwritten Japanese | text | Recognition Free from Line Direction and Writing Format Constraints, A |
model-based approach to offline | text | -independent Arabic writer identification and verification, A |
Model-Based System Specification With Tesperanto: Readable | text | From Formal Graphics |
Modeling Motion with Multi-Modal Features for | text | -Based Video Segmentation |
Modeling of image, video and | text | fusion quality data packet system for aerospace complex products based on business intelligence |
Modeling Stroke Mask for End-to-End | text | Erasing |
Modern vs Diplomatic Transcripts for Historical Handwritten | text | Recognition |
Moment-Based Image Normalization for Handwritten | text | Recognition |
Mono-font Cursive Arabic | text | Recognition Using Speech Recognition System |
Morality Classification in Natural Language | text | |
MORAN: A Multi-Object Rectified Attention Network for scene | text | recognition |
More Grounded Image Captioning by Distilling Image- | text | Matching Model |
More Than Just Attention: Improving Cross-Modal Attentions with Contrastive Constraints for Image- | text | Matching |
More than Words: In-the-Wild Visually-Driven Prosody for | text | -to-Speech |
Morphological Approach for | text | -Line Segmentation in Handwritten Documents, A |
Morphological Approach to | text | String Extraction from Regular Periodic Overlapping Text-Background Images, A |
Morphological Approach to | text | String Extraction from Regular Periodic Overlapping Text-Background Images, A |
Morphological | text | Extraction from Images |
Morphology-based hierarchical representation with application to | text | segmentation in natural images |
Morphology-based | text | line extraction |
Morph | text | : Deep Morphology Regularized Accurate Arbitrary-Shape Scene Text Detection |
Mosaicing-by-recognition for video-based | text | recognition |
Mosaicing-by-recognition: a technique for video-based | text | recognition |
MOST: A Multi-Oriented Scene | text | Detector with Localization Refinement |
Moto: Enhancing Embedding with Multiple Joint Factors for Chinese | text | Classification |
Movie fill in the blank by joint learning from video and | text | with adaptive temporal attention |
Movie/Script: Alignment and Parsing of Video and | text | Transcription |
MPEG-7 Video | text | Description Scheme for Superimposed Text in Images and Video |
MRF based | text | binarization in complex images using stroke feature |
MRF Model for Binarization of Natural Scene | text | , An |
MRN: Multiplexed Routing Network for Incremental Multilingual | text | Recognition |
MRP-GAN: Multi-resolution parallel generative adversarial networks for | text | -to-image synthesis |
MSCap: Multi-Style Image Captioning With Unpaired Stylized | text | |
MSER-Based Real-Time | text | Detection and Tracking |
MSR-Video to | text | dataset with clean annotations, The |
MTRNet++: One-stage mask-based scene | text | eraser |
MUGEN: A Playground for Video-Audio- | text | Multimodal Understanding and GENeration |
Multi language | text | detection using fast stroke width transform |
Multi scale mirror connection based encoder decoder network for | text | localization |
Multi-branch Network with Ensemble Learning for | text | Removal in the Wild |
Multi-Concept Customization of | text | -to-Image Diffusion |
Multi-dimensional long short-term memory networks for artificial Arabic | text | recognition in news video |
Multi-event Video- | text | Retrieval |
Multi-fractal Modeling for On-line | text | -Independent Writer Identification |
Multi-Granularity Aggregation Transformer for Joint Video-Audio- | text | Representation Learning |
Multi-Granularity Matching Transformer for | text | -Based Person Search |
Multi-granularity Prediction for Scene | text | Recognition |
Multi-head Self-relation Network for Scene | text | Recognition, A |
Multi-label | text | Classification Approach for Sentence Level News Emotion Analysis |
Multi-lingual scene | text | detection and language identification |
Multi-lingual | text | recognition from video frames |
Multi-modal Con | text | ual Graph Neural Network for Text Visual Question Answering |
Multi-Modal Fusion Network for Rumor Detection with | text | s and Images |
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene | text | |
Multi-Modal Reasoning Graph for Scene- | text | Based Fine-Grained Image Classification and Retrieval |
Multi-Modal Representation Learning with | text | -Driven Soft Masks |
Multi-modal | text | Recognition Networks: Interactive Enhancements Between Visual and Semantic Features |
Multi-Modal Topic Model for Image Annotation Using | text | Analysis, A |
Multi-Orientation Scene | text | Detection with Adaptive Clustering |
Multi-orientation scene | text | detection with multi-information fusion |
Multi-Oriented and Multi-Lingual Scene | text | Detection With Direct Regression |
Multi-oriented Bangla and Devnagari | text | recognition |
Multi-oriented English | text | Line Identification |
Multi-oriented Scene | text | Detection via Corner Localization and Region Segmentation |
Multi-oriented | text | detection from natural scene images based on a CNN and pruning non-adjacent graph edges |
Multi-oriented | text | Detection with Fully Convolutional Networks |
Multi-Oriented | text | Extraction in Stylistic Documents |
Multi-oriented touching | text | character segmentation in graphical documents using dynamic programming |
Multi-phase recognition of multifont photoscript Arabic | text | |
multi-plane approach for | text | segmentation of complex document images, A |
Multi-polarity | text | segmentation using graph theory |
Multi-resolution form of SVD for | text | -independent speaker recognition |
Multi-scale sequential network for semantic | text | segmentation and localization |
Multi-scale | text | Line Segmentation Method in Freestyle Handwritten Documents, A |
Multi-scale video | text | detection based on corner and stroke width verification |
multi-scenario | text | generation method based on meta reinforcement learning, A |
Multi-script and Multi-oriented | text | Localization from Scene Images |
Multi-script iterative steerable directional filtering for handwritten | text | line extraction |
Multi-script | text | Extraction from Natural Scenes |
Multi-script | text | versus non-text classification of regions in scene images |
Multi-script | text | versus non-text classification of regions in scene images |
Multi-Script-Oriented | text | Detection and Recognition in Video/Scene/Born Digital Images |
Multi-sensor | text | classification experiments: A comparison |
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained | text | -to-Image Synthesis |
Multi-Spectral Fusion Based Approach for Arbitrarily Oriented Scene | text | Detection in Video Images |
Multi-stage HMM based Arabic | text | recognition with rescoring |
Multi-strategy tracking based | text | detection in scene videos |
Multi3DRefer: Grounding | text | Description to Multiple 3D Objects |
Multifractal Characterization of | text | s for Pattern Recognition: On the Complexity of Morphological Structures in Modern and Ancient Languages |
Multigap: Multi-pooled inception network with | text | augmentation for aesthetic prediction of photographs |
Multilabel | text | Classification With Incomplete Labels: A Safe Generative Model With Label Manifold Regularization and Confidence Constraint |
Multilateral Semantic Relations Modeling for Image | text | Retrieval |
Multilevel | text | -Line Segmentation Framework for Handwritten Historical Documents, A |
Multilingual Artificial | text | Detection Using a Cascade of Transforms |
Multimodal grid features and cell pointers for scene | text | visual question answering |
Multimodal interactive transcription of | text | images |
Multimodal Meme Classification Identifying Offensive Content in Image and | text | |
Multimodal Neurons in Pretrained | text | -Only Transformers |
Multimodal Processing and Interaction: Audio, Video, | text | |
Multimodal Sentiment Analysis With Image- | text | Interaction Network |
Multimodal Topic Modeling by Exploring Characteristics of Short | text | Social Media |
Multioriented and Curved | text | Lines Extraction From Indian Documents |
Multioriented Video Scene | text | Detection Through Bayesian Classification and Boundary Growing |
multiple agent architecture for handwritten | text | recognition, A |
Multiple attention encoded cascade R-CNN for scene | text | detection |
Multiple Classifier Approach for the Recognition of Screen-Rendered | text | , A |
Multiple Document Datasets Pre-training Improves | text | Line Detection With Deep Neural Networks |
Multiple Geometry Transform Estimation from Single Camera-Captured | text | Image |
Multiple Handwritten | text | Line Recognition Systems Derived from Specific Integration of a Language Model |
Multiple Learned Dictionaries Based Clustered Sparse Coding for the Super-Resolution of Single | text | Image |
Multiple Positives Enhanced NCE Loss for Image- | text | Retrieval, A |
Multiview | text | Imagination Network Based on Latent Alignment for Image-Text Matching, A |
Multiview | text | Imagination Network Based on Latent Alignment for Image-Text Matching, A |
MuLTReNets: Multilingual | text | recognition networks for simultaneous script identification and handwriting recognition |
MUST-VQA: Multilingual Scene- | text | VQA |
Mutually Guided Dual-Task Network for Scene | text | Detection |
N-Gram-Based | text | Categorization |
Name your style: | text | -guided artistic style transfer |
Natural Language Watermarking Using Semantic Substitution for Chinese | text | |
Natural scene | text | detection based on SWT, MSER and candidate classification |
Natural Scene | text | Detection with Multi-channel Connected Component Segmentation |
Natural scene | text | detection with multi-layer segmentation and higher order conditional random field based analysis |
Negative-Aware Attention Framework for Image- | text | Matching |
NEOCR: A Configurable Dataset for Natural Image | text | Recognition |
neural model for | text | localization, transcription and named entity recognition in full pages, A |
Neural network-based prediction of the stopping moment for | text | recognition in a video stream |
Neural Network-based | text | Location for News Video Indexing |
Neural network-based | text | location in color images |
Neuron-Based Spiking Transmission and Reasoning Network for Robust Image- | text | Retrieval |
New Approach Based on | text | ure and Geometric Features for Text Detection |
New Approach for Overlay | text | Detection and Extraction From Complex Video Scene, A |
new approach for | text | -independent speaker recognition, A |
new approach for video | text | detection, A |
New Approach towards | text | Filtering, A |
New Arabic Printed | text | Image Database and Evaluation Protocols, A |
New Binarization Approach Based on | text | Block Extraction |
New Block Partitioned | text | Feature for Text Verification, A |
New Block Partitioned | text | Feature for Text Verification, A |
new deep CNN for 3D | text | localization in the wild through shadow removal, A |
New Deep Wavefront Based Model for | text | Localization in 3D Video, A |
new edge-based | text | verification approach for video, A |
New Fourier-Statistical Features in RGB Space for Video | text | Detection |
New Fuzzy Hierarchical Classification Based on SVM for | text | Categorization, A |
New Gradient Based Character Segmentation Method for Video | text | Recognition, A |
new hybrid method to detect | text | in natural scene, A |
new instrumented approach for translating American Sign Language into sound and | text | , A |
New Language-Independent Deep CNN for Scene | text | Detection and Style Transfer in Social Media Images, A |
New Method for Arabic | text | Detection in Natural Scene Image Based on the Color Homogeneity, A |
New Method for Arabic | text | Detection in Natural Scene Images, A |
new method for detection and prediction of occluded | text | in natural scene images, A |
New Method for Handwritten Scene | text | Detection in Video, A |
new method for multi-oriented graphics-scene-3D | text | classification in video, A |
New Method for | text | Verification Based on Random Forests, A |
New Method for | text | -Line Segmentation for Warped Documents, A |
New Method for Word Segmentation from Arbitrarily-Oriented Video | text | Lines, A |
New Method for Writer Identification and Verification Based on Farsi/Arabic Handwritten | text | s, A |
new multi-modal approach to bib number/ | text | detection and recognition in Marathon images, A |
New Nearest Neighbor Rule for | text | Categorization, A |
new robust algorithm for video | text | extraction, A |
new scheme for unconstrained handwritten | text | -line segmentation, A |
new segmentation technique for omnifont Farsi | text | , A |
new segmentation technique of Arabic | text | , A |
New Smoothing Method for Lexicon-Based Handwritten | text | Keyword Spotting, A |
New Strategy for Reducing Errors in Scene | text | Detection, A |
new structural technique for recognizing printed Arabic | text | , A |
New Symmetry Based on Proximity of Wavelet-Moments for | text | Frame Classification in Video, A |
New Technique for Multi-Oriented Scene | text | Line Detection and Tracking in Video, A |
New | text | Extraction Method Incorporating Local Information, A |
New | text | -Line Alignment Approach Based on Piece-Wise Painting Algorithm for Handwritten Documents, A |
New Type of Feature: Loose N-Gram Feature in | text | Categorization, A |
new unified method for detecting | text | from marathon runners and sports players in video (PR-D-19-01078R2), A |
New Video Images | text | Localization Approach Based on a Fast Hough Transform, A |
New Wavelet and Color Features for | text | Detection in Video |
new wavelet-Laplacian method for arbitrarily-oriented character segmentation in video | text | lines, A |
News2meme: An Automatic Content Generator from News Based on Word Subspaces from | text | and Image |
Noise-aware Learning from Web-crawled Image- | text | Data for Image Captioning |
Noisy | text | Categorization |
Non-Local | text | Image Reconstruction |
Non-negative Sparse Semantic Coding for | text | categorization |
non-stationary density model to separate overlapped | text | s in degraded documents, A |
Non-Uniform Slant Correction for Handwritten | text | Line Recognition |
Novel Algorithm for | text | Detection and Localization in Natural Scene Images, A |
novel automated depression detection technique using | text | transcript, A |
novel binarization approach for | text | in images, A |
Novel Data Independent Approach for Conversion of Hand Punched Kannada Braille Script to | text | and Speech, A |
Novel Data Representation for | text | Extraction from Multispectral Historical Document Images |
Novel Edge Features for | text | Frame Classification in Video |
Novel Fuzzy Logic-Based | text | Classification Method for Tracking Rare Events on Twitter, A |
Novel Illumination-Balance Technique for Improving the Quality of Degraded | text | -Photo Images, A |
Novel Integrated Framework for Learning both | text | Detection and Recognition, A |
Novel Method for Embedded | text | Segmentation Based on Stroke and Color, A |
novel method for straightening curved | text | -lines in stylistic documents, A |
novel method of | text | line segmentation for historical document image of the uchen Tibetan, A |
Novel Multi-oriented Chinese | text | Extraction Approach from Videos, A |
novel mutual nearest neighbor based symmetry for | text | frame classification in video, A |
novel scene | text | detection algorithm based on convolutional neural network, A |
Novel Sub-character HMM Models for Arabic | text | Recognition |
Novel System for Robust | text | Location and Recognition of Book Covers, A |
Novel | text | Detection System Based on Character and Link Energies, A |
novel | text | structure feature extractor for Chinese scene text detection and recognition, A |
novel | text | structure feature extractor for Chinese scene text detection and recognition, A |
Novel | text | -Independent Speaker Verification System Using Ant Colony Optimization Algorithm, A |
novel triangulation procedure for thinning hand-written | text | , A |
novel two-stage algorithm for baseline estimation and correction in Farsi and Arabic handwritten | text | line, A |
Novel Visual Representation on | text | Using Diverse Conditional GAN for Visual Recognition, A |
Novice and Expert Performance of KeyScretch: A Gesture-Based | text | Entry Method for Touch-Screens |
Null- | text | Inversion for Editing Real Images using Guided Diffusion Models |
Object proposals for | text | extraction in the wild |
Object Reading: | text | Recognition for Object Recognition |
Object-Driven | text | -To-Image Synthesis via Adversarial Training |
Objective Distortion Measure for Binary | text | Image Based on Edge Line Segment Similarity |
Objective Function Design for MCE-Based Combination of On-line and Off-line Character Recognizers for On-line Handwritten Japanese | text | Recognition |
Occluded | text | Detection and Recognition in the Wild |
OCR and Voting Shell Fulfilling Specific | text | Analysis Requirements |
OCR of Printed Telugu | text | with High Recognition Accuracies |
OCR Pipeline and Semantic | text | Analysis for Comics, An |
OCR-VQGAN: Taming | text | -within-Image Generation |
OCRSpell: An Interactive Spelling Correction System for OCR Errors in | text | |
Off-line Chinese Writer Retrieval System Based on | text | -sensitive Writer Identification, An |
Offline arabic handwritten | text | recognition: A Survey |
Offline handwritten Arabic cursive | text | recognition using Hidden Markov Models and re-ranking |
Offline Recognition of Large Vocabulary Cursive Handwritten | text | |
Offline recognition of omnifont Arabic | text | using the HMM ToolKit (HTK) |
Offline Recognition of Unconstrained Handwritten | text | s Using HMMs and Statistical Language Models |
Offline | text | -independent writer identification using codebook and efficient code extraction methods |
Old fashion | text | -based image retrieval using FCA |
Omnifont recognition of | text | using topological recognition techniques |
On appearance-based feature extraction methods for writer-independent handwritten | text | recognition |
On Calibration of Scene- | text | Recognition Models |
On Combining Multiple Segmentations in Scene | text | Recognition |
On optimal stopping strategies for | text | recognition in a video stream as an application of a monotone sequential decision model |
On partitioning a dictionary for visual | text | recognition |
On Recognizing | text | s of Arbitrary Shapes with 2D Self-Attention |
On the discriminability of keystroke feature vectors used in fixed | text | keystroke authentication |
On the Evaluation of Handwritten | text | Line Detection Algorithms |
On the General Value of Evidence, and Bilingual Scene- | text | Visual Question Answering |
On the influence of vocabulary size and language models in unconstrained handwritten | text | recognition |
On the Modification of Binarization Algorithms to Retain Grayscale Information for Handwritten | text | Recognition |
On the Processing of Fuzzy Patterns for | text | Independent Phonetic Speech Segmentation |
On the Segmentation of | text | in Videos |
On the use of Bernoulli mixture models for | text | classification |
On the use of duration-corrected N-best hypotheses for | text | recognition in gray-scale document images |
On Vocabulary Reliance in Scene | text | Recognition |
On-Device | text | Image Super Resolution |
On-Line Handwritten Japanese | text | Recognition Free from Constrains on Line Direction and Character Orientation |
On-line Handwritten Japanese | text | Recognition System Free from Line Direction and Character Orientation Constraints, An |
On-Line Handwritten | text | Line Detection Using Dynamic Programming |
On-line Handwritten | text | Search Method Based on Directional Feature Matching, An |
On-line recognition of handwritten Renqun shorthand for fast mobile Chinese | text | entry |
On-line Writing-box-free Recognition of Handwritten Japanese | text | Considering Character Size Variations |
One-shot Compositional Data Generation for Low Resource Handwritten | text | Recognition |
One-Shot Doc Snippet Detection: Powering Search in Document Beyond | text | |
Online Biterm Topic Model based short | text | stream classification using short text expansion and concept drifting detection |
Online Biterm Topic Model based short | text | stream classification using short text expansion and concept drifting detection |
Online | text | -Independent Writer Identification Based on Stroke's Probability Distribution Function |
Online | text | -independent Writer Identification Based on Temporal Sequence and Shape Codes |
Ontology-Based | text | Mining Method to Develop D-Matrix From Unstructured Text, An |
Ontology-Based | text | Mining Method to Develop D-Matrix From Unstructured Text, An |
Opaque Document Imaging: Building Images of Inaccessible | text | s |
Open set classification of untranscribed handwritten | text | image documents |
Open-Set | text | Recognition via Character-Context Decoupling |
Open-Vocabulary Panoptic Segmentation with | text | -to-Image Diffusion Models |
Open-vocabulary recognition of machine-printed Arabic | text | using hidden Markov models |
Opinion mining from noisy | text | data |
OPMP: An Omnidirectional Pyramid Mask Proposal Network for Arbitrary-Shape Scene | text | Detection |
Optical character correction of large-curvature annular sector | text | in polar coordinate system |
Optical flow based dynamic curved video | text | detection |
Optical modelling and language modelling trade-off for Handwritten | text | Recognition |
Optimal Boxes: Boosting End-to-End Scene | text | Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning |
Optimal | text | /Background Color Combination of LED Information Boards for Visibility Improvement Based on Psychological Measurements, An |
Optimizing the class information divergence for transductive classification of | text | s using propagation in bipartite graphs |
Optimizing the integration of a statistical language model in HMM based offline handwritten | text | recognition |
Orientation and Scale Invariant | text | Region Extraction in WWW Images |
Orientation Robust | text | Line Detection in Natural Images |
OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page | text | Recognition by learning to unfold |
Oscillating Feature Subset Search Algorithm for | text | Categorization |
Out of vocabulary word detection and recovery in Arabic handwritten | text | recognition |
Outline Generation Transformer for Bilingual Scene | text | Recognition |
PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese | text | Recognition |
Pair-Copula Based Scheme for | text | Extraction from Digital Images, A |
Pairwise optimized Rocchio algorithm for | text | categorization |
PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped | text | |
Paragraph | text | segmentation into lines with Recurrent Neural Networks |
Parametric Spectral-Based Method for Verification of | text | in Videos, A |
Part-based method on handwritten | text | s |
Parts2Words: Learning Joint Embedding of Point Clouds and | text | s by Bidirectional Matching Between Parts and Words |
Pay attention to what you read: Non-recurrent handwritten | text | -Line recognition |
Pen Acoustic Emissions for | text | and Gesture Recognition |
Perceptive Vision for Headline Localisation in Bangla Handwritten | text | Recognition |
Performance Analysis of | text | Halftone Modulation |
Performance Evaluation of | text | Detection and Tracking in Video |
Person Identification Using | text | and Image Data |
Person Search by | text | Attribute Query As Zero-Shot Learning |
Personalized | text | snippet extraction using statistical language models |
Perspective Scene | text | Recognition with Feature Compression and Ranking |
PETR: Rethinking the Capability of Transformer-Based Language Model in Scene | text | Recognition |
PFAN++: Bi-Directional Image- | text | Retrieval With Position Focused Attention Network |
Photographic | text | -to-Image Synthesis with a Hierarchically-Nested Adversarial Network |
PhotoOCR: Reading | text | in Uncontrolled Conditions |
Picture and | text | Query and Archiving System, A |
PIDRo: Parallel Isomeric Attention with Dynamic Routing for | text | -Video Retrieval |
Piece-wise linearity based method for | text | frame classification in video |
Pitch Based Segmentation and Recognition of Dot-Matrix | text | |
Pitman Shorthand inspired model for plain | text | compression |
Pixel-Based Evaluation Method for | text | Detection in Color Images, A |
Pix | text | GAN: structure aware text image synthesis for license plate recognition |
Plda-based system for | text | -prompted password speaker verification |
Plug-and-Play Diffusion Features for | text | -Driven Image-to-Image Translation |
Plug-and-Play Regulators for Image- | text | Matching |
Plugnet: Degradation Aware Scene | text | Recognition Supervised by a Pluggable Super-resolution Unit |
PMMN: Pre-Trained Multi-Modal Network for Scene | text | Recognition |
PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved | text | -to-Image Diffusion |
PolygloNet: Multilingual Approach for Scene | text | Recognition Without Language Constraints |
Polygon-based technique for the automatic classification of | text | and graphics components from digitized paper-based forms |
Polygon-Free: Unconstrained Scene | text | Detection with Box Annotations |
pooling based scene | text | proposal technique for scene text reading in the wild, A |
pooling based scene | text | proposal technique for scene text reading in the wild, A |
Portable and fast | text | detection |
Portmanteauing Features for Scene | text | Recognition |
Position-Guided | text | Prompt for Vision-Language Pre-Training |
Predict, Prevent, and Evaluate: Disentangled | text | -Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model |
Predicting audio-visual salient events based on visual, audio and | text | modalities for movie summarization |
Predicting Emotional Responses to Long Informal | text | |
Predicting Motivations of Actions by Leveraging | text | |
Predicting Visual Features From | text | for Image and Video Caption Retrieval |
PreSTU: Pre-Training for Scene- | text | Understanding |
Primitive Representation Learning for Scene | text | Recognition |
Printed | text | Discrimination |
Printed | text | Featuring Using the Visual Criteria of Legibility and Complexity |
Printed | text | segmentation using distance transform |
Prior knowledge guided | text | to image generation |
Probabilistic Hierarchical Clustering Method for Organising Collections of | text | Documents, A |
Probabilistic Kernels for Improved | text | -to-Speech Alignment in Long Audio Tracks |
probabilistic model derived term weighting scheme for | text | classification, A |
Processing of Binary Images of Handwritten | text | Documents |
Processing of Off-Line Handwritten | text | : Polygonal-Approximation and Enforcement of Temporal Information |
Progressive Contour Regression for Arbitrary-Shape Scene | text | Detection |
Progressive scene | text | erasing with self-supervision |
Progressive Spatio-Temporal Prototype Matching for | text | -Video Retrieval |
Progressive | text | -to-Face Synthesis with Generative Adversarial Network |
Prompt Switch: Efficient CLIP Adaptation for | text | -Video Retrieval |
Prompt Tuning Inversion for | text | -Driven Image Editing Using Diffusion Models |
Proposal for a | text | -Indicated Writer Verification Method, A |
Proposal of the hybrid spectral gradient method to extract character- | text | regions from general scene images |
Pure Transformer with Integrated Experts for Scene | text | Recognition |
Pushing the Performance Limit of Scene | text | Recognizer without Human Annotation |
PYRAD-DCNN: A Fully Convolutional Neural Network to Replace BLSTM in Offline | text | Recognition Systems |
Pyrboxes: An efficient multi-scale scene | text | detector with feature pyramids |
quad tree based method for blurred and non-blurred video | text | frames classification through quality metrics, A |
Quadrilateral Scene | text | Detector with Two-Stage Network Architecture, A |
Quality inspection of printed | text | s |
Quality-related English | text | classification based on recurrent neural network |
QWERTY- and 8pen- Based Touchless | text | Input with Hand Movement |
R-Net: A Relationship Network for Efficient and Accurate Scene | text | Detection |
R2CNN: Rotational Region CNN for Arbitrarily-Oriented Scene | text | Detection |
Random Subspace Method in | text | Categorization |
Rapid Evaluation of the Handwriting Performance for Gesture Based | text | Input |
Re-ranking and TOPSIS-based ensemble feature selection with multi-stage aggregation for | text | categorization |
Re-ranking image- | text | matching by adaptive metric fusion |
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene | text | Recognition |
Read | text | from Signs in General Scenes |
Reading Arbitrary-shaped Scene | text | from Images Through Spline Regression and Rectification |
Reading Newspaper | text | |
Reading | text | in the Wild from Compressed Images |
Reading | text | in the Wild with Convolutional Neural Networks |
Reading-Life Log: Technologies to Recognize | text | s That We Read, The |
Reading-Strategy Inspired Visual Representation Learning for | text | -to-Video Retrieval |
ReADS: A Rectified Attentional Double Supervised Network for Scene | text | Recognition |
Real time image enhancement and segmentation for sign/ | text | detection |
Real time image enhancement for both | text | and color photo images |
Real-Time Lexicon-Free Scene | text | Localization and Recognition |
Real-time Lexicon-free Scene | text | Retrieval |
Real-Time Scene | text | Detection Based on Stroke Model |
Real-Time Scene | text | Detection With Differentiable Binarization and Adaptive Scale Fusion |
Real-time scene | text | localization and recognition |
Real-Time Scene | text | to Speech System, A |
Real-Time | text | Steganalysis Based on Multi-Stage Transfer Learning |
Real-time | text | tracking in natural scenes |
real-time | text | -independent speaker identification system, A |
Real-Time Visual Analytics for | text | Streams |
Realtime multi-scale scene | text | detection with scale-based region proposal network |
Recipe2Video: Synthesizing Personalized Videos from Recipe | text | s |
ReCo: Region-Controlled | text | -to-Image Generation |
Recognising | text | in Real Scenes |
Recognition based | text | localization from natural scene images |
Recognition of Apparent Personality Traits from | text | and Handwritten Images |
Recognition of Arabic Machine-Printed Cursive | text | |
Recognition of Bangla | text | from scene images through perspective correction |
Recognition of cursive video | text | using a deep learning framework |
Recognition of Hand-Written Archive | text | Documents |
Recognition of Handwritten Chinese | text | by Segmentation: A Segment-Annotation-Free Approach |
Recognition of Indian multi-oriented and curved | text | |
Recognition of Multi-oriented, Multi-sized, and Curved | text | |
Recognition of Noise Polyfont Printed | text | Using Combined HMMS, The |
Recognition of Pornographic Web Pages by Classifying | text | s and Images |
Recognition of printed arabic | text | based on global features and decision tree learning techniques |
Recognition of Printed Arabic | text | Using Neural Networks |
Recognition of printed Devanagari | text | using BLSTM Neural Network |
Recognition of Printed | text | under Realistic Conditions |
Recognition of Screen-Rendered | text | |
Recognition of Video | text | through Temporal Integration |
Recognition-Based Segmentation of Nom Characters from Body | text | Regions of Stele Images Using Area Voronoi Diagram |
Recognize | text | in General Scenes |
Recognizing Chinese | text | s with 3D Convolutional Neural Network |
Recognizing irregular entities in biomedical | text | via deep neural networks |
Recognizing Multiple | text | Sequences from an Image by Pure End-to-End Learning |
Recognizing perspective scene | text | with context feature |
Recognizing semantic correlation in image- | text | Weibo via feature space mapping |
Recognizing | text | Elements for SVG Comic Compression and Its Novel Applications |
Recognizing | text | in historical maps using maps from multiple time periods |
Recognizing | text | in raster maps |
Recognizing | text | with a CNN |
Recognizing | text | with Perspective Distortion in Natural Scenes |
Recognizing | text | -Based Traffic Guide Panels with Cascaded Localization Network |
Recognizing | text | -Based Traffic Signs |
Reconsidering Tourism Destination Images by Exploring Similarities between Travelogue | text | s and Photographs |
Rectification and recognition of | text | in 3-D scenes |
Rectifying Perspective Views of | text | in 3D Scenes Using Vanishing Points |
Recurrent Affine Transformation for | text | -to-Image Synthesis |
Recurrent Global Convolutional Network for Scene | text | Detection |
Recurrent Highway Networks with Attention Mechanism for Scene | text | Recognition |
Redefining the DCT-based feature for scene | text | detection: Analysis and comparison of spatial frequency-based features |
Redif Extraction in Handwritten Ottoman Literary | text | s |
Reduced annotation based on deep active learning for arabic | text | detection in natural scene images |
Referring Image Segmentation Using | text | Supervision |
Region Reinforcement Network With Topic Constraint for Image- | text | Matching |
Region-Aware Arbitrary-Shaped | text | Detection With Progressive Fusion |
Region-Based Discriminative Feature Pooling for Scene | text | Recognition |
Regularizing Visual Semantic Embedding With Contrastive Learning for Image- | text | Matching |
Reinforcement Shrink-Mask for | text | Detection |
Rejection Strategies for Offline Handwritten | text | Line Recognition |
ReLa | text | : Exploiting visual relationships for arbitrary-shaped scene text detection with graph convolutional networks |
Relation Graph Reasoning for Image- | text | Matching |
Relation Mining and Visualization Framework for Automated | text | Summarization, A |
Relation-Guided Network for Image- | text | Retrieval |
Remote Sensing Image Augmentation Based on | text | Description for Waterside Change Detection |
Report from the AND 2009 working group on noisy | text | datasets |
Representation and Recognition of | text | Using Hidden Markov Models, The |
Representation learning for very short | text | s using weighted word embedding aggregation |
Representation transfer and data cleaning in multi-views for | text | simplification |
Residual Dual Scale Scene | text | Spotting by Fusing Bottom-Up and Top-Down Processing |
ReStGAN: A step towards visually guided shopper experience via | text | -to-image synthesis |
Rethinking | text | Segmentation: A Novel Dataset and A Text-Specific Refinement Approach |
Rethinking | text | Segmentation: A Novel Dataset and A Text-Specific Refinement Approach |
Retrieval Methods for English- | text | with Misrecognized OCR Characters |
Retrieval Strategies for Noisy | text | |
Review of Segmentation and Con | text | ual Analysis Techniques for Text Recognition, A |
Revisiting Scene | text | Recognition: A Data Perspective |
RIATIG: Reliable and Imperceptible Adversarial | text | -to-Image Generation with Natural Prompts |
Rickrolling the Artist: Injecting Backdoors into | text | Encoders for Text-to-Image Synthesis |
Rickrolling the Artist: Injecting Backdoors into | text | Encoders for Text-to-Image Synthesis |
RiFeGAN2: Rich Feature Generation for | text | -to-Image Synthesis From Constrained Prior Knowledge |
RiFeGAN: Rich Feature Generation for | text | -to-Image Synthesis From Prior Knowledge |
RLST: A Reinforcement Learning Approach to Scene | text | Detection Refinement |
robust algorithm for | text | detection in color images, A |
Robust Algorithm for | text | String Separation from Mixed Text/Graphics Images, A |
Robust Algorithm for | text | String Separation from Mixed Text/Graphics Images, A |
Robust and Accurate | text | Stroke Segmentation |
robust and multiscale document image segmentation for block line/ | text | line structures extraction, A |
Robust and Non-Negative Collective Matrix Factorization for | text | -to-Image Transfer Learning |
Robust and parallel Uyghur | text | localization in complex background images |
Robust and Secure Data Hiding for PDF | text | Document |
robust approach for recognition of | text | embedded in natural scenes, A |
robust approach for | text | detection from natural scene images, A |
Robust Approach to Extraction of | text | s from Camera Captured Images, A |
robust approach to | text | line grouping in online handwritten Japanese documents, A |
Robust Binarization for Video | text | Recognition |
Robust Color-Independent | text | Detection Method from Complex Videos, A |
Robust detection of stylized | text | events in digital video |
Robust Extraction of | text | from Camera Images |
Robust Extraction of | text | in Video |
robust hybrid approach for | text | line segmentation in historical documents, A |
Robust Lexicon-Free Confidence Prediction for | text | Recognition |
Robust Local Scoring Function for | text | -Independent Speaker Verification |
Robust Model for On-Line Handwritten Japanese | text | Recognition, A |
Robust outdoor | text | detection using text intensity and shape features |
Robust outdoor | text | detection using text intensity and shape features |
Robust Scene | text | Detection for Multi-script Languages Using Deep Learning |
Robust Scene | text | Detection for Partially Annotated Training Data |
Robust scene | text | detection using integrated feature discrimination |
Robust Scene | text | Detection with Convolution Neural Network Induced MSER Trees |
Robust Scene | text | Detection with Deep Feature Pyramid Network and CNN based NMS Model |
Robust Scene | text | Recognition with Automatic Rectification |
Robust seed-based stroke width transform for | text | detection in natural images |
Robust Segmentation Technique for Line, Word and Character Extraction from Kannada | text | in Low Resolution Display Board Images, A |
Robust skew detection in mixed | text | /graphics documents |
Robust Split-and-Merge | text | Segmentation Approach for Images, A |
Robust stereo correspondence for documents by matching connected components of | text | -lines with dynamic programming |
Robust stereo matching for document images using parameter selection of | text | -line extraction |
robust system for | text | extraction in video, A |
Robust System For Thresholding And Skew Detection In Mixed | text | /graphics Documents, A |
robust technique for | text | extraction in mixed-type binary documents, A |
Robust | text | detection from binarized document images |
Robust | text | detection in natural images with edge-enhanced Maximally Stable Extremal Regions |
Robust | text | Detection in Natural Scene Images |
Robust | text | Detection in Natural Scene Images by Generalized Color-Enhanced Contrasting Extremal Region and Neural Networks |
Robust | text | Detection with Vertically-Regressed Proposal Network |
Robust | text | Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation |
Robust | text | Line Segmentation for Historical Manuscript Images Using Color and Texture |
Robust | text | Segmentation in Low Quality Images via Adaptive Stroke Width Estimation and Stroke Based Superpixel Grouping |
Robust | text | segmentation using graph cut |
Robust Two Level Classification Algorithm for | text | Localization in Documents, A |
Robust video | text | segmentation and recognition with multiple hypotheses |
Robust Video- | text | Retrieval Via Noisy Pair Calibration |
Robust Wavelet Transform Based Technique for Video | text | Detection, A |
Robustly Recognizing Irregular Scene | text | by Rectifying Principle Irregularities |
Robustscanner: Dynamically Enhancing Positional Clues for Robust | text | Recognition |
Rolling bilateral filter-based | text | image deblurring |
Rotation and script independent | text | detection from video frames using sub pixel mapping |
Rotation-Sensitive Regression for Oriented Scene | text | Detection |
Rough-fuzzy based scene categorization for | text | detection and recognition in video |
RSCA: Real-time Segmentation-based Con | text | -Aware Scene Text Detection |
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to- | text | Adversarial Attacks |
RUArt: A Novel | text | -Centered Solution for Text-Based Visual Question Answering |
RUArt: A Novel | text | -Centered Solution for Text-Based Visual Question Answering |
Rule Based Con | text | ual Post-Processing for Devanagari Text Recognition |
SAC: Semantic Attention Composition for | text | -Conditioned Image Retrieval |
SAFE: Scale Aware Feature Encoder for Scene | text | Recognition |
SaHAN: Scale-Aware Hierarchical Attention Network for Scene | text | Recognition |
SAM: Self Attention Mechanism for Scene | text | Recognition Based on Swin Transformer |
Sample-aware Data Augmentor for Scene | text | Recognition |
SARAT-a system for the recognition of Arabic printed | text | |
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker | text | -to-Speech Systems |
Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and | text | Data |
Scale and Orientation Invariant | text | Segmentation for Born-Digital Compound Images |
Scale robust deep oriented- | text | detection network |
Scale-aware Polar Representation for Arbitrarily-shaped | text | Detection |
Scale-Invariant Multi-Oriented | text | Detection in Wild Scene Image |
Scale-Residual Learning Network for Scene | text | Detection |
Scaling up GANs for | text | -to-Image Synthesis |
SCATTER: Selective Con | text | Attentional Scene Text Recognizer |
Scene Graph Driven | text | -Prompt Generation for Image Inpainting |
Scene Retrieval for Video Summarization Based on | text | -to-Image GAN |
Scene | text | Character Recognition Using Spatiality Embedded Dictionary |
Scene | text | Deblurring Using Text-Specific Multiscale Dictionaries |
Scene | text | Deblurring Using Text-Specific Multiscale Dictionaries |
Scene | text | detection and recognition with advances in deep learning: A survey |
Scene | text | Detection and Recognition: The Deep Learning Era |
Scene | text | Detection and Segmentation Based on Cascaded Convolution Neural Networks |
Scene | text | Detection and Tracking for a Camera-Equipped Wearable Reading Assistant for the Blind |
Scene | text | detection based on component-level fusion and region-level verification |
Scene | text | detection based on multi-scale SWT and edge filtering |
Scene | text | Detection Based on Robust Stroke Width Transform and Deep Belief Network |
Scene | text | detection based on skeleton-cut detector |
Scene | text | detection method based on the hierarchical model |
Scene | text | detection suitable for parallelizing on multi-core |
Scene | text | detection using adaptive color reduction, adjacent character model and hybrid verification strategy |
Scene | text | detection using graph model built upon maximally stable extremal regions |
Scene | text | detection using sequential nontext filtering |
Scene | text | detection using sparse stroke information and MLP |
Scene | text | Detection Using Superpixel-Based Stroke Feature Transform and Deep Learning Based Region Classification |
Scene | text | Detection via Connected Component Clustering and Nontext Filtering |
Scene | text | Detection via Deep Semantic Feature Fusion and Attention-based Refinement |
Scene | text | Detection via Integrated Discrimination of Component Appearance and Consensus |
Scene | text | detection via stroke width |
Scene | text | Detection with Adaptive Line Clustering |
Scene | text | detection with extremal region based cascaded filtering |
Scene | text | Detection with Recurrent Instance Segmentation |
Scene | text | detection with robust character candidate extraction method |
Scene | text | Detection with Selected Anchors |
Scene | text | detection with superpixels and hierarchical model |
Scene | text | Extraction and Translation for Handheld Devices |
Scene | text | extraction based on edges and support vector regression |
Scene | text | Extraction by Superpixel CRFs Combining Multiple Character Features |
Scene | text | Extraction in Complex Images |
Scene | text | extraction in natural scene images using hierarchical feature combining and verification |
Scene | text | Extraction Using Focus of Mobile Camera |
Scene | text | Extraction with Edge Constraint and Text Collinearity |
Scene | text | Extraction with Edge Constraint and Text Collinearity |
Scene | text | extraction with local symmetry transform |
Scene | text | Identification by Leveraging Mid-level Patches and Context Information |
Scene | text | Image Super-Resolution in the Wild |
Scene | text | Localization and Recognition with Oriented Stroke Detection |
Scene | text | Localization Using Gradient Local Correlation |
Scene | text | Recognition and Retrieval for Large Lexicons |
Scene | text | recognition by learning co-occurrence of strokes based on spatiality embedded dictionary |
Scene | text | Recognition in Mobile Applications by Character Descriptor and Structure Configuration |
Scene | text | Recognition Models Explainability Using Local Features |
Scene | text | recognition using a Hough forest implicit shape model and semi-Markov conditional random fields |
Scene | text | Recognition Using Co-occurrence of Histogram of Oriented Gradients |
Scene | text | Recognition using Higher Order Language Priors |
Scene | text | Recognition Using Part-Based Tree-Structured Character Detection |
Scene | text | recognition using residual convolutional recurrent neural network |
Scene | text | Recognition Using Similarity and a Lexicon with Sparse Belief Propagation |
Scene | text | recognition using sparse coding based features |
Scene | text | Recognition Using Structure-Guided Character Detection and Linguistic Knowledge |
Scene | text | Recognition with a Hough Forest Implicit Shape Model |
Scene | text | recognition with CNN classifier and WFST-based word labeling |
Scene | text | recognition with deeper convolutional neural networks |
Scene | text | Recognition with Permuted Autoregressive Sequence Models |
Scene | text | Recognition with Self-supervised Contrastive Predictive Coding |
Scene | text | Recognition: No Country for Old Men? |
Scene | text | rectification using glyph and character alignment properties |
Scene | text | Retrieval via Joint Text Detection and Similarity Learning |
Scene | text | Retrieval via Joint Text Detection and Similarity Learning |
Scene | text | Script Identification with Convolutional Recurrent Neural Networks |
Scene | text | Segmentation Based on Local Image Phase Information and MSER Method |
Scene | text | Segmentation by Paired Data Synthesis |
Scene | text | Segmentation via Inverse Rendering |
Scene | text | Segmentation with Multi-level Maximally Stable Extremal Regions |
Scene | text | Telescope: Text-Focused Scene Image Super-Resolution |
Scene | text | Telescope: Text-Focused Scene Image Super-Resolution |
Scene | text | Visual Question Answering |
Scene | text | , Assistance for Visually Imapired |
Scene- | text | Oriented Referring Expression Comprehension |
Scene- | text | Synthesis Engine Achieved Through Learning From Decomposed Real-World Data, A |
Scene- | text | -Detection Method Robust Against Orientation and Discontiguous Components of Characters |
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and | text | |
Scientometric Full- | text | Analysis of Papers Published in Remote Sensing between 2009 and 2021 |
SCOB: Universal | text | Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap |
SCOB: Universal | text | Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap |
ScrabbleGAN: Semi-Supervised Varying Length Handwritten | text | Generation |
Screen-rendered | text | images recognition using a deep residual network based segmentation-free method |
Script and nature differentiation for Arabic and Latin | text | images |
Script-Free | text | Line Segmentation Using Interline Space Model for Printed Document Images |
Script-Independent | text | Line Segmentation in Freestyle Handwritten Documents |
Script-independent, HMM-based | text | Line Finding for OCR |
SCUT-COUCH | text | line_NU: An Unconstrained Online Handwritten Chinese Text Lines Dataset |
SCUT-HCCDoc: A new benchmark dataset of handwritten Chinese | text | in unconstrained camera-captured documents |
Search method and apparatus for locating digitally stored content, such as visual images, music and sounds, | text | , or software, in storage devices on a computer network |
Searching a High Performance Feature Extractor for | text | Recognition Network |
Searching OCR'ed | text | : An LDA Based Approach |
Searching through a Speech Memory for | text | -Independent Speaker Verification |
See Finer, See More: Implicit Modality Alignment for | text | -based Person Retrieval |
See-Through- | text | Grouping for Referring Image Segmentation |
SEED: Semantics Enhanced Encoder-Decoder Framework for Scene | text | Recognition |
SeeTek: Very Large-Scale Open-set Logo Recognition with | text | -Aware Metric Learning |
SegLink++: Detecting Dense and Arbitrary-shaped Scene | text | by Instance-aware Component Grouping |
Segmentation and Classification of Mixed | text | /Graphics/Image Documents |
Segmentation and Recognition of Continuous Handwriting Chinese | text | |
Segmentation and Recognition of Dimensioning | text | from Engineering Drawings |
Segmentation and Word Spotting Methods for Printed and Handwritten Arabic | text | s: A Comparative Study |
Segmentation Method of Single- and Multiple-Touching Characters in Offline Handwritten Japanese | text | Recognition, A |
Segmentation of Bangla unconstrained handwritten | text | |
Segmentation of On-Line Freely Written Japanese | text | Using SVM for Improving Text Recognition |
Segmentation of On-Line Freely Written Japanese | text | Using SVM for Improving Text Recognition |
Segmentation of on-line handwritten Japanese | text | of arbitrary line direction by a neural network for improving text recognition |
Segmentation of on-line handwritten Japanese | text | of arbitrary line direction by a neural network for improving text recognition |
Segmentation of On-Line Handwritten Japanese | text | Using SVM for Improving Text Recognition |
Segmentation of On-Line Handwritten Japanese | text | Using SVM for Improving Text Recognition |
Segmentation of stick | text | based on sub connected area analysis |
Segmentation of | text | and graphics |
Segmentation of | text | and Graphics from Document Images |
Segmentation of | text | From Color Map Images |
Segmentation of | text | , picture and lines of a document image |
Segmentation of | text | /image documents using texture approaches |
Segmentation of Uniform Colored | text | from Color Graphics Background |
Segmentation of Very Low Resolution Screen-Rendered | text | |
Segmentation-Aware | text | -Guided Image Manipulation |
Segmentation-Free Approach to | text | Recognition Recognition with Application to Arabic Text, A |
Segmentation-Free Approach to | text | Recognition Recognition with Application to Arabic Text, A |
Segmentation-free handwritten Chinese | text | recognition with LSTM-RNN |
Segmented handwritten | text | recognition with recurrent neural network classifiers |
Segmenting a page of a document into areas which are | text | and areas which are halftone |
Segmenting Messy | text | : Detecting Boundaries in Text Derived from Historical Newspaper Images |
Segmenting Messy | text | : Detecting Boundaries in Text Derived from Historical Newspaper Images |
Segmenting | text | Images with Massively Parallel Machines |
Self-attention based | text | Knowledge Mining for Text Detection |
Self-attention based | text | Knowledge Mining for Text Detection |
Self-learning structure for | text | localization |
Self-Organized | text | Detection with Minimal Post-processing via Border Learning |
Self-paced Learning to Improve | text | Row Detection in Historical Documents with Missing Labels |
Self-supervised adaptation for on-line script | text | recognition |
Self-supervised Character-to-Character Distillation for | text | Recognition |
Self-supervised deep reconstruction of mixed strip-shredded | text | documents |
Self-Supervised Implicit Glyph Attention for | text | Recognition |
Self-Supervised Learning of Visual Features through Embedding Images into | text | Topic Spaces |
Self-supervised writer adaptation using perceptive concepts: application to on-line | text | recognition |
Self-Training for Domain Adaptive Scene | text | Detection |
Self-training for Handwritten | text | Line Recognition |
SEM-CS: Semantic Clipstyler for | text | -Based Image Style Transfer |
Semantic and Morphological Information Guided Chinese | text | Classification |
Semantic Correlation Mining between Images and | text | s with Global Semantics and Local Mapping |
Semantic Distance Adversarial Learning for | text | -to-Image Synthesis |
Semantic Indexing of Multimedia Content Using Visual, Audio, and | text | Cues |
Semantic Integration of Information Through Relation Mining: Application to Bio-medical | text | Processing |
Semantic keyword extraction via adaptive | text | binarization of unstructured unsourced video |
Semantic Object Accuracy for Generative | text | -to-Image Synthesis |
Semantic Oriented | text | Clustering Based on RDF |
Semantic Proximity Based System of Arabic | text | Indexation, A |
Semantic Role Aware Correlation Transformer for | text | To Video Retrieval |
Semantic role-based representations in | text | classification |
Semantic Similarity Distance: Towards better | text | -image consistency metric in text-to-image generation |
Semantic Similarity Distance: Towards better | text | -image consistency metric in text-to-image generation |
Semantic | text | Summarization of Long Videos |
Semantic-Aware Video | text | Detection |
Semantic-Compensated and Attention-Guided Network for Scene | text | Detection |
Semantic-Preserving Metric Learning for Video- | text | Retrieval |
Semantically Consistent Hierarchical | text | to Fashion Image Synthesis with an Enhanced-Attentional Generative Adversarial Network |
Semantically consistent | text | to fashion image synthesis with an enhanced attentional generative adversarial network |
Semantically Invariant | text | -to-Image Generation |
Semantics Disentangling for | text | -To-Image Generation |
Semantics-Enhanced Adversarial Nets for | text | -to-Image Synthesis |
Semi-automatic news video annotation framework for Arabic | text | |
Semi-Incremental Recognition Method for On-Line Handwritten Japanese | text | , A |
Semi-Incremental Recognition of On-Line Handwritten Japanese | text | |
Semi-supervised learning for | text | -line detection |
Semi-supervised network embedding with | text | information |
Semi-Supervised Pixel-Level Scene | text | Segmentation by Mutually Guided Network |
Semi-Supervised Scene | text | Recognition |
Semi-Supervised | text | Classification With Universum Learning |
Semi-Supervised | text | Detection With Accurate Pseudo-Labels |
Semiautomatic Ground Truth Generation for | text | Detection and Recognition in Video Images |
SemStyle: Learning to Generate Stylised Image Captions Using Unaligned | text | |
Sense discovery via co-clustering on images and | text | |
Sentence level | text | classification in the Kannada language: A classifier's perspective |
Sentiment Similarity-oriented Attention Model with Multi-task Learning for | text | -based Emotion Recognition, A |
Separate Images and Graphics from | text | |
Separating Content from Style Using Adversarial Learning for Recognizing | text | in the Wild |
Separating handwritten material from machine printed | text | using hidden Markov models |
Separating Handwritten | text | from Non-Textual Interference |
Separating Lines of | text | in Free-Form Handwritten Historical Documents |
Separating | text | and background in degraded document images: A comparison of global thresholding techniques for multi-stage thresholding |
Separation of overlapping | text | from graphics |
Separation of touching and overlapping words in adjacent lines of handwritten | text | |
Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised | text | Recognition |
Seq2seq-based Model with Global Semantic Con | text | for Scene Text Recognition, A |
Sequence as a Whole: A Unified Framework for Video Action Localization With Long-Range | text | Query |
Sequence to Sequence -- Video to | text | |
Sequence-to-Sequence Contrastive Learning for | text | Recognition |
Sequence-To-Sequence Domain Adaptation Network for Robust | text | Image Recognition |
Sequential alignment attention model for scene | text | recognition |
Sequential Deformation for Accurate Scene | text | Detection |
Sequential Monte Carlo video | text | segmentation |
Sequential | text | s Driven Cohesive Motions Synthesis with Natural Transitions |
Sequential visual and semantic consistency for semi-supervised | text | recognition |
set of benchmarks for Handwritten | text | Recognition on historical documents, A |
SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene | text | Recognition |
Shape Robust | text | Detection With Progressive Scale Expansion Network |
Shape-Aware | text | -Driven Layered Video Editing |
Shape-DNA: Effective Character Restoration and Enhancement for Arabic | text | Documents |
Shape-Matching GAN++: Scale Controllable Dynamic Artistic | text | Style Transfer |
ShapeScaffolder: Structure-Aware 3D Shape Generation from | text | |
Shatter and Gather: Learning Referring Image Segmentation with | text | Supervision |
Shifted Diffusion for | text | -to-image Generation |
Shuffle and Divide: Contrastive Learning for Long | text | |
SiamCLIM: | text | -Based Pedestrian Search Via Multi-Modal Siamese Contrastive Learning |
Sigma-Lognormal Model for Handwritten | text | CAPTCHA Generation, A |
Sign Detection Based | text | Localization in Mobile Device Captured Scene Images |
Signing Avatars: Multimodal Challenges for | text | -to-sign Generation |
SimAN: Exploring Self-Supervised Representation Learning of Scene | text | via Similarity-Aware Normalization |
Similarity Search on Semantic Trajectories Using | text | Processing |
Simple and Effective Multi-word Query Spotting in Handwritten | text | Images |
Simple and Robust Correlation Filtering Method for | text | -Based Person Search, A |
Simple Framework for | text | -Supervised Semantic Segmentation, A |
Simulated Annealing Clustering of Chinese Words for Con | text | ual Text Recognition |
SINE: SINgle Image Editing with | text | -to-Image Diffusion Models |
Single Shot Scene | text | Retrieval |
Single Shot | text | Detector with Regional Attention |
Single-frame | text | super-resolution: a bayesian approach |
Single-Line | text | Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition |
Single-Line | text | Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition |
SKED: Sketch-guided | text | -based 3D Editing |
skeleton based descriptor for detecting | text | in real scene images, A |
Skeleton Filter: A Self-Symmetric Filter for Skeletonization in Noisy | text | Images |
Sketch and | text | Guided Diffusion Model for Colored Point Cloud Generation |
Sketch is Worth a Thousand Words: Image Retrieval with | text | and Sketch, A |
SketchBird: Learning to Generate Bird Sketches from | text | |
Skew Angle Detection and Correction in | text | Images Using RGB Gradient |
Skew correction and line extraction in binarized printed | text | images |
Skew Detection and | text | Line-Position Determination in Digitized Documents |
Skew detection for complex document images using robust borderlines in both | text | and non-text regions |
Skew detection for complex document images using robust borderlines in both | text | and non-text regions |
Skew detection of | text | in a noisy digitized image |
Skewed | text | correction based on the improved Hough transform |
Sliding Line Point Regression for Shape Robust Scene | text | Detection |
SLOAN: Scale-Adaptive Orientation Attention Network for Scene | text | Recognition |
SMAN: Stacked Multimodal Attention Network for Cross-Modal Image- | text | Retrieval |
SmartBrush: | text | and Shape Guided Object Inpainting with Diffusion Model |
Smile: Sequence-to-Sequence Domain Adaptation with Minimizing Latent Entropy for | text | Image Recognition |
SNAC: Speaker-Normalized Affine Coupling Layer in Flow-Based Architecture for Zero-Shot Multi-Speaker | text | -to-Speech |
Snooper | text | : A multiresolution system for text detection in complex visual scenes |
Snooper | text | : A text detection system for automatic indexing of urban scenes |
Snoopertrack: | text | detection and tracking for outdoor videos |
So Many Heads, So Many Wits: Multimodal Graph Reasoning for | text | -Based Visual Question Answering |
Social Image- | text | Sentiment Classification With Cross-Modal Consistency and Knowledge Distillation |
Sounding Video Generator: A Unified Framework for | text | -Guided Sounding Video Generation |
sparse version of the ridge logistic regression for large-scale | text | categorization, A |
Spatial and Color Spaces Combination for Natural Scene | text | Extraction |
Spatial and Spectral Based Segmentation of | text | in Multispectral Images of Ancient Documents |
Spatially Prioritized and Persistent | text | Detection and Decoding |
Spatio-Temporal Relevance Classification from Geographic | text | s Using Deep Learning |
Special issue on camera-based | text | and document recognition |
Special issue on deep learning for video | text | analysis |
Special Issue on Noisy | text | Analytics |
Special Issue on Noisy | text | Analytics, II |
Special Issue on Noisy | text | Analytics, III |
Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of | text | -to-Image Diffusion Models to Learn Any Unseen Style |
Specific Category Region Proposal Network for | text | Detection in Natural Scene |
Spectral approach to find number of clusters of short- | text | documents |
Spectral Fluctuation Method: A | text | ure-Based Method to Extract Text Regions in General Scene Images |
SPEye: A Calibration-Free Gaze-Driven | text | Entry Technique Based on Smooth Pursuit |
Spontaneous Handwriting | text | Recognition and Classification Using Finite-State Models |
Spotting Phrases in Lines of Imaged | text | |
SPTS v2: Single-Point Scene | text | Spotting |
Stable | text | line detection |
StableVideo: | text | -driven Consistency-aware Diffusion Video Editing |
Stacked Cross Attention for Image- | text | Matching |
StackGAN: | text | to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks |
StacMR: Scene- | text | Aware Cross-Modal Retrieval |
STAN: A sequential transformation attention-based network for scene | text | recognition |
STAR-Net: A SpaTial Attention Residue Network for Scene | text | Recognition |
State Estimation in a Document Image and Its Application in | text | Block Identification and Text Line Extraction |
State Estimation in a Document Image and Its Application in | text | Block Identification and Text Line Extraction |
State-of-the-Art in Action: Unconstrained | text | Detection |
Static | text | region detection in video sequences using color and orientation consistencies |
Statistical Approach for Phrase Location and Recognition within a | text | Line: An Application to Street Name Recognition, A |
Statistical modeling for the detection, localization and extraction of | text | from heterogeneous textual images using combined feature scheme |
Statistical | text | Line Analysis in Handwritten Documents |
Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic | text | Lines, A |
STEFANN: Scene | text | Editor Using Font Adaptive Neural Network |
Steganalysis for | text | , Documents |
Stochastic | text | Models for Music Categorization |
Stop Word Location and Identification for Adaptive | text | Recognition |
Store classification using | text | -Exemplar-Similarity and Hypotheses-Weighted-CNN |
Story Segmentation in News Videos Using Visual and | text | Cues |
Story Visualization by Online | text | Augmentation with Context Memory |
StoryDALL-E: Adapting Pretrained | text | -to-Image Transformers for Story Continuation |
Straight-Line Approximation and 1D Representation of Off-Line Handwritten | text | |
Straightening warped | text | lines using polynomial regression |
Stratified Multi-Task Learning for Robust Spotting of Scene | text | s |
Street View | text | Recognition With Deep Learning for Urban Scene Understanding in Intelligent Transportation Systems |
Stretching deep architectures for | text | recognition |
String Matching, | text | Matching |
String-level learning of confidence transformation for Chinese handwritten | text | recognition |
STRIVE: Scene | text | Replacement In Videos |
stroke filter and its application to | text | localization, A |
Stroke Filter for | text | Localization in Video Images |
Stroke Segmentation and Recognition from Bangla Online Handwritten | text | |
Stroke Verification with Gray-level Image for Hangul Video | text | Recognition |
Stroke-Based Scene | text | Erasing Using Synthetic Data for Training |
Strokelets: A Learned Multi-Scale Mid-Level Representation for Scene | text | Recognition |
Strokelets: A Learned Multi-scale Representation for Scene | text | Recognition |
Structural feature-based event clustering for short | text | streams |
Structure-Aware Generative Adversarial Network for | text | -to-Image Generation |
Structuring low-quality videotaped lectures for cross-reference browsing by video | text | analysis |
Study on Automatic Chinese | text | Classification, A |
Style Transformer With Common Knowledge Optimization for Image- | text | Retrieval, The |
StyleCLIP: | text | -Driven Manipulation of StyleGAN Imagery |
StyleMC: Multi-Channel Based Fast | text | -Guided Image Generation and Manipulation |
StyleT2I: Toward Compositional and High-Fidelity | text | -to-Image Synthesis |
Stylized | text | -to-Fashion Image Generation |
Sub-structure Learning Based Handwritten Chinese | text | Recognition |
subtractive clustering scheme for | text | -independent online writer identification, A |
Super-resolution Enhancement of | text | Image Sequences |
Super-Resolved Binarization of | text | Based on the FAIR Algorithm |
Superresolution-based Enhancement of | text | in Digital Video |
supervised algorithm with a new differentiated-weighting scheme for identifying the author of a handwritten | text | , A |
Supervised and Traditional Term Weighting Methods for Automatic | text | Categorization |
Supervised semantic relation mining from linguistically noisy | text | documents |
support vector approach for cross-modal search of images and | text | s, A |
Support vector machine-based approach for | text | description from the video |
Support vector machine-based | text | detection in digital video |
Suppression of non- | text | components in handwritten document images |
Surprisingly Straightforward Scene | text | Removal Method with Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis, The |
survey on camera-captured scene | text | detection and extraction: towards Gurmukhi script, A |
survey on methods, datasets and implementations for scene | text | spotting, A |
survey on | text | generation using generative adversarial networks, A |
SViTT: Temporal Learning of Sparse Video- | text | Transformers |
Swap | text | : Image Based Texts Transfer in Scenes |
Swin | text | Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition |
Swin | text | Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition |
Swin | text | Spotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition |
SWT voting-based color reduction method for detecting | text | in natural scene images |
Symbolic Subtraction from Fixed Formatted Graphics and | text | from Filled in Forms |
Symmetric-key block cipher for image and | text | cryptography |
Symmetry-based object proposal for | text | detection |
Symmetry-based | text | line detection in natural scenes |
Symmetry-Constrained Rectification Network for Scene | text | Recognition |
Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to- | text | Translation |
Synthesizing Talking Faces from | text | and Audio: An Autoencoder and Sequence-to-Sequence Convolutional Neural Network |
Synthetic Data for | text | Localisation in Natural Images |
Synthetic-to-real Unsupervised Domain Adaptation for Scene | text | Detection in the Wild |
Synthetically Supervised Feature Learning for Scene | text | Recognition |
System and method for automatically distinguishing between graphic information and | text | information of image data |
System for Bangla Online Handwritten | text | , A |
System for Handwritten and Machine-Printed | text | Separation in Bangla Document Images, A |
system for the off-line recognition of handwritten | text | , A |
T-HOG: An effective gradient-based descriptor for single line | text | regions |
t-SS3: A | text | classifier with dynamic n-grams for early risk detection over text streams |
t-SS3: A | text | classifier with dynamic n-grams for early risk detection over text streams |
t-Test feature selection approach based on term frequency for | text | categorization |
T2V2T: | text | -to-Video-to-Text Fusion for Text-to-Video Retrieval |
T2V2T: | text | -to-Video-to-Text Fusion for Text-to-Video Retrieval |
T2V2T: | text | -to-Video-to-Text Fusion for Text-to-Video Retrieval |
T2VLAD: Global-Local Sequence Alignment for | text | -Video Retrieval |
Tablet identification using support vector machine based | text | recognition and error correction by enhanced n-grams algorithm |
TACo: Token-aware Cascade Contrastive Learning for Video- | text | Alignment |
TACT: | text | attention based CNN-Transformer network for polyp segmentation |
Tag2Pix: Line Art Colorization Using | text | Tag With SECat and Changing Loss |
Tagging Webcast | text | in Baseball Videos by Video Segmentation and Text Alignment |
Tagging Webcast | text | in Baseball Videos by Video Segmentation and Text Alignment |
Tailoring | text | for automatic layouting of newspaper pages |
TAP: | text | -Aware Pre-training for Text-VQA and Text-Caption |
TAP: | text | -Aware Pre-training for Text-VQA and Text-Caption |
TAP: | text | -Aware Pre-training for Text-VQA and Text-Caption |
TAPS3D: | text | -Guided 3D Textured Shape Generation from Pseudo Supervision |
Target-level Sentiment Analysis Based on Image and | text | Fusion |
Task Grouping for Multilingual | text | Recognition |
TCATD: | text | Contour Attention for Scene Text Detection |
TCATD: | text | Contour Attention for Scene Text Detection |
TE141K: Artistic | text | Benchmark for Text Effect Transfer |
TE141K: Artistic | text | Benchmark for Text Effect Transfer |
Teach | text | : CrossModal Generalized Distillation for Text-Video Retrieval |
Technique for Segmentation of Gurmukhi | text | , A |
Tecm-clip: | text | -based Controllable Multi-attribute Face Image Manipulation |
TediGAN: | text | -Guided Diverse Face Image Generation and Manipulation |
Tell Me What Happened: Unifying | text | -guided Video Completion via Multimodal Masked Video Generation |
Tell Your Story: | text | -Driven Face Video Synthesis with High Diversity via Adversarial Learning |
Tem-adapter: Adapting Image- | text | Pretraining for Video Question Answer |
Template Based Segmentation of Touching Components in Handwritten | text | Lines |
Temporal Multimodal Graph Transformer With Global-Local Alignment for Video- | text | Retrieval |
Tensor representation learning based image patch analysis for | text | identification and recognition |
Tensor Voting Based | text | Localization in Natural Scene Images |
Term relevance dependency model for | text | classification |
TETFN: A | text | enhanced transformer fusion network for multimodal sentiment analysis |
TexFusion: Synthesizing 3D | text | ures with Text-Guided Image Diffusion Models |
| text | alignment in early printed books combining deep learning and dynamic programming |
| text | alignment with handwritten documents |
| text | analysis using local energy |
| text | and Documents in the Deep Learning Era |
| text | and Image Guided 3D Avatar Generation and Manipulation |
| text | and Image Sharpening of Scanned Images in the JPEG Domain |
| text | and Layout Information Extraction from Document Files of Various Formats Based on the Analysis of Page Description Language |
| text | and non-text segmentation based on connected component features |
| text | and non-text segmentation based on connected component features |
| text | and non-text separation in offline document images: a survey |
| text | and non-text separation in offline document images: a survey |
| text | and picture segmentation by the distribution analysis of wavelet coefficients |
| text | and User Generic Model for Writer Verification Using Combined Pen Pressure Information From Ink Intensity and Indented Writing on Paper |
| text | Area Detection in Digital Documents Images Using Textural Features |
| text | area localization under complex-background using wavelet decomposition |
| text | Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution, A |
| text | Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution, A |
| text | baseline detection, a single page trained system |
| text | Baseline Recognition Using a Recurrent Convolutional Neural Network |
| text | binarization in color documents |
| text | Block Segmentation in Comic Speech Bubbles |
| text | box proposals for handwritten word spotting from documents |
| text | Categorization Approach for Music Style Recognition, A |
| text | Categorization: A Symbolic Approach |
| text | Classification and Document Layout Analysis of Paper Fragments |
| text | classification with the support of pruned dependency patterns |
| text | Co-Detection in Multi-View Scene |
| text | Compression-Aided Transformer Encoding |
| text | data extraction from microfilm images of punched cards |
| text | degradations and OCR training |
| text | Detection and Character Recognition in Scene Images with Unsupervised Feature Learning |
| text | Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm |
| text | Detection and Recognition in Imagery: A Survey |
| text | detection and recognition in images and video frames |
| text | detection and recognition in natural scene with edge analysis |
| text | Detection and Recognition in Real World Images |
| text | detection and recognition in urban scenes |
| text | Detection and Recognition on Traffic Panels From Street-Level Imagery Using Visual Appearance |
| text | detection and restoration in natural scene images |
| text | Detection and Translation from Natural Scenes |
| text | detection based on convolutional neural networks with spatial pyramid pooling |
| text | Detection for Video Analysis |
| text | detection from natural scene images using topographic maps and sparse representations |
| text | Detection from Natural Scene Images: Towards a System for Visually Impaired Persons |
| text | detection from scene images using sparse representation |
| text | detection in color scene images based on unsupervised clustering of multi-channel wavelet features |
| text | detection in continuous tone image segments |
| text | Detection in Digital Images Captured with Low Resolution Under Nonuniform Illumination Conditions |
| text | detection in images based on unsupervised classification of edge-based features |
| text | detection in images based on unsupervised classification of high-frequency wavelet coefficients |
| text | detection in images using sparse representation with discriminative dictionaries |
| text | detection in manga by combining connected-component-based and region-based classifications |
| text | Detection in Natural Images Using Bio-inspired Models |
| text | Detection in Natural Images Using Localized Stroke Width Transform |
| text | Detection in Natural Scene Images by Stroke Gabor Words |
| text | detection in natural scene images with user-intention |
| text | detection in natural scene with edge analysis |
| text | detection in natural scenes using Gradient Vector Flow-Guided symmetry |
| text | detection in nature scene images using two-stage nontext filtering |
| text | detection in scene images based on exhaustive segmentation |
| text | detection in stores using a repetition prior |
| text | Detection of Two Major Indian Scripts in Natural Scene Images |
| text | detection on camera acquired document images using supervised classification of connected components in wavelet domain |
| text | Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification, A |
| text | Detection Using Edge Gradient and Graph Spectrum |
| text | Detection, Find Text in General Scenes, Scene Text |
| text | Detection, Find Text in General Scenes, Scene Text |
| text | Detection, Find Text in General Scenes, Scene Text |
| text | detection, localization, and tracking in compressed video |
| text | Detection, Scene Text, Curved Text, Arbitrary Orientation |
| text | Detection, Scene Text, Curved Text, Arbitrary Orientation |
| text | Detection, Scene Text, Curved Text, Arbitrary Orientation |
| text | Detection, Tracking and Recognition in Video: A Comprehensive Survey |
| text | discrimination method and related apparatus |
| text | Driven Face-Video Synthesis Using GMM and Spatial Correlation |
| text | Driven Temporal Segmentation of Cricket Videos |
| text | effects transfer via distribution-aware texture synthesis |
| text | Encryption: Hybrid cryptographic method using Vigenere and Hill Ciphers. |
| text | Enhancement by PDE's Based Methods |
| text | Enhancement for Laser Copiers |
| text | Enhancement Network for Cross-Domain Scene Text Detection |
| text | Enhancement Network for Cross-Domain Scene Text Detection |
| text | enhancement with asymmetric filter for video OCR |
| text | Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model |
| text | extraction from color documents-clustering approaches in three and four dimensions |
| text | Extraction from Colored Book and Journal Covers |
| text | extraction from degraded document images |
| text | extraction from gray scale document images using edge information |
| text | extraction from gray scale historical document images using adaptive local connectivity map |
| text | Extraction from Grey Scale Page Images by Simple Edge Detectors |
| text | extraction from images captured via mobile and digital devices |
| text | extraction from name cards with complex design |
| text | extraction from scene images by character appearance and structure modeling |
| text | Extraction from Street Level Images |
| text | Extraction from Video Using Conditional Random Fields |
| text | extraction from web images based on a split-and-merge segmentation method using colour perception |
| text | extraction in complex color documents |
| text | Extraction in Digital News Video Using Morphology |
| text | Extraction in MPEG Compressed Video for Content-based Indexing |
| text | extraction in real scene images on planar planes |
| text | Extraction Using Component Analysis and Neuro-fuzzy Classification on Complex Backgrounds |
| text | Extraction Using Pyramid |
| text | Flow: A Unified Text Detection System in Natural Scene Images |
| text | Flow: A Unified Text Detection System in Natural Scene Images |
| text | From Corners: A Novel Approach to Detect Text and Caption in Videos |
| text | From Corners: A Novel Approach to Detect Text and Caption in Videos |
| text | Growing on Leaf |
| text | Guided Person Image Synthesis |
| text | Identification for Document Image Analysis Using a Neural Network |
| text | Identification in Complex Background Using SVM |
| text | identification in noisy document images using Markov random field |
| text | Image Classifier Using Image-Wise Annotation |
| text | Image Clean-Up and Thresholding: A Comparative Study |
| text | Image Compression Using Soft Pattern Matching |
| text | Image Deblurring Using Kernel Sparsity Prior |
| text | Image Deblurring Using Text-Specific Properties |
| text | Image Deblurring Using Text-Specific Properties |
| text | Image Deblurring via Intensity Extremums Prior |
| text | in Everything |
| text | in Scenes, Stroke Based, Contour Based |
| text | independent speaker gender recognition using lip movement |
| text | Independent Writer Identification for Bengali Script |
| text | independent writer identification of Arabic manuscripts and the effects of writers increase |
| text | independent writer recognition using redundant writing patterns with contour-based orientation and curvature features |
| text | information extraction in images and video: a survey |
| text | Input System Using Online Overlapped Handwriting Recognition for Mobile Devices |
| text | is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation |
| text | is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation |
| text | is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation |
| text | line and word segmentation of handwritten documents |
| text | line bounding system |
| text | Line Characterization by Connected Component Transformations |
| text | Line Detection for Heterogeneous Documents |
| text | Line Detection in Corrupted and Damaged Historical Manuscripts |
| text | Line Detection in Document Images: Towards a Support System for the Blind |
| text | line detection in handwritten documents |
| text | Line Detection in Unconstrained Handwritten Documents Using a Block-Based Hough Transform Approach |
| text | Line Detection Method for Mathematical Formula Recognition, A |
| text | line extraction for historical document images |
| text | line extraction from multi-skewed handwritten documents |
| text | line extraction in document images |
| text | Line Extraction in Documents |
| text | line extraction in graphical documents using background and foreground information |
| text | Line Extraction in Handwritten Document with Kalman Filter Applied on Low Resolution Image |
| text | Line Extraction Method Using Domain-Based Active Contour Model |
| text | line extraction strategy for palm leaf manuscripts |
| text | Line Extraction Using Adaptive Partial Projection for Palm Leaf Manuscripts from Thailand |
| text | Line Extraction Using DMLP Classifiers for Historical Manuscripts |
| text | Line Extraction Using Fully Convolutional Network and Energy Minimization |
| text | line segmentation and word recognition in a system for general writer independent handwriting recognition |
| text | Line Segmentation Based on Morphology and Histogram Projection |
| text | Line Segmentation for Unconstrained Handwritten Document Images Using Neighborhood Connected Component Analysis |
| text | line segmentation in Chinese handwritten text images |
| text | line segmentation in Chinese handwritten text images |
| text | Line Segmentation in Handwritten Documents Using Mumford-Shah Model |
| text | Line Segmentation in Images of Handwritten Historical Documents |
| text | Line Segmentation of Historical Arabic Documents |
| text | line segmentation of historical documents: a survey |
| text | line segmentation using a fully convolutional network in handwritten document images |
| text | Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis |
| text | Localization and Extraction from Complex Color Images |
| text | Localization and Extraction from Complex Gray Images |
| text | Localization and Recognition in Complex Scenes Using Local Features |
| text | Localization Based on Fast Feature Pyramids and Multi-Resolution Maximally Stable Extremal Regions |
| text | Localization in Born-Digital Images of Advertisements |
| text | Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors |
| text | Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors |
| text | Localization in Natural Scene Images Based on Conditional Random Field |
| text | Localization in Real-World Images Using Efficiently Pruned Exhaustive Search |
| text | Localization in Web Images Using Probabilistic Candidate Selection Model |
| text | localization using image cues and text line information |
| text | localization using image cues and text line information |
| text | localization, enhancement and binarization in multimedia documents |
| text | locating from natural scene images using image intensities |
| text | location in complex images |
| text | Mining in Remotely Sensed Phenology Studies: A Review on Research Development, Main Topics, and Emerging Issues |
| text | Mining the Contributors to Rail Accidents |
| text | OCR by Solving a Cryptogram |
| text | only Analysis |
| text | Page Recognition Using Grey-Level Features and Hidden Markov-Models |
| text | Parsing Using Spatial Information for Recognizing Addresses in Mail Pieces |
| text | Particles Multi-band Fusion for Robust Text Detection |
| text | Particles Multi-band Fusion for Robust Text Detection |
| text | Position-Aware Pixel Aggregation Network With Adaptive Gaussian Threshold: Detecting Text in the Wild |
| text | Position-Aware Pixel Aggregation Network With Adaptive Gaussian Threshold: Detecting Text in the Wild |
| text | Prior Guided Scene Text Image Super-Resolution |
| text | Prior Guided Scene Text Image Super-Resolution |
| text | Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding |
| text | reading algorithm for natural images, A |
| text | Recognition - Real World Data and Where to Find Them |
| text | Recognition and Retrieval System for e-Business Image Management, A |
| text | recognition from grey level images using hidden Markov models |
| text | Recognition in Images Based on Transformer with Hierarchical Attention |
| text | recognition in multimedia documents: A study of two neural-based OCRs using and avoiding character segmentation |
| text | Recognition in Real Scenarios with a Few Labeled Samples |
| text | Recognition in the Wild: A Survey |
| text | recognition of low-resolution document images |
| text | Recognition System for Japanese Documents |
| text | recognition using deep BLSTM networks |
| text | Recognition: From Pixels to Meaning |
| text | Region Conditional Generative Adversarial Network for Text Concealment in the Wild |
| text | Region Conditional Generative Adversarial Network for Text Concealment in the Wild |
| text | region extraction and text segmentation on camera-captured document style images |
| text | region extraction and text segmentation on camera-captured document style images |
| text | Region Extraction from Quality Degraded Document Images |
| text | Region Extraction From Scene Images Using AGF and MSER |
| text | region extraction in a document image based on the Delaunay tessellation |
| text | retrieval from early printed books |
| text | scanner with text detection technology on image sequences |
| text | scanner with text detection technology on image sequences |
| text | search for medieval manuscript images |
| text | segmentation and recognition in complex background based on markov random field |
| text | Segmentation by Clustering Cohesion |
| text | Segmentation for MRC Document Compression |
| text | Segmentation from Complex Background Using Sparse Representations |
| text | segmentation in color images using tensor voting |
| text | Segmentation in Colour Posters from the Spanish Civil War Era |
| text | segmentation in natural scenes using Toggle-Mapping |
| text | Segmentation in Unconstrained Hand-Drawings in Whiteboard Photos |
| text | Segmentation of Consumer Magazines in PDF Format |
| text | Segmentation Using Gabor Filters for Automatic Document Processing |
| text | segmentation using superpixel clustering |
| text | selection by structured light marking for hand-held cameras |
| text | Separation from Mixed Documents Using a Tree-Structured Classifier |
| text | Similarity Measurement Method Based on BiLSTM-SECapsNet Model |
| text | Spotting Transformers |
| text | String Detection From Natural Scenes by Structure-Based Partition and Grouping |
| text | String Extraction from Images of Color-Printed Documents |
| text | Synopsis Generation for Egocentric Videos |
| text | to Image Generation with Semantic-Spatial Aware GAN |
| text | To Image Synthesis With Erudite Generative Adversarial Networks |
| text | to Image, Layout to Image, Image Based Rendering |
| text | to photo-realistic image synthesis via chained deep recurrent generative adversarial network |
| text | to visual synthesis with appearance models |
| text | Verification in an Automated System for the Extraction of Bibliographic Data |
| text | vs. Non-Text Regions |
| text | vs. Non-Text Regions |
| text | watermarking algorithm based on word classification and inter-word space statistics, A |
| text | with Knowledge Graph Augmented Transformer for Video Captioning |
| text | zone classification using unsupervised feature learning |
| text | - and speech-based phonotactic models for spoken language identification of Basque and Spanish |
| text | -Attentional Convolutional Neural Network for Scene Text Detection |
| text | -Attentional Convolutional Neural Network for Scene Text Detection |
| text | -aware balloon extraction from manga |
| text | -aware image dehazing using stroke width transform |
| text | -based Geometric Normalization for Robust Watermarking of Digital Maps |
| text | -based image retrieval using progressive multi-instance learning |
| text | -Based Localization of Moments in a Video Corpus |
| text | -based Person Search via Attribute-aided Matching |
| text | -Based Temporal Localization of Novel Events |
| text | -based visual context modulation neural model for multimodal machine translation, A |
| text | -Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models |
| text | -Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models |
| text | -Driven Automatic Frame Generation Using MPEG-4 Synthetic/Natural Hybrid Coding for 2-D Head-and-Shoulder Scene |
| text | -Driven Generative Domain Adaptation with Spectral Consistency Regularization |
| text | -driven Stylization of Video Objects |
| text | -Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method |
| text | -Edge-Box: An Object Proposal Approach for Scene Texts Localization |
| text | -Edge-Box: An Object Proposal Approach for Scene Texts Localization |
| text | -Enhanced Scene Image Super-Resolution via Stroke Mask and Orthogonal Attention |
| text | -graphics separation to detect logo and stamp from color document images: A spectral approach |
| text | -Guided Eyeglasses Manipulation With Spatial Constraints |
| text | -Guided Facial Image Manipulation for Wild Images via Manipulation Direction-Based Loss |
| text | -guided Fourier Augmentation for long-tailed recognition |
| text | -Guided Generation and Refinement Model for Image Captioning, A |
| text | -Guided Human Image Manipulation via Image-Text Shared Space |
| text | -Guided Human Image Manipulation via Image-Text Shared Space |
| text | -Guided Neural Network Training for Image Recognition in Natural Scenes and Medicine |
| text | -Guided Object Detector for Multi-modal Video Question Answering |
| text | -Guided Unsupervised Latent Transformation for Multi-Attribute Image Manipulation |
| text | -guided visual representation learning for medical image retrieval systems |
| text | -image separation in Devanagari documents |
| text | -image super-resolution through anchored neighborhood regression with multiple class-specific dictionaries |
| text | -Independent Online Writer Identification Using Hidden Markov Models |
| text | -independent Persian writer identification based on feature relation graph (FRG), A |
| text | -independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram |
| text | -independent speaker recognition using graph matching |
| text | -independent speaker verification with ant colony optimization feature selection and support vector machine |
| text | -independent voice conversion using deep neural network based phonetic level features |
| text | -Independent Writer Identification and Verification on Offline Arabic Handwriting |
| text | -Independent Writer Identification and Verification Using Textural and Allographic Features |
| text | -Independent Writer Identification Based on Fusion of Dynamic and Static Features |
| text | -Independent Writer Identification on Online Arabic Handwriting |
| text | -independent writer identification using convolutional neural network |
| text | -independent writer identification using SIFT descriptor and contour-directional feature |
| text | -independent writer recognition using multi-script handwritten texts |
| text | -independent writer recognition using multi-script handwritten texts |
| text | -indicated writer verification using hidden Markov models |
| text | -instance graph: Exploring the relational semantics for text-based visual question answering |
| text | -instance graph: Exploring the relational semantics for text-based visual question answering |
| text | -Line Detection in Camera-Captured Document Images Using the State Estimation of Connected Components |
| text | -line examination for document forgery detection |
| text | -line Extraction and Character Recognition of Document Headlines with Graphical Designs Using Complementary Similarity Measure |
| text | -Line Extraction and Character Recognition of Japanese Newspaper Headlines With Graphical Designs |
| text | -Line Extraction in Handwritten Chinese Documents Based on an Energy Minimization Framework |
| text | -Line Extraction Using a Convolution of Isotropic Gaussian Filter with a Set of Line Filters |
| text | -mining based journal splitting |
| text | -Pose Estimation in 3D Using Edge-Direction Distributions |
| text | -to-Image Generation Grounded by Fine-Grained User Attention |
| text | -to-Image Generation via Semi-Supervised Training |
| text | -to-Image Synthesis based on Object-Guided Joint-Decoding Transformer |
| text | -to-image synthesis with self-supervised bi-stage generative adversarial network |
| text | -to-image synthesis with self-supervised learning |
| text | -to-image via mask anchor points |
| text | -to-Traffic Generative Adversarial Network for Traffic Situation Generation |
| text | -to-video: a semantic search engine for internet videos |
| text | -Tracking Wearable Camera System for the Blind |
| text | -tracking wearable camera system for visually-impaired people |
| text | -Video Completion Using Structure Repair and Texture Propagation |
| text | -Visual Prompting for Efficient 2D Temporal Video Grounding |
| text | /continuous tone image decision processor |
| text | /graphic labelling of ancient printed documents |
| text | /graphic separation using a sparse representation with multi-learned dictionaries |
| text | /Graphics Segmentation in Architectural Floor Plans |
| text | /Graphics Separation Revisited |
| text | /image separation method |
| text | /Non-Text Image Classification in the Wild with Convolutional Neural Networks |
| text | /Non-Text Image Classification in the Wild with Convolutional Neural Networks |
| text | /Non-text Ink Stroke Classification in Japanese Handwriting Based on Markov Random Fields |
| text | /Non-text Ink Stroke Classification in Japanese Handwriting Based on Markov Random Fields |
| text | /shape classifier for mobile applications with handwriting input |
| text | 2Concept: Concept Activation Vectors Directly from Text |
| text | 2LIVE: Text-Driven Layered Image and Video Editing |
| text | 2Mesh: Text-Driven Neural Stylization for Meshes |
| text | 2Performer: Text-Driven Human Video Generation |
| text | 2Pos: Text-to-Point-Cloud Cross-Modal Localization |
| text | 2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models |
| text | 2Scene: Text-driven Indoor Scene Stylization with Part-Aware Details |
| text | 2Sketch: Learning Face Sketch from Facial Attribute Text |
| text | 2Tex: Text-driven Texture Synthesis via Diffusion Models |
| text | 2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators |
| text | 2Video: An End-to-end Learning Framework for Expressing Text With Videos |
| text | AdaIN: Paying Attention to Shortcut Learning in Text Recognizers |
| text | Boxes++: A Single-Shot Oriented Scene Text Detector |
| text | Catcher: a method to detect curved and challenging text in natural scenes |
| text | ContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade |
| text | DCT: Arbitrary-Shaped Text Detection via Discrete Cosine Transform Mask |
| text | Dragon: An End-to-End Framework for Arbitrary Shaped Text Spotting |
| text | Face: Text-to-Style Mapping Based Face Generation and Manipulation |
| text | Field: Learning a Deep Direction Field for Irregular Scene Text Detection |
| text | Finder: An Automatic System to Detect and Recognize Text in Images |
| text | ManiA: Enriching Visual Feature by Text-driven Manifold Augmentation |
| text | Mountain: Accurate scene text detection via instance segmentation |
| text | Net: Irregular Text Reading from Images with an End-to-End Trainable Network |
| text | OCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text |
| text | Place: Visual Place Recognition and Topological Localization Through Reading Scene Texts |
| text | Proposals: A text-specific selective search algorithm for word spotting in the wild |
| text | RS: Deep Bidirectional Triplet Network for Matching Text to Remote Sensing Images |
| text | s as Images in Prompt Tuning for Multi-Label Image Recognition |
| text | s as points: Scene text detection with point supervision |
| text | s as points: Scene text detection with point supervision |
| text | SLAM: Visual SLAM With Semantic Planar Text Features |
| text | Snake: A Flexible Representation for Detecting Text of Arbitrary Shapes |
| text | SRNet: Scene Text Super-Resolution Based on Contour Prior and Atrous Convolution |
| text | StyleBrush: Transfer of Text Aesthetics From a Single Example |
| text | ual Concept Expansion with Commonsense Knowledge to Improve Dual-Stream Image-Text Matching |
| text | ual Visual Semantic Dataset for Text Spotting |
| text | ure-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm |
TFRGAN: Leveraging | text | Information for Blind Face Restoration with Extreme Degradation |
Thinking Fast and Slow: Efficient | text | -to-Visual Retrieval with Transformers |
Three decision levels strategy for Arabic and Latin | text | s differentiation in printed and handwritten natures |
Three-Dimensional Lip Motion Network for | text | -Independent Speaker Recognition |
Thresholding video images for | text | detection |
Ticker: An Adaptive Single-Switch | text | Entry Method for Visually Impaired Users |
TIED: A Cycle Consistent Encoder-Decoder Model for | text | -to-Image Retrieval |
TieNet: | text | -Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays |
TIFA: Accurate and Interpretable | text | -to-Image Faithfulness Evaluation with Question Answering |
Tightness-Aware Evaluation Protocol for Scene | text | Detection |
TIPS: | text | -Induced Pose Synthesis |
TISE: Bag of Metrics for | text | -to-Image Synthesis Evaluation |
Tk- | text | : Multi-shaped Scene Text Detection via Instance Segmentation |
Tk- | text | : Multi-shaped Scene Text Detection via Instance Segmentation |
TKDN: Scene | text | Detection via Keypoints Detection |
TLWSR: Weakly supervised real-world scene | text | image super-resolution using text label |
TLWSR: Weakly supervised real-world scene | text | image super-resolution using text label |
TM2D: Bimodality Driven 3D Dance Generation via Music- | text | Integration |
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and | text | s |
TMR: | text | -to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis |
To Speak or to | text | : Effects of Display Type and I/O Style on Mobile Virtual Humans Nurse Training |
To | text | or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging |
To | text | or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging |
To | text | or not to text- drivers' interpretation of traffic situations as the basis for their decision to (not) engage in text messaging |
Tool for Ground-Truthing | text | Lines and Characters in Off-Line Handwritten Chinese Documents, A |
Top-down and bottom-up cues for scene | text | recognition |
Topic Language Model Adaption for Recognition of Homologous Offline Handwritten Chinese | text | Image |
Total- | text | : toward orientation robustness in scene text detection |
Total- | text | : toward orientation robustness in scene text detection |
Toward Integrated Scene | text | Reading |
Toward | text | -independent Cross-lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset |
Toward Understanding WordArt: Corner-Guided Transformer for Scene | text | Recognition |
Toward Verifiable and Reproducible Human Evaluation for | text | -to-Image Generation |
Towards Accurate Scene | text | Recognition With Semantic Reasoning Networks |
Towards Accurate | text | -based Image Captioning with Content Diversity Exploration |
Towards Automated Transcription of Label | text | from Pinned Insect Collections |
Towards Cycle-Consistent Models for | text | and Image Retrieval |
Towards End-to-End | text | Spotting in Natural Scenes |
Towards End-to-End | text | Spotting with Convolutional Recurrent Neural Networks |
Towards End-to-End Unified Scene | text | Detection and Layout Analysis |
Towards Fast and Accurate Image- | text | Retrieval With Self-Supervised Fine-Grained Alignment |
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image- | text | Pre-Training |
Towards Generic | text | -Line Extraction |
Towards High-Fidelity | text | -Guided 3D Face Generation and Manipulation Using only Images |
Towards Implicit | text | -Guided 3D Shape Generation |
Towards Interactive Facial Image Inpainting by | text | or Exemplar Image |
Towards Language-Free Training for | text | -to-Image Generation |
Towards Modelling an Attention-Based | text | Localization Process |
Towards open-set | text | recognition via label-to-prototype learning |
Towards robust and efficient | text | sign reading from a mobile phone |
Towards Robust Curve | text | Detection With Conditional Spatial Expansion |
Towards Robust Tampered | text | Detection in Document Image: New Dataset and New Solution |
Towards the Unseen: Iterative | text | Recognition by Distilling from Errors |
Towards Unconstrained End-to-End | text | Spotting |
Towards Unified Scene | text | Spotting Based on Sequence Generation |
Towards Weakly-Supervised | text | Spotting using a Multi-Task Transformer |
Towards Zero-Shot Multi-Speaker Multi-Accent | text | -to-Speech Synthesis |
Trace Controlled | text | to Image Generation |
Tracking Based Multi-Orientation Scene | text | Detection: A Unified Framework With Dynamic Programming |
Traffic Video Event Retrieval via | text | Query using Vehicle Appearance and Motion Attributes |
Training on severely degraded | text | -line images |
Training-Free Location-Aware | text | -to-Image Synthesis |
Transcript Mapping for Handwritten | text | Lines Using Conditional Random Fields |
Transferable Adversarial Attacks for Deep Scene | text | Detection |
Transferring Image-CLIP to Video- | text | Retrieval via Temporal Relations |
Transferring Knowledge From | text | to Video: Zero-Shot Anticipation for Procedural Actions |
Transform invariant | text | extraction |
Transformation of arc-form- | text | to linear-form-text suitable for OCR |
Transformation of arc-form- | text | to linear-form-text suitable for OCR |
Transformer models for enhancing AttnGAN based | text | to image generation |
Transformer Reasoning Network for Image- | text | Matching and Retrieval |
Transformer-based | text | Detection in the Wild |
Transparent | text | Detection and Background Recovery |
Trans | text | Net: Transducing Text for Recognizing Unseen Visual Relationships |
Tree structure for word extraction from handwritten | text | lines |
True color distributions of scene | text | and background |
TS-RNN: | text | Steganalysis Based on Recurrent Neural Networks |
TS2-Net: Token Shift and Selection Transformer for | text | -Video Retrieval |
TSA-SCC: | text | Semantic-Aware Screen Content Coding With Ultra Low Bitrate |
TSINIT: A Two-Stage Inpainting Network for Incomplete | text | |
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for | text | -to-Video Generation |
Turning a CLIP Model into a Scene | text | Detector |
TV Commercial Detection Based on Shot Change and | text | Extraction |
TV Program Classification Based on Face and | text | Processing |
TV program segmentation using | text | -visual analysis |
TWD: A New Deep E2E Model for | text | Watermark/Caption and Scene Text Detection in Video |
TWD: A New Deep E2E Model for | text | Watermark/Caption and Scene Text Detection in Video |
Twitter Stream Analysis, Tweets, | text | s, SMS, Internet |
Two approaches for | text | segmentation in web images |
Two combination stages of clustered One-Class Classifiers for writer identification from | text | fragments |
Two Stage SVM and kNN | text | Documents Classifier |
Two-Level Rectification Attention Network for Scene | text | Recognition, A |
Two-Pass Clustering Technique for Orientation-Invariant and Language-Independent | text | Localization |
Two-stage hybrid binarization around fringe map based | text | line segmentation for document images |
two-stage method for | text | line detection in historical documents, A |
Two-stage Multimodality Fusion for High-performance | text | -based Visual Question Answering |
Two-stage partial image- | text | clustering (TPIT-C) |
two-stage scheme for | text | detection in video images, A |
Two-Stage Seamless | text | Erasing on Real-World Scene Images |
Txt2Img-MHN: Remote Sensing Image Generation From | text | Using Modern Hopfield Networks |
Type-2 Fuzzy GMMs for Robust | text | -Independent Speaker Verification in Noisy Environments |
Typing in Mid Air: Assessing One- and Two-Handed | text | Input Methods of the Microsoft HoloLens 2 |
Typographical Features for Scene | text | Recognition |
Typography With Decor: Intelligent | text | Style Transfer |
UATVR: Uncertainty-Adaptive | text | -Video Retrieval |
UHaT: Urdu handwritten | text | dataset |
Unambiguous Scene | text | Segmentation With Referring Expression Comprehension |
Unambiguous | text | Localization and Retrieval for Cluttered Scenes |
Unambiguous | text | Localization, Retrieval, and Recognition for Cluttered Scenes |
Unconstrained end-to-end | text | reading with feature rectification |
Uncorrelated Geo- | text | Inhibition Method Based on Voronoi K-Order and Spatial Correlations in Web Maps |
Uncovering the Disentanglement Capability in | text | -to-Image Diffusion Models |
Uncurated Image- | text | Datasets: Shedding Light on Demographic Bias |
Understanding Handwritten | text | in a Structured Environment |
Understanding Plane Geometry Problems by Integrating Relations Extracted from | text | and Diagram |
Understanding Video Scenes through | text | : Insights from Text-based Video Question Answering |
Understanding Video Scenes through | text | : Insights from Text-based Video Question Answering |
Unified Adaptive Relevance Distinguishable Attention Network for Image- | text | Matching |
Unified Coarse-to-Fine Alignment for Video- | text | Retrieval |
Unified Contrastive Learning in Image- | text | -Label Space |
Unified Framework for Multioriented | text | Detection and Recognition, A |
Unified Framework for Tracking Based | text | Detection and Recognition from Web Videos, A |
unified method for augmented incremental recognition of online handwritten Japanese and English | text | , A |
Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes | text | Zones, Tables and Non-text Objects |
Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes | text | Zones, Tables and Non-text Objects |
Unified Pre-training with Pseudo | text | s for Text-To-Image Person Re-identification |
Unified Pre-training with Pseudo | text | s for Text-To-Image Person Re-identification |
Unified | text | Extraction Method for Instructional Videos, A |
Unifying Vision, | text | , and Layout for Universal Document Processing |
unique approach in | text | independent speaker recognition using MFCC feature sets and probabilistic neural network, A |
Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing | text | -to-Speech System in Hindi |
UniTAB: Unifying | text | and Box Outputs for Grounded Vision-Language Modeling |
Uniter: Universal Image- | text | Representation Learning |
Unleashing | text | -to-Image Diffusion Models for Visual Perception |
Unsupervised Alignment of News Video and | text | Using Visual Patterns and Textual Concepts |
Unsupervised Approach for Video | text | Localization, An |
Unsupervised Block Covering Analysis for | text | -Line Segmentation of Arabic Ancient Handwritten Document Images |
Unsupervised categorization of heterogeneous | text | images based on fractals |
Unsupervised clustering of | text | entities in heterogeneous grey level documents |
Unsupervised Compositional Concepts Discovery with | text | -to-Image Generative Models |
Unsupervised Cross-Modal Hashing Method Robust to Noisy Training Image- | text | Correspondences in Remote Sensing, An |
Unsupervised deep learning for | text | line segmentation |
Unsupervised Domain Adaptation via Class Aggregation for | text | Recognition |
Unsupervised Domain Adaptation with Imbalanced Character Distribution for Scene | text | Recognition |
Unsupervised Image and | text | Fusion for Travel Information Enhancement |
Unsupervised language model adaptation for handwritten Chinese | text | recognition |
Unsupervised Prompt Tuning for | text | -Driven Object Detection |
Unsupervised refinement of color and stroke features for | text | binarization |
Unsupervised Segmentation of | text | Fragments in Real Scenes |
Unsupervised Speech | text | Localization in Comic Images |
Unsupervised | text | Segmentation Using Color and Wavelet Features |
Unsupervised | text | -to-image synthesis |
Unsupervised writer adaptation applied to handwritten | text | recognition |
Urdu handwritten | text | recognition: a survey |
Usage-Oriented Performance Evaluation for | text | Localization Algorithms |
Use of a Dictionary in Conjunction with a Handwritten | text | s Recognizer |
Use of an Evolutive Base of Models in a System for Reading Printed | text | s |
Use of Captions and Other Collateral | text | in Understanding Photographs |
Use of Collateral | text | in Image Interpretation |
Use of Collateral | text | in Understanding Photos in Documents |
Use of Global Con | text | in Text Recognition, The |
Use of the Hough transform to separate merged | text | /graphics in forms |
USER: Unified Semantic Enhancement With Momentum Contrast for Image- | text | Retrieval |
Using a boosted tree classifier for | text | segmentation in hand-annotated documents |
Using a Probabilistic Syllable Model to Improve Scene | text | Recognition |
Using Adaptive Run Length Smoothing Algorithm for Accurate | text | Localization in Images |
Using an Exact Performance of Hough Transform for Image | text | Segmentation |
Using Biographical | text | s as Linked Data for Prosopographical Research and Applications |
Using double attention for | text | tattoo localisation |
Using Hidden Markov Models as a Tool for Handwritten | text | Line Segmentation |
Using histogram representation and Earth Mover's Distance as an evaluation tool for | text | detection |
Using irregular pyramid for | text | segmentation and binarization of gray scale images |
Using Kernel Density Classifier with Topic Model and Cost Sensitive Learning for Automatic | text | Categorization |
Using Large | text | To Image Models with Structured Prompts for Skin Disease Identification: A Case Study |
Using Mouse Feedback in Computer Assisted Transcription of Handwritten | text | Images |
Using Multimodal Contrastive Knowledge Distillation for Video- | text | Retrieval |
Using Multiple Frame Integration for the | text | Recognition of Video |
Using Object Information for Spotting | text | |
Using pyramid of histogram of oriented gradients on natural scene | text | recognition |
Using Readers' Highlighting on Monochromatic Documents for Automatic | text | Transcription and Summarization |
Using Scale-Space Anisotropic Smoothing for | text | Line Extraction in Historical Documents |
Using Shape and Layout Information to Find Signatures, | text | , and Graphics |
Using | text | to Teach Image Retrieval |
Using Typical Testors for Feature Selection in | text | Categorization |
Using web search engines to improve | text | recognition |
Using Webcast | text | for Semantic Event Detection in Broadcast Sports Video |
UT-GAN: A Novel Unpaired | text | ual-Attention Generative Adversarial Network for Low-Light Text Image Enhancement |
Utilization of | text | ure, contrast and color homogeneity for detecting and recognizing text from video frames |
Uyghur Language | text | Detection in Complex Background Images Using Enhanced MSERs |
Uyghur | text | Localization with Fast Component Detection |
Variance Based Image Binarization Scheme and Its Application in | text | Segmentation, A |
Variational Bayes Method for Handwritten | text | Line Segmentation, A |
Variational Distribution Learning for Unsupervised | text | -to-Image Generation |
Variational DNN embeddings for | text | -independent speaker verification |
Vector Quantized Diffusion Model for | text | -to-Image Synthesis |
VectorFusion: | text | -to-SVG by Abstracting Pixel-Based Diffusion Models |
Verisimilar Image Synthesis for Accurate Detection and Recognition of | text | s in Scenes |
Versatile Diffusion: | text | , Images and Variations All in One Diffusion Model |
Vertical bar detection for gauging | text | similarity of document images |
Vesselness for | text | detection in historical document images |
VGSG: Vision-Guided Semantic-Group Network for | text | -Based Person Search |
Video Analysis -- Captions, | text | , Video Text |
Video Analysis -- Captions, | text | , Video Text |
Video and | text | Matching with Conditioned Embeddings |
Video captioning with | text | -based dynamic attention and step-by-step learning |
Video Frame-wise Explanation Driven Contrastive Learning for Procedural | text | Generation |
Video Generation from | text | Employing Latent Path Construction for Temporal Modeling |
Video google: A | text | retrieval approach to object matching in videos |
Video Question Answering Using Clip-Guided Visual- | text | Attention |
Video Question Answering with Iterative Video- | text | Co-tokenization |
Video Scene | text | Frames Categorization for Text Detection and Recognition |
Video Scene | text | Frames Categorization for Text Detection and Recognition |
Video Script Identification Based on | text | Lines |
Video search in concept subspace: a | text | -like paradigm |
Video Search with CLIP and Interactive | text | Query Reformulation |
Video | text | detection and recognition: Dataset and benchmark |
Video | text | Detection System Based on Automated Training, A |
Video | text | Extraction Using the Fusion of Color Gradient and Log-Gabor Filter |
video | text | location method based on background classification, A |
Video | text | recognition using feature compensation as category-dependent feature extraction |
Video | text | recognition using sequential Monte Carlo and error voting methods |
Video | text | Tracking With a Spatio-Temporal Complementary Model |
Video, | text | , and Speech-Driven Realistic 3-D Virtual Head for Human-Machine Interface, A |
Video- | text | as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning |
Video- | text | Compliance: Activity Verification Based on Natural Language Instructions |
Video- | text | Representation Learning via Differentiable Weak Temporal Alignment |
ViLEM: Visual-Language Error Modeling for Image- | text | Retrieval |
Vision and | text | Transformer for Predicting Answerability on Visual Question Answering |
Vision-Language Matching for | text | -to-Image Synthesis via Generative Adversarial Networks |
Vision-Language Pre-Training for Boosting Scene | text | Detectors |
ViSTA: Vision and Scene | text | Aggregation for Cross-Modal Retrieval |
Visual Attention Based Approach to | text | Extraction, A |
Visual enhancement of incised | text | |
Visual Re-ranking with Natural Language Understanding for | text | Spotting |
Visual Semantic Reasoning for Image- | text | Matching |
Visual Semantics: Extracting Visual Information from | text | Accompanying Pictures |
Visual speaker authentication with random prompt | text | s by a dual-task CNN framework |
Visual | text | Correction |
Visual | text | Recognition Through Contextual Processing |
Visual Word Embedding for | text | Classification |
Visual-relation Conscious Image Generation from Structured- | text | |
Visual- | text | ual Capsule Routing for Text-Based Video Segmentation |
Visualizing Unstructured | text | Sequences Using Iterative Visual Clustering |
Visually-Enabled Active Deep Learning for (Geo) | text | and Image Classification: A Review |
Vita-CLIP: Video and | text | adaptive CLIP via Multimodal Prompting |
Viterbi algorithm as an aid in | text | recognition, The |
VoP: | text | -Video Co-Operative Prompt Tuning for Cross-Modal Retrieval |
Vox-E: | text | -guided Voxel Editing of 3D Objects |
VSR++: Improving Visual Semantic Reasoning for Fine-Grained Image- | text | Matching |
VSRNet: End-to-end video segment retrieval with | text | query |
VTC: Improving Video- | text | Retrieval with User Comments |
VX2 | text | : End-to-End Learning of Video-Based Text Generation From Multimodal Inputs |
W-A net: Leveraging Atrous and Deformable Convolutions for Efficient | text | Detection |
Wacnet: Word Segmentation Guided Characters Aggregation Net for Scene | text | Spotting With Arbitrary Shapes |
Watch Your Strokes: Improving Handwritten | text | Recognition with Deformable Convolutions |
Watermarking JBIG2 | text | Region for Image Authentication |
Watermarking | text | document images using edge direction histograms |
Wavelet feature domain adaptive noise reduction using learning algorithm for | text | -independent speaker recognition |
Wavelet feature selection based neural networks with application to the | text | independent speaker identification |
Wavelet-gradient-fusion for video | text | binarization |
Weak supervision for generating pixel-level annotations in scene | text | segmentation |
Weakly Supervised Attention Rectification for Scene | text | Recognition |
Weakly Supervised | text | -based Person Re-Identification |
Weakly Supervised Video Moment Retrieval From | text | Queries |
Weakly Supervised Video Representation Learning with Unaligned | text | for Sequential Videos |
Weakly-Supervised 3D Spatial Reasoning for | text | -Based Visual Question Answering |
Weakly-Supervised Alignment of Video with | text | |
Weakly-Supervised | text | -driven Contrastive Learning for Facial Behavior Understanding |
WEB Image Classification Based on the Fusion of Image and | text | Classifiers |
Webly Supervised Image- | text | Embedding with Noisy Tag Refinement |
Well-calibrated confidence measures for multi-label | text | classification with a large number of labels |
WeStcoin: Weakly-Supervised Con | text | ualized Text Classification with Imbalance and Noisy Labels |
We | text | : Scene Text Detection under Weak Supervision |
WETM: A word embedding-based topic model with modified collapsed Gibbs sampling for short | text | |
What Are You Talking About? | text | -to-Image Coreference |
What does scene | text | tell us? |
What If We Only Use Real Datasets for Scene | text | Recognition? Toward Scene Text Recognition With Fewer Labels |
What If We Only Use Real Datasets for Scene | text | Recognition? Toward Scene Text Recognition With Fewer Labels |
What is a good evaluation protocol for | text | localization systems? Concerns, arguments, comparisons and solutions |
What is the Real Need for Scene | text | Removal? Exploring the Background Integrity and Erasure Exhaustivity Properties |
What Is Wrong With Scene | text | Recognition Model Comparisons? Dataset and Model Analysis |
What Machines See Is Not What They Get: Fooling Scene | text | Recognition Models With Adversarial Text Images |
What Machines See Is Not What They Get: Fooling Scene | text | Recognition Models With Adversarial Text Images |
When IC meets | text | : Towards a rich annotated integrated circuit text dataset |
When IC meets | text | : Towards a rich annotated integrated circuit text dataset |
Where you edit is what you get: | text | -guided image editing with region-based attention |
Which super-resolution algorithm is proper for Farsi | text | image sequences |
Who's Waldo? Linking People Across | text | and Images |
Whole is Greater than Sum of Parts: Recognizing Scene | text | Words |
Wikipedia-based semantic tensor space model for | text | analytics, A |
Word Extraction from On-Line Handwritten | text | Lines |
Word Image Matching as a Technique for Degraded | text | Recognition |
Word segmentation in handwritten Korean | text | lines based on gap clustering techniques |
Word segmentation of printed | text | lines based on gap clustering and special symbol detection |
Word separation of unconstrained handwritten | text | lines in PCR forms |
Word Shape Analysis in a Knowledge-Based System for Reading | text | |
Word spotting and recognition via a joint deep embedding of image and | text | |
Wordfence: | text | detection in natural images with border awareness |
Words Matter: Scene | text | for Image Classification and Retrieval |
WordSup: Exploiting Word Annotations for Character Based | text | Detection |
Write a Classifier: Predicting Visual Classifiers from Unstructured | text | |
Writer identification using | text | line based features |
Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese | text | recognition |
Writing speed normalization for on-line handwritten | text | recognition |
X-Mesh: Towards Fast and Accurate | text | -driven 3D Stylization via Dynamic Textual Guidance |
X-Pool: Cross-Modal Language-Video Attention for | text | -Video Retrieval |
Zero-Shot Contrastive Loss for | text | -Guided Diffusion Image Style Transfer |
Zero-shot spatial layout conditioning for | text | -to-image diffusion models |
Zero-Shot | text | Classification with Semantically Extended Graph Convolutional Network |
Zero-Shot | text | -Guided Object Generation with Dream Fields |
Zero-Shot | text | -to-Parameter Translation for Game Character Auto-Creation |
ZeroCap: Zero-Shot Image-to- | text | Generation for Visual-Semantic Arithmetic |
Zone identification in the printed Gujarati | text | |
3083 for text