He et al., 2018 - Google Patents

Deep learning in natural language generation from images

He et al., 2018

Document ID: 15650179927471326957
Author: He X; Deng L
Publication year: 2018
Publication venue: Deep learning in natural language processing

External Links

Cited by

Snippet

Natural language generation from images, referred to as image or visual captioning also, is an emerging deep learning application that is in the intersection between computer vision and natural language processing. Image captioning also forms the technical foundation for …

Continue reading at link.springer.com (other versions)

230000000007 visual effect 0 abstract description 37

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management

Similar Documents

Publication	Publication Date	Title
Yang et al.	2020	Image-text multimodal emotion classification via multi-view attentional network
Yang et al.	2018	Video captioning by adversarial LSTM
Wang et al.	2018	Application of convolutional neural network in natural language processing
Gu et al.	2018	Multimodal affective analysis using hierarchical attention strategy with word-level alignment
He et al.	2017	Deep learning for image-to-text generation: A technical overview
Pei et al.	2017	Temporal attention-gated model for robust sequence classification
CN107066464B (en)	2022-12-27	Semantic natural language vector space
CN106973244B (en)	2021-04-20	Method and system for automatically generating image captions using weak supervision data
Islam et al.	2021	Exploring video captioning techniques: A comprehensive survey on deep learning methods
Zhang et al.	2021	Cross-modal image sentiment analysis via deep correlation of textual semantic
He et al.	2018	Deep learning in natural language generation from images
Xian et al.	2019	Self-guiding multimodal LSTM—when we do not have a perfect training dataset for image captioning
Khan et al.	2022	A deep neural framework for image caption generation using gru-based attention mechanism
Zhu et al.	2019	Joint visual-textual sentiment analysis based on cross-modality attention mechanism
CN113392179A (en)	2021-09-14	Text labeling method and device, electronic equipment and storage medium
Chaudhuri	2019	Visual and text sentiment analysis through hierarchical deep learning networks
Lai et al.	2022	Multimodal sentiment analysis with asymmetric window multi-attentions
Phukan et al.	2021	An efficient technique for image captioning using deep neural network
Yang et al.	2024	Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey
Vayadande et al.	2023	Mood detection and emoji classification using tokenization and convolutional neural network
Dutta et al.	2024	EmoComicNet: A multi-task model for comic emotion recognition
Dahikar et al.	2023	Sketch captioning using LSTM and BiLSTM
Jamil et al.	2024	Deep Learning Approaches for Image Captioning: Opportunities, Challenges and Future Potential
Liu et al.	2017	Personalized Recommender System for Children's Book Recommendation with A Realtime Interactive Robot
Lei et al.	2024	Multimodal Sentiment Analysis Based on Composite Hierarchical Fusion