Sentence encoders for semantic textual similarity: A survey

Summary

Abstract	Abstract The last decade has witnessed many accomplishments in the field of Natural Language Processing, especially in understanding the language semantics. Well-established machine learning models for generating word representation are available and has been proven useful. However, the existing techniques proposed for learning sentence level representations do not adequately capture the complexity of compositional semantics. Finding semantic similarity between sentences is a fundamental language understanding problem. In this project, we compare various machine learning models on their ability to capture the semantics of a sentence using Semantic Textual Similarity (STS) Task. We focus on models that exhibit state-of-the-art performance in Sem-Eval(2017) STS shared task. Also, we analyse the impact of models’ internal architectures on STS task performance. Out of all the models the we compared, Bi-LSTM RNN with max-pooling layer achieves the best performance in extracting a generic semantic representation and aids in better transfer learning when compared to hierarchical CNN.
Persons	Persons Author (aut): Babu, Aarthi Thesis advisor (ths): Hartley, Ian Degree committee member (dgc): Casperson, David
Degree Name	Degree Name Master of Science (MSc)
Department	Department Computer Science
DOI	DOI https://doi.org/10.24124/2018/58808
Collection(s)	Collection(s) Graduate Research Projects

Origin Information

Date Created/Date Issued	2018
Publisher	University of Northern British Columbia
Issuance	monographic

Organizations

Degree Level

Subjects and Classifications

Subject Topic	Subject Topic Natural language processing (Computer science) Semantics Neural networks (Computer science)
Keywords	Keywords Natural Language Processing language semantics Semantic Textual Similarity (STS) Task Sem-Eval(2017) STS Bi-LSTM RNN hierarchical CNN

Resource Description

Extent	Extent 1 online resource (viii, 70 pages)
Physical Form	Physical Form electronic electronic
Content type	Content type Digital Document
Resource Type	Resource Type Text
Genre	Genre research (documents)
Language	Language English

Access and Rights

Use and Reproduction	Use and Reproduction author

Language	English
Name	Sentence encoders for semantic textual similarity
Authored on	2025-03-20
MIME type	application/pdf
File size	742789
Media Use	Original File

Download