Analysis of Searching for Similar Phrases in Sections of Judgements of the Court of Justice of the European Union Based on CountVectorizer and Word2Vec

Name
Sirle Orav-Hinno
Abstract
The master’s thesis analyzes whether CountVectorizer or Word2Vec can be used to create a smarter keyword search that would yield paragraphs of the Court’s judgments similar to the given phrase. The current InfoCuria and EUR-Lex search systems do not display the judgment of the Court of Justice of the European Union in such a way that you can start reading from the Court’s analysis part, nor do they display results with similar words. So it is time-consuming to find information corresponding to the phrases in the judgments of the Court. In the master’s thesis, the author created three data tables of court decisions (court assessment and resolution texts, court assessment texts only and resolution texts only), where court assessment and resolution sections are divided into separate lines. After that author applied CountVectorizer and Word2Vec on these datasets and received vectors. These vectors were used to compare with tax law phrases for testing. The result was that CountVectorizer or Word2Vec could be used to create a smarter keyword search (the results would show the paragraphs of the judgment, not the full texts), but it works only when you want to find useful sections of the court assessment part. The InfoCuria and EUR-Lex search engines continue to work better when you want to find practical court resolutions.
Graduation Thesis language
Estonian
Graduation Thesis type
Master - Conversion Master in IT
Supervisor(s)
Dage Särg, Risto Hinno
Defence year
2021
 
PDF