RO  EN
IMCS/Publications/CSJM/Issues/CSJM v.31, n.3 (93), 2023/

Multilingual Fine-Grained Named Entity Recognition

Authors: Viorica-Camelia Lupancu, Adrian Iftene

Abstract

The “MultiCoNER II Multilingual Complex Named Entity Recognition” task\footnote[1]{\url{https://multiconer.github.io}} within SemEval 2023 competition focuses on identifying complex named entities (NEs), such as the titles of creative works (e.g., songs, books, movies), people with different titles (e.g., politicians, scientists, artists, athletes), different categories of products (e.g., food, drinks, clothing), and so on, in several languages. In the context of SemEval, our team, \textit{FII\_Better}, presented an exploration of a base transformer model’s capabilities regarding the task, focused more specifically on five languages (English, Spanish, Swedish, German, and Italian). We took DistilBERT (a distilled version of BERT) and BERT (Bidirectional Encoder Representations from Transformers) as two examples of basic transformer models, using DistilBERT as a baseline and BERT as the platform to create an improved model. In this process, we managed to get fair results in the chosen languages. We have managed to get moderate results in the English track (we ranked 17th out of 34), while our results in the other tracks could be further improved in the future (overall third to last).

Viorica-Camelia Lupancu
"Alexandru Ioan Cuza" University of Iasi, Romania, Faculty of Computer Science
General Berthelot, No. 16, Iasi, Romania
E-mail:

Adrian Iftene
ORCID: https://orcid.org/0000-0003-3564-8440
"Alexandru Ioan Cuza" University of Iasi, Romania, Faculty of Computer Science
General Berthelot, No. 16, Iasi, Romania
E-mail:

DOI

https://doi.org/10.56415/csjm.v31.16

Fulltext

Adobe PDF document0.20 Mb