Publications
- Books: Authored Monographs
- Books: Editorships
- Journal articles
- Articles in conference proceedings
- Articles in collections
- Shared tasks
- Guidelines
- Talks and presentations
[ FAU Current Research Information System (CRIS) · dblp Computer Science Bibliography · Semantic Scholar · Scopus · ACL Anthology · Google Scholar ]
< ORCID identifier: 0000-0002-6759-1808 >Authored Monographs
- Kabashi, Besim. 2015. Automatische Verarbeitung der Morphologie des Albanischen. 1st ed. Erlangen, Germany: FAU University Press. [pdf, bib]
Co-Edited Proceedings
Translation Inference Across Dictionaries (TIAD), 2021
- Gracia, Jorge, Besim Kabashi, and Ilan Kernerman. 2021. “Translation Inference Across Dictionaries 2021 Shared Task.” In Proceedings of the Workshops and Tutorials held at LDK 2021 Co-Located with the 4th Language, Data and Knowledge Conference (LDK 2021), edited by Edited by Sara Carvalho and Renato Rocha Souza. Zaragoza, Spain, September 1–4, 2021. https://tiad2021.unizar.es/. [CEUR net, bib]
Globalex Workshop on Linked Lexicography, 2020
- Kernerman, Ilan, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, and Besim Kabashi, eds. 2020. Proceedings of Globalex Workshop on Linked Lexicography –– at the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), Marseílle, France, May 12, 2020. Paris, France: European Language Resources Association. [ACL net, pdf, bib]
The 15th Conference on Natural Language Processing (KONVENS), 2019
- Evert, Stefan, Andreas Blombach, Natalie Dykes, Paul Greiner, Philipp Heinrich, Besim Kabashi, and Thomas Proisl, eds. 2019. Proceedings of the 15th Conference on Natural Language Processing (KONVENS), October 9–11. Erlangen, Germany: German Society for Computational Linguistics & Language Technology; University of Erlangen-Nuremberg. [pdf, bib]
Translation Inference Across Dictionaries (TIAD), 2019
- Gracia, Jorge, Besim Kabashi, and Ilan Kernerman, eds. 2019. Translation Inference Across Dictionaries 2019 Shared Task –– at the 2nd Language, Data and Knowledge Conference (LDK 2019), Leipzig, Germany, May 20, 2019. [CEUR net/pdf, bib]
Journal articles
Proisl, Thomas, Natalie Dykes, Andreas Blombach, Philipp Heinrich, Besim Kabashi, and Stefan Evert. 2020. “Normalization and Lemmatization of German Computer-Mediated Communication.” Language Resources and Evaluation. [bib], In preparation.
Kabashi, Besim. 2018. “A Lexicon of Albanian for Natural Language Processing.” Lexicographica 34 (1): 239–48. https://doi.org/10.1515/lex-2018-340112. [bib]
Kabashi, Besim. 2017. “AlCo – një korpus tekstesh i gjuhës shqipe me njëqind milionë fjalë.” Seminari Ndërkombëtar për Gjuhën, Letërsinë dhe Kulturën Shqiptare, no. 36: 123–32. [pdf, bib]
Articles in conference proceedings
Chiarcos, Christian, Elena-Simona Apostol, Besim Kabashi, and Ciprian-Octavian Truică. 2022. “Modelling Frequency, Attestation, and Corpus-Based Information with OntoLex-FrAC.” In Proceedings of the 29th International Conference on Computational Linguistics, 4018–27. Gyeongju, Republic of Korea: International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.353. [pdf, bib]
Chiarcos, Christian, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, and Ciprian-Octavian Truică. 2022. “Modelling Collocations in Ontolex-Frac.” In Proceedings of Globalex Workshop on Linked Lexicography Within the 13th Language Resources and Evaluation Conference, 10–18. Marseille, France: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2022/workshops/GWLL/pdf/2022.gwll-1.3.pdf. [pdf, bib]
Proisl, Thomas, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Andreas Blombach, and Stefan Evert. 2020. “EmpiriST Corpus 2.0: Adding Manual Normalization, Lemmatization and Semantic Tagging to a German Web and CMC Corpus.” In Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), edited by Nicoletta Calzolari, Sara Goggi, and Hélène Mazo, 6144–50. Marseille, France: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.754.pdf. [pdf, bib].
Blombach, Andreas, Natalie Dykes, Philipp Heinrich, Besim Kabashi, and Thomas Proisl. 2020. “A Corpus of German Reddit Exchanges (GeRedE).” In Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), edited by Nicoletta Calzolari, Sara Goggi, and Hélène Mazo, 6312–8. Marseille, France: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.774.pdf. [pdf, bib].
Kabashi, Besim, Thomas Proisl, and Michael Ruppert. 2020. “Creating high-quality bilingual word lists and terms based on same or similar topics in certain languages in Wikipedia.” In Asia Pacific Corpus Linguistics Conference (APCLC) 2020. Yonsei University, Seoul, South Korea. https://easychair.org/smart-program/APCLC2020/2020-02-13.html#talk:140430. [Abstract, bib]. [Conference cancelled due to the COVID-19 pandemic situation].
Kabashi, Besim. 2019. “Collecting Collocations for the Albanian Language.” In Proceedings of the Sixth Biennial Conference on Electronic Lexicography: Electronic Lexicography in the 21st Century (eLex 2019), Sintra, Portugal, October 1–3, 2019., edited by Zingano Kuhn Kosem I., 478–89. Brno, Czech Republic: Lexical Computing, s.r.o. https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_27.pdf. [pdf, bib]
Blombach, Andreas, Natalie Dykes, Stefan Evert, Philipp Heinrich, Besim Kabashi, and Thomas Proisl. 2019. “A New German Reddit Corpus (A Report on Work in Progress).” Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019). Erlangen, Germany. [pdf, bib]
Kabashi, Besim. 2018. “A Lexicon of Albanian for Natural Language Processing.” In Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, edited by Iztok Kosem Jaka Čibej Vojko Gorjanc, 855–62. Ljubljana, Slovenia: Ljubljana University Press, Faculty of Arts. https://euralex.org/category/publications/euralex-2018/. [pdf, bib]
Kabashi, Besim, and Thomas Proisl. 2018. “Albanian Part-of-Speech Tagging: Gold Standard and Evaluation.” In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), edited by Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis, and Takenobu Tokunaga, 2593–9. Miyazaki, Japan: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2018/pdf/89.pdf. [pdf, bib]
- Kabashi, Besim, and Thomas Proisl. 2016. “A Proposal for a Part-of-Speech Tagset for the Albanian Language.” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), edited by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, and Stelios Piperidis, 4305–10. Portorož, Slovenia: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2016/pdf/1066_Paper.pdf. [pdf, bib]
- Proisl, Thomas, and Besim Kabashi. 2010. “Using High-Quality Resources in NLP: The Valency Dictionary of English as a Resource for Left-Associative Grammars.” In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, and Daniel Tapias, 3878–81. Valletta, Malta: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2010/pdf/62_Paper.pdf. [pdf, bib]
- Handl, Johannes, Besim Kabashi, Thomas Proisl, and Carsten Weber. 2009. “JSLIM – Computational Morphology in the Framework of the SLIM Theory of Language.” In State of the Art in Computational Morphology. Workshop on Systems and Frameworks for Computational Morphology (SFCM 2009), edited by Cerstin Mahlow and Michael Piotrowski, 10–27. Berlin, Heidelberg, New York: Springer. https://doi.org/10.1007/978-3-642-04131-0_2. [pdf, bib]
- Kabashi, Besim. 2005. “Disa propozime për Modelimin e Informacionit në Leksikografinë Kompjuterike.” In Seminari Ndërkombëtar për Gjuhën, Letërsinë dhe Kulturën Shqiptare, XXXIV, Prishtinë, 79–184. 24. Univeristeti i Prishtinës. [bib]
- Kabashi, Besim. 2004. “Analiza automatike e fjalëformave të ghuhës shqipe.” In Seminari Ndërkombëtar për Gjuhën, Letërsinë dhe Kulturën Shqiptare, XXXIII, Prishtinë, 129–35. 23. Univeristeti i Prishtinës. [bib]
Articles in collections
- Kabashi, Besim. 2020. “Building diachronic corpora of the Albanian language.” In Altalbanische Schriftkultur – aus der Perspektive der historischen Lexikographie und der Philologie der Gegenwart, edited by Bardhyl Demiraj, 44:103–8. Albanische Forschungen. Wiesbaden, Germany: Harrassowitz Verlag. [pdf, bib].
- Kabashi, Besim. 2018. “Neologjizmat në shqipen e sotme – njohja dhe analiza automatike.” In Studimet albanistike në vendet ku flitet gjermanishtja / Albanische Studien in den deutschsprachigen Ländern, edited by Ismajli Rexhep, 601–7. Prishtinë, Kosovo: Akademia e Shkencave dhe e Arteve e Kosovës / Akademia e Shkencave e Shqipërisë. [bib]
- Kabashi, Besim. 2014. “Ndarja në rrokje – një pasurim i leksikografisë shqiptare.” In Ndihmesa shkencore të prof. Rami Memushajt, edited by Bardhosh Gaçe, 83–88. Vlorë, Albania: Universiteti i Vlorës. [bib]
- Kabashi, Besim. 2012. “Korpuse gjuhësore për shqipen.” In Shqipja dhe gjuhët e Ballkanit / Albanian and Balkan Languages, edited by Ismajli Rexhep, 627–34. Prishtinë, Kosovo: Akademia e Shkencave dhe e Arteve e Kosovës / Akademia e Shkencave e Shqipërisë. [pdf, bib]
- Kabashi, Besim. 2011. “Pasurimi dhe përmirësimi i standardit të gjuhës vështruar nga pikëpamja e përpunimit teknologjik të gjuhëve natyrore sot.” In Shqipja në etapën e sotme: politikat e pasurimit dhe të përmirësimit të standardit, edited by Ardian Marashi, 371–83. Botimet Albanologjike. Tiranë, Albania: Qendra e Studime Albanologjike. [pdf, bib]
Kabashi, Besim. 2010. “Einige Notizen über den Roman ’Rrathë’ von Martin Camaj.” In Wir sind die Deinen - Studien zur albanischen Sprache, Literatur und Kulturgeschichte, dem Gedenken an Martin Camaj (1925–1992) gewidmet., edited by Bardhyl Demiraj, 29:301–11. Albanische Forschungen. Wiesbaden, Germany: Harrassowitz Verlag. https://www.harrassowitz-verlag.de/titel_84.ahtml. [pdf, bib]
Kabashi, Besim. 2010. “Resurset e gjuhës shqipe – një diskutim rreth gjendjes së tyre të tashme.” In (Materialet nga) Seminari III Ndërkombëtar i Albanologjisë, edited by University of Tetovo, 3:97–102. Seminari Ndërkombëtar I Albanologjisë. Tetovo, North Macedonia: University of Tetovo. [pdf, bib]
- Kabashi, Besim. 2009. “Das Albanische Alphabet aus sprachtechnologischer Sicht.” In Der Kongress von Manastir. Herausforderung zwischen Tradition und Neuerung in der albanischen Schriftkultur., edited by Bardhyl Demiraj, 189–227. PHILOLOGIA - Sprachwissenschaftliche Forschungsergebnisse. Hamburg, Germany: Dr. Kovač. http://www.verlagdrkovac.de/3-8300-4705-3.htm. [pdf, bib]
- Kabashi, Besim. 2007. “Pronominal Clitics and Valency in Albanian. A Computational Linguistics Prespective and Modelling Within the LAG-Framework.” In Valency. Theoretical, Descriptive and Cognitive Issues., edited by Herbst Thomas; Götz-Votteler Katrin, 187:339–52. Trends in Linguistics. Studies and Monographs. Berlin, Germany / New York, USA: Mouton de Gruyter. https://doi.org/10.1515/9783110198775.4.339. [pdf, bib]
Shared tasks
- Gracia, Jorge, Besim Kabashi, and Ilan Kernerman. 2022. “TIAD 2022: The Fifth Translation Inference Across Dictionaries Shared Task.” In Proceedings of Globalex Workshop on Linked Lexicography Within the 13th Language Resources and Evaluation Conference, 19–25. Marseille, France: European Language Resources Association. http://www.lrec-conf.org/proceedings/lrec2022/workshops/GWLL/pdf/2022.gwll-1.4.pdf. [pdf, bib]
- Gracia, Jorge, Besim Kabashi, and Ilan Kernerman. 2021. “Results of the Translation Inference Across Dictionaries 2021 Shared Task.” In The 4th Language, Data and Knowledge Conference (LDK 2021) Workshops and Tutorials, edited by Sara Carvalho and Renato Rocha Souza, 208–20. Zaragoza, Spain, September 1, 2021. http://ceur-ws.org/Vol-3064/tiad4.pdf. [pdf, bib]
- Kernerman, Ilan, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, and Besim Kabashi. 2020. “Introduction to the Proceedings of Globalex 2020 Workshop on Linked Lexicography.” In Proceedings of Globalex Workshop on Linked Lexicography, edited by Ilan Kernerman, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, and Besim Kabashi, iii–xiii. Paris, France: European Language Resources Association. https://www.aclweb.org/anthology/2020.globalex-1.0.pdf. [pdf, bib]
Proisl, Thomas, Peter Uhrig, Andreas Blombach, Natalie Dykes, Philipp Heinrich, Besim Kabashi, and Sefora Mammarella. 2019. In Proceedings of the First International Workshop on NLP Solutions for Under Resourced Languages (NSURL 2019). Trento, Italy. https://www.aclweb.org/anthology/2019.nsurl-1.11.pdf. [pdf, bib].
Gracia, Jorge, Besim Kabashi, and Ilan Kernerman. 2019. “Proceedings of TIAD-2019 Shared Task – Translation Inference Across Dictionaries.” In Translation Inference Across Dictionaries 2019 Shared Task, edited by Jorge Gracia, Besim Kabashi, and Ilan Kernerman. Co-located with the 2nd Language, Data; Knowledge Conference (LDK 2019), Leipzig, Germany, May 20, 2019. http://ceur-ws.org/Vol-2493/xpreface.pdf. [pdf, bib]
Gracia, Jorge, Besim Kabashi, Ilan Kernerman, Marta Lanau-Coronas, and Dorielle Lonke. 2019. “Results of the Translation Inference Across Dictionaries 2019 Shared Task.” In Translation Inference Across Dictionaries 2019 Shared Task, edited by Jorge Gracia, Besim Kabashi, and Ilan Kernerman. Co-located with the 2nd Language, Data; Knowledge Conference (LDK 2019), Leipzig, Germany, May 20, 2019. http://ceur-ws.org/Vol-2493/summary.pdf. [pdf, bib]
- Proisl, Thomas, Philipp Heinrich, Besim Kabashi, and Stefan Evert. 2018. “EmotiKLUE at IEST 2018: Topic-Informed Classification of Implicit Emotions.” In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, edited by Alexandra Balahur, Saif M. Mohammad, Veronique Hoste, and Roman Klinger, 235–42. Brussels, Belgium: Association for Computational Linguistics. http://aclweb.org/anthology/W18-6234. [pdf, bib]
- Proisl, Thomas, Philipp Heinrich, Stefan Evert, and Besim Kabashi. 2017. “Translation Inference Across Dictionaries via a Combination of Graph-Based Methods and Co-Occurrence Statistics.” In Proceedings of the LDK 2017 Workshops: 1st Workshop on the OntoLex Model (OntoLex-2017), Shared Task on Translation Inference Across Dictionaries & Challenges for Wordnets, edited by John P. McCrae, Francis Bond, Paul Buitelaar, Philipp Cimiano, Thierry Declerck, Jorge Gracia, Ilan Kernerman, Elena Montiel-Ponsoda, Noam Ordan, and Maciej Piasecki, 94–102. Galway, Ireland: CEUR-WS.org. http://ceur-ws.org/Vol-1899/TIAD17_paper_1.pdf. [pdf, bib]
Evert, Stefan, Thomas Proisl, Paul Greiner, and Besim Kabashi. 2014. “SentiKLUE: Updating a Polarity Classifier in 48 Hours.” In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), edited by Preslav Nakov and Torsten Zesch, 551–55. Dublin, Ireland: Association for Computational Linguistics. http://www.aclweb.org/anthology/S14-2096. [pdf, bib]
Proisl, Thomas, Stefan Evert, Paul Greiner, and Besim Kabashi. 2014. “SemantiKLUE: Robust Semantic Similarity at Multiple Levels Using Maximum Weight Matching.” In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), edited by Preslav Nakov and Torsten Zesch, 532–40. Dublin, Ireland: Association for Computational Linguistics. http://www.aclweb.org/anthology/S14-2093. [pdf, bib]
Greiner, Paul, Thomas Proisl, Stefan Evert, and Besim Kabashi. 2013. “KLUE-CORE: A Regression Model of Semantic Textual Similarity.” In Proceedings of the Second Joint Conference on Lexical and Computational Semantics (*SEM 2013), edited by Mona T. Diab, Timothy Baldwin, and Marco Baroni, 181–86. Atlanta, GA, USA: Association for Computational Linguistics. http://aclweb.org/anthology/S13-1026. [pdf, bib]
Proisl, Thomas, Paul Greiner, Stefan Evert, and Besim Kabashi. 2013. “KLUE: Simple and Robust Methods for Polarity Classification.” In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013), edited by Mona T. Diab, Timothy Baldwin, and Marco Baroni, 395–401. Atlanta, GA, USA: Association for Computational Linguistics. http://aclweb.org/anthology/S13-2065. [pdf, bib]
Guidelines
- Proisl, Thomas, Natalie Dykes, Philipp Heinrich, Besim Kabashi, and Stefan Evert. 2019. “Lemmatisierungsrichtlinien für das EmpiriST-Corpus.” Presentation at "Annotation of Non-Standard Corpora". Bamberg, Germany; LREC 2020, Marseille, France; GitHub. https://github.com/fau-klue/empirist-corpus/blob/master/doc/Lemmatisierungsrichtlinien.pdf. [pdf, bib]
Talks and presentations
Recent Talks and Presentations (selected)
Kabashi, Besim. 2023. The lexis of the Albanian language used in social media: An investigation. The 16th International Conference of the Asian Association for Lexicography (ASIALEX 2023), June 24, 2023. Yonsei University, Seoul, Republic of Korea.
Kabashi, Besim. 2022. Lexicographers, corpora and computers. Texas Symposium on Lexicography, March 29, 2022. The University of Texas at Austin, USA.
Kabashi, Besim. 2022. Teaching in EMJMD-EMLex: A6: Computational Lexicography. Informational Session on the European Master in Lexicography program, March 28, 2022. The University of Texas at Austin, USA.
Kabashi, Besim. 2021. “Words, words, … : Usage and Constructions”. International Congress of Albanian Studies, October 26, 2021, Academy of Sciences of the Republic of Albania, Tirana, Albania.
Kabashi, Besim. 2021. “A Corpus Network of Albanian”. Interdisziplinäre und transnationale Aspekte und Perspektiven der Albanologie: Konferenz in Gedenken an Professor Wilfried Fiedler, October 1, 2021. University of Jena, Germany.
Proisl, Thomas, Natalie Dykes, Philipp Heinrich, Besim Kabashi, and Stefan Evert. 2020. “EmpiriST corpus 2.0: Adding normalization, lemmatization and semantic tags to a German web and social media corpus.” In 42. Jahrestagung der Deutschen Gesellschaft für Sprachwissenschaft (DGfS) 2020. Universität Hamburg, Germany. [Abstract, pdf, Poster, pdf, bib].
Blombach, Andreas, Natalie Dykes, Stefan Evert, Philipp Heinrich, Besim Kabashi, and Thomas Proisl. 2019. “A New German Reddit Corpus (A Report on Work in Progress).” Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019). Erlangen, Germany. [bib]
- Proisl, Thomas, Natalie Dykes, Andreas Blombach, Philipp Heinrich, and Besim Kabashi. 2019. “NLP for German CMC Texts: Tokenization, POS Tagging, and a New Gold Standard for Lemmatization.” Presentation at Annotation of Non-Standard Corpora. Bamberg. [bib].