Our Speakers
Dr. Mona DIAB
Title:
Cross lingual solutions for low resource scenarios. The case of Arabic dialects
Abstract:
With the advent of social media, we are witnessing an exponential growth in unstructured data online. A huge amount of this data is in fact in languages other than English. Some of these languages have rich automated resources and processing tools, but the majority of the languages in the world are considered low resource despite presence online. In this talk, I will address the problem of processing low resource languages. I will present some of our solutions for language identification, information extraction, machine translation, and resource creation exploiting rich languages via cross linguistic modeling. Such techniques can also be cast for cross genre and cross domain challenges. I will use Arabic as a use case.Dr Mona Diab
is one of the most well known people, in the community working Arabic NLP. She conducts research in Statistical Natural Language Processing. She is an Associate Professor in the Department of Computer Science, George Washington University (GW). She is also the founder and Director of the GW NLP lab CARE4Lang. Before joining GW, she was Research Scientist (Principal Investigator) at the Center for Computational Learning Systems (CCLS), Columbia University in New York. Her research interests span several areas in computational linguistics/natural language processing: computational lexical semantics, multilingual processing, social media processing, information extraction & text analytics, machine translation, and computational socio-pragmatics.She has a special interest in low resource language processing with a focus on Arabic dialects. She is currently the elected President for the Association for Computational Linguistics (ACL) Special Interest Group (SIG) for Semitic Language Processing (SIG-Semitic).Dr. Albert GATT
Title:
Maltese at the Digital Crossroads: Language Technology and Resources for the Maltese Language. Powerpoint Presentation.
Abstract:
Many European languages are “under-represented” when it comes to digital resources and tools. Maltese is no exception. In this regard, a further challenge for Maltese is that it is a “small” language (that is, it is spoken by a relatively small population) and co-exists with English in a bilingual scenario.This talk will seek to give a characterisation of Maltese in relation to its historical antecedents in Arabic, but also discussing the impact of contact with Italian and English. Following this, I will give an overview of the present situation of Maltese when it comes to NLP tools, with particular reference to past and ongoing efforts in the areas of analysis (morphological labelling and parsing) and speech technology.
Dr Albert Gatt
is a Senior Lecturer and Director, Institute of Linguistics and Language Technology, and a member of the RiVaL (Reseaarch in Vision and Language) group at the University of Malta. For a while he was also affiliated as a Research Associate at the Tilburg center for Communication and Cognition (TiCC), Tilburg University, The Netherlands and held a position as Research Fellow in the Natural Language Generation Group of the Department of Computing Science at the University of Aberdeen.His research mostly focusses on the generation of language by machines (a.k.a. Natural Language Generation) and the production of language by humans. He has worked with both experimental psycholinguistic and computational methods, including machine learning and neural approaches. Below are some of the topics he explores:
- Data-to-text generation, that is, the automatic summarisation of non-linguistic information, in forms that are understandable by people
- The vision-language interface, especially image captioning and grounded inference
- The production and generation of referring expressions, especially the question of what to include in object descriptions in visual scenes
Another long-standing interest is in digital language resources for under-resourced languages, including Maltese and in morphology. He is particularly intrigued by languages such as Maltese, which have hybrid systems of word-formation processes, making them very challenging both from a computational and a psycholinguistic perspective.
