Skip to content

Advanced NLP Workshop: word-sense disambiguation with RoBERTa and text summarization with BART (Machine Learning Milan)

Notifications You must be signed in to change notification settings

denocris/NLP-Workshop-MLMilan

Repository files navigation

Natural Language Processing Workshop - Machine Learning Milan

This workshop was supported by Machine Learning Milan, IAML and AINDO. The aim is to show two NLP use cases using the most recent algorithms and libraries available (May 2020).

Authors: Andrea Gatto (LinkedIn) and Cristiano De Nobili (LinkedIn, Twitter)

The workshop was recorded (video link) and it is organized in two sessions:

In the first session, after a brief introduction to the Transformers library by HuggingFace and the NLP pipeline, Cristiano talks about word-sense disambiguation and embedding visualization in the age of contextual embeddings (such as BERT). Also some geometrical aspects hidden into transformer-based architectures are touched. We focus on Italian language and take advantage of the pre-trained checkpoints: Gilberto and Umberto. Here's the link to the slides for this section.

During the second session, Andrea introduces automatic summarization and focuses on two different methods. In the first part we build our own version of TextRank to extract the most important sentences from a document. In the second we introduce encoder-decoder Transformer architectures and generate summaries using BART, a sequence-to-sequence denoising autoencoder Transformer part of the new wave of Transformer models for Natural Language Generation. These are the slides for the second section.

Cristiano holds a Ph.D. in Theoretical Physics (SISSA) and he has been actively working in Deep Learning for four years. In particular, he is now part of the Bixby project, Samsung's vocal assistant. He is also a TEDx speaker (here is talk about AI, Humans and their future). Here his contacts!

Andrea obtained his Ph.D. in Theoretical Astrophysics at the Max Planck Institute for Astrophysics in Munich. He worked as Software Developer and as Data Scientist in different business areas. Currently, he builds NLP models via Deep Learning for Bixby, Samsung's virtual assistant.

For any doubts or questions feel free to contact us!

About

Advanced NLP Workshop: word-sense disambiguation with RoBERTa and text summarization with BART (Machine Learning Milan)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published