Natural Language Processing with Transformers
This is a new master level course that is being offered for the first time in the winter semester 2023/24. Parts of that course originate from the course Text Analytics (ITA) that has been offered in the winter semester 2020/21, primarily as a master-level course, and is not offered anymore. ITA has been split into two courses, the bachelor-level course Data Science for Text Analytics (IDSTA) and the follow-up master-level course described here. That is, ITA basically has been split into two courses, a bachelor course and a master course. Thus, students who have taken the Text Analytics ITA course in the winter semester 2020/21 are not eligible to take this new course.
This course is programming intensive (well, it is about NLP, Data Science, and AI)! In addition to a final group project there will be assignments that focus on conceptual (theoretical) aspects of NLP and transformer models but also concentrate on mapping these concepts into programs using real world text corpora. For the projects, we will exclusively use Python and frameworks such as Huggingface, LangChain, Google Colab, and OpenAI. We also assume that students are already familiar with frameworks such as spaCy, gensim, Opensearch (IR backend component for text data storage and retrieval), and typical project tools such as Github and Docker. If you are uncomfortable with Python and programming (in a team) in general, this class is very likely not the best fit for you.
In the following we give some information about this new course as it will appear in the module handbook.
Credit Points: 6 LP / CP (2h lecture + 2h exercise session)
Language of instruction: English
Workload: 180 h; thereof 60 h lecture and 120 h self-study and working on assignments/projects (optionally in groups)
Time and Location:
- Lectures: TBA
- Exercise Sessions: TBA
First Lecture: M, October 16, 2023
Applicable to courses of study: Master Data and Computer Science
Course Objectives: Students
- fully understand the principles and methods underlying word embedding approaches
- are familiar with traditional sequence-to-sequence machine learning methods
- can describe the key concepts and techniques underlying attention mechanism and different transformer architectures
- understanding training and fine-tuning approaches to improve the performance of different transformer architectures for different downstream NLP tasks
- know the key methods and architectural components for building QA and text summarization pipelines
- can build and deploy QA and text summarization pipelines using common software frameworks
- know key metrics in evaluating transformer architectures for different applications
- can implement diverse transformer-based NLP applications using common Python frameworks and libraries
- can deploy transformer-based NLP applications through Web interfaces
Course Content:
- Word embeddings (review of simple neural network architectures and concepts)
- Sequence-to-sequence models (Recurrent Neural Networks, LSTM, GRU)
- Attention mechanism
- Transformer components (encoder, decoder) and common transformer architectures (BERT, GPT, T5)
- Training and fine-tuning transformers, including zero- and few-shot learning
- Text summarization approaches
- Question answering and building a QA pipeline
- Transformer architectures for conversational AI
- Programming and model frameworks such as Huggingface, LangChain, OpenAI and and (cloud-based) vector databases
Suggested Prerequisites: Recommended courses: Data Science for Text Analytics (IDSTA), Foundations of Machine Learning (IML)
Recommended background: solid knowledge of basic calculus, statistics, and linear algebra; good Python programming skills; familiarity with frameworks such as Huggingface, Google Colab, and cloud-based services, in particular vector databases
Assessments: Assignments (40%) and Programming Project (60%); about 4-5 assignments focusing on the material learned in class on a conceptual and formal level; group project in which 3-4 students develop a prototypical transformer-based application, including design and evaluation, a written project documentation as well as the code need to be submitted at the end of classes, clearly indicating what student is responsible for what part of the project. Both assignments and project must be at least satisfactory (4,0) in order to pass the class.
Suggested Literature: The following textbook and texts are useful but not required.
For the different topics, several research papers will be provided to students via the Moodle platform. The following textbooks are useful but not required.
- Lewis Tunstall, Leandro von Werra, and Thomas Wolf. Natural Language Processing with Transformers, 2022 (revised edition)
- Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft)
Furthermore, during the course of this lecture, several papers covering topics discussed in class will be provided.
Instructors:
- Prof. Dr. Michael Gertz