Universität Wien
Warning! The directory is not yet complete and will be amended until the beginning of the term.

136041 SE Open Source Language Models (2024W)

Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 25 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Tuesday 01.10. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 08.10. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 15.10. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 22.10. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 29.10. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 05.11. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 12.11. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 19.11. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 26.11. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 03.12. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 10.12. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 17.12. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 07.01. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 14.01. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3
  • Tuesday 28.01. 09:45 - 11:15 Hörsaal 2 Hauptgebäude, Tiefparterre Stiege 5 Hof 3

Information

Aims, contents and method of the course

In this seminar, participants will read, present and discuss recent papers on the capabilities and limitations of language-based AI models.

Possible topics to be covered in the seminar:

A. Foundational work on transformer language models
B. Evaluation and Analysis of LLMs
C. Open Source LLMs, Training, and Corpora
D. Legal Aspects, Copyright, and Transparency

Assessment and permitted materials

Participants will have to present one topic from the list in the seminar, the presentation should be roughly 25 minutes (hard limits: min. 20 minutes, max. 30 minutes). The presentation is followed by a QA session and discussion. Participants will also have to submit a written report (deadline and exact requirements TBD), describing the main contents of the presented paper(s) - see a list of recommended papers below - and putting it in a wider context.

Minimum requirements and assessment criteria

Your presentation will account for 45% of the grade, participation in discussions for 10%, and the written report for 45%.

Examination topics

Your presentation will account for 45% of the grade, participation in discussions for 10%, and the written report for 45%.

Reading list

---
A. Foundational work on transformer language models

Vaswani, A. "Attention is all you need." 2017

Brown, Tom B. "Language models are few-shot learners." 2020

Holtzman, Ari, et al. "The curious case of neural text degeneration." 2019

Wei, Jason, et al. "Finetuned language models are zero-shot learners." 2021

Ouyang, Long, et al. "Training language models to follow instructions with human feedback." 2022

---
B. Evaluation and Analysis of LLMs

Hendrycks, Dan, et al. "Measuring massive multitask language understanding." 2020
and
Wang, Yubo, et al. "Mmlu-pro: A more robust and challenging multi-task language understanding benchmark." 2024

Zheng, Lianmin, et al. "Judging llm-as-a-judge with mt-bench and chatbot arena." 2023

Biderman, Stella, et al. "Pythia: A suite for analyzing large language models across training and scaling." 2023

Schaeffer, Rylan, Brando Miranda, and Sanmi Koyejo. "Are emergent abilities of large language models a mirage?." 2024

---
C. Open Source LLMs, Training, and Corpora

Zhang, Susan, et al. "Opt: Open pre-trained transformer language models." 2022

Le Scao, Teven, et al. "Bloom: A 176b-parameter open-access multilingual language model." 2023

Groeneveld, Dirk, et al. "Olmo: Accelerating the science of language models." 2024

Soldaini, Luca, et al. "Dolma: An open corpus of three trillion tokens for language model pretraining research." 2024

Wang, Yizhong, et al. "How far can camels go? exploring the state of instruction tuning on open resources." 2023
and
Ivison, Hamish, et al. "Camels in a changing climate: Enhancing lm adaptation with tulu 2." 2023

Wang, Yizhong, et al. "Self-instruct: Aligning language models with self-generated instructions." 2022

Peng, Baolin, et al. "Instruction tuning with gpt-4." 2023

Üstün, Ahmet, et al. "Aya model: An instruction finetuned open-access multilingual language model." 2024

Singh, Shivalika, et al. "Aya dataset: An open-access collection for multilingual instruction tuning." 2024

Rafailov, Rafael, et al. "Direct preference optimization: Your language model is secretly a reward model." 2024

Frantar, Elias, et al. "Gptq: Accurate post-training quantization for generative pre-trained transformers." 2022

Jiang, Albert Q., et al. "Mixtral of experts." 2024

Shen, Yongliang, et al. "Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face." 2024

Li, Raymond, et al. "Starcoder: may the source be with you!." 2023

---
D. Legal Aspects, Copyright, and Transparency

Lemley, Mark A., and Bryan Casey. "Fair learning." 2020

NYT vs. OpenAI
Complaint by New York Times
and
Response by OpenAI (blogpost, legal response)
https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf , https://openai.com/index/openai-and-journalism/ , https://www.courtlistener.com/docket/68117049/52/the-new-york-times-company-v-microsoft-corporation/

Strowel, Alain. Study on copyright and new technologies: copyright data management and artificial intelligence. 2022

Jernite, Yacine, et al. "Data governance in the age of large-scale data-driven language technology." 2022

Bommasani, Rishi, et al. "The foundation model transparency index." 2023

Association in the course directory

S-DH Cluster I: Language and Literature

Last modified: Mo 23.09.2024 14:46