CLARIN and Libraries 2023: Large Language Models and Libraries

CLARIN, National Library of Norway

Official event website:

The workshop builds upon the first CLARIN and Libraries workshop held in the Hague in May 2022 (see here).

This year's workshop will investigate further areas of collaboration between CLARIN-related initiatives and libraries with a special emphasis on building (large) language models in and in cooperation with libraries. The workshop will bring together for the second time a group of people associated with both CLARIN (or other research infrastructures) and libraries. Whereas the first CLARIN and Libraries workshop was particularly concerned with digital content delivery for researchers, the main theme of the second workshop will be large language models and library collections, e.g. technical challenges in building such models and legal implications of model training and use. 

The host, the National Library of Norway (NLN), has since 2005 digitised its entire text collections, amounting at present to a large corpus of 160 billion words for Norwegian and has built large language models for text (BERT, GPT-2, T5) and speech (wav2vec, Whisper) on these collections. There will be keynotes from the National Libraries of Norway and Germany on the technical aspects of building such models in a library setting, as well as a keynote on the legal aspects of building large language models from the Swedish National Library.

Participation in the workshop is by invitation. If you are interested in attending, please contact your national coordinator or The venue (National Library of Norway, Henrik Ibsens gate 110, Oslo) is located very close to the train station Nationaltheatret. Descriptions for getting to the venue can be found here


Tuesday 5 December 2023

12:00 - 13:15Lunch (Cafeteria, National Library of Norway)
13:15 - 13:30Welcome
13:30 - 15:00Introduction to CLARIN and Libraries, wrap-up from last year’s workshop (15 mins)Tour de table: introduction and points for discussion (45 mins)Library collections as data (Sally Chambers)
15:00 - 15:30 Break
15:30 - 17:00Large language models at the National Library of Norway (Javier De La Rosa)Large language models at the German National Library (Peter Leinen)Discussion: technical aspects (chair: Andreas Witt)
17:00 - 17:30Sensitive Data in HPC – How secure can it be? Is secure data processing in shared computing environments a dream? (Martin Matthiesen)
19:00 Evening social dinner (Avalon, Munkedamsveien 31, Oslo)

Wednesday 6 December 2023

9:30 - 10:30Lightning Talks: Participants who have registered for a lightning talk (see separate invitation by e-mail) will have the possibility to introduce their own projects and resources.
10:30 - 11:00 Break
11:00 - 12:00Legal aspects of large language models in libraries (Jerker Rydén)Discussion: legal aspects (chair: Andreas Witt)
12:00 - 13:00 Lunch (Cafeteria, National Library of Norway)

National Library of Norway
Henrik Ibsens gate 110
0255 Oslo

Liitu Eesti Rahvusraamatukogu uudiskirjaga

    RaRa väike maja
    E-R 10—20
    L 12—19
    P Suletud

    RaRa saatkond Solarises
    E-P     10—19

    Eesti Rahvusraamatukogu
    Narva mnt 11, 15015 Tallinn
    +372 630 7100

    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram