Building Open Source Language Models
- Track: AI and Machine Learning devroom
- Room: UB2.252A (Lameere)
- Day: Sunday
- Start: 09:45
- End: 10:00
- Video only: ub2252a
- Chat: Join the conversation!
LINAGORA, as a leader in the Open LLM France community, has made it a priority to pull the curtain off of the process of building Large Language Models (LLMs). While most LLMs in use today – even the “open” ones – reveal few to no details about their training, and especially the data on which they are trained, we have decided to share it all. In this talk, we discuss why using an open model trained on traceable data is important for business and research alike and examine some of the difficulties involved in pursuing an open strategy for LLMs. We bring to the table our experience with data collection and training of LLMs, including the Claire family of language models.
For links to the Claire models:
- Main model: huggingface.co/OpenLLM-France/Claire-7B-0.1 (CC BY-NC-SA 4.0)
- Main model in GGUF formats: huggingface.co/TheBloke/Claire-7B-0.1-GGUF
- Variant model, license Apache: huggingface.co/OpenLLM-France/Claire-7B-Apache-0.1
- Demo (simulated chat): huggingface.co/spaces/OpenLLM-France/Claire-Chat-0.1
Dataset & Code:
- Full dataset: huggingface.co/datasets/OpenLLM-France/Claire-Dialogue-French-0.1
- Paper: arxiv.org/abs/2311.16840 “The Claire French Dialogue Dataset”; https://huggingface.co/papers/2311.16840 (with links to related assets)
- Databases survey: https://github.com/OpenLLM-France/Claire-datasets
- Code for training: https://github.com/OpenLLM-France/Lit-Claire
Speakers
Julie Hunter |