Viva la Revolución of Open Source Large Language Models: Unleashing the Dark Horse in AI Innovation

Industrial Problems Seminar

Abstract

Contrary to popular belief, compared to what was already expected 2023 did not really witness significant advancements in benchmark measurements for large language models (LLMs), at least from the ones provided by tech giants like Microsoft, OpenAI, and Anthropic. Open Source LLMs, while not covered in, "the newspapers of the day," silently emerged as the dark horse, improving their benchmark capabilities by up to 30% by benchmark measurements. This surge underscores a revolutionary shift in the computational linguistics and AI fields, challenging proprietary models for enterprise adoption due to data governance necessary for enterprises, an arena very familiar to midwest IT and software professionals in industries such as healthcare. This presentation will explore open-source LLMs, showing common patterns found by companies attempting to adopt them, ways to address these problems, and touch on quantized training, tooling and infrastructure to support the development process, and since this is directed at the institute for mathematics, an attempt at formalizing the goal of LLM quantization into a series of mathematical statements.

This presentation is very abstract. More detail including the mathematical formulation will be posted at the presenter's blog at patdel.com.

A draft writeup to the mathematical formulation portion of the presentation has been provided here: https://github.com/pwdel/openllm-framework/tree/main.

Viva la Revolución of Open Source Large Language Models: Unleashing the Dark Horse in AI Innovation

Abstract

Share