Skip to content

Vishesht27/Language-Models_from_Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large Language Models from Scratch

This repository contains implementations of various large language models (LLMs) from scratch, including:

  1. Vanilla Transformer: An implementation of the Transformer architecture introduced in the paper "Attention is All You Need".
  2. LLaMA-2: A re-implementation of Meta's LLaMA-2 model.
  3. BERT: A from-scratch implementation of Google's Bidirectional Encoder Representations from Transformers (BERT) model.
  4. Mistral: An implementation of the Mistral language model.
  5. Mixture of Experts: A model that combines multiple expert sub-models to achieve better performance.
  6. Gemma: A re-implementation of Google's Gemma model.

Getting Started

To get started with this repository, follow these steps:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages