Skip to content

codewithdark-git/Building-LLMs-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.


πŸ“˜ Reference Book


πŸ—“οΈ Weekly Curriculum Overview

πŸ”Ή Week 1: Foundations of Language Models

  • Set up the environment and tools.
  • Learn about tokenization, embeddings, and the idea of a "language model".
  • Encode input/output sequences and build basic forward models.
  • Understand unidirectional processing and causal language modeling.

πŸ”Ή Week 2: Building the Transformer Decoder

  • Explore Transformer components: attention, multi-head attention, and positional encoding.
  • Implement residual connections, normalization, and feedforward layers.
  • Build a GPT-style decoder-only transformer architecture.

πŸ”Ή Week 3: Training and Dataset Handling

  • Load and preprocess datasets like TinyShakespeare.
  • Implement batch creation, context windows, and training routines.
  • Use cross-entropy loss, optimizers, and learning rate schedulers.
  • Monitor perplexity and improve generalization.

πŸ”Ή Week 4: Text Generation and Deployment

  • Generate text using greedy, top-k, top-p, and temperature sampling.
  • Evaluate and tune generation.
  • Export and convert model for Hugging Face compatibility.
  • Deploy via Hugging Face Hub and Gradio Space.

πŸ› οΈ Getting Started

Prerequisites

  • Python 3.8+
  • PyTorch
  • NumPy
  • Matplotlib
  • JupyterLab or Notebooks
  • Hugging Face libraries: transformers, datasets, huggingface_hub
  • gradio for deployment

Installation

git clone https://.com/codewithdark-git/Building-LLMs-from-scratch.git
cd Building-LLMs-from-scratch
pip install -r requirements.txt

πŸ“ Project Structure

Building-LLMs-from-scratch/
β”œβ”€β”€ notebooks/            # Weekly learning notebooks
β”œβ”€β”€ models/               # Model architectures & checkpoints
β”œβ”€β”€ data/                 # Preprocessing and datasets
β”œβ”€β”€ hf_deploy/            # Hugging Face config & deployment scripts
β”œβ”€β”€ theoretical/          # Podcast & theoretical discussions
β”œβ”€β”€ utils/                # Helper scripts
β”œβ”€β”€ requirements.txt
└── README.md

πŸš€ Hugging Face Deployment

This project includes:

  • Scripts to convert the model for πŸ€— Transformers compatibility
  • Uploading to Hugging Face Hub
  • Launching an interactive demo on Hugging Face Spaces using Gradio

You’ll find detailed instructions inside the hf_deploy/ folder.


πŸ“š Resources


πŸ“„ License

MIT License β€” see the LICENSE file for details.