Skip to content

xXKeNdAmAmAsTeRXx/toy_LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toy LLM

This project creates a simple language model, based on Markov Chains. This article explains Markov Chains for Natural Language Processing (NLP): geeksforgeeks.org.

In short, it analyses probability that a word appears (in text) after another word, and then randomly chooses a next word, based on the previous one.

This repo implements a model class with helper functions to fit the model from string or .txt file.

Project setup

Install required dependencies:

pip install -r requirements.txt

Then instantiate the model with example text:

from model import MarkovModel

model: MarkovModel = MarkovModel.from_text("A B C A", seed=42)
print(model.predict_n_tokens(10))
# B C A B C A B C A B C

Examples

Here take a look at demo notebooks:

About

This project is about making toy LLM based on Markov Chain. In this approach decision about choosing next token is only influenced by previous one.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages