PaulPauls/llama3_interpretability_sae
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
Language: Python
#feature_extraction #feature_steering #llama3 #llm_interpretability #open_research #pytorch #sparse_autoencoder
Stars: 285 Issues: 0 Forks: 13
https://github.com/PaulPauls/llama3_interpretability_sae
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.
Language: Python
#feature_extraction #feature_steering #llama3 #llm_interpretability #open_research #pytorch #sparse_autoencoder
Stars: 285 Issues: 0 Forks: 13
https://github.com/PaulPauls/llama3_interpretability_sae
GitHub
GitHub - PaulPauls/llama3_interpretability_sae: A complete end-to-end pipeline for LLM interpretability with sparse autoencoders…
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible. - PaulPauls/llama3_interpretability_sae