Refinedoc - Little text processing lib
Hello everyone!
I'm here to present my latest little project, which I developed as part of a larger project for my work.
What's more, the lib is written in pure Python and has no dependencies other than the standard lib.
What My Project Does
It's called Refinedoc, and it's a little python lib that lets you remove headers and footers from poorly structured texts in a fairly robust and normally not very RAM-intensive way (appreciate the scientific precision of that last point), based on this paper https://www.researchgate.net/publication/221253782\_Header\_and\_Footer\_Extraction\_by\_Page-Association
I developed it initially to manage content extracted from PDFs I process as part of a professional project.
When Should You Use My Project?
The idea behind this library is to enable post-extraction processing of unstructured text content, the best-known example being pdf files. The main idea is to robustly and securely separate the text body from its headers and footers which is very useful when you collect lot of PDF files and want the body oh each.
Comparison
I compare it with pymuPDF4LLM wich is incredible but don't allow to extract specifically headers and footers and the license was a problem in my case.
I'd be delighted to hear your feedback on the code or lib as such!
https://github.com/CyberCRI/refinedoc
/r/Python
https://redd.it/1kn4lfx
Hello everyone!
I'm here to present my latest little project, which I developed as part of a larger project for my work.
What's more, the lib is written in pure Python and has no dependencies other than the standard lib.
What My Project Does
It's called Refinedoc, and it's a little python lib that lets you remove headers and footers from poorly structured texts in a fairly robust and normally not very RAM-intensive way (appreciate the scientific precision of that last point), based on this paper https://www.researchgate.net/publication/221253782\_Header\_and\_Footer\_Extraction\_by\_Page-Association
I developed it initially to manage content extracted from PDFs I process as part of a professional project.
When Should You Use My Project?
The idea behind this library is to enable post-extraction processing of unstructured text content, the best-known example being pdf files. The main idea is to robustly and securely separate the text body from its headers and footers which is very useful when you collect lot of PDF files and want the body oh each.
Comparison
I compare it with pymuPDF4LLM wich is incredible but don't allow to extract specifically headers and footers and the license was a problem in my case.
I'd be delighted to hear your feedback on the code or lib as such!
https://github.com/CyberCRI/refinedoc
/r/Python
https://redd.it/1kn4lfx
ResearchGate
(PDF) Header and Footer Extraction by Page-Association
PDF | This paper introduces a robust algorithm to extract headers and footers from a variety of electronic documents, such as image files, Adobe PDF... | Find, read and cite all the research you need on ResearchGate
PyTorch vs. Keras/Tensorflow D
Hey guys,
I am aware of the intended use cases, but I am interested to learn what you use more often in your projects. PyTorch or Keras and why?
/r/Python
https://redd.it/1kn4132
Hey guys,
I am aware of the intended use cases, but I am interested to learn what you use more often in your projects. PyTorch or Keras and why?
/r/Python
https://redd.it/1kn4132
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
R AlphaEvolve: A coding agent for scientific and algorithmic discovery
Paper: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf
Abstract:
In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances
capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems
or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous
pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using
an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve
iteratively improves the algorithm, potentially leading to new scientific and practical discoveries. We
demonstrate the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational
stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found
a functionally equivalent simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered
novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems
in mathematics and computer science, significantly expanding the scope of prior automated discovery
methods (Romera-Paredes et al., 2023). Notably, AlphaEvolve developed a search algorithm that found a
procedure to multiply two 4 × 4 complex-valued matrices using 48 scalar multiplications; offering the
first improvement, after 56 years, over Strassen’s algorithm in this setting.
/r/MachineLearning
https://redd.it/1kmxi4z
Paper: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf
Abstract:
In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances
capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems
or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous
pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using
an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve
iteratively improves the algorithm, potentially leading to new scientific and practical discoveries. We
demonstrate the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational
stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found
a functionally equivalent simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered
novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems
in mathematics and computer science, significantly expanding the scope of prior automated discovery
methods (Romera-Paredes et al., 2023). Notably, AlphaEvolve developed a search algorithm that found a
procedure to multiply two 4 × 4 complex-valued matrices using 48 scalar multiplications; offering the
first improvement, after 56 years, over Strassen’s algorithm in this setting.
/r/MachineLearning
https://redd.it/1kmxi4z
I built an Interactive reStructuredText Tutorial that runs entirely in your browser
Hey r/Python!
I wanted to share a project I've been working on: an Interactive reStructuredText Tutorial.
What My Project Does
It's a web-based, hands-on tutorial designed to teach reStructuredText (reST), the markup language used extensively in Python documentation (like Sphinx, docstrings, etc.). The entire tutorial, including the reST rendering, runs directly in your browser using PyScript and Pyodide.
You get a lesson description on one side and an interactive editor on the other. As you type reST in the editor, you see the rendered HTML output update instantly. It covers topics from basic syntax and inline markup to more complex features like directives, roles, tables, and figures.
There's also a separate Playground page for free-form experimentation.
Why I Made It
While the official reStructuredText documentation is comprehensive, I find that learning markup languages is often easier with immediate, interactive feedback. I wanted to create a tool where users could experiment with reST syntax and see the results without needing any local setup. Building it with PyScript was also a fun challenge to see how much could be done directly in the browser with Python.
Target Audience
This is for anyone who needs to learn or brush up on reStructuredText:
Python developers writing documentation or docstrings.
Users of Sphinx or
/r/Python
https://redd.it/1kn6ysa
Hey r/Python!
I wanted to share a project I've been working on: an Interactive reStructuredText Tutorial.
What My Project Does
It's a web-based, hands-on tutorial designed to teach reStructuredText (reST), the markup language used extensively in Python documentation (like Sphinx, docstrings, etc.). The entire tutorial, including the reST rendering, runs directly in your browser using PyScript and Pyodide.
You get a lesson description on one side and an interactive editor on the other. As you type reST in the editor, you see the rendered HTML output update instantly. It covers topics from basic syntax and inline markup to more complex features like directives, roles, tables, and figures.
There's also a separate Playground page for free-form experimentation.
Why I Made It
While the official reStructuredText documentation is comprehensive, I find that learning markup languages is often easier with immediate, interactive feedback. I wanted to create a tool where users could experiment with reST syntax and see the results without needing any local setup. Building it with PyScript was also a fun challenge to see how much could be done directly in the browser with Python.
Target Audience
This is for anyone who needs to learn or brush up on reStructuredText:
Python developers writing documentation or docstrings.
Users of Sphinx or
/r/Python
https://redd.it/1kn6ysa
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
R AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
> Large language models (LLMs) are remarkably versatile. They can summarize documents, generate code or even brainstorm new ideas. And now we’ve expanded these capabilities to target fundamental and highly complex problems in mathematics and modern computing.
Today, we’re announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas.
AlphaEvolve enhanced the efficiency of Google's data centers, chip design and AI training processes — including training the large language models underlying AlphaEvolve itself. It has also helped design faster matrix multiplication algorithms and find new solutions to open mathematical problems, showing incredible promise for application across many areas.
For all the Evolutionary Algorthim fans out there, here's a really interesting paper that Deepmind published where they show AlphaEvolve designing advanced algorithms like improving matrix multiplication (which is a big deal in ML optimization)
Paper link: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
Interview with team:
https://youtu.be/vC9nAosXrJw?si=rzZSorXqgbqChFJa
/r/MachineLearning
https://redd.it/1kmzpg0
> Large language models (LLMs) are remarkably versatile. They can summarize documents, generate code or even brainstorm new ideas. And now we’ve expanded these capabilities to target fundamental and highly complex problems in mathematics and modern computing.
Today, we’re announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas.
AlphaEvolve enhanced the efficiency of Google's data centers, chip design and AI training processes — including training the large language models underlying AlphaEvolve itself. It has also helped design faster matrix multiplication algorithms and find new solutions to open mathematical problems, showing incredible promise for application across many areas.
For all the Evolutionary Algorthim fans out there, here's a really interesting paper that Deepmind published where they show AlphaEvolve designing advanced algorithms like improving matrix multiplication (which is a big deal in ML optimization)
Paper link: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
Interview with team:
https://youtu.be/vC9nAosXrJw?si=rzZSorXqgbqChFJa
/r/MachineLearning
https://redd.it/1kmzpg0
Google DeepMind
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators
Introducing Pyrefly: A fast type checker and IDE experience for Python, written in Rust
Blog post: https://engineering.fb.com/2025/05/15/developer-tools/introducing-pyrefly-a-new-type-checker-and-ide-experience-for-python/
Podcast: https://engineering.fb.com/2025/05/15/developer-tools/open-sourcing-pyrefly-a-faster-python-type-checker-written-in-rust/
Source code: https://github.com/facebook/pyrefly
/r/Python
https://redd.it/1knh1uu
Blog post: https://engineering.fb.com/2025/05/15/developer-tools/introducing-pyrefly-a-new-type-checker-and-ide-experience-for-python/
Podcast: https://engineering.fb.com/2025/05/15/developer-tools/open-sourcing-pyrefly-a-faster-python-type-checker-written-in-rust/
Source code: https://github.com/facebook/pyrefly
/r/Python
https://redd.it/1knh1uu
Engineering at Meta
Introducing Pyrefly: A new type checker and IDE experience for Python
Today we are announcing an alpha version of Pyrefly, an open source Python type checker and IDE extension crafted in Rust. Pyrefly is a static type checker that analyzes Python code to ensure type …
Python for Good - Save the Date!
Hey Pythonistas!
Do you:
✅ Get excited about writing Python code?
✅ Want to use your skills for some serious good in the world?
✅ Interested in hanging out with the coolest, kindest, most awesome people in the Python community?
✅ Want to make dozens of new close friends?
If you're nodding enthusiastically right now, block off August 28-31st for Python for Good! Registration opens June 1st, but we wanted to give you a heads-up so you can plan accordingly!
Never heard of Python for Good? Python for Good operates year round but the event is basically summer camp for nerds! And it's ALL-INCLUSIVE (yes, you read that right) - lodging, meals, everything - at a gorgeous retreat space overlooking the Pacific Ocean. By day, we code for awesome causes. By night? We unleash our inner geeks with board games, nature hikes, campfire s'mores, epic karaoke battles, and other community building activities!
This is definitely NOT a hackathon. We work on real problems from real nonprofits (who'll be right there with us!), creating or contributing to existing open source solutions that will continue to make a difference long after the event wraps up.
Sounds like fun? Or maybe something your company would love to support? Hit us up!
/r/Python
https://redd.it/1knhkex
Hey Pythonistas!
Do you:
✅ Get excited about writing Python code?
✅ Want to use your skills for some serious good in the world?
✅ Interested in hanging out with the coolest, kindest, most awesome people in the Python community?
✅ Want to make dozens of new close friends?
If you're nodding enthusiastically right now, block off August 28-31st for Python for Good! Registration opens June 1st, but we wanted to give you a heads-up so you can plan accordingly!
Never heard of Python for Good? Python for Good operates year round but the event is basically summer camp for nerds! And it's ALL-INCLUSIVE (yes, you read that right) - lodging, meals, everything - at a gorgeous retreat space overlooking the Pacific Ocean. By day, we code for awesome causes. By night? We unleash our inner geeks with board games, nature hikes, campfire s'mores, epic karaoke battles, and other community building activities!
This is definitely NOT a hackathon. We work on real problems from real nonprofits (who'll be right there with us!), creating or contributing to existing open source solutions that will continue to make a difference long after the event wraps up.
Sounds like fun? Or maybe something your company would love to support? Hit us up!
/r/Python
https://redd.it/1knhkex
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
Friday Daily Thread: r/Python Meta and Free-Talk Fridays
# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️
Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!
## How it Works:
1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.
## Guidelines:
All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.
## Example Topics:
1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.
Let's keep the conversation going. Happy discussing! 🌟
/r/Python
https://redd.it/1knn8l8
# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️
Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!
## How it Works:
1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.
## Guidelines:
All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.
## Example Topics:
1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.
Let's keep the conversation going. Happy discussing! 🌟
/r/Python
https://redd.it/1knn8l8
Redditinc
Reddit Rules
Reddit Rules - Reddit
Better Pythonic Thinking
I've been using Python for a while, but I still find myself writing it more like JS than truly "Pythonic" code. I'm trying to level up how I think in Python.
Any tips, mindsets, patterns, or cheat sheets that helped you make the leap to more Pythonic thinking?
/r/Python
https://redd.it/1knff06
I've been using Python for a while, but I still find myself writing it more like JS than truly "Pythonic" code. I'm trying to level up how I think in Python.
Any tips, mindsets, patterns, or cheat sheets that helped you make the leap to more Pythonic thinking?
/r/Python
https://redd.it/1knff06
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
Jinja2
what is Jinja2 template
explain it or any source or youtube video.
/r/flask
https://redd.it/1kn3rhw
what is Jinja2 template
explain it or any source or youtube video.
/r/flask
https://redd.it/1kn3rhw
Reddit
From the flask community on Reddit
Explore this post and more from the flask community
python and Flask
I am using Python with Flask to create a secure login portal. Since I have a QA exam, could you tell me what theory and practical questions the QA team might ask?
/r/flask
https://redd.it/1kmhiri
I am using Python with Flask to create a secure login portal. Since I have a QA exam, could you tell me what theory and practical questions the QA team might ask?
/r/flask
https://redd.it/1kmhiri
Reddit
From the flask community on Reddit
Explore this post and more from the flask community
What network/data analysis projects are you building in Python?
I've been working on some tools to analyze detailed API performance data — things like latency, error rates, and concurrency patterns from load tests, mostly using Python, pandas, and notebooks.
Got me wondering: what kinds of network-related data projects are people building these days?
Always up for swapping ideas — or just learning what’s out there.
/r/Python
https://redd.it/1knvl0u
I've been working on some tools to analyze detailed API performance data — things like latency, error rates, and concurrency patterns from load tests, mostly using Python, pandas, and notebooks.
Got me wondering: what kinds of network-related data projects are people building these days?
Always up for swapping ideas — or just learning what’s out there.
/r/Python
https://redd.it/1knvl0u
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
Hi there, I'm new here and I need a partner to learn Django with through discord
/r/djangolearning
https://redd.it/1knv0m8
/r/djangolearning
https://redd.it/1knv0m8
Reddit
From the djangolearning community on Reddit
Explore this post and more from the djangolearning community
D presenting a paper virtually in ACL findings - should we?
Hi everyone.
Our paper (mine and colleagues) has been accepted to ACL findings. This is the first paper of mine that got accepted, so i am very excited and happy.
ACL findings papers are not required to be presented. They give you an option to present it, and if you choose to present it you can do it in person or virtually.
Unfortunately none of us are able to do it in person and fly to the conference. So the question becomes "is it worth it to present it virtually?".
I would love to hear what people think and experiences you had when presenting virtually.
Thanks.
/r/MachineLearning
https://redd.it/1knvsib
Hi everyone.
Our paper (mine and colleagues) has been accepted to ACL findings. This is the first paper of mine that got accepted, so i am very excited and happy.
ACL findings papers are not required to be presented. They give you an option to present it, and if you choose to present it you can do it in person or virtually.
Unfortunately none of us are able to do it in person and fly to the conference. So the question becomes "is it worth it to present it virtually?".
I would love to hear what people think and experiences you had when presenting virtually.
Thanks.
/r/MachineLearning
https://redd.it/1knvsib
Reddit
From the MachineLearning community on Reddit
Explore this post and more from the MachineLearning community
Which library would you choose Pygame or Arcade?
which library would you guys choose if making a game similar to mini millitia for steam, i see both libraries are good and have community support also , but still which one would you choose or if any other options , do comment
/r/Python
https://redd.it/1knwiyt
which library would you guys choose if making a game similar to mini millitia for steam, i see both libraries are good and have community support also , but still which one would you choose or if any other options , do comment
/r/Python
https://redd.it/1knwiyt
Reddit
From the Python community on Reddit
Explore this post and more from the Python community
P TTSDS2 - Multlingual TTS leaderboard
A while back, I posted about my TTS evaluation metric TTSDS, which uses an ensemble of perceptually motivated, FID-like scores to objectively evaluate synthetic speech quality. The original thread is here, where I got some great feedback:
https://www.reddit.com/r/MachineLearning/comments/1e9ec0m/p\_ttsds\_benchmarking\_recent\_tts\_systems/
Since then, I've finally gotten around to updating the benchmark. The new version—TTSDS2—is now multilingual, covering 14 languages, and generally more robust across domains and systems.
⭐ Leaderboard: ttsdsbenchmark.com#leaderboard
📄 Paper: https://arxiv.org/abs/2407.12707
The main idea behind TTSDS2 is still the same: FID-style (distributional) metrics can work well for TTS, but only if we use several of them together, based on perceptually meaningful categories/factors. The goal is to correlate as closely as possible with human judgments, without having to rely on trained models, ground truth transcriptions, or tuning hyperparameters. In this new version, we get a Spearman correlation above 0.5 with human ratings in every domain and language tested, which none of the other 16 metrics we compared against could do.
I've also put in place a few infrastructure changes. The benchmark now reruns automatically every quarter, pulling in new systems published in the previous quarter. This avoids test set contamination. The test sets themselves are also regenerated periodically using a reproducible pipeline. All TTS
/r/MachineLearning
https://redd.it/1knwaf7
A while back, I posted about my TTS evaluation metric TTSDS, which uses an ensemble of perceptually motivated, FID-like scores to objectively evaluate synthetic speech quality. The original thread is here, where I got some great feedback:
https://www.reddit.com/r/MachineLearning/comments/1e9ec0m/p\_ttsds\_benchmarking\_recent\_tts\_systems/
Since then, I've finally gotten around to updating the benchmark. The new version—TTSDS2—is now multilingual, covering 14 languages, and generally more robust across domains and systems.
⭐ Leaderboard: ttsdsbenchmark.com#leaderboard
📄 Paper: https://arxiv.org/abs/2407.12707
The main idea behind TTSDS2 is still the same: FID-style (distributional) metrics can work well for TTS, but only if we use several of them together, based on perceptually meaningful categories/factors. The goal is to correlate as closely as possible with human judgments, without having to rely on trained models, ground truth transcriptions, or tuning hyperparameters. In this new version, we get a Spearman correlation above 0.5 with human ratings in every domain and language tested, which none of the other 16 metrics we compared against could do.
I've also put in place a few infrastructure changes. The benchmark now reruns automatically every quarter, pulling in new systems published in the previous quarter. This avoids test set contamination. The test sets themselves are also regenerated periodically using a reproducible pipeline. All TTS
/r/MachineLearning
https://redd.it/1knwaf7
Reddit
From the MachineLearning community on Reddit: [P] TTSDS - Benchmarking recent TTS systems
Explore this post and more from the MachineLearning community
RouteSage - Documentation of FastAPI made easy
I have just built RouteSage as one of my side project. Motivation behind building this package was due to the tiring process of manually creating documentation for FastAPI routes. So, I thought of building this and this is my first vibe-coded project.
My idea is to set this as an open source project so that it can be expanded to other frameworks as well and more new features can be also added.
What My Project Does:
RouteSage is a CLI tool that uses LLMs to automatically generate human-readable documentation from FastAPI route definitions. It scans your FastAPI codebase and provides detailed, readable explanations for each route, helping teams understand API behavior faster.
Target Audience:
RouteSage is intended for FastAPI developers who want clearer documentation for their APIs—especially useful in teams where understanding endpoints quickly is crucial. This is currently a CLI-only tool, ideal for development or internal tooling use.
Comparison:
Unlike FastAPI’s built-in OpenAPI/Swagger UI docs, which focus on the structural and request/response schema, RouteSage provides natural language explanations powered by LLMs, giving context and descriptions not present in standard auto-generated docs. This is useful for onboarding, code reviews, or improving overall API clarity.
Your suggestions and validations are welcomed.
Link to project: https://github.com/dijo-d/RouteSage
https://routesage.vercel.app
/r/Python
https://redd.it/1knw6ie
I have just built RouteSage as one of my side project. Motivation behind building this package was due to the tiring process of manually creating documentation for FastAPI routes. So, I thought of building this and this is my first vibe-coded project.
My idea is to set this as an open source project so that it can be expanded to other frameworks as well and more new features can be also added.
What My Project Does:
RouteSage is a CLI tool that uses LLMs to automatically generate human-readable documentation from FastAPI route definitions. It scans your FastAPI codebase and provides detailed, readable explanations for each route, helping teams understand API behavior faster.
Target Audience:
RouteSage is intended for FastAPI developers who want clearer documentation for their APIs—especially useful in teams where understanding endpoints quickly is crucial. This is currently a CLI-only tool, ideal for development or internal tooling use.
Comparison:
Unlike FastAPI’s built-in OpenAPI/Swagger UI docs, which focus on the structural and request/response schema, RouteSage provides natural language explanations powered by LLMs, giving context and descriptions not present in standard auto-generated docs. This is useful for onboarding, code reviews, or improving overall API clarity.
Your suggestions and validations are welcomed.
Link to project: https://github.com/dijo-d/RouteSage
https://routesage.vercel.app
/r/Python
https://redd.it/1knw6ie
GitHub
GitHub - dijo-d/RouteSage
Contribute to dijo-d/RouteSage development by creating an account on GitHub.