Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Thursday Daily Thread: Python Careers, Courses, and Furthering Education!

# Weekly Thread: Professional Use, Jobs, and Education 🏢

Welcome to this week's discussion on Python in the professional world! This is your spot to talk about job hunting, career growth, and educational resources in Python. Please note, this thread is not for recruitment.

---

## How it Works:

1. Career Talk: Discuss using Python in your job, or the job market for Python roles.
2. Education Q&A: Ask or answer questions about Python courses, certifications, and educational resources.
3. Workplace Chat: Share your experiences, challenges, or success stories about using Python professionally.

---

## Guidelines:

- This thread is not for recruitment. For job postings, please see r/PythonJobs or the recruitment thread in the sidebar.
- Keep discussions relevant to Python in the professional and educational context.

---

## Example Topics:

1. Career Paths: What kinds of roles are out there for Python developers?
2. Certifications: Are Python certifications worth it?
3. Course Recommendations: Any good advanced Python courses to recommend?
4. Workplace Tools: What Python libraries are indispensable in your professional work?
5. Interview Tips: What types of Python questions are commonly asked in interviews?

---

Let's help each other grow in our careers and education. Happy discussing! 🌟

/r/Python
https://redd.it/1kxwlna
I accidentally built a vector database using video compression

While building a RAG system, I got frustrated watching my 8GB RAM disappear into a vector database just to search my own PDFs. After burning through $150 in cloud costs, I had a weird thought: what if I encoded my documents into video frames?

The idea sounds absurd - why would you store text in video? But modern video codecs have spent decades optimizing for compression. So I tried converting text into QR codes, then encoding those as video frames, letting H.264/H.265 handle the compression magic.

The results surprised me. 10,000 PDFs compressed down to a 1.4GB video file. Search latency came in around 900ms compared to Pinecone’s 820ms, so about 10% slower. But RAM usage dropped from 8GB+ to just 200MB, and it works completely offline with no API keys or monthly bills.

The technical approach is simple: each document chunk gets encoded into QR codes which become video frames. Video compression handles redundancy between similar documents remarkably well. Search works by decoding relevant frame ranges based on a lightweight index.

You get a vector database that’s just a video file you can copy anywhere.

https://github.com/Olow304/memvid

/r/Python
https://redd.it/1ky24a0
I built a template for FastAPI apps with React frontends using Nginx Unit

Hey guys, this is probably a common experience, but as I built more and more Python apps for actual users, I always found myself eventually having to move away from libraries like Streamlit or Gradio as features and complexity grew.

This meant that I eventually had to reach for React and the disastrous JS ecosystem; it also meant managing two applications (the React frontend and a FastAPI backend), which always made deployment more of a chore. However, having access to building UIs with Tailwind and Shadcn was so good, I preferred to just bite the bullet.

But as I kept working on and polishing this stack, I started to find ways to make it much more manageable. One of the biggest improvements was starting to use Nginx Unit, which is a drop-in replacement for uvicorn in Python terms, but it can also serve SPAs like React incredibly well, while also handling request routing internally.

This setup lets me collapse my two applications into a single runtime, a single container. Which makes it SO much easier to deploy my applications to GCP Cloud Run, Azure Web Apps, Fly Machines, etc.

Anyways, I created a template repo that I could reuse to skip the boilerplate of this

/r/Python
https://redd.it/1ky1bwq
R Can't attend to present at ICML

Due to visa issues, no one on our team can attend to present our poster at ICML.


Does anyone have experience with not physically attending in the past? Is ICML typically flexible with this if we register and don't come to stand by the poster? Or do they check conference check-ins?



/r/MachineLearning
https://redd.it/1kxs67w
We built a Python SDK for our open source auth platform - would love feedback from Flask devs!!

Hey everyone, I’m Megan writing from Tesseral, the YC-backed open source authentication platform built specifically for B2B software (think: SAML, SCIM, RBAC, session management, etc.). We released our Python SDK and I’d love feedback from Flask devs…. 

If you’re interested in auth or if you have experience building it in Flask, would love to know what’s missing / confusing / would make this easier to use in your stack? Also, if you have general gripes about auth (it is very gripeable) would love to hear them. 

Here’s our GitHub: https://github.com/tesseral-labs/tesseral 

And our docs: https://tesseral.com/docs/what-is-tesseral   

Appreciate the feedback!

/r/flask
https://redd.it/1kxs6ff
Architecture and code for a Python RAG API using LangChain, FastAPI, and pgvector

I’ve been experimenting with building a Retrieval-Augmented Generation (RAG) system entirely in Python, and I just completed a write-up that breaks down the architecture and implementation details.

The stack:

Python + FastAPI
LangChain (for orchestration)
PostgreSQL + pgvector
OpenAI embeddings

I cover the high-level design, vector store integration, async handling, and API deployment — all with code and diagrams.

I'd love to hear your feedback on the architecture or tradeoffs, especially if you're also working with vector DBs or LangChain.

📄 Architecture + code walkthrough



/r/Python
https://redd.it/1ky5bgs
Looking for advice: Applying for a full-stack role with 5-year experience requirement (React/Django) — Internal referral opportunity

Hi everyone,

I’d really appreciate some advice or insight from folks who’ve been in a similar situation.

I was recently referred internally for a full-stack software engineer role that I’m very excited about. It’s a precious opportunity for me, but I’m feeling unsure because the job requires 5 years of experience in designing, developing, and testing web applications using Python, Django, React, and JavaScript.

Here’s my background:

I graduated in 2020 with a degree in Computer Engineering.
I worked for 2.5 years doing manual QA testing on the Google TV platform.
For the past 5 years, I’ve been teaching Python fundamentals and data structures at a coding bootcamp.
I only started learning React and Django a few months ago, but I’ve gone through the official tutorials on both the React and Django websites and have built a few simple full-stack apps. I feel fairly comfortable with the basics and am continuing to learn every day.

While I don't meet the "5 years of professional experience with this exact stack" requirement, I do have relevant technical exposure, strong Python fundamentals, and hands-on experience through teaching and recent personal projects.

If you've been in similar shoes — applying for a role where you didn’t meet all the

/r/django
https://redd.it/1ky3fqc
DTC - CLI tool to dump telegram channels.

🚀 What my project does

extract data from particular telegram channel.

Target Audience

Anyone who wants to dump channel.


Comparison

Never thought about alternatives, because I made up this poject idea this morning.

Key features:

📋 Lists all channels you're subscribed to in a nice tabular format
💾 Dumps complete message history from any channel
📸 Downloads attached photos automatically
💾 Exports everything to structured JSONL format
🖥️ Interactive CLI with clean, readable output

# 🛠️ Tech Stack

Built with some solid Python libraries

:

Telethon \- for Telegram API integration
Pandas \- for data handling and table formatting
Tabulate \- for those beautiful CLI tables

Requires Python 3.8+ and works across platforms.

# 🎯 How it works

The workflow is super simple

:



bash
# List your channels
>> list
+----+----------------------------+-------------+
| | name | telegram id |
+====+============================+=============+
| 0 | My Favorite Channel | 123456789 |
+----+----------------------------+-------------+


/r/Python
https://redd.it/1kya1xc
I don't understand the FlaskSQLalchemy conventions

When using the FlaskSQLalchemy package, I don't understand the convention of

class Base(DeclarativeBase):
pass

db=SQLAlchemy(model_class=Base)


Why not just pass in `db=SQLAlchemy(model_class=DeclarativeBase)` ?

/r/flask
https://redd.it/1kyd5l7
Open-source AI-powered test automation library for mobile and web

Hey [r/Python](/r/Python/),

My name is Alex Rodionov and I'm a tech lead of the Selenium project. For the last 10 months, I’ve been working on **Alumnium**. I've already shared it [2 months ago](https://www.reddit.com/r/Python/comments/1jpo96u/i_built_an_opensource_aipowered_library_for_web/), but since then the project gained a lot of new features, notably:

* mobile applications support via Appium;
* built-in caching for faster test execution;
* fully local model support with Ollama and Mistral Small 3.1.

**What My Project Does**
It's an open-source Python library that automates testing for mobile and web applications by leveraging AI, natural language commands and Appium, Playwright, or Selenium.

**Target Audience**
Test automation engineers or anyone writing tests for web applications. It’s an early-stage project, not ready for production use in complex web applications.

**Comparison**
Unlike other similar projects (Shortest, LaVague, Hercules), Alumnium can be used in existing tests without changes to test runners, reporting tools, or any other test infrastructure. This allows me to gradually migrate my test suites (mostly Selenium) and revert whenever something goes wrong (this happens a lot, to be honest). Other major differences:

* dead cheap (works on low-tier models like gpt-4o-mini, costs $20 per month for 1k+ tests)
* not an AI agent (dumb enough to fail the test rather than working around to

/r/Python
https://redd.it/1kyjmwl
Problems with Django Autocomplete Light

So, I'm stuck, I'm trying to make two selection boxes, one to select the state, the other to select the city. Both the code and the html are not crashing, but nothing is being loaded into the selection boxes.

Any help would be greatly appreciated!

#models.py
class City(models.Model):
    country = models.CharField(maxlength=50)
    state = models.CharField(max
length=50)
    city = models.CharField(maxlength=50)

    def str(self):
        return f"{
self.name}, {self.state}"
class City(models.Model):
    country = models.CharField(max
length=50)
    state = models.CharField(maxlength=50)
    city = models.CharField(max
length=50)


    def str(self):
        return f"{self.name}, {self.state}"

#forms.py
class CreateUserForm(forms.ModelForm):

    def init(self, args, kwargs):
        super().__init__(
args, kwargs)
        # Ensure city field has proper empty queryset initially


/r/djangolearning
https://redd.it/1kxt7bn
R How to add confidence intervals to your LLM-as-a-judge

Hi all – I recently built a system that automatically determines how many LLM-as-a-judge runs you need for statistically reliable scores. Key insight: treat each LLM evaluation as a noisy sample, then use confidence intervals to decide when to stop sampling.

The math shows reliability is surprisingly cheap (95% → 99% confidence only costs 1.7x more), but precision is expensive (doubling scale granularity costs 4x more).Also implemented "mixed-expert sampling" - rotating through multiple models (GPT-4, Claude, etc.) in the same batch for better robustness.

I also analyzed how latency, cost and reliability scale in this approach.Typical result: need 5-20 samples instead of guessing. Especially useful for AI safety evals and model comparisons where reliability matters.

Blog: https://www.sunnybak.net/blog/precision-based-sampling

GitHub: https://github.com/sunnybak/precision-based-sampling/blob/main/mixed\_expert.py

I’d love feedback or pointers to related work.

Thanks!

/r/MachineLearning
https://redd.it/1kyl04x
Friday Daily Thread: r/Python Meta and Free-Talk Fridays

# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️

Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!

## How it Works:

1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.

## Guidelines:

All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.

## Example Topics:

1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.

Let's keep the conversation going. Happy discussing! 🌟

/r/Python
https://redd.it/1kyq5i2
bulletchess, A high performance chess library

# What My Project Does

`bulletchess` is a high performance chess library, that implements the following and more:

* A complete game model with intuitive representations for pieces, moves, and positions.
* Extensively tested legal move generation, application, and undoing.
* Parsing and writing of positions specified in [Forsyth-Edwards Notation](https://www.chessprogramming.org/Forsyth-Edwards_Notation) (FEN), and moves specified in both [Long Algebraic Notation](https://www.chessprogramming.org/Algebraic_Chess_Notation#Long_Algebraic_Notation_.28LAN.29) and [Standard Algebraic Notation](https://www.chessprogramming.org/Algebraic_Chess_Notation#Standard_Algebraic_Notation_.28SAN.29).
* Methods to determine if a position is check, checkmate, stalemate, and each specific type of draw.
* Efficient hashing of positions using [Zobrist Keys](https://en.wikipedia.org/wiki/Zobrist_hashing).
* A [Portable Game Notation](https://thechessworld.com/articles/general-information/portable-chess-game-notation-pgn-complete-guide/) (PGN) file reader
* Utility functions for writing engines.

`bulletchess` is implemented as a C extension, similar to NumPy.

# Target Audience

I made this library after being frustrated with how slow `python-chess` was at large dataset analysis for machine learning and engine building. I hope it can be useful to anyone else looking for a fast interface to do any kind of chess ML in python.

# Comparison:

`bulletchess` has many of the same features as `python-chess`, but [is much faster](https://zedeckj.github.io/bulletchess/auto-examples/performance.html). I think the syntax of `bulletchess` is also a lot nicer to use. For example, instead of `python-chess`'s

board.piece_at(E1)

`bulletchess` uses:

board[E1]

You can install wheels with,

pip

/r/Python
https://redd.it/1kyoyds
Mastering Modern Time Series Forecasting : The Complete Guide to Statistical, Machine Learning & Dee

I’ve been working on a Python-focused guide called Mastering Modern Time Series Forecasting — aimed at bridging the gap between theory and practice for time series modeling.

It covers a wide range of methods, from traditional models like ARIMA and SARIMA to deep learning approaches like Transformers, N-BEATS, and TFT. The focus is on practical implementation, using libraries like statsmodelsscikit-learnPyTorch, and Darts. I also dive into real-world topics like handling messy time series data, feature engineering, and model evaluation.

I’m publishing the guide on Gumroad and LeanPub. I’ll drop a link in the comments in case anyone’s interested.

Always open to feedback from the community — thanks!

/r/Python
https://redd.it/1kz1tkt
🎉 Introducing TurboDRF - Auto Generate CRUD APIs from your django models

# What My Project Does:
🚀 [TurboDRF](https://github.com/alexandercollins/turbodrf) is a new drf module that auto generates endpoints by adding 1 class mixin to your django models:
- Autogenerate CRUD API endpoints with docs 🎉
- No more writng basic urls, views, view sets or serailizers
- Supports filtering, text search and granular perissions

After many years with DRF and spinning up new projects I've really gotten tired of writing basic views, urls and serializers so I've build turbodrf which will do all that for you.


🔗 You can access it here on my github: [https://github.com/alexandercollins/turbodrf](https://github.com/alexandercollins/turbodrf)


Basically just **add 1 mixin to the model you want to expose as an endpoint** and then 1 method in that model which specifies the fields (could probably move this to Meta tbh) and boom 💥 your API is ready.

📜 It also generates swagger docs, integrates with django's default user permissions (and has its own static role based permission system with field level permissions too), plus you get advanced filtering, full-text search, automatic pagination, nested relationships with double underscore notation, and automatic query optimization with select_related/prefetch_related.

💻 Here's a quick example:

```
class Book(models.Model, TurboDRFMixin):
title = models.CharField(max_length=200)
author = models.ForeignKey(Author, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10,

/r/Python
https://redd.it/1kyywn0
MigrateIt, A database migration tool

# What My Project Does

[MigrateIt](https://github.com/iagocanalejas/MigrateIt) allows to manage your database changes with simple migration files in plain SQL. Allowing to run/rollback them as you wish.

Avoids the need to learn a different sintax to configure database changes allowing to write them in the same SQL dialect your database use.

# Target Audience

Developers tired of having to synchronize databases between different environments or using tools that need to be configured in JSON or native ASTs instead of plain SQL.

# Comparison

Instead of:

```json
{ "databaseChangeLog": [
{
"changeSet": {
"changes": [
{
"createTable": {
"columns": [
{
"column": {
"name": "CREATED_BY",
"type": "VARCHAR2(255 CHAR)"


/r/Python
https://redd.it/1kz30mk
Functional programming concepts that actually work in Python

Been incorporating more functional programming ideas into my Python/R workflow lately - immutability, composition, higher-order functions. Makes debugging way easier when data doesn't change unexpectedly.

Wrote about some practical FP concepts that work well even in non-functional languages: https://borkar.substack.com/p/why-care-about-functional-programming?r=2qg9ny&utm\_medium=reddit

Anyone else finding FP useful for data work?

/r/Python
https://redd.it/1kz6kx3
HELP-Struggling to Scale Django App for High Concurrency

Hi everyone,

I'm working on scaling my Django app and facing performance issues under load. I've 5-6 API which hit concurrently by 300 users. Making almost 1800 request at once. I’ve gone through a bunch of optimizations but still seeing odd behavior.

# Tech Stack

\- Django backend
 \- PostgreSQL (AWS RDS)
 \- Gunicorn with `gthread` worker class
 \- Nginx as reverse proxy
 \- Load testing with `k6` (to simulate 500 to 5,000 concurrent requests)
 \- Also tested with JMeter — it handles 2,000 requests without crashing

# Server Setup

Setup 1 (Current):

\- 10 EC2 servers
 \- 9 Gunicorn `gthread` workers per server
 \- 30 threads per worker
 \- 4-core CPU per server

Setup 2 (Tested):

\- 2 EC2 servers
 \- 21 Gunicorn `gthread` workers per server
 \- 30 threads per worker
 \- 10-core CPU per server

Note: No PgBouncer or DB connection pooling in use yet.
 RDS `max_connections` = 3476.

# Load Test Scenario

\- 5–6 APIs are hit concurrently by around 300 users, totaling approximately 1,800 simultaneous requests.
 \- Each API is I/O-bound, with 8–9 DB queries using annotate, aggregate, filter, and other Django ORM queries and some CPU bound logic.

/r/django
https://redd.it/1kzb8h0