Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Built a small PyPI Package for explainable preprocessing

I made a Python package that explains preprocessing with reports and plots

Note: This project started as a way for me to learn packaging and publishing on PyPI, but I thought it might also be useful for beginners who want not just preprocessing, but also clear reports and plots of what happened during preprocessing.


What my project does: It’s a simple ML preprocessing helper package called ml-explain-preprocess. Along with handling basic preprocessing tasks (missing values, encoding, scaling, and outliers), it also generates additional outputs to make the process more transparent:

Text reports

JSON reports

(Optional) visual plots of distributions and outliers


The idea was to make it easier for beginners not only to preprocess data but also to understand what happened during preprocessing, since I couldn’t find many libraries that provide clear reports or visualizations alongside transformations.

It’s nothing advanced and definitely not optimized for production-level pipelines, but it was a good exercise in learning how packaging works and how to publish to PyPI.

Target audience: beginners in ML who want preprocessing plus some transparency. Experts probably won’t find it very useful, but maybe it can help people starting out.

Comparison: To my knowledge, most existing libraries handle preprocessing well, but they don’t directly give reports/plots. This project tries

/r/Python
https://redd.it/1njb946
stop wrong ai answers in your django app before they show up: one tiny middleware + grandma clinic (beginner, mit)

hi folks, last time i posted about “semantic firewalls” and it was too abstract. this is the ultra simple django version that you can paste in 5 minutes.

**what this does in one line**
instead of fixing bad llm answers after users see them, we check the payload before returning the response. if there’s no evidence, we block it politely.

**before vs after**

* before: view returns a fluent answer with zero proof, users see it, you fix later
* after: view includes small evidence, middleware checks it, only stable answers go out

below is a minimal copy-paste. it works with any provider or local model because it’s just json discipline.

---

### 1) middleware: block ungrounded answers

`core/middleware.py`

```python
# core/middleware.py
import json
from typing import Callable
from django.http import HttpRequest, HttpResponse, JsonResponse

class SemanticFirewall:
"""
minimal 'evidence-first' guard for AI responses.
contract we expect from the view:
{ "answer": "...", "refs": [...], "coverage_ok": true }
if refs is empty or coverage_ok is false or missing, we return 422.
"""

def __init__(self, get_response: Callable):
self.get_response = get_response

def __call__(self, request: HttpRequest)

/r/django
https://redd.it/1nj9mmu
R Need model/paper/code suggestion for document template extraction

I am looking to create a document template extraction pipeline for document similarity. One important thing I need to do as part of this is create a template mask. Essentially, say I have a collection of documents which all follow a similar format (imagine a form or a report). I want to

1. extract text from the document in a structured format (OCR but more like VQA type). About this, I have looked at a few VQA models. Some are too big but I think this a straightforward task.
2. (what I need help with) I want a model that can, given a collection of documents or any one document, can generate a layout mask without the text, so a template). I have looked at Document Analysis models, but most are centered around classifying different sections of the document into tables, paragraphs, etc. I have not come across a mask generation pipeline or model.

If anyone has encountered such a pipeline before or worked on document template extraction, I would love some help or links to papers.

/r/MachineLearning
https://redd.it/1njgjdd
BS4 vs xml.etree.ElementTree



Beautiful Soup or standard library (xml.etree.ElementTree)? I am building an ETL process for extracting notes from Evernote ENML. I hear BS4 is easier but standard library performs faster. This alone makes me want to stick with the standard library. Any reason why I should reconsider?

/r/Python
https://redd.it/1njiy79
Where's a good place to find people to talk about projects?

I'm a hobbyist programmer, dabbling in coding for like 20 years now, but never anything professional minus a three month stint. I'm trying to work on a medium sized Python project but honestly, I'm looking to work with someone who's a little bit more experienced so I can properly learn and ask questions instead of being reliant on a hallucinating chat bot.

But where would be the best place to discuss projects and look for like minded folks?

/r/Python
https://redd.it/1njo1k2
Thursday Daily Thread: Python Careers, Courses, and Furthering Education!

# Weekly Thread: Professional Use, Jobs, and Education 🏢

Welcome to this week's discussion on Python in the professional world! This is your spot to talk about job hunting, career growth, and educational resources in Python. Please note, this thread is not for recruitment.

---

## How it Works:

1. Career Talk: Discuss using Python in your job, or the job market for Python roles.
2. Education Q&A: Ask or answer questions about Python courses, certifications, and educational resources.
3. Workplace Chat: Share your experiences, challenges, or success stories about using Python professionally.

---

## Guidelines:

- This thread is not for recruitment. For job postings, please see r/PythonJobs or the recruitment thread in the sidebar.
- Keep discussions relevant to Python in the professional and educational context.

---

## Example Topics:

1. Career Paths: What kinds of roles are out there for Python developers?
2. Certifications: Are Python certifications worth it?
3. Course Recommendations: Any good advanced Python courses to recommend?
4. Workplace Tools: What Python libraries are indispensable in your professional work?
5. Interview Tips: What types of Python questions are commonly asked in interviews?

---

Let's help each other grow in our careers and education. Happy discussing! 🌟

/r/Python
https://redd.it/1njtelc
Django forms with bootstrap styling, how do you do it?

I like using Bootstrap because it makes it easy to make a website responsive to different screen sizes. There are several libraries out there who provide you with some way to get the needed bootstrap classes into forms while rendering.

However everytime I try one of these, I end up in a dead end. On a recent project I tried cirspy forms. It seemed alright at first. The first thing that frustrated me: it turns out they put an entire layer of layouting on top which is kinda clunky but workable. But then it is impossible to use a custom widget with a custom template. I just can't make crispy forms use the template of the custom widget.

So I was wondering if anyone found a proper way to make forms include bootstrap classes without a library introducing as many new problems as they solve old ones.

/r/django
https://redd.it/1nk5kvy
Django static files not being collected/deployed on Railway (Docker + Whitenoise)

Hi,

I’m deploying a Django app on Railway using Docker and Whitenoise, and I keep hitting the same problem:
my app works, but all static files (CSS/JS/images) return 404s.

What I see in logs:

Starting Container


2025-09-18 17:00:33 +0000 1 INFO Starting gunicorn 23.0.0


2025-09-18 17:00:33 +0000 1 INFO Listening at: http://0.0.0.0:8080 (1)


2025-09-18 17:00:33 +0000 1 INFO Using worker: sync


2025-09-18 17:00:33 +0000 2 INFO Booting worker with pid: 2


/usr/local/lib/python3.12/site-packages/django/core/handlers/base.py:61: UserWarning: No directory at: /app/staticfiles/


mwinstance = middleware(adaptedhandler)

UserWarning: No directory at: /app/staticfiles/


And HTTP logs show things like:

GET /static/name/styles.css 404
GET /static/name/name.js 404
GET /static/image.png 404


My setup:

Dockerfile with `python:3.12-slim`
Whitenoise enabled:STATIC_URL = "/static/" STATIC_ROOT = os.path.join(BASE_DIR, "staticfiles") STATICFILES_STORAGE = "whitenoise.storage.CompressedManifestStaticFilesStorage"
Procfile originally had:release: python manage.py migrate && python manage.py collectstatic --noinput web: gunicorn project.wsgi
Tried switching to an **entrypoint.sh** script that

/r/django
https://redd.it/1nke0bb
[P] We built mmore: an open-source multi-GPU/multi-node library for large-scale document parsing

We are a student group from EPFL and we have been working on a tool called mmore, and thought it might be useful to share it here. Maybe the community will find it useful.

You can think of mmore as something in the spirit of [Docling](https://github.com/docling-project/docling), but designed from the ground up to run natively on multi-GPU and multi-node setups. As the backend OCR for PDFs (and images) we use [Surya](https://github.com/datalab-to/surya), which we’ve found to be both very accurate and fast. For those with limited GPU resources, we also provide a lightweight “fast” mode. It skips OCR (so it cannot process scanned files) but still works well for born-digital documents.

In a [paper](https://www.arxiv.org/pdf/2509.11937) we released a few months ago, we showed that mmore achieves both speed and accuracy gains over Docling (maybe this has changed by now with the latest Granite-Docling). Right now, it supports a broad range of formats: PDFs, DOCX, PPTX, XLSX, MD, EML (emails), TXT, HTML, as well as videos and audio (MP4, MOV, AVI, MKV, MP3, WAV, AAC).

The use cases are flexible. For example:

* Unlocking text and image data from previously unprocessed files, enabling larger dataset creation (similar to what Docling + HuggingFace did a few days ago

/r/MachineLearning
https://redd.it/1nkdbin
Error running app

Hello everyone, I am currently trying to implement Oauth with google in a flask app for a learning project. Building the normal auth modules with email and username work fine, however as I refactor the code to work with oauth using the python oauthlib and requests modules, I am getting this error:

```bash
(.venv)daagi@fedora:~/Desktop/sandbox/oauth-primer$ python app.py
Usage: app.py [OPTIONS]
Try 'app.py --help' for help.

Error: While importing 'app', an ImportError was raised:

Traceback (most recent call last):
File "/home/daagi/Desktop/sandbox/oauth-primer/.venv/lib64/python3.13/site-packages/flask/cli.py", line 245, in locate_app
__import__(module_name)
~~~~~~~~~~^^^^^^^^^^^^^
File "/home/daagi/Desktop/sandbox/oauth-primer/app.py", line 1, in <module>
from website import create_app
ImportError: cannot import name 'create_app' from 'website' (consider renaming '/home/daagi/Desktop/sandbox/oauth-primer/website/__init__.py' if it has the same name as a library you intended to import)```

This is my file hierachy structure:

```bash
.
├── app.py
├── LICENSE
├── oauth.log
├── __pycache__
│   └── app.cpython-313.pyc
├── README.md
├── requirements.txt
├── TODO.md
└── website
├── auth.py
├── database
│   └── db.sql
├── db.py
├── __init__.py
├── models.py
├── oauth.py
├── __pycache__
├── static
│   ├── style
│   │   └── style.css


/r/flask
https://redd.it/1nk2otw
Friday Daily Thread: r/Python Meta and Free-Talk Fridays

# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️

Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!

## How it Works:

1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.

## Guidelines:

All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.

## Example Topics:

1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.

Let's keep the conversation going. Happy discussing! 🌟

/r/Python
https://redd.it/1nkohvq
Today I learned that Python doesn't care about how many spaces you indent as long as it's consistent

Call me stupid for only discovering this after 6 years, but did you know that you can use as many spaces you want to indent, as long as they're consistent within one indented block. For example, the following (awful) code block gives no error:

def say_hi(bye = False):
 print("Hi")
 if bye:
        print("Bye")

/r/Python
https://redd.it/1nkidxq
enso: A functional programming framework for Python

Hello all, I'm here to make my first post and 'release' of my functional programming framework, enso. Right before I made this post, I made the repository public. You can find it here.

# What my project does

enso is a high-level functional framework that works over top of Python. It expands the existing Python syntax by adding a variety of features. It does so by altering the AST at runtime, expanding the functionality of a handful of built-in classes, and using a modified tokenizer which adds additional tokens for a preprocessing/translation step.

I'll go over a few of the basic features so that people can get a taste of what you can do with it.

1. Automatically curried functions!

How about the function add, which looks like

def add(x:a, y:a) -> a:
return x + y

Unlike normal Python, where you would need to call add with 2 arguments, you can call this add with only one argument, and then call it with the other argument later, like so:

f = add(2)
f(2)
4

2. A map operator

Since functions are automatically curried, this

/r/Python
https://redd.it/1nksvm0
T-Strings: What will you do?

Good evening from my part of the world!

I'm excited with the new functionality we have in Python 3.14. I think the feature that has caught my attention the most is the introduction of t-strings.

I'm curious, what do you think will be a good application for t-strings? I'm planning to use them as better-formatted templates for a custom message pop-up in my homelab, taking information from different sources to format for display. Not reinventing any functionality, but certainly a cleaner and easier implementation for a message dashboard.

Please share your ideas below, I'm curious to see what you have in mind!

/r/Python
https://redd.it/1nkq8pt
FYI: PEP 2026 (CalVer) was shot down back in February - no jumping from 3.14.y to 3.25.y or 2025.x.y

PEP2026 discussed replacing the current Semantic Versioning with a Calender Versioning, where some options were 26.x.y (where 26 was from 2026), or 3.26.y (because there's currently a yearly release, they would just shift the minor version about 10 points).

Luckily this idea was shot down, back in Feb, because I was NOT looking forward to having to mess around with versions.

---

I'm mentioning it, because I recall a discussion back in Januari that they were going to do this, and quite a few people disliked the idea, so I'm happy to inform you that it's dead.

---

edit: It was shot down in this post

/r/Python
https://redd.it/1nl0x1p
Herramientas para trabajar en Django de mode API first

Quiero empezar a trabajar con Django y DRF definiendo primero la API (API first). Hago una definición de OpenAPI en un fichero YAML, pero no encuentro buenas herramientas para comprobar que mis vistas de Django cumplen con ese contrato.

/r/django
https://redd.it/1nl49u5
Best way to document my code ?

Hi, I would like to cleanly document my Python+Flask code ; this is my first time so I'm looking for help.

For now I've been doing it in a javadoc-style (see below), but i don't know if there are tools integrating it (VSCode integration, HTML doc generation, and other intelligent features). For instance I'm seing that python's typing library allows features similar to \\@param and \\@return that are closer to the code, that feels like a better idea than what I'm doing already.

In short, what is the standard(s), and what are the tools to exploit ?


Thanks in advance !



\---



Example of what I'm doing currently and want to improve on :

def routeAPIRequest(self, configFromPayload):
        """
        @param llmConfig a config dict, such as the output from processPayloadData()
                            can be None if no config coverride is meant

        @return Response    (meant to be transmitted in

/r/flask
https://redd.it/1nl4q4l