Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
AI Impostor Game

/r/flask
https://redd.it/1pcsja2
Python-native mocking of realistic datasets by defining schemas for prototyping, testing, and demos

https://github.com/DavidTorpey/datamock

What my project does:
This is a piece of work I developed recentlv that I've found quite useful. I decided to neaten it up and release it in case anyone else finds it useful.

It's useful when trving to mock structured data during development, for things like prototyping or testing. The declarative schema based approach feels Pythonic and intuitive (to me at least!).

I may add more features if there's interest.


Target audience: Simple toy project I've decided to release

Comparison:
Hypothesis and Faker is the closest things out these available in Python. However, Hypothesis is closely coupled with testing rather than generic data generation. Faker is focused on generating individual instances, whereas datamock allows for grouping of fields to express and generating data for more complex types and fields more easily. Datamock, in fact, utilises Faker under the hood for some of the field data generation.

/r/Python
https://redd.it/1pcrbn4
I listened to your feedback on my "Thanos" CLI. It’s now a proper Chaos Engineering tool.

Last time I posted `thanos-cli` (the tool that deletes 50% of your files), the feedback was clear: it needs to be safer and smarter to be actually useful.

People left surprisingly serious comments… so I ended up shipping **v2**.

It still “snaps,” but now it also has:

* weighted deletion (age / size / file extension)
* `.thanosignore` protection rules
* deterministic snaps with `--seed`

So yeah — it accidentally turned into a mini chaos-engineering tool.


If you want to play with controlled destruction:

GitHub: [https://github.com/soldatov-ss/thanos](https://github.com/soldatov-ss/thanos)

Snap responsibly. 🫰

/r/Python
https://redd.it/1pd2cgw
Django + HTMX + jquery. Do you know any websites / apps using this stack?

I am looking for example websites using this stack as it minimizes maintenance for most basic setups.

Let me know if you know any websites.

Of course, any ideas, critics and concerns for this stack are welcome, too.

/r/django
https://redd.it/1pcp15d
Blog: ReThinking Django Template #4: Server Side Component

This is #4 of my ReThinking Django Template series.

In this blog post, I will compare Django server-side component packages:

1. Django-Components
2. Django-ViewComponent
3. Cotton
4. Django-Slippers

After reading, it will help you pick one which fits best for your Django project.

ReThinking Django Template: Part 4, Server Side Component

/r/django
https://redd.it/1pd1dl9
My wife was manually copying YouTube comments, so I built this tool

I have built a Python Desktop application to extract YouTube comments for research and analysis.

My wife was doing this manually, and I couldn't see her going through the hassle of copying and pasting.

I posted it here in case someone is trying to extract YouTube comments.

What My Project Does

1. Batch process multiple videos in a single run
2. Basic spam filter to remove bot spam like crypto, phone numbers, DM me, etc
3. Exports two clean CSV files - one with video metadata and another with comments (you can tie back the comments data to metadata using the "video_id" variable)
4. Sorts comments by like count. So you can see the high-signal comments first.
5. Stores your API key locally in a settings.json file.

By the way, I have used Google's Antigravity to develop this tool. I know Python fundamentals, so the development became a breeze.

Target Audience

Researchers, data analysts, or creators who need clean YouTube comment data. It's a working application anyone can use.

Comparison

Most browser extensions or online tools either have usage limits or require accounts. This application is a free, local, open-source alternative with built-in spam filtering.

Stack: Python, CustomTkinter for the GUI, YouTube Data API v3, Pandas

GitHub: https://github.com/vijaykumarpeta/yt-comments-extractor

Would love to hear your feedback or feature

/r/Python
https://redd.it/1pd1g30
Pandas 3.0 release candidate tagged

After years of work, the Pandas 3.0 release candidate is tagged.

>We are pleased to announce a first release candidate for pandas 3.0.0. If all goes well, we'll release pandas 3.0.0 in a few weeks.

* Release candidate: [https://github.com/pandas-dev/pandas/releases/tag/v3.0.0rc0](https://github.com/pandas-dev/pandas/releases/tag/v3.0.0rc0)
* Full release notes: [https://pandas.pydata.org/docs/dev/whatsnew/v3.0.0.html](https://pandas.pydata.org/docs/dev/whatsnew/v3.0.0.html)
* Tracking issue: [https://github.com/pandas-dev/pandas/issues/57064](https://github.com/pandas-dev/pandas/issues/57064)

A very concise, incomplete list of changes:

#### String Data Type by Default
Previously, pandas represented text columns using NumPy's generic "object" dtype. Starting with pandas 3.0, string columns now use a dedicated "str" dtype (backed by PyArrow when available). This means:

* String columns are inferred as dtype "str" instead of "object"
* The str dtype only holds strings or missing values (stricter than object)
* Missing values are always NaN with consistent semantics
* Better performance and memory efficiency

#### Copy-on-Write Behavior
All indexing operations now consistently behave as if they return copies. This eliminates the confusing "view vs copy" distinction from earlier versions:

* Any subset of a DataFrame or Series always behaves like a copy
* The only way to modify an object is to directly modify that object itself
* "Chained assignment" no longer works (and the SettingWithCopyWarning is removed)
* Under the hood, pandas uses views for performance but copies when needed

#### Python and Dependency Updates
* Minimum Python version: 3.11
* Minimum NumPy version:

/r/Python
https://redd.it/1pdb5u2
Pyrefly now has built-in support for Pydantic

Pyrefly (Github) now includes built-in support for Pydantic, a popular Python library for data validation and parsing.

The only other type checker that has special support for Pydantic is Mypy, via a plugin. Pyrefly has implemented most of the special behavior from the Mypy plugin directly in the type checker.

This means that users of Pyrefly can have provide improved static type checking and IDE integration when working on Pydantic models.

Supported features include:
- Immutable fields with ConfigDict
- Strict vs Non-Strict Field Validation
- Extra Fields in Pydantic Models
- Field constraints
- Root models
- Alias validation

The integration is also documented on both the Pyrefly and Pydantic docs.

/r/Python
https://redd.it/1pdbw37
JustHTML: A pure Python HTML5 parser that just works.

Hi all! I just released a new HTML5 parser that I'm really proud of. Happy to get any feedback on how to improve it from the python community on Reddit.

I think the trickiest thing is if there is a "market" for a python only parser. Parsers are generally performance sensitive, and python just isn't the faster language. This library does parse the wikipedia startpage in 0.1s, so I think it's "fast enough", but still unsure.

Anyways, I got HEAVY help from AI to write it. I directed it all carefully (which I hope shows), but GitHub Copilot wrote all the code. Still took months of work off-hours to get it working. Wrote down a short blog post about that if it's interesting to anyone: https://friendlybit.com/python/writing-justhtml-with-coding-agents/

What My Project Does

It takes a string of html, and parses it into a nested node structure. To make sure you are seeing exactly what a browser would be seeing, it follows the html5 parsing rules. These are VERY complicated, and have evolved over the years.

from justhtml import JustHTML

html = "<html><body><div id='main'><p>Hello, <b>world</b>!</p></div></body></html>"
doc = JustHTML(html)



/r/Python
https://redd.it/1pdgpmk
D Self-Promotion Thread

Please post your personal projects, startups, product placements, collaboration needs, blogs etc.

Please mention the payment and pricing requirements for products and services.

Please do not post link shorteners, link aggregator websites , or auto-subscribe links.

\--

Any abuse of trust will lead to bans.

Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

\--

Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.

/r/MachineLearning
https://redd.it/1pbxkt2
Thursday Daily Thread: Python Careers, Courses, and Furthering Education!

# Weekly Thread: Professional Use, Jobs, and Education 🏢

Welcome to this week's discussion on Python in the professional world! This is your spot to talk about job hunting, career growth, and educational resources in Python. Please note, this thread is not for recruitment.

---

## How it Works:

1. Career Talk: Discuss using Python in your job, or the job market for Python roles.
2. Education Q&A: Ask or answer questions about Python courses, certifications, and educational resources.
3. Workplace Chat: Share your experiences, challenges, or success stories about using Python professionally.

---

## Guidelines:

- This thread is not for recruitment. For job postings, please see r/PythonJobs or the recruitment thread in the sidebar.
- Keep discussions relevant to Python in the professional and educational context.

---

## Example Topics:

1. Career Paths: What kinds of roles are out there for Python developers?
2. Certifications: Are Python certifications worth it?
3. Course Recommendations: Any good advanced Python courses to recommend?
4. Workplace Tools: What Python libraries are indispensable in your professional work?
5. Interview Tips: What types of Python questions are commonly asked in interviews?

---

Let's help each other grow in our careers and education. Happy discussing! 🌟

/r/Python
https://redd.it/1pdkuig
Chatgpt says local storage of video files for filefield is not SEO friendly.

I have a filefield where I am about to store video file. It's displayed on a video player on product details page. However chatgpt says it's not best practice to store video files locally but instead use links or store on some sort of cdn. What's best practice when it comes to working with video files? Easier and cheaper version is what I am looking for

/r/djangolearning
https://redd.it/1pb8l93
anyID: A tiny library to generate any ID you might need

Been doing this side project in my free time. Why do we need to deal with so many libraries when we want to generate different IDs or even worse, why do we need to write it from scratch? It got annoying, so I created AnyID. A lightweight Python lib that wraps the most popular ones in an API. It can be used in prod but for now it's under development.

Github: https://github.com/adelra/anyid

PyPI: https://pypi.org/project/anyid/

What My Project Does:

It can generate a wide of IDs, like cuid2, snowflake, ulid etc.

How to install it:

uv pip install anyid

How to use it:

from anyid import cuid, cuid2, ulid, snowflake, setupsnowflakeidgenerator

# Generate a CUID
my
cuid = cuid()
print(f"CUID: {mycuid}")

# Generate a CUID2
my
cuid2 = cuid2()
print(f"CUID2: {mycuid2}")

# Generate a ULID
my
ulid = ulid()
print(f"ULID: {myulid}")

# For Snowflake, you need to set up the generator first
setup
snowflakeidgenerator(workerid=1, datacenterid=1)


/r/Python
https://redd.it/1pdtpyx
Oauth2/oidc authentication vs authorization w/ Git and Google example.

Hi everyone,

Trying to grasp oauth2 and oidc and want to know if someone wouldn’t mind taking a look at this answer, and helping me understand if the Git example and the Google example each comprises authentication AND authorization or only one or the other? And whatever each are - are the “oauth2/oidc compliant”?

https://stackoverflow.com/a/63107397 here the author describes one that Git uses and one that Google uses.

Thanks so much!

Edit: are both the Git and Google scenarios explained, representing authentication and authorization? Or just one or the other?

/r/djangolearning
https://redd.it/1pa2var
After 3+ years as a software engineer… I finally built my personal website (and yes, I shamelessly copied Shudin’s design 😅)



So after writing thousands of lines of code, shipping products, fixing bugs that weren’t my fault (I swear), and pretending to understand cloud architecture diagrams…
I have finally achieved the ultimate developer milestone:

I built my personal website.
…3+ years later.
…and yes, I copied Shudin’s design layout like the absolute template-goblin I am.

Here it is if you wanna roast it, hire me, or tell me my spacing is off on mobile:
👉 https://etnik.vercel.app

Honestly, I don’t know what took longer:
• Understanding Kubernetes
• Explaining to my family what a software engineer does
• Or actually sitting down and building this website instead of saying “I’ll do it next weekend” for 150 weekends straight.

Anyway, enjoy my finally-born portfolio child.
Feedback, memes, insults (gentle ones pls), and improvements welcome. 😄

/r/flask
https://redd.it/1pdw98c
Introducing docu-crawler: A lightweight library for crwaling Documentation, with CLI support

# [](https://www.reddit.com/r/Python/?f=flair_name%3A%22Showcase%22)Hi everyone!

I've been working on **docu-crawler**, a **Python** library that crawls documentation websites and converts them to Markdown. It's particularly useful for:

\- Building offline documentation archives
\- Preparing documentation data
\- Migrating content between platforms
\- Creating local copies of docs for analysis

**Key features:**
\- Respects robots.txt and handles sitemaps automatically
\- Clean HTML to Markdown conversion
\- Multi-cloud storage support (local, S3, GCS, Azure, SFTP)
\- Simple API and CLI interface

**Links:**
\- PyPI: [https://pypi.org/project/docu-crawler/](https://pypi.org/project/docu-crawler/)
\- GitHub: [https://github.com/dataiscool/docu-crawler](https://github.com/dataiscool/docu-crawler)

Hope it is useful for someone!

/r/Python
https://redd.it/1pdsvon
Built an open-source app to convert LinkedIn -> Personal portfolio generator using FastAPI backend

I was always too lazy to build and deploy my own personal website. So, I built an app to convert a LinkedIn profile (via PDF export) or GitHub profile into a personal portfolio that can be deployed to Vercel in one click.

Here are the details required for the showcase:

What My Project Does It is a full-stack application where the backend is built with Python FastAPI.

1. Ingestion: It accepts a LinkedIn PDF export or fetched projects using a GitHub username or uses a Resume PDF.
2. Parsing: I wrote a custom parsing logic in Python that extracts the raw text and converts it into structured JSON (Experience, Education, Skills).
3. Generation: This JSON is then used to populate a Next.js template.
4. AI Chat Integration: It also injects this structured data into a system prompt, allowing visitors to "chat" with the portfolio. It is like having an AI-twin for viewers/recruiters.

The backend is containerized and deployed on Azure App Containers, using Firebase for the database.

Target Audience This is meant for Developers, Students, and Job Seekers who want a professional site but don't want to spend days coding it from scratch. It is open source so you are free to clone it, customize it and run

/r/Python
https://redd.it/1pe1cm1
mail and sms tool with templates, multiple backends, and daily limits.

I'm thinking I'll have to add my own app on top of some existing apps.

My site is sports related. It sends out updates to interested parties at certain times. For instance, when a player is invited to join a team, when the location or time of a game they care about changes, when scores or posted, etc.

We also send occasional newsletters. They usually go to a subset of members and we have filters for selecting users.

I would like users to be to decide how to receive these updates. SMS messages could be the entire message or a link to the actual message. Emails would always be the complete content. This would just be part of the user profile.

We have 3 backends. One email backend that is fast and reliable, a second email backend for bulk mail and SMS. Our reliable backend has a daily limit so I'd like the system to warn me if we ever get close to that limit (not a must have).

I would also like it to be possible for messages to be template driven and support multiple languages. They would have to have HTML and TXT formatting.

Finally, I'd like

/r/django
https://redd.it/1pdrsxw