Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Query and Eval for Python Polars

I am a longtime pandas user. I hate typing when it comes to slicing and dicing the dataframe. Pandas query and eval come to the rescue.

On the other hand, pandas suffers from the performance and memory issue as many people have discussed. Fortunately, Polars comes to the rescue. I really enjoy all the performance improvements and the lazy frame just makes it possible to handle large dataset with a 32G memory PC.

However, with all the good things about Polars, I still miss the query and eval function of pandas, especially when it comes to data exploration. I just don’t like typing so many pl.col in a chained conditions or pl.when otherwise in nested conditions.

Without much luck with existing solutions, I implemented my own version of query, eval among other things. The idea is using lark to define a set of grammars so that it can parse any string expressions to polars expression.

For example,
“1 < a <= 3” is translated to (pl.col(‘a’)> 1) & (pl.col(‘a’)<=3), “a.sum().over(‘b’)” is translated to pl.col(‘a’).sum().over(‘b’), “ a in @A” where A is a list, is translated to pl.col(‘a’).isin(A), “‘2010-01-01’ <= date < ‘2019-10-01’” is translated accordingly for date time columns. For my

/r/Python
https://redd.it/1kmy3xm
Refinedoc - Little text processing lib

Hello everyone!

I'm here to present my latest little project, which I developed as part of a larger project for my work.

What's more, the lib is written in pure Python and has no dependencies other than the standard lib.

What My Project Does

It's called Refinedoc, and it's a little python lib that lets you remove headers and footers from poorly structured texts in a fairly robust and normally not very RAM-intensive way (appreciate the scientific precision of that last point), based on this paper https://www.researchgate.net/publication/221253782\_Header\_and\_Footer\_Extraction\_by\_Page-Association

I developed it initially to manage content extracted from PDFs I process as part of a professional project.



When Should You Use My Project?

The idea behind this library is to enable post-extraction processing of unstructured text content, the best-known example being pdf files. The main idea is to robustly and securely separate the text body from its headers and footers which is very useful when you collect lot of PDF files and want the body oh each.


Comparison

I compare it with pymuPDF4LLM wich is incredible but don't allow to extract specifically headers and footers and the license was a problem in my case.



I'd be delighted to hear your feedback on the code or lib as such!



https://github.com/CyberCRI/refinedoc

/r/Python
https://redd.it/1kn4lfx
PyTorch vs. Keras/Tensorflow D

Hey guys,

I am aware of the intended use cases, but I am interested to learn what you use more often in your projects. PyTorch or Keras and why?

/r/Python
https://redd.it/1kn4132
R AlphaEvolve: A coding agent for scientific and algorithmic discovery

Paper: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf

Abstract:

In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances
capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems
or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous
pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using
an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve
iteratively improves the algorithm, potentially leading to new scientific and practical discoveries. We
demonstrate the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational
stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found
a functionally equivalent simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered
novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems
in mathematics and computer science, significantly expanding the scope of prior automated discovery
methods (Romera-Paredes et al., 2023). Notably, AlphaEvolve developed a search algorithm that found a
procedure to multiply two 4 × 4 complex-valued matrices using 48 scalar multiplications; offering the
first improvement, after 56 years, over Strassen’s algorithm in this setting.

/r/MachineLearning
https://redd.it/1kmxi4z
I built an Interactive reStructuredText Tutorial that runs entirely in your browser

Hey r/Python!

I wanted to share a project I've been working on: an Interactive reStructuredText Tutorial.

What My Project Does

It's a web-based, hands-on tutorial designed to teach reStructuredText (reST), the markup language used extensively in Python documentation (like Sphinx, docstrings, etc.). The entire tutorial, including the reST rendering, runs directly in your browser using PyScript and Pyodide.

You get a lesson description on one side and an interactive editor on the other. As you type reST in the editor, you see the rendered HTML output update instantly. It covers topics from basic syntax and inline markup to more complex features like directives, roles, tables, and figures.

There's also a separate Playground page for free-form experimentation.

Why I Made It

While the official reStructuredText documentation is comprehensive, I find that learning markup languages is often easier with immediate, interactive feedback. I wanted to create a tool where users could experiment with reST syntax and see the results without needing any local setup. Building it with PyScript was also a fun challenge to see how much could be done directly in the browser with Python.

Target Audience

This is for anyone who needs to learn or brush up on reStructuredText:

Python developers writing documentation or docstrings.
Users of Sphinx or

/r/Python
https://redd.it/1kn6ysa
R AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms

> Large language models (LLMs) are remarkably versatile. They can summarize documents, generate code or even brainstorm new ideas. And now we’ve expanded these capabilities to target fundamental and highly complex problems in mathematics and modern computing.
Today, we’re announcing AlphaEvolve, an evolutionary coding agent powered by large language models for general-purpose algorithm discovery and optimization. AlphaEvolve pairs the creative problem-solving capabilities of our Gemini models with automated evaluators that verify answers, and uses an evolutionary framework to improve upon the most promising ideas.
AlphaEvolve enhanced the efficiency of Google's data centers, chip design and AI training processes — including training the large language models underlying AlphaEvolve itself. It has also helped design faster matrix multiplication algorithms and find new solutions to open mathematical problems, showing incredible promise for application across many areas.


For all the Evolutionary Algorthim fans out there, here's a really interesting paper that Deepmind published where they show AlphaEvolve designing advanced algorithms like improving matrix multiplication (which is a big deal in ML optimization)

Paper link: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/

Interview with team:
https://youtu.be/vC9nAosXrJw?si=rzZSorXqgbqChFJa

/r/MachineLearning
https://redd.it/1kmzpg0
flask wiki got a new server to run the website.

/r/flask
https://redd.it/1kn088t
Python for Good - Save the Date!

Hey Pythonistas!

Do you:

Get excited about writing Python code?
Want to use your skills for some serious good in the world?
Interested in hanging out with the coolest, kindest, most awesome people in the Python community?
Want to make dozens of new close friends?

If you're nodding enthusiastically right now, block off August 28-31st for Python for Good! Registration opens June 1st, but we wanted to give you a heads-up so you can plan accordingly!

Never heard of Python for Good? Python for Good operates year round but the event is basically summer camp for nerds! And it's ALL-INCLUSIVE (yes, you read that right) - lodging, meals, everything - at a gorgeous retreat space overlooking the Pacific Ocean. By day, we code for awesome causes. By night? We unleash our inner geeks with board games, nature hikes, campfire s'mores, epic karaoke battles, and other community building activities!

This is definitely NOT a hackathon. We work on real problems from real nonprofits (who'll be right there with us!), creating or contributing to existing open source solutions that will continue to make a difference long after the event wraps up.

Sounds like fun? Or maybe something your company would love to support? Hit us up!

/r/Python
https://redd.it/1knhkex
Friday Daily Thread: r/Python Meta and Free-Talk Fridays

# Weekly Thread: Meta Discussions and Free Talk Friday 🎙️

Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!

## How it Works:

1. Open Mic: Share your thoughts, questions, or anything you'd like related to Python or the community.
2. Community Pulse: Discuss what you feel is working well or what could be improved in the /r/python community.
3. News & Updates: Keep up-to-date with the latest in Python and share any news you find interesting.

## Guidelines:

All topics should be related to Python or the /r/python community.
Be respectful and follow Reddit's Code of Conduct.

## Example Topics:

1. New Python Release: What do you think about the new features in Python 3.11?
2. Community Events: Any Python meetups or webinars coming up?
3. Learning Resources: Found a great Python tutorial? Share it here!
4. Job Market: How has Python impacted your career?
5. Hot Takes: Got a controversial Python opinion? Let's hear it!
6. Community Ideas: Something you'd like to see us do? tell us.

Let's keep the conversation going. Happy discussing! 🌟

/r/Python
https://redd.it/1knn8l8
Better Pythonic Thinking

I've been using Python for a while, but I still find myself writing it more like JS than truly "Pythonic" code. I'm trying to level up how I think in Python.

Any tips, mindsets, patterns, or cheat sheets that helped you make the leap to more Pythonic thinking?

/r/Python
https://redd.it/1knff06
Jinja2

what is Jinja2 template

explain it or any source or youtube video.

/r/flask
https://redd.it/1kn3rhw
python and Flask

I am using Python with Flask to create a secure login portal. Since I have a QA exam, could you tell me what theory and practical questions the QA team might ask?

/r/flask
https://redd.it/1kmhiri
What network/data analysis projects are you building in Python?

I've been working on some tools to analyze detailed API performance data — things like latency, error rates, and concurrency patterns from load tests, mostly using Python, pandas, and notebooks.

Got me wondering: what kinds of network-related data projects are people building these days?

Always up for swapping ideas — or just learning what’s out there.

/r/Python
https://redd.it/1knvl0u
Hi there, I'm new here and I need a partner to learn Django with through discord



/r/djangolearning
https://redd.it/1knv0m8
D presenting a paper virtually in ACL findings - should we?

Hi everyone.

Our paper (mine and colleagues) has been accepted to ACL findings. This is the first paper of mine that got accepted, so i am very excited and happy.

ACL findings papers are not required to be presented. They give you an option to present it, and if you choose to present it you can do it in person or virtually.

Unfortunately none of us are able to do it in person and fly to the conference. So the question becomes "is it worth it to present it virtually?".

I would love to hear what people think and experiences you had when presenting virtually.

Thanks.

/r/MachineLearning
https://redd.it/1knvsib
Which library would you choose Pygame or Arcade?

which library would you guys choose if making a game similar to mini millitia for steam, i see both libraries are good and have community support also , but still which one would you choose or if any other options , do comment

/r/Python
https://redd.it/1knwiyt
P TTSDS2 - Multlingual TTS leaderboard

A while back, I posted about my TTS evaluation metric TTSDS, which uses an ensemble of perceptually motivated, FID-like scores to objectively evaluate synthetic speech quality. The original thread is here, where I got some great feedback:
https://www.reddit.com/r/MachineLearning/comments/1e9ec0m/p\_ttsds\_benchmarking\_recent\_tts\_systems/

Since then, I've finally gotten around to updating the benchmark. The new version—TTSDS2—is now multilingual, covering 14 languages, and generally more robust across domains and systems.

Leaderboard: ttsdsbenchmark.com#leaderboard
📄 Paper: https://arxiv.org/abs/2407.12707

The main idea behind TTSDS2 is still the same: FID-style (distributional) metrics can work well for TTS, but only if we use several of them together, based on perceptually meaningful categories/factors. The goal is to correlate as closely as possible with human judgments, without having to rely on trained models, ground truth transcriptions, or tuning hyperparameters. In this new version, we get a Spearman correlation above 0.5 with human ratings in every domain and language tested, which none of the other 16 metrics we compared against could do.

I've also put in place a few infrastructure changes. The benchmark now reruns automatically every quarter, pulling in new systems published in the previous quarter. This avoids test set contamination. The test sets themselves are also regenerated periodically using a reproducible pipeline. All TTS

/r/MachineLearning
https://redd.it/1knwaf7