Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
datamule: download, parse, and construct structured datasets from SEC filings

Link: https://github.com/john-friedman/datamule-python

# What my project does

1. Download SEC filings quickly. (Bulk downloads are also available, benchmark is \~2 min/year for every 10-K/10-Q since 2001
2. Parse SEC filings quickly. (Currently only 8-K, 13F-HR Information tables are implemented. 10-K/10-Q coming next week)
3. Convert SEC textual filings directly into structured datasets.
4. Watch for new filings.
5. Has a basic tool calling chatbot with artifacts. Doesn't do anything useful yet, but was fun to make.

# Target Audience

Grad students looking to save money on expensive datasets, quants with side projects, software engineers looking to build commercial projects, and WSB people trying fun new trading strategies. In the future I'd like to make the chatbot code a bit cleaner so it can be used as a tutorial project for masters students w/ finance but not programming experience.

# Comparison

Getting SEC data in bulk is surprisingly expensive. Parsed SEC data is even more expensive. Derived datasets such as board of directors data is also expensive (something like 35k/license).

# Contribution

Greatly appreciated. Also SEC feature requests + QoL suggestions are very useful.

Links: https://github.com/john-friedman/datamule-python

/r/Python
https://redd.it/1gc7yac
I created a Django rest framework package for MFA/2FA



I'm excited to announce the release of drf-totp, a package that brings Time-Based One-Time Password (TOTP) Multi-Factor Authentication (MFA) to the Django Rest Framework.

What My Project Does

drf-totp provides a simple and secure way to add an extra layer of authentication to your API endpoints, protecting your users' accounts from unauthorized access. With this package, you can easily integrate TOTP MFA into your Django Rest Framework project, supporting popular authenticator apps like Google Authenticator and Authy.

Key Features

1. Easy integration with Django Rest Framework
2. Supports popular authenticator apps like Google Authenticator and Authy

Target Audience

drf-totp is designed for developers and teams building secure API-based applications with Django Rest Framework. This package is suitable for production environments and can be used to add an extra layer of security to existing projects or new applications.

Comparison

While there are other MFA solutions available for Django, drf-totp is specifically designed for the Django Rest Framework and provides a seamless integration experience. Unlike other solutions that may require extensive configuration or customization, drf-totp is easy to set up and use, making it an ideal choice for developers who want to add TOTP MFA to their API endpoints quickly and securely.


Check out the GitHub repo for installation instructions and example

/r/Python
https://redd.it/1gcl0hk
Every unicode character can be a variable name in globals and locals

Hello. Reading about walrus operator I've seen Ο† used as a variable. That defied my knowledge (_, a-z, A-Z, 0-9), and I thought "if Ο† is valid, why πŸ† isn't?".

After a bit of try, I've come up with this.

initial = 127810
for i in range(10):
    variable = chr(initial + i)
    locals()variable = f"Value of {variable} is {ord(variable)}"
print(locals().get("πŸ†"))

Getting

Value of πŸ† is 127814

Therefore, πŸ† can be a variable in Python (in globals and locals). But also horizontal tab, backspace, null character, ... can be. Of course, they are not accessible in the code the same way than Ο† or hello_world, but still it's a nice gimmick. I hope you find it fun and/or useful.

But now the real thing. In this context, do you know if using backspace or null as a variable in globals could break the program in execution time? Thank you.



/r/Python
https://redd.it/1gc2gmg
Hosting my Flask application - selecting a provider?

I'm currently looking to host my Flask application that is completely finished and just needs to go online, but as it is my first project that is actually going online I'm looking for some guidance with selecting a provider.

The app is a statistics application that I built for a company. It's a fairly basic Flask application with upwards of 8 .py scripts, a .json dataset and and some web templates, images and .css files. Everything is running smoothly and perfectly on the built-in development server, so I'm hoping it will continue to do so once hosted properly.

Security is a concern (if that matters when it comes to selecting the provider) as the application uses developer keys and some other credentials (that I've done all I can to secure within the app itself). I will need to install a log-in system of some sort so if any provider can make that easy that would be a major advantage.

Hoping for some pointers or just to hear some experiences with different providers - and thanks in advance :-)

T

/r/flask
https://redd.it/1gay069
P Shape-restricted regression with neural networks

Some time ago at work we had to enforce that our model learns an increasing function of a feature. For example, the probability of winning an auction as a function of the bid should increase. Recently, I encountered the paper https://arxiv.org/abs/2209.04476 on regression with shape-restricted functions, and wanted to make it a bit more tangible, with actual code that trains such a model.

So it resulted in a blog post: https://alexshtf.github.io/2024/10/14/Shape-Restricted-Models.html
There's also a notebook with the accompanying code: https://github.com/alexshtf/alexshtf.github.io/blob/master/assets/shape\_constrained\_models.ipynb

I used to work on ads quite a lot .So such models seem useful in this industry - predicting the probability of winning an ad auction given the bid. I hope it's also useful elsewhere.

So I hope you'll enjoy it! It's a big 'mathy', but you know, it can't be otherwise.

/r/MachineLearning
https://redd.it/1gcpl03
πŸŽ‰ Introducing dj-announcement-api package πŸŽ‰

We're thrilled to announce the release of `dj-announcement-api`, a versatile Django package developed by Lazarus to simplify and optimize the management and distribution of announcements through a robust API.

# Key Features

Full, Optimizable API: Manage announcements programmatically with an API designed for high performance and scalability.
Targeted Announcements: Create detailed, categorized announcements directed at specific user audiences.
Auto-Assign Audiences: Automatically assign users to relevant audiences for seamless, targeted communication.
Scheduling Options: Schedule announcements with customizable publication and expiration dates to deliver information at the right time.

Ideal for modern Django applications with dynamic needs, dj-announcement-api brings flexibility, scalability, and ease of use for any project needing streamlined announcement management. Check it out on PyPI: dj-announcement-api on PyPI Source Code and Docs on GitHub: dj-announcement-api on GitHub

/r/django
https://redd.it/1gch3py
D Train on full dataset after cross-validation? Semantic segmentation

I am currently working on a semantic segmentation project of oat leaf disease symptoms. The dataset is quite small, 16 images. Due to time constraints, I won't be able to extend this.

I am currently training 3 models, 3 backbones, and 3 losses--using 5-fold cross validation and grid search.

Once this is done, I plan to then run cross validation on a few different levels of augmentations per image.

My question is this:

Once I have established the best model, backbone, loss, and augmentation combination, can I train on the full dataset since it is so small? If I can do this, how do I know when to stop training to prevent overfitting but still adequately learn the data?

I have attached an image of some results so far.

https://preview.redd.it/sx394c58l5xd1.png?width=2000&format=png&auto=webp&s=3cefbf5c84bf3fbf48936c47810c4e3039dcb410

Thanks for any help you can provide!

/r/MachineLearning
https://redd.it/1gct22r
Configuration format

I currently use JSONs for storing my configurations and was instead recommended YAML by a colleague. I tried it out, and it looks decent. Big fan of the ability to write comments. I want to switch, but wanted to get opinions regarding pros and cons from the perspective of file size, time taken to read/write and how stable are the corresponding python libraries used to handle them.

My typical production JSONs are ~50 MB. During the research phase, they can be upto ~500 MB before pruning.

/r/Python
https://redd.it/1gcq5rg
Sunday Daily Thread: What's everyone working on this week?

# Weekly Thread: What's Everyone Working On This Week? πŸ› οΈ

Hello /r/Python! It's time to share what you've been working on! Whether it's a work-in-progress, a completed masterpiece, or just a rough idea, let us know what you're up to!

## How it Works:

1. Show & Tell: Share your current projects, completed works, or future ideas.
2. Discuss: Get feedback, find collaborators, or just chat about your project.
3. Inspire: Your project might inspire someone else, just as you might get inspired here.

## Guidelines:

Feel free to include as many details as you'd like. Code snippets, screenshots, and links are all welcome.
Whether it's your job, your hobby, or your passion project, all Python-related work is welcome here.

## Example Shares:

1. Machine Learning Model: Working on a ML model to predict stock prices. Just cracked a 90% accuracy rate!
2. Web Scraping: Built a script to scrape and analyze news articles. It's helped me understand media bias better.
3. Automation: Automated my home lighting with Python and Raspberry Pi. My life has never been easier!

Let's build and grow together! Share your journey and learn from others. Happy coding! 🌟

/r/Python
https://redd.it/1gcygl6
D Self-Promotion Thread

Please post your personal projects, startups, product placements, collaboration needs, blogs etc.

Please mention the payment and pricing requirements for products and services.

Please do not post link shorteners, link aggregator websites , or auto-subscribe links.

--

Any abuse of trust will lead to bans.

Encourage others who create new posts for questions to post here instead!

Thread will stay alive until next one so keep posting after the date in the title.

--

Meta: This is an experiment. If the community doesnt like this, we will cancel it. This is to encourage those in the community to promote their work by not spamming the main threads.

/r/MachineLearning
https://redd.it/1gd0v8r
Moving to Canada Soon! Looking to Connect with Fellow Django Devs

Hey everyone! I'm moving to Canada in about four months, and I'm excited to connect with people working with Django. I have around three years of experience building large-scale Django apps that handle millions of data entries, and I'm always looking to improve and learn from others.

If you're in Canada or familiar with the Django scene there, I'd love to hear any insights, or tips, or just chat about the latest in Django development! 😊

/r/django
https://redd.it/1gcx9ax
How do I compare a form in CKEditorField and StringField?

I am using flask and flask-sqlalchemy and flask-wtf-forms.


Imagine I have 2 forms where one form is placed in each flask route.

Lets start with the first route and the first form.

The 1st form has the flask wtf field CKEditorField. Within the route I type zzz in the form. Next I save this in the Posts table as the content column. Now lets switch to the second route and second form.


In the 2nd route I am using StringField. Then in a form, I input/type zzz.
Now I am using a custom validator in the form and I query one_or_None for the Posts table. If Posts returns something I then test if posts_db.content == content_form' I raise the validationerror('the post is not unique"). This should work but what if I use something like bold in the ckeditor form. How would I get the output in the second Stringfield form? The only solution I can think of is passing on the variable posts_db in the route. Does anyone have any other suggestions?




TLDR:

I have 2 forms. The 1st form being CKEditorField ,in the 1st route, which I fill with the text 'zzz' then save it the Posts db

/r/flask
https://redd.it/1gcujm2
How did you first learn about Python?

How did all of you stumble upon python? I saw someone writing python in RuneScape one day and became curious. Then I dipped into front end frameworks like html and css, then JavaScript and python

/r/Python
https://redd.it/1gcwcex
Django and iOS/android apps?

Is it possible to create one Django web app and also release iOS and android versions of that app without having to write in the native languages? It would be great to avoid having to learn/write in 3 frameworks but also is great for consistency/maintainability, only having to maintain the code in one place

Of course, a Django web app can be used on mobile, but people always seem to say that users want to actually install an iOS/ android app instead. What is the best option here?

/r/django
https://redd.it/1gd77ho
Looking for someone willing to join a call with me to review my code

I'm working on Django Rest Framework and built REST API with MySQL workbench as database, I've got most of the code done, but I'm facing bugs in authentication that I've been stuck on for a really long time and I can't move on with my project without fixing them, I really tried everything and I'm trying this as a last option, I don't want anyone to write me code, I'm suggesting if someone is willing to join a discord call with me where I can share my screen and they can review my code and maybe tell me what I've been doing wrong. it's not a large project and I'll make sure I don't take much time, it'll be much appreciated, thanks for everyone in advance :)

/r/django
https://redd.it/1gd7jpy
D Last Week in Medical AI: Top LLM Research Papers/Models (October 19 - October 26)


Last Week in Medical AI: Top LLM Research Papers\/Models \(October 19 - October 26\)



Medical AI Paper of the Week:

Safety principles for medical summarization using generative AI by Google
This paper discusses the potential and challenges of applying large language models (LLMs) in healthcare, focusing on the promise of generative AI to support various workflows. Medical LLM & Other Models:

Medical LLM & Other Models:

BioMistral-NLU: Medical Vocab Understanding
This paper introduces BioMistral-NLU, a generalizable medical NLU model fine-tuned on the MNLU-Instruct dataset for improved performance on specialized medical tasks. BioMistral-NLU outperforms existing LLMs like ChatGPT and GPT-4 in zero-shot evaluations across six NLU tasks from BLUE and BLURB benchmarks.
Bilingual Multimodal LLM for Biomedical Tasks
This paper introduces MedRegA, a novel region-aware medical Multimodal Large Language Model (MLLM) trained on a large-scale dataset called MedRegInstruct.
Metabolic-Enhanced LLMs for Clinical Analysis
This paper introduces Metabolism Pathway-driven Prompting (MPP) to enhance anomaly detection in clinical time-series data by integrating domain knowledge of metabolic pathways into LLMs.
Dermatology Foundation Model
This paper introduces PanDerm, a multimodal dermatology foundation model trained on over

/r/MachineLearning
https://redd.it/1gd6k6j
We're thinking of rewriting our go / java API in python, what do we need to think about?

Background:
We have a horrible hodgepodge of APIs in front of our data platform, java that mostly calls underlying functions in the go (with slightly more user friendly calls). The go API often calls bash scripts to do the actual work. Most of the stuff the API does is building a call for an external service like doing spark submit on the file the user has provided or creating a table in hive with details the user has provided. The java API has swagger and is mostly what all users call.

One option we have is to rewrite it all in go getting rid of java and bash, write swagger into the go and all the things the java does.
But we're predominantly a python shop, which means whenever something needs to be done with the APIs only a few people are prepared to go near it and it's recieved very little change over the years where the rest of the platform is moving on rapidly.

So a few of us are in favour for rewiteing it all in something like fastAPI, (or maybe black sheep?)

From what I understand this would basically give us swagger for free and mean there

/r/Python
https://redd.it/1gdavp9