Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Hey Python/pandas users. Check this out. I've used this as part of my day-to-day workflow since it was released a couple months ago and I'm hooked. It's a better way to store, retrieve, and explore my data in a way that's seamlessly integrated with my preferred analysis environment (Python).


https://data.world/nrippner/explore-the-data-world-python-sdk/file/ddw_SDK.ipynb
or, to look at notebook without data.world account:
https://github.com/nrippner/misc/blob/master/ddw_SDK.ipynb

/r/pystats
https://redd.it/6a7rnd
Requests per second with database query

I am trying to get a sense of performance based on my set up because I feel like I am way under performing.

I have a flask container on AWS ECS with 1024Cpu Units and 2GB of memory.

The containerized app uses uwsgi and nginx.

The query I am running is a sql-alchemy pagianted query with a page size of 20. Running this same query from SequelPro takes 109ms. This is against a remote Aurora RDS.

However when load testing this endpoint (with 2 containers under an ELB) the requests per second is hardly double digits. Is this just a symptom of having a remote db and small hardware profile on the container?

/r/flask
https://redd.it/6a79u3
[D] Machine Learning - WAYR (What Are You Reading) - Week 25

This is a place to share machine learning research papers, journals, and articles that you're reading this week. If it relates to what you're researching, by all means elaborate and give us your insight, otherwise it could just be an interesting paper you've read.

Please try to provide some insight from your understanding and please don't post things which are present in wiki.

Preferably you should link the arxiv page (not the PDF, you can easily access the PDF from the summary page but not the other way around) or any other pertinent links.

Previous weeks :

|1-10|11-20|21-30|
|----|-----|-----|
|[Week 1](https://www.reddit.com/r/MachineLearning/comments/4qyjiq/machine_learning_wayr_what_are_you_reading_week_1/)|[Week 11](https://www.reddit.com/r/MachineLearning/comments/57xw56/discussion_machine_learning_wayr_what_are_you/)|[Week 21](https://www.reddit.com/r/MachineLearning/comments/60ildf/d_machine_learning_wayr_what_are_you_reading_week/)|
|[Week 2](https://www.reddit.com/r/MachineLearning/comments/4s2xqm/machine_learning_wayr_what_are_you_reading_week_2/)|[Week 12](https://www.reddit.com/r/MachineLearning/comments/5acb1t/d_machine_learning_wayr_what_are_you_reading_week/)|[Week 22](https://www.reddit.com/r/MachineLearning/comments/64jwde/d_machine_learning_wayr_what_are_you_reading_week/)||
|[Week 3](https://www.reddit.com/r/MachineLearning/comments/4t7mqm/machine_learning_wayr_what_are_you_reading_week_3/)|[Week 13](https://www.reddit.com/r/MachineLearning/comments/5cwfb6/d_machine_learning_wayr_what_are_you_reading_week/)|[Week 23](https://www.reddit.com/r/MachineLearning/comments/674331/d_machine_learning_wayr_what_are_you_reading_week/)||
|[Week 4](https://www.reddit.com/r/MachineLearning/comments/4ub2kw/machine_learning_wayr_what_are_you_reading_week_4/)|[Week 14](https://www.reddit.com/r/MachineLearning/comments/5fc5mh/d_machine_learning_wayr_what_are_you_reading_week/)|[Week 24](https://www.reddit.com/r/MachineLearning/comments/68hhhb/d_machine_learning_wayr_what_are_you_reading_week/)||
|[Week 5](https://www.reddit.com/r/MachineLearning/comments/4xomf7/machine_learning_wayr_what_are_you_reading_week_5/)|[Week 15](https://www.reddit.com/r/MachineLearning/comments/5hy4ur/d_machine_learning_wayr_what_are_you_reading_week/)||
|[Week 6](https://www.reddit.com/r/MachineLearning/comments/4zcyvk/machine_learning_wayr_what_are_you_reading_week_6/)|[Week 16](https://www.reddit.com/r/MachineLearning/comments/5kd6vd/d_machine_learning_wayr_what_are_you_reading_week/)||
|[Week 7](https://www.reddit.com/r/MachineLearning/comments/52t6mo/machine_learning_wayr_what_are_you_reading_week_7/)|[Week 17](https://www.reddit.com/r/MachineLearning/comments/5ob7dx/discussion_machine_learning_wayr_what_are_you/)||
|[Week 8](https://www.reddit.com/r/MachineLearning/comments/53heol/machine_learning_wayr_what_are_you_reading_week_8/)|[Week 18](https://www.reddit.com/r/MachineLearning/comments/5r14yd/discussion_machine_learning_wayr_what_are_you/)||
|[Week 9](https://www.reddit.com/r/MachineLearning/comments/54kvsu/machine_learning_wayr_what_are_you_reading_week_9/)|[Week 19](https://www.reddit.com/r/MachineLearning/comments/5tt9cz/discussion_machine_learning_wayr_what_are_you/)||
|[Week 10](https://www.reddit.com/r/MachineLearning/comments/56s2oa/discussion_machine_learning_wayr_what_are_you/)|[Week 20](https://www.reddit.com/r/MachineLearning/comments/5wh2wb/d_machine_learning_wayr_what_are_you_reading_week/)||

Most upvoted papers two weeks ago:

/u/whenmaster: https://arxiv.org/abs/1701.07875v2

/u/nicrob355982: https://arxiv.org/abs/1507.04808

Besides that, there are no rules, have fun.

Hey, seems there was a little hiccup where last week's WAYR post wasn't stickied, so I'm going to change the bot to post every other week.

/r/MachineLearning
https://redd.it/69teiz
Looking for resources for deserializing JSON into multiple models

I'm very new to both python and Django and am having difficulty finding the right documentation to work through my current problem.

I am trying to take JSON from a url and commit it to a database. For the example below I would like to commit "name", "type", and "level" to a model called Items and all of the details to a model called ItemDetails.

[
{
"name": "Abomination Hammer",
"type": "Weapon",
"level": 0,
"rarity": "Fine",
"vendor_value": 0,
"default_skin": 5014,
"game_types": [
"Activity",
"Wvw",
"Dungeon",
"Pve"
],
"flags": [
"NoSell",
"SoulbindOnAcquire",
"SoulBindOnUse"
],
"restrictions": [],
"id": 15,
"chat_link": "[&AgEPAAAA]",
"details": {
"type": "Hammer",
"damage_type": "Physical",
"min_power": 146,
"max_power": 165,
"defense": 0,
"infusion_slots": [],
"infix_upgrade": {
"id": 112,
"attributes": []
},
"secondary_suffix_item_id": ""
}
}
]

So far the only way I've found to do it is to load the JSON from the url

def jsonload(url):
response = urllib.request.urlopen(url).read()
jsonstring = str(response, 'utf-8')
json_load = json.loads(jsonstring)
return json_load

Then loop through each item (like the one shown above) and save the fields I want to use.

i = Items(id=data.get('id'),
name=data.get('name'),
type=data.get('type'),
level=data.get('level'),
)
i.save()

From what I've been able to gather it seems like this is better suited for a deserializer, but I can't find anything about deserializing JSON from a url into multiple model classes.

/r/django
https://redd.it/6a7mq5
Using DjangoQL with default search bar

There's a neat library called [DjangoQL](https://github.com/ivelum/djangoql) which I want to incorporate into my project. It replaces the standard search_fields search bar in the Admin with a very powerful query system.

However, I want my user to be able to use the normal search_fields most of the time, and only use the DjangoQL bar on occasion when the query functionality is really needed.

Is there some way that I could have there be two search bars, one DjangoQL and one search_fields? Or even better, to have an "advanced search" button which toggles/pulls up/etc the DjangoQL search bar?

Thanks so much!

/r/django
https://redd.it/6a9w07
Populate form field from ModelChoiceField with previously selected value

I'm trying to let my users edit a form they have previously created. All my form fields are being populated with their existing values with one exception. I have a ModelChoiceField that I'm unable to populate, it always shows the none value ("Select Client"). So my question is, how to I get this field populated when editing a form?

https://pastebin.com/3ACrHYpW

/r/django
https://redd.it/6a9ggq
Deployment best practices for secret key, environment variables, etc

I just spent days - DAYS - on my first real AWS EC2 deployment, all because I was setting environment variables on the production server and apparently that doesn't work. I finally figured it out and was able to deploy with my key and passwords hard-coded into my settings.py on the production server which, I assume, is not something I should be doing? But then Django docs say to use env vars or to import from a file, but that sounds just as insecure as hard coding it into settings.py, so maybe I'm overthinking this.

What the standard practice in production? Where do you put the secret key and, say, the email server password? When I search for environment variables + AWS EC2 there's a whole bunch of results I don't understand involving startup scripts and whatnot, which I'm not using at this point.

/r/djangolearning
https://redd.it/6a7wfy
Worth the switch to Pycharm?

I've been using Emacs for a while to do things like stats in R (ESS), organizing my life (org-mode), and writing papers (auctex). So I've got quite a bit of investment in Emacs. I'm not a super-user by any means, but I can use it pretty efficiently.

Now that I'm starting to do a lot more Python development, I've heard over and over again how amazing Pycharm is. I have access to the full edition, so I'll be able to use it to its full potential. I am wondering about a few things, though, as well as the general "is it worth it?" question.

* How good is IdeaVim? I use spacemacs in Emacs, so I'm used to bopping around inside and between files with vim-style shortcuts.
* Is there an easy way to sync settings between machines? I do development on 2-3 different computers (home, work, laptop), and I'd like to keep everything synced up between the three.
* What are the killer features I should switch for?

EDIT new question:

* Is it possible to have machine-specific settings in addition to syncing general settings? For example, my laptop is super small, so I usually use a slightly smaller font size there so I can see a bit more code.
* I also want to start using Python for more of my data analysis. Does Pycharm work well for that, or should I use a different package for the stats stuff?

Thanks y'all.

/r/Python
https://redd.it/6ac5pk
I started building a Flask API service using Python 3.6 and I can't connect to the database. More info below

I have the following code for database connection:

from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()

In other part of the app I have some code for adding tables, etc.:

db.init_app(flask_app)
db.create_all()
db.session.commit()

This is the error I'm getting: `ModuleNotFoundError: No module named 'MySQLdb'

What's going on here? I'm positive I have `MySQL` on my Mac. Is this because I'm using Python 3?

Thanks in advance!

/r/flask
https://redd.it/6a35tn
Project-based tables and Django

Hi everyone,
I am looking for some suggestion/input regarding how to structure the database for a web-based application I will need to develop in the following months.

Basically I will need to develop an application to organize technical information (cable connections, tags, I/O boards, etc). The main issue is that this documentation is project-based, and I'd like to avoid to have everything merged together (i.e. I don't want to have a single table with all I/O boards for all the projects, as I'll have to filter by project in any query I do, and the table size will quickly grow out of control).

My idea is to create a different DB everytime a new project is started, and to populate said DB with the required empty tables ("bootstrapping a project"). Then I'd like to access the various projects via URLs (like /myapp/project1/ will "open" project1).

Probably the easiest thing would be to just create a different instance of the web app for each project, but that would make more difficult the creation of a new project (instance a new container/VM/whatever for the app...), as well as "sharing data" functions (like "copy all this I/Os from project A to B"). So I'd like to avoid this path.



Is there a best practice for this, or do you have any suggestion from your experience? I know that this is a Django subreddit but... perhaps Django is not the best candidate for something like that, and is better to go with something more... barebone (e.g. flask)?

Thanks! :)

/r/django
https://redd.it/6adio8
Automate Everything with Python - Question

Hi Ya'll,

I'm learning Python from Automate the Boring Stuff. My first project is to download episodes of this podcast I like. Right now when I run the program is downloads all the HTML from the 'url.' Where am I going wrong?

Next question. What it is supposed to do next is Click on all the links that say "Download this Episode." I have that bit of code included in the below but it doesn't seem to be working yet. Also, any tips for someone learning how to work with python would be appreciated!

Below is the python script:

https://drive.google.com/open?id=0B_w5K7I4N0d3a0JReW1QRUNPUXM


/r/Python
https://redd.it/6ad823
Using PreFetch with DRF ViewSets to include a @property on a model that relies on reverse foreign keys.

It took me entirely too long to figure this out, so thought I would share in case others are running into the same thing. Note that I attempted to use a OneToOne, but because not every Model B can relate to a Model A, and it throws duplicate errors on Nulls... I couldn't.



Model B has a FK to Model A.
Model A has a @property to obtain a field from Model B from a single model instance. In this scenario, I know there will only be one, so I go get the first value only.



@property
get_model_B_field(self):
b = self.modelB_set.values('field').all()
if b:
return b[0].field



Of course, including this in the serializer quickly creates the N+1 problem, as Django will re-query the database for each Model A in the queryset. As I have several properties like this across a variety of models, the problem grew exponentially until I had thousands of queries running and 12+ seconds to load the api page. P.S. I highly recommend using [Django Debug Toolbar](https://django-debug-toolbar.readthedocs.io/en/stable/).



First attempt was to use prefetch_related as such in the ViewSet which calls the serializer:



queryset=ModelA.objects.prefetch_related('modelB_set').all()



Still N+1



I went back to the property and attempted to remove the "values" aspect, and also make use of the first() function.



@property
get_model_B_field(self):
b = self.modelB_set.first()
if b:
return b.field



Still N+1



Then I tried using the new Prefetch object on my view's queryset, as such:



queryset=ModelA.objects.prefetch_related(Prefetch('modelB_set',queryset=ModelB.objects.all()))



Still N+1




Finally, I read the bug report from another user indicating first() wasn't respecting the prefetch, to realize that in order for that to happen, you must explicitly set an order on the queryset in the PreFetch()



queryset=ModelA.objects.prefetch_related(Prefetch('modelB_set',queryset=ModelB.objects.order_by('field').all()))


Finally, down to 7-8 (from 5000+) queries for a rather large API returning data from across multiple models, many of which are reverse FK lookups in about 1 second. If others have a more elegant solution, I'd love to hear it. My only other option was going to be attempt to use Nested Serializers, but I didn't want to nest all of ModelB inside ModelA, and I didn't like the idea of creating multiple serializers against ModelB to just handle different field combinations. Now, I can have all my properties in one place (the model), and include the fields individually on the API as needed.



/r/django
https://redd.it/6aexjt