Python Daily
2.57K subscribers
1.48K photos
53 videos
2 files
38.9K links
Daily Python News
Question, Tips and Tricks, Best Practices on Python Programming Language
Find more reddit channels over at @r_channels
Download Telegram
Regex Hell: Is there a good way of combining multiple regex's into one?

Hi all,

I'm reimplementing the following regex I found in a paper. We're essentially trying to do name entity recognition for medical documents.

> import re
>
> """
> Reimplementation of regular expression for measurement extraction described in the
> paper found here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4586346/
> """
>
x = "(\d+\.( )?\d+" + "|\d+( )?\.\d+" + "|\.\d+|\d+) *"
>
> by = "( )?(by|x)( )?"
>
> cm = "([\- ](mm|cm|millimeter(s)?|centimeter(s)?)(?![a-z/]))"
>
> x_cm = "(("+x+"*(to|\-)*" + cm + ")" + "|(" + x + cm + "))"
>
> xy_cm = "(("+x+cm+by+x+")" + \
> "|(" + x + by + x + cm + ")" + \
> "|(" + x + by + x + "))"
>
> xyz_cm = "(("+x+cm+by+x+cm+by+x+cm+")"+ \
> "|(" + x + by + x + by + x + cm +")" + \
> "|(" + x + by + x + by + x + "))"
>
> \# Final regular expression

> m = "((" + xyz_cm + ")" + \
> "|(" + xy_cm + ")" + \
> "|(" + x_cm + "))"

Whenever I run re.compiles(m) on the final expression, I'm getting a runtime error. Do you all have any suggestions on how I can cleanly break this up or where the error is? I'm trying to avoid having to use a context free grammar.

/r/Python
https://redd.it/6d1yg6
export as a normal Python session

Can we export the current session as if we had worked in the default Python shell? Here is what I mean. IPython session:

In [1]: li = [1, 2, 3]

In [2]: li
Out[2]: [1, 2, 3]

In [3]:

I want to paste it in a blog post, but it's too verbose (imagine that it's longer). I'd like to export it like this:

>>> li = [1,2,3]
>>> li
[1, 2, 3]
>>>

Is it possible?

/r/IPython
https://redd.it/6d2zi2
Save a django object getting another model instance as foreign key from a form

I am getting the URL tag from formnumber, but I would like to save the value in another model named ExpenseForecast. Any ideas?


def inputsexp(request, number):
formnumber = Projectsummary.objects.all().get(number=number)
form = ProjExpField(data=request.POST or None, instance=formnumber)
if form.is_valid():
form.save()
return HttpResponseRedirect(reverse('app:test_forecast_data', kwargs={'number': number}))
context = {'form': form, 'formnumber': formnumber}
return render(request, 'projexpfield.html', context)

/r/django
https://redd.it/6d4pts
What is a good ecommerce support for Django? More info below.

I need to provide a utility for the users to purchase items, select shipping address & pick up address, and delivery date & time. I also need to be able to use payment APIs such as Stripe, PayPal, etc.
Any suggestions?

/r/django
https://redd.it/6d4nwq
Does flask have any plugins to manipulate and render database instances conveniently?

I have been using Flask with SQLAlchemy and WTForms for a while now.

At first, I've been following something along these lines:
* Create SQLAlchemy model class
* Create WTForm class based on the SQLAlchemy model (it's almost always line to line mapping)
* Create a view template to render the form for viewing and for displaying the model data in Jinja.

I find this approach to be very cumbersome: If I change a field in my SQLAlchemy class, I have to update my WTForm and my view template as well.

Then I came across WTForms-Alchemy. Now,

* I just have to create the SQLAlchemy class
* WTForms-Alchemy will map that to a WTForm
* And I render that form for creating/viewing/updating in my Jinja.

This made my job a lot easier but WTForms-Alchemy has a lot of bugs that makes it very difficult for me to continue using it (check my history).

#Are there other plugins for Flask which makes this whole process simpler from my point of view?



/r/flask
https://redd.it/6d378b
Is it just me or does json serializing take forever?

I'm running all localhost and a 2mb json file takes like a minute to load.

I'm making a web map so I need to in geojson format. Are there any ways to make this go faster?

/r/django
https://redd.it/6d6ss3
The story of adding type hints to one of my projects - things I learned and issues encountered

Adding type hints to Tale
-------------------------

**TL;DR:** I wanted to share my experience with adding type hints to one of my projects. There were some benefits, mypy discovered a couple of bugs, there are a few annoyances for me, and I am not really sold on it yet.
Please correct my mistakes or help me solve some problems if possible :)


Intro
-----

'Tale - an Interactive Fiction, MUD & mudlib framework' is
a medium sized Python project. At the time of writing (just after
releasing version 3.1 of the project),
it has about 15000 SLOC (~9500 in the tale library,
~3800 in the test suite, and ~2100 in the demo stories).

You can find it on Github; https://github.com/irmen/Tale

Tale started as a Python 2.x-and-3.x compatible library but I decided
I want to be able to use modern Python (3.5+) features and not worry
anymore about backwards compatibility, so I cleaned up and modernized the code a while ago.

After finishing that, one of the major new Python features I still hadn't used anywhere,
was _type hinting_. Because I wanted to learn to use it, I took Tale as
a test case and added type hints to all of the code.
I wanted to learn about the syntax,
how to apply it to an existing code base,
to see what benefits it gives,
and to discover what problems it introduces and what limitations it has.

Below are the results.


Benefits
--------

The following things have proven to be beneficial to me:

- PyCharm (my IDE of choice) is giving more detailed warning messages and code completion suggestions:
it adds inferred types or type hints to it.
- The MyPy tool (http://mypy.readthedocs.io/en/latest/index.html) is able to statically find type related errors in the code
at _compile time_ rather than having to run into them during _run time_.
It correctly identified several mistakes that were in the code that I had
not discovered yet and weren't caught by the unit tests.
- Mypy is pretty smart, I was quite surprised by its capability to infer
the type of things and how it propagates through the code.
- Type hinting is optional so you can mix code with and without type hints.



Things I'm not happy with
-------------------------

- It takes a lot of effort to add correct type hints everywhere.
Before this task is complete, mypy can and will report lots of
errors that are wrong or can be misleading.

- Mypy has bugs. One of the most obvious is that it doesn't know about
several modules and classes from the standard library.

- Shutting mypy up is done via a `# type: ignore` comment.
Tale now has about 60 of these...
Some of them can and should be removed once mypy gets smarter,
but I have no way of knowing which ones they are. How am I going to maintain these?
It strongly reminds me of the 'suppress-warning' type of comments found
in other languages. This is problematic because it hides possible errors
in a way that is hard to find later.

- I really don't like these two major aspects of the type hint syntax:
1. Sometimes you have to use _strings_ rather than the type itself.
This is because the hints are parsed at the same time as all other code,
and sometimes you need "forward references" to types that are not yet defined.
This can sometimes be fixed by rearranging the definition order
of your classes, but if classes reference each other (which is common)
this doesn't help. I find the most irritating that you have to
do this for the actual class that you're type hinting the methods of!
I understand the reason (the class is still being parsed, and is not
yet _defined_) but I find it very cumbersome.
2. Type hints for variables are often required to help mypy. Especially
if you're initializing names with empty collection types such as `[]` or `{}`
which is common. Also you have to do this via a _comment_ such as `# type: List[str]`
The latter is improved in Python 3.6 (see PEP-526) but I want to
s
tick with 3.5 as a minimum for a while.


- Because type hints are parsed by Python itself, and because mypy
even parses the comments as well,
you'll have to import all types that you use in hints.
This causes a lot of extra imports in every module.
In my case this even led to some circular import problems that
were only fixable by changing the modules itself. (One can argue
that this is a good thing! Because circular references often
are a code smell)
Some types are only used in a _comment_, which
causes the IDE to warn me about unused imports that it wants to
clean up (but if I do this, it breaks mypy). PyCharm is not (yet)
smart enough to see that an import is really used even if it is just a type hint comment.
To be honest, PyCharm has a point, because it is *mypy* that uses it, not *python*...
But it causes me to have to accept several warnings that aren't.

- The code becomes harder to read. In some cases, _a lot harder_ because
some type hints can become quite elaborate (list of dicts mapping str to tuple... etc)
You can ease the pain a bit by creating your own type classes but
this clutters the module namespace.



Things learned, thougts
-----------------------

After adding type hints to all of Tale's code, a lot of time
was spent fixing mistakes (in the hints I wrote) and several bugs (that mypy found).

After all of this, I learned that

- using type hints helps uncover bugs and improves IDE feedback and code understanding
- it is quite a lot of work to add it to an existing code base and "get it right" (= no mypy errors)
- it has to be done in an 'unnatural' way sometimes, because of the way Python parses stuff
- it can clutter up code that was very concise and clean before.


But the most important thing really:

**...static typing (via type hints + mypy tooling) clashes with Python's dynamic duck type nature.**
It all still feels 'off' to me, in the context of Python. It introduces new kinds of problems, that we didn't have
without them, and the syntax is not as 'natural' as I hoped.

Right now, I am not really sure if I want to continue to use type hints.
In Tale I probably will because it now has them everywhere already, but
my other projects have to wait.

As this is the first and only time so far that I have used type hints,
it is very likely that I made some mistakes or missed a better solution to
some of the issues I encountered.
Please correct me on them or give me some tips on improving my application
and understanding of Python's type hints. Thank you!


/r/Python
https://redd.it/6d5haz
Rocketry/Physics rocket apogee project

Not sure where this should be posted, kick me out if this is the wrong place. no hard feelings.

**background TL;DR**

I'm in a rocketry class that focuses on kinematics and engineering. I'm a matlab user, but my teacher said "oh just code in python" when referring to my final rocketry project bc I dont have matlab on my home computer. My project is almost done, but rip me; the computer lab with matlab on it is closed for the semester. So I am left with all the knowledge of physics in my head, the knowledge of matlab in my head, an unfinished project I can't get, no matlab to write it again, and just a language that idk to write it all out within the next 2 days. Basically I'm hoping some coder takes pity on me and helps me by coding what I need. (My teacher told me that it was cool if I asked for help for those that think what I am doing is morally wrong.)

 

**The Help I Need**

I need to have a program that calculates the apogee of a rocket.

 

 

*notes before physics*

 

- for those who don't know the definition -- apogee is when the rocket reaches the top of its flight before falling back to earth.

 

- apogee would occur either when velocity reaches 0 or when the ejection charge from the motor occurs, whichever comes first.

 

- using python 3.6.1

 

There will be 2 different phases: powered and coasting.

 

**Inputs**: thrust curve (time vs. thrust vector), mass (g), diameter (mm), drag coefficient (CD)

**Assumptions**: density (ρ) = 1225 g/m3, rocket is stable

**Outputs**: apogee, altitude vs. time plot

 

 

**Powered Phase**
calculate the force of the rocket while motor is burning:

 

𝐹𝑛𝑒𝑡 = 𝐹𝑡 − 𝐹𝐷 − 𝐹𝑔

 

𝐹𝑛𝑒𝑡 = 𝐹𝑡 −(1/2𝜌𝐶𝐷𝐴𝑣^2)− 𝑚𝑔

 

then subtract the mass from the propellant in a linear time frame (it isn't actually linear. It is proportional to the burnout of the propellant. we just use the linear function bc the difference is negligible on a smaller scale):

 

𝑚(𝑡) = 𝑚𝑖 − 𝑚𝑝(𝑡/𝑡𝑏)

 

mi = Initial total mass of the rocket (including the motor)

mp = propellant mass

tn = time at motor burnout)

t = Current time (time at which mass is desired)

 

 

**Coasting Phase**
Most of the flight is coasting. The physics are close to the same:

 

𝐹𝑛𝑒𝑡 = −𝐹𝐷 − 𝐹𝑔

 

𝐹𝑛𝑒𝑡 = (−1/2)*𝜌𝐶𝐷𝐴𝑣^2− 𝑚𝑔

 

it has to know when know when motor burnout occurs and march forward in time at a specified time increments (Δt) until reaching apogee.

 

velocity is obtained through the area under the net force vs. time curve

 

𝑣𝑓 =((𝐹𝑓 + 𝐹𝑖2) (𝑡𝑓 − 𝑡𝑖) + 𝑚𝑖*𝑣𝑖)/𝑚𝑓

 

displacement, or altitude, is found in a similar way through area under the velocity vs time curve... man I love integrals.

 

𝑥 = ((𝑣𝑓+𝑣𝑖)/2) (𝑡𝑓 − 𝑡𝑖) + 𝑥𝑖

 

 

**Final notes**
Legitimately if anyone could help me with any part of this or even just a rough code, I can struggle through finishing and or polishing it. My semester grade hangs on this assignment. ik what I'm doing with the class, I just don't know how to write it. I think that's all the physics, but if not, let me know. I also have the info on all the motors if that would help. Please help a poor high school senior on his last week of school finish his project.

 

Any advice, links, videos, or even written code would be so very highly appreciated. Thanks so much in advance.

/r/Python
https://redd.it/6d6lp2
How do I tell a user that the page is processing?

I am working on a program that lets users upload a spreadsheet. After it's uploaded I have a function that processes the data on the spreadsheet. This takes about 1 second per row, due to an API call.

How can I make this a good experience for the user. In a perfect world, I would like something like "4 of 500 records processing" to be shown on the screen, but I am not sure what the first step in that direction would be.

/r/flask
https://redd.it/6d21t6
Issues with third party apps after upgrade

TLDR: Should I fork major third party apps and upgrade them myself?

I'm upgrading one of my apps (from 1.8 to 1.11) for a couple of reasons:

- I want to keep it current with the latest version of Django since I have the time to work on it now (I might not have that luxury in a few weeks).

- I want to add some functionality and channels would be a perfect fit.

Problem is, I rely on a few third party apps that are not compatible with 1.11.

Some of these projects are stale, and others even have pull requests to fix compatibility issues, but no word from the repo owner on issues or PRs.

One of these apps is used in a main feature (and possible replacements wouldn't work either), so I can't just scratch it off.

So my question is - what would be the pros and cons of forking one of these dependencies, fixing the issues myself, and keep the resulting app within my project's structure?

(again, assume those projects are stale and that I have some free time to work on it).

/r/django
https://redd.it/6d3tfq
HELP! Trying to use Python to Join Datasets

Essentially I have two data sets of city level data. I want to match both data sets on the names of cities and drop the observations that are unmatched. Anyone have experience doing something like this (i.e. matching strings to join datasets)? I would greatly appreciate any help.

/r/pystats
https://redd.it/6crsa1
Templates and Blueprints

hello everyone, I saw on the docs for blueprints that I can specify a templates folder when creating the blue print.

However, currently I have a templates folder in my parent module, and even when calling the render_template in the route in my sub-modules' views, the correct template is registered.

currently I have this,

/myapp
..../templates
..../auth
..../admin
..../user

Is this a big no-no or is this perfectly ok?


/r/flask
https://redd.it/6ctsgf
Need help with selecting and filtering measurement data taken on a 3d part. Examples and data provided.

For work I built an XY stage that measures parts placed on it with a laser displacement sensor. A picture is worth a thousand words, [so here](http://i.imgur.com/9jHZOVi.png). I place the part on the top of the stage where the red laser dot is and then use the stage to move the part underneath the laser. I move in increments of 1mm in a serpentine pattern and measure the height of the surface in microns (1E-6m). ( I have permission to share this data.) As I'm collecting data I am adding it to a 1-D Pandas dataframe with x, y and z columns. The final data I would like to get would be a measure of the shape: radius of curvature (5-20m typically), concavity, peak to valley distance from two lines where x = 0 and y =0.


The [raw data](https://plot.ly/~scottdillon/15) I get is here with the [corresponding surface plot](https://plot.ly/~scottdillon/15).


What I want is only the round surface near z = 0. I need to exclude the flat surface at the bottom. I'm doing this now by selecting data between the max value and (max + min) * truncation factor between 0 and 1. If the user does not set the equipment up properly, they may not leave the bottom surface within the measurement range of the laser sensor and the bottom surface would not be there. When this happens, I'm usually left with just a few data points because I've only taken 15-ish% of the top surface. After this selection I get [this surface](https://plot.ly/~scottdillon/17) and [this data](https://plot.ly/~scottdillon/16).


After I select the top surface I still have some data that was measured on the sidewall of the top surface which will throw off any attempt at leveling the data to remove any tilt. So I do another filter to get data between the average +/- std * x. I get [this surface](https://plot.ly/~scottdillon/19) with [this data](https://plot.ly/~scottdillon/18).

Now I can perform a least squares fit of a plane to this data. I subtract the resulting plane to [level the data](https://plot.ly/~scottdillon/20) and get [this surface plot](https://plot.ly/~scottdillon/21).

Finally, I have a[ leveled surface](https://plot.ly/~scottdillon/12) but it still has s[ome data that is due to artifacts of measurement on the edge of the surface](https://plot.ly/~scottdillon/13).

If you rotate that last plot around to the backside, there is a lot of spikiness and noise at the edge that I'd like to get rid off.

The problems with the way I'm doing this now:

* I have three different filter variables I have to manipulate to get decent data and it's still got some features that affect measurements I want to take on it. It's not easy for me to manipulate these values much less someone who doesn't know how this works.

* I am measuring two different kinds of parts. The sample data provided is relatively well behaved. [The other types of parts are not](http://i.imgur.com/R7YsN6N.png). They can have a variation in 10's of microns but usually less than 100um around the edge. There is lots of noise there. There can also be data points near the edge that are above the surface.

* It's fragile. The data taken has to be well behaved for this to work.

What I've tried:

* I've done a median 2d filter on the top surface data and that works well but I still have noise and spikiness after I've flattened this. This spikiness is the hardest to get rid of without removing much real data. A median filter also reduces the resolution of the data maybe too much. The radius of curvature went from 1.23m to 10m without and with a median filter.

Some ideas I have:

* Instead of performing analysis on the whole set just take the center 20mm in x,y and level only on that data. Then perform an avg + std dev filter which should be much better.

* Start at the center data point and spiral out filtering out data that is outside a rolling average value.

* Calculate residuals from the median and exclude data that is higher than some threshold value.

Does anyone have better ideas/algorithms to perform
Tutorial: Five useful data wrangling tactics shown using python & pandas (Jupyter notebook).

Techniques to solve a few data wrangling problems I've encountered in my work. I prepared this notebook last week as part of a presentation to a group of data science students. I hope it's relevant, interesting, and not too basic for some folks here.

Note: the datasets are imported from data.world (where I work) via the datadotworld python package. However, I attempted to reference the canonical data sources (eg, Worldbank) in the notebook, as well.

https://github.com/nrippner/misc/blob/master/datadotworld_wrangling_tutorial.ipynb

/r/pystats
https://redd.it/6cpnxq