Spark in me – Telegram

Spark in me

2.73K subscribers

1.28K photos

71 videos

118 files

2.91K links

Lost like tears in rain. DS, ML, a bit of philosophy and math. No bs or ads.

Download Telegram

About

Blog

Apps

Platform

2.73K subscribers

Гайд по настройке виртуальных окружений на питоне и установке open-cv (не самая последняя версия есть тупо в репозитории pip)

- https://www.pyimagesearch.com/2017/09/25/configuring-ubuntu-for-deep-learning-with-python/

По идее на работе правильно работать через докер или виртуальные окружения.

#data_science
#python

Configuring Ubuntu for deep learning with Python - PyImageSearch

Inside this guide you will learn how to configure your Ubuntu machine for deep learning using Python, Keras, TensorFlow, mxnet, and more.

1.08K viewsAlexander, 06:28

Еще лучше сниппет для скачивания файлов с докачиваением и ordered dictionaries. Качает в последовательности в которой вы загружаете ключи в словарь.

import collections

file_dict = collections.OrderedDict()
file_dict['FILENAME'] = 'URL'


for file,url in file_dict.items():
    url_q = "'" + url + "'"
    ! wget --continue --no-check-certificate --no-proxy -O $file $url_q

#data_science
#python

1.52K viewsAlexander, edited 11:20

Знакомый прислал еще такое - как в juputer notebook одновременно запускать несколько ячеек
- https://github.com/alexanderkuk/parallel-cell

#data_science
#python

GitHub - kuk/parallel-cell

Contribute to kuk/parallel-cell development by creating an account on GitHub.

1.98K viewsAlexander, 15:15

В новом конкурсе нашел на Kaggle отличный "мануал", про то как работать c bson (архив базы Монго).

Очень рекомендую к прочтению
- https://www.kaggle.com/humananalog/keras-generator-for-reading-directly-from-bson/notebook

#data_science
#python

Keras generator for reading directly from BSON

Using data from Cdiscount’s Image Classification Challenge

1.18K viewsAlexander, 14:15

Пара отличных тредов про то, как сделать ваш генератор на питоне thread-safe, то есть минимальными усилиями использовать параметр workers > 1 у fit_generator в Keras. Полезно, если ваша модель сильно CPU-bound.

- https://github.com/fchollet/keras/issues/1638
- https://stackoverflow.com/questions/41194726/python-generator-thread-safety-using-keras
- http://anandology.com/blog/using-iterators-and-generators/

#data_science
#python

Proper way of making a data generator which can handle multiple workers · Issue #1638 · fchollet/keras

I am having difficulty in writing a data generator which can work with multiple workers. My data generator works fine with one worker, but with > 1 workers, it gives me the following error:
Unbound...

1.14K viewsAlexander, edited 07:53

Отличная паста чтобы проверять хеши файлов.

# make sure you downloaded the files correctly
import hashlib
import os.path as path

def sha256(fname):
    hash_sha256 = hashlib.sha256()
    with open(fname, 'rb') as f:
        for chunk in iter(lambda: f.read(4096), b''):
            hash_sha256.update(chunk)
    return hash_sha256.hexdigest()

filenames = ['', '', '', '', ']

hashes = ['', '', '', '', '']

data_root = path.join('data/')  # make sure you set up this path correctly

# this may take a few minutes
for filename, hash_ in zip(filenames, hashes):
    computed_hash = sha256(path.join(data_root, filename))
    if computed_hash == hash_:
        print('{}: OK'.format(filename))
    else:
        print('{}: fail'.format(filename))
        print('expected: {}'.format(hash_))
        print('computed: {}'.format(computed_hash))

#python
#data_science

1.13K viewsAlexander, edited 07:42

Оказывается уже есть готовый squeeze-net для keras с весами =)

Неплохо
- https://github.com/wohlert/keras-squeezenet

#python
#neural_nets

wohlert/keras-squeezenet

Pretrained Squeezenet 1.1 implementation in Keras. Contribute to wohlert/keras-squeezenet development by creating an account on GitHub.

1.25K viewsAlexander, 08:41

У меня встал вопрос расширения класса Pytorch, который мне понравился. Если бы все было банально - я бы просто написал функцию и вызвал бы ее и передал ей объект класса, но но одна проблема - некоторые утилиты в классе вызывают локальные утилиты, которые не совсем понятно как модифицировать при импорте.

Вдохновившись примером итератора с bson (было выше - https://goo.gl/xvZErF), как оказалось расширение классов делается довольно просто:
- Раз https://goo.gl/JZpfiV
- Два https://goo.gl/D3KkLm
- Ну и старая наркомания для тех кому внутрянка питона интересна
-- https://www.artima.com/weblogs/viewpost.jsp?thread=237121
-- https://www.artima.com/weblogs/viewpost.jsp?thread=236278
-- http://www.artima.com/weblogs/viewpost.jsp?thread=236275

#python
#data_science

Keras generator for reading directly from BSON

Explore and run machine learning code with Kaggle Notebooks | Using data from Cdiscount’s Image Classification Challenge

954 viewsAlexander, 07:05

Из серии извращений - как загрузить k-means объект из второго питона в третий, причем с ростом версии sklearn?

Очевидное решение не работает по причине смены версии sklearn
- https://goo.gl/s8V5zf

А такое работает

# saving - python2
import numpy as np
np.savetxt('centroids.txt', centroids, delimiter=',') 

# loading - python3 
from sklearn.cluster import KMeans
import numpy as np

centroids = np.loadtxt('centroids.txt', delimiter=',')
kmeans = KMeans(init = centroids)

Unpickling a python 2 object with python 3

I'm wondering if there is a way to load an object that was pickled in Python 2.4, with Python 3.4.

I've been running 2to3 on a large amount of company legacy code to get it up to date.

Having don...

1.15K viewsAlexander, 09:55

Великолепная либа на питоне для работы с видео
- https://github.com/Zulko/moviepy

Она построена сверху над image.io и по сути позволяет работать с видео в 1 строчку (вместо просто итерации или ручного использования ffmpeg). Как хорошо что на питоне есть такие инструменты!

#python
#video

GitHub - Zulko/moviepy: Video editing with Python

Video editing with Python. Contribute to Zulko/moviepy development by creating an account on GitHub.

1.08K viewsAlexander, 04:35

На новой работе увидел, что люди тренируют свои модели на 2 питоне (ЩИТО?), на tensorflow (WTF???) и грузят данные в 1 поток (2017 год на дворе!).

По этой причине сделал коллегам такую немного трололо презентацию. Может и вам понравится
- https://goo.gl/ne9RH4

Все простое - очень просто, главное просто знать где искать)

#data_science
#deep_learning
#python

1.14K viewsAlexander, edited 11:54

Just found a book on practical Python programming patterns
- http://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonForProgrammers.html

Looks good

#python

1.28K viewsAlexander, 14:01

Amazing article about the most popular warning in Pandas
- https://www.dataquest.io/blog/settingwithcopywarning/

#data_science

SettingwithCopyWarning: How to Fix This Warning in Pandas – Dataquest

SettingWithCopyWarning: Everything you need to know about the most common (and most misunderstood) warning in pandas and how to fix it!

1.16K viewsAlexander, 06:26

Useful Python abstractions / sugar / patterns

I already shared a book about patterns, which contains mostly high level / more complicated patters. But for writing ML code sometimes simple imperative function programming style is ok.

So - I will be posting about simple and really powerful python tips I am learning now.

This time I found out about map and filter, which are super useful for data preprocessing:

Map

items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))

Filter

number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)

Also found this book - http://book.pythontips.com/en/latest/map_filter.html

#python
#data_science

848 viewsAlexander, 04:58

Readable list comprehensions in Python

My list and dictionary comprehensions usually look like s**t

https://gist.github.com/IaroslavR/7dcb54830242a22de1869f6fd05a8d7e

#python

Examples of readable comprehension formatting from SO

Examples of readable comprehension formatting from SO - examples.py

816 viewsAlexander, 13:54

A decent explanation about decorators in Python

http://book.pythontips.com/en/latest/decorators.html

#python

1.13K viewsAlexander, edited 07:45

Yet another python tricks book

https://dbader.org/
https://www.getdrip.com/deliveries/xugaymstfzmizbyposdk?__s=ejdgfo9tsdhpgcrcscs3
https://vk.com/doc7608079_466151365

#python

Python Training by Dan Bader – dbader.org

Dan Bader helps Python developers become more awesome. His tutorials, videos, and trainings have reached over half a million developers around the world.

1.09K viewsAlexander, edited 05:53