To tune Spark job performance and debug finished jobs engineers need information. This can get from spark events and debug in history server
https://luminousmen.com/post/spark-history-server-and-monitoring-jobs-performance
https://luminousmen.com/post/spark-history-server-and-monitoring-jobs-performance
There are many myths in the IT field:
“You can unsubscribe from spam”,
“Backups are not needed”,
“Two antiviruses are better than one”
“You can unsubscribe from spam”,
“Backups are not needed”,
“Two antiviruses are better than one”
Mute all useless java-related logging in pyspark:
gist
#python #spark
def mute_spark_logs(sc):
"""Mute Spark info logging(show only error logs)"""
logger = sc._jvm.org.apache.log4j # noqa
logger.LogManager.getLogger("org").setLevel(logger.Level.ERROR)
logger.LogManager.getLogger("akka").setLevel(logger.Level.ERROR)
logging.info("Spark muted.")
gist
#python #spark
Gist
mute_spark.py
GitHub Gist: instantly share code, notes, and snippets.
Python useful Data Structures to work with numerical data to speed up your computations:
numpy arrays — for N-dimensional structured arrays
scipy.spatial — for spatial queries like distances, nearest neighbors, etc
pandas —for SQL-like grouping and aggregations
dask — parallel arrays, dataframes, and lists that extend to larger-than-memory or distributed environments
xarray — for grouping across multiple dimensions
scipy.sparse — sparse matrices for 2-dimensional structured data
sparse — for N-dimensional structured data
scipy.sparse.csgraph — for graph-like problems (e.g. finding the shortest path
#python
numpy arrays — for N-dimensional structured arrays
scipy.spatial — for spatial queries like distances, nearest neighbors, etc
pandas —for SQL-like grouping and aggregations
dask — parallel arrays, dataframes, and lists that extend to larger-than-memory or distributed environments
xarray — for grouping across multiple dimensions
scipy.sparse — sparse matrices for 2-dimensional structured data
sparse — for N-dimensional structured data
scipy.sparse.csgraph — for graph-like problems (e.g. finding the shortest path
#python
Punctuation removal
You can easily remove all punctuation using snippet
You can easily remove all punctuation using snippet
import stringOutput:
input_str = “This &is [an] example? {of} string. with.? punctuation!!!!” # Sample string
result = input_str.translate(string.maketrans("", ""), string.punctuation)
print(result)
This is an example of string with punctuation#python
There is a secret that needs to be understood in order to write good software documentation: there isn’t one thing called documentation, there are four.
They are: tutorials, how-to guides, explanation and technical reference. They represent four different purposes or functions, and require four different approaches to their creation. Understanding the implications of this will help improve most software documentation - often immensely.
Check out the Daniele Procida talk on pycon
#dev #soft_skills
They are: tutorials, how-to guides, explanation and technical reference. They represent four different purposes or functions, and require four different approaches to their creation. Understanding the implications of this will help improve most software documentation - often immensely.
Check out the Daniele Procida talk on pycon
#dev #soft_skills
YouTube
What nobody tells you about documentation
Daniele Procida
http://2017.pycon-au.org/schedule/presentation/15/
#pyconau
This talk was given at PyCon Australia 2017 which was held from 3-8 August, 2017 in Melbourne, Victoria.
PyCon Australia is the national conference for users of the Python Programming…
http://2017.pycon-au.org/schedule/presentation/15/
#pyconau
This talk was given at PyCon Australia 2017 which was held from 3-8 August, 2017 in Melbourne, Victoria.
PyCon Australia is the national conference for users of the Python Programming…
from the CPP core guidelines
>Scream when you see a macro that isn't just used for source control (e.g., #ifdef)
It somewhat fits
source
>Scream when you see a macro that isn't just used for source control (e.g., #ifdef)
It somewhat fits
source
These clothes and accessories outsmart facial recognition tech. Prepare yourself to the future
https://www.businessinsider.com/clothes-accessories-that-outsmart-facial-recognition-tech-2019-10
#privacy
https://www.businessinsider.com/clothes-accessories-that-outsmart-facial-recognition-tech-2019-10
#privacy
Business Insider
These clothes use outlandish designs to trick facial recognition software into thinking you're not human
Privacy-focused designers, academics, and activists have designed wearable accessories and clothes meant to thwart facial recognition tech.
If your application needs to measure elapsed time, you need a timer that will give the right answer even if the user changes the time on the system clock
https://luminousmen.com/post/how-to-not-leap-in-time-using-python
https://luminousmen.com/post/how-to-not-leap-in-time-using-python
Blog | iamluminousmen
How to not leap in time using Python
If your application needs to measure elapsed time, you need a timer that will give the right answer even if the user changes the time on the system clock
Dictionaries in CPython are everywhere, classes, global variables, kwargs parameters are based on them, the interpreter creates thousands of dictionaries, even if you did not add any curly brackets in your script. And it is not surprising that their implementation continues to improve and increasingly acquire various tricks.
The internal structure of dictionaries in Python is not limited only to buckets and closed hashing. If you don’t know the number of elements in the dictionary you just created, how much memory is spent for each element, why now (CPython 3.6>) the dictionary is implemented in two arrays and how it relates to maintaining the insertion order, or you just didn’t watch the presentation by Raymond Hettinger "Modern Python Dictionaries A confluence of a dozen great ideas. Then the time has come.
Recommended 👌
#python
The internal structure of dictionaries in Python is not limited only to buckets and closed hashing. If you don’t know the number of elements in the dictionary you just created, how much memory is spent for each element, why now (CPython 3.6>) the dictionary is implemented in two arrays and how it relates to maintaining the insertion order, or you just didn’t watch the presentation by Raymond Hettinger "Modern Python Dictionaries A confluence of a dozen great ideas. Then the time has come.
Recommended 👌
#python
Singleton pattern in Python
Do you like Singletons? I don't too — they are a bit complicated.
But you know what? I've never seen in any code(except some famous libraries) and on any interview good implementation of Singleton pattern. We need to fix it!
Please check the following implementation:
#python
Do you like Singletons? I don't too — they are a bit complicated.
But you know what? I've never seen in any code(except some famous libraries) and on any interview good implementation of Singleton pattern. We need to fix it!
Please check the following implementation:
weakref import WeakValueDictionaryDo you know a better implementation? Send me yours and we will discuss
class Singleton(type):
_instances = WeakValueDictionary()
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
instance = super(Singleton, cls).__call__(*args, **kwargs)
cls._instances[cls] = instance
return cls._instances[cls]
class Config(metaclass=Singleton):
pass
#python
Gist
singleton.py
GitHub Gist: instantly share code, notes, and snippets.
Service which turns photos into comics of the style of the russian comic book published Bubble, has gained popularity.
It handles faces poorly — they becoming scary and ugly especially if the photo is not contrast or blurred. But photos of the cats are interesting.
https://face.bubble.ru/en/
#stuff
It handles faces poorly — they becoming scary and ugly especially if the photo is not contrast or blurred. But photos of the cats are interesting.
https://face.bubble.ru/en/
#stuff
Bubblecomics
App | Bubble
Jupyter notebook will have to make room for a new solution from netflix. Polynote — a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more.
https://medium.com/netflix-techblog/open-sourcing-polynote-an-ide-inspired-polyglot-notebook-7f929d3f447
#stuff
https://medium.com/netflix-techblog/open-sourcing-polynote-an-ide-inspired-polyglot-notebook-7f929d3f447
#stuff
Medium
Open-sourcing Polynote: an IDE-inspired polyglot notebook
Jeremy Smith, Jonathan Indig, Faisal Siddiqi
What is the definition of a good software engineer? This question's aim is to be personal, it focuses on the thoughts of the people you're asking it. I will show you my thoughts in this post.
https://streamlit.io/
Web prototyping / reporting framework. It looks like a great alternative to bokeh/plotly - pure python without callbacks and with advanced data caching
Web prototyping / reporting framework. It looks like a great alternative to bokeh/plotly - pure python without callbacks and with advanced data caching
Those parts of the system that you can hit with a hammer (not advised) are called "hardware"; those that you can only curse at are called "software"
— Anonymous
— Anonymous