L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

The story of how Facebook, along with the FBI, hunted a man who regularly insulted and attacked little girls. Facebook even paid for a zero-day vulnerability in Tails(!) that allowed to identify the real address of the criminal.

https://www.vice.com/en_us/article/v7gd9b/facebook-helped-fbi-hack-child-predator-buster-hernandez?utm_content=buffer15b0e&utm_medium=social&utm_source=twitter&utm_campaign=buffer

Recommended for reading 👌

#privacy

VICE

Facebook Helped the FBI Hack a Child Predator

Facebook paid a cybersecurity firm six figures to develop a zero-day in Tails to identify a man who extorted and threatened girls.

146 viewsedited 13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Congrats! Web scraping is legal! (US precedent)

https://www.reddit.com/r/datascience/comments/excxlv/congrats_web_scraping_is_legal_us_precedent/

From the datascience community on Reddit: Congrats! Web scraping is legal! (US precedent)

Explore this post and more from the datascience community

171 views13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Found this little diamond on wiki - you can quote it like:
ninety percent of everything is crap
or
everybody is wrong on the internet

https://en.m.wikipedia.org/wiki/Sturgeon%27s_law

#stuff

Wikipedia

Sturgeon's law

adage cited as "ninety percent of everything is crap"

152 viewsedited 13:09

👍 1

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Lex has released a rather large but interesting interview with Andrew Ng, a man whose name you've met many times if you've ever been interested in artificial intelligence, machine learning, or technology in general.

https://youtu.be/0jspaMLxBig

#ds #ml

YouTube

Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73

Andrew Ng is one of the most impactful educators, researchers, innovators, and leaders in artificial intelligence and technology space in general. He co-founded Coursera and Google Brain, launched deeplearning.ai, Landing.ai, and the AI fund, and was the…

178 viewsedited 13:09

👍

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Check out background removal from videos. At the end you're getting gifs and it's really quick

http://unscreen.com

#usefullinks

156 views13:09

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

This media is not supported in your browser

VIEW IN TELEGRAM

#wednesday

137 views13:09

👍

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Congrats to the Apache Spark community and all the contributors! The Apache Spark 3.0 is here. Try it out!

https://spark.apache.org/releases/spark-release-3-0-0.html

#spark

119 viewsedited 13:09

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Russia lifts its ban on the Telegram messenger app

https://www.theverge.com/2020/6/18/21295535/russia-telegram-ban-lifted-security

#privacy

The Verge

Russia lifts its ban on the Telegram messenger app

Telegram said earlier this year it has 400 million monthly active users

126 viewsedited 09:12

👍

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Have you heard of microservices? Of course you have - any housewife already knows how to deploy them on a k8s cluster. Here's some thinking about them.

https://luminousmen.com/post/thoughts-on-microservices

Blog | iamluminousmen

Thoughts on (micro)services

Have you heard of microservices? Of course you have - any housewife already knows how to deploy them on a k8s cluster. Here's some thinking about them.

126 views13:09

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Wow! Here comes Nextgen - Unreal Engine 5 with high quality textures. It creates as much geometric detail as the eye can see, and dynamic lighting that responds to changes in the scene and lighting with tools and libraries. That means that it can be done without much effort for artists, and we will see it soon enough in many games.

https://youtu.be/qC5KtatMcUw

#stuff

YouTube

Unreal Engine 5 Revealed! | Next-Gen Real-Time Demo Running on PlayStation 5

Unreal Engine 5 empowers artists to achieve unprecedented levels of detail and interactivity, and brings these capabilities within practical reach of teams of all sizes through highly productive tools and content libraries.

Join Technical Director of Graphics…

134 viewsedited 13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

SpaceX use Chromium and JavaScript for the Dragon 2 flight interface. Now I can officially say that frontend nowadays is a rocket science.

Check out the exact replica of the interface by the link below

https://iss-sim.spacex.com/

#stuff

SPACEX - ISS Docking Simulator

This simulator will familiarize you with the controls of the actual interface used by NASA Astronauts to manually pilot the SpaceX Dragon 2 vehicle to the International Space Station.

325 viewsedited 13:09

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

#wednesday

109 views13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Spark 3.0 now on Databricks

https://databricks.com/blog/2020/06/18/introducing-apache-spark-3-0-now-available-in-databricks-runtime-7-0.html

#spark #big_data

Databricks

Introducing Spark 3.0 - Now Available in Databricks Runtime 7.0

Learn more about the latest release of Apache Spark, version 3.0.0, including new features like AQE and how to begin using it through Databricks Runtime 7.0.

151 viewsedited 13:09

👍 1

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

I think it's clear now who's the winner
With that said it can be obvious that standard data scientist tech stack will change in the nearest future. As an example pandas can be somewhat easily replaced by koalas. By the way, new koalas is here and it covers 80% of pandas api - https://github.com/databricks/koalas/releases/tag/v1.0.0

#spark #big_data

110 viewsedited 09:12

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Technological degradation

Technology degrades from generation to generation, because no project starts with nothing, you always choose tools, libraries, etc. to solve a problem. Of course you can do everything from scratch using the assembler, but nobody will do it because it doesn't make sense. And when you take the X library which depends on the Y and Z libraries, it doesn't mean that you know what Y and Z do, you just know what X does and how to use it. And so while any problem in IT can be solved by adding an abstraction level, this abstraction stack is extremely hard to know for one person, and without proper communication between old developers and new developers, those connections are lost. And each generation grows at its abstraction level without understanding what's going on below.

https://youtu.be/pW-SOdj4Kkk.

#dev

YouTube

Jonathan Blow - Preventing the Collapse of Civilization (English only)

Jonathan's talk from DevGAMM 2019.
https://www.youtube.com/c/DevGAMMchannel

119 viewsedited 13:09

👍 2

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Understanding Explain Formatted in Spark 3.0

https://medium.com/analytics-vidhya/spark-3-understanding-explain-formatted-d4f33c1dee86

#spark #big_data

Medium

Spark 3 — Understanding Explain Formatted

Spark 3.0 is around the corner, and yes, this time it’s for real(I hope).

112 viewsedited 09:12

👍

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

109 views13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Data Lake and the Data Warehouse. They seemed similar, but there are differences.

https://luminousmen.com/post/data-lake-vs-data-warehouse

Blog | iamluminousmen

Data Lake vs Data Warehouse

Data Lake and the Data Warehouse. They seemed similar, but there are differences.

160 views13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

Nice advises on how to read research papers from Andrew Ng's CS230 Lectures on Deep learning

https://deeps.site/blog/2019/10/14/reading-research-papers-career-advice/

#ml #ds

deeps.site

ML Career Advice and Reading Papers

Space to uncover things that tick.

107 viewsedited 13:09

👍 3

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

108 views09:12

👍 5

L̶u̵m̶i̵n̷o̴u̶s̶m̶e̵n̵B̶l̵o̵g̵

AutoML

We should understand that ML models are not static - as soon as the data changes, so do the models and their predictions, and it is necessary to constantly monitor ML pipelines, retraining, optimization and so on. All these are "time series" problems, which should be solved by engineers and data scientists, which are not trivial from many points of view. And solutions may have huge time horizons, but the worst part is that they need to be maintained afterwards. Eww. As engineers, we love to create things, but we don't want to maintain them. To somehow automate data preprocessing, feature engineering, model selection and configuration, and the evaluation of results, the AutoML process was invented. AutoML can automate these tasks by providing a basic result, can provide high quality for certain problems and can give an understanding of where to continue research.

It sounds great, of course, but how effective is it? The answer to this question depends on how you use it. It's about understanding where people are good at and where machines are good at. People are good at connecting existing data to the real world - they understand the business area, they understand what specific data means. Machines are good at calculating statistics, storing and updating state, and doing repetitive processes. Tasks like exploratory data analysis, preprocessing of data, hyper-parameter tuning, model selection and putting models into production can be automated to some extent with an automated machine learning frameworks, but good feature engineering and draw actionable insights can be done by human data scientist that understands what he is doing. By separating these activities, we can easily benefit from AutoML now, and I think that in the future AutoML as a thing will replace most of the work of a data scientist.

Many data scientists are saying that the existence of human data scientist is still necessary after AutoML, but I doubt it. I am not talking about specific tasks to achieve maximum model accuracy or research, I am talking about real business problems. And here I think it is obvious that AutoML will win. There are not many projects in the real world that go from POC to production, and automation will help to make a quick prototypes and eventually increase ROI for the company.

What's more, I think it's noticeable that the industry is undergoing a strong evolution of ML platform solutions (e.g. Amazon Sagemaker, Microsoft Azure ML, Google Cloud ML, etc.) and as ML adoption grows, many enterprises are quickly moving to ready-to-use DS&ML platforms to accelerate time to market, reduce operating costs and improve success rates (number of ML models deployed and commissioned).

#ml

169 views13:09

👍 2

About

Blog

Apps

Platform