before we skate into the weekend let’s read the DSB! And this volume is full of ridiculously good articles. I would recommend the one from Computer Science & Science where you can in pure python learn how a Bitcoin really works. Or beautiful intro to Kafka from Graphs and Visualizations. Or a story about the creation of a data driven company from Business and Career. And many more. Just go through it all.
And as always, enjoy your reading.
https://www.aidancooper.co.uk/scottish-people-are-more-inclined-to-skip-the-gym/ – Scottish people rather stay at home and watch soccer instead of torturing themselves in the gym. Or something like this can be found in this interesting analysis.
https://deepnote.com/@ashish-karhade/Apple-Music-Streaming-analysis-RZehtt6QT5q1nWDC6mwXrQ – EDA of music streaming data from Apple Music. Very nicely done!
https://towardsdatascience.com/causal-inference-example-elasticity-de4a3e2e621b – Price elasticity estimation done in scikit-learn.
Computer Science & Science
https://karpathy.github.io/2021/06/21/blockchain/ – Everybody talks about crypto or blockchain. Almost nobody really understands what it’s hidden under the hood. In the article you will learn to create, digitally sign, and broadcast a Bitcoin transaction in pure Python. And the code itself is unbelievably amazing. (rcmd by reader)
https://realpython.com/python-counter/ – Learn to use a Python’s Counter from collections library in order to count objects.
Graphs and Visualizations
https://www.gentlydownthe.stream/ – This is hilarious. How Apache Kafka works in beautiful presentation or visualization. Whatever this is, it’s a lovely introduction. (rcmd by reader)
Business and Career
https://erikbern.com/2021/07/07/the-data-team-a-short-story.html – Another perfect reading about building a data driven company, what are the necessities, that the modelling is most of the time the least important thing and much more. (rcmd by reader)
– In US, Google releases Google Pay Balance Card by VISA with the NFC tap-and-pay functionality. Google Pay also has a P2P payment without a need for a bank account. And Google Bank Account is supposed to launch sometime this year.
https://sifted.eu/articles/revolut-losses-results-2020/ – Revolut doubled its losses last year and might have troubles with obtaining the UK banking license.
https://www.aimyths.org/ – Slightly confusing Website that will mythbust your opinion about AI. (rcmd by reader)
https://www.oracle.com/news/announcement/oracle-and-deutsche-bank-2021-06-24/ – Deutsche Bank is migrating into the Oracle Exadata Cloud. (rcmd by reader)
https://theconversation.com/languages-dont-all-have-the-same-number-of-terms-for-colors-scientists-have-a-new-theory-why-84117 – Why is there a different number of terms for colors in languages?
https://github.com/ankurchavda/SparkLearning – More than 60 points about the theory of Spark, could be a useful reference material.
https://deepnote.com/@bala-priya/Guide-to-Cross-Validation-and-Hyperparameter-Search-aXKLhfeNSu6MKgtckqRMnw – Clasical topic, comprehensive intro to cross-validation and search of hyperparameters.
https://github.com/microsoft/ML-For-Beginners – 12-week, 24-lesson curriculum about ML by Microsoft.
Data & Libraries
https://github.com/fabsig/GPBoost – GPBoost is a software library for combining tree-boosting with Gaussian process and mixed effects models written in C++. There is a Python version as well as an R version. (rcmd by reader)
– How PayPal migrated from Teradata to Google Cloud. Step by step, not all at once and with help of automatization, of course. (rcmd by reader)
https://eval.ai/ – Evaluate your ML and AI algorithms on this “alternative Kaggle”. (rcmd by reader)
https://towardsdatascience.com/learn-you-some-kedro-be67d4fc0ce7 – What is Kedro? It’s a Python framework for creating reproducible, maintainable and modular data science code with help of concepts from software engineering.
Video & Podcast
Papers & Books
https://abseil.io/resources/swe-book – How does software engineering in Google work and look like? Find out in this book. (rcmd by reader)
https://arxiv.org/pdf/2011.14817.pdf – Paper about TailCor, a tail correlation typical for rare events.
https://paperswithcode.com/paper/revisiting-deep-learning-models-for-tabular – Deep Learning for Tabular Data. It seems there is still no universally superior solution and it’s a draw between Gradient Boosted Decision Trees and DL models.
Behind the Fence
https://www.pythonjobshq.com/jobs/66817620-senior-software-engineer-at-truveris – Software Engineer in Truveris, New York, USA.