DSB incoming! And it means it’s Friday. I would recommend an article about PCA in Analytical or interesting is a short reading about code review from MLOps.
And as always, enjoy your reading.
https://towardsdatascience.com/pca-beyond-the-dimensionality-reduction-e352eb0bdf52 – Use PCA to verify how the features are varying together.
https://towardsdatascience.com/graph-neural-networks-through-the-lens-of-differential-geometry-and-algebraic-topology-3a7c3c22d5f – Graph neural networks reinterpreted via differential geometry and algebraic topology.
https://mathspp.com/blog/minimax-algorithm-and-alpha-beta-pruning – Introduction to the minimax search algorithm, and to alpha-beta pruning.
Computer Science & Science
https://paulstamatiou.com/crypto-design-challenges/ – How to change or improve the design of crypto to gain trust of users? How well is crypto already adopted?
https://www.kdnuggets.com/2021/11/5-advanced-tips-python-sequences.html – Useful and short tips on things like unpacking, list comprehension or generator expression.
https://easylang.online/apps/tutorial_monte_carlo_methods.html – Simulations of coin flipping, roulette, and lotto. You can find out which variant will earn you money.
Graphs and Visualizations
https://medium.com/neo4j/graph-visualization-of-panama-papers-data-in-neo4j-9c08ca17039c – Panama papers
visualized in Neo4j Bloom. Quite impressive graph visualization. (rcmd by reader)
https://www.analyticsvidhya.com/blog/2021/11/complete-guide-to-people-counting-and-tracking-end-to-end-deep-learning-project/ – How to count people on images.
https://filwd.substack.com/p/clarity-and-aesthetics-in-visualization – Short but very interesting and often mentioned article
about attention, contrast and grouping in visualizations.
Business and Career
https://www.kaggle.com/kaggle-survey-2021 – As each year Kaggle is coming with results of its DS/ML survey. How large is the gender gap? Which online courses are the most preferred? And of course salaries. (rcmd by reader)
– Web 1.0 was all about static html pages, web 2.0 brings user generated content and web 3.0? It’s about metaverses, AI and crypto. Read this report in order to find out the future of the internet worth 400 billion USD in 2025.
https://benn.substack.com/p/the-missing-analytics-executive – Chief data officer (CDO) is an underestimated position by companies, poorly defined and overburdened. Let’s split it into two new ones: vice president of data and chief analytic officer.
https://www.hani.co.kr/arti/english_edition/e_national/1016107.html – You probably already heard about that. The South Korea government handed over 170M facial images without any consent.
https://realpython.com/interview-eric-wastl/ – AoC is coming! Before you test your coding skills in this programming version of the Advent calendar, read this interview with its founder Eric Wastl. (rcmd by reader)
https://www.quantamagazine.org/to-be-energy-efficient-brains-predict-their-perceptions-20211115/ – Your brain neurons seem to use predictive processing and make inferences.
https://www.analyticsvidhya.com/blog/2021/11/an-introduction-to-stemming-in-natural-language-processing/ – Nice and simple overview of stemming in NLP.
https://www.analyticsvidhya.com/blog/2021/11/3-ways-to-deal-with-settingwithcopywarning-in-pandas/ – SettingWithCopy is one of the most common warnings in Pandas. How to handle it and what it means?
https://e2eml.school/transformers.html – Exhausting and comprehensive guide on transformers.
Data & Libraries
https://github.com/koaning/paftdunk – PaftDunk will help you set up a recommender benchmark for your analysis. But even if you are not interested, have a look at it. Elegant code like this you won’t see every day. (rcmd by reader)
https://www.kdnuggets.com/2021/11/easy-synthetic-data-python-faker.html – In DSB #25 was mentioned Faker library that generates synthetic data. Remind yourself how to use it with this simple article.
https://www.kdnuggets.com/2021/11/simple-things-steal-agile-data-science-analytics.html – What is the most important thing that data science should take from agile? Peer review of code/solution.
https://towardsdatascience.com/complete-machine-learning-pipeline-for-nlp-tasks-f39f8b395c0d – End-to-end implementation of pipeline for name entity recognition in emails.
https://eng.lyft.com/parameter-exploration-at-lyft-b9d2a1483c82 – Parameter exploration and optimization with Bayess and Guassian process models.
Video & Podcast
https://www.imdb.com/title/tt15392100/ – Highly appraised mini series of 4 episodes by Netflix how Google copied the idea of Google Earth. (rcmd by reader)
https://realpython.com/podcasts/rpp/81/ – What are the new features in Python 3.10? (rcmd by reader)
Papers & Books
https://github.com/eugeneyan/applied-ml – Curated list of ML papers focused on practical implementation. (rcmd by reader)
https://www.humblebundle.com/books/code-like-pro-manning-publications-books – “Code like a pro” is the name of a humble bundle of books about programming. And if you are a pure pythonist then use this offer of Python O’Reilly books. Improve your skills beyond your wildest dream! (rcmd by reader)
https://store.metasnake.com/effective-pandas-book?coupon=BF40 – Winter is comming so it’s all about the books. Stay home, stay warm and read this one about Pandas. (rcmd by reader)
Behind the Fence
https://jobs.volvocars.com/job/Gothenburg-Data-Scientist/727855901/ – DS in Volvo, Gothenburg, Sweden. (rcmd by reader)