Skip to content

DSB #126

Hi,

DSB incoming! And it means it’s Friday. I would recommend an article about PCA in Analytical or interesting is a short reading about code review from MLOps.

And as always, enjoy your reading.

Analytical

https://towardsdatascience.com/pca-beyond-the-dimensionality-reduction-e352eb0bdf52 – Use  PCA to verify how the features are varying together.

https://towardsdatascience.com/graph-neural-networks-through-the-lens-of-differential-geometry-and-algebraic-topology-3a7c3c22d5f – Graph neural networks reinterpreted via differential geometry and algebraic topology.

https://mathspp.com/blog/minimax-algorithm-and-alpha-beta-pruning – Introduction to the minimax search algorithm, and to alpha-beta pruning.

Computer Science & Science

https://paulstamatiou.com/crypto-design-challenges/ – How to change or improve the design of crypto to gain trust of users? How well is crypto already adopted?

https://www.kdnuggets.com/2021/11/5-advanced-tips-python-sequences.html – Useful and short tips on things like unpacking, list comprehension or generator expression.

https://easylang.online/apps/tutorial_monte_carlo_methods.html – Simulations of coin flipping, roulette, and lotto. You can find out which variant will earn you money.

Graphs and Visualizations

https://medium.com/neo4j/graph-visualization-of-panama-papers-data-in-neo4j-9c08ca17039c – Panama papers
visualized in Neo4j Bloom. Quite impressive graph visualization. (rcmd by reader)

https://www.analyticsvidhya.com/blog/2021/11/complete-guide-to-people-counting-and-tracking-end-to-end-deep-learning-project/ – How to count people on images.

https://filwd.substack.com/p/clarity-and-aesthetics-in-visualization – Short but very interesting and often mentioned article
about attention, contrast and grouping in visualizations.

Business and Career

https://www.kaggle.com/kaggle-survey-2021 – As each year Kaggle is coming with results of its DS/ML survey. How large is the gender gap? Which online courses are the most preferred? And of course salaries. (rcmd by reader)

https://www.coindesk.com/business/2021/11/25/grayscale-says-metaverse-is-a-trillion-dollar-market-opportunity/
– Web 1.0 was all about static html pages, web 2.0 brings user generated content and web 3.0? It’s about metaverses, AI and crypto. Read this report in order to find out the future of the internet worth 400 billion USD in 2025.

https://benn.substack.com/p/the-missing-analytics-executive – Chief data officer (CDO) is an underestimated position by companies, poorly defined and overburdened. Let’s split it into two new ones: vice president of data and chief analytic officer. 

Pop

https://www.hani.co.kr/arti/english_edition/e_national/1016107.html – You probably already heard about that. The South Korea government handed over 170M facial images without any consent.

https://realpython.com/interview-eric-wastl/ – AoC is coming! Before you test your coding skills in this programming version of the Advent calendar, read this interview with its founder Eric Wastl. (rcmd by reader)

https://www.quantamagazine.org/to-be-energy-efficient-brains-predict-their-perceptions-20211115/ – Your brain neurons seem to use predictive processing and make inferences.

Education

https://www.analyticsvidhya.com/blog/2021/11/an-introduction-to-stemming-in-natural-language-processing/ – Nice and simple overview of stemming in NLP. 

https://www.analyticsvidhya.com/blog/2021/11/3-ways-to-deal-with-settingwithcopywarning-in-pandas/ – SettingWithCopy is one of the most common warnings in Pandas. How to handle it and what it means?

https://e2eml.school/transformers.html – Exhausting and comprehensive guide on transformers.

Data & Libraries

https://github.com/koaning/paftdunk – PaftDunk will help you set up a recommender benchmark for your analysis. But even if you are not interested, have a look at it. Elegant code like this you won’t see every day. (rcmd by reader)

https://www.kdnuggets.com/2021/11/easy-synthetic-data-python-faker.htmlIn DSB #25 was mentioned Faker library that generates synthetic data. Remind yourself how to use it with this simple article.

MLOps

https://www.kdnuggets.com/2021/11/simple-things-steal-agile-data-science-analytics.html – What is the most important thing that data science should take from agile? Peer review of code/solution.

https://towardsdatascience.com/complete-machine-learning-pipeline-for-nlp-tasks-f39f8b395c0d – End-to-end implementation of pipeline for name entity recognition in emails.

https://eng.lyft.com/parameter-exploration-at-lyft-b9d2a1483c82 – Parameter exploration and optimization with Bayess and Guassian process models.

Video & Podcast

https://www.imdb.com/title/tt15392100/ – Highly appraised mini series of 4 episodes by Netflix how Google copied the idea of Google Earth. (rcmd by reader)

https://realpython.com/podcasts/rpp/81/ – What are the new features in Python 3.10? (rcmd by reader)

Papers & Books

https://github.com/eugeneyan/applied-ml – Curated list of ML papers focused on practical implementation. (rcmd by reader)

https://www.humblebundle.com/books/code-like-pro-manning-publications-books – “Code like a pro” is the name of a humble bundle of books about programming. And if you are a pure pythonist then use this offer of Python O’Reilly books. Improve your skills beyond your wildest dream! (rcmd by reader) 

https://store.metasnake.com/effective-pandas-book?coupon=BF40 – Winter is comming so it’s all about the books. Stay home, stay warm and read this one about Pandas. (rcmd by reader)

Behind the Fence

https://jobs.volvocars.com/job/Gothenburg-Data-Scientist/727855901/ – DS in Volvo, Gothenburg, Sweden. (rcmd by reader)

Joke

https://xkcd.com/722/

Be First to Comment

Leave a Reply