Hi,
the summer is on fire and the DSB as well! So let’s go read! For Python users the article about dictionaries from Computer Science & Science is obligatory. The business article about salaries of software engineers in Business and Career is also very important, especially for tech leaders. But definitely there are other really really good ones.
Also, please, if you have any articles you would want to recommend, use our web form here:
https://datasciencebulletin.com/recommend-article/
And as always, enjoy your reading.
Analytical
https://neo4j.com/developer-blog/using-neo4j-graph-data-science-in-python-to-improve-machine-learning-models/ – Use graph-based features to increase the accuracy of a ML model.
https://eugeneyan.com/writing/bandits/ – Use bandits from reinforcement learning for recommender systems to modell uncertainty and exploration.
https://www.pinecone.io/learn/spotify-podcast-search/ – Spotify’s natural language search for podcasts is based on semantic search.
Computer Science & Science
https://roman.pt/posts/dont-let-dicts-spoil-your-code/ – Don’t overuse dictionaries in Python, or even better use them only for values with same type. For other situation use dataclases, Pydantic or TypeDict.
https://nolanlawson.com/2022/06/09/the-collapse-of-complex-software/ – The danger of complexity of solutions, ideas and plans. Sometimes less is better than more. I really recommend the discussion under the article.
https://gordonc.bearblog.dev/dry-most-over-rated-programming-principle/ – DRY principle (do not repeat yourself) is fundamental for many programmers, but use it wisely.
Graphs and Visualizations
https://levelup.gitconnected.com/how-to-do-amazing-twitter-network-analysis-in-r-2c258537dd7d – Twitter network of British politicians visualized in R. Intearctive version is available here.
https://tech.marksblogg.com/pretty-maps-in-python.html – Are you in need to create a really pretty map? Then try prettymaps! Convinient, isn’t it?
https://jott.live/html/mm_impl_anim – Everybody knows that vectorization is fast, but it’s not that simple, and this animation will explain it.
Business and Career
https://blog.pragmaticengineer.com/software-engineering-salaries-in-the-netherlands-and-europe/ – The salaries distribution of software engineers is trimodal. The article thoroughly describes reasons from multiple points of view, from a job candidate to the reasons of a company, its competions and more. Must read.
https://www.oreilly.com/radar/drivetrain-approach-data-products/ – Building a true data based product is not an easy task, so have a look at these 4 steps of Drivetrain Approach. (rcmd by reader)
https://www.kitchensoap.com/2012/10/25/on-being-a-senior-engineer/ – What does it mean to be a senior engineer? What do you need to change to fit in this role?
Pop
https://theconversation.com/is-googles-lamda-conscious-a-philosophers-view-184987 – Is LaMDA by Google conscious? Whether it is or not, we will see similar questions more and more often.
https://www.scientificamerican.com/article/artificial-general-intelligence-is-not-as-imminent-as-you-might-think1/
– General AI is far far away. One of the reasons can be unrelibility of current AI solutions. Don’t fall for a hype.
https://www.dailymail.co.uk/news/article-11010077/Chinese-courts-allow-AI-make-rulings-charge-people-carry-punishments.html – We mentioned China’s totalitarian and dystopian use of AI in DSB #22, DSB #31, DSB #38, DSB #41, DSB #81 and DSB #124. Another chapter in this fascinating development is smart court, when AI helps run courts. A judge must explain when he or she rejects machine’s recommendation.
Education
https://guicommits.com/how-to-log-in-python-like-a-pro/ – Good logging isn’t that easy as it may seems. This article is the precisely what you need to read if you are using only one log level and most of your code is just copied.
https://architecturenotes.co/things-you-should-know-about-databases/ – Simple but still a good intro to databases.
https://web.stanford.edu/class/cs25/ – Stanford class CS25: Transformers United.
MLOps
https://red-engine.readthedocs.io/en/stable/index.html – Red Engine is a scheduling framework for Python. And at the first sight it looks good!
https://www.analyticsvidhya.com/blog/2022/07/apache-kafka-architecture-and-use-cases-explained/ – Brief description of Apache Kafka.
https://www.featureform.com/post/feature-stores-explained-the-three-common-architectures – Feature stores and their three common architectures.
Video & Podcast
https://www.youtube.com/playlist?list=PLGJQS0h-wqLQqR5wdpL2tG68ArKjoIMpK – Sarah Polak a její série krátkých přednášek o AI. (rcmd by reader)
https://youtu.be/lJKPiOf_o8k – Even though your model works on paper, it does not necessarily mean it is also solving the real problem. In the video Vincent D. Warmerdam will explain the rest and also show you a demo case with code.
https://youtu.be/axuGfh4UR9Q – Noam Chomsky is a legend. And in AI world he is famous for his critique of AI. The video is basically a document and at the end there is an interview with the mister himself. Among other thigs, he critizes GPT models, which according to him have achieved nothing.
Papers & Books
https://www.amazon.com/gp/product/B06XP3GJ7F/ref=ppx_yo_dt_b_d_asin_title_o00 – Incredible book how to be a good tech lead. Written by amazing Camille Fournier. It’s completely changed my perception of my own career. I would recommend it to everyone, no matter whether you are a junior or a teamleader.
https://hal.archives-ouvertes.fr/hal-03723551/document – On tabular data tree-base models still have higher performance than deep learning. Why is that?
https://arxiv.org/abs/2207.08822 – Is it possible to replace floating-point arithmetic with integer arithmetic? Authors try to challenge that idea with a fully integer training pipeline. Really like the idea!
Behind the Fence
https://boards.greenhouse.io/redesignhealth/jobs/6083256002 – Senior Data Scientist
at Redesign Health, remote work, USA.
Be First to Comment