DSB #132


hopefully you have lived your weekend fully and you can finish (or start) the day with DSB! And what would I recommend in this volume? Go to Computer Science & Science and read the first two articles. The first one about queues is a must read for every manager and the second one is for everybody who writes more than 10 lines of code.

And as always, enjoy your reading.

Analytical – If you are into NLP then read this very long article about embedding. Multiple methods described on real data. – Who won ML competitions on Kaggle, AIcrowd and others last year? Which language and packages did they use? (not R or Tensorflow…) – GFlowNets are the new hot topic in the data science world. Learn about them in this blog post by Yoshua Bengio. Tutorial can be found here.

Computer Science & Science – Impressive short article about unintuitive behavior of queues. It’s not only about CPU, but it can also be applied on utilizations of teams. And it explains a lot. You really should not even go anywhere near the 100% utilization to be effective. – It reads a string and writes out a string and yet it’s extremely complex and difficult. An automated code formatter for Dart. Old but gold article about computer science, complexity and usage of algorithms. – Do you need to optimize your DL model? Then you need to understand what is happening behind the scenes.

Graphs and Visualizations – In case of ML papers it seems PyTorch is murdering Tesorflow. At least on these three graphs. Methodology is explained here. – Good dashboard should answer the business question with 2 or 3 main graphs and should reflect the feedback of users. Build good dashboards! – What are data salaries in US and Europe? And how do they differ by seniority?

Business and Career – Building a data science organization is not about fancy and empty proclamations, you need to change everything. The article presents several models on how to integrate data science (team) into the company. – If DeFi is supposed to be the future, one should understand it. This is the third article in the series about DeFi and it is about technology behind decentralised exchanges. – Which companies drive their industries forward thanks to data science? Look at this really interesting list. (rcmd by reader)

Pop DSB #112 was mentioned an AI index report by Stanford University. After a year, the 2022 (fifth) edition was released. The most interesting chapters for us are probably recommendation systems and NLP.,221259 – The New York Times is planning to invest more into their already amazing data-driven journalism. Unfortunately most of the articles are behind the pay-wall, which is the reason why they are mentioned so sparely in DSB. – Netflix will recognize whether you share your account and make you pay more.

Education – Introduction to deep belief network (DBN). – Comprehensive tutorial to class constructors in Python. – Building a hash table in Python with test-driven-development. You also learn how Python’s hash function works.

Data & Libraries – Data architecture cannot stay the same for years. It is changing rapidly. Read about these changes and have a look at modern patterns. DSB #124 centralized data lakes were buried in the ground because of data mesh architecture. The link will give you a very nice and understandable explanation from an engineering perspective. – What types of labels are there and where to take them?

MLOps – Forget multiple data science roles and become an end-to-end data scientist. Because then you can deliver value like those in Stitch Fix or Netflix. – When is a data science project done? How to measure whether it is finished? The answer is projection completion matrix and 0/1/Done strategy.

Papers & Books – Wow, GNN is able to discover orbital mechanics without knowing actual parameters! Paper is available here. – List of 7 papers in computer vision with link to github. (rcmd by reader) – For most of you probably well known intro books about data-science. But still a good list.

Behind the Fence – Senior ML engineer in eBay, New York, USA.


