Skip to content

DSB #109

Hi,

I think it’s Friday again and DSB is on its way, a bit too late I admit. Try to play the mini game from Graphs and Visualizations or read a long article why we partly failed in modelling of covid-19 in Analytical. But also learn more about data engineering, you gonna need it in near future as you can learn in the first article in Business and Career.

As always, enjoy your reading.

Analytical

https://www.quantamagazine.org/the-hard-lessons-of-modeling-the-coronavirus-pandemic-20210128/ – Long, like really long reading why it’s been so difficult to model covid-19.

https://ai.googleblog.com/2020/12/privacy-considerations-in-large.html – Large generative language models like GPT-2 are vulnerable to multiple types of attacks. Especially in terms of privacy of training data, when it’s possible to extract pieces of the data. (rcmd by reader)

https://medium.com/anomalo-hq/unsupervised-data-monitoring-36cb2304c61e – How to monitor data quality at scale.

Computer Science & Science

https://www.theregister.com/2021/01/22/aws_elastic_fork/ – License battle between Elastic and AWS. AWS is using Elastic as an open-source software for free and Elastic wants money, hence they’re changing their license and AWS is not happy. (rcmd by reader)

https://frostming.com/2021/01-22/introducing-pdm/ – Yeah, you should use a virtual environment for your projects (if you can) but there is another option thanks to PDM, Python Development Master, how to manage packages. (rcmd by reader)

https://delitescere.medium.com/hotwire-html-over-the-wire-2c733487268c – Let me introduce a Hotwire when you’re sending html instead of json over the wire and you get fast first-load pages.

Graphs and Visualizations

https://vole.wtf/coder-serial-killer-quiz/ – Who is a programming language inventor and who is a serial killer… try to recognize 😀 (rcmd by reader)

windowsontheory.org/2021/01/15/ml-theory-with-bad-drawings/ – Theory of ML expressed by bad drawings but still interesting reading that summarize many theoretical principles of ML.

https://observablehq.com/@twitter/density-plot-introduction – Review of density plots and their advantages.

Business and Career

https://www.mihaileric.com/posts/we-need-data-engineers-not-data-scientists/ – It’s true, we really need more data engineers or at least more data scientists with engineering skills. That’s the real bottleneck.

https://towardsdatascience.com/the-top-5-data-trends-for-cdos-to-watch-out-for-in-2021-e230817bcb16 – I don’t like list of trends, most of the time they’re lazy, but this one looks reasonable – 5 data trends for CDO in 2021.

https://www.macrumors.com/2021/01/28/facebook-preparing-antitrust-lawsuit-against-apple/ – Apple allows people to disable tracking but ask about it only in case of third parties’ apps and Facebook strongly disagrees.

Pop

marksaroufim.substack.com/p/machine-learning-the-great-stagnation – Honest and raw opinion about ML, and why at least in academic research it is almost stagnating. 

https://pitchfork.com/news/new-spotify-patent-involves-monitoring-users-speech-to-recommend-music/ – Spotify aims to use your mood, speech and background noses to recommend you adecvate music. Seems terrifying to me. (rcmd by reader)

https://koronahra.cz/game – Vyzkoušejte si, jak byste byli úspěšní při řešení korona krize v ČR. Na závěr se můžete porovnat s ostatními hráči. (rcmd by reader)

Education

https://www.kdnuggets.com/2021/01/working-lambda-layer-keras.html – How to work with Lambda layers in Keras when you for example to adjust some dense layer.

https://www.kdnuggets.com/2021/01/graph-representation-learning-book-free-ebook.html – Free e-book Graph Representation Learning that teaches you about methods for embedding graph data, graph neural networks, and deep generative models of graphs. (rcmd by reader)

https://www.devopsonline.co.uk/google-cloud-to-launch-free-courses-in-cloud-ai-and-data-analytics/ – Google Cloud offers free online courses with focus on data analytics, AI, machine learning, and cloud services. (rcmd by reader)

Data & Libraries

https://github.com/ml-tooling/best-of-ml-python – Do you demand a framework or library for ML in Python? You find it here! A weekly updated list with anything you need. (rcmd by reader)

https://github.com/shobrook/sequitur/ – Sequiter provides autoencoders for sequential data and it’s built on PyTorch. (rcmd by reader)

Video & Podcast

https://www.youtube.com/watch?v=8yUPhRJtNJM – Did you play Pokemon Red and have you ever wondered who is the best NPC trainer in the game? Author of the video took all the trainers and let them battle. Then measured their results via Elo and ordered them into tiers based on Kernel density estimation. Funny and truly interesting data science in practice. (rcmd by reader)

https://www.youtube.com/playlist?list=PLx8omXiw3n9y26FKZLV5ScyS52D_c29QN – Architecture of ML systems.

https://www.youtube.com/watch?reload=9&v=iAR8LkkMMIM&feature=youtu.be – Video about Switch Transformers by Google Brain. Some call it the next step after GPT-3.

Papers & Books

http://www.cs.toronto.edu/~tl/papers/LuBoutilier_ICML14workshop.pdf – Old but gold? Who knows. But papers about lead management seem rare so give it a chance. (rcmd by reader)

https://www.microsoft.com/en-us/research/publication/vinvl-making-visual-representations-matter-in-vision-language-models/ – Vision-language tasks are another interesting area in deep learning. Microsoft improved its model called Oscar by adding visual features as you read in the paper. Code for Oscar is here. (rcmd by reader)

https://synerise.com/papers/efficient-manifold-density-estimator – Embedding with manifold density estimator for recommendation systems by Synerise. (rcmd by reader)

Behind the Fence

https://jobs.apple.com/en-us/details/200191352/data-scientist-apple-pay-analytics-nyc – Data Scientist in Apple, New York City, USA.

Joke

https://devhumor.com/content/uploads/images/January2021/patience.png

Be First to Comment

Leave a Reply