DSB #104

Hi,

gloomy Friday (at least in Czechia) will be saved by bulletin! In this volume I really liked the first article from Graphs and Visualizations about epidemic and everything around it. Also, reminder that if you want, you can send me interesting links and I will consider them. And if you want, send me your personal email address and I will add you to hidden email list.

As always, enjoy your reading.

Analytical

https://www.saturncloud.io/s/random-forest-on-gpus-2000x-faster-than-apache-spark/ – Let’s train random forest on GPU with Rapids and Dask.

https://bair.berkeley.edu/blog/2020/10/13/supervised-rl/ – This article is everywhere and it’s about the involvement of supervised learning in reinforcement learning – RL. (rcmd by reader)

http://d2l.ai/chapter_multilayer-perceptrons/environment.html – How to handle and what is a distribution shift? And here, you can find intro to probability distributions.

Computer Science & Science

https://www.quantamagazine.org/computer-scientists-break-traveling-salesperson-record-20201008/ – Notoriously known salesperson optimization problem is mentioned almost in every text about algorithms. Now it’s partial solution was improved by 0.2 billionth of a trillionth of a trillionth of a percent and it’s a major breakthrough.

https://medium.com/@d3lm/understand-tensorflow-by-mimicking-its-api-from-scratch-faa55787170d – Old but gold, paper from January 2019 that will guide you step by step on your Tensorflow api building journey. All the way from scratch.

https://nedbatchelder.com/blog/202010/ordered_dict_surprises.html – Since python 3.6 the dictionaries (more about them here) are ordered, yet getting a Nth element is still complicated. Still, OrderedDict is probably a better choice, cause explicit is better than implicit (PEP20). I also recommend the comments, you learn a thing about big O notation. (rcmd by reader)

Graphs and Visualizations

https://filiph.github.io/covid-19/ – Amazing interactive article with many visualizations that explains how epidemic works, all necessary metrics and multiple scenarios. It’s written in multiple languages – czech, english etc… (rcmd by reader)

https://mikecroucher.github.io/reproducible_ML/ – Presentation about principles of proper research and typical mistakes with an emphasis on software. (rcmd by reader)

https://gitlab.com/michalskop/volby-2020/-/blob/master/README.md – Spousta grafů a vizualizací k právě proběhlým krajským volbám včetně analytiky. Projděte si i článek, podle kterého se to modelovalo. (rcmd by reader)

Business and Career

https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-challengers-in-the-next-normal-in-central-and-eastern-europe – What is the state of economy in Central and Eastern Europe, what about digitization and covid-19? Here, you can find a profile of Czech Republic. (rcmd by reader)

https://about.gitlab.com/company/culture/all-remote/phases-of-remote-adaptation/ – Interesting view on remote cooperation, how it evolves and how to handle it effectively, of course with help of GitLab – it is written by them 🙂 (rcmd by reader)

https://www.mckinsey.com/business-functions/organization/our-insights/how-to-be-great-at-people-analytics – Learn about ingredients important for HR analytics.

Pop

https://blogs.microsoft.com/ai/shrinking-the-data-desert/ – If you want to help people with disabilities via AI, first you need data and second you need those people to help you with development.

https://www.wired.com/story/ai-is-throwing-battery-development-into-overdrive/ – How did the ML change the evolution of batteries? What can we expect?

https://medium.com/@colin.fraser/target-didnt-figure-out-a-teen-girl-was-pregnant-before-her-father-did-a6be13b973a5 – The hilarious story about algorithm figuring out that a girl had been pregnant was probably – big surprise – nonsense.

Education

https://towardsdatascience.com/can-a-neural-network-train-other-networks-cf371be516c6 – Just a shallow intro to knowledge distillation, interesting though.

https://joel.net/how-one-guy-ruined-hacktoberfest2020-drama – Short and funny story that some online tutorials might get out of hand. (rcmd by reader)

https://towardsdatascience.com/implementing-recurrent-neural-network-using-numpy-c359a0a68a67 – Recurrent neural networks with Numpy.

Data & Libraries

https://github.com/HlidacStatu/UZIS-COVID19-modelovani-predikci – Data poskytnutá již minule zmíněným Hlídačem státu od ÚZISu pro modelování predikcí covid-19. (rcmd by reader)

https://github.com/rednafi/konfik – If you are using configs for your models and apps, then this library is for you – just go and try konfik! (rcmd by reader)

https://www.vice.com/en/article/pky7km/usenet-archive-utzoo-online – On Usenet Archives were uploaded Usenet posts from 1980s (yes, it was pre-internet era, era of BBS).

Video & Podcast

https://www.youtube.com/watch?v=_gQ202CFKzA&feature=emb_logo – DLSS (deep learning super sampling) is a technology created by Nvidia and currently it’s probably only possible way how to play in 8K. But not only that, you can play your games in 360p and upscale them into 1080p as you can see in video. You will get a lot of fps.

https://mlsys.stanford.edu/ – Seminar Series by Stanford about ML systems, their limits and how ML changes the modern programming stack.

https://realpython.com/podcasts/rpp/30/ – Podcast about new features of Python 3.9.

Papers & Books

https://arxiv.org/abs/2010.05767 – Amazing paper about RL. Authors were able to train an agent with significantly smaller model with purely simulated experiences. New approaches, great math, clear descriptions of parameters and also explanation of dead ends. What do you want more? (rcmd by reader)

https://medium.com/paperswithcode/papers-with-code-partners-with-arxiv-ecc362883167 – Perfect! Papers with Code partners with arXiv and now it will be much easier to reproduce the articles you are reading. (rcmd by reader)

https://hdsr.mitpress.mit.edu/pub/d9j96ne4/release/2 – What are the challenges for data science research that are awaiting us?

Behind the Fence

https://ai-jobs.net/job/3475-data-scientist/ – Data Scientist in NeuroFlow, USA, Philadelphia.

Joke

https://davidbuckley.ca/post/stacksort/ – You don’t want this sorting algorithm 😀

https://www.reddit.com/r/ProgrammerHumor/comments/jbw0hz/marriage_logic_map/ – Wait for it…