Merry Christmass and enjoy it as much as you enjoy DSB! In this volume I would definitely recommend to everybody an article about bank transfers as a payment method from Business and Career. Or the message about death of a famous crypto trader from Pop.
And as always, enjoy your reading.
https://www.pytorchlightning.ai/ – In DSB #102 was mentioned a core maintainer for PyTorch Lightning, now have a look at the framework itself, because it has evolved a lot. Intro to it here. (rcmd by reader)
https://openai.com/blog/improving-factual-accuracy/ – The article shows how much work still needs to be done to achieve general understanding language model, or even more general-purpose AI. But still the web browsing model explained here is an interesting option.
http://fa.bianp.net/blog/2021/exponential-sgd/ – Proof that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.
Computer Science & Science
https://requestmetrics.com/web-performance/http3-is-fast – HTTP/3 is another version of Hypertext Transfer Protocol and it’s much faster than its predecessors. It might possibly change the future of websites.
https://python.plainenglish.io/who-writes-better-code-github-copilot-or-gpt-3-9e7441650c9b – The battle between GitHub CoPilot vs GPT-3 in code writing. (rcmd by reader)
Graphs and Visualizations
https://journals.sagepub.com/stoken/default+domain/10.1177%2F15291006211051956-FREE/full – Good visualizations are not easy to create. Read this paper and learn how to create understandable visualization.
https://pandastutor.com/ – This visualization tool will show you how your data are transformed. (rcmd by reader)
Business and Career
https://bam.kalzumeus.com/archive/bank-transfers-as-a-payment-method/ – While money transfers are bank-to-bank, payments are not. But this will change, why and how? (rcmd by reader)
https://scientistemily.substack.com/p/save-data-science – Who is a data scientist? Does this title still matter? And are the expectations of companies reasonable?
https://kevlinhenney.medium.com/agility-speed-96057078fe40 – Why (software) development is not about sprinting, but rather about endurance. (rcmd by reader)
https://www.bbc.com/news/technology-59432659 – Mr Goxx is dead. It’s sad but don’t mourn. This guy was a crypto-trading hamster, who was working on his wooden PC, and was quite successful! (rcmd by reader)
https://www.analyticsvidhya.com/blog/2021/12/a-review-of-2021-and-trends-in-2022-a-technical-overview-of-the-data-industry/ – One the first articles that summarize year 2021 in data science and predicts the trends for year 2022.
https://www.kdnuggets.com/2021/12/developments-predictions-ai-machine-learning-data-science-research.html – Multiple
experts from several companies bring their opinions on the main developments in Data Science research in 2021 and the key trends for 2022. This part talks about technologies and this part generally about industry.
https://www.kdnuggets.com/2021/12/introduction-clustering-python-pycaret.html – Clustering in Python with PyCaret.
https://github.com/Machine-Learning-Tokyo/Interactive_Tools – List of interactive and visually attractive tools to learn about ML.
https://schemaverse.com/ – Strategy game implemented in PostgreSQL that is fully controlled by sql commands. (rcmd by reader)
Data & Libraries
https://ai.googleblog.com/2021/12/training-machine-learning-models-more.html – Dataset distillation, a large dataset is distilled into a synthetic, smaller dataset. (rcmd by reader)
https://www.assemblyai.com/blog/pytorch-vs-tensorflow-in-2022/ – Extensive comparison of PyTorch and TensorFlow.
https://simonwillison.net/2021/Dec/7/git-history/ – Impressive, git-history is the CLI tool that reads through the entire history of a file and creates a final dataset.
Video & Podcast
https://www.wsj.com/video/series/inside-tiktoks-highly-secretive-algorithm/investigation-how-tiktok-algorithm-figures-out-your-deepest-desires – TiTok has probably one of the best recommendation algorithms (and billion users). So learn in 15 minutes how it works.
https://www.superdatascience.com/podcast/data-science-at-the-command-line – SuperDataScience podcast with Jeroen Janssens. Author of the book on usage of CML in data science. (rcmd by reader)
https://youtu.be/WU0gvPcc3jQ – Impressive demonstration of Unreal Engine 5 power called: The Matrix Awakens.
Papers & Books
https://datascienceatthecommandline.com/2e/ – Previously mentioned book on CML in data science. (rcmd by reader)
https://www.distributional-rl.org/ – Book about Distributional Reinforcement Learning. A draft is freely available and soon there should be a print version by MIT Press.
Behind the Fence
https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-Data-Engineer–Speech_JR1949422 – Senior Data Engineer at NVIDIA, USA.