Skip to content

DSB #108

Hi,

the week is over, let’s jump into the weekend, but first read the DSB! The most impressive for me were the pieces of amazing codes in Computer Science & Science. Or categorical variables encoding in Papers & Books.

As always, enjoy your reading.

Analytical

https://towardsdatascience.com/python-jupyter-notebooks-in-excel-5ab34fc6439 – Jupyter Notebooks running inside Excel, like in real. You can call functions written in Python in Excel! (rcmd by reader)

https://eugeneyan.com/writing/real-time-recommendations/ – Forget batch recommendations, real thing is in real time!

https://blog.jupyter.org/jupyterlab-3-0-is-out-4f58385e25bb – JupyterLab 3.0 is here, read about new features! (rcmd by reader)

Computer Science & Science

https://medium.com/swlh/impressive-sources-codes-that-every-developer-should-see-b68028b36da5 – Impressive pieces of codes like Apollo 11 or Quake III Arena and why they were so important.

https://www.youtube.com/watch?v=Fzf8QotUNCE&ab_channel=RobertMartin – Advent of Code in Clojure? No problem, for Rober Martin, author of Clean Code. (rcmd by reader)

https://ai.facebook.com/blog/deep-learning-to-translate-between-programming-languages/ – Translate between programming languages with help of deep learning.  (rcmd by reader)

Graphs and Visualizations

https://lynxkite.com/ – I am into graphs databases, I liked them and hence I like this graph data science platform. Unfortunately, it’s in Docker, so it’s a no go zone. But you can always try the cloud version. (rcmd by reader)

https://blog.jetbrains.com/datalore/2020/12/17/we-downloaded-10-000-000-jupyter-notebooks-from-github-this-is-what-we-learned/ – Analysis of 10 000 000 Jupyter Notebooks from Github, mainly nice graphs and their interpretation. (rcmd by reader)

https://www.visualsource.net/repo/github.com/python/cpython – Visualization of cpython repository and its evolution in time, beautiful! (rcmd by reader)

Business and Career

https://www.searchenginejournal.com/google-pay-google-plex/391328/amp/ – Welcome the Google Plex, it manages your day-to-day finances and banks should definitely notice that. (rcmd by reader)

https://marker.medium.com/the-bitcoin-dream-is-dead-8b621d2d7dbd – Some of you are maybe milionairs thanks to bitcoin, but it’s useless as currency.

https://www.cnbc.com/2021/01/11/walmart-to-create-fintech-start-up-with-investment-firm-behind-robinhood.html – Everybody is into finance these days, even Walmart has a fintech start-up. And it’s an interesting symbiosis between retail shop and retail finances.

Pop

https://blog.engora.com/2021/01/covid-and-soccer-home-team-advantage.html – Without fans, there is no home advantage in soccer. (rcmd by reader)

https://openai.com/blog/dall-e/ – This topic was tackled even by mainstream media. AI is able to be creative thanks to GPT-3 and come up with its own design of, for example, avocado chairs. (rcmd by reader)

https://www.idnes.cz/technet/internet/google-vypadek-post-mortem-vysvetleni.A201220_221500_sw_internet_pka – Pěkný, až detektivně napsaný článek, jak inženýrům z Googlu trvalo 47 minut vyřešit výpadek. Za mě respekt. (rcmd by reader)

Education

https://towardsdatascience.com/10-jupyter-lab-extensions-to-boost-your-productivity-4b3800b7ca2a – Boost your productivity in Jupyter Lab with these extensions. (rcmd by reader)

https://probml.github.io/pml-book/book1.html – Intro to probabilistic ML – a book by MIT.

https://www.analyticsvidhya.com/blog/2021/01/beginners-guide-to-standard-gui-library-in-python-tkinter/ – If you need a GUI created by Python, think about Tkinter.

Data & Libraries

https://dirty-cat.github.io/stable/ – dirty_cat is doing exactly what you would expect – it provides encoders that are robust! (rcmd by reader)

https://www.microsoft.com/en-us/research/blog/microsoft-deberta-surpasses-human-performance-on-the-superglue-benchmark/ – DeBERTa by Microsoft is state of the art in NLU (natural language understanding), it was built with 1.5 billion parameters (code here) and uses a two-vector approach. (rcmd by reader)

https://hector.dev/2020/12/29/validating-data-in-python-with-cerberus.html – How to validate data with Cerberus with help of regex for example.

Video & Podcast

https://www.youtube.com/watch?v=BsDeG3jQ61s – Encode your categorical variables – see the first article in Paper & Books. (rcmd by reader)

https://open.spotify.com/episode/7f8pAAA6EGjvREtxNeYJfB?si=rx57p4w2RAaYhF7ecm5k3w – Tomáš Mikolov o NLP, práci v Googlu a FB, aktuálním dění v AI atd. (rcmd by reader)

https://open.spotify.com/episode/1AVBn7AIlxiGdgQcRjhdlI?si=Sot6aG9ETMicUUNNRddSJw – SuperDataScience about new trends in data science in 2021. (rcmd by reader)

Papers & Books

https://arxiv.org/pdf/1907.01860.pdf – How to encode categorical variables when you have many of them, too many to be handled by one-hot encoding. (rcmd by reader)

https://www.bis.org/publ/work917.pdf – Sentiment in Chinese media and its effect on the stock market. (rcmd by reader)

https://www.bis.org/publ/bisbull36.htm – How e-commerce is doign in the pandemic. (rcmd by reader)

Behind the Fence

https://ai-jobs.net/job/4615-lead-data-engineer/ – Lead Data Engineer in Kettle, California, USA.

Joke

https://www.monkeyuser.com/2021/task-story-epic-quest/ – Epic! 😀

2 Comments

Leave a Reply