it’s Friday, and this time it means bulletin! From all the articles, I would recommend the github repository with answers on data science interview questions from category Education or lovely hate on Google cloud from category Computer Science & Science.
I would also like to welcome our new readers, not only for them, a few things:
- you can send me an interesting link, I will consider it, and possibly include it in the next DSB
- you can send me your personal email and I will send the DSB there, too (in hidden copy)
- you can ask me to add some other people to the email list, but with their permission please (ask them first)
- DSB is usually send every two weeks but there are and there will be exceptions
- DSB does not (and can’t) serve to share any internal links or propaganda, since many former colleagues and even strangers are recipients
Thank you all!
And as always, enjoy your reading.
https://data-apis.org/blog/announcing_the_consortium/ – There is a chaos in data science frameworks, too many of them, too fragmented. The Consortium for Python Data API Standards aims to change it. Read how. (rcmd by reader)
https://www.jeremyjordan.me/testing-ml/ – How to test machine learning systems with more than unit tests.
https://www.cell.com/heliyon/fulltext/S2405-8440(20)31261-5 – Quantitative analysis to a large dataset of USA congressional speeches made over a period of 138 years.
Computer Science & Science
https://firstname.lastname@example.org/dear-google-cloud-your-deprecation-policy-is-killing-you-ee7525dc05dc – Pure hate on Google cloud. (rcmd by reader)
https://powershellstation.com/2020/08/25/git-the-5-percent-that-i-always-use/ – Necessary minimum for every git user. (rcmd by reader)
https://martinfowler.com/articles/is-quality-worth-cost.html – An excellent thought about trade-off between quickly written software and well written software. (rcmd by reader)
Graphs and Visualizations
https://github.com/vinayak-mehta/present – Do you think that you don’t need a terminal based presentation by python library „present“ which uses Markdown syntax? Think otherwise! (rcmd by reader)
https://graphics.reuters.com/LEBANON-SECURITY/BLAST/nmopalewrva/index.html – Let’s see how powerful was Beirut explosion in this comprehensive visualization.
https://www.scientificamerican.com/article/the-language-of-science/ – Visualizations of the words used in the pages of Scientific American – from 1845 to 2020.
Business and Career
https://www.forbes.com/sites/googlecloud/2020/08/19/when-it-comes-to-cloud-migration-stop-playing-it-safe/#6b6458731ae1 – If you go cloud, do it properly and without excuses. (rcmd by reader)
https://www.forbes.com/sites/cherylwinokurmunk/2020/08/20/insurance-getting-a-fintech-facelift/#250b254020fc – Fintech affects also the insurance companies.
https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/rethinking-ai-talent-strategy-as-automated-machine-learning-comes-of-age – How AutoML will change the future of a data scientist? And here related article why you should be more end-to-end DS?
https://old.reddit.com/r/MachineLearning/comments/ifn7ua/d_what_are_the_untold_truths_of_being_a_machine/ – Very down-to-earth reddit thread about untold thruths in data science.
https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/ – GPT-3 is impressive, but it lacks and understanding of reality.
https://datasociety.net/wp-content/uploads/2017/08/DataAndSociety_LexiconofLies.pdf – This Lexicon will help you to recognize terms and concepts for information that is false.
https://github.com/alexeygrigorev/data-science-interviews – This is an amazing link! Github repository with answers on many (many) technical and theoretical questions from data science interviews. (rcmd by reader)
https://towardsdatascience.com/activation-functions-in-deep-learning-from-softmax-to-sparsemax-math-proof-50c1eb293456 – Repeat the activation functions.
Data & Data Mining
https://edorado93.github.io/2018/09/17/Fun-with-array-rotations-add4a335d79a/ – Array rotation in python.
https://protect-public.hhs.gov/datasets/state-representative-estimates-for-hospital-utilization/data – Data about state representative estimates for hospital utilization in USA.
Video & Podcast
https://www.wpeebles.com/hessian-penalty – What is the Hessian Penalty?
https://ceskepodcasty.cz/podcasty/digitalni-banka-budoucnosti/ – V posledním díle Digitální banky budoucnosti se probírají možnosti a příležitosti spojené s bankovní identitou. (rcmd by reader)
Papers & Books
https://www.pnas.org/content/early/2020/08/13/1907370117 – How to handle data shortage in case of reinforcement learning.
https://pubs.aeaweb.org/doi/pdfplus/10.1257/jep.33.2.51 – What were the effects of automation in 19th century?
Behind the Fence
https://www.paycomonline.net/v4/ats/web.php/jobs/ViewJobDetails?job=54510&clientkey=D25120971391831BA4315C705AA7ABF1 – A Data Scientist at GNY Mutual Insurance Company, New York, USA.
https://i.redd.it/kpnhcf413jj51.png – From the live…
https://github.com/JBall1/tomato-ID- – And as a bonus, very practical model!