Skip to content

DSB #146

Hi,

finally time to create a new volume of DSB. Give it a shot. A lot has happened since the last bulletin, but my favorite thing is maybe older (but better) and it‘s a John Carmack’s reading list of papers that should help you become a data scientist.

I’m also sorry, but I couldn’t include some of your recommendations because the links were no longer working.

And as always, enjoy your reading.

Science

https://arc.net/folder/D0472A20-9C20-4D3F-B145-D2865C0A9FEE – This is presumed to be the list shared with John Carmack by Ilya Sutskerev: “So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.And I did. I plowed through all those things and it all started sorting out in my head.” The original list is believed to be lost to Meta’s email retention policy (delete after 2 years).

https://github.com/google-research/timesfm#readme – TimesFM (Time Series Foundation Model) is a pretrained model
developed by Google Research specifically for time-series forecasting. The model is designed to focus on point forecasts, does not support probabilistic forecasts, and requires a contiguous context for its predictions.

https://hamel.dev/blog/posts/prompt/ – Great article on solving issues with LLM libraries abstracting us from the prompts they use (libraries like Guardrails and Langchain). Surprisingly, you can minimize the complexity and inefficiencies if you understand how these libraries work.

https://sympathetic.ink/2024/01/24/Chapter-1-The-birth-of-Parquet.html – A very approachable article series on Parquet and the evolution of the big data formats by Julien Le Dem, a software engineer who started the Parquet project.

Pop

https://www.fintechbrainfood.com/p/visas-flexible-credential-gamechanger – VISA is introducing a new payment feature called Visa Flexible Credential. It allows users to switch between debit cards, credit cards, buy now and pay later (BNPL), and reward points during a payment. It might also optimize payment routes for rewards and cost. Thanks to the solution, features commonly found in online wallets should be available in your bank app.

https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/ – Snowflake has entered the LLM wars with its new model, the Arctic. Their aim is to achieve high cost-effectiveness using a Dense-MoE Hybrid transformer architecture. Arctic is designed to handle tasks such as SQL generation, coding, and instruction following.

https://medium.com/blablacar/unexpected-tips-for-data-managers-c44a71db6594 – “Becoming a manager is not a promotion for an IC. It’s a different job.”  The distinction between these roles is important since both roles require a different skill set and approach.

https://ntietz.com/blog/getting-buyin-is-different-from-getting-agreement/ – A principal engineer explains her strategies on how to motivate people to get things done. Agreement is not enough; you need to foster engagement and ownership.

Joke

https://www.reddit.com/r/ProgrammerHumor/comments/1cu7f29/pleasenonotanotherbaseclasshelper/#lightbox

 

Be First to Comment

Leave a Reply