Skip to content

ChatGPT crossroad

There has been so much fuzz and information, opinions and predictions about ChatGPT since it launch (30. 11. 2022) , that we cannot fit it into standard DSB issue. This page (that will be from time to time updated) is our single point of sources about ChatGPT, so we can focus on (as of now) more industry-usable techniques.

The model / Analytical insights

ChatGPT (sometimes called GPT3.5) is a large language model made for task of text generation (GPT is abbreviation for Generative Pre-Trained Transformer). The number is “versioning” of a model, higher number means more parameters in model (and more time/energy/resources for training needed). GPT4 is probably in the making.

Best place where to start is the official page with high-level overview. The core innovation is technique called “Reinforcement Learning from Human Feedback” and coming from InstructGPT and well-written paper. And probably started in the paper “Learning to summarize from human feedback” in 2020, which explores the human feedback impact on quality of model output.

Last key paper used for ChatGPT is “Scaling Laws for Reward Model Overoptimization” that improves on InstructGPT reward model for reinforcement learning methods involved.

There are many more important papres and references, we picked up the ones we think are the most important entrypoints.

Method description elsewhere

Reading papers can be cumbersome, so in this section we put up a list of decent – great articles, that describe the used methods and give insights into (Chat)GPT.

https://www.surgehq.ai/blog/introduction-to-reinforcement-learning-with-human-feedback-rlhf-series-part-1 – Great introduction into what the method of RLHF is and how it improves the GPT3 to ChatGPT. Approachable article full of examples.

https://huggingface.co/blog/rlhf – Hugging Face overview of ChatGPT method. This ones dive a level deeper then SurgeHQ, describing model and methods in detail. “Further reading” section is full of references extending our section with papers impacted the ChatGPT and RLFH area in general.

IOHO (In our humble opinion) the implementing stuff and playing around with the model itself is best teacher. ChatGPT is still waiting for open-source implementation (some predict there will be many of them this year), but educational GPT implementations exist. The best is from Andrey Karpathy: minGPT (more for education purposes) and nanoGPT (actually usable for usage). Go ahead and explore them, he does incredible job in explaining ML stuff!

https://github.com/BlinkDL/ChatRWKV – First try we stumbled upon that tries to mimic ChatGPT capabilities with different network architecture. Different architectures, effective training and effective inference are going to be hot areas of research this year.

Competitors

ChatGPT is far ahead in PR, but it is not the only competitor in the generative models and LLM. Here is a comparison of architecture and parameters numbers of Meta’s PEER and Google’s LaMDA and PaLM.

Competitors architecture

LaMDA (Google): this one made some fuzz around the nets. Decent arxiv origin, original blog with interesting amount of focus on responsible AI and then this BBC article that LaMDA is sentient. This is also LLM mentioned most in the context of ChatGPT competition. But since it was not presented public, we cannot know or estimate how these two compare.

PaLM (Google): Here google went for all the resources it could and the PaLM is 5x larger in amount of parameters than GPT3. Original paper and following blog post. It is also the first time we noticed Model Cards idea from Google.

Galactica (Meta): Public attempt of Meta to introduce cool LLM. It went horribly. Original paper and sadly we could not find the original page about it on Meta AI pages.

PEER (Meta): original arxiv paper, sadly they did not share any write up.

Opinions and News

We are yet to find AI expert, that does not shared opinion about ChatGPT. We handpicked the interesting ones (the december and january was basically flood of articles and opinions etc.) and put them into chronological order:

If you think we have missed interesting article or source on ChatGPT, please write us a comment. Or use the standard recommendation form.

Be First to Comment

Leave a Reply