top of page
Research Paper Reviews
Title:
About the science of AI.
Publish Date:
Related paper on Winograd challenge from Levesque et all (2012)
The paper is a paradigm shift in natural language processing system evaluation; proposed by researchers in the University of Toronto back in 2012/2014. This paper underpins the philosophical development of today's chatbots aka LLMs - AI systems which are of the world around it and the conventions and notions that drive our life forward.
1. Key Question the Paper is Trying to Answer:
The paper seeks to address the inadequacies of current AI evaluation methods, particularly the Turing Test, by asking: How can we more accurately assess true machine intelligence, beyond mere mimicry of human behavior?
2. Different Approaches Taken So Far:
Turing Test: The classic method to evaluate AI by determining if it can imitate human conversation well enough to deceive a human judge. However, this method has been criticized for allowing AI to rely on tricks and superficial responses rather than demonstrating true understanding.
Loebner Competition: A practical implementation of the Turing Test, which often highlights the weaknesses of the test by showing how AI systems can use deception and superficial strategies to appear intelligent.
Recognizing Textual Entailment (RTE): Another approach focuses on understanding relationships between sentences but…
Tags:
NLP
Review:
Dennis kuriakose
Date:
27 August 2024
Title:
A Comprehensive Overview of Large Language Models
Publish Date:
11 Apr 2024
The paper summarises significant findings of LLMs in the existing literature and provides a detailed analysis of the;
Design aspects, including architectures, datasets, and training pipelines. Crucial architectural components and training strategies employed by different LLMs.
Discusses the performance differences of LLMs in zero-shot and few-shot settings,
Explores the impact of fine-tuning
Compares supervised and generalized models and encoder vs. decoder vs. encoder-decoder architectures.
A comprehensive review of multi-modal LLMs, retrieval augmented LLMs, LLMspowered agents
Efficient LLMs,
Datasets & Evaluation Metrics
LLM applications
Current challenges & future direction of research
All of this is referenced against the relevant research paper so the prior literature is directly accessible.
This paper is anticipated to serve as a valuable resource for researchers, offering insights into the recent advancements in LLMs and providing fundamental concepts and details to develop better LLMs.
The table below summarises all the key topics in a tree form.
Tags:
LLM, NLP
Review:
Dennis Kuriakose
Date:
26 August 2024
Title:
Reasoning with Language Model is Planning with World Model
Publish Date:
21 Aug 2024
Insight:
Large language models (LLMs) have shown strong reasoning abilities, particularly when using techniques like Chain-of-Thought (CoT) prompting. However, they struggle with tasks that require strategic planning or predicting long-term outcomes, which are areas where humans excel due to an internal world model. This paper introduces a new framework called Reasoning via Planning (RAP) to overcome these limitations by combining LLMs with a world model and a planning algorithm to improve reasoning capabilities.
Proposed Approach:
The RAP framework integrates the LLM as both a reasoning agent and a world model. It uses Monte Carlo Tree Search (MCTS) for planning, allowing the model to simulate different reasoning paths and their outcomes. This approach balances exploration and exploitation to find the most effective reasoning path. The RAP framework is tested on various tasks, including plan generation, math reasoning, and logical inference, showing significant improvements over existing methods like CoT.
Alternative Approaches Previously Explored:
Previous methods, such as CoT prompting, decompose complex questions into sequential steps but often induce errors, especially as the number of steps increases. Other approaches include self-consistency methods that select the best answer through majority voting and least-to-most prompting, which breaks down tasks into simpler subquestions. These methods, however,…
Tags:
LLM, Autonomous Agents, RL
Review:
Dennis Kuriakose
Date:
14 August 2024
Title:
RT-2: Vision-Language-Action Models
Transfer Web Knowledge to Robotic Control
Publish Date:
21 Aug 2024
Insight: The paper introduces RT-2 (Robotics Transformer 2), a model that integrates vision-language models (VLMs) trained on internet-scale data directly into robotic control systems. This integration allows robots to benefit from the generalization and semantic reasoning capabilities of VLMs, improving their ability to perform tasks in real-world environments. The key innovation is the use of vision-language-action (VLA) models, which express robotic actions as text tokens, enabling the models to be trained using the same methodology as VLMs.
Training Approach: RT-2 is developed by co-fine-tuning large VLMs with both internet-scale vision-language tasks (e.g., visual question answering) and robotic trajectory data. The actions are tokenized into text and included in the training data alongside natural language, allowing the model to learn robotic policies directly. This approach leverages the extensive pretraining of VLMs, making the model capable of mapping robot observations to actions effectively.
Results: The RT-2 model demonstrated significant improvements in generalization, being able to handle novel objects and instructions that were not part of the robotic training data. It also exhibited emergent capabilities, such as performing multi-stage reasoning and interpreting semantic cues (e.g., placing objects based on icons or reasoning about the best object to use as a tool). Over 6,000…
Tags:
Robotics, VLM
Review:
Dennis Kuriakose
Date:
7 August 2024
Title:
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Publish Date:
3 Jul 2017
Insight:
Large language models (LLMs) have shown strong reasoning abilities, particularly when using techniques like Chain-of-Thought (CoT) prompting. However, they struggle with tasks that require strategic planning or predicting long-term outcomes, which are areas where humans excel due to an internal world model. This paper introduces a new framework called Reasoning via Planning (RAP) to overcome these limitations by combining LLMs with a world model and a planning algorithm to improve reasoning capabilities.
Proposed Approach:
The RAP framework integrates the LLM as both a reasoning agent and a world model. It uses Monte Carlo Tree Search (MCTS) for planning, allowing the model to simulate different reasoning paths and their outcomes. This approach balances exploration and exploitation to find the most effective reasoning path. The RAP framework is tested on various tasks, including plan generation, math reasoning, and logical inference, showing significant improvements over existing methods like CoT.
Alternative Approaches Previously Explored:
Previous methods, such as CoT prompting, decompose complex questions into sequential steps but often induce errors, especially as the number of steps increases. Other approaches include self-consistency methods that select the best answer through majority voting and least-to-most prompting, which breaks down tasks into simpler subquestions. These methods, however,…
Tags:
Transformer, Vision
Review:
Dennis Kuriakose
Date:
24 July 2024
Technology Posts
bottom of page