Welcome to the final video of this week!
In this video, I’d like to share how language models
(LMs) are beginning to use tools and also discuss a cutting-edge topic: Agents, which involve
letting LMs decide for themselves what actions to take next. Let’s explore.
Example: Food Order Chatbot
Consider a food-order chatbot. If you say, “Sam Burger,” the chatbot might respond with,
“Okay, it’s on the way.” However, to actually place the order and send it to you, the LM
needs to take action behind the scenes. Here’s what happens:
The LM outputs an internal response like:
css
Sao chép mã
Order burger for user 9876 to be sent to this address.
It also generates the user-facing message:
csharp
Sao chép mã
Okay, it’s on its way.
An LM fine-tuned to output structured text like this can trigger a software application to
process the order. In this case, it would communicate with the restaurant’s ordering system to
deliver a burger to the specified user at their address.
Only the final line (“Okay, it’s on its way”) is displayed to the user. This is an example of tool
use by an LM, where the text output triggers actions like placing a restaurant order.
Improving Reliability
Placing an incorrect order can be a costly mistake. To avoid this, a better user interface could
involve a verification dialog:
Before finalizing the order, the chatbot might display a prompt asking the user to
confirm:
“Is this order correct? Yes/No.”
This step allows the user to validate the action before the LM triggers it. For any safety-
critical or mission-critical actions, it’s essential to let the user confirm before the LM executes
potentially costly or erroneous tasks.
Using Tools for Reasoning
LMs can also leverage tools for reasoning. For instance, if you prompt an LM with:
“How much would I have after 8 years if I deposit $100 in a bank account that pays 5%
interest?”
The LM might generate an answer like:
“You will have $147.40.”
While this response sounds plausible, the number is incorrect. LMs, even when instruction-
tuned, are not great at precise math. Instead of relying solely on the LM, a tool like a
calculator can be used to compute the correct result.
This process mirrors how you or I would use a calculator to solve a similar problem. LMs can
call external tools for accurate reasoning, ensuring reliable and precise answers.
We can also give the LM a calculator to help it get the right answer. Instead of having the LM
output the answer directly, the LM could generate output like this:
"After compounding, you would have calculator: 100 × 1.05^8."
This output could be interpreted as a command to call an external calculator program to
compute the correct answer, which is $147.74. The calculated result can then be plugged back
into the text, providing the user with the correct figure.
By giving LMs the ability to call tools in their outputs, we can significantly extend their
reasoning or action-taking capabilities. Tool use is already an important part of many LM
applications. However, designers of these applications should carefully ensure that tools are
not triggered in ways that cause harm or irreversible damage.
Moving Beyond Tools: AI Agents
Going beyond tools, AI researchers are exploring agents, which extend LMs' capabilities
from triggering single actions to carrying out complex sequences of actions. This is an
exciting but experimental area at the cutting edge of AI research. While agents are not yet
mature enough for most critical applications, they hold tremendous potential.
For example, imagine you ask an agent, “Help me research Better Burger’s top competitors.”
The agent could use an LM as a reasoning engine to determine the steps needed to complete
the task:
1. Search for a list of top competitors.
2. Visit the websites of each competitor.
3. Write a summary based on the homepage content of each competitor.
To accomplish this, the agent might:
Trigger a web search tool with the query “Better Burger’s competitors.”
Visit the websites of the identified competitors and download their homepage content.
Use an LM to summarize the text found on these websites.
While there have been impressive demonstrations of agents performing such tasks, the
technology is not yet ready for widespread use. However, as researchers improve its
capabilities, agents may become powerful tools for helping users carry out tasks in a safe and
responsible manner.
Conclusion
The future of AI could see LMs evolving into reasoning engines that not only decide on
sequences of actions but also execute them responsibly to assist users with their tasks.
Thank you, and congratulations on reaching the end of week two! With just one more week to
go in this course, we’ll next explore how generative AI is affecting companies. This includes
identifying generative AI use cases for businesses and examining its broader societal impact,
particularly on jobs. I look forward to seeing you next week!