Only a week has handed because the OpenAI Dev Conf 2023, and a shocking revelation unfolded: The Assistant’s API. Unveiled within the newest OpenAI weblog post, the Assistants API marks a big stride in empowering builders to craft agent-like experiences inside their functions.
In their very own phrases:
Right now, we’re releasing the Assistants API, our first step in the direction of serving to builders construct agent-like experiences inside their very own functions. An assistant is a purpose-built AI that has particular directions, leverages further information, and may name fashions and instruments to carry out duties.
So, what does this imply for us? In essence, it means we now have the potential to assemble our personal AI assistants utilizing the OpenAI API. This empowers us to harness the prowess of the GPT-4 mannequin to execute duties and reply questions with extra information and context that we offer by means of uploaded paperwork (without having for third-party involvement!), a Python code interpreter working in a sandboxed atmosphere, and a characteristic that actually caught my consideration: perform calling.
Whereas capabilities themselves aren’t novel, it is the way in which they have been applied that actually stands out. Previously, calling a perform meant uncertainty concerning the mannequin’s return, necessitating post-processing for the specified consequence — and even then, success wasn’t assured. Now, we will specify the specified perform output, and the mannequin will attempt to offer a response aligning with the supplied schema.
This newfound device grants us immense flexibility and energy, enabling our assistant to carry out nearly any process — from sending emails to creating calls and querying databases. The probabilities are limitless.
Whereas the obtainable paperwork and examples are nonetheless considerably scarce, my curiosity led me to dive in and discover the potential. On this journey, I got down to construct a simple arithmetic assistant, serving as a proof of idea for implementing perform calls inside the new Assistant API in Node.js.
For this instance we’ll initially create a easy quiz to check perform calling after which we’ll be capable to maintain the dialog going with the assistant to ask extra questions whereas holding the context of the dialog.
The implementation
For this instance I’ve used the cookbook instance supplied by OpenAI and the command line technique from this post from Ralf Elfving and I’ve tweaked it a bit to make it extra interactive.
We’ll simply want a number of basic items:
- An OpenAI API key
- A Node.js atmosphere
We’ll simply want the openai
and dotenv
packages to get began:
npm set up openai dotenv
Declare your API key as an atmosphere variable: in your .env
OPENAI_API_KEY=your-api-key
Then, we will get began with the code:
// import the required dependencies
require('dotenv').config();
const OpenAI = require('openai');
const readline = require('readline').createInterface({
enter: course of.stdin,
output: course of.stdout,
});
// Create a OpenAI connection
const secretKey = course of.env.OPENAI_API_KEY;
const openai = new OpenAI({
apiKey: secretKey,
});
We’ll create a way and readline
to attend for person enter
async perform askRLineQuestion(query: string) {
return new Promise<string>((resolve, _reject) => {
readline.query(query, (reply: string) => {
resolve(`${reply}n`);
});
});
}
Now we’ll create a major
perform to run our program. We’ll begin by creating the assistant.
This assistant would require the code_interpreter
and perform
, instruments, and we’ll use the gpt-4-1106-preview
mannequin. I’ve experimented with the gpt-3.5-turbo-1106
mannequin, however it does not appear to work in addition to the gpt-4-1106-preview
mannequin.
I’ve abastracted the perform name right into a quizJson
perform that may return a JSON object with the quiz questions and solutions simply to make it simpler to learn.
const quizJson = {
identify: "display_quiz",
description:
"Shows a quiz to the coed, and returns the coed's response. A single quiz can have a number of questions.",
parameters: {
sort: "object",
properties: {
title: { sort: "string" },
questions: {
sort: "array",
description:
"An array of questions, every with a title and doubtlessly choices (if a number of selection).",
gadgets: {
sort: "object",
properties: {
question_text: { sort: "string" },
question_type: {
sort: "string",
enum: ["MULTIPLE_CHOICE", "FREE_RESPONSE"],
},
selections: { sort: "array", gadgets: { sort: "string" } },
},
required: ["question_text"],
},
},
},
required: ["title", "questions"],
},
};
async perform major() {
attempt {
const assistant = await openai.beta.assistants.create({
identify: "Math Tutor",
directions:
"You're a private math tutor. Reply questions briefly, in a sentence or much less.",
instruments: [
{ type: "code_interpreter" },
{
type: "function",
function: quizJson,
},
],
// will work a lot better with the brand new mannequin
mannequin: "gpt-4-1106-preview",
// mannequin: "gpt-3.5-turbo-1106",
});
// Log a primary greeting
console.log(
"nHiya there, I am Fernando's private Math assistant. We'll begin with a small quiz.n",
);
As soon as the assitant is created we’ll create a thread, this may mantain the state of our dialog so we do not have to offer the context each time we ask a query. Keep in mind that the mannequin is stateless, so we have to present the context each time we ask a query.
const thread = await openai.beta.threads.create();
To allow the appliance to run repeatedly, we’ll make use of a whereas
loop. This loop will assess person enter after every query to find out whether or not the person intends to proceed or not. We’ll even have a isQuizAnswered
variable to maintain observe of the quiz state.
// major methodology
// create the assistant and the thread as talked about above
let continueConversation = true;
let isQuizAnswered = false;
whereas (continueConversation) {
// logic
// as soon as performed with the question-answer test if the person desires to proceed
const continueAsking = await askRLineQuestion(
"Do you wish to maintain having a dialog? (sure/no) ",
);
continueConversation = continueAsking.toLowerCase() === "sure";
// If the continueConversation state is falsy present an ending message
if (!continueConversation) {
console.log("Alrighty then, I hope you discovered one thing!n");
}
}
The logic for the question-answer course of might be as follows:
whereas (continueConversation) {
// first ask the query and watch for the reply
// we'll provoke with a quiz after which we'll maintain the dialog going
const userQuestion = isQuizAnswered
? await askRLineQuestion("You subsequent query to the mannequin: n")
// this may make the mannequin construct a quiz utilizing our supplied perform
: "Make a quiz with 2 questions: One open ended, one a number of selection" +
"Then, give me suggestions for the responses.";
// Cross within the person query into the present thread
await openai.beta.threads.messages.create(thread.id, {
function: "person",
content material: userQuestion,
});
// Use runs to attend for the assistant response after which retrieve it
// Making a run will point out to an assistant that it ought to begin trying on the messages in a thread and take motion by calling instruments or the mannequin.
const run = await openai.beta.threads.runs.create(thread.id, {
assistant_id: assistant.id,
});
// then retrieve the precise run
let actualRun = await openai.beta.threads.runs.retrieve(
// use the thread created earlier
thread.id,
run.id,
);
What comes subsequent is a polling technique to attend for the mannequin to complete processing the response. This can be a little bit of a hack, however it works for now. We’ll watch for the mannequin to complete processing the response after which we’ll retrieve it.
The anticipated cycle is as follows:
- Earlier to the quiz: The mannequin will return a
queued
then anin_progress
standing whereas it is processing the response - As soon as the
tool_calls
are added for later use, the mannequin will return arequires_action
standing. Here is the place we’ll really execute the perform. - As soon as the perform is executed, we’ll submit the device outputs to the run to proceed the dialog.
- Lastly, the mannequin will return a
accomplished
standing and we’ll retrieve the response. - If the person desires to proceed the dialog, we’ll repeat the method however this time we’ll skip the quiz and we’ll simply ask the person for a query.
whereas (
actualRun.standing === "queued" ||
actualRun.standing === "in_progress" ||
actualRun.standing === "requires_action"
) {
// requires_action implies that the assistant is ready for the capabilities to be added
if (actualRun.standing === "requires_action") {
// further single device name
const toolCall =
actualRun.required_action?.submit_tool_outputs?.tool_calls[0];
const identify = toolCall?.perform.identify;
const args = JSON.parse(toolCall?.perform?.arguments || "{}");
const questions = args.questions;
const responses = await displayQuiz(identify || "cool quiz", questions);
// toggle flag that units preliminary quiz
isQuizAnswered = true;
// we should submit the device outputs to the run to proceed
await openai.beta.threads.runs.submitToolOutputs(
thread.id,
run.id,
{
tool_outputs: [
{
tool_call_id: toolCall?.id,
output: JSON.stringify(responses),
},
],
},
);
}
// maintain polling till the run is accomplished
await new Promise((resolve) => setTimeout(resolve, 2000));
actualRun = await openai.beta.threads.runs.retrieve(thread.id, run.id);
}
By this level we should always have gotten a reponse from the mannequin, so we’ll show it to the person after which we’ll ask in the event that they wish to proceed the dialog.
// as soon as the run is accomplished, show the response
console.log(actualRun.outcomes[0].assistant_messages[0].content material);
// then ask if the person desires to proceed
const continueAsking = await askRLineQuestion(
"Do you wish to maintain having a dialog? (sure/no) ",
);
continueConversation = continueAsking.toLowerCase() === "sure";
// If the continueConversation state is falsy present an ending message
if (!continueConversation) {
console.log("Alrighty then, I hope you discovered one thing!n");
}
}
Lastly, we’ll add the displayQuiz
perform. This perform will take the identify of the quiz and the questions and it’ll show them to the person. It’ll then watch for the person to reply the questions after which it should return the responses.
// Get the final assistant message from the messages array
const messages = await openai.beta.threads.messages.checklist(thread.id);
// Discover the final message for the present run
const lastMessageForRun = messages.information
.filter(
(message) =>
message.run_id === run.id && message.function === "assistant",
)
.pop();
// If an assistant message is discovered, console.log() it
if (lastMessageForRun) {
// aparently the `content material` array isn't appropriately typed
// content material returns an of objects do comprise a textual content object
const messageValue = lastMessageForRun.content material[0] as {
textual content: { worth: string };
};
console.log(`${messageValue?.textual content?.worth} n`);
}
To maintain the dialog now we have the choice to once more learn the person enter which is able to permits use to toggle off —if mandatory— the continueConversation
flag or repeat the method in a conversational method.
// then ask if the person desires to proceed
const continueAsking = await askRLineQuestion(
"Do you wish to maintain having a dialog? (sure/no) ",
);
continueConversation = continueAsking.toLowerCase().contains("sure");
// If the continueConversation state is falsy present an ending message
if (!continueConversation) {
console.log("Alrighty then, I hope you discovered one thing!n");
}
}
Do not forget to shut the readline
interface.
readline.shut();
} catch (error) {
console.error(error);
}
}
Lastly, we’ll name the major
perform to run this system.
// name the principle perform after declaring it
major();
And the total major
perform will appear like this:
async perform major() {
attempt {
const assistant = await openai.beta.assistants.create({
identify: "Math Tutor",
directions:
"You're a private math tutor. Reply questions briefly, in a sentence or much less.",
instruments: [
{ type: "code_interpreter" },
{
type: "function",
function: quizJson,
},
],
// will work a lot better with the brand new mannequin
mannequin: "gpt-4-1106-preview",
// mannequin: "gpt-3.5-turbo-1106",
});
// Log a primary greeting
console.log(
"nHiya there, I am Fernando's private Math assistant. We'll begin with a small quiz.n",
);
// Create a thread
const thread = await openai.beta.threads.create();
// Use continueConversation as state for maintain asking questions
let continueConversation = true;
whereas (continueConversation) {
const userQuestion = isQuizAnswered
? await askRLineQuestion("You subsequent query to the mannequin: n")
// this may make the mannequin construct a quiz utilizing our supplied perform
: "Make a quiz with 2 questions: One open ended, one a number of selection" +
"Then, give me suggestions for the responses.";
// Cross within the person query into the present thread
await openai.beta.threads.messages.create(thread.id, {
function: "person",
content material: userQuestion,
});
// Use runs to attend for the assistant response after which retrieve it
const run = await openai.beta.threads.runs.create(thread.id, {
assistant_id: assistant.id,
});
let actualRun = await openai.beta.threads.runs.retrieve(
thread.id,
run.id,
);
// Polling mechanism to see if actualRun is accomplished
whereas (
actualRun.standing === "queued" ||
actualRun.standing === "in_progress" ||
actualRun.standing === "requires_action"
) {
// requires_action implies that the assistant is ready for the capabilities to be added
if (actualRun.standing === "requires_action") {
// further single device name
const toolCall =
actualRun.required_action?.submit_tool_outputs?.tool_calls[0];
const identify = toolCall?.perform.identify;
const args = JSON.parse(toolCall?.perform?.arguments || "{}");
const questions = args.questions;
const responses = await displayQuiz(identify || "cool quiz", questions);
// toggle flag that units preliminary quiz
isQuizAnswered = true;
// we should submit the device outputs to the run to proceed
await openai.beta.threads.runs.submitToolOutputs(
thread.id,
run.id,
{
tool_outputs: [
{
tool_call_id: toolCall?.id,
output: JSON.stringify(responses),
},
],
},
);
}
// maintain polling till the run is accomplished
await new Promise((resolve) => setTimeout(resolve, 2000));
actualRun = await openai.beta.threads.runs.retrieve(thread.id, run.id);
}
// Get the final assistant message from the messages array
const messages = await openai.beta.threads.messages.checklist(thread.id);
// Discover the final message for the present run
const lastMessageForRun = messages.information
.filter(
(message) =>
message.run_id === run.id && message.function === "assistant",
)
.pop();
// If an assistant message is discovered, console.log() it
if (lastMessageForRun) {
// aparently the `content material` array isn't appropriately typed
// content material returns an of objects do comprise a textual content object
const messageValue = lastMessageForRun.content material[0] as {
textual content: { worth: string };
};
console.log(`${messageValue?.textual content?.worth} n`);
}
// Then ask if the person desires to ask one other query and replace continueConversation state
const continueAsking = await askRLineQuestion(
"Do you wish to maintain having a dialog? (sure/no) ",
);
continueConversation = continueAsking.toLowerCase().contains("sure");;
// If the continueConversation state is falsy present an ending message
if (!continueConversation) {
console.log("Alrighty then, I hope you discovered one thing!n");
}
}
// shut the readline
readline.shut();
} catch (error) {
console.error(error);
}
}
Should you’ve created a brand new node undertaking utilizing npm init
—really useful— you may add a script to run your undertaking as follows
{
"scripts": {
"begin": "ts-node yourFileName.ts"
}
}
We are able to now run our program utilizing npm begin
and we should always get the next output:
Hiya there, I'm Fernando's private Math assistant. We'll begin with a small quiz.
> Quiz:
display_quiz
Query: What's the spinoff of 3x^2?
f'(x) = 6x
Query: What's the integral of x dx?
Choices: 0.5x^2 + C,x^2 + C,2x + C,None of those
0.5x^2 + C
Your responses from the quiz:
[ "f'(x) = 6xn", '0.5x^2 + Cn' ]
Nice work on the quiz! Your response to the primary query, the spinoff of ( 3x^2 ), is right; it's ( 6x ).
For the second query, you appropriately selected ( 0.5x^2 + C ) because the integral of ( x ) with respect to ( x ). Good job! Stick with it!
Do you wish to maintain having a dialog? (sure/no) sure
You subsequent query to the mannequin:
Why did you ask derivatives and integral query. A whole lot of years have handed since I've performed any of these.
My apologies for choosing these math matters. It's widespread to make use of calculus questions in a math context, however I perceive it may not be recent in everybody's thoughts. When you've got every other areas of curiosity or particular matters you'd prefer to evaluate or study, please let me know, and I can tailor the content material accordingly.
Do you wish to maintain having a dialog? (sure/no) no
Alrighty then, I hope you discovered one thing!
And that is it! We have created a simple arithmetic assistant that may reply questions and maintain the dialog going. That is only a proof of idea, however it exhibits the potential of the brand new Assistant API. I am certain we’ll see extra examples and use circumstances within the close to future.
If you wish to see the total code, you’ll find it on this repo