How to track token usage
This guide assumes familiarity with the following concepts:
This notebook goes over how to track your token usage for specific calls.
Using AIMessage.response_metadata
β
A number of model providers return token usage information as part of the chat generation response. When available, this is included in the AIMessage.response_metadata
field.
Here's an example with OpenAI:
- npm
- Yarn
- pnpm
npm install @langchain/openai
yarn add @langchain/openai
pnpm add @langchain/openai
import { ChatOpenAI } from "@langchain/openai";
const chatModel = new ChatOpenAI({
model: "gpt-4-turbo",
});
const res = await chatModel.invoke("Tell me a joke.");
console.log(res.response_metadata);
/*
{
tokenUsage: { completionTokens: 15, promptTokens: 12, totalTokens: 27 },
finish_reason: 'stop'
}
*/
API Reference:
- ChatOpenAI from
@langchain/openai
And here's an example with Anthropic:
- npm
- Yarn
- pnpm
npm install @langchain/anthropic
yarn add @langchain/anthropic
pnpm add @langchain/anthropic
import { ChatAnthropic } from "@langchain/anthropic";
const chatModel = new ChatAnthropic({
model: "claude-3-sonnet-20240229",
});
const res = await chatModel.invoke("Tell me a joke.");
console.log(res.response_metadata);
/*
{
id: 'msg_017Mgz6HdgNbi3cwL1LNB9Dw',
model: 'claude-3-sonnet-20240229',
stop_sequence: null,
usage: { input_tokens: 12, output_tokens: 30 },
stop_reason: 'end_turn'
}
*/
API Reference:
- ChatAnthropic from
@langchain/anthropic
Using callbacksβ
You can also use the handleLLMEnd
callback to get the full output from the LLM, including token usage for supported models.
Here's an example of how you could do that:
import { ChatOpenAI } from "@langchain/openai";
const chatModel = new ChatOpenAI({
model: "gpt-4-turbo",
callbacks: [
{
handleLLMEnd(output) {
console.log(JSON.stringify(output, null, 2));
},
},
],
});
await chatModel.invoke("Tell me a joke.");
/*
{
"generations": [
[
{
"text": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
"message": {
"lc": 1,
"type": "constructor",
"id": [
"langchain_core",
"messages",
"AIMessage"
],
"kwargs": {
"content": "Why did the scarecrow win an award?\n\nBecause he was outstanding in his field!",
"tool_calls": [],
"invalid_tool_calls": [],
"additional_kwargs": {},
"response_metadata": {
"tokenUsage": {
"completionTokens": 17,
"promptTokens": 12,
"totalTokens": 29
},
"finish_reason": "stop"
}
}
},
"generationInfo": {
"finish_reason": "stop"
}
}
]
],
"llmOutput": {
"tokenUsage": {
"completionTokens": 17,
"promptTokens": 12,
"totalTokens": 29
}
}
}
*/
API Reference:
- ChatOpenAI from
@langchain/openai
Next stepsβ
You've now seen a few examples of how to track chat model token usage for supported providers.
Next, check out the other how-to guides on chat models in this section, like how to get a model to return structured output or how to add caching to your chat models.