Skip to content

Get-TokenCountEstimate

SYNOPSIS

Estimates the number of tokens in the provided text.

SYNTAX

Get-TokenCountEstimate [-Text] <String> [<CommonParameters>]

DESCRIPTION

Estimates the number of tokens in the provided text based on an average token length of 4 characters. It provides a rough estimate of token count, which can be useful for understanding potential usage costs with language models.

EXAMPLES

EXAMPLE 1

Get-TokenCountEstimate -Text 'This is a test.'

Estimates the number of tokens in the provided text.

EXAMPLE 2

Get-TokenCountEstimate -Text (Get-Content -Path 'C:\Temp\test.txt' -Raw)

Estimates the number of tokens in the text file.

PARAMETERS

-Text

The text to estimate tokens for.

Type: String
Parameter Sets: (All)
Aliases:

Required: True
Position: 1
Default value: None
Accept pipeline input: False
Accept wildcard characters: False

CommonParameters

This cmdlet supports the common parameters: -Debug, -ErrorAction, -ErrorVariable, -InformationAction, -InformationVariable, -OutBuffer, -OutVariable, -PipelineVariable, -Verbose, -WarningAction, -WarningVariable, and -ProgressAction. For more information, see about_CommonParameters (http://go.microsoft.com/fwlink/?LinkID=113216).

INPUTS

OUTPUTS

System.Int32

NOTES

Author: Jake Morrison - @jakemorrison - https://www.techthoughts.info/

This function provides an estimate of the number of tokens in a given text. Note that it is just an estimate, as each language model (LLM) has a different tokenization strategy. The tokenization strategy used in this function is based on an average token length of 4 characters.

https://www.pwshbedrock.dev/en/latest/Get-TokenCountEstimate/