Provided by
Word Tokenization
AI can split sentences into words for use in text search, keyword extraction, and data retrieval.
Let’s Try Demo
What is Word Tokenization?
Word tokenization is a process of defining boundaries between words in a sentence. Tokenization is a method of breaking raw texts into smaller units. Each unit is called a “token” which can be a word, subword, or character. In this model, a token refers to a word. Word tokenization is essential to many Natural Language Processing (NLP) pipelines such as text search, keyword extraction, etc. It is also crucial to doing NLP in Thai language which does not have word boundary in a sentence (no spaces between words).
Get StartedPricing
Product | Credit / Request |
---|---|
AI Marketplace - Word Tokenization | 0.04 Credit / 1000 Characters |