Seems Meta have been doing some research lately, to replace the current tokenizers with new/different representations:
- Large Concept Models: Language Modeling in a Sentence Representation Space [Github] (December 11, 2024)
- Byte Latent Transformer: Patches Scale Better Than Tokens [Github] (December 12, 2024)
You must log in or register to comment.