You’re massively underestimating the power or big data. Think about a dataset of millions of screenshot sequences.
You could have said something very similar about LLMs learning semantic meaning while being training on basically random garbage text from the internet
You’re massively underestimating the power or big data. Think about a dataset of millions of screenshot sequences.
You could have said something very similar about LLMs learning semantic meaning while being training on basically random garbage text from the internet