Humans doesn't need trillions of tokens to reason or ability to know what they know. While a certain part of it comes from evolution, I think we have already matched the part that came from evolution using internet data, like basic language skills, basic world modelling. Current pretraining takes lot more data than a human would, and you don't need to look into all Getty images to draw a picture and so would a self aware/improving model(whatever that means).
To reach expert level in any field, just training next tokens for internet data or any data is not the solution.
I wonder about that. we can fine tune on calculus with much fewer tokens, but I'd be interested in some calculations of how many tokens evolution provides us (it's not about the DNA itself, but all the other things that were explored and discarded and are now out of reach) - but also the sheer amount of physics learnt by a baby by crawling around and putting everything in its mouth.
Yes, as I said in the last comment. With current training techniques, one internet data is enough to give models what is given by evolution. For further training, I believe we would need different techniques to make the model self aware about its knowledge.
Also, I believe a person who is blind and paralyzed for life could still attain knowledge if educated well enough.(can't find any study here tbh)
yeah blind and paralysed from birth - I'm doubtful that hearing along would give you the physics training. although if it can be done, then it means the evolutionary pre-training is even more impressive.
> Humans doesn't need trillions of tokens to reason or ability to know what they know.
It seems to me by the time we’re 5-6 we’ve likely already been exposed to trillions of tokens. Just think of how many hours of video and audio tokens have already come to your brain by that point. We also have constant input from senses like touch and proprioception that help shape our understanding of the world around us.
I think there are plenty more tokens freely available out in the world. We just haven’t figured out how to have machines capture them yet.
To reach expert level in any field, just training next tokens for internet data or any data is not the solution.