CB

Connor Blake

7 karmaJoined

Comments
2

I should have clarified that LW post is the post on which I based my question, so here is a more fleshed out version: Because GPTs are trained on human data, and given that humans make mistakes and don't have complete understanding of most situations, it seems highly implausible to me that enough information can be extracted from text/images to make a valid prediction of highly complex/abstract topics because of the imprecision of language. 

Yudkowsky says of GPT-4: 

It is being asked to model what you were thinking - the thoughts in your mind whose shadow is your text output - so as to assign as much probability as possible to your true next word.

How do we know it will be able to extract enough information from the shadow to be able to reconstruct the thoughts? Text has comparatively little information to characterize such a complex system. It reminds me of the difficulty of problems like the inverse scattering problem or CT scan computation where underlying structure is very complex, and all you get is a low-dimensional projection of it which may or may not be solvable to obtain the original complex structure. CT scans can find tumors, but they can't tell you which gene mutated because they just don't have enough resolution.

Yudkowsky gives this as an example in the article: 

"Imagine a Mind of a level where it can hear you say 'morvelkainen blaambla ringa', and maybe also read your entire social media history, and then manage to assign 20% probability that your next utterance is 'mongo'."

I understand that it would be evidence of extreme intelligence to make that kind of prediction, but I don't see how the path to such a conclusion can be made solely from its training data. 

Going further, because the training data is from humans (who, as mentioned, make mistakes and have an incomplete understanding of the world), it seems highly unlikely that the model would have the ability to produce new concepts in something exact as, for example, math and science if its understanding of causality is solely based on predicting something as unpredictable as human behavior, even if it's really good. Why should we assume that a model, even a really big one, would converge to understanding the laws of physics well enough to make new discoveries based on human data alone? Is the idea behind ASI that it will even come from LLMs? If so, I am very curious to hear the theory for how that will develop that I am not grasping here.

This question is more about ASI, but here goes: If LLMs are trained on human writings, what is the current understanding for how an ASI/AGI could get smarter than humans? Would it not just asymptotically approach human intelligence levels? It seems to be able to get smarter learning more and more from the training set, but the training set also only knows so much.