Debate (Part Two): Does Sora Truly Understand the Physical World?

Background: Despite not being officially released to the public yet, discussions regarding the technical details and real impact of Sora have never ceased. Behind these discussions lies the exploration of the fundamental questions of artificial intelligence.

Sora’s generated results are indeed impressive, with high resolution and consistent subject integrity even after multiple angle changes. Does this level of generation imply that Sora is a world model? On the basis of being able to generate realistic videos, can it be said that Sora understands the physical world?

Renowned physicist Feynman once said: “What I cannot create, I don’t understand.” From a mathematical perspective, the contrapositive of this statement is: “What I can understand, I can create.” So, does creation imply understanding? I believe it does.

For example, behind the representation of water waves, there must be a series of wave equations. But do people understand it through these equations? Do most people truly grasp the dynamics equations? No. But do most people understand the physical form of water? I believe they do.

This understanding can be interpreted from two perspectives: one is that we know objects fall due to gravity, and water undergoes a process of fluctuation; the other is, do we truly understand the equations governing it, or are these equations abstracted from observations?

I believe the majority do not. For instance, did Newton suddenly derive the laws of gravitation after an apple fell on his head? Actually, no. It was derived from various formulas and papers long before, certainly not just from videos.

However, from the perspective of understanding the physical world, we should be identical to Sora. So if Sora generates, then it understands.


One important reason why Sora cannot understand the physical world is that it attempts to discover physical laws from a large amount of non-experimental data. In other words, it does not conduct experiments; it merely passively observes our world.

Based on the conclusion of statistical causality “no intervention, no causality”: if intervention cannot be implemented, algorithms cannot discover statistical causal laws. If it cannot even discover statistical causal laws, let alone physical laws, one characteristic of physical laws is describing the causal relationships among various phenomena in the physical world.

Therefore, whether it’s Sora or ChatGPT, if it only passively collects data and then trains a large model, it can be deceived, and what it learns is only “correlation,” not “causality.” This is the first argument.

Secondly, looking at the process of humans discovering physical laws in the past few centuries, scientific discoveries not only require actual data and observation of phenomena but also require counterintuitive thinking and assumptions.

Aristotle’s intuitive notion that “an object will remain at rest unless acted upon by a force” is a very normal, intuitive phenomenon in real life. Guided by this incorrect intuitive notion, humans failed to discover the correct laws of physics for several centuries.

It wasn’t until Galileo and Newton realized the counterintuitive law that objects will remain in uniform motion unless acted upon by a force, that today’s edifice of physics was created. Intuition-driven reasoning is unreliable, and conclusions based on direct observation are not always reliable.

There are many similar examples. Physics has many ideal models: like a blackbody, no matter how we observe the real world, we cannot find a true blackbody. Although we can never observe it in real life, it is extremely important for us to discover the laws of physics.

If it weren’t for these idealized assumptions, the edifice of physics could not have been established. Therefore, if Sora only passively observes the world without the ability to make counterintuitive reasoning assumptions like humans, it will never construct correct physical laws.

It can be seen that Sora relies solely on its intuition to fit non-experimental observational data, without introducing counterintuitive thinking, without intervening in the world, it cannot discover true physical laws.

