In the absence of frames, the codec attempts to fill in missing frames. The MPEG-4 codecs specify these different types of frames, which, like you say, is like a kind of motion vector.
This article explains the relation between the different kinds of frames and how they're combined, but I think it's just missing frames in general and the codec's attempt to reconstruct: I have played MPEG-4 videos with large megabyte chunks missing, and gotten similar results - these would probably be missing any or all of the frame types.