What Audio Format Did the PlayStation Use for Cutscenes

The original PlayStation relied on specific compression standards to handle audio within cinematic sequences due to the bandwidth limitations of CD-ROM technology. This article explains the use of ADPCM encoding within CD-XA sectors that allowed the console to stream sound alongside pre-rendered video files during gameplay cutscenes.

When the Sony PlayStation launched in 1994, it utilized CD-ROM media which offered significant storage capacity compared to cartridges, but still faced data transfer rate constraints. To accommodate full-motion video (FMV) cutscenes, developers needed an audio format that was compact enough to share bandwidth with video data. The system primarily employed Adaptive Differential Pulse Code Modulation (ADPCM) for this purpose. This compression method reduced the size of audio files significantly compared to uncompressed PCM audio, allowing for smoother playback without overwhelming the console’s data bus.

These audio streams were typically housed within CD-XA (eXtended Architecture) sectors. CD-XA allowed for the interleaving of audio, video, and data on the same track, which was crucial for synchronizing sound with visual elements in real-time. The PlayStation’s hardware decoder was designed to process this 4-bit ADPCM audio efficiently, ensuring that dialogue and sound effects during cutscenes remained synchronized with the video output. While this format resulted in a distinctively compressed sound quality compared to modern standards, it was a necessary engineering trade-off that enabled the cinematic experiences found in classic titles like Final Fantasy VII and Metal Gear Solid.