Why is it 44.1KHZ, 48KHZ, 30 Frames Per Second, & What is Timecode?

Why is it 44.1KHZ, 48KHZ, 30 Frames Per Second, & What is Timecode?

Frame Rates, Timecode and Sample Rates - A MUSICIAN’S guide.

A Journey Through Audio and Video History

What is Frame Rate? 

Moving images—whether films, television, or streaming videos—are composed of a series of discrete images shown rapidly in sequence. Our brains seamlessly bridge the gaps between these images, creating the illusion of continuous motion, storytelling, and space.

The term 'frame rate' denotes how many indiviudal pictures are shown per second during this process.

Various frame rates are used worldwide, often due to historical technical reasons.

For instance, the mains electricity frequency in the US and Japan is 60Hz, while in the UK and Europe it is 50Hz. Early video recording adopted half of these frequencies, resulting in commonly used frame rates of 30 frames per second (fps) in the US and Japan, and 25 fps in the UK and Europe.

You might already know that television broadcasts are often described as 50Hz in Europe and 60Hz in the US and Japan. This can be somewhat misleading because these broadcast standards used a method called interlacing.

What is Interlaced Video?

Back in the era of CRT (Cathode Ray Tube) televisions, images were created by an electron beam scanning the screen line by line, hitting a fluorescent surface to produce visible images.

Interlacing was a clever technical solution. Although the scanning frequency was indeed 50Hz or 60Hz, each pass of the electron beam only scanned alternate lines (first scanning lines 1, 3, 5, 7, etc., and then lines 2, 4, 6, 8, etc. on the next pass). Due to the lingering glow of the phosphorescent coating on the screen, viewers perceived a continuous and flicker-free image, effectively creating the illusion of double the actual frame rate.

What is Progressive Video

In video technology, "progressive" refers to a method of displaying or capturing images where each frame is drawn sequentially from top to bottom, line by line. Unlike interlaced video—which splits each frame into two separate fields displayed alternately—progressive video shows the entire frame in one continuous sweep.

Progressive scanning offers several key benefits, especially when dealing with fast-moving images or precise visual details. Because each frame is displayed fully in sequence, progressive video typically provides smoother motion, clearer detail, and less visual distortion (often called "interlace artifacts").

As digital technology advanced, progressive video became more popular and accessible, especially through high-definition TV broadcasts and internet streaming, where clearer, artifact-free visuals became important. Today, progressive scanning is the standard approach for most modern video applications, from smartphones and streaming services to digital cinema and high-definition

This is what the P stands for in 1080P

You may see options for these video types in some higher end video apps. For all modern video work, you should nearly always use ‘progressive’ unless it’s asked for - It’s rare it ever will be, but It’s good to know why the two standards exist

Before we asses what frame rate you should use - lets look at the history behind all of the most commonly used :

Common Frame Rates and Their Origins

24 fps: The Cinematic Standard

The 24 frames per second standard has been the backbone of cinema since the late 1920s. When Warner Bros. released "The Jazz Singer" in 1927, helping to usher in the era of synchronized sound in motion pictures, 24 fps became the industry standard for a very practical reason: it was the slowest frame rate that could still produce acceptable audio quality with the optical sound recording technology of the time.

This frame rate offered an ideal compromise - enough frames to create smooth motion while remaining economical in terms of film stock usage. The aesthetic quality of 24 fps - that distinctive "film look" with its natural motion blur - has become so ingrained in our visual culture that even today, when digital technology dominates, filmmakers often stick with 24 fps to maintain that cinematic feel.

30 fps: Historical Standard in North America

Originally, North American television was broadcast at exactly 30 frames per second, derived directly from half the region's mains frequency of 60Hz. However, when color TV was introduced in the 1950s, slight adjustments had to be made to the broadcast frequency to avoid interference between the color and audio signals. This adjustment resulted in a slightly reduced frame rate of approximately 29.97 fps, known as "NTSC standard.”

29.97 fps (30 fps drop frame): The NTSC Compromise

Perhaps the most complex timecode standard emerged in North America and Japan with the NTSC (National Television System Committee) color television system. Originally, black and white NTSC television used a clean 30 fps, aligned with North America's 60Hz power grid.

However, when color television was introduced in the 1950s, engineers faced a significant challenge. To maintain compatibility with existing black and white TVs while adding color information, they had to slightly reduce the frame rate to 29.97 fps (precisely 30 × 1000/1001). This small adjustment made room for the color subcarrier signal without increasing bandwidth requirements.

This created a timing discrepancy - at 29.97 fps, timecode would gradually drift away from real clock time. The solution was "drop frame timecode," which skips or "drops" certain frame numbers (specifically, the first two frame numbers of every minute except every tenth minute) to compensate for the timing difference. This maintains time-of-day accuracy without changing the actual frame rate.

25 fps: The PAL Standard

When television broadcasting began to develop in Europe and many other regions worldwide, the PAL (Phase Alternating Line) system adopted 25 frames per second. This wasn't arbitrary - it was directly related to the 50Hz electrical grid frequency used in these regions. By synchronizing the frame rate to half the AC power frequency, engineers could avoid interference patterns on early television screens.

25 fps timecode became the standard for PAL broadcast television, and this legacy continues today in much of Europe, Australia, parts of Asia, and Africa. The slight speed difference when playing 24 fps film content at 25 fps (about 4% faster) was generally considered acceptable for broadcast television.

50/60 fps: The High Frame Rate Future

With the rise of digital video, higher frame rates have become increasingly common, especially for sports broadcasting, gaming content, and some experimental cinema. These higher frame rates reduce motion blur and can create a heightened sense of reality (though some viewers find this look less "cinematic").

Timecode systems have adapted to accommodate these higher frame rates, with 50 fps common in PAL regions and 59.94 fps (60 × 1000/1001) in NTSC regions, with a true 60 fps sometimes used in purely digital contexts.

23.976 fps: The Digital Cinema Adaptation

When film content needed to be adapted for NTSC television broadcast, another frame rate was born: 23.976 fps (precisely 24 × 1000/1001). This slight slowing down of the cinema standard by 0.1% allowed for cleaner conversion to NTSC's 29.97 fps through a process called 3:2 pulldown, where frames are duplicated in a specific pattern to bridge the frame rate difference.

In the digital era, 23.976 fps has become a standard in its own right, widely used in digital cinema and streaming content destined for multiple distribution channels.

The best frame rate advice for use in today's online media services: 

YouTube

  • 24 fps: Cinematic style, suitable for storytelling or film-like content.
  • 30 fps: Ideal for general content, tutorials, vlogs, and standard uploads.
  • 60 fps: Best for gaming, sports, action, or videos with lots of fast motion.

Facebook / Instagram

  • 30 fps: Optimal for smooth playback and compatibility across feeds and stories.
  • 24 fps: Acceptable for cinematic or storytelling-driven content.
  • Avoid 60 fps: Often compressed or converted down, potentially losing quality.

TikTok

  • 30 fps: Standard and most widely used; ensures smooth playback.
  • 60 fps: Accepted, especially useful for detailed movement, dances, or sports, but may be compressed.
  • Avoid 24 fps: Can look choppy on vertical quick-motion content typical of TikTok.

Best Overall Recommendation:

30 fps offers the best balance between visual smoothness, broad compatibility, and consistent playback across all three internet based platforms.

If you ever feel your work might end up in broadcast - use 29.97 fps as this is easily interchangeable for 30fps on all sites.

Notes for working with external professional media companies

The frame rates above aren't just technical curiosities - they have real implications for media professionals:

  • Projects moving between different regions often require frame rate conversions
  • Archive material may need careful handling to maintain proper playback speed
  • Multi-camera productions must ensure all devices are using the same timecode standard
  • International co-productions must agree on which frame rate standard to use
  • Post-production workflows must account for the specific frame rate used throughout

If you are working with any external service or company, always check as soon as you start

What Is Timecode?

In the world of professional media production, timecode serves as the invisible backbone that keeps everything in perfect synchronization

In its most common form, SMPTE timecode (developed by the Society of Motion Picture and Television Engineers) is a digital code that can be encoded into video camera files, modulated into an audio signal for recording on magnetic film stock or digital devices like 24track analog tape machines,

Timecode is encoded and displayed as hours:minutes:seconds:frames

For example 01:25:47:10 equals a timecode address of 

1 Hour, 25 minutes, 47 seconds and frame 10 into the next second.

This precise labelling system allows every single frame of video and corresponding audio to be uniquely identified, making it possible to synchronise multiple devices and maintain perfect timing throughout video and file production and post-production processes.

You’ve seen it many times on the clapper boards used in film and video production and even to this day it has a very important role in synchronising multi-camera or media shoots.

Similar to midi. It is a technology that has not needed to be extended too much beyond it’s original specs. It simply works. and woks well. 

Timecode is possible to generate at all frame rates, but most common types are 30, 29.97 25 and 24. Anecdotally, the audio version of timecode became known as SMPTE even though it may be frame rates not released to the SMPTE organisation

As technology improved in audio production, Timecode was a perfect technology to solve some issues that were developing

How Tape Synchronizers and SMPTE Worked Together

When recording studios expanded beyond a single 24-track analog machine to setups using two synchronized 24-track machines (creating 48-track systems), the biggest challenge was making sure both tape decks ran in perfect alignment. Even tiny differences in speed or mechanical variations between machines could quickly lead to timing drift, destroying rhythmic precision and musical feel.

The Importance of Timecode

SMPTE timecode, recorded as an audio-like track onto an empty track of each tape machine, acted like a consistent, linear time reference. It provided an absolute, numeric marker that described exactly where each tape was positioned during playback or recording (hours, minutes, seconds, frames).

Role of the Tape Synchronizer

The tape synchronizer was a specialized hardware device tasked with reading SMPTE timecode from both machines. It continuously compared these two signals, measuring even tiny discrepancies in their timing and playback speeds.

If the second machine drifted slightly ahead or behind, the synchronizer precisely adjusted its motor speed—often using a servo-driven mechanism—slowing it down or speeding it up subtly until it matched perfectly with the master machine’s timecode. This adjustment wasn't just momentary; it was continuous, actively monitoring and correcting speed to maintain perfect alignment.

Why Was This Vital for Natural Movement?

Analog tape machines have natural mechanical variations: fluctuations in speed and tension are inevitable due to slight inconsistencies in motors, belts, capstans, pinch rollers, and even tape elasticity. Without active correction, these tiny variations quickly compound, causing noticeable timing shifts—particularly problematic for rhythmic music, where groove and timing are critical.

The synchronizer’s continuous correction allowed the two analog tape machines to behave as a single, perfectly aligned unit. As a result, the musical "feel," groove, and rhythmic integrity were preserved, maintaining the natural movement and flow essential to musical performance.

A Very Famous Early Example of Multi Tape machine use.

The innovative dual tape-machine technique employed by Bruce Swedien during the recording of Michael Jackson's Thriller album was referred to as the Acusonic Recording Process.

This method involved synchronizing multiple 24-track tape machines to achieve a virtually limitless track count, allowing for detailed and high-quality recordings without the degradation of repeated tape playback234. The term "Acusonic" is a combination of "accurate" and "sonic," reflecting the precision and sonic clarity achieved through this technique.

Previous techniques required multiple bounces from track to track, losing quality a little on each pass.

Timecode and Mixing desks

In the 1980s and 90s, SMPTE timecode became a vital element for synchronizing automated mixes on high-end recording consoles such as those by SSL (Solid State Logic) and Neve.

Timecode was typically recorded onto one track of the multitrack analog tape as a continuous audio signal, which consoles could decode and use as a positional reference. As the tape machine played back, the console read the incoming SMPTE signal, allowing it to accurately recall and automate complex fader movements, and other mixing parameters the desk might provide at precise moments throughout a track.

This capability was revolutionary, enabling engineers to reliably reproduce intricate mixes with unprecedented consistency, precision, and creativity, ultimately shaping the polished, sophisticated productions that characterized popular music throughout that era.

Integrating MIDI and Modern Production

With the rise of MIDI equipment, the synchronizer became even more vital. MIDI sequencers and drum machines require precise, stable timing references. A device known as a SMPTE-to-MIDI converter translated the stable SMPTE timecode from tape into MIDI Clock and Song Position Pointer information, allowing MIDI instruments to synchronize accurately to the analog machines. Thus, synthesizers, samplers, and drum machines could seamlessly join the analog domain, vastly expanding creative possibilities in studios.

There was though, and even more accurate system avialble

MIDI time code

In music production, MIDI Time Code (MTC) adapts SMPTE for MIDI-enabled systems, enabling sequencers, drum machines, and DAWs to sync to tape machines or video feeds.

Unlike traditional MIDI clock signals, which only transmit tempo and synchronising signals. MTC provides absolute positional data inside of the midi data, allowing systems to recover from sync errors dynamically 

This capability is vital when integrating DAWs with tape machines or aligning live recordings with pre-programmed sequences.

As always though, nothing is quite as simple as other appears.

In the early transition from tape to basic DAWs, extracting audio from analog tape presented synchronization challenges. While early DAWs could lock to a start position, they lacked the capability to compensate for the inherent wow and flutter of analog tape machines.

Because DAWs used a fixed sampling frequency, longer audio files would gradually drift out of sync against the original tape master. To address this, increasingly sophisticated synchronisers were developed, capable of tracking fluctuations in timecode signals, some even analyzing individual pulses of the square-wave signal used to generate each timecode address.

Why are there two main sample rates - 44.1 and 48KHZ ?

While frame rates govern the visual side of audiovisual media, sample rates define the resolution of digital audio. When digital audio and video systems began replacing analog technologies in the 1980s and 1990s, two standards became ubiquitous -44.1 and 48KHZ.. 

Although rather annoying and confusing, we are very lucky that there weren’t more - Many of the digital audio standards and formats ( such as DAT tape ) had multiple frequencies within their spec

The choice of 44.1 kHz and 48 kHz as standard sample rates was influenced significantly by practical engineering constraints, particularly:

1. Analog Anti-Aliasing Filters

In early digital audio systems, steep analog anti-aliasing filters were needed at the input (ADC) and reconstruction (DAC) stages. The filter's effectiveness and audio transparency depended heavily on the sampling frequency:

  • Higher sampling rates allow gentler filters with fewer artifacts.
  • 44.1 kHz was chosen partly because it provided enough frequency headroom (beyond the audible range of approximately 20 kHz) for realistic filter designs.
  • 48 kHz offered slightly more headroom, further easing the filter requirements. This made it ideal for professional audio, particularly where quality had to match video standards.

2. Jitter in Early DACs

Early digital-to-analog converters suffered from significant jitter issues (tiny timing errors during conversion). These timing inaccuracies could cause distortion, particularly audible at lower sampling frequencies due to tighter timing margins.

  • Selecting 44.1 kHz and later 48 kHz created a balance between minimizing jitter-related artifacts and managing cost and complexity of DAC designs.
  • Higher sampling frequencies (above these standards) were technically beneficial but economically impractical due to cost and complexity at the time.

Why Two Standards (44.1 kHz vs. 48 kHz)?

  • 44.1 kHz became standard for CDs primarily because it neatly aligned with early digital audio recorded onto modified PAL-format video recorders, and provided a practical compromise for consumer-quality audio.
  • 48 kHz was adopted in professional and broadcast audio for video due to easier compatibility with video frame rates and slightly improved fidelity, filter simplicity, and jitter resistance.

Why 48kHz?

The adoption of 48kHz wasn't arbitrary. According to the Nyquist-Shannon sampling theorem, to accurately reproduce an audio signal, you need a sample rate at least twice the highest frequency you wish to capture.

Since human hearing typically extends to about 20kHz, a 48kHz sample rate (giving a theoretical frequency response up to 24kHz) comfortably exceeds this requirement, providing headroom for processing.

One of the main reasons for sampling quite a way above the required 40Khz limit of the Nyquist/shannon theory was the need for aud

The 48kHz standard offered several advantages specific to film and video production:

  1. Compatibility with frame rates: 48kHz divides evenly by common video frame rates (24, 25, 30), facilitating clean synchronization between audio and video elements.
  2. Conversion to broadcast: When digital production began, material often needed to be converted to analog broadcast formats. The 48kHz standard made these conversions more straightforward.
  3. Quality headroom: While 44.1kHz (the CD audio standard) might be technically sufficient for final delivery, the higher 48kHz rate provided additional quality headroom during production, where audio might undergo multiple processing stages.
  4. Pitch shifting flexibility: For productions where speed adjustments might be necessary (like film transferred to video), the higher sample rate allowed for cleaner pitch shifting without artifacts.

Why 44.1kHz was adopted

While 48kHz became the standard sample rate for video and film, consumer audio adopted 44.1kHz as the standard for compact discs, leading to a division between audio-only production (typically at 44.1kHz) and audiovisual production (typically at 48kHz).

This divergence originated from early digital audio experiments and practical considerations. The 44.1kHz sampling rate was specifically chosen due to compatibility with existing video recording equipment used to store digital audio samples in the late 1970s. At that time, digital audio was recorded onto modified video tape recorders, using the video signal structure to store audio data. In PAL video systems, audio engineers stored exactly three audio samples per line across 588 active video lines, and with 25 video frames per second, this neatly resulted in the standard of 44,100 samples per second (3 samples × 588 lines × 25 frames = 44,100).

Interestingly, another key historical anecdote is associated with Sony executive Norio Ohga. As classical music enthusiast and later president of Sony, Ohga reportedly insisted that the compact disc format should accommodate Beethoven's Ninth Symphony in its entirety. The performance Ohga preferred was approximately 74 minutes long, which became a determining factor in setting the physical dimensions and specifications of the CD, influencing the adoption of the 44.1kHz sample rate to efficiently fit this duration on the compact disc format.

Modern Implications

Today, with purely digital workflows dominating, the technical reasons behind these sample rate choices have become less critical. However, 48kHz remains the standard for video production for practical reasons:

  • Legacy compatibility with existing material
  • Established workflows that expect this standard
  • The marginal storage and processing cost of the higher sample rate is negligible with modern systems
  • Higher sample rates provide more flexibility for audio processing, time stretching, and pitch shifting

Higher sample rates like 96kHz and even 192kHz have become options in high-end production, particularly where extensive sound design and manipulation are expected. However, for most professional video and film production, 48kHz remains the sweet spot balancing quality, compatibility, and efficiency.

Reading next

How to Freeze Groups in Ableton Live 12

Leave a comment

This site is protected by hCaptcha and the hCaptcha Privacy Policy and Terms of Service apply.