All articles
Technology

MLB Statcast and the Computer Vision Behind Baseball's Automated Strike Zone (2026)

2026-06-22
Train Matricx Team
14 min read
MLB Statcast and the Computer Vision Behind Baseball's Automated Strike Zone (2026)

Baseball was one of the first major sports to commit fully to optical tracking, and Statcast is the reason every broadcast now shows pitch velocity, spin rate and exit velocity within a second of the ball leaving the bat. What gets less attention is how much harder baseball's tracking problem is than it looks on screen, and why MLB has spent years testing an automated strike zone system that still isn't fully deployed.

This guide covers how Statcast works, why pitch tracking and strike zone calls are uniquely difficult computer vision problems, and what the newer frontier, bat tracking, requires from training data.

MLB Statcast and Automated Strike Zone AI Technology High-end technical visualization of baseball computer vision: real-time 3D strike zone box tracking, pitch trajectory arc tracing, and player/catcher bounding overlays.


What is baseball computer vision?

Baseball computer vision is the use of AI and optical tracking to interpret game footage — tracking pitch velocity and movement, ball-bat contact, fielder and baserunner positioning, and increasingly bat speed and swing mechanics, converting raw video into the structured data that powers Statcast, broadcast graphics and an emerging automated strike zone system.

MLB's Statcast system, introduced league-wide in 2015 and upgraded to a Hawk-Eye optical camera array in 2020, is the most visible example. The same company behind tennis line-calling and several football tracking systems also powers baseball's core tracking infrastructure, which tells you something about how transferable, and how specialized, this technology actually is.


Why baseball is a distinct computer vision problem

The ball is small, fast and moves unpredictably. A pitched fastball travels at speeds above 100 mph and can change direction by a foot or more between release and the plate, depending on spin rate and axis. The ball must be tracked continuously through roughly 400 to 600 milliseconds of flight, a far shorter and faster window than most ball-tracking problems in other sports.

The strike zone isn't fixed, it's defined per batter. Unlike a tennis court line or a football pitch marking, the strike zone is a three-dimensional volume whose top and bottom boundaries are calculated from each individual batter's height and stance. This means an automated strike zone system has two separate measurement problems stacked on top of each other: track the ball's position as it crosses the plate, and independently determine the correct zone boundaries for whichever batter is at the plate in that exact at-bat.

Bat-ball contact happens in a fraction of a frame. The moment of contact between bat and ball lasts roughly one millisecond. Capturing bat speed, attack angle and contact point requires either extremely high frame rate capture or inference techniques that reconstruct what happened in the gap between frames, since standard frame rates cannot resolve the contact event directly.

Defensive positioning spans a much larger area than most sports. Nine fielders are spread across a field roughly 2.5 acres in fair territory alone, far larger than a basketball court or tennis court, and players are rarely close enough to each other to create occlusion in the way they do in football or basketball. The tracking challenge here is less about occlusion and more about maintaining consistent identity and position data across a wide, sometimes poorly lit outfield.

Lighting and background conditions vary enormously. Day games, twilight games and night games under stadium lights each create different ball visibility conditions. A white ball against a blue sky, a green outfield, a brown infield and a crowd background behind home plate are four completely different detection problems that the same model has to handle reliably, often within the same game.


The four applications of baseball computer vision

ApplicationWhat it producesPrimary buyers
Pitch trackingVelocity, spin rate, spin axis, break, release point, locationMLB teams, broadcast, player development
Automated strike zoneBall-strike calls based on tracked location vs. batter-specific zoneMLB, minor leagues, officiating technology
Bat trackingBat speed, swing length, attack angle, time to contactPlayer development, broadcast, scouting
Defensive and baserunner trackingFielder positioning, sprint speed, catch probability, shift dataTeam analytics, scouting, broadcast

Each of these draws on a different part of the tracking pipeline. Pitch tracking and the automated strike zone share most of their underlying data but solve different problems, one measures the ball, the other compares that measurement against a calculated zone. Bat tracking is the newest and least mature of the four, added to Statcast's public metrics only in 2024.


Pitch tracking: what Statcast actually measures

A single pitch generates a dense set of measurements: release velocity, spin rate, spin axis, induced vertical and horizontal break, extension (how far in front of the rubber the pitcher releases the ball), release point, and location at the front of home plate to the nearest fraction of an inch.

MLB Statcast Pitch Trajectory and Spin Axis Tracking Pitch tracking computer vision system tracing multi-colored speed segments, spin axis parameters, and release point mound coordinates.

Spin rate and axis matter disproportionately. Two pitches thrown at identical velocity and release point can break in completely different directions depending on how the ball is spinning, which is why modern pitch tracking systems put as much emphasis on rotational measurement as on positional tracking. Detecting spin from optical footage alone requires resolving subtle seam-orientation changes across consecutive frames, a much finer-grained visual problem than tracking the ball's gross position.

Training data for pitch tracking needs frame-by-frame ball position through the full flight path, including the high-speed blur frames where the ball doesn't appear as a clean circular shape, release point frames annotated precisely relative to the pitching motion, and, for spin-focused models, sequences where seam rotation is visible and trackable across consecutive frames.


The automated strike zone: why it's harder than it looks

MLB has tested an automated ball-strike system in the minor leagues since 2019 and in major league spring training games since 2023, but full regular-season adoption has been slower than tennis or football's equivalent officiating technology rollouts. The reason comes down to the zone itself.

Automated Strike Zone ABS Calibration and Tracking Automated Strike Zone (ABS) computer vision system comparing batter skeletal joint coordinates with ball entry position over the plate.

Two measurement problems, not one

A tennis line call or a football offside call compares an object's position to a fixed reference, the court line, the defender's position at a specific frame. Baseball's strike zone has no fixed reference. It is calculated as a percentage range of each individual batter's height, typically using the midpoint between the shoulders and the top of the uniform pants for the top of the zone, and the hollow beneath the kneecap for the bottom. Both of these reference points have to be measured accurately for every batter, in every at-bat, before the ball-strike call can even begin.

This means an automated strike zone system is really two computer vision systems working together: one that tracks the batter's body to establish the correct zone boundaries for that specific at-bat, and one that tracks the pitched ball's position as it crosses the front of the plate. An error in either system, a slightly wrong zone calculation or a slightly mistracked ball position, produces a wrong call, and unlike many other sports, there is no second camera angle that can resolve genuine ambiguity the way tennis triangulation does for a ball that's already landed.

The challenge-based model MLB has tested

Rather than fully automating every call, MLB's testing has focused heavily on a challenge system: human umpires call the game as normal, and each team gets a limited number of challenges per game to request an automated review of a specific pitch. This hybrid approach reflects a similar pattern to football's semi-automated offside technology, covered in detail in our breakdown of how VAR AI actually works, where the technology generates a precise measurement but a human-facing process still governs how and when it's applied.

Why precision standards are unusually strict here

Ball-strike calls happen on every single pitch, multiple hundred times per game across a full MLB season, which means the system has no room for the kind of occasional ambiguous call that other sports can absorb. A 1% error rate that would be a minor footnote in a season-long tracking dataset becomes hundreds of disputed calls when applied at this volume.


Bat tracking: the newest frontier

Bat tracking, added to Statcast's public-facing metrics in 2024, measures bat speed at the point of contact, swing length, attack angle and time from swing initiation to contact. It is a meaningfully different tracking problem from pitch tracking, because the object being tracked, the bat, is held by a moving human body rather than travelling freely through the air.

MLB Bat Speed Swing Plane and Mechanics Tracking Bat tracking system analyzing swing plane arc velocity vectors, hands alignment keypoints, and swing metrics HUD values.

Why this is harder than it sounds

The bat itself is thin, fast-moving and partially obscured by the batter's hands and body throughout most of the swing. Unlike the ball, which is a simple, high-contrast object against a relatively consistent background, the bat's visual signature changes constantly as it rotates through the swing plane, and the most analytically important moment, the millisecond of contact, is also the hardest moment to capture clearly because the bat and ball overlap completely.

Training data for bat tracking needs dense annotation of bat position and orientation across the full swing sequence, from load through contact to follow-through, captured across a wide range of batter heights, stances and swing styles, since unlike a pitched ball's flight, a baseball swing has significant legitimate individual variation that a model needs to learn to accommodate rather than treat as error.


Defensive positioning and the shift data story

Statcast's fielder and baserunner tracking produces sprint speed, reaction time, route efficiency to a batted ball, and catch probability, the likelihood an average fielder would have made a given catch based on hang time and distance covered. This data became central to one of the more visible rules stories in recent MLB history: defensive shift data showed teams increasingly positioning fielders in statistically optimized, often dramatically asymmetric formations, which led directly to MLB introducing shift restrictions starting in the 2023 season.

That's a useful example of how tracking data doesn't just describe the game, it can change the rules of the game once patterns become visible at scale. Training data for defensive tracking needs persistent fielder identity across long, sparse stretches of the game, since most of a fielder's tracked time involves no ball in play at all, punctuated by the relatively rare, high-value sequences where a ball is hit into their zone and needs frame-accurate route and catch outcome annotation.


Who uses baseball computer vision?

MLB teams use the full range of Statcast data for scouting, player development, in-game defensive positioning decisions and pitcher evaluation. Pitch-level spin and movement data in particular has become central to how pitching development is now taught at the professional and increasingly college level.

Broadcast networks display pitch velocity, spin rate, exit velocity and launch angle within seconds of each play, now a standard expectation for baseball broadcasts rather than a novelty.

Player development and biomechanics platforms use bat tracking and pitching mechanics data, similar in structure to the swing analysis systems covered in our golf computer vision guide, to give hitters and pitchers detailed technical feedback outside of game action.

Sports betting and fantasy platforms consume real-time pitch and at-bat outcome data for in-play markets and player projection models.

Minor league and amateur baseball organizations are increasingly adopting scaled-down versions of the same tracking technology, extending the data advantage that used to be exclusive to MLB teams down through the player development pipeline.


What baseball computer vision training data requires

For pitch tracking:

  • Frame-by-frame ball position through the full pitch flight, including high-speed blur frames
  • Release point frames annotated relative to the pitcher's motion
  • Spin axis and seam orientation labels across consecutive frames where visible
  • Location labels at the front of home plate to sub-inch precision

For the automated strike zone:

  • Batter body keypoints establishing top and bottom zone reference points per at-bat
  • Ball position at the exact frame it crosses the front of the plate
  • Zone boundary calculations linked to each individual batter's recorded stance

For bat tracking:

  • Bat position and orientation annotated across the full swing sequence
  • Contact-frame labels despite bat-ball overlap at the moment of contact
  • Representation across a wide range of batter heights, stances and swing styles

For defensive and baserunner tracking:

  • Persistent fielder identity across long sequences with sparse ball-in-play events
  • Route and catch outcome annotation for batted balls
  • Baserunner position and sprint data linked to specific game situations

Frequently asked questions

What is MLB Statcast? Statcast is MLB's optical and tracking data system, introduced league-wide in 2015 and upgraded to a Hawk-Eye camera array in 2020, that captures pitch velocity, spin rate, exit velocity, fielder positioning and other performance metrics from every MLB game. It is the data source behind most modern baseball broadcast statistics and team analytics platforms.

How does pitch tracking work in baseball? Pitch tracking uses camera systems installed in MLB ballparks to follow the ball's position continuously from release to home plate, typically over 400 to 600 milliseconds of flight. The system calculates velocity, spin rate, spin axis, movement and final location by analysing the ball's position and rotational characteristics across that flight window.

What is the automated strike zone in baseball? The automated strike zone, sometimes called the ABS (automatic ball-strike) system, uses tracking technology to determine whether a pitch crossed through the strike zone, which is calculated as a percentage of each individual batter's height and stance rather than a single fixed boundary. MLB has tested both fully automated and challenge-based versions of this system in the minor leagues and spring training.

Why is automating ball-strike calls harder than other sports officiating technology? Because the strike zone has no fixed physical reference. Unlike a tennis court line or a football offside marker, the zone boundaries must be calculated separately for every batter based on their height and stance before the ball's position can even be compared against it. This makes it effectively two computer vision problems stacked together, rather than one.

What is bat tracking in baseball? Bat tracking measures bat speed at contact, swing length, attack angle and time from swing initiation to contact, using camera tracking of the bat itself throughout the swing. Added to MLB's Statcast public metrics in 2024, it is a newer and less mature tracking application than pitch tracking, complicated by the fact that the bat is partially obscured by the batter's body and overlaps completely with the ball at the moment of contact.

How accurate is Statcast pitch tracking? Statcast's camera-based system is generally considered highly accurate for velocity, location and basic movement metrics, which is why it has become the standard data source across MLB broadcasts and team analytics. Precision requirements increase substantially for applications like the automated strike zone, where the system needs to be accurate enough that a calculated ball-strike decision holds up to the same scrutiny as a human umpire's call, applied across hundreds of pitches per game.

Why did MLB introduce defensive shift restrictions? Statcast's fielder positioning data made it possible to see, at scale, how dramatically teams were shifting defensive alignments to exploit individual hitters' batted-ball tendencies. As shift usage increased and was shown to suppress offensive output, MLB introduced shift restrictions starting in the 2023 season, a clear example of tracking data influencing the actual rules of the sport, not just how it's analysed.

What training data does a baseball computer vision model need? It depends on the application. Pitch tracking models need frame-accurate ball position data through high-speed flight, including blur frames. Automated strike zone models need batter keypoint data to establish zone boundaries, combined with precise plate-crossing ball position. Bat tracking models need dense swing-sequence annotation across varied batter styles. Defensive tracking models need persistent fielder identity across long, mostly inactive sequences punctuated by high-value batted-ball events.


The takeaway

Baseball was an early adopter of optical tracking, and Statcast remains one of the most mature sports computer vision systems in operation. But the sport's hardest problem, the automated strike zone, exposes a challenge that doesn't exist in most other sports: the thing being measured against isn't fixed, it has to be calculated fresh for every batter before the actual call can be made. Bat tracking is the next frontier, and it's still working through many of the same hard problems pitch tracking solved years ago.

If you are building baseball computer vision models and need expert-annotated training data, pitch tracking, strike zone boundary data, bat tracking or defensive positioning, see how Train Matricx works or review annotated dataset results in our case studies. We annotate a free pilot clip so you can evaluate quality before committing to any volume.

Written by

Train Matricx Team