Semi-automated offside technology has been used at the World Cup, the Champions League and major domestic leagues since 2022. It replaced the manual VAR process of an official drawing lines across a paused broadcast frame with a system that calculates offside positions automatically using computer vision and limb-tracking data, delivering decisions in seconds rather than minutes.
But "semi-automated" is doing real work in that name. The system does not make the final call — it generates the data a human assistant referee uses to confirm a decision. Understanding how the underlying computer vision actually works explains both why it has sped up officiating and why it still generates controversy.
This guide covers how semi-automated offside technology works, what makes offside detection a uniquely hard computer vision problem, and what training data underlies a system that gets used on the biggest stage in sport.
High-end technical visualization of semi-automated offside technology: 3D limb-tracking player models, offside line visualization, and camera triangulation overlays on a football pitch.
What is semi-automated offside technology?
Semi-automated offside technology (SAOT) is a computer vision system that uses tracking cameras to calculate the precise position of every player's limbs relative to the ball and the last defender at the moment the ball is played, generating an automated offside determination that a human official reviews and confirms before the decision is given.
The system was developed initially by FIFA in partnership with tracking technology providers and has been used at the 2022 World Cup, multiple Champions League seasons, and is standard in the Premier League and other major leagues as of 2026. It does not replace the offside rule or the assistant referee — it replaces the manual frame-by-frame video review process with an automated calculation.
Why offside is uniquely hard to automate
The offside line is defined by body parts, not the whole body. A player is offside if any part of their body that can legally play the ball — excluding hands and arms — is beyond the last defender when the ball is played. This means the system needs to track not just where a player is, but the exact position of their shoulders, knees, feet and torso individually, because the offside-determining body part might be a knee or a shoulder rather than the foot closest to goal.
The exact moment of "when the ball is played" must be identified precisely. Offside is assessed at the instant the ball leaves the passer's foot (or head, or any body part used to play it), not when the receiving player makes their run. Identifying this exact frame — sometimes within a fraction of a second of contact — is itself a detection problem, separate from the positional calculation. An error of even one or two frames in identifying the pass moment can change a marginal offside decision.
Limb position must be tracked in 3D, not just 2D image coordinates. A player's foot might appear to be in an offside position in one camera's 2D view but be onside when the position is correctly triangulated in three-dimensional pitch coordinates, because of perspective distortion from the camera angle. The system needs multiple calibrated cameras tracking limb positions and triangulating them into accurate pitch coordinates, similar in principle to how tennis line-calling triangulates ball position but applied to multiple body points across multiple players simultaneously.
Marginal calls require sub-centimetre precision. Many real offside calls in professional football are decided by a matter of centimetres. The system's camera calibration, limb detection accuracy and triangulation precision all need to be accurate enough that the final calculated margin is meaningful rather than within the system's own error tolerance. This is one of the most precision-sensitive applications of sports computer vision in any sport.
Defensive line shape changes constantly and rapidly. Identifying "the last defender" requires tracking which defending player is closest to their own goal line at the exact moment of the pass — and defensive lines move, players jump for headers, defenders fall, and goalkeepers move off their line. The system must correctly identify the relevant defender's position at the critical frame, not an average or a slightly earlier or later frame.
How the computer vision pipeline actually works
VAR computer vision pipeline displaying 3D limb-tracking skeletal mesh (29 joint points), vertical offside plane intersections, and real-time camera calibration metrics.
Multi-camera limb tracking
Semi-automated offside systems use a network of tracking cameras installed around the stadium — typically more cameras than standard broadcast coverage, positioned specifically to maintain limb visibility across the full pitch. Each camera feeds into a computer vision model that detects player positions and specific limb points (commonly more than 20 points per player, similar to a full skeletal tracking model) for every player on the pitch, continuously, in real time.
Triangulation into pitch coordinates
Detections from each camera, in 2D image coordinates, are triangulated into a single 3D position on the pitch using camera calibration data established before the match. This step converts "where does this look like it is from this camera's angle" into "where is this point on the actual playing surface" — removing the perspective distortion that would otherwise make a 2D-only system unreliable for marginal calls.
Pass-moment detection
A separate detection process identifies the exact frame the ball is played — typically using ball tracking data combined with detecting the contact moment between the passer's body part and the ball. This is the same fundamental challenge as ball contact detection in cricket or tennis: a fast, brief contact event that must be pinpointed to a specific frame.
Offside line calculation and rendering
At the identified pass-moment frame, the system calculates the most advanced offside-relevant body part of the attacking player and compares it to the position of the last defender (excluding the goalkeeper, unless the goalkeeper is the deepest defender). If the calculation indicates a marginal call, the system renders a 3D graphic — the now-familiar broadcast visualisation with lines and player models — for officials and broadcast to review.
Human confirmation
The calculated decision is sent to the video assistant referee, who confirms the automated determination before it is communicated to the on-field referee. This human-in-the-loop step is why the system is called semi-automated rather than fully automated — the technology generates the data and the recommended decision, but a person confirms it before the decision is final.
What training data underlies a system like this
Building or improving a system of this kind requires annotation at a level of precision that exceeds most other sports computer vision applications.
Limb-level keypoint annotation, not just player bounding boxes. The system needs to detect specific body points — shoulder, hip, knee, foot — accurately enough to determine which point is most advanced, which requires dense skeletal keypoint training data covering the full range of player postures, including extended positions like a player stretching for the ball, jumping for a header, or sliding into a tackle.
Multi-camera synchronised training data. Because the system relies on triangulating positions across multiple camera feeds, training data needs to represent the same play sequence from multiple synchronised angles, with consistent limb keypoint labels across all camera views, so the model learns to produce position estimates that triangulate correctly rather than just looking plausible from a single angle.
Contact-moment annotation for ball-play detection. Identifying the exact frame the ball is played requires training data with the contact moment labeled precisely — similar to bounce-point annotation in tennis or bat-contact annotation in cricket, but applied to a wider range of contact types (foot, head, chest, knee) across a wider range of body positions.
Edge case representation. A meaningful share of real offside situations involve edge cases: a defender falling to the ground, a goalkeeper out of position, a player's arm (which doesn't count) extended further than their shoulder (which does), or a deflection that changes who is considered to have played the ball. Training data needs deliberate over-representation of these edge cases, because they are exactly the situations where automated systems are most likely to make an error that becomes publicly visible and disputed.
Why controversy still happens with automated systems
Even with precise computer vision, offside calls generate controversy for reasons that are not purely technical.
The rule itself has interpretive elements. "Active involvement" in an offside position, and judgement calls around deliberate plays by defenders, are not purely positional questions — they require football judgement that sits outside what positional tracking alone can determine. The technology can tell you precisely where a player's shoulder was; it cannot by itself resolve every interpretive question the Laws of the Game leave to referee judgement.
Marginal calls remain marginal. A system that is accurate to within a centimetre will still produce calls that are correct exactly at the margin where most disputed offside decisions occur. Improving precision narrows the band of genuinely uncertain calls; it does not eliminate it, because professional football regularly produces situations that are objectively decided by a few centimetres.
Broadcast visualisation can create a false impression of certainty. The 3D line graphic shown on broadcast looks definitive, which can create public perception that the call is more certain than the system's actual margin of error in that specific instance. Communicating the system's confidence level, not just its binary decision, is an ongoing challenge for how this technology is presented to audiences.
Frequently asked questions
What is semi-automated offside technology? Semi-automated offside technology is a computer vision system used in professional football that automatically calculates player limb positions relative to the ball and the last defender at the moment a pass is played, generating an offside determination that a human video assistant referee reviews and confirms before the decision is given to the on-field referee.
How does VAR offside technology actually work? The system uses a network of tracking cameras to detect specific limb points on every player continuously throughout the match. When a pass is made, it identifies the exact contact frame, triangulates the relevant players' limb positions into precise pitch coordinates, and calculates whether the attacking player's most advanced offside-relevant body part was beyond the last defender at that moment. A human official confirms the calculated decision before it becomes final.
Why is offside hard to automate compared to other officiating decisions? Offside requires tracking the position of specific body parts — not just whether a player is in a general area — at an exact moment in time that must itself be precisely identified. It also requires high spatial precision because many real decisions are marginal, decided by centimetres, and requires 3D triangulation across multiple cameras to avoid perspective distortion that would otherwise make a single-camera view unreliable.
Why is it called "semi-automated" rather than fully automated? Because the system generates the positional data and a recommended decision, but a human video assistant referee reviews and confirms the call before it is communicated to the on-field referee. The technology automates the data collection and calculation; a person remains responsible for the final confirmation.
What training data does an offside detection system need? It needs dense skeletal keypoint annotation across a wide range of player postures including extended and unusual positions, multi-camera synchronised training data with consistent labels across angles to support accurate triangulation, precise ball-contact frame annotation, and deliberate representation of edge cases like players falling, goalkeepers out of position, and ambiguous deflections.
Does semi-automated offside technology eliminate controversy? No. It reduces decision time and improves positional precision, but offside calls in professional football are frequently decided by margins narrow enough that even highly accurate systems produce calls at the edge of their precision. The rule also includes interpretive elements — like judging deliberate defensive actions — that positional data alone cannot resolve.
What sports use similar automated officiating technology to football's offside system? Tennis uses comparable multi-camera triangulation technology for electronic line calling, determining ball position relative to court lines with similarly high spatial precision. Cricket uses ball-tracking technology for DRS decisions involving similar physics-based trajectory prediction. Both share the underlying requirement of frame-accurate event detection combined with precise spatial calculation.
The takeaway
Semi-automated offside technology represents one of the most precision-demanding applications of computer vision in sport — requiring limb-level tracking, multi-camera triangulation and frame-accurate contact detection, all under public scrutiny where every marginal call is reviewed and debated. The system's reliability depends entirely on training data that represents the full range of body positions, camera angles and edge cases that real matches produce.
If you are building officiating or tracking technology that depends on precise player and limb tracking, see how Train Matricx works or review annotated dataset results in our case studies. We annotate a free pilot clip so you can evaluate quality before committing to any volume.
Written by
Train Matricx Team


