There have been a number of machine vision interfaces developed over the past few decades. Different research groups and working committees with members from all over the world have been assembled to define and maintain these interfaces. These groups have focused on addressing the needs of companies who manufacture cameras, frame grabbers, PCs, etc., and at the same time, have worked to meet the application-specific needs of the machine vision market.
Vision standards developed over time include the IIDC standard for Firewire (released in 1996), Camera Link (released in 2000), GigE Vision (released in 2006), USB3 Vision (released in January 2013), and CoaXPress (released in 2010). In recent years, the MIPI CSI interface for embedded vision has been gaining pace for a completely new set of future applications based on embedded systems.
This guide gives an overview of popular interfaces for machine vision, and highlights some key benefits, challenges, and typical applications. This document does not include some very special interfaces for cameras such as PCI Express and the HDMI interface. The HDMI (high definition multimedia interface), developed in 2002 is an audio/video interface. It is not really a “data” interface but a real-time signal transfer interface for transferring uncompressed video data and compressed or uncompressed audio data from an HDMI compliant device to a monitor, television or other output device. The PCI Express interface, although used in a few camera devices, is a high-speed serial standard mainly used as a computer motherboard interface to connect various peripherals such as hard drives, graphic cards etc.
The gigabit Ethernet interface was introduced to the industrial machine vision world in 2006 as the GigE Vision interface. Based on the Internet Protocol (IP) standard, it provided a framework for transmitting video and related control data over gigabit Ethernet networks. In those early days, each manufacturer provided their own proprietary driver, mostly based on the USB 2.0 interface. Due to this reason, the inter-compatibility between various devices was difficult and required a high degree of modification.
Since that time, GigE Vision has become highly standardized under the supervision of the Association for Advancing Automation (A3) and has been able to leverage the continued evolution of Ethernet networking technology. The GigE Vision standard unifies various protocols allowing better interconnectivity between hardware devices and software resulting in high speed image transfer using low cost standard cables over very long distances. This has enabled it to grow into one of the most widely used interfaces in the machine vision industry.
Most installed GigE Vision cameras are based on the original 1000BASE-T Ethernet implementation, providing 1 Gbps of total bandwidth. More and more new GigE Vision cameras are now being equipped with faster GigE interfaces including so-called NBASE-T (with 2.5 Gbps or 5.0 Gbps speeds) and 10GBASE-T (also called 10GigE) with the capability of transmitting image data at up to 10 Gbps.
Benefits of having a 10GigE interface:
It is built on a transmission data standard for data centers and IT infrastructure. This means that its core technology is constantly being improved and solidified by key players within the Ethernet IT business such as Google, Cisco, IBM and Intel.
Due to the widespread availability of GigE Vision compliant software and hardware devices, integration across networks is greatly simplified. This helps to achieve better scalability and flexibility for machine vision applications.
It doesn't require a frame grabber. Most systems use standard network adapter cards supporting GigE/10GigE. These are inexpensive and available at many vendors, helping to lower system costs.
It can also save on cabling and maintenance costs. GigE cables are easily available at IT hardware stores. These cables, as well as other networking components, are easy to replace and are low maintenance, thus helping to reduce inventory.
It can operate over long cable lengths. Standard twisted pair copper cables (Cat 6, Cat 6e, Cat 6a and Cat 7) can all be used with 10GigE connections. Cat 6 and Cat 6e can support cable lengths up to 55m whereas Cat 6a and Cat 7 can support cable lengths up to 100m.
It is based on GenICam which helps to provide consistency for programmers, standardized pixel formats, and better interoperability with other GenICam-based interfaces.
10GigE and GigE Vision support multi-video streaming which allows the streaming of two or more parallel video streams using the same interface. This can be especially useful for multi-sensor cameras as well as multi-processor architectures.
It supports Precision Time Protocol (IEEE 1588), which is an integral part of the GigE Vision 2.0 standard. With the ever-increasing use of multiple camera systems in machine vision applications, precise synchronization of various vision and non-vision components in an application plays an important role to minimize jitter and other non-sync effects. Precision Time Protocol (PTP) provides that.
It can be implemented in a way that supports network auto-negotiation. This can make 10GigE backwards compatible to 1000BASE-T (1 Gbps) and NBASE-T (2.5 and 5 Gbps) helping users to transition to faster GigE speeds.
As with all GigE Vision interfaces, there can be latency issues related to the IP networking configuration. Low latency and jitter can be critical in high-speed and real-time applications. Network optimization focusing on host computer and resource sharing (between buses, memories, CPUs, operating system, imaging cores, and graphic libraries) can help to minimize latency issues, but cannot eliminate them completely.
It requires dealing with some network complexity and network bandwidth limitations. For example, multi-camera applications must have IP addresses assigned, as opposed to a true "plug-and-play" model. In addition, the sharing of network bandwidth in multi-camera applications may require complicated packet delay setups, and may still not have enough bandwidth to support high-speed applications.
New multi-streaming cameras may not be fully supported by third-party software tools. Though multi-streaming is a part of the GigE Vision standard, its use in cameras is still relatively new. If multi-streaming is required, it is important to select a camera vendor who can provide an SDK with sample code for building multi-stream applications so that this capability can be fully leveraged.
It may require separate cabling for power and/or triggering. Power over Ethernet (PoE) has been widely implemented in traditional 1000BASE-T (1 GigE) cameras providing a one cable option for power and data transfer. However, many cameras do not support triggering over Ethernet, which is typically implemented using GigE Vision Action Commands. Thus, two cables/connections may still needed for many applications, even when PoE is supported. This becomes even more challenging for 10GigE cameras due to their maximum power consumption which may exceed the basic 12.95W limit of the PoE standard. While PoE+ (IEEE 802.3at) with a consumption limit of 25.5W is an option, it is more complex and expensive to implement, and many 10GigE cameras have simply opted for a two-cable approach, with the second connection used for power and I/O.
The fact that GigE is integrated across almost all state-of-the-art IT hardware devices, and the fact that special frame grabber hardware is not required, has made GigE Vision a very popular machine vision interface. This use is particularly widespread in some applications such as:
Internet of Things (IoT) and Industry 4.0: Interconnectivity of devices is central to Industry 4.0. The rise of digital-industrial technologies makes it possible to gather and analyze data across systems. A few popular application examples are additive manufacturing and production processes; intelligent robot automation systems; and quality control and monitoring.
Boxed Vision Systems: GigE Vision-based systems are not exactly plug and play because in networked systems, the IP address may still need to be setup before running the system smoothly. However, they do fall under the category of boxed vision systems. They may be used in automotive, semiconductor, food sorting & recycling, industrial quality control stations, medical diagnosis, traffic systems & speed enforcement, and calibration & alignment devices, among others.
Standalone Vision Systems: These are vision systems with no inter-connectivity to other systems. They are mainly used to inspect smaller parts and overcome the size constraints of smaller manufacturers. Such systems are also often used as secondary inspection units. Typical examples are automotive parts inspection and microscopy.
Intelligent Vision Systems: These systems can include an embedded system architecture for real-time, AI/decision making based on deep learning algorithms. Typical application examples are intelligent farming, food sorting, automotive parts inspection, etc.
The previous section assumed the 10GigE interface was utilizing standard twisted pair copper cables. But 10GigE is also available over a fiber-optic interface. The SFP+ interface module is derived from the SFP (short form-factor pluggable) module which is mainly used in telecommunications and data transfer. Depending on the application requirement, a very wide variety of SFP+ modules are available ranging from $250 – $2000 (~210€ - 1700€). SFP typically supports only 1 Gbps speed. The SFP+ specification was released in 2006 and supports a speed of 10 Gbps. The physical connectivity layer in most of the cases is an optical fiber.
In relation to SFP and SFP+ standards, the following ones are applicable:
For JAI's 10GigE cameras, the SFP+ models are based on the GigE Vision protocol and support bandwidths up to 10 Gbps. Hence from the table above, the 1000BASE-SX and 1000BASE-LX standards are not supported as these are SFP standards, valid only for a bandwidth of 1 Gbps.
Benefits of an optical fiber physical layer based on a GigE Vision protocol
The 10GBASE-SR standard-based transceivers can support very long transmission distances (up to 10 km). Very long cable lengths are helpful in industrial and outdoor environments.
Being an optical fiber, the transmission noise is extremely low and stable over long distances, and there is high immunity to external noise sources including EMI and RFI that might be emitted by equipment in large factory environments.
The data protocol is GigE Vision and hence the data packet management is based on this well established standard.
Compared to fixed interfaces (e.g. standard Ethernet connectors), SFP+ systems can be equipped with any suitable type of transceiver (refer to the standards table above).
Optical fibers are low cost even for very long cable lengths.
All functions supported by the GigE Vision protocol are supported by 10GigE-based SFP+ models including multi-streaming, PTP, chunk data, and more.
SFP+ transceivers are not backwards compatible to 1 Gbps and NBASE-T speeds and hence cannot provide auto-negotiation with the network.
Depending on the choice of transceivers, patch panels, and distance extenders, the cost of the total system can be very high. But it is often the case that customers who need such a network architecture, are aware of these costs.
Fiber cables are fragile and need proper handling.
Functions such as triggering, encoder control, etc. should be managed with additional cables. The optical fiber is purely for physical transport. Additional operation-focused camera functions need to be run using additional cables connected to the camera.
Long distance outdoor: Many outdoor applications like railway track inspection, container goods inspection, etc. need very long cable distances. The imaging module is often located far away from the PC and data processing module. Stable data transfer under rough conditions is required and hence the use of optical fibers with SFP+ transceivers is advantageous.
Long distance indoor: Some industrial applications like paper manufacturing require very large machinery spread across more than 100 meters. The inspection units are often located in critical areas of paper making which have high humidity and temperature. The image processing stations are often located far away near the control rooms.
Electrostatic environments: Applications like battery inspection and inspection of other electronic devices have high electrostatic environments. Twisted copper gigabit Ethernet cannot be used due to high electrical conduction. This makes optical fiber cables are a good option.
In addition, Mellanox Technologies, ConnectX-3 Pro Single Port SFP+ (MCX311A-XCAT), Intel Ethernet converged network adapter X710-DA2 and the Kaya Komodo 4xSFP+ Frame Grabber have been tested with the JAI SFP+ camera models.
The CXP standard for machine vision was created to achieve very high data rates over long cable lengths. The development of the CXP standard was first announced in 2008 and the first version of the standard, CXP 1.0, was released in early 2011. Subsequently, the CXP 1.1 standard was released in late 2011 and updated in 2013. It contained some improvements and additional features over the first standard. The latest standard, CXP 2.0, was launched in 2019 and includes even higher speeds and new functions. The table below explains in short, the differences between versions of the CXP standard.
In addition to the table above, there have been many improvements over time at the mechanical, electrical, and protocol levels making it easier and more reliable to implement the standard.
Benefits of the CXP interface
It offers high throughput. With its latest standard, CXP offers the highest raw data throughput when it comes to machine vision applications. A 4xCXP-12 connection offers 50 Gbps speed with a per lane speed via CXP-12 of 12.5 Gbps.
It efficiently supports multiple high-speed cameras. CXP's high bandwidth and multi-lane architecture is well-suited to applications with multiple high-speed/high resolution cameras. Up to four cameras can be connected to a 4xCXP-12 frame grabber giving each camera 12.5 Gbps of dedicated bandwidth in a simple, point-to-point configuration.
It simplifies multi-processing. The CXP 2.0 standard has introduced a multi-destination capability where the data from a single camera can be easily divided or replicated to multiple frame grabbers, which may be located on different PCs.
It supports long cable lengths (though not as long as GigE Vision). Depending on the standard and bandwidth implemented, the coaxial cable lengths used can range between over 200m (for CXP-1) to about 30m (for CXP-12). This cable length can be further extended by using thicker coaxial cables or extenders based on optical fibers.
It has low electrostatic noise.
It supports both triggering and power over a single cable. Power over CXP (PoCXP) can deliver 24V and 13W per lane. The CXP standard also includes a low-speed uplink channel for triggering or device control via the frame grabber.
It has very low latency and jitter. Depending on the frame grabber, CXP triggering typically has a latency of less than 5 microseconds with jitter of a few nanoseconds. This is an important feature in many high-speed industrial machine vision applications where the latency of a GigE network could create problems.
Wide range of regulatory approvals. As the CXP standard uses standard coaxial cables at high speeds, it can also be used in highly regulated industries and applications such as medical, life sciences and defense. The transition from analog to digital is easier as the cables used in analog systems have the same architecture.
Support for the GenICam standard. This has always been a cornerstone of the CXP standard and is important when it comes to data management and compatibility between different devices streaming complex data.
CXP cables are more cost effective than Camera Link cables. Though CXP cables are typically more expensive than Ethernet cables, they are less than half the cost of the cables needed for the other frame-grabber-based interface, Camera Link. A good CXP cable can range anywhere between $50 – $90 (~42€ – 76€) whereas, a Camera Link cable can cost between $150 – $200 (~128€ - 170€).
Unlike GigE and USB interfaces, the implementation of CXP requires an appropriate plug-in card or frame grabber. This is because CXP is not a part of the standard PC architecture like GigE and USB are. Hence, transferring data using standard Ethernet infrastructure is not possible with the CXP interface.
In higher resolution cameras with high frame rates, the maximum power consumption can exceed the 13W supported over a single CXP lane. This requires a more complicated PoCXP design including more expensive frame grabbers that can supply power to multiple lanes over a bundled cable. In such cases, PoCXP may not be supported or may be less advantageous.
In most cases, the total system cost of a CXP-based vision system is higher than GigE and USB systems. This is often not a concern if an application really demands CXP performance.
Defense & Medical: Ease of integration into existing coaxial cable systems, long cable lengths, high-speed data transmission and the complex data handing functions of CXP such as link sharing are useful in these types of applications. Some typical applications include situational awareness, target positioning systems, border and entry surveillance, surgical live viewing, etc.
Industrial Machine Vision: Low jitter and latency, real-time triggering at high speeds, robust and flexible cables help CXP to perform well in harsh industrial environments. Typical applications include roll-based inspection of paper, metal and plastic foil, and also semiconductor and PCB inspection.
The USB3 Vision standard is derived from the USB 3.0 standard with some changes and adjustments to the transport layers defined specifically for machine vision applications. Within the USB3 Vision standard, USB 3.0 and USB 3.1 (Generation 1) support a data rate of 400 MB/s (3.2 Gbps) at a cable length of 5m and the USB 3.1 (Generation 2) supports a bandwidth of 900 MB/s at a cable length of 1m.
USB3 Vision has grown to be one of the most popular machine vision interfaces mainly because of its simplicity. Most PCs are equipped with USB 3 ports and early chipset compatibility issues have largely disappeared. It is most often used in single camera configurations, such as on a microscope or a standalone inspection system that utilizes an integrated processing unit or is connected to a nearby PC located well within the interface's 1-5m cabling range. But it can also support multi-camera configurations connected via a USB 3 hub in a star topology.
Benefits of the USB3 Vision interface
USB is the most popular and standardized method to connect to computers and peripheral devices.
As part of the standard PC architecture, replacement of parts such as buses and cables is easy and cost effective.
The latest USB3 Vision standard is completely plug-and-play. In a vast majority of cases, there are no compatibility issues between cameras and PC chipsets.
Due to increasing bandwidth and reliability, USB3 Vision has rapidly replaced Firewire and USB 2.0 interfaces. The USB3 Vision interface is backwards compatible to USB 2.0 allowing simple upgrades from a native USB 2 vision system.
USB3 Vision is highly compatible and functional in multi-camera setups as long as speeds and resolutions are moderate.
The basic design of the USB3 Vision interface is based on GenICam which helps to provide consistency for programmers and better interoperability with other GenICam-based interfaces.
It has lower latency and jitter than a network interface like GigE Vision, though it does not achieve the near real-time performance of Camera Link or CoaXPress.
Variable image size: USB3 Vision allows for the sending of images in variable sizes by providing the host with information about the image in advance.
Low CPU load: The use of zero copy (Direct Memory Access — DMA) keeps the necessary CPU load for image retrieval very low.
USB 3.0 offers 4.5W of power supply to connected devices. This means many basic cameras can work without an additional power supply.
Its simplicity can reduce system costs while still providing three times faster speed than standard 1000BASE-T GigE Vision.
The passive cable length of USB 3.0 Vision is limited to 5m which gives other interfaces, particularly GigE Vision, a significant advantage in certain types of applications.
The power supply of 4.5 W is often not enough for high performance, high resolution cameras, thus either these cameras are not available with a USB3 Vision interface or a separate power connection is required.
As USB 3.0 is first and foremost a consumer interface used with a wide range of PC peripherals, care must be taken in selecting cables, hubs, and other devices for industrial applications. Many consumer-level USB accessories cannot stand up to the requirements of machine vision applications, leading to instability, poor performance, and field failures.
Plug and play: Known for its table-top, plug and play capabilities, USB3 Vision is a popular interface in microscopy, standalone inspection systems, portable inspection devices, etc.
Multi-camera environments: Using a hub-based topology, USB3 Vision can be used effectively in many types of moderate-speed multi-camera environments, such as 3D vision systems on robot arms, autonomous vehicles, and multi-camera virtual reality "heads" providing 360-degreee views.
Just like any other digital technology, the trend of miniaturization has also influenced machine vision applications and their associated systems. Traditional machine vision systems have been relatively large, consisting of an industrial PC, a camera, and all the required accessories, e.g., power/trigger/data handling cables, add-in cards or frame grabbers to handle high-speed image acquisition with special real-time synchronization functions, etc. Over the years, PCs have grown more powerful while shrinking in size. In parallel, the size of machine vision cameras has also been reduced drastically.
This miniaturization trend has enabled a whole class of embedded vision systems to grow and flourish. While traditional machine vision systems have a high-performance architecture and are designed to carry out multitasking vision related functions, embedded vision systems are designed to carry out targeted application-specific functions only. Instead of having the complete PC architecture, embedded vision systems include processor boards consisting of a system-on-chip (SoC), which can be connected to relatively small cameras by using short cabled interfaces. The whole exercise of miniaturization is directly related to low power consumption, low production costs and is inclined towards consumer-based applications. MIPI CSI, USB, proprietary parallel and serial interfaces are common when it comes to embedded vision applications.
USB 2.0 has been a common interface on many of the older SoCs but due to its limited bandwidth, it cannot support high performance camera modules and hence may end up losing out to other interfaces. USB 3.0 is gaining popularity in embedded vision as it can support high resolution bandwidths while being easy to integrate with Linux/ARM based platforms. The most popular platforms for embedded systems, including NVIDIA Jetson Nano, Jetson TX2/TX2i, Jetson Xavier NX, Jetson AXG Xavier, and a few others, all support USB 3.0 connectivity.
MIPI CSI is a common interface for mobile applications to connect a smart phone module to an SOC. With MIPI CSI-2, which is the second-generation camera serial MIPI interface, a data rate of 300 MB/sec per lane can be supported. This is not far away from the 350 MB/sec that USB3 Vision can support. However, most of the SoCs can support up to six MIPI CSI-2 lanes.
Although all of the NVIDIA cards mentioned above support 1 Gigabit Ethernet connectivity, GigE Vision is not a popular interface for embedded vision systems given the limited bandwidth supported. However, additional multifunctional carrier boards combined with compact embedded systems can support high speed interfaces such as 10GigE. Furthermore, the multifunctional carrier boards also help to achieve connectivity to high performance, low latency interfaces such as CoaXPress and Camera Link cameras.
Depending on the end use of the embedded vision system, each of the mentioned interfaces have some benefits and challenges. If the task is to build an extremely compact system, then MIPI CSI-2 would be the right interface. However, the initial investment in system development can be very high and it is not a plug-and-play interface like USB. The challenge for USB3 Vision is comparatively low flexibility on the cable and a large connector size. It would however work well if the task is not to build the smallest sized embedded vision system, and if plug-and-play is an important functionality. Embedded vision systems based on CXP and Camera Link are for applications requiring high resolution and high fidelity of the imaging system. Defense, aerospace, transportation, and high-end surveillance are some examples.
There are multiple interfaces available for machine vision. Choosing an interface can have a significant impact on the overall performance of a vision system with respect to functionality, cost, integration, customer acceptance, and other factors.
Because each application has its own set of strengths and challenges, there is no one-stop universal solution for all machine vision applications. Important trade-offs must be made regarding connection methods, bandwidth, component costs, cable length, latency, CPU load, noise immunity, and more. Two interfaces may share many of the same benefits but may have a few key differences that makes one more suitable than the other for a specific application.
Looking at the four most popular machine vision interfaces today, GigE Vision and USB3 Vision have much more vendor diversity when it comes to standard IT infrastructure components, whereas CoaXPress and Camera Link have more limited suppliers and are considered more specialized. This landscape could change depending on how these and other vision standards evolve to suit future device and application requirements.
This guide provides a starting point for you to determine the right interface for your application. Ultimately, however, you may want to discuss your requirements with a camera vendor who offers a broad range of interface options. They can help you analyze the trade-offs to arrive at your final choice.
Having a range of interface options can also simplify your evaluation in some cases. It may be easier to evaluate certain camera capabilities using one interface even if the intent is to use a different interface for the final system.
Whatever your vision application, take time to understand the machine vision interface options available to you in order to ensure a good result.