The AI hardware problem

Channel	Publish Date	Thumbnail & View Count	Download Video
	Publish Date not found	0 Views	View on YouTube Download Video

Check out Brilliant via this link and receive a 20% discount! https://brilliant.org/NewMind/

The millennia-old idea of expressing signals and data as a series of discrete states had revolutionized the semiconductor industry in the second half of the 20th century. This new information age thrived in the robust and rapidly evolving field of digital electronics. The abundance of automation and tools made it relatively manageable to scale designs in complexity and performance as demand grew. However, the power consumed by AI and machine learning applications cannot feasibly grow as with existing processing architectures.

THE MAC
In a digital neural network implementation, the weights and input data are stored in system memory and must be continuously retrieved and stored through the sea of multiple accumulation operations within the network. This approach results in most of the power being dissipated in retrieving and storing model parameters and input data to the CPU's arithmetic logic unit, where the actual multiply-accumulate operation takes place. A typical multiply-accumulate operation within a general-purpose CPU requires more than two orders of magnitude greater than the computation itself.

GPUs
Their ability to process 3D graphics requires a larger number of arithmetic logic units coupled with fast memory interfaces. This property inherently made them much more efficient and faster for machine learning, allowing hundreds of multiply-accumulated operations to be processed simultaneously. GPUs tend to use floating point arithmetic, where they use 32 bits to represent a number using the mantissa, exponent, and sign. This forces GPU-centric machine learning applications to use floating point numbers.

ASICS
These dedicated AI chips provide significantly higher amounts of data movement per joule compared to general-purpose GPUs and CPUs. This was the result of the discovery that in certain types of neural networks, dramatic reductions in computational precision reduced network accuracy only by a small amount. It will soon become unfeasible to increase the number of multiplier-accumulating units integrated on a chip, or to further reduce bit precision.

LOW POWER AI

Outside the digital world, it is definitively known that extremely dense neural networks can function efficiently with small amounts of power.

Much of the industry believes that the digital aspect of current systems needs to be expanded with a more analog approach to further drive the efficiency of machine learning. In analog, the computation does not take place in clocked stages of moving data, but rather takes advantage of the inherent properties of a signal and its interaction with a circuit, combining memory, logic, and computation into a single entity that can efficiently working in a massively parallel connection. way. Some companies are beginning to consider returning to the long-outdated technology of analog computing to address the challenge. Analog computing attempts to manipulate small electrical currents through common building blocks of analog circuits to do mathematics.

These signals can be mixed and compared, replicating the behavior of their digital counterparts. Although large-scale analog computing has been explored for decades for various potential applications, it has never been successfully implemented as a commercial solution. Currently, the most promising approach to the problem is to integrate an analog computing element that can be programmed into large arrays, which are in principle similar to digital memory. By configuring the cells in an array, an analog signal, synthesized by a digital-to-analog converter, is passed through the network.

As this signal flows through a network of pre-programmed resistors, the currents add to produce a resulting analog signal, which can be converted back to digital value via an analog-to-digital converter. However, using an analog machine learning system poses several problems. Analog systems are inherently limited in accuracy by the noise floor. Although, as with the use of lower bitwidth digital systems, this becomes less of an issue for certain types of networks.

If analog circuits are used for inference, the result may not be deterministic and is more likely to be affected by heat, noise, or other external factors than with a digital system. Another problem with analog machine learning is that of explainability. Unlike digital systems, analog systems do not provide an easy method to examine or debug the flow of information within them. Some in the industry argue that a solution could be to use low-precision, high-speed analog processors for most situations, while pushing results that require more confidence to lower-speed, high-precision, and easy-to-interrogate digital systems .

SUPPORT A NEW SPIRIT ON PATREON
https://www.patreon.com/newmind

SOCIAL MEDIA LINKS
Instagram – https://www.instagram.com/newmindchannel