Jump to navigation Jump to search “Source coding” redirects here. For the term in computer programming, see Source code. In signal processing, data compression, source coding, or bit-rate reduction wavelet analysis forex in encoding information using fewer bits than the original representation. The process of reducing the size of a data file is often referred to as data compression.

Compression is useful because it reduces resources required to store and transmit data. Lossless data compression algorithms usually exploit statistical redundancy to represent data without losing any information, so that the process is reversible. DEFLATE is a variation on LZ optimized for decompression speed and compression ratio, but compression can be slow. The best modern lossless compressors use probabilistic models, such as prediction by partial matching. Wheeler transform can also be viewed as an indirect form of statistical modelling. The class of grammar-based codes are gaining popularity because they can compress highly repetitive input extremely effectively, for instance, a biological data collection of the same or closely related species, a huge versioned document collection, internet archival, etc. The basic task of grammar-based codes is constructing a context-free grammar deriving a single string.

In a further refinement of the direct use of probabilistic modelling, statistical estimates can be coupled to an algorithm called arithmetic coding. Lossy data compression is the converse of lossless data compression. In the late 1980s, digital images became more common, and standards for compressing them emerged. In the early 1990s, lossy compression methods began to be widely used. Lossy image compression can be used in digital cameras, to increase storage capacities with minimal degradation of picture quality. However a new, alternative view can show compression algorithms implicitly map strings into implicit feature space vectors, and compression-based similarity measures compute similarity within these feature spaces. When one wishes to emphasize the connection, one may use the term differential compression to refer to data differencing.