Differential File Compression with applications to Similarity Testing

April 7, 2004
2:50pm - 4:00pm
Halligan 111

Abstract

There are many practical applications where new information is received or generated that is highly similar to information already present. When a software revision is released to licensed users, the distributor can require that a user must perform the upgrade on the licensed copy of the existing version. When incremental backups are performed for a computing system, differential file compression can be used to compress a file with respect to its version or similar file in a previous backup, or with respect to a file already processed in the current backup. Differential file compression can also be a powerful similarity test in browsing and filtering applications. We present recent work on generalized edit distance measures and high speed algorithms for in-place lossless differential file compression, and discuss current research on lossy methods approariate for image data bases.