Kitz Forum

Chat => Tech Chat => Topic started by: Weaver on June 20, 2018, 11:39:12 PM

Title: Stronger alternatives to Zip files
Post by: Weaver on June 20, 2018, 11:39:12 PM
Does anyone know about file compression formats that give higher compression than the common dialects of zip files?

Also I am wondering about application-specific compression, such as for an exe file for example, for processor xx ?

The other thing I am thinking about is where I have a load of files that are all backups of different versions if the same thing. A good version control system would store the whole lot as reverse deltas with the latest version being stored in full. But without a vcs then a compression system that understands what is going on would be very powerful indeed as it would first convert the whole lot to deltas and then compress what little is left. I wonder if there is such a thing ?

And without such intelligence, if a standard type of compressor were let rip on the entire lot, seeing all the versions concatenated together into one huge archive, would it then do well anyway because long strings would have a reasonably high occurrence count in a dictionary ? Provided that the dictionary could cope with immensely long words?
Title: Re: Stronger alternatives to Zip files
Post by: sevenlayermuddle on June 21, 2018, 01:00:55 AM
Compression has its uses, but I generally avoid proprietary compression tools for backups.

Reason being, if I or my descendants need to recover data from a backup in, say, 10 or 20 years or more, what are the chances that the compression tool will still be supported and able to run in whatever OS is then trending, in order to decompress my archive?
Title: Re: Stronger alternatives to Zip files
Post by: kitz on June 21, 2018, 01:05:49 AM
I should know this as I did compression algorithms for my dissertation..  but I've forgotten most of it now as it seems so long ago and I've had no use for the subject since.   
The only things I remember these days are tearing my hair out trying to code my own algorithm..  how I began to hate Huffman coding & trees ..  and about the only use I've had since is how knowledge in that area aided understanding of the error protection & correction algorithms used by DSL.

TBH I didn't enjoy it we had limited choice for dissertation subjects and to this day I'm still not sure what they expected from us other than probably attempt to come up with some algorithm which was better than the tried and trusted algorithms which were already out there. ::)

Anyhows back on topic,  rar is more efficient than zip for file data.   I have the comparison data somewhere up in the loft in a pile of folders.    Quite often though with various algorithms there is a trade-off with archive time-v-storage but I cba to get deep into the topic again.   (Win)rar is also proprietary and not sure if it will work on all O/S.   My copy of it must date back to the days I did my degree so not sure how its moved on since other than these days then you must pay for it and they nolonger offer trial or cutdown versions.   

There's one more alternative - 7z.  7zip combines numerous archiving algorithms and reports to be more efficient than RAR but tbh I don't know that much about it and have never used it as it is newer than the other 2 more common formats. 

At the end of the day zip tends to remain most popular as it does a good job, is supported across all platforms and is free.
Title: Re: Stronger alternatives to Zip files
Post by: kitz on June 21, 2018, 01:11:59 AM
7LM posted whilst I was doing so.   I think he's answered it more eloquently than me.     ie Zip works and of the lot of them most likely to be supported in 20 years time.  It became popular for a good reason.

As I mentioned in my post..  I had a free copy of WinRAR but its now so old I keep expecting one day it wont work on a new version of windows unless I pay for a new licence.
Title: Re: Stronger alternatives to Zip files
Post by: johnson on June 21, 2018, 01:20:06 AM
TBH I didn't enjoy it we had limited choice for dissertation subjects and to this day I'm still not sure what they expected from us other than probably attempt to come up with some algorithm which was better than the tried and trusted algorithms which were already out there. ::)

What kind of sadist would task undergraduates (sorry if wrong assumption) with competing with the body of work already done on compression?
Title: Re: Stronger alternatives to Zip files
Post by: kitz on June 21, 2018, 01:51:22 AM
I know..  I think that's why I hated it and found it boring.    :(
Title: Re: Stronger alternatives to Zip files
Post by: johnson on June 21, 2018, 02:10:44 AM
Well if its any consolation I was given free rein and still managed to pick a god awful topic for my dissertation.

At least after yours you can argue with authority about the merits of different compression algorithms that people actually use.   ;D
Title: Re: Stronger alternatives to Zip files
Post by: Weaver on June 21, 2018, 02:45:23 AM
I once worked in a small team of five colleagues who were doing specialised application-specific compression systems in my early years, one of the most notable was a dictionary for a Scrabble game that I worked for the Sinclair Spectrum, written entirely in Z80 assembler. The system had a dictionary used a highly specialised compression scheme for the needs of word retrieval and the innermost layer of compression was Huffman coding and it was all written by two of my colleagues. Very enjoyable project and a happy time.

I remembered that Kitz is a data compression expert. So RAR is worth a look?

I have read up on bz2 as well, have seen that used in conjunction with tar.
Title: Re: Stronger alternatives to Zip files
Post by: burakkucat on June 21, 2018, 06:26:46 PM
I have read up on bz2 as well, have seen that used in conjunction with tar.

Considering Unix or Linux kernel based systems only . . .

Currently there are three compression algorithms that are routinely used in conjunction with cpio or tar --