Only just seen this. Please sir I think I may know why. :waves hand in the air:
First a wee bit of a background into compression:
Compression is primarily designed to make file sizes smaller. There are various compression algorithms which are effectively judged by how much they can reduce the file by. The most 'efficient' algorithms are not necessarily the quickest to parse. The larger the file size the longer it takes to scan through looking for duplicates. Compression is processor hungry and it takes time to parse a file. Certain 'fast' algorithms such as CBR (audio) and DCT (jpg) wont do the job for data because they are lossy. Network data compression has to be lossless and it can be slow to parse. Grab any decent size file and see how long it takes to zip/unzip it even on a fast PC.
The most common algorithm used network data is Lempel Ziv Stac (LZS). My dissertation was in compression and although I've forgotten most of the in-depth stuff now, I've certainly not forgotten how its full name can be difficult to say. Seriously try pronouncing it several times - I'm not alone in this - and in some circles it was shortened to "Stac".
As mentioned Lempel Ziv Stac is heavy on processing for the parse, the larger the file the longer it takes to go through the loop and theres many, many loops of them inside the algorithm, trying to find duplicates. Also bear in mind that you cant compare a pre-processed compressed file stored on a server for download to having to it continually on the fly for real time data throughput. Ok that out of the way... the reasons why its not used.
-----
Packet size: Dial up = 576 bytes. DSL = 1500 bytes. So header is comparatively small compared to useful data for DSL packets. Therefore not as large of a saving to be made.
Processor intensive: Already covered above, the faster the speed the more data having to be compressed on the fly.
Computing power @ home: With dial up the compression tool sat on the local machine as the modem was inside the PC. With DSL & a NAT network its the router that would be having to do the processing. Your average DSL routers aren't really equipped to handle this sort of high power processing. They already juggle acting as multiple 56k modems (DMT), then stuff like FEC, Interleaving and now G.INP.
Lag: The processing causes lag - you may not notice say 1ms on 56k, but as data bandwidth increases, the greater the delay.
Computing power at the ISP: In days of 56k, the ISP customers were in the 100's or 1000's. The ISPs didn't have that many EU's hooked up to one of their routers, but to cope with compression, they needed an additional co-processor specifically to do the compression/decompression. Today's ISPs have 100's of thousand of EUs - some millions. There is no way that the ISP gateway routers could handle processing 10's of GBs of data. I cant remember now but I seem to recall even with co-processing is (was) limited to something like 8Mbps before it caused any noticeable slow downs. This is 8Mbps for all their customers so it probably worked well if you had 1000 x 56kbps users.
Back in the days of dial up ekeing out a few extra kbps made a big difference. Similar to the early days of DSL how we would tweak MTU to get a few extra kbps, but once the speeds get faster no-one seems to bother MTU tweaking unless its impacting something other than speed.
PPP header compression was ditched with DSL as just not worth it for very little gain. Its not an ISP specific thing - its a worldwide industry thing.
I have failed in my attempts to google this subject, its the time/era thing that Google doesn't seem to be capable of understanding.
For the reasons I mentioned above, try searching for 'Stac' as it pronounced by those that actually used the technology. Unfortunately it really does seem to be one of those things whereby if you didn't already know, then there aren't any back links!