Text this: Compression schemes for mining large datasets :