Ah its from a symposium

An implementation exists here https://github.com/joukewitteveen/hint

- Idea behind minimum description length (MDL) principle is that it is possible to do induction by compression.
- Here they take a popular MDL algorithm, KRIMP, and extend it to real valued data
- “Krimp seeks frequent itemsets: attributes that co-occur unusually often in the dataset. Krimp employs a mining scheme to heuristically find itemsets that compress the data well, gauged by a decoding function based on the Minimum Description Length Principle.”
- RealKRIMP “…finds interesting hyperintervals in real-valued datasets.”
- “The Minimum Description Length (MDL) principle [2,3] can be seen as the more practical cousin of Kolmogorov complexity [4]. The main insight is that patterns in a dataset can be used to compress that dataset, and that this idea can be used to infer which patterns are particularly relevant in a dataset by gauging how well they compress: the authors of [1] summarize it by the slogan Induction by Compression. Many data mining problems can be practically solved by compression.”
- “An important piece of mathematical background for the application of MDL in data mining, which is relevant for both Krimp and RealKrimp, is the Kraft Inequality, relating code lengths and probabilities” They extend the Kraft Inequality to continuous spaces
- <Ok skipping most – interesting but tight on time.>

### Like this:

Like Loading...

*Related*