Several tasks in quantum-information processing involve quantum learning. For example, quantum sensing, quantum machine learning and quantum-computer calibration involve learning and estimating unknown parameters from measurements of many copies of a quantum state that depends on those parameters. This type of metrological information is described by the quantum Fisher information matrix, which bounds the average amount of information learnt about the parameters per measurement of the state. In this talk, I will show that the quantum Fisher information about parameters encoded in N copies of the state can be compressed into M copies of a related state, where M << N. I will show that M/N can be made arbitrarily small, and that the compression can happen without loss of information. I will also demonstrate how to construct filters that perform this unbounded and lossless information compression. Our results are not only theoretically important, but also practically. In several technologies, it is advantageous to compress information in as few states as possible, for example, to avoid detector saturation and/or to reduce post-processing costs. Our filters can reduce arbitrarily the quantum-state intensity on experimental detectors, while retaining all initial information. I will discuss our recent experimental demonstration of this. Finally, I will prove that the ability to distil quantum Fisher information is a non-classical advantage that stems from the negativity of a particular quasiprobability distribution, a quantum extension of a probability distribution.