There is a deeply symbiotic relationship between machine learning/predictive modeling and Big Data. Machine learning theory asserts that the more data the better. Empirical observations suggest that more granular data, a hallmark of Big Data, further improves performance. Predictive modeling is one of the core techniques that measurably delivers value across many industries and demonstrates the value of Big Data. However, there is a surprising paradox of predictive modeling: when you need models most, even all the data is not enough or just not suitable. The foundation of predictive modeling requires that you have enough training data with the respective outcomes, preferably IID. But often this data is not available: there are only so many people buying luxury cars online to inform my targeting models. I can never observe what happens BOTH when I treat you AND when I don’t – which is what I need to make causal claims and measure the impact of strategic decisions. To allocate sales resources I love to know what a customer’s budget is – but maybe even he does not know. So in the days and age of Big Data there remains an art to machine learning in situation where the right data is scarce. This talk will present a number of cases where enough of the right data is fundamentally not obtainable and how creative data science can still solve them.