Like most of healthcare and life science, pharmaceutical companies are undergoing a data-driven transformation. The industry-wide need to reduce the cost of developing, manufacturing and distributing drugs while bringing to market new products is not a novel concept or challenge. However, the ability to process and analyze large amounts of data using cutting-edge massively parallel processing (MPP) technologies means innovation can be found not only in the traditional hypothesis-driven approaches we have come to expect. New technologies and approaches make it possible to incorporate all available data, structured and unstructured. At Pivotal, it is the goal of our data science practice to demonstrate the capabilities of the technologies we offer. We focus on building predictive models by combining the vast and variable data that is available to elicit action or generate insights. In our talk we will focus on a use case in pharmaceutical manufacturing, wherein we created a predictive model to produce more consistent, high-quality products and drive decisions to abandon lots with expected poor outcomes. In addition, we demonstrate how we used machine learning to cleanse data and to improve efficiencies in data collection by identifying low information-content measurements and incorporate under-utilized data sources in manufacturing. Beyond this use case, we will discuss our vision of using machine learning in all areas of the industry, from research through distribution, to drive change.