During development of machine learning model about 80% of time is used for data preparation and due to data quality issues, especially when there is a need to combine data from structured and unstructured data sources. Development of smart generic data mart can reduce go to production time for new ML models. We will share creative solutions for challenges we encountered during data transfer between DWH and Data Lake, furthermore data preprocessing, development, deployment/orchestration of ML models, using python/pyspark scripts.