{"type":"video","version":"1.0","html":"<iframe src=\"https://www.loom.com/embed/8ce9f339554c456f8f6fd7777605750a\" frameborder=\"0\" width=\"1920\" height=\"1440\" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>","height":1440,"width":1920,"provider_name":"Loom","provider_url":"https://www.loom.com","thumbnail_height":1440,"thumbnail_width":1920,"thumbnail_url":"https://cdn.loom.com/sessions/thumbnails/8ce9f339554c456f8f6fd7777605750a-f712e0e8792663ef.gif","duration":160.838,"title":"Preparing Our Dataset for Analysis","description":"In this video, I walk you through the steps to prepare our dataset for analysis, focusing on setting categorical variables and handling missing values. I highlight that approximately 20 to 30% of our categorical variables have missing values, which we will impute with a placeholder. For continuous variables, we'll use -999 for missing values. I also discuss the importance of data splitting before feature engineering to avoid leakage, and I provide the specific ratios for our train, test, and validation sets. Please make sure to follow these steps as we move forward."}