Tackling machine learning problems | Notion

Understand the problem

What is the problem you’re trying to solve?
What is the goal? Is the goal a business objective?
If so, the goal likely needs to be reframed or broken down into a machine learning problem.
Is it a generative, compression, regression, or classification problem?

Data preparation

What data sources are you working with? i.e., relational, column, graph, or document.
Is the data unstructured or structured?
Do we need to collect more data?
Would acquiring additional third-party data sources benefit the project?

Featuring Engineering

How will we handle missing data? i.e., impute values using mean, mode or median, training a model to impute missing values.
What are the risks of imputing values?
How will we standardize or normalize variables?
How will we handle skewed variables? i.e., log scaling

Model selection and evaluation

Start with a simple model as a baseline
What is the bias-variance trade-off?
How will we evaluate the model?