need to wrap more function for the pipeline (in load data and preprocess)
need to wrap more function for the pipeline (in load data and preprocess)