simply doesn't work when set the bert as the pre trained model
simply doesn't work when set the bert as the pre trained model