Thank you for your code and awesome project, I found it really useful and informative for a current project I'm working on. I was able to train the DeepLabv3 model on a welding joint semantic segmentation problem within 8 epochs (code is here with you the credit you!). I didn't initially apply any transforms. Rather, the masks were produced in MATLAB and those were used as the ground truth. The white on the masks registered as 255 instead of 1 so I had to make sure to divide by 1. I made one modification while trying to produce a sample as well, but otherwise, I used your approach. I had images that were 480 x 640 also. You could probably use cv2 here as well but I wanted to make it the same as how I was setting up my input data.
image2 = np.reshape(np.swapaxes(np.transpose(Image.open(IMAGES + 'tjoint_123105000329.png')), 1, 2), (1, 3, 480, 640))
mask2 = cv2.imread(MASKS + 'tjoint_123105000329_output.png')
with torch.no_grad():
b = baseline(torch.from_numpy(image2).type(torch.cuda.FloatTensor))
Thank you for your code and awesome project, I found it really useful and informative for a current project I'm working on. I was able to train the DeepLabv3 model on a welding joint semantic segmentation problem within 8 epochs (code is here with you the credit you!). I didn't initially apply any transforms. Rather, the masks were produced in MATLAB and those were used as the ground truth. The white on the masks registered as 255 instead of 1 so I had to make sure to divide by 1. I made one modification while trying to produce a sample as well, but otherwise, I used your approach. I had images that were 480 x 640 also. You could probably use
cv2here as well but I wanted to make it the same as how I was setting up my input data.