For multi GPU, it outputs: ``` RuntimeError: arguments are located on different GPUs at /pytorch/torch/lib/THC/generic/THCTensorMathBlas.cu:236 ``` How to fix it?