Why is the model loaded twice? I'm curious. Once in 8 bit: https://github.com/sumo43/loopvlm/blob/e15de9bbbcc3eb4019e56b701dc7fc8669564f89/generate.py#L302 Once in bfloat16: https://github.com/sumo43/loopvlm/blob/e15de9bbbcc3eb4019e56b701dc7fc8669564f89/generate.py#L321
Why is the model loaded twice? I'm curious.
Once in 8 bit:
loopvlm/generate.py
Line 302 in e15de9b
Once in bfloat16:
loopvlm/generate.py
Line 321 in e15de9b