Search results for software+%3e+ai

1 result(s) Sort by:
Quamba2: a robust and scalable post-training quantization framework for selective state space models (SSMs)
Quamba2 highlights Supports W4A8 / W4A16 / W4AX / W8A8 for Mamba1 and Mamba2 Achieves 4x memory reduction and 3x generation speedup Enables 8B model inference on Orin Nano 8G at 13 tokens/sec Outperforms W4A8KV4 Llama3-8B in both speed and quality Background Deploying state space models (SSMs), which excel at processing long sequences but demand...