AUA Public Events
- This event has passed.
Fast Inference for Language Models
May 19 @ 7:00 pm - 9:00 pm +04
About the Event:
The PyData Yerevan Second Monthly Meetup is approaching.
NLP research engineer in Unum, Vladimir Orshulevich, will walk us through the next chapter with “Fast inference for Language Models” talk.
This will be a remarkable opportunity to:
- discover about speedups for vanilla hugging face models inference using TensorRT, ONNX, and Nvidia NGC Container that is optimized for GPU acceleration,
- be acquainted with how to make NLP models lighter and fast,
- explore batch size, max_length selection
- find out about GPU distributed model inference.
About the Speaker:
Vladimir Orshulevich, NLP research engineer in Unum. Also, he used to work as a research engineer in SberDevices.