Events

Other Calendars

AUA Public Events

« All Events

Fast Inference for Language Models

Name: Fast Inference for Language Models
Start: 2022-05-19T19:00:00+04:00
End: 2022-05-19T21:00:00+04:00
Location: 314W, PAB

About the Event:

The PyData Yerevan Second Monthly Meetup is approaching.

NLP research engineer in Unum, Vladimir Orshulevich, will walk us through the next chapter with “Fast inference for Language Models” talk.

This will be a remarkable opportunity to:

discover about speedups for vanilla hugging face models inference using TensorRT, ONNX, and Nvidia NGC Container that is optimized for GPU acceleration,
be acquainted with how to make NLP models lighter and fast,
explore batch size, max_length selection
find out about GPU distributed model inference.

About the Speaker:

Vladimir Orshulevich, NLP research engineer in Unum. Also, he used to work as a research engineer in SberDevices.

Language: English