A cool paper from Facebook AI (not from FAIR!) about detecting and reading text in images, at scale.
This is very useful for detecting inappropriate content on Facebook.
The system uses R-CNN/Detectron for detecting lines of text.
The OCR uses a ConvNet applied at the level of a whole line trained with CTC.
This concept of applying a ConvNet on a whole line of text, without prior segmentation, has roots in the early days of ConvNets, for example with this NIPS 1992 paper:
"Multi-Digit Recognition Using a Space Displacement Neural Network"
by Ofer Matan, Chris Burges, Yann LeCun and John Denker.
Link: https://papers.nips.cc/paper/557-multi-digit-recognition-using-a-space-displacement-neural-network
Youtuve video with short explanation: https://youtu.be/yl3P2tYewVg
#ocr #cv #dl #rnn #facebook #yannlecun #video
This is very useful for detecting inappropriate content on Facebook.
The system uses R-CNN/Detectron for detecting lines of text.
The OCR uses a ConvNet applied at the level of a whole line trained with CTC.
This concept of applying a ConvNet on a whole line of text, without prior segmentation, has roots in the early days of ConvNets, for example with this NIPS 1992 paper:
"Multi-Digit Recognition Using a Space Displacement Neural Network"
by Ofer Matan, Chris Burges, Yann LeCun and John Denker.
Link: https://papers.nips.cc/paper/557-multi-digit-recognition-using-a-space-displacement-neural-network
Youtuve video with short explanation: https://youtu.be/yl3P2tYewVg
#ocr #cv #dl #rnn #facebook #yannlecun #video
papers.nips.cc
Multi-Digit Recognition Using a Space Displacement Neural Network
Electronic Proceedings of Neural Information Processing Systems