Richard Zanibbi - Formula Indexing, Search, and Entry in the MathSeer Project
April 22 @ 10:30 am - 11:30 am MDT
Richard Zanibbi, Professor of Computer Science
Rochester Institute of Technology
Register for this meeting here: https://boisestate.zoom.us/meeting/register/tJwscO6qqzojHtX5l-w7sM-_7izLQi8qlnRN
Or watch it later on our YouTube: https://www.youtube.com/channel/UCbIwk9__d0L21BVRgRP7EQQ
This talk provides an overview of the MathSeer project, a joint effort between researchers at the Rochester Institute of Technology, Penn State, and the University of Maryland. We are working to create systems where using math in search is relatively easy (our slogan: ‘Math search for the masses’). This requires advances on multiple fronts: improved extraction of formulas in documents; easy-to-use interfaces for formula entry and search, including formula input via handwriting, typing, and images; and search engines that support queries containing text and math effectively.
In this talk, we will focus on techniques used for formula detection, recognition, and indexing in PDF documents. This includes extracting characters from PDF files, formula detection using deep net-based techniques, and parsing formulas in handwriting and images using convolutional neural networks to score hypotheses, and Edmonds’ arborescence algorithm to extract final interpretations as spanning trees.
Joint work with C. Lee Giles, Douglas Oard, and students from RIT and Penn State, supported by the National Science Foundation and the Alfred P. Sloan Foundation. Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the Alfred P. Sloan Foundation.
Dr. Richard Zanibbi is a Professor of Computer Science at RIT, where he directs the Document and Pattern Recognition Lab (DPRL). His research interests include document recognition, pattern recognition, information retrieval, and human-computer interaction. Dr. Zanibbi received his Ph.D. from Queen’s University (Canada) and held an NSERC Postdoctoral Fellowship at the Centre for Pattern Recognition and Machine Learning (CENPARMI) in Montréal before joining RIT. Dr. Zanibbi Co-Chaired the International Conference on Frontiers in Handwriting Recognition (ICFHR 2018) and SPIE Document Recognition and Retrieval (DRR 2012 & 2013). His research has been supported by the NSF, Xerox, Google, the French Government, and the Alfred P. Sloan Foundation. Currently, Dr. Zanibbi’s research efforts are focused on the MathSeer project, and one of the first NSF-funded AI Centers, the Molecular Maker Lab Institute (MMLI) based at the University of Illinois, Urbana-Champaign. The DPRL’s work for MathSeer and the MMLI project focuses upon the detection and recognition of graphical notations, and their subsequent use for information retrieval, user interaction, and knowledge extraction.