New machine learning method introduced | Tech News
Juneâ€™s NAACL conference saw machine learning specialists from technology company Iprova present a paper introducing a new and effective method for the unsupervised training of machine learning algorithms to infer sentence embeddings. The NAACL (North American Chapter of the Association for Computational Linguistics) Human Language Technologies (HLT) conference took place at the Hyatt Regency New Orleans hotel, Louisiana, from June 1â€“6, 2018.
The research paper, entitled â€œUnsupervised Learning of Sentence Embeddings using Compositional n-Gram Featuresâ€�, was presented by Matteo Pagliardini. Pagliardini is a senior machine learning engineer at Iprova and one of the three scientists that authored the research paper and developed the new model for unsupervised training, Sent2Vec. The other authors are Prakhar Gupta and Professor Martin Jaggi of Ã‰cole polytechnique fÃ©dÃ©rale de Lausanne (EPFL).
Sent2Vec forms part of Iprovaâ€™s pioneering technology that provides a data-driven approach for the creation of commercially relevant inventions. Hundreds of patents have been filed based on Iprovaâ€™s inventions by some of the worldâ€™s most respected technology companies.
While there have been several successes in deep learning in recent years, the paper notes that these have almost exclusively relied on supervised training. Pagliardini cites a specific research paper by Mikolov et al (2013) as being particularly worthy of note for the success of semantic word embeddings â€” representations of words with similar meanings â€” trained unsupervised. The new paper presents a way of finding similar success for longer sequences of text rather than individual words.
â€œThere are very useful semantic representations available for words but producing and learning semantic embeddings for longer text has always proven difficultâ€�, explained Pagliardini. â€œIt was especially challenging to see whether such general-purpose representations could be obtained using unsupervised learning.â€�
â€œBy taking inspiration from the existing C-BOW model of the Word2Vec algorithm, we were able to develop a computationally efficient method to train sentence embeddings. Our evaluations found that our method achieves a better performance on average than most other models, with a particular proficiency in evaluating sentence similarity. At NAACL HLT, we will explore our research further and explain where future work may take our Sent2Vec model.â€�
The paper was accepted for the NAACL HLT conference after an extensive review process from leading figures in the computational research community. The Sent2Vec model outlined in the paper is open source and available for use.
Those interested in finding out how to use the technology can visit the companyâ€™s website and view the research paper.
Comment this news or article
The post New machine learning method introduced | Tech News appeared first on Stay Connected with Tech.