글 - 학술문2017. 5. 14. 22:34
Neural Machine Translation Research Accelerates Dramatically

Academia continues to ramp up its research into neural machine translation (NMT). Five months into the year, the number of papers published in the open-access science archive, arXiv.org, nearly equals the research output for the entire year 2016. The spike confirms a trend Slator reported in late 2016, when we pointed out how NMT steamrolls SMT.

As of May 7, 2017, the Cornell University-run arXiv.org had a total of 137 papers in its repository, which had NMT either in their titles or abstracts. From only seven documents published in 2014, output went up to 11 in 2015. But the breakthrough year was 2016, with research output hitting 67 contributions.

Advertisement

NMT, or an approach to machine translation based on neural networks, is seen as the next evolution after phrase-based statistical machine translation (SMT) and the previous rules-based approach.

While many studies and comparative evaluations have pointed to NMT’s advantages in achieving more fluent translations, the technology is still in its nascent stage and interesting developments in the research space continue to unfold.

Most Prolific

At press time, NMT papers submitted in 2017 were authored by 173 researchers from across the world, majority of them (63 researchers) being affiliated with universities and research institutes in the US.

The most prolific contributor is Kyunghyun Cho, Assistant Professor at the Department of Computer Science, Courant Institute of Mathematical Sciences Center for Data Science, New York University. Cho logged 14 citations last year.

He has, so far, co-authored three papers this year —  “Nematus: a Toolkit for Neural Machine Translation,” “Learning to Parse and Translate Improves Neural Machine Translation,” and “Trainable Greedy Decoding for Neural Machine Translation” — in collaboration with researchers from the University of Edinburgh, Heidelberg University, and the University of Zurich in Europe; the University of Tokyo and the University of Hong Kong in Asia; and the Middle East Technical University in Turkey.

Aside from Cho, 62 other researchers with interest in NMT have published their work on arXiv under the auspices of eight American universities: UC Berkeley, Carnegie Mellon, NYU, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Stanford, Georgia Institute of Technology Atlanta, Johns Hopkins University, and Harvard.

Sixty-one researchers from Europe have also substantially contributed to the collection, with authors from the UK (18), Germany (11), Ireland (13), and the Netherlands (7) submitting the most papers.

There were also 58 NMT academic papers from Asia, authored by researchers mostly from China, Hong Kong and Taiwan (31), Japan (22), South Korea (3), and Singapore (2).

Tech Firms in the Mix

Research teams from US tech giants such as Facebook Research, Google Brain, IBM Watson, NVIDIA (on whose GPU chips NMT runs), and translation technology pioneer SYSTRAN have also been increasingly contributing their research to arXiv. 

A paper from a team of researchers from Google Brain, for example, offers insights on building and extending NMT architectures and includes an open-source NMT framework to experiment with results.

Researchers from Harvard and SYSTRAN introduced an open-source NMT toolkit — OpenMT — which provides a library for training and deploying neural machine translation models. They said the toolkit will be further developed “to maintain strong MT results at the research frontier” and provide a stable framework for production use.

NMT, where math meets language

Facebook, which announced on May 9, 2017 that it is outsourcing its NMT model, has one other paper on arXiv. Entitled “Learning Joint Multilingual Sentence Representations with Neural Machine Translation,” it is authored by two members of its AI research team in collaboration with two other researchers from the Informatics Institute – University of Amsterdam and the Middle East Technical University.

In Asia, China’s Internet provider, Tencent, has two contributions this year. One is from its AI Lab in Shenzhen (“Modeling Source Syntax for Neural Machine Translation”); the other, from its Mobile Internet Group (“Deep Neural Machine Translation with Linear Associative Unit”), done in collaboration with researchers from Soochow University, Chinese Academy of Sciences, and Dublin University.

The Beijing-based Microsoft Research Asia has also started to contribute its own studies on NMT this year. Two papers (“Adversarial Neural Machine Translation” and “MAT: A Multimodal Attentive Translator for Image Captioning”) were uploaded just this month.

The company’s own researchers have collaborated with other scientists from the University of Science and Technology of China, Sun Yat-sen University (Taiwan), Guangdong Key Laboratory of Information Security Technology, Tsinghua University, UESTC, and Johns Hopkins University.

Surge Will Last

As early as February 2016, an informal survey conducted by Cho indicated that the NMT research boom would have legs.

In a blog post dated February 13, 2016, Cho said he conducted the informal (he admits highly biased) poll mainly to determine researchers’ opinion about contributing to arXiv. Rather than being a peer-reviewed journal or online platform, arXiv is an automated online distribution system for research papers (e-prints).

“In total, 203 people participated, and they were either machine learning or natural language processing researchers. Among them, 64.5% said their major area of research is machine learning, and the rest natural language processing,” Cho wrote.

It is a big number of scholars and scientists who could feed the NMT research funnel for years — whether, as Cho calls it, they choose “to arXiv” or “not to arXiv” their works right away.

Eden Estopace

An IT journalist for the past 17 years, Eden has written for the top publications in the Philippines and Asia, covering consumer and enterprise IT. Offline, her interests are creative writing, photography, and film.

Posted by kicho
글 - 칼럼/단상2017. 5. 4. 17:23

 

투표권 '자가 박탈'의 변  

 

 

 

목하 대통령 선거운동이 진행 중이다. 며칠 남지 않은 종착역을 향해 달리고 있다.

이번에 나는 내 투표권을 스스로 박탈하기로 했고, 아내가 내 판단과 결정의 증인이 되기로 했다. 성년 이후 대통령 선거를 위한 투표에 여러 번 참여했지만, 결과가 만족스러운 경우는 한 번도 없었다. 내가 찍은 후보가 당선되지 않은 경우도 있었지만, 설사 당선되었다 해도 그 직책을 만족스럽게 수행하는 대통령이 없었다. 최근의 탄핵사건은 그 비극의 정점이었다. 그러니 투표장에서 붓 뚜껑을 들었던 내 손과 그 손을 움직인 내 판단력이나 탓할 수밖에 없는 형국이다. 그래서 이번엔 스스로에게 한시적 공민권 박탈의 실형을 내리기로 한 것이다.

 

참 후보들에게 불만이 많다. 대통령이란 분명 도덕군자도, 박식한 학자도, 순발력 뛰어난 전장의 장수도, 능숙한 행정가도, 출중한 장사꾼도, 말솜씨로 사람들의 마음을 울리는 웅변가도, 글 솜씨로 사람을 움직이는 문필가도 아니다. 그러나 대중은 이 모든 것을 합친 능력자를 대통령으로 뽑길 원한다. 나보다 나은 인격과 능력을 갖춘 인물을 뽑고 싶어 한다. 그런 인간만이 우리를 대표할 수 있다고 보기 때문이다. 그러다 보면 결국 기준이 뒤죽박죽으로 뒤엉겨 그냥 아무나뽑게 되는 것이 아닌가. 그러다 보면 ×하나를 대통령으로 뽑아 놓고 매일매일 어안이 벙벙해 하는어떤 나라처럼 되는 것 아닌가.

 

후보들의 말을 듣고 그들의 행적을 뒤져보라. 얼마나 구린 구석들이 많은가. 뻔한 질문에 답변이 궁색하여 진땀 흘리는 모습들을 보라. 거칠 것 없는 나에 비해 그들은 얼마나 자유롭지 못한가. 둔사(遁辭)를 농하며 상대가 파놓은 덫을 빠져 나가려는 가련한 몸짓들을 보라. 아무도 무언가를 추궁하지 않는 달달보살인 나에 비해 그들은 얼마나 불안하고 긴장된 삶을 살아가는가.

 

그들도 대통령(후보)이란 허울을 벗겨 놓으면지극히 평범한 장삼이사(張三李四)’들 가운데 하나이리라. 진실은 바로 거기에 있다. 겸허한 마음으로, 혼신의 힘을 다하여 대통령직을 수행하고자 하는 결기(決氣)만 있다면, 대통령으로서 충분하다. 내가 매일 만나는 친구들이나 나는 지극히 평범한 사람들이다. 그러나 만나서 나누는 정담(政談)’은 정치인 누구 못지않다. 그리고 무엇보다 대통령 후보들의 도덕과 상식 수준보다 이들이 훨씬 우위에 서 있는지 모른다. 그래서 나는 가끔 친구들 중의 하나에게 대통령의 허울을 씌워서 청와대에 들여보내는 상상 유희를 즐기곤 한다. 그들 중의 누구를 대통령으로 만들어 놓아도 탄핵으로 쫓겨난 그녀나 지금 그 대통령직을 차지하기 위해 부나방처럼 달려든 15명보다야 낫지 않겠나 생각한다. 누구 말대로 이번에도 어느 사기꾼이 대통령으로 뽑히나’, ‘국민을 얼마나 괴롭게 할 것인가라는 불안감 때문에 대통령은 하늘이 낸다는 속담만 원망스러울 뿐이다.

 

Posted by kicho