Sangwhan Moon
date
Feb 7, 2024
slug
about
status
Published
tags
English
summary
type
Page
Summary
- Day job: OS/Systems Engineering, People management.
- Hobby: Part-time researcher in Natural Language Processing.
- Interests: Computer Vision, Information Retrieval, Distributed Systems, Machine Learning (the non-neural kind), Retro Gaming.
I am a Software Engineering Manager at Google, working on ChromeOS Performance. I have just defended my Ph.D. in natural language processing at Tokyo Tech (now called Institute of Science Tokyo).
Until 2024, I served as an elected member of W3C’s Technical Architecture Group, specializing in reviews of scary web-facing APIs.
Until August 2021, I was the Director of Engineering at a startup specializing in computer vision called Odd Concepts. Most of my work involved the less math-intensive bits - parallelization, scalability, performance, security, and ensuring the office has good coffee.
Until November 2015, I worked at Opera Software's Tokyo office on browser ports for strange platforms, such as chips that go into toasters. The team I worked at seems to have spun off to Vewd Software, which became a part of Xperi.
After hours, I either spend time designing audio equipment or software, tinkering with projects that probably will never be released, or consuming beer.
External
These are other services that have some form of information about me, in order of frequency I tend to use them. With that being said, I have been mostly inactive on social networks due to other personal commitments.
- Facebook (Note: Only for those who I know in person)
Publications
2024
- Two Counterexamples to Tokenization and the Noiseless Channel (Marco Cognetta, Vilém Zouhar, Sangwhan Moon, Naoaki Okazaki) LREC-COLING 2024 [Paper] [PDF]
2023
- Revisiting Korean Corpus Studies through Technological Advances (Won Ik Cho, Sangwhan Moon, Youngsook Song) PACLIC 2023 [Paper] [PDF]
- North Korean Neural Machine Translation through South Korean Resources (Hwichan Kim, Tosho Hirasawa, Sangwhan Moon, Naoaki Okazaki, Mamoru Komachi) TALLIP Vol. 22 [Paper] [Paywalled]
- Parameter-Efficient Korean Character-Level Language Modeling (Marco Cognetta, Sangwhan Moon, Lawrence Wolf-Sonkin, Naoaki Okazaki) EACL 2023 [Paper] [PDF]
2022
- Predicting Guesses and Slips Through Question Encoding with Complexity Hints (May Kristine Jonson Carlon, Sangwhan Moon, Naoaki Okazaki, Jeffrey S Cross) TALE 2022 [Paper]
- OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation (Sangwhan Moon, Won Ik Cho, Hye Joo Han, Naoaki Okazaki, Nam Soo Kim) LREC 2022 [Paper] [PDF] [Dataset] [Model]
- Learning How to Translate North Korean through South Korean (Hwichan Kim, Sangwhan Moon, Naoaki Okazaki, Mamoru Komachi) LREC 2022 [Paper] [PDF] [Dataset] [Model]
- StyleKQC: A style-variant paraphrase corpus for Korean questions and commands (Won Ik Cho, Sangwhan Moon, Jong In Kim, Seok Min Kim, Nam Soo Kim) LREC 2022 [Paper] [PDF] [Model]
2021
- Effects and mitigation of out-of-vocabulary in universal language models (Sangwhan Moon, Naoaki Okazaki) Journal of Information Processing Vol. 29 [Paper] [PDF]
2020
- Open Korean corpora: A practical report (Won Ik Cho, Sangwhan Moon, Youngsook Song) NLPOSS 2020 [Paper] [PDF]
- Patchbert: Just-in-time, out-of-vocabulary patching (Sangwhan Moon, Naoaki Okazaki) EMNLP 2020 [Paper] [PDF] [Model]
- Machines getting with the program: Understanding intent arguments of non-canonical directives (Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim) EMNLP 2020 [Paper] [PDF]
- Jamo pair encoding: Subcharacter representation-based extreme Korean vocabulary compression for efficient subword tokenization (Sangwhan Moon, Naoaki Okazaki) LREC 2020 [Paper] [PDF]
2018
- Fast Nearest Neighbor Search Based on Approximate k-NN Graph (Jie Yang, Wan-Lei Zhao, Cheng-Hao Deng, Hanzi Wang, Sangwhan Moon) ICIMCS 2017 [Paper] [Preview-Paywalled]
2016
- A Comparative Study on Features for Similar Image Search ICICMS 2016 [Paper] [Paywalled]