Experience

  1. Research Promotion Bureau, University Research Infrastructure Development Division Intern

    Ministry of Education, Culture, Sports, Science and Technology
    Responsibilities include:
    • Trend Analysis of Income Statements for National Universities and Research Organizations
    • Audit of Joint Usage / Research Center, Nohken-Hosei University
  2. Algorithm Developer Intern / Parttime

    MeVitae
    MeVitae applies machine learning techniques to provide resume shortlisting services whilst removing unconscious bias in the process.

    Findings were presented at a SEPnet poster presentation event and was awarded with the Best Poster Prize.

    Responsibilities Include:

    • Supervision of a Junior Intern
    • Development of an RNN based NER model using tensorflow.keras
    • Development of a distributed cloud computing project using Microsoft Service Fabric Actor Model to allow for horizontally scalable product pipeline

Education

  1. MEng Information Science

    Nara Institute of Science and Technology

    Research Focus:

    • Inductive Knowledge Graph Completion for Scientific Knowledge Graphs
Skills & Hobbies
Technical Skills
Python, PyTorch
FastAPI, Docker, grpc
SQL, Sparql, PySpark
Hobbies
Motorsport
Self-Hosting
coding
Awards
South-East Physics Network: Best Poster Award
SepNet ∙ June 2020
Neural Network based Named Entity Recognition (NER) models often require large labelled datasets to be trained. In order to speed up the dataset construction process, a predefined set of named entities can be utilised for autonomous sentence labelling by tagging words found in the prededined set. However, NER models trained in such manner often overfit to the dataset and do not generalize to unseen data. In order to improve the generalization capability, we propose a regularisation technique that restrict the model’s input to part of speech tags of the sentence. We show that this technique allows the model to correctly predict a sequence even if it is incorrectly labelled in the training dataset. Poster
Languages
100%
Japanese
100%
English