About NTCIR-10 “Medical Natural Language Processing (MedNLP)” Pilot Task

This NTCIR-10 pilot task calls participants to retrieve important information (personal and medical information) from clinical text written in Japanese.


Recently, more and more medical records are written in electronic format in place of paper, which leads to a higher importance of information processing technique in medical fields. In this proposed pilot task, participants are supposed to retrieve important information from medical documents in Japanese. This is one of the elemental technologies to develop computational systems for supporting a wide range of medical services.

Our goal is to promote and support to generate practical tools and systems applicable in the medical industry, which will support medical decisions and treatments by physicians and medical staffs. While a short-term objective of this pilot task is to evaluate basic techniques to information extraction in medical fields, the actual objective is to offer a forum for achieving the goal with community-based approach, that is, to gather people who are interested in this issue, and to facilitate their communication and discussion to clarify issues to be solved and to define element technologies.


Participants are supposed to extract information from medical reports written by physicians. We are planning to hold following three types of tasks:

De-identification task

A task to add the following tags to the given reports:

<a> : age, < t> :time, <h> :hospital, <l> :location
<p> :person, < x> :sex

Complaint and diagnosis task

A task to add the following tags to the given reports:

<c> : complaint,  < d> : diagnosis

Free task

A task to welcome practical and/or creative ideas other than above tasks.


The dataset for this pilot task contains medical history summaries of imaginary patients written by physicians:

Example input data: plain text







(6 sentences; each sentence finishes with a small circle “。”.)

Example output data: tagged text




明らかな<c modality=”negation”>運動麻痺</c>はみられず。




Registration open 2012-11-01
Registration deadline 2012-11-30
Send sample data (for development) 2012-12-10
Send test data (for evaluation) 2013-01-25
Submission deadline 2013-02-01
Early draft overview papers by TOs released to participants 2013-02-15
Draft participants abstracts* due 2013-03-01
All camera ready abstracts* due 2013-05-01
NTCIR-10 Workshop Meeting, NII 2013-06-18/21

(*Abstracts describing the methods tested in this pilot task. Non-peer reviewed.)


Please register via registration link to participate in this pilot task.

Registered participants are automatically subscribed to the NTCIR-MedNLP-particpants mailing list.

Expected participants

Anyone who is involved in medical and/or informatics studies. This pilot task is open to participants of any nationality who can develop tools for processing Japanese is acceptable.


Please refer to the following paper:

Mizuki MORITA, Yoshinobu KANO, Tomoko OHKUMA, Mai MIYABE, Eiji ARAMAKI:
Overview of the NTCIR-10 MedNLP task,
In Proceedings of NTCIR-10, 2013. (2013/06/18, Tokyo, Japan)

Organizing committee




Contact address