Data Description

The data recorded ten nurses with more than three years of clinical suctioning experience and twelve nursing students from a university performing Endotracheal Suctioning (ES) on the ESTE-SIM simulation system.

There are two types of data in this dataset:

Video: recorded from the front side of the nurse beyond the patient mannequin. This video will be published only during training.
Pose Skeleton (Keypoints): extracted from videos by using YOLOv7. This data will be published for training and testing.

There are a total of 9 activities in the ES procedure. All the activities are listed in the below table.

Table 1. Activities in endotracheal suctioning and their IDs

Activity class id	Activity name
0	Catheter preparation
1	Temporal removement of an artificial airway
2	Suctioning phlegm
3	Refitting the artificial airway
4	Catheter disinfection
5	Discarding gloves
6	Positioning
7	Auscultation
8	Others

Data Structure

Subject Information

We invited ten nurses and twelve nursing students for our data collection. Each participant was asked to perform ES procedure twice. The dataset is divided into a training set (32 videos) and a submission set (12 videos). Here is the information for currently available subjects.

Table 2. Subject information

Subject_id	Usage	Experience	Note
N01T1	Training	Nurse
N01T2	Training	Nurse	Same person as N01T1
N02T1	Training	Nurse
N02T2	Training	Nurse	Same person as N02T1
N03T1	Submission	Nurse
N03T2	Submission	Nurse	Same person as N03T1
N04T1	Training	Nurse
N04T2	Training	Nurse	Same person as N04T1
N05T1	Submission	Nurse
N05T2	Submission	Nurse	Same person as N05T1
N06T1	Training	Nurse
N06T2	Training	Nurse	Same person as N06T1
N07T1	Training	Nurse
N07T2	Training	Nurse	Same person as N07T1
N09T1	Submission	Nurse
N09T2	Submission	Nurse	Same person as N09T1
N11T1	Training	Nurse
N11T1	Training	Nurse	Same person as N11T1
N12T1	Training	Nurse
N12T1	Training	Nurse	Same person as N12T1
S01T1	Training	Student
S01T2	Training	Student	Same person as S01T1
S02T1	Training	Student
S02T2	Training	Student	Same person as S02T1
S03T1	Training	Student
S03T2	Training	Student	Same person as S03T1
S04T1	Submission	Student
S04T2	Submission	Student	Same person as S04T1
S05T1	Training	Student
S05T2	Training	Student	Same person as S05T1
S06T1	Submission	Student
S06T2	Submission	Student	Same person as S06T1
S07T1	Training	Student
S07T2	Training	Student	Same person as S07T1
S08T1	Training	Student
S08T2	Training	Student	Same person as S08T1
S09T1	Training	Student
S09T2	Training	Student	Same person as S09T1
S10T1	Training	Student
S10T2	Training	Student	Same person as S10T1
S11T1	Training	Student
S11T2	Training	Student	Same person as S11T1
S12T1	Submission	Student
S12T2	Submission	Student	Same person as S12T1

Subject_id

Usage

Experience

Note

N01T1

Training

Nurse

N01T2

Training

Nurse

Same person as N01T1

N02T1

Training

Nurse

N02T2

Training

Nurse

Same person as N02T1

N03T1

Submission

Nurse

N03T2

Submission

Nurse

Same person as N03T1

N04T1

Training

Nurse

N04T2

Training

Nurse

Same person as N04T1

N05T1

Submission

Nurse

N05T2

Submission

Nurse

Same person as N05T1

N06T1

Training

Nurse

N06T2

Training

Nurse

Same person as N06T1

N07T1

Training

Nurse

N07T2

Training

Nurse

Same person as N07T1

N09T1

Submission

Nurse

N09T2

Submission

Nurse

Same person as N09T1

N11T1

Training

Nurse

N11T1

Training

Nurse

Same person as N11T1

N12T1

Training

Nurse

N12T1

Training

Nurse

Same person as N12T1

S01T1

Training

Student

S01T2

Training

Student

Same person as S01T1

S02T1

Training

Student

S02T2

Training

Student

Same person as S02T1

S03T1

Training

Student

S03T2

Training

Student

Same person as S03T1

S04T1

Submission

Student

S04T2

Submission

Student

Same person as S04T1

S05T1

Training

Student

S05T2

Training

Student

Same person as S05T1

S06T1

Submission

Student

S06T2

Submission

Student

Same person as S06T1

S07T1

Training

Student

S07T2

Training

Student

Same person as S07T1

S08T1

Training

Student

S08T2

Training

Student

Same person as S08T1

S09T1

Training

Student

S09T2

Training

Student

Same person as S09T1

S10T1

Training

Student

S10T2

Training

Student

Same person as S10T1

S11T1

Training

Student

S11T2

Training

Student

Same person as S11T1

S12T1

Submission

Student

S12T2

Submission

Student

Same person as S12T1

Datasets will be published in the following directory structure:

./dataset/

./video/

{Training_subject_id}.MTS
N01T2.MTS
...
S11T1.MTS
S11T2.MTS

./keypoints/

{Subject_id}_keypoint.csv
N01T2_keypoint.csv
...
S12T1_keypoint.csv
S12T2_keypoint.csv

./ann/

{Training_subject_id}_ann.csv
N01T2_ann.csv
...
S11T1_ann.csv
S11T2_ann.csv

Video

The ./video/ folder contains 32 videos in MTS format of subjects in the Training set (see Table 2). The frame per second of videos is 30 and the image size is 1920×1080.

Keypoints

Inside the ./keypoint/ folder, we have provided 44 CSV files containing x, and y coordinates and confidence scores of 17 positions on the subject's body in each video frame. These keypoints were extracted by using YOLOv7. Sometimes, there are other people included in the frame while passing by in the background, therefore we applied post-processing steps in the skeleton results to keep only the skeleton of the main nurse. Therefore, the sampling rate of keypoint is also 30. If you open the files you can see the following columns.

Table 3. Keypoint data description ({Subject_id}_keypoint.csv)

Column name	Description of column
nose_x	X coordinate value of nose
nose_y	Y coordinate value of nose
nose_conf	Confidence value of nose
left_eye_x	X coordinate value of left eye
left_eye_y	Y coordinate value of left eye
left_eye_conf	Confidence value of left eye
right_eye_x	X coordinate value of right eye
right_eye_y	Y coordinate value of right eye
right_eye_conf	Confidence value of right eye
left_ear_x	X coordinate value of left ear
left_ear_y	Y coordinate value of left ear
left_ear_conf	Confidence value of left ear
right_ear_x	X coordinate value of right ear
right_ear_y	Y coordinate value of right ear
right_ear_conf	Confidence value of right ear
left_shoulder_x	X coordinate value of left shoulder
left_shoulder_y	Y coordinate value of left shoulder
left_shoulder_conf	Confidence value of left shoulder
right_shoulder_x	X coordinate value of right shoulder
right_shoulder_y	Y coordinate value of right shoulder
right_shoulder_conf	Confidence value of right shoulder
left_elbow_x	X coordinate value of left elbow
left_elbow_y	Y coordinate value of left elbow
left_elbow_conf	Confidence value of left elbow
right_elbow_x	X coordinate value of right elbow
right_elbow_y	Y coordinate value of right elbow
right_elbow_conf	Confidence value of right elbow
left_wrist_x	X coordinate value of left wrist
left_wrist_y	Y coordinate value of left wrist
left_wrist_conf	Confidence value of left wrist
right_wrist_x	X coordinate value of right wrist
right_wrist_y	Y coordinate value of right wrist
right_wrist_conf	Confidence value of right wrist
left_hip_x	X coordinate value of left hip
left_hip_y	Y coordinate value of left hip
left_hip_conf	Confidence value of left hip
right_hip_x	X coordinate value of right hip
right_hip_y	Y coordinate value of right hip
right_hip_conf	Confidence value of right hip
left_knee_x	X coordinate value of left knee
left_knee_y	Y coordinate value of left knee
left_knee_conf	Confidence value of left knee
right_knee_x	X coordinate value of right knee
right_knee_y	Y coordinate value of right knee
right_knee_conf	Confidence value of right knee
left_ankle_x	X coordinate value of left ankle
left_ankle_y	Y coordinate value of left ankle
left_ankle_conf	Confidence value of left ankle
right_ankle_x	X coordinate value of right ankle
right_ankle_y	Y coordinate value of right ankle
right_ankle_conf	Confidence value of right ankle

Column name

Description of column

nose_x

X coordinate value of nose

nose_y

Y coordinate value of nose

nose_conf

Confidence value of nose

left_eye_x

X coordinate value of left eye

left_eye_y

Y coordinate value of left eye

left_eye_conf

Confidence value of left eye

right_eye_x

X coordinate value of right eye

right_eye_y

Y coordinate value of right eye

right_eye_conf

Confidence value of right eye

left_ear_x

X coordinate value of left ear

left_ear_y

Y coordinate value of left ear

left_ear_conf

Confidence value of left ear

right_ear_x

X coordinate value of right ear

right_ear_y

Y coordinate value of right ear

right_ear_conf

Confidence value of right ear

left_shoulder_x

X coordinate value of left shoulder

left_shoulder_y

Y coordinate value of left shoulder

left_shoulder_conf

Confidence value of left shoulder

right_shoulder_x

X coordinate value of right shoulder

right_shoulder_y

Y coordinate value of right shoulder

right_shoulder_conf

Confidence value of right shoulder

left_elbow_x

X coordinate value of left elbow

left_elbow_y

Y coordinate value of left elbow

left_elbow_conf

Confidence value of left elbow

right_elbow_x

X coordinate value of right elbow

right_elbow_y

Y coordinate value of right elbow

right_elbow_conf

Confidence value of right elbow

left_wrist_x

X coordinate value of left wrist

left_wrist_y

Y coordinate value of left wrist

left_wrist_conf

Confidence value of left wrist

right_wrist_x

X coordinate value of right wrist

right_wrist_y

Y coordinate value of right wrist

right_wrist_conf

Confidence value of right wrist

left_hip_x

X coordinate value of left hip

left_hip_y

Y coordinate value of left hip

left_hip_conf

Confidence value of left hip

right_hip_x

X coordinate value of right hip

right_hip_y

Y coordinate value of right hip

right_hip_conf

Confidence value of right hip

left_knee_x

X coordinate value of left knee

left_knee_y

Y coordinate value of left knee

left_knee_conf

Confidence value of left knee

right_knee_x

X coordinate value of right knee

right_knee_y

Y coordinate value of right knee

right_knee_conf

Confidence value of right knee

left_ankle_x

X coordinate value of left ankle

left_ankle_y

Y coordinate value of left ankle

left_ankle_conf

Confidence value of left ankle

right_ankle_x

X coordinate value of right ankle

right_ankle_y

Y coordinate value of right ankle

right_ankle_conf

Confidence value of right ankle

Annotation (ann)

Inside the ./ann/ folder, we have provided 32 CSV files containing annotations of each video in the Training set (see Table 2). If you open the files you can see the following columns.

Table 4. Annotation data description ({Training_subject_id}_ann.csv)

Column name	Description
start_time	Start time of activity (in second)
stop_time	Stop time of activity (in second)
annotation_str	Activity name
annotation	Activity ID (described in Table 1)

Column name

Description

start_time

Start time of activity (in second)

stop_time

Stop time of activity (in second)

annotation_str

Activity name

annotation

Activity ID (described in Table 1)

Data Usage

You can get the dataset download link by registering to participate on the challenge.

Rules

We provide video and skeleton data for the Training set. The skeleton data is extracted from the video by using YOLOv7 and applying our post-processing to define and track the main subject in the video. Because of limitations on camera position, skeleton data could only identify certain portions of the body. According to the challenge's objective, we only allow participants to utilize the provided skeleton for activity recognition in the testing phase. Participants must utilize a generative model or large language model for some ingenuity. The video data is only able to be used for training or for being imaginative and creative using Generative AI.

For the testing set, only skeleton data are published. From the provided skeleton data, participants are required to propose their pipelines, predict, and submit the activity label for the submission set (in each second) as shown in the tutorial.

The submission file contains the columns detailed below.

subjectID: Corresponding participant’s ID (see Table 2)
timestamp: Each second
activityID: The activity class is supposed to happen (see Table 1)

SubjectID	timestamp	activityID
N03T1	00:00 - 00:01	0
N03T1	00:01 - 00:02	0

SubjectID

timestamp

activityID

N03T1

00:00 - 00:01

N03T1

00:01 - 00:02

For evaluation, we will consider F1 Score and the paper contents. We will take an average F1 score for all the subjects. The baseline result for each subject is shared in the table below.

Subject ID	Accuracy	F1 score
N03T1	0.50	0.39
N03T2	0.49	0.44
N05T1	0.39	0.28
N05T2	0.51	0.36