Mercurial > hg > Members > atton > intelligence_robotics

title: A Novel Greeting Selection System for a Culture-Adaptive Humanoid Robot
author: Tatsuki KANAGAWA <br> Yasutaka HIGA
profile: Concurrency Reliance Lab
lang: Japanese

# Abstract: Robots and cultures
* Robots, especially humanoids, are expected to perform human-like actions and adapt to our ways of communication in order to facilitate their acceptance in human society.
* Among humans, rules of communication change depending on background culture.
* Greeting are a part of communication in which cultural differences are strong.

# Abstract: Summary of this paper
* In this paper, we present the modelling of social factors that influence greeting choice,
* and the resulting novel culture-dependent greeting gesture and words selection system.
* An experiment with German participants was run using the humanoid robot ARMAR-IIIb.

# Introduction: Acceptance of humanoid robots
* Acceptance of humanoid robots in human societies is a critical issue.
* One of the main factors is the relations ship between the background culture of human partners and acceptance.
    * ecologies, social structures, philosophies, educational systems.

# Introduction: Culture adapted greetings
* In the work Trovat et al. culture-dependent acceptance and discomfort relating to greeting gestures were found in a comparative study with Egyptian and Japanese participants.
* As the importance of culture-specific customization of greeting was confirmed.
* Acceptance of robots can be improved if they are able to adapt to different kinds of greeting rules.

# Introduction: Greeting interaction with robots
* Robots are expected to interact and communicate with humans of different cultural background in a natural way.
* It is there therefore important to study greeting interaction between robots and humans.
    * ARMAR-III: greeted the Chancellor of Germany with a handshake
    * ASIMO: is capable of performing a wider range of greetings
    * (a handshake, waving both hands, and bowing)

# Introduction: Objectives of this paper
* The robot should be trained with sociology data related to one country, and evolve its behaviour by engaging with people of another country in a small number of interactions.
* As the experiment is carried out in Germany, the interactions are with German participants, while preliminary training is done with Japanese data, which is culturally extremely different.

# Introduction: ARMAR-IIIb
<img src="pictures/ARMAR-IIIb.png" style='width: 350px; height: 350px; margin-left: 200px;'>

# Introduction: Target scenario
* The idea behind this study is a typical scenario in which a foreigner visiting a country for the first time greets local people in an inappropriate way as long as he is unaware of the rules that define the greeting choice.
    * (e.g., a Westerner in Japan)
* For example, he might want to shake hands or hug, and will receive a bow instead.

# Introduction: Objectives of this work
* This work is an application of a study of sociology into robotics.
* Our contribution is to synthesize the complex and sparse data related to greeting types into a model;
* create a selection and adaptation system;
* and implement the greetings in a way that can potentially be applied to any robot.

# Greeting Selection: Greetings among humans
* Greetings are the means of initiating and closing an interaction.
* We desire that robots be able to greet people in a similar way to humans.
* For this reason, understanding current research on greetings in sociological studies is necessary.
* Moreover, depending on cultural background, there can be different rules of engagement in human-human interaction.

# Greeting Selection: Solution for selection
* A unified model of greetings does not seem to exist in the literature, but a few studies have attempted a classification of greetings.
* Some more specific studies have been done on handshaking.

# Greeting Selection: Classes for greetings
* A classification of greetings was first attempted by Friedman based on intimacy and commonness.
* The following greeting types were mentioned: smile; wave; nod; kiss on mouth; kiss on cheek; hug; handshake; pat on back; rising; bow; salute; and kiss on hand.
*  Greenbaum et al. also performed a gender-related investigation, while [24] contained a comparative study between Germans and Japanese.

# Greeting Selection: Factors on Classification
* 'terms'  : same terms with different meanings, or different terms with the same meaning.
* 'location' : influences intimacy and greeting words. (private or public)
* 'intimacy' : is influenced by physical distance, eye contact, gender, location, and culture. (Social Distance)
* 'Time' : time of the day is important for the choice of words.
* 'Politeness', 'Power Relationship', 'culture' and more.

# Greeting Selection: Factors on Classification
* the factors to be cut are greyed out.

<img src="pictures/factors.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Model of Greetings: Assumptions (1 - 5)
* The simplification was guided by the following ten assumptions.
* Only two individuals (a robot and a human participant): we do not take in consideration a higher number of individuals.
* Eye contact is taken for granted.
* Age is considered part of 'power relationship'
* Regionally is not considered.
* Setting is not considered

# Model of Greetings: Assumptions (6 - 10)
* Physical distance is close enough to allow interaction
* Gender is intended to be a same-sex dyad
* Affect is considered together with 'social distance'
* Time since the last interaction is partially included in 'social distance'
* Intimacy and politeness are not necessary

# Model of Greetings: Basis of classification
* Input
    * All the other factors are then considered features of a mapping problem
    * They are categorical data, as they can assume only two or three values.
* Output
    * The outputs can also assume only a limited set of categorical values.

# Model of Greetings: Features, mapping discriminants, classes, and possible status
<img src="pictures/classes.png" style='width: 60%; margin-left: 150px;'>

# Model of Greetings: Overview of the greeting model
* Greeting model takes context data as input and produces the appropriate robot posture and speech for that input.
* The two outputs evaluated by the participants of the experiment through written questionnaires.
* These training data that we get from the experience are given as feedback to the two mappings.

# Model of Greetings: Overview of the greeting model
<img src="pictures/model_overview.png" style='width: 75%; margin-left: 120px;'>

# Greeting selection system training data
* Mappings can be trained to  an initial state with data taken from the literature of sociology studies.
* Training data should be classified through some machine learning method or formula.
* We decided to use conditional probabilities: in particular the Naive Bayes formula to map data.
* Naive Bayes only requires a small amount of training data.

# Model of Greetings: Details of training data
* While training data of gestures can be obtained from the literature, data of words can also be obtained from text corpora.
* English: English corpora, such as British National Corpus, or the Corpus of Historical American English, are used.
* Japanese: extracted from data sets by [24, 37, 41-43]. Analyze Corpus on Japanese is difficult.

# Model of Greetings: Location Assumption
* The location of the experiment was Germany.
* For this reason, the only dataset needed was the Japanese.
* As stated in the motivations at the beginning of this paper, the robot should initially behave like a foreigner.
* ARMAR-IIIb, trained with Japanese data, will have to interact with German people and adapt to their customs.

# Model of Greetings: Mappings and questionnaires
* The mapping is represented by a dataset, initially built from training data, as a table containing weights for each context vector corresponding to each greeting type.
* We now need to update these weights.

# feedback from three questionnaires
* Whenever a new feature vector is given as an input, it is checked to see whether it is already contained in the dataset or not.
* In the former case, the weights are directly read from the dataset
* in the latter case, they get assigned the values of probabilities calculated through the Naive Bayes classifier.
* The output is the chosen greeting, after which the interaction will be evaluated through a questionnaires.

# Model of Greetings: Three questionnaires for feedback
* answers of questionnaires are five-point semantic differential scale:
1. How appropriate was the greeting chosen by the robot for the current context?
2. (If the evaluation at point 1 was <= 3) which greeting type would have been appropriate instead?
3. (If the evaluation at point 1 was <= 3) which context would have been appropriate, if any, for the greeting type of point 1?

# Model of Greetings: feedback and terminate condition
* Weights of the affected features are multiplied by a positive or negative reward (inspired by reinforcement learning) which is calculated proportionally to the evaluation.
* Mappings stop evolving when the following two stopping conditions are satisfied
* all possible values of all features have been explored
* and the moving average of the latest 10 state transitions has decreased below a certain threshold.

# Model of Greetings: Summary
* Thanks to this implementation, mappings can evolve quickly, without requiring hundreds or thousands of iterations
* but rather a number comparable to the low number of interactions humans need to understand and adapt to social rules.

# Implementation on ARMAR-IIIb
* ARMAR-III is designed for close cooperation with humans
* ARMAR-III has a humanlike appearance
* sensory capabilities similar to humans
* ARMAR-IIIb is a slightly modified version with different shape to the head, the trunk, and the hands

# Implementation of gestures
* The implementation on the robot of the set of gestures it is not strictly hardwired to the specific hardware
* manually defining the patterns of the gestures
* Definition gesture is performed by Master Motor Map(MMM) format and is converted into robot

# Master Motor Map
* The MMM is a reference 3D kinematic model
* providing a unified representation of various human motion capture systems, action recognition systems, imitation systems, visualization modules
* This representation can be subsequently converted to other representations, such as action recognizers, 3D visualization, or implementation into different robots
* The MMM is intended to become a common standard in the robotics community

# Master Motor Map
<img src="pictures/MMM.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Master Motor Map
* The body model of MMM  model can be seen in the left-hand illustration in Figure
* It contains some joints, such as the clavicula, which are usually not implemented in humanoid robots
* A conversion module is necessary to perform a transformation between this kinematic model and ARMAR-IIIb kinematic model

# Master Motor Map
<img src="pictures/MMMModel.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>


# MMM support
* The MMM framework has a high support for every kind of human-like robot
* MMM can define the transfer rules
* Using the conversion rules, it can be converted from the MMM Model to the movement of the robot
* may not be able to convert from MMM model for a specific robot
* the motion representation parts of the MMM can be used nevertheless

# Conversion example of MMM
* After programming the motion on the MMM model  they were processed by the converter
* the human model contains many joints, which are not present in the robot configuration
* ARMAR is not bending the body when performing a bow
* It was expressed using a portion present in the robot (e.g., the neck)


# GestureExample
<img src="pictures/GestureExample.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# ImplementGestureARMARⅢ
<img src="pictures/ImplementGestureARMARⅢ.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Modular Controller Architecture, a modular software framework
* The postures could be triggered from the MCA (Modular Controller Architecture, a modular software framework)interface, where the greetings model was also implemented
* the list of postures is on the left together with the option
* When that option is activated, it is possible to select the context parameters through the radio buttons on the right

#  Modular Controller Architecture, a modular software framework
<img src="pictures/MCA.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Implementation of words
* Word of greeting uses two of the Japanese and German
* For example,Japan it is common to use a specific greeting in the workplace 「otsukaresama desu」
* where a standard greeting like 「konnichi wa」 would be inappropriate
* In German, such a greeting type does not exist
* but the meaning of “thank you for your effort” at work can be directly translated into German
* the robot knows dictionary terms, but does not understand the difference in usage of these words in different contexts

# table of greeting words
<img src="pictures/tableofgreetingwords.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>


# Implementation of words
* These words have been　recorded through free text-to-speech software into wave files　that could be played by the　robot
* ARMAR does not have embedded speakers in its body
* added two small speakers behind the head and connected them to another computer

# Experiment description
* Experiments were conducted at room as shown in Figure , Germany
<img src="pictures/room.png" style='width: 60%; margin-left: 150px; margin-top: 50px;'>


# Experiment description2
* Participants were 18 German people of different ages, genders, workplaces
* robot could be trained with various combinations of context
* It was not possible to include all combinations of feature values in the experiment
* for example  there cannot be a profile with both [‘location’: ‘workplace’] and [‘social distance’: ‘unknown’]
* the [‘location’:‘private’] case was left out, because it is impossible to simulate the interaction in a private context, such as one’s home

# Experiment description3
* repeated the experiment more than
* for example  experiment is repeated at different times
* Change the acquaintance from unknown social distance at the time of exchange
* we could collect more data by manipulating the value of a single feature

# Statistics of participants
* The demographics of the 18 participants were as follows
1. gender :M: 10; F: 8
2. average age: 31.33
3. age standard deviation:13.16


# tatistics of participants
* the number of interactions was determined by the stopping condition of the algorithm
* The number of interactions taking repetitions into account was 30
1. gender :M: 18; F: 12
2. average age: 29.43
3. age standard deviation: 12.46

# The experiment protocol is as follows 1~5
1. ARMAR-IIIb is trained with Japanese data
2. encounter are given as inputs to the algorithm and the robot is prepared
3. Participants entered the room , you are prompted to interact with consideration robot the current situation
4. The participant enters the room
5. The robot’s greeting is triggered by an operator as the human participant approaches

# The experiment protocol is as follows 6~10
6. After the two parties have greeted each other, the robot is turned off
7. the participant evaluates the robot’s behaviour through a questionnaire
8. The mapping is updated using the subject’s feedback
9. Repeat steps 2–8 for each participant
10. Training stops after the state changes are stabilized

# Results
* It referred to how the change in the gesture of the experiment
* It has become common Bowing is greatly reduced handshake
* It has appeared hug that does not exist in Japan of mapping
* This is because the participants issued a feedback that hug is appropriate

# Results
<img src="pictures/GestureTable.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Results
* The biggest change in the words of the mapping , are gone workplace of greeting
* Is the use of informal greeting as a small amount of change

# Results
<img src="pictures/GreetingWordTable.png" style='width: 60%; margin-left: 150px; margin-top: -50px;'>

# Limitations and improvements
* The first obvious limitation is related to the manual input of context data
* The integrated use of cameras would make it possible to determine features such as gender, age, and race of the human

# Limitations and improvements
* Speech recognition system and cameras could also detect the human own greeting
* Robot itself , to determine whether the greeting was correct
* The decision to check the distance to the partner , the timing of the greeting , head orientation , or to use other information , whether the response to a greeting is correct and what is expected

#Limitations and improvements
* It is possible to extend the set of context by using a plurality of documents

# Different kinds of embodiment
* Humanoid robot has a body similar to the human
* robot can change shape , the size capability
* By expanding this robot , depending on their physical characteristics , it is possible to start discovering interaction method with the best human yourself

<style>
    .slide.cover H2 { font-size: 60px; }
</style>

<!-- vim: set filetype=markdown.slide: -->
author	Yasutaka Higa <e115763@ie.u-ryukyu.ac.jp>
date	Fri, 26 Jun 2015 10:21:43 +0900
parents	e5967e62ebde
children