这篇博客重在实现,力求多快好省的搭建出来一个Task-oriented Dialogue System, 对于原理部分不做太多解释参考资料可以参考 1. A User Simulator for Task-Completion Dialogues 2. https://github.com/MiuLab/TC-Bot

NLU
DM
NLG # Natural Language Understanding NLU 主要完成
Intent Prediction
Slot Filling # Dialogue Management 通常，可以通过Reinforcement Learning 来优化dialogue policy。但是RL的方法需要与真实用户进行多轮交互，成本比较高。为了解决这一问题，可以建立"模拟用户(Simulated User)"来对agent 进行优化，等agent 优化到一定程度再和真实用户进行交互进行进一步优化。

Natural Language Generation

Agent 类

一共有两个主要方法 + initialize_eposode(self) + state_to_action(self, state, available_actions)

其中, state_to_action 可以看成是 dialogue policy learning(DPL) 部分的功能: 根据系统state 选择相应action. 我们拿经典的DQNAgent 举例

def state_to_action(self, state):
   """ DQN: Input state, output action """
   
   self.representation = self.prepare_state_representation(state)
   self.action = self.run_policy(self.representation)
   act_slot_response = copy.deepcopy(self.feasible_actions[self.action])
   return {'act_slot_response': act_slot_response, 'act_slot_value_response': None}

可以看到思路还是很清晰的 1. self.prepare_state_representation(state): DPL先将agent系统的状态:state 进行抽象。 2. self.run_policy(self.representation): DPL根据DQN选择一个action 3. act_slot_response: 将选择的action进一步映射为act

其中对话中state 表示方法

user_action = state['user_action']
current_slots = state['current_slots']
kb_results_dict = state['kb_results_dict']
agent_last = state['agent_action']
turn = state['turn']

For User Action

Current user action representation: user_act_rep
User inform slots representation: user_inform_slots_rep
User request slots representation: user_request_slots_rep

For Current Slots

current_slots[inform_slots]: current_slots_rep

For Agent_Last

agent_last['diaact']: agent_act_act
agent_last['inform_slots']: agent_inform_slots_rep
agent_last['request_slots']: agent_request_slots_rep

For KB Result

kb_results_dict: kb_count_rep
kb_binary: kb_binary_rep

For Dialogue Turns number

trun: turn_rep
turn: turn_onehot_rep