这篇博客重在实现,力求多快好省的搭建出来一个Task-oriented Dialogue System, 对于原理部分不做太多解释 参考资料可以参考 1. A User Simulator for Task-Completion Dialogues 2. https://github.com/MiuLab/TC-Bot
- NLU
- DM
- NLG # Natural Language Understanding NLU 主要完成
- Intent Prediction
- Slot Filling # Dialogue Management 通常,可以通过Reinforcement Learning 来优化dialogue policy。但是RL的方法需要与真实用户进行多轮交互,成本比较高。为了解决这一问题,可以建立"模拟用户(Simulated User)"来对agent 进行优化,等agent 优化到一定程度再和真实用户进行交互进行进一步优化。
Natural Language Generation
Agent 类
一共有两个主要方法 + initialize_eposode(self) + state_to_action(self, state, available_actions)
其中, state_to_action 可以看成是 dialogue policy learning(DPL) 部分的功能: 根据系统state 选择相应action. 我们拿经典的DQNAgent 举例 1
2
3
4
5
6
7def state_to_action(self, state):
""" DQN: Input state, output action """
self.representation = self.prepare_state_representation(state)
self.action = self.run_policy(self.representation)
act_slot_response = copy.deepcopy(self.feasible_actions[self.action])
return {'act_slot_response': act_slot_response, 'act_slot_value_response': None}
可以看到思路还是很清晰的 1. self.prepare_state_representation(state): DPL先将agent系统的状态:state 进行抽象。 2. self.run_policy(self.representation): DPL根据DQN选择一个action 3. act_slot_response: 将选择的action进一步映射为act
其中对话中state 表示方法 1
2
3
4
5user_action = state['user_action']
current_slots = state['current_slots']
kb_results_dict = state['kb_results_dict']
agent_last = state['agent_action']
turn = state['turn']
- For User Action
- Current user action representation: user_act_rep
- User inform slots representation: user_inform_slots_rep
- User request slots representation: user_request_slots_rep
- For Current Slots
- current_slots[inform_slots]: current_slots_rep
- For Agent_Last
- agent_last['diaact']: agent_act_act
- agent_last['inform_slots']: agent_inform_slots_rep
- agent_last['request_slots']: agent_request_slots_rep
- For KB Result
- kb_results_dict: kb_count_rep
- kb_binary: kb_binary_rep
- For Dialogue Turns number
- trun: turn_rep
- turn: turn_onehot_rep