Your browser was unable to load all of the resources. They may have been blocked by your firewall, proxy or browser configuration.
Press Ctrl+F5 or Ctrl+Shift+R to have your browser try again.

Reproduce COTA using Ludwing #21

#1

Hi,

I'm watching an amazing talk about how Uber handles Customer Care using COTA and I'm wondering if it's possible to reproduce the architecture of this project just using the Ludwig framework. The architecture I'm referencing is in this Uber presentation: https://youtu.be/jIGHY7fz2XA?t=991.

There is any similar example of COTA using Ludwig?

Thank you guys for this incredible framework!

  • replies 2
  • views 1.4K
  • likes 0
#2

Hey Lucas,
funny enough I'm the person giving that talk :)
And actually COTA was the first model developed with Ludwig, so definitely yes, you can reproduce exactly that architecture in Ludwig.
In your model definition you would need to have the input text of the ticket + all the other features about the user and the context you can extract from your data, then in the output feature you can have contact type, action and template as categories. In cota we also added dependencies between the output features (check the ludwig documentation for more details) and we made the prediction of the contact type actually a prediction of a path from root to leaf in the tree of contact types by using a sequence output feature.
Let me know if this is not clear or if you have further questions.
Cheers,
Piero

P.S. for more details on COTA:


#3

Thank you, Piero! I checked your website and there is even one more funny coincidence. In my master thesis, I'm implementing in DGL a very similar model that you had implemented on Uber Eats using Graph Convolutional Networks :sweat_smile:

The paper and the blog post are very pretty clear and well structured! I still with just one more question. In my case a have a set of input features (text + user features + order features) and I would like to recommend a FAQ post (text with tags) based on the user question. Any suggestions about how to adjust the architecture to this problem and train in an unsupervised way?

Again, thank you for your help.

Best regards,