Deep Learning's Generalization, Especially on structured discrete data

This front adapts from our legacy website deeplearning4discrete.net and introduces a suite of deep learning tools we have developed for improve deep learning generalization, especially when on discrete structured data types like text, graph, or sets. Please feel free to email me when you find my typos.

Background on why Generalization topics of Deep Learning are interesting?

Generalization refers to how a machine model adapts properly to new, previously unseen data. We focus on OOD (out of distribution) generalization.

timeline

Why structured discrete Data is Interesting?

Deep learning constructs networks of parameterized functional modules and is trained from reference examples using gradient-based optimization [Lecun19].

Since it is hard to estimate gradients through functions of discrete random variables, researching on how to make deep learning behave well on discrete structured data and structured representation interests us. Developing such techniques are an active research area. We focus on investigating interpretable and scalable techniques for doing so.

Relevant Papers we published

  • We can use a component-view to categorize the research topics in OOD generalization:
    • (1) sample level
    • (2) feature level
    • (3) representation/encoding level
    • (4) loss level
    • (5) task level (e.g., meta learning, few shot generalization)
    • Please check out each item in our side-bar

timeline

Contacts:

Have questions or suggestions? Feel free to ask me on Twitter or email me.

Thanks for reading!