*********************************
There is now a CONTENT FREEZE for Mercury while we switch to a new platform. It began on Friday, March 10 at 6pm and will end on Wednesday, March 15 at noon. No new content can be created during this time, but all material in the system as of the beginning of the freeze will be migrated to the new platform, including users and groups. Functionally the new site is identical to the old one. webteam@gatech.edu
*********************************
Title: Optimization-Driven Emergence of Deep Hierarchies with Applications in Data Mining and Evolution
Payam Siyari
School of Computer Science
College of Computing
Georgia Institute of Technology
Date: Wednesday, April 25th, 2018
Time: 11:00am - 1:00pm
Location: Klaus 3100
Committee:
Dr. Constantine Dovrolis (Advisor, School of Computer Science, Georgia Institute of Technology)
Dr. Bistra Dilkina (Co-Advisor, Department of Computer Science, University of Southern California & School of Computational Science and Engineering, Georgia Institute of Technology)
Dr. Thad Starner (School of Interactive Computing, Georgia Institute of Technology)
Dr. Duen Horng (Polo) Chau (School of Computational Science and Engineering, Georgia Institute of Technology)
Dr. Matthias Gallé (Senior Scientist, Naver Labs Europe)
Abstract:
It is well known that many complex systems, in both nature and technology, exhibit hierarchical modularity: smaller modules, each of them providing a certain function, are used within larger modules that perform more complex functions. What is not well understood, however, is how this hierarchical structure (which is fundamentally a network property) emerges initially, how it relates to the desired input-output function of the entire system, and how it evolves over time.
We present an optimization framework, called Lexis, that constructs a given set of target strings through a minimum-cost hierarchy of concatenations of shorter strings. We apply the Lexis framework in diverse applications such as optimized synthesis of DNA fragments in genomic libraries, hierarchical structure discovery in protein sequences, and feature extraction in text data.
Lexis has close ties to the classic Smallest Grammar Problem (SGP). We also propose a generalized formulation of SGP that produces smaller models than the current best approximations, and also better captures the syntactic structure of natural language.
Finally, we use Lexis in a modeling framework, referred to as Evo-Lexis, that can capture the emergence and dynamics of hierarchical structure in evolving systems. By modeling evolutionary mechanisms such as mutation, recombination, and selection, we show that Evo-Lexis can produce deep hierarchies with various realistic properties, such as a small but relatively stable core despite frequent changes at the higher layers of the hierarchy. This modeling study provides a plausible explanation for the hourglass (or bow-tie) structure that is often observed in some natural and technological hierarchical systems.