Instructions for submitting a technical report.
Two domain decomposition methods for solving vector field problems posed in
H(curl) and discretized with Nédélec finite elements are considered. These
finite elements are conforming in H(curl).
A two-level overlapping Schwarz algorithm in two dimensions is analyzed, where the subdomains are only assumed to be uniform in the sense of Peter Jones. The coarse space is based on energy minimization and its dimension equals the number of interior subdomain edges. Local direct solvers are based on the overlapping subdomains. The bound for the condition number depends only on a few geometric parameters of the decomposition. This bound is independent of jumps in the coefficients across the interface between the subdomains for most of the different cases considered.
A bound is also obtained for the condition number of a balancing domain decomposition by constraints (BDDC) algorithm in two dimensions, with Jones subdomains. For the primal variable space, a continuity constraint for the tangential average over each interior subdomain edge is imposed. For the averaging operator, a new technique named deluxe scaling is used. The optimal bound is independent of jumps in the coefficients across the interface between the subdomains.
Furthermore, a new coarse function for problems in three dimensions is introduced, with only one degree of freedom per subdomain edge. In all the cases, it is established that the algorithms are scalable. Numerical results that verify the results are provided, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
Title: Big Data Analytics for Development: Events, Knowledge Graphs and Predictive Models
Candidate: Sunandan Chakraborty
Advisor: Lakshminaryanan Subramanian
Volatility in critical socio-economic indices can have a significant negative impact on global development. This thesis presents a suite of novel big data analytics algorithms that operate on unstructured Web data streams to automatically infer events, knowledge graphs and predictive models to understand, characterize and predict the volatility of socioeconomic indices.
This thesis makes four important research contributions. First, given a large volume of diverse unstructured news streams, we present new models for capturing events and learning spatio-temporal characteristics of events from news streams. We specifically explore two types of event models in this thesis: one centered around the concept of event triggers and a probabilistic meta-event model that explicitly delineates named entities from text streams to learn a generic class of meta-events. The second contribution focuses on learning several different types of knowledge graphs from news streams and events: a) Spatio-temporal article graphs capture intrinsic relationships between different news articles; b) Event graphs characterize relationships between events and given a news query, provide a succinct summary of a timeline of events relating to a query; c) Event-phenomenon graphs that provide a condensed representation of classes of events that relate to a given phenomena at a given location and time; d) Causality testing on word-word graphs which can capture strong spatio-temporal relationships between word occurrences in news streams; e) Concept graphs that capture relationships between different word concepts that occur in a given text stream.
The third contribution focuses on connecting the different knowledge graph representations and structured time series data corresponding to a socio-economic index to automatically learn event-driven predictive models for the given socio-economic index to predict future volatility. We propose several types of predictive models centered around our two event models: event triggers and probabilistic meta-events. The final contribution focuses on a broad spectrum of inference case studies for different types of socio-economic indices including food prices, stock prices, disease outbreaks and interest rates. Across all these indices, we show that event-driven predictive models provide significant improvements in prediction accuracy over state-of-the-art techniques.
Title: SMT-Based and Disjunctive Relational Abstract Domains for Static Analysis
Candidate: Junjie Chen
Advisor: Patrick Cousot
Abstract Interpretation is a theory of sound approximation of program semantics. In recent decades, it has been widely and successfully applied to the static analysis of computer programs. In this thesis, we will work on abstract domains, one of the key concepts in abstract interpretation, which aim at automatically collecting information about the set of all possible values of the program variables. We will focus, in particularly, on two aspects: the combination with theorem provers and the refinement of existing abstract domains.
Satisfiability modulo theories (SMT) solvers are popular theorem provers, which proved to be very powerful tools for checking the satisfiability of first-order logical formulas with respect to some background theories. In the first part of this thesis, we introduce two abstract domains whose elements are logical formulas involving finite conjunctions of affine equalities and finite conjunctions of linear inequalities. These two abstract domains rely on SMT solvers for the computation of transformations and other logical operations.
In the second part of this thesis, we present an abstract domain functor whose elements are binary decision trees. It is parameterized by decision nodes which are a set of boolean tests appearing in the programs and by a numerical or symbolic abstract domain whose elements are the leaves. This new binary decision tree abstract domain functor provides a flexible way of adjusting the cost/precision ratio in path-dependent static analysis.
Title: Iris: Mitigating Phase Noise in Millimeter Wave OFDM Systems
Candidate: Aditya Dhananjay
Advisor: Jinyang Li
Next-generation wireless networks are widely expected to operate over millimeter-wave (mmW) frequencies of over 28GHz. These bands mitigate the acute spectrum shortage in the conventional microwave bands of less than 6GHz. The shorter wavelengths in these bands also allow for building dense antenna arrays on a single chip, thereby enabling various MIMO configurations and highly directional links that can increase the spatial reuse of spectrum.
While attempting to build a practical over-the-air (OTA) link over mmW, we realized that the traditional baseband processing techniques used in the microwave bands simply could not cope with the exacerbated frequency offsets (or phase noise) observed in the RF oscillators at these bands. While the frequency offsets are large, the real difficulty arose from the fact that they varied significantly over very short time-scales.Traditional feedback loop techniques still left significant residual offsets, which in turn led to inter-carrier-interference (ICI). The result was very high symbol error rates (SER).
This thesis presents Iris, a baseband processing block that enables clean mmW links, even in the presence of previously fatal amounts of phase noise. Over real mmW hardware, Iris reduces the SER by one to two orders of magnitude, as compared to competing techniques.
Title: Predicting Images using Convolutional Networks: Visual Scene Understanding with Pixel Maps
Candidate: David Eigen
Advisor: Rob Fergus
In the greater part of this thesis, we develop a set of convolutional networks that infer predictions at each pixel of an input image. This is a common problem that arises in many computer vision applications: For example, predicting a semantic label at each pixel describes not only the image content, but also fine-grained locations and segmenta- tions; at the same time, finding depth or surface normals provide 3D geometric relations between points. The second part of this thesis investigates convolutional models also in the contexts of classification and unsupervised learning.
To address our main objective, we develop a versatile Multi-Scale Convolutional Network that can be applied to diverse vision problems using simple adaptations, and apply it to predict depth at each pixel, surface normals and semantic labels. Our model uses a series of convolutional network stacks applied at progressively finer scales. The first uses the entire image field of view to predict a spatially coarse set of feature maps based on global relations; subsequent scales correct and refine the output, yielding a high resolution prediction. We look exclusively at depth prediction first, then generalize our method to multiple tasks. Our system achieves state-of-the-art results on all tasks we investigate, and can match many image details without the need for superpixelation.
Leading to our multi-scale network, we also design a purely local convolutional network to remove dirt and raindrops present on a window surface, which learns to identify and inpaint compact corruptions. We also we investigate a weighted nearest-neighbors labeling system applied to superpixels, in which we learn weights for each example, and use local context to find rare class instances.
In addition, we investigate the relative importance of sizing parameters using a recursive convolutional network, finding that network depth is most critical. We also develop a Convolutional LISTA Autoencoder, which learns features similar to stacked sparse coding at a fraction of the cost, combine it with a local entropy objective, and describe a convolutional adaptation of ZCA whitening.
Large-scale C software like Linux needs software engineering tools. But such codebases are software product families, with complex build systems that tailor the software with myriad features. This variability management is a challenge for tools, because they need awareness of variability to process all software product lines within the family. With over 14,000 features, processing all of Linux's product lines is infeasible by brute force, and current solutions employ incomplete heuristics. But having the complete set of compilation units with precise variability information is key to static tools such a bug-finders, which could miss critical bugs, and refactoring tools, since behavior-preservation requires a complete view of the software project. Kmax is a new tool for the Linux build system that extracts all compilation units with precise variability information. It processes build system files with a variability-aware make evaluator that stores variables in a conditional symbol table and hoists conditionals around complete statements, while tracking variability information as presence conditions. Kmax is evaluated empirically for correctness and completeness on the Linux kernel. Kmax is compared to previous work for correctness and running time, demonstrating that a complete solution's added complexity incurs only minor latency compared to the incomplete heuristic solutions.
Title: Unsupervised Feature Learning in Computer Vision
Candidate: Ross Goroshin
Advisor: Yann LeCun
Much of computer vision has been devoted to the question of representation through feature extraction. Ideal features transform raw pixel intensity values to a representation in which common problems such as object identification, tracking, and segmentation are easier to solve. Recently, deep feature hierarchies have proven to be immensely successful at solving many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function of the data and problem specific label information. Recent findings suggest that despite being trained on a specific task, the learned features can be transferred across multiple visual tasks. These findings suggests that there exists a generically useful feature representation for natural visual data.
This work aims to uncover the principles that lead to these generic feature representations in the unsupervised setting, which does not require problem specific label information. We begin by reviewing relevant prior work, particularly the literature on autoencoder networks and energy based learning. We introduce a new regularizer for autoencoders that plays an analogous role to the partition function in probabilistic graphical models. Next we explore the role of specialized encoder architectures for sparse inference. The remainder of the thesis explores visual feature learning from video. We establish a connection between slow-feature learning and metric learning, and experimentally demonstrate that semantically coherent metrics can be learned from natural videos. Finally, we posit that useful features linearize natural image transformations in video. To this end, we introduce a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences by learning to predict future frames in the presence of uncertainty.
Title: Efficient and Trustworthy Theory Solver for Bit-vectors in Satisfiability Modulo Theories
Candidate: Liana Hadarean
Advisor: Clark Barrett
As software and hardware systems grow in complexity, automated techniques for ensuring their correctness are becoming increasingly important. Many modern formal verification tools rely on back-end satisfiability modulo theories (SMT) solvers to discharge complex verification goals. These goals are usually formalized in one or more fixed first-order logic theories, such as the theory of fixed-width bit-vectors. The theory of bit-vectors offers a natural way of encoding the precise semantics of typical machine operations on binary data. The predominant approach to deciding the bit-vector theory is via eager reduction to propositional logic. While this often works well in practice, it does not scale well as the bit-width and number of operations increase. The first part of this thesis seeks to fill this gap, by exploring efficient techniques of solving bit-vector constraints that leverage the word-level structure. We propose two complementary approaches: an eager approach that takes full advantage of the solving power of off the shelf propositional logic solvers, and a lazy approach that combines on-the-fly algebraic reasoning with efficient propositional logic solvers. In the second part of the thesis, we propose a proof system for encoding automatically checkable refutation proofs in the theory of bit-vectors. These proofs can be automatically generated by the SMT solver, and act as a certificate for the correctness of the result.
We describe the data sources and machine learning algorithms that go into the current version of http://www.whatcanifarm.com , a website to help prospective organic farmers determine what to grow given the climate characterized by their zip code.
Predicting the Market Value of Single-Family Residences
Time: Wednesday, December 3, 2014. 10:30 AM
Place: 726 Broadway, 7th Floor, CDS Collaborative Space(Follow the signs to the Center for Data Science)
This work develops the best linear model of residential real estate
prices for 2003 through 2009
in Los Angeles County. It differs from other studies comparing models
for predicting house prices
by covering a larger geographic area than most, more houses than most, a
longer time period than
most, and the time period both before and after the real estate price
boom in the United States.
In addition, it open sources all of the software. We test designs for linear models to determine the best form for the model as well as the training period, features, and regularizer that produce the lowest errors. We compare the best of our linear models to random forests and point to directions for further research.
Title : Building Fast, CPU-Efficient Distributed Systems on Ultra-Low Latency, RDMA-Capable Networks
Candidate: Christopher Mitchell
Advisor: Jinyang Li
Modern datacenters utilize traditional Ethernet interconnects to connect hundreds or thousands of machines. Although inexpensive and ubiquitous, Ethernet imposes design constraints on datacenter-scale distributed storage systems that use traditional client-server architectures. Recent technological trends indicate that future datacenters will embrace interconnects with ultra-low latency, high bandwidth, and the ability to offload work from servers to clients. Future datacenter-scale distributed storage systems will need to be designed specifically to exploit these features. This thesis explores what these features mean for large-scale in-memory storage systems, and derives two key insights for building RDMA-aware distributed systems.
First, relaxing locality between data and computation is now practical: data can be copied from servers to clients for computation. Second, selectively relaxing data-computation locality makes it possible to optimally balance load between server and client CPUs to maintain low application latency. This thesis presents two in-memory distributed storage systems built around these two insights, Pilaf and Cell, that demonstrate effective use of ultra-low-latency, RDMA-capable interconnects. Through Pilaf and Cell, this thesis demonstrates that by combining RDMA and message passing to selectively relax locality, systems can achieve ultra-low latency and optimal load balancing with modest CPU resources.
Title: Instance Segmentation of RGBD Scenes
Candidate: Nathan Silberman
Advisor: Rob Fergus
The vast majority of literature in scene parsing can be described as semantic pixel labeling or semantic segmentation: predicting the semantic class of the object represented by each pixel in the scene. Our familiar perception of the world, however, provides a far richer representation. Firstly, rather than just being able to predict the semantic class of a location in a scene, humans are able to reason about object instances. Discriminating between a region that might represent a single object versus ten objects is a crucial and basic faculty. Secondly, rather than reasoning about objects as merely occupying the space visible from a single vantage point, we are able to quickly and easily reason about an object's true extent in 3D. Thirdly, rather than viewing a scene as a collection of objects independently existing in space, humans exhibit a representation of scenes that is highly grounded through a intuitive model of physics. Such models allow us to reason about how objects relate physically: via physical support relationships.
Instance segmentation is the task of segmenting a scene into regions which correspond to individual object instances. We argue that this task is not only closer to our own perception of the world than semantic segmentation, but also directly allows for subsequent reasoning about a scenes constituent elements. We explore various strategies for instance segmentation in indoor RGBD scenes.
Firstly, we explore tree-based instance segmentation algorithms. The utility of trees for semantic segmentation has been thoroughly demonstrated and we adapt them to instance segmentation and analyze both greedy and global approaches to inference.
Next, we investigate exemplar-based instance segmentation algorithms, in which a set of representative exemplars are chosen from a large pool of regions and pixels are assigned to exemplars. Inference can either be performed in two stages, exemplar selection followed by pixel-to-exemplar assignment, or in a single joint reasoning stage. We consider the advantages and disadvantages of each approach.
We introduce the task of support-relation prediction in which we predict which objects are physically supporting other objects. We propose an algorithm and a new set of features for performing discriminative support prediction, we demonstrate the effectiveness of our method and compare training mechanisms.
Finally, we introduce an algorithm for inferring scene and object extent. We demonstrate how reasoning about 3D extent can be done by extending known 2D methods and highlight the strengths and limitations of this approach.
Title: Localization of Humans in Images Using Convolutional Networks
Candidate: Jonathan Tompson
Advisor: Christoph Bregler
Tracking of humans in images is a long standing problem in computer vision research for which, despite significant research effort, an adequate solution has not yet emerged. This is largely due to the fact that human body localization is complicated and difficult; potential solutions must find the location of body joints in images with invariance to shape, lighting and texture variation and it must do so in the presence of occlusion and incomplete data. However, despite these significant challenges, this work will present a framework for human body pose localization that not only offers a significant improvement over existing traditional architectures, but has sufficient localization performance and computational efficiency for use in real-world applications.
At it's core, this framework makes use of Convolutional Networks to infer the location of body joints efficiently and accurately. We describe solutions to two applications 1) hand-tracking from a depth image source and 2) human body-tracking from and RGB image source. For both these applications we show that Convolutional Networks are able to significantly out-perform existing state-of-the-art.
We propose a new hybrid architecture that consists of a deep Convolutional Network and a Probabilistic Graphical Model which can exploit structural domain constraints such as geometric relationships between body joint locations to improve tracking performance. We then explore the use of both color and motion features to improve tracking performance. Finally we introduce a novel architecture which includes an efficient ‘position refinement’ model that is trained to estimate the joint offset location within a small region of the image. This refinement model allows our network to improve spatial localization accuracy even with large amounts of spatial pooling.
Acronym disambiguation is the process of determining the correct expansion of an acronym in a given context. We describe a novel approach for expanding acronyms, by identifying acronym / expansion pairs in a large training corpus of text from Wikipedia and using these as a training dataset to expand acronyms based on word frequencies. On instances in which the correct acronym expansion has at least one instance in our training set (therefore making correct expansion possible), and in which the correct expansion is not the only expansion of an acronym seen in our training set (therefore making the expansion decision a non-trivial decision), we achieve an average accuracy of 88.6%. On a second set of experiments using user-submitted documents, we achieve an average accuracy of 81%.
Title: Joint Training of a Neural Network and a Structured Model for Computer Vision
Candidate: Li Wan
Advisor: Rob Fergus
Identifying objects and telling where they are in real world images is one of the most important problems in Artificial Intelligence. The problem is challenging due to: occluded objects, varying object viewpoints and object deformations. This makes the vision problem extremely difficult and cannot be efficiently solved without learning.
This thesis explores hybrid systems that combine a neural network as a trainable feature extractor and structured models that capture high level information such as object parts. The resulting models combine the strengths of the two approaches: a deep neural network which provides a powerful non-linear feature transformation and a high level structured model which integrates domain-specific knowledge. We develop discriminative training algorithms to jointly optimize these entire models end-to-end.
First, we proposed a unified model which combines a deep neural network with a latent topic model for image classification. The hybrid model is shown to outperform models based solely on neural networks or topic model alone. Next, we investigate techniques for training a neural network system, introducing an effective way of regularizing the network called DropConnect. DropConnect allows us to train large models while avoiding over-fitting. This yields state-of-the-art results on a variety of standard benchmarks for image classification. Third, we worked on object detection for PASCAL challenge. We improved the deformable parts model and proposed a new non-maximal suppression algorithm. This system was the joint winner of the 2011 challenge. Finally, we develop a new hybrid model which integrates a deep network, deformable parts model and non-maximal suppression. Joint training of our hybrid model shows clear advantage over train each component individually, and achieving competitive result on standard benchmarks.