Instructions for submitting a technical report.
Many tasks in design, verification, and testing of hardware and computer systems can be reduced to checking satisfiability of logical formulas. Certain fragments of first-order logic that model the semantics of prevalent data types, and hardware and software constructs, such as integers, bit-vectors, and arrays are thus of most interest. The appeal of satisfiability modulo theories (SMT) solvers is that they implement decision procedures for efficiently reasoning about formulas in these fragments. Thus, they can often be used off-the-shelf as automated back-end solvers in verification tools. In this thesis, we expand the scope of SMT solvers by developing decision procedures for new theories of interest in reasoning about hardware and software.
First, we consider the theory of finite sets with cardinality. Sets are a common high-level data structure used in programming; thus, such a theory is useful for modeling program constructs directly. More importantly, sets are a basic construct of mathematics and thus natural to use when mathematically defining the properties of a computer system. We extend a calculus for finite sets to reason about cardinality constraints. The reasoning for cardinality involves tracking how different sets overlap. For an efficient procedure in an SMT solver, we'd like to avoid considering Venn regions directly, which has been the approach in earlier work. We develop a novel technique wherein potentially overlapping regions are considered incrementally. We use a graph to track the interaction of the different regions. Additionally, our technique leverages the procedure for reasoning about the other set operations (besides cardinality) in a modular fashion.
Second, a limitation frequently encountered is that verification problems are often not fully expressible in the theories supported natively by the solvers. Many solvers allow the specification of application-specific theories as quantified axioms, but their handling is incomplete outside of narrow special cases. We show how SMT solvers can be used to obtain complete decision procedures for local theory extensions, an important class of theories that are decidable using finite instantiation of axioms. We present an algorithm that uses E-matching to generate instances incrementally during the search, significantly reducing the number of generated instances compared to eager instantiation strategies.
We need better tools for C, such as source browsers, bug finders, and automated refactorings. The problem is that large C systems such as Linux are software product lines, containing thousands of configuration variables controlling every aspect of the software from architecture features to file systems and drivers. The challenge of such configurability is how do software tools accurately analyze all configurations of the source without the exponential explosion of trying them all separately. To this end, we focus on two key subproblems, parsing and the build system. The contributions of this thesis are the following: (1) a configuration-preserving preprocessor and parser called SuperC that preserves configurations in its output syntax tree; (2) a configuration-preserving Makefile evaluator called Kmax that collections Linux's compilation units and their configurations; and (3) a framework for configuration-aware analyses of source code using these tools.
C tools need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on tokens, making it hard to even parse both languages. SuperC is a complete, performant solution to parsing all of C. First, a configuration-preserving preprocessor resolves includes and macros yet leaves static conditionals intact, thus preserving a program's variability. To ensure completeness, we analyze all interactions between preprocessor features and identify techniques for correctly handling them. Second, a configuration-preserving parser generates a well-formed AST with static choice nodes for conditionals. It forks new subparsers when encountering static conditionals and merges them again after the conditionals. To ensure performance, we present a simple algorithm for table-driven Fork-Merge LR parsing and four novel optimizations. We demonstrate SuperC's effectiveness on the x86 Linux kernel.
Large-scale C codebases like Linux are software product families, with complex build systems that tailor the software with myriad features. Such variability management is a challenge for tools, because they need awareness of variability to process all software product lines within the family. With over 14,000 features, processing all of Linux's product lines is infeasible by brute force, and current solutions employ incomplete heuristics. But having the complete set of compilation units with precise variability information is key to static tools such a bug-finders, which could miss critical bugs, and refactoring tools, since behavior-preservation requires a complete view of the software project. Kmax is a new tool for the Linux build system that extracts all compilation units with precise variability information. It processes build system files with a variability-aware \texttt{make} evaluator that stores variables in a conditional symbol table and hoists conditionals around complete statements, while tracking variability information as presence conditions. Kmax is
evaluated empirically for correctness and completeness on the Linux kernel. Kmax is compared to previous work for correctness and running time, demonstrating that a complete solution's added complexity incurs only minor latency compared to the incomplete heuristic solutions.
SuperC's configuration-preserving parsing of compilation units and Kmax's project-wide capabilities are in a unique position to process source code across all configurations. Bug-finding is one area where such capability is useful. Bugs may appear in untested combinations of configurations and testing each configuration one-at-a-time is infeasible. For example, one compilation units that defines a global function called by other compilation units may not be linked into the final program due to configuration variable selection. Such a bug can be found with Kmax and SuperC's cross-configuration capability. Cilantro is a framework for creating variability-aware bug-checkers. Kmax is used to determine the complete set of compilation units and the combinations of features that activate them, while SuperC's parsing framework is extended with semantic actions in order implement the checkers. A checker for linker errors across all
compilation units in the Linux kernel demonstrates each part of the Cilantro framework and is evaluated on the Linux source code.
The Boolean Satisfiability Problem (SAT) is a canonical decision problem originally shown to be NP-complete in Cook’s seminal work on the theory of computational complexity. The SAT problem is one of several computational tasks identified by researchers as core problems in computer science. The existence of an efficient decision procedure for SAT would imply P = NP. However, numerous algorithms and techniques for solving the SAT problem have been proposed in various forms in practical settings. Highly efficient solvers are now actively being used, either directly or as a core engine of a larger system, to solve real-world problems that arise from many application domains. These state-of-the-art solvers use the Davis-Putnam-Logemann-Loveland (DPLL) algorithm extended with ConflictDriven Clause Learning (CDCL). Due to the practical importance of SAT, building a fast SAT solver can have a huge impact on current and prospective applications. The ultimate contribution of this thesis is improving the state of the art of CDCL by understanding and exploiting the empirical characteristics of how CDCL works on real-world problems. The first part of the thesis shows empirically that most of the unsatisfiable real-world problems solvable by CDCL have a refutation proof with near-constant width for the great portion of the proof. Based on this observation, the thesis provides an unconventional perspective that CDCL solvers can solve real-world problems very efficiently and often more efficiently just by maintaining a small set of certain classes of learned clauses. The next part of the thesis focuses on understanding the inherently different natures of satisfiable and unsatisfiable problems and their implications on the empirical workings of CDCL. We examine the varying degree of roles and effects of crucial elements of CDCL based on the satisfiability status of a problem. Ultimately, we propose effective techniques to exploit the new insights about the different natures of proving satisfi- ability and unsatisfiability to improve the state of the art of CDCL. In the last part of the thesis, we present a reference solver that incorporates all the techniques described in the thesis. The design of the presented solver emphasizes minimality in implementation while guaranteeing state-of-the-art performance. Several versions of the reference solver have demonstrated top-notch performance, earning several medals in the annual SAT competitive events. The minimal spirit of the reference solver shows that a simple CDCL framework alone can still be made competitive with state-of-the-art solvers that implement sophisticated techniques outside the CDCL framework.
Scalability is a key challenge in static program analyses based on solvers for Satisfiability Modulo Theories (SMT). For imperative languages like C, the approach taken for modeling memory can play a significant role in scalability. The main theme of this thesis is using partitioned memory models to divide up memory based on the alias information derived from a points-to analysis.
First, a general analysis framework based on memory partitioning is presented. It incorporates a points-to analysis as a preprocessing step to determine a conservative approximation of which areas of memory may alias or overlap and splits the memory into distinct arrays for each of these areas.
Then we propose a new cell-based field-sensitive points-to analysis, which is an extension of Steensgaard’s unification-based algorithms. A cell is a unit of access with scalar or record type. Arrays and dynamically memory allocations are viewed as a collection of cells. We show how our points-to analysis yields more precise alias information for programs with complex heap data structures.
Our work is implemented in Cascade, a static analysis framework for C programs. It replaces the former flat memory model that models the memory as a single array of bytes. We show that the partitioned memory models achieve better scalability within Cascade, and the cell-based memory model, in particular, improves the performance significantly, making Cascade a state-of-the-art C analyzer.
Isogeometric analysis has been introduced as an alternative to finite element methods in order to simplify the integration of CAD software and the discretization of variational problems of continuum mechanics. In contrast with the finite element case, the basis functions of isogeometric analysis are often not nodal. As a consequence, there are fat interfaces which can easily lead to an increase in the number of interface variables after a decomposition of the parameter space into subdomains. Building on earlier work on the deluxe version of the BDDC family of domain decomposition algorithms, several adaptive algorithms are here developed for scalar elliptic problems in an effort to decrease the dimension of the global, coarse component of these preconditioners. Numerical experiments provide evidence that this work can be successful, yielding scalable and quasi-optimal adaptive BDDC algorithms for isogeometric discretizations.
Two domain decomposition methods for solving vector field problems posed in H(curl) and discretized with Nédélec finite elements are considered. These finite elements are conforming in H(curl).
A two-level overlapping Schwarz algorithm in two dimensions is analyzed, where the subdomains are only assumed to be uniform in the sense of Peter Jones. The coarse space is based on energy minimization and its dimension equals the number of interior subdomain edges. Local direct solvers are based on the overlapping subdomains. The bound for the condition number depends only on a few geometric parameters of the decomposition. This bound is independent of jumps in the coefficients across the interface between the subdomains for most of the different cases considered.
A bound is also obtained for the condition number of a balancing domain decomposition by constraints (BDDC) algorithm in two dimensions, with Jones subdomains. For the primal variable space, a continuity constraint for the tangential average over each interior subdomain edge is imposed. For the averaging operator, a new technique named deluxe scaling is used. The optimal bound is independent of jumps in the coefficients across the interface between the subdomains.
Furthermore, a new coarse function for problems in three dimensions is introduced, with only one degree of freedom per subdomain edge. In all the cases, it is established that the algorithms are scalable. Numerical results that verify the results are provided, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
Volatility in critical socio-economic indices can have a significant negative impact on global development. This thesis presents a suite of novel big data analytics algorithms that operate on unstructured Web data streams to automatically infer events, knowledge graphs and predictive models to understand, characterize and predict the volatility of socioeconomic indices.
This thesis makes four important research contributions. First, given a large volume of diverse unstructured news streams, we present new models for capturing events and learning spatio-temporal characteristics of events from news streams. We specifically explore two types of event models in this thesis: one centered around the concept of event triggers and a probabilistic meta-event model that explicitly delineates named entities from text streams to learn a generic class of meta-events. The second contribution focuses on learning several different types of knowledge graphs from news streams and events: a) Spatio-temporal article graphs capture intrinsic relationships between different news articles; b) Event graphs characterize relationships between events and given a news query, provide a succinct summary of a timeline of events relating to a query; c) Event-phenomenon graphs that provide a condensed representation of classes of events that relate to a given phenomena at a given location and time; d) Causality testing on word-word graphs which can capture strong spatio-temporal relationships between word occurrences in news streams; e) Concept graphs that capture relationships between different word concepts that occur in a given text stream.
The third contribution focuses on connecting the different knowledge graph representations and structured time series data corresponding to a socio-economic index to automatically learn event-driven predictive models for the given socio-economic index to predict future volatility. We propose several types of predictive models centered around our two event models: event triggers and probabilistic meta-events. The final contribution focuses on a broad spectrum of inference case studies for different types of socio-economic indices including food prices, stock prices, disease outbreaks and interest rates. Across all these indices, we show that event-driven predictive models provide significant improvements in prediction accuracy over state-of-the-art techniques.
Abstract Interpretation is a theory of sound approximation of program semantics. In recent decades, it has been widely and successfully applied to the static analysis of computer programs. In this thesis, we will work on abstract domains, one of the key concepts in abstract interpretation, which aim at automatically collecting information about the set of all possible values of the program variables. We will focus, in particularly, on two aspects: the combination with theorem provers and the refinement of existing abstract domains.
Satisfiability modulo theories (SMT) solvers are popular theorem provers, which proved to be very powerful tools for checking the satisfiability of first-order logical formulas with respect to some background theories. In the first part of this thesis, we introduce two abstract domains whose elements are logical formulas involving finite conjunctions of affine equalities and finite conjunctions of linear inequalities. These two abstract domains rely on SMT solvers for the computation of transformations and other logical operations.
In the second part of this thesis, we present an abstract domain functor whose elements are binary decision trees. It is parameterized by decision nodes which are a set of boolean tests appearing in the programs and by a numerical or symbolic abstract domain whose elements are the leaves. This new binary decision tree abstract domain functor provides a flexible way of adjusting the cost/precision ratio in path-dependent static analysis.
Next-generation wireless networks are widely expected to operate over millimeter-wave (mmW) frequencies of over 28GHz. These bands mitigate the acute spectrum shortage in the conventional microwave bands of less than 6GHz. The shorter wavelengths in these bands also allow for building dense antenna arrays on a single chip, thereby enabling various MIMO configurations and highly directional links that can increase the spatial reuse of spectrum.
While attempting to build a practical over-the-air (OTA) link over mmW, we realized that the traditional baseband processing techniques used in the microwave bands simply could not cope with the exacerbated frequency offsets (or phase noise) observed in the RF oscillators at these bands. While the frequency offsets are large, the real difficulty arose from the fact that they varied significantly over very short time-scales.Traditional feedback loop techniques still left significant residual offsets, which in turn led to inter-carrier-interference (ICI). The result was very high symbol error rates (SER).
This thesis presents Iris, a baseband processing block that enables clean mmW links, even in the presence of previously fatal amounts of phase noise. Over real mmW hardware, Iris reduces the SER by one to two orders of magnitude, as compared to competing techniques.
In the greater part of this thesis, we develop a set of convolutional networks that infer predictions at each pixel of an input image. This is a common problem that arises in many computer vision applications: For example, predicting a semantic label at each pixel describes not only the image content, but also fine-grained locations and segmenta- tions; at the same time, finding depth or surface normals provide 3D geometric relations between points. The second part of this thesis investigates convolutional models also in the contexts of classification and unsupervised learning.
To address our main objective, we develop a versatile Multi-Scale Convolutional Network that can be applied to diverse vision problems using simple adaptations, and apply it to predict depth at each pixel, surface normals and semantic labels. Our model uses a series of convolutional network stacks applied at progressively finer scales. The first uses the entire image field of view to predict a spatially coarse set of feature maps based on global relations; subsequent scales correct and refine the output, yielding a high resolution prediction. We look exclusively at depth prediction first, then generalize our method to multiple tasks. Our system achieves state-of-the-art results on all tasks we investigate, and can match many image details without the need for superpixelation.
Leading to our multi-scale network, we also design a purely local convolutional network to remove dirt and raindrops present on a window surface, which learns to identify and inpaint compact corruptions. We also we investigate a weighted nearest-neighbors labeling system applied to superpixels, in which we learn weights for each example, and use local context to find rare class instances.
In addition, we investigate the relative importance of sizing parameters using a recursive convolutional network, finding that network depth is most critical. We also develop a Convolutional LISTA Autoencoder, which learns features similar to stacked sparse coding at a fraction of the cost, combine it with a local entropy objective, and describe a convolutional adaptation of ZCA whitening.
Large-scale C software like Linux needs software engineering tools. But such codebases are software product families, with complex build systems that tailor the software with myriad features. This variability management is a challenge for tools, because they need awareness of variability to process all software product lines within the family. With over 14,000 features, processing all of Linux's product lines is infeasible by brute force, and current solutions employ incomplete heuristics. But having the complete set of compilation units with precise variability information is key to static tools such a bug-finders, which could miss critical bugs, and refactoring tools, since behavior-preservation requires a complete view of the software project. Kmax is a new tool for the Linux build system that extracts all compilation units with precise variability information. It processes build system files with a variability-aware make evaluator that stores variables in a conditional symbol table and hoists conditionals around complete statements, while tracking variability information as presence conditions. Kmax is evaluated empirically for correctness and completeness on the Linux kernel. Kmax is compared to previous work for correctness and running time, demonstrating that a complete solution's added complexity incurs only minor latency compared to the incomplete heuristic solutions.
Much of computer vision has been devoted to the question of representation through feature extraction. Ideal features transform raw pixel intensity values to a representation in which common problems such as object identification, tracking, and segmentation are easier to solve. Recently, deep feature hierarchies have proven to be immensely successful at solving many problems in computer vision. In the supervised setting, these hierarchies are trained to solve specific problems by minimizing an objective function of the data and problem specific label information. Recent findings suggest that despite being trained on a specific task, the learned features can be transferred across multiple visual tasks. These findings suggests that there exists a generically useful feature representation for natural visual data.
This work aims to uncover the principles that lead to these generic feature representations in the unsupervised setting, which does not require problem specific label information. We begin by reviewing relevant prior work, particularly the literature on autoencoder networks and energy based learning. We introduce a new regularizer for autoencoders that plays an analogous role to the partition function in probabilistic graphical models. Next we explore the role of specialized encoder architectures for sparse inference. The remainder of the thesis explores visual feature learning from video. We establish a connection between slow-feature learning and metric learning, and experimentally demonstrate that semantically coherent metrics can be learned from natural videos. Finally, we posit that useful features linearize natural image transformations in video. To this end, we introduce a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unlabeled natural video sequences by learning to predict future frames in the presence of uncertainty.
As software and hardware systems grow in complexity, automated techniques for ensuring their correctness are becoming increasingly important. Many modern formal verification tools rely on back-end satisfiability modulo theories (SMT) solvers to discharge complex verification goals. These goals are usually formalized in one or more fixed first-order logic theories, such as the theory of fixed-width bit-vectors. The theory of bit-vectors offers a natural way of encoding the precise semantics of typical machine operations on binary data. The predominant approach to deciding the bit-vector theory is via eager reduction to propositional logic. While this often works well in practice, it does not scale well as the bit-width and number of operations increase. The first part of this thesis seeks to fill this gap, by exploring efficient techniques of solving bit-vector constraints that leverage the word-level structure. We propose two complementary approaches: an eager approach that takes full advantage of the solving power of off the shelf propositional logic solvers, and a lazy approach that combines on-the-fly algebraic reasoning with efficient propositional logic solvers. In the second part of the thesis, we propose a proof system for encoding automatically checkable refutation proofs in the theory of bit-vectors. These proofs can be automatically generated by the SMT solver, and act as a certificate for the correctness of the result.
We describe the data sources and machine learning algorithms that go into the current version of http://www.whatcanifarm.com , a website to help prospective organic farmers determine what to grow given the climate characterized by their zip code.
This work develops the best linear model of residential real estate prices for 2003 through 2009 in Los Angeles County. It differs from other studies comparing models for predicting house prices by covering a larger geographic area than most, more houses than most, a longer time period than most, and the time period both before and after the real estate price boom in the United States.
In addition, it open sources all of the software. We test designs for linear models to determine the best form for the model as well as the training period, features, and regularizer that produce the lowest errors. We compare the best of our linear models to random forests and point to directions for further research.
Modern datacenters utilize traditional Ethernet interconnects to connect hundreds or thousands of machines. Although inexpensive and ubiquitous, Ethernet imposes design constraints on datacenter-scale distributed storage systems that use traditional client-server architectures. Recent technological trends indicate that future datacenters will embrace interconnects with ultra-low latency, high bandwidth, and the ability to offload work from servers to clients. Future datacenter-scale distributed storage systems will need to be designed specifically to exploit these features. This thesis explores what these features mean for large-scale in-memory storage systems, and derives two key insights for building RDMA-aware distributed systems.
First, relaxing locality between data and computation is now practical: data can be copied from servers to clients for computation. Second, selectively relaxing data-computation locality makes it possible to optimally balance load between server and client CPUs to maintain low application latency. This thesis presents two in-memory distributed storage systems built around these two insights, Pilaf and Cell, that demonstrate effective use of ultra-low-latency, RDMA-capable interconnects. Through Pilaf and Cell, this thesis demonstrates that by combining RDMA and message passing to selectively relax locality, systems can achieve ultra-low latency and optimal load balancing with modest CPU resources.
A BDDC domain decomposition preconditioner is defined by a coarse component, expressed in terms of primal constraints, a weighted average across the interface between the subdomains, and local components given in terms of solvers of local subdomain problems. BDDC methods for vector field problems discretized with Raviart-Thomas finite elements are introduced. The methods are based on a new type of weighted average and an adaptive selection of primal constraints developed to deal with coefficients with high contrast even inside individual subdomains. For problems with very many subdomains, a third level of the preconditioner is introduced.
Under the assumption that the subdomains are all built from elements of a coarse triangulation of the given domain, and that the material parameters are constant in each subdomain, a bound is obtained for the condition number of the preconditioned linear system which is independent of the values and the jumps of the coefficients across the interface and has a polylogarithmic condition number bound in terms of the number of degrees of freedom of the individual subdomains. Numerical experiments, using the PETSc library, and a large parallel computer, for two and three dimensional problems are also presented which support the theory and show the effectiveness of the algorithms even for problems not covered by the theory. Included are also experiments with a variety of finite element approximations.
Compilers for statically typed functional programming languages are notorious for generating confusing type error messages. When the compiler detects a type error, it typically reports the program location where the type checking failed as the source of the error. Since other error sources are not even considered, the actual root cause is often missed. A more adequate approach is to consider all possible error sources and report the most useful one subject to some usefulness criterion. In our previous work, we showed that this approach can be formulated as an optimization problem related to satisfiability modulo theories (SMT). This formulation cleanly separates the heuristic nature of usefulness criteria from the underlying search problem. Unfortunately, algorithms that search for an optimal error source cannot directly use principal types which are crucial for dealing with the exponential-time complexity of the decision problem of polymorphic type checking. In this paper, we present a new algorithm that efficiently finds an optimal error source in a given ill-typed program. Our algorithm uses an improved SMT encoding to cope with the high complexity of polymorphic typing by iteratively expanding the typing constraints from which principal types are derived. The algorithm preserves the clean separation between the heuristics and the actual search. We have implemented our algorithm for OCaml. In our experimental evaluation, we found that the algorithm reduces the running times for optimal type error localization from minutes to seconds and scales better than previous localization algorithms.
The vast majority of literature in scene parsing can be described as semantic pixel labeling or semantic segmentation: predicting the semantic class of the object represented by each pixel in the scene. Our familiar perception of the world, however, provides a far richer representation. Firstly, rather than just being able to predict the semantic class of a location in a scene, humans are able to reason about object instances. Discriminating between a region that might represent a single object versus ten objects is a crucial and basic faculty. Secondly, rather than reasoning about objects as merely occupying the space visible from a single vantage point, we are able to quickly and easily reason about an object's true extent in 3D. Thirdly, rather than viewing a scene as a collection of objects independently existing in space, humans exhibit a representation of scenes that is highly grounded through a intuitive model of physics. Such models allow us to reason about how objects relate physically: via physical support relationships.
Instance segmentation is the task of segmenting a scene into regions which correspond to individual object instances. We argue that this task is not only closer to our own perception of the world than semantic segmentation, but also directly allows for subsequent reasoning about a scenes constituent elements. We explore various strategies for instance segmentation in indoor RGBD scenes.
Firstly, we explore tree-based instance segmentation algorithms. The utility of trees for semantic segmentation has been thoroughly demonstrated and we adapt them to instance segmentation and analyze both greedy and global approaches to inference.
Next, we investigate exemplar-based instance segmentation algorithms, in which a set of representative exemplars are chosen from a large pool of regions and pixels are assigned to exemplars. Inference can either be performed in two stages, exemplar selection followed by pixel-to-exemplar assignment, or in a single joint reasoning stage. We consider the advantages and disadvantages of each approach.
We introduce the task of support-relation prediction in which we predict which objects are physically supporting other objects. We propose an algorithm and a new set of features for performing discriminative support prediction, we demonstrate the effectiveness of our method and compare training mechanisms.
Finally, we introduce an algorithm for inferring scene and object extent. We demonstrate how reasoning about 3D extent can be done by extending known 2D methods and highlight the strengths and limitations of this approach.
Tracking of humans in images is a long standing problem in computer vision research for which, despite significant research effort, an adequate solution has not yet emerged. This is largely due to the fact that human body localization is complicated and difficult; potential solutions must find the location of body joints in images with invariance to shape, lighting and texture variation and it must do so in the presence of occlusion and incomplete data. However, despite these significant challenges, this work will present a framework for human body pose localization that not only offers a significant improvement over existing traditional architectures, but has sufficient localization performance and computational efficiency for use in real-world applications.
At it's core, this framework makes use of Convolutional Networks to infer the location of body joints efficiently and accurately. We describe solutions to two applications 1) hand-tracking from a depth image source and 2) human body-tracking from and RGB image source. For both these applications we show that Convolutional Networks are able to significantly out-perform existing state-of-the-art.
We propose a new hybrid architecture that consists of a deep Convolutional Network and a Probabilistic Graphical Model which can exploit structural domain constraints such as geometric relationships between body joint locations to improve tracking performance. We then explore the use of both color and motion features to improve tracking performance. Finally we introduce a novel architecture which includes an efficient ‘position refinement’ model that is trained to estimate the joint offset location within a small region of the image. This refinement model allows our network to improve spatial localization accuracy even with large amounts of spatial pooling.
Acronym disambiguation is the process of determining the correct expansion of an acronym in a given context. We describe a novel approach for expanding acronyms, by identifying acronym / expansion pairs in a large training corpus of text from Wikipedia and using these as a training dataset to expand acronyms based on word frequencies. On instances in which the correct acronym expansion has at least one instance in our training set (therefore making correct expansion possible), and in which the correct expansion is not the only expansion of an acronym seen in our training set (therefore making the expansion decision a non-trivial decision), we achieve an average accuracy of 88.6%. On a second set of experiments using user-submitted documents, we achieve an average accuracy of 81%.
Identifying objects and telling where they are in real world images is one of the most important problems in Artificial Intelligence. The problem is challenging due to: occluded objects, varying object viewpoints and object deformations. This makes the vision problem extremely difficult and cannot be efficiently solved without learning.
This thesis explores hybrid systems that combine a neural network as a trainable feature extractor and structured models that capture high level information such as object parts. The resulting models combine the strengths of the two approaches: a deep neural network which provides a powerful non-linear feature transformation and a high level structured model which integrates domain-specific knowledge. We develop discriminative training algorithms to jointly optimize these entire models end-to-end.
First, we proposed a unified model which combines a deep neural network with a latent topic model for image classification. The hybrid model is shown to outperform models based solely on neural networks or topic model alone. Next, we investigate techniques for training a neural network system, introducing an effective way of regularizing the network called DropConnect. DropConnect allows us to train large models while avoiding over-fitting. This yields state-of-the-art results on a variety of standard benchmarks for image classification. Third, we worked on object detection for PASCAL challenge. We improved the deformable parts model and proposed a new non-maximal suppression algorithm. This system was the joint winner of the 2011 challenge. Finally, we develop a new hybrid model which integrates a deep network, deformable parts model and non-maximal suppression. Joint training of our hybrid model shows clear advantage over train each component individually, and achieving competitive result on standard benchmarks.
Scalability is a key challenge in static program analyses based on solvers for Satisfiability Modulo Theories (SMT). For imperative languages like C, the approach taken for modeling memory can play a significant role in scalability. The main theme of this thesis is using partitioned memory models to divide up memory based on the alias information derived from a points-to analysis.
First, a general analysis framework based on memory partitioning is presented. It incorporates a points-to analysis as a preprocessing step to determine a conservative approximation of which areas of memory may alias or overlap and splits the memory into distinct arrays for each of these areas.
Then we propose a new cell-based field-sensitive points-to analysis, which is an extension of Steensgaard's unification-based algorithms. A cell is a unit of access with scalar or record type. Arrays and dynamically memory allocations are viewed as a collection of cells. We show how our points-to analysis yields more precise alias information for programs with complex heap data structures.
Our work is implemented in Cascade, a static analysis framework for C programs. It replaces the former at memory model that models the memory as a single array of bytes. We show that the partitioned memory models achieve better scalability within Cascade, and the cell-based memory model, in particular, improves the performance significantly, making Cascade a state-of-the-art C analyzer.
This thesis concerns the acquisition, modeling and manipulation of the human form.
First, we acquire body models. We introduce an efficient bootstraped algorithm that we employed to register over 2,000 high resolution body scans of male and female adult subjects. Our algorithm outputs not only the traditional vertex correspondences, but also directly produces a high quality model which can be immediately deformed. We then employ the result to fit noisy depth maps coming from now commercially available 3D sensors such as Microsoft's Kinect and PrimeSense's Carmine.
We conclude by describing a new real-time system for image-based body manipulation called BodyJam, that lets you change your outfit with a finger snap. BodyJam is inspired by a technique invented by the surrealists a century ago: "Exquisite corpse", a method by which a collection of images (of body parts) is collectively assembled. BodyJam does it on a video display that mirrors the pose in real-time of a real-person standing in front of the camera/display mirror, and allows the user to change clothes and other appearance attributes. Using Microsoft's Kinect, poses are matched to a video database of different torsos and legs, and "pages" showing different clothes are turned by handwitch focus to the topic of body manipulation. We first revisit the more traditional way of specifying bodies from a set of measurements, such as coming from clothing sizing charts, showing how the statistics of the population learned during the registration can aid us in accurately defining the body shape. We then introduce a new manipulation metaphor, where we navigate through the space of body shapes and poses by directly dragging the body mesh surface.
We conclude by describing a new real-time system for image-based body manipulation called BodyJam, that lets you change your outfit with a finger snap. BodyJam is inspired by a technique invented by the surrealists a century ago: "Exquisite Corpse", a method by which a collection of images (of body parts) is collectively assembled. BodyJam does it on a video display that mirrors the pose in real-time of a real-person standing in front of the camera/display mirror, and allows the user to change clothes and other appearance attributes. Using Microsoft's Kinect, poses are matched to a video database of different torsos and legs, and "pages" showing different clothes are turned by hand gestures.
Low order finite element discretizations of the linear elasticity system suffer increasingly from locking effects and ill-conditioning, when the material approaches the incompressible limit, if only the displacement variable are used. Mixed finite elements using both displacement and pressure variables provide a well-known remedy, but they yield larger and indefinite discrete systems for which the design of scalable and efficient iterative solvers is challenging. Two-level overlapping Schwarz preconditioner for the almost incompressible system of linear elasticity, discretized by mixed finite elements with discontinuous pressures, are constructed and analyzed. The preconditioned systems are accelerated either by a GMRES (generalized minimum residual) method applied to the resulting discrete saddle point problem or by a PCG (preconditioned conjugate gradient) method applied to a positive definite, although extremely ill-conditioned, reformulation of the problem obtained by eliminating all pressure variables on the element level. A novel theoretical analysis of the algorithm for the positive definite reformulation is given by extending some earlier results by Dohrmann and Widlund. The main result of the paper is a bound on the condition number of the algorithm which is cubic in the relative overlap and grows logarithmically with the number of elements across individual subdomains but is otherwise independent of the number of subdomains, their diameters and mesh sizes, and the incompressibility of the material and possible discontinuities of the material parameters across the subdomain interfaces. Numerical results in the plane confirm the theory and also indicate that an analogous result should hold for the saddle point formulation, as well as for spectral element discretizations.
A bound is obtained for the condition number of a BDDC algorithm for problems posed in H(curl) in two dimensions, where the subdomains are only assumed to be uniform in the sense of Peter Jones. For the primal variable space, a continuity constraint for the tangential average over each interior subdomain edge is imposed.
For the averaging operator, a new technique named deluxe scaling is used. Our bound is independent of jumps in the coefficients across the interface between the subdomains and depends only on a few geometric parameters of the decomposition. Numerical results that verify the result are shown, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
A bound is obtained for the condition number of a two-level overlapping Schwarz algorithm for problems posed in H(curl) in two dimensions, where the subdomains are only assumed to be uniform in the sense of Peter Jones. The coarse space is based on energy minimization and its dimension equals the number of interior subdomain edges. Local direct solvers are used on the overlapping subdomains. Our bound depends only on a few geometric parameters of the decomposition. This bound is independent of jumps in the coefficients across the interface between the subdomains for most of the different cases considered. Numerical experiments that verify the result are shown, including some with subdomains with fractal edges and others obtained by a mesh partitioner.
The impetus for this dissertation is to explain why well-functioning markets might be able to stay at or near a market equilibrium. We argue that tatonnement, a natural, simple and distributed price update dynamic in economic markets, is a plausible candidate to explain how markets might reach their equilibria.
Tatonnement is broadly defined as follows: if the demand for a good is more than the supply, increase the price of the good, and conversely, decrease the price when the demand is less than the supply. Prior works show that tatonnement converges to market equilibrium in some markets while it fails to converge in other markets. Our goal is to extend the classes of markets in which tatonnement is shown to converge. The prior positive results largely concerned markets with substitute goods. We seek market constraints which enable tatonnement to converge in markets with complementary goods, or with a mixture of substitutes and complementary goods. We also show fast convergence rates for some of these markets.
We introduce an amortized analysis technique to handle asynchronous events - in our case asynchronous price updates. On the other hand, for some markets we show that tatonnement is equivalent to generalized gradient descent (GGD). The amortized analysis and our analysis on GGD may be of independent interests.
In this paper, we present and analyze a BDDC algorithm for a class of elliptic problems in the three-dimensional H(curl) space. Compared with existing results, our condition number estimate requires fewer assumptions and also involves two fewer powers of log(H/h), making it consistent with optimal estimates for other elliptic problems. Here, H/h is the maximum of H _{ i } /h _{ i } over all subdomains, where H _{ i } and h _{ i } are the diameter and the smallest element diameter for the subdomain Ω _{ i } .
The analysis makes use of two recent developments. The first is a new approach to averaging across the subdomain interfaces, while the second is a new technical tool which allows arguments involving trace classes to be avoided. Numerical examples are presented to confirm the theory and demonstrate the importance of the new averaging approach in certain cases.
In this work, we describe an application of convolutional networks to object classification and detection in images. The task of image based object recognition is surveyed in the first chapter. Its application in internet advertisement is one of the main motivations of this work.
The architecture of the convolutional networks is described in details in the following chapter. Stochastic gradient descent is used to train the networks.
We then describe the data collection and labelling process. The set of training data labelled basically decides what kind of recognizer is being built. Four binary classifers are trained for the object types of sailboat, car, motorbike, and dog.
GPU based massive parallel implementation of the convolutional networks is built. This enables us to run the convolution operations at close to 40 times faster than running on a traditional CPU. Details about how to implement the convolutional operation on NVIDIA GPUs using CUDA is disscused.
In order to apply the object recognizer in a production environment where millions of images are processed daily, we have built a platform with cloud computing. We describe how large scale and low latency image processing can be achieved with such a system.
A core technique of modern tools for formally reasoning about computing systems is generating and dispatching queries to automated theorem provers, including Satisfiability Modulo Theories (SMT) provers. SMT provers aim at the tight integration of decision procedures for propositional satisfiability and decision procedures for fixed first-order theories ‒ known as theory solvers. This thesis presents several advancements in the design and implementation of theory solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic. These are implemented within the SMT system CVC4. We begin by formally describing the Satisfiability Modulo Theories problem and the role of theory solvers within CVC4. We discuss known techniques for building solvers for quantifier-free linear real, integer, and mixed integer and real arithmetic around the Simplex for DPLL(T) algorithm. We give several small improvements to theory solvers using this algorithm and describe the implementation and theory of this algorithm in detail. To extend the class of problems that the theory solver can robustly support, we borrow and adapt several techniques from linear programming (LP) and mixed integer programming (MIP) solvers which come from the tradition of optimization. We propose a new decicion procedure for quantifier-free linear real arithmetic that replaces the Simplex for DPLL(T) algorithm with a variant of the Simplex algorithm that performs a form of optimization ‒ minimizing the sum of infeasibilties. In this thesis, we additionally describe techniques for leveraging LP and MIP solvers to improve the performance of SMT solvers without compromising correctness. Previous efforts to leverage such solvers in the context of SMT have concluded that in addition to being potentially unsound, such solvers are too heavyweight to compete in the context of SMT. We present an empirical comparison against other state-of-the-art SMT tools to demonstrate the effectiveness of the proposed solutions.
We present a general theory of serializability, unifying a wide range of transactional algorithms, including some that are yet to come. To this end, we provide a compact semantics in which concurrent transactions push their effects into the shared view (or unpush to recall effects) and pull the effects of potentially uncommitted concurrent transactions into their local view (or unpull to detangle). Each operation comes with simple side-conditions given in terms of commutativity (Lipton's left-movers and right-movers).
The benefit of this model is that most of the elaborate reasoning (coinduction, simulation, subtle invariants, etc.) necessary for proving the serializability of a transactional algorithm is already proved within the semantic model. Thus, proving serializability (or opacity) amounts simply to mapping the algorithm on to our rules, and showing that it satisfies the rules' side-conditions.
We present the first method for reasoning about temporal logic properties of higher-order, infinite-data programs. By distinguishing between the finite traces and infinite traces in the specification, we obtain rules that permit us to reason about the temporal behavior of program parts via a type-and-effect system, which is then able to compose these facts together to prove the overall target property of the program. The type system alone is strong enough to derive many temporal safety properties using refinement types and temporal effects. We also show how existing techniques can be used as oracles to provide liveness information (e.g. termination) about program parts and that the type-and-effect system can combine this information with temporal safety information to derive nontrivial temporal properties. Our work has application toward verification of higher-order software, as well as modular strategies for procedural programs.
In today’s world, we store our data and perform expensive computations remotely on powerful servers (a.k.a. “the cloud”) rather than on our local devices. In this dissertation we study the question of achieving cryptographic security in the setting where multiple (mutually distrusting) clients wish to delegate the computation of a joint function on their inputs to an untrusted cloud, while keeping these inputs private. We introduce two frameworks for modeling such protocols.
We construct cloud-assisted and on-the-fly MPC protocols using fully homomorphic encryption (FHE). However, FHE requires inputs to be encrypted under the same key; we extend it to the multiparty setting in two ways:
We consider two new algorithms with practical application to the problem of designing controllers for linear dynamical systems with input and output: a new spectral value set based algorithm called hybrid expansion-contraction intended for approximating the H-infinity norm, or equivalently, the complex stability radius, of large-scale systems, and a new BFGS SQP based optimization method for nonsmooth, nonconvex constrained optimization motivated by multi-objective controller design. In comprehensive numerical experiments, we show that both algorithms in their respect domains are significantly faster and more robust compared to other available alternatives. Moreover, we present convergence guarantees for hybrid expansion-contraction, proving that it converges at least superlinearly, and observe that it converges quadratically in practice, and typically to good approximations to the H-infinity norm, for problems which we can verify this. We also extend the hybrid expansion-contraction algorithm to the real stability radius, a measure which is known to be more difficult to compute than the complex stability radius. Finally, for the purposes of comparing multiple optimization methods, we present a new visualization tool called relative minimization profiles that allow for simultaneously assessing the relative performance of algorithms with respect to three important performance characteristics, highlighting how these measures interrelate to one another and compare to the other competing algorithms on heterogenous test sets. We employ relative minimization profiles to empirically validate our proposed BFGS SQP method in terms of quality of minimization, attaining feasibility, and speed of progress compared to other available methods on challenging test sets comprised of nonsmooth, nonconvex constrained optimization problems arising in controller design.
The recent cloud computing revolution has changed the distributed computing landscape, making the resources of entire datacenters available to ordinary users. This process has been greatly aided by dataflow style frameworks such as MapReduce which expose simple model for programs, allowing for efficient, fault-tolerant execution across many machines. While the MapReduce model has proved to be effective for many applications, there are a wide class of applications which are difficult to write or inefficient in such a model. This includes many familiar and important applications such as PageRank, matrix factorization and a number of machine learning algorithms. In lieu of a good framework for building these applications, users resort to writing applications using MPI or RPC, a difficult and error-prone construction.
This thesis presents 2 complementary frameworks, Piccolo and Spartan, which help programmers to write in-memory distributed applications not served well by existing approaches.
Piccolo presents a new data-centric programming model for in-memory applications. Unlike data-flow models, Piccolo allows programs running on different machines to share distributed, mutable state via a key-value table interface. This design allows for both high-performance and additional flexibility. Piccolo makes novel use of commutative updates to efficiently resolve write-write conflicts. We find Piccolo provides an efficient backend for a wide-range of applications: from PageRank and matrix multiplication to web-crawling.
While Piccolo provides an efficient backend for distributed computation, it can still be some- what cumbersome to write programs using it directly. To address this, we created Spartan. Spartan implements a distributed implementation of the NumPy array language, and fully sup- ports important array language features such as spatial indexing (slicing), fancy indexing and broadcasting. A key feature of Spartan is its use of a small number of simple, powerful high-level operators to provide most functionality. Not only do these operators dramatically simplify the design and implementation of Spartan, they also allow users to implement new functionality with ease.
We evaluate Piccolo and Spartan on a wide range of applications and find that they both perform significantly better than existing approaches.
This paper presents a cryptosystem that will allow for fair first-price sealed-bid auctions among groups of individuals to be conducted over the internet without the need for a trusted third party. A client who maintains the secrecy of his or her private key will be able to keep his/her bid secret from the server and from all other clients until this client explicitly decides to reveal his/her bid, which will be after all clients publish their obfuscated bids. Each client will be able to verify that every other client's revealed bid corresponds to that client's obfuscated bid at the end of each auction. Each client is provided with a transcript of all auction proceedings so that they may be independently audited.
The Python programming language has become a popular platform for data analysis and scientific computing. To mitigate the poor performance of Python's standard interpreter, numerically intensive computations are typically offloaded to library functions written in languages such as Fortran or C. If, however, some algorithm does not have an existing low-level implementation, then the scientific programmer must either accept sub-standard performance (sometimes orders of magnitude slower than native code) or themselves implement the desired functionality in a less productive but more efficient language.
To alleviate this problem, this thesis present Parakeet, a runtime compiler for an array-oriented subset of Python. Parakeet does not replace the Python interpreter, but rather selectively augments it by compiling and executing functions explicitly marked by the programmer. Parakeet uses runtime type specialization to eliminate the performance-defeating dynamicism of untyped Python code. Parakeet's pervasive use of data parallel operators as a means for implementing array operations enables high-level restructuring optimization and compilation to parallel hardware such as multi-core CPUs and graphics processors. We evaluate Parakeet on a collection of numerical benchmarks and demonstrate its dramatic capacity for accelerating array-oriented Python programs.
One of the biggest challenges artificial intelligence faces is making sense of the real world through sensory signals such as audio or video. Noisy inputs, varying object viewpoints, deformations and lighting conditions turn it into a high-dimensional problem which cannot be efficiently solved without learning from data.
This thesis explores a general way of learning from high dimensional data (video, images, audio, text, financial data, etc.) called deep learning. It strives on the increasingly large amounts of data available to learn robust and invariant internal features in a hierarchical manner directly from the raw signals.
We propose an unified pipeline for feature learning, recognition, localization and detection using Convolutional Networks (ConvNets) that can obtain state-of-the-art accuracy on a number of pattern recognition tasks, including acoustic modeling for speech recognition and object recognition in computer vision. ConvNets are particularly well suited for learning from continuous signals in terms of both accuracy and efficiency.
Additionally, a novel and general deep learning approach to detection is proposed and successfully demonstrated on the most challenging vision datasets. We then generalize it to other modalities such as speech data. This approach allows accurate localization and detection objects in images or phones in voice signals by learning to predict boundaries from internal representations. We extend the reach of deep learning from classification to detection tasks in an integrated fashion by learning multiple tasks using a single deep model. This work is among the first to outperform human vision and establishes a new state of the art on some computer vision and speech recognition benchmarks.
Developing technology to help people teach and learn is an important topic in Human Computer Interaction (HCI).
In this thesis we present three studies on this topic. In the first study, we demonstrate new games for learning mathematics and discuss the evidence for key design decisions from user studies. In the second study, we develop a real-time video compositing system for distance education and share evidence for its potential value compared to standard techniques from two user studies. In the third study, we demonstrate our markerless hand tracking interface for real-time 3D manipulation and explain its advantages compared to other state-of-the-art methods.
A data-driven methodology is applied intensively throughout the course of this study. Several paraphrase corpora are constructed using automatic techniques, experts and crowdsourcing platforms. Paraphrase systems are trained and evaluated by using these data as a cornerstone. We show that even with a very noisy or a relatively small amount of parallel training data, it is possible to learn paraphrase models which capture linguistic phenomena. This work expands the scope of paraphrase studies to targeting different language variations, and more potential applications, such as text normalization and domain adaptation.
Modern Cryptography is based on computational intractability assumptions, e.g., Factoring, Discrete Logarithm, Diffie-Helman etc. However, since an assumption might be proven incorrect, there has been a lot of focus in order to construct cryptographic primitives based on the possibly most minimal assumption. The most popular minimal assumption, which is implied by the existence of almost all cryptographic primitives, is the existence of One Way Functions. Coin-Flipping protocols are known to be implied by One-Way Functions, however, a complete characterization of the inverse direction is not known. There was even speculation that weak notions of Coin Flipping Protocols might be strictly weaker than One Way Functions. In this thesis we show that even very weak notions of Coin Flipping protocols do imply One Way Functions. In particular we show that the existence of a coin-flipping protocol safe against any non-trivial constant bias (e.g 0.499) implies the existence of One Way Functions. This improves upon a recent result of Haitner and Omri [FOCS '11], who proved this implication for protocols with bias 0.207. Unlike the former result, our result also holds for weak coin-flipping protocols.
Separation logic (SL) is a widely used formalism for verifying heap manipulating programs. Existing SL solvers focus on decidable fragments for list-like structures. More complex data structures such as trees are typically unsupported in implementations, or handled by incomplete heuristics.
While complete decision procedures for reasoning about trees have been proposed, these procedures suffer from high complexity, or make global assumptions about the heap that contradict the separation logic philosophy of local reasoning. In this paper, we present a fragment of classical first-order logic for local reasoning about tree-like data structures. The logic is decidable in NP and the decision procedure allows for combinations with other decidable first-order theories for reasoning about data. Such extensions are essential for proving functional correctness properties.
We have implemented our decision procedure and, building on earlier work on translating SL proof obligations into classical logic, integrated it into an SL-based verification tool. We successfully used the tool to verify functional correctness of tree-based data structure implementations.
Our language changes very rapidly, accompanying political, social and cultural trends, as well as the evolution of science and technology. The Internet, especially the social media, has accelerated this process of change. This poses a severe challenge for both human beings and natural language processing (NLP) systems, which usually only model a snapshot of language presented in the form of text corpora within a certain domain and time frame.
While much previous effort has investigated monolingual paraphrase and bilingual translation, we focus on modeling meaning-preserving transformations between variants of a single language. We use Shakespearean and Internet language as examples to investigate various aspects of this new paraphrase problem, including acquisition, generation, detection and evaluation.
A data-driven methodology is applied intensively throughout the course of this study. Several paraphrase corpora are constructed using automatic techniques, experts and crowdsourcing platforms. Paraphrase systems are trained and evaluated by using these data as a cornerstone. We show that even with a very noisy or a relatively small amount of parallel training data, it is possible to learn paraphrase models which capture linguistic phenomena. This work expands the scope of paraphrase studies to targeting different language variations, and more potential applications, such as text normalization and domain adaptation.
With the recent proliferation of large, unlabeled data sets, a particular subclass of semisupervised learning problems has become more prevalent. Known as positiveunlabeled learning (PU learning), this scenario provides only positive labeled examples, usually just a small fraction of the entire dataset, with the remaining examples unknown and thus potentially belonging to either the positive or negative class. Since the vast majority of traditional machine learning classifiers require both positive and negative examples in the training set, a new class of algorithms has been developed to deal with PU learning problems.
A canonical example of this scenario is topic labeling of a large corpus of documents. Once the size of a corpus reaches into the thousands, it becomes largely infeasible to have a curator read even a sizable fraction of the documents, and annotate them with topics. In addition, the entire set of topics may not be known, or may change over time, making it impossible for a curator to annotate which documents are NOT about certain topics. Thus a machine learning algorithm needs to be able to learn from a small set of positive examples, without knowledge of the negative class, and knowing that the unlabeled training examples may contain an arbitrary number of additional but as yet unknown positive examples. Another example of a PU learning scenario recently garnering attention is the protein function prediction problem (PFP problem).
While the number of organisms with fully sequenced genomes continues to grow, the progress of annotating those sequences with the biological functions that they perform lags far behind. Machine learning methods have already been successfully applied to this problem, but with many organisms having a small number of positive annotated training examples, and the lack of availability of almost any labeled negative examples, PU learning algorithms can make large gains in predictive performance.
The first part of this dissertation motivates the protein function prediction problem, explores previous work, and introduces novel methods that improve upon previously reported benchmarks for a particular type of learning algorithm, known as Gaussian Random Field Label Propagation (GRFLP). In addition, we present improvements to the computational efficiency of the GRFLP algorithm, and a modification to the traditional structure of the PFP learning problem that allows for simultaneous prediction across multiple species.
The second part of the dissertation focuses specifically on the positive-unlabeled aspects of the PFP problem. Two novel algorithms are presented, and rigorously compared to existing PU learning techniques in the context of protein function prediction. Additionally, we take a step back and examine some of the theoretical considerations of the PU scenario in general, and provide an additional novel algorithm applicable in any PU context. This algorithm is tailored for situations in which the labeled positive examples are a small fraction of the set of true positive examples, and where the labeling process may be subject to some type of bias rather than being a random selection of true positives (arguably some of the most difficult PU learning scenarios).
The third and fourth sections return to the PFP problem, examining the power of tertiary structure as a predictor of protein function, as well as presenting two case studies of function prediction performance on novel benchmarks. Lastly, we conclude with several promising avenues of future research into both PU learning in general, and the protein function prediction problem specifically.
It has long been the goal in computer vision to learn a hierarchy of features useful for object recognition. Spanning the two traditional paradigms of machine learning, unsupervised and supervised learning, we investigate the application of deep learning methods to tackle this challenging task and to learn robust representations of images.
We begin our investigation with the introduction of a novel unsupervised learning technique called deconvolutional networks. Based on convolutional sparse coding, we show this model learns interesting decompositions of images into parts without object label information. This method, which easily scales to large images, becomes increasingly invariant by learning multiple layers of feature extraction coupled with pooling layers. We introduce a novel pooling method called Gaussian pooling to enable these layers to store continuous location information while being differentiable, creating a unified objective function to optimize.
In the supervised learning domain, a well-established model for recognition of objects is the convolutional network. We introduce a new regularization method for convolutional networks called stochastic pooling which relies on sampling noise to prevent these powerful models from overfitting. Additionally, we show novel visualizations of these complex models to better understand what they learn and to provide insight on how to develop state-of-the-art architectures for large-scale classification of 1,000 different object categories.
We also investigate some other related problems in deep learning. First, we introduce a model for the task of mapping one high dimensional time series sequence onto another. Second, we address the choice of nonlinearity in neural networks, showing evidence that rectified linear units outperform others types in automatic speech recognition. Finally, we introduce a novel optimization method called ADADELTA which shows promising convergence speeds in practice while being robust to hyper-parameter selection.
This thesis serves as a step toward a better understanding of how to design fair and efficient multiagent resource allocation systems by bringing the incentives of the participating agents to the center of the design process. As the quality of these systems critically depends on the ways in which the participants interact with each other and with the system, an ill-designed set of incentives can lead to severe inefficiencies. The special focus of this work is on the problems that arise when the use of monetary exchanges between the system and the participants is prohibited. This is a common restriction that substantially complicates the designer's task; we nevertheless provide a sequence of positive results in the form of mechanisms that maximize efficiency or fairness despite the possibly self-interested behavior of the participating agents.
The first part of this work is a contribution to the literature on approximate mechanism design without money. Given a set of divisible resources, our goal is to design a mechanism that allocates them among the agents. The main complication here is due to the fact that the agents' preferences over different allocations of these resources may not be known to the system. Therefore, the mechanism needs to be designed in such a way that it is in the best interest of every agent to report the truth about her preferences; since monetary rewards and penalties cannot be used in order to elicit the truth, a much more delicate regulation of the resource allocation is necessary. Our contribution mostly revolves around a new truthful mechanism that we propose, which we call the /Partial Allocation/ mechanism. We first show how to use the two-agent version of this mechanism to create a system with the best currently known worst-case efficiency guarantees for problem instances involving two agents. We then consider fairness measures and prove that the general version of this elegant mechanism yields surprisingly good approximation guarantees for the classic problem of fair division. More specifically, we use the well established solution of /Proportional Fairness/ as a benchmark and we show that for an arbitrary number of agents and resources, and for a very large class of agent preferences, our mechanism provides /every agent/ with a value close to her proportionally fair value. We complement these results by also studying the limits of truthful money-free mechanisms, and by providing other mechanisms for special classes of problem instances. Finally, we uncover interesting connections between our mechanism and the Vickrey-Clarke-Groves mechanism from the literature on mechanism design with money.
The second part of this work concerns the design of money-free resource allocation mechanisms for /decentralized/ multiagent systems. As the world has become increasingly interconnected, such systems are using more and more resources that are geographically dispersed; in order to provide scalability in these systems, the mechanisms need to be decentralized. That is, the allocation decisions for any given resource should not assume global information regarding the system's resources or participants. We approach this restriction by using /coordination mechanisms/: a collection of simple resource allocation policies, each of which controls only one of the resources and uses only local information regarding the state of the system. The system's participants, facing these policies, have the option of choosing which resources they will access. We study a variety of coordination mechanisms and we prove that the social welfare of any equilibrium of the games that these mechanisms induce is a good approximation of the optimal welfare. Once again, we complement our positive results by studying the limits of coordination mechanisms. We also provide a detailed explanation of the seemingly counter-intuitive incentives that some of these mechanisms yield. Finally, we use this understanding in order to design a combinatorial constant-factor approximation algorithm for maximizing the social welfare, thus providing evidence that a game-theoretic mindset can lead to novel optimization algorithms.
We give the proof of a tight lower bound on the probability that a binomial random variable exceeds its expected value. The inequality plays an important role in a variety of contexts, including the analysis of relative deviation bounds in learning theory and generalization bounds for unbounded loss functions.
Productivity languages such as NumPy and Matlab make it much easier to implement data-intensive numerical algorithms than it is to implement them in efficiency languages such as C++. This is important as many programmers (1) aren't expert programmers; or (2) don't have time to tune their software for performance, as their main job focus is not programming per se. The tradeoff is typically one of execution time versus programming time, as unless there are specialized library functions or precompiled primitives for your particular task a productivity language is likely to be orders of magnitude slower than an efficiency language.
In this thesis, we present Parakeet, an array-oriented language embedded within Python, a widely-used productivity language. The Parakeet just-in-time compiler dynamically translates whole user functions to high performance multi-threaded native code. This thesis focuses in particular on our use of data parallel operators as a basis for locality enhancing program optimizations. e transform Parakeet programs written with the classic data parallel operators (Map, Reduce, and Scan; in Parakeet these are called adverbs) to process small local pieces (called tiles) of data at a time. To express this locality we introduce three new adverbs: TiledMap, TiledReduce, and TiledScan. These tiled adverbs are not exposed to the programmer but rather are automatically generated by a tiling transformation.
We use this tiling algorithm to bring two classic locality optimizations to a data parallel setting: cache tiling, and register tiling. We set register tile sizes statically at compile time, but use an online autotuning search to find good cache tile sizes at runtime. We evaluate Parakeet and these optimizations on various benchmark programs, and exhibit excellent performance even compared to typical C implementations.
Approximate adders are adders with conventional architectures run in an overclocked mode. With this mode, erroneous sums may be produced at the savings of energy required to execute the computation. The results presented in this report lead to a procedure for allocating the available energy budgets among the adders modules so as to minimize the expected error. For simplicity, only the uniform distribution of the inputs is considered.
Since Bennett's 1973 seminal paper, there has been a growing interest in general-purpose, reversible computations and they have been studied using both mathematical and physical models. Following Bennett, given a terminating computation of a deterministic Turing Machine, one may be interested in constructing a new Turing Machine, whose computation consists of two stages. The first stage emulates the original Turing Machine computation on its working tape, while also producing the trace of the computation on a new history tape. The second stage reverses the first stage using the trace information. Ideally, one would want the second stage to traverse whole-machine states in the reverse order from that traversed in the first stage. But this is impossible other than for trivial computations. Bennett constructs the second phase by using additional controller states, beyond those used during the first stage. In this report, a construction of the new machine is presented in which the second stage uses the same and only those controller states that the first stage used and they are traversed in the reverse order. The sole element that is not fully reversed is the position of the head on the history tape, where it is out of phase by one square compared to the first stage.
The creation, manipulation and display of piecewise smooth surfaces has been a fundamental topic in computer graphics since its inception. The applications range from highest-quality surfaces for manufacturing in CAD, to believable animations of virtual creatures in Special Effects, to virtual worlds rendered in real-time in computer games.
Our focus is on improving the a) mathematical representation and b) automatic construction of such surfaces from finely sampled meshes in the presence of features. Features can be areas of higher geometric detail in an otherwise smooth area of the mesh, or sharp creases that contrast the overall smooth appearance of an object.
In the first part, we build on techniques that define piecewise smooth surfaces, to improve their quality in the presence of features. We present a crease technique suitable for real-time applications that helps increases the perceived visual detail of objects that are required to be very compactly represented and efficiently evaluated.
We then introduce a new subdivision scheme that allows the use of T-junctions for better local refinement. It thus reduces the need for extraordinary vertices, which can cause surface artifacts especially on animated objects.
In the second part, we consider the problem of how to build the control meshes of piecewise smooth surfaces, in a way that the resulting surface closely approximates an existing data set (such as a 3D range scan), particularly in the presence of features. To this end, we introduce a simple modification that can be applied to a wide range of parameterization techniques to obtain an anisotropic parameterization. We show that a resulting quadrangulation can indeed better approximate the original surface. Finally, we present a quadrangulation scheme that turns a data set into a quad mesh with T-junctions, which we then use as a T-Spline control mesh to obtain a smooth surface.
In the first part of this thesis, we develop novel image priors and efficient algorithms for image denoising and deconvolution applications. Our priors and algorithms enable fast, high-quality restoration of images corrupted by noise or blur. In the second part, we develop effective preconditioners for Laplacian matrices. Such matrices arise in a number of computer graphics and computational photography problems such as image colorization, tone mapping and geodesic distance computation on 3D meshes.
The first prior we develop is a spectral prior which models correlations between different spectral bands. We introduce a prototype camera and flash system, used in conjunction with the spectral prior, to enable taking photographs at very low light levels. Our second prior is a sparsity-based measure for blind image deconvolution. This prior gives lower costs to sharp images than blurred ones, enabling the use simple and efficient Maximum a-Posteriori algorithms.
We develop a new algorithm for the non-blind deconvolution problem. This enables extremely fast deconvolution of images blurred by a known blur kernel. Our algorithm uses Fast Fourier Transforms and Lookup Tables to achieve real-time deconvolution performance with non convex gradient-based priors. Finally, for certain image restoration problems with no clear formation model, we demonstrate how learning a direct mapping between original/corrupted patch pairs enables effective restoration.
We develop multi-level preconditioners to solve discrete Poisson equations. Existing multilevel preconditioners have two major drawbacks: excessive bandwidth growth at coarse levels; and the inability to adapt to problems with highly varying coefficients. Our approach tackles both these problems by introducing sparsification and compensation steps at each level. We interleave the selection of fine and coarse-level variables with the removal of weak connections between potential fine-level variables (sparsification) and compensate for these changes by strengthening nearby connections. By applying these operations before each elimination step and repeating the procedure recursively on the resulting smaller systems, we obtain highly efficient schemes. The construction is linear in time and memory. Numerical experiments demonstrate that our new schemes outperform state of the art methods, both in terms of operation count and wall-clock time, over a range of 2D and 3D problems.
The Reissner-Mindlin plate models thin plates. The condition numbers of finite element approximations of these plate models increase very rapidly as the thickness of the plate goes to 0. A Balancing Domain Decomposition by Constraints (BDDC) De Luxe method is developed for these plate problems discretized by Falk-Tu finite elements. In this new algorithm, subdomain Schur complements restricted to individual edges are used to define the average operator for the BDDC De Luxe method. It is established that the condition number of this preconditioned iterative method is bounded by C(1 + log (H/h))^2 if t, the thickness of the plate, is on the order of the element size h or smaller; H is the maximum diameter of the subdomains. The constant C is independent of the thickness t as well as H and h. Numerical results, which verify the theory, and a comparison with a traditional BDDC method are also provided.
Macaroons, recently introduced by Birgisson et al., are authorization credentials that provide support for controlled sharing in decentralized systems. Macaroons are similar to cookies in that they are bearer credentials, but unlike cookies, macaroons include caveats that attenuate and contextually confine when, where, by who, and for what purpose authorization should be granted.
In this work, we formally study the cryptographic security of macaroons. We define macaroon schemes, introduce corresponding security definitions and provide several constructions. In particular, the MAC-based and certificate-based constructions outlined by Birgisson et al., can be seen as instantiations of our definitions. We also present a new construction that is privately-verifiable (similar to the MAC-based construction) but where the verifying party does not learn the intermediate keys of the macaroon, a problem already observed by Birgisson et al.
We also formalize the notion of a protocol for "discharging" third-party caveats and present a security definition for such a protocol. The encryption-based protocol outlined by Birgisson et al. can be seen as an instantiation of our definition, and we also present a new signature-based construction.
Finally, we formally prove the security of all constructions in the given security models.
Relation Extraction aims at detecting and categorizing semantic relations between pairs of entities in unstructured text. It benefits an enormous number of applications such as Web search and Question Answering. Traditional approaches for relation extraction either rely on learning from a large number of accurate human-labeled examples or pattern matching with hand-crafted rules. These resources are very laborious to obtain and can only be applied to a narrow set of target types of interest.
This talk focuses on learning relations with little or no human supervision. First, we examine the approach that treats relation extraction as a supervised learning problem. We develop an algorithm that is able to train a model with approximately 1/3 of the human-annotation cost and that matches the performance of models trained with high-quality annotation. Second, we investigate distant supervision, a weakly supervised algorithm that automatically generates its own labeled training data. We develop a latent Bayesian framework for this purpose. By using a model which provides a better approximation of the weak source of supervision, it outperforms the state-of-the-art methods. Finally, we investigate the possibility of building all relational tables beforehand with an unsupervised relation extraction algorithm. We develop an effective yet efficient algorithm that combines the power of various semantic resources that are automatically mined from a corpus based on distributional semantics. The algorithm is able to extract a very large set of relations from the web at high precision.
Although the phenomenon of time travel is common in popular culture, there has been little work in AI on developing a formal theory of time travel. This paper develops such a theory. The paper introduces a branching-time ontology that maintains the classical restriction of forward movement through a temporal tree structure, but permits the representation of paths in which one can perform inferences about time-travel scenarios. Central to the ontology is the notion of an agent embodiment whose beliefs are equivalent to those of an agent who has time-traveled from the future. We show how to formalize an example scenario and demonstrate what it means for such a scenario to be motivated with respect to an agent embodiment.
A BDDC preconditioner is defined by a coarse component, expressed in terms of primal constraints and a weighted average across the interface between the subdomains, and local components given in terms of Schur complements of local subdomain problems. A BDDC method for vector field problems discretized with Raviart-Thomas finite elements is introduced. Our method is based on a new type of weighted average developed to deal with more than one variable coefficient. A bound on the condition number of the preconditioned linear system is also provided which is independent of the values and jumps of the coefficients across the interface and has a polylogarithmic condition number bound in terms of the number of degrees of freedom of the individual subdomains. Numerical experiments for two and three dimensional problems are also presented, which support the theory and show the effectiveness of our algorithm even for certain problems not covered by our theory.
Security and privacy are increasingly important in our interconnected world. Cybercrimes, including identity theft, phishing, and other attacks, are on the rise, and computer-assisted crimes such as theft and stalking are becoming commonplace.
Contemporary with this trend is the uptake of technology in the developing world, proceeding at a pace often outstripping that of the developed world. Penetration of mobile phones and services such as healthcare delivery, mobile money, and social networking is higher than that of even amenities like electricity. Connectivity is empowering disenfranchised people, providing information and services to the heretofore disconnected poor.
There are efforts to use technology to enhance physical security and well-being in the developing world, including citizen journalism, education, improving drug security, attendance tracking, etc.
However, there are significant challenges to security both in the digital and the physical domains that are particular to these contexts. Infrastructure is constrained, literacy, numeracy, and familiarity with basic technologies cannot be assumed, and environments are harsh on hardware. These circumstances often prevent security best practices from being transplanted directly to these regions â in many ways, the adoption of technology has overtaken the users ability to use it safely, and their trust in it is oftentimes reater than it should be.
This dissertation describes several systems and methodologies designed to operate in the developing world, using technologies and metaphors that are familiar to users and that are robust against the operating environments.
It begins with an overview of the state of affairs, and several threat models. It continues with a description of Signet, a method to use SIM cards as trusted computing hardware to provide secure signed receipts. Next, Epothecary describes a low-infrastructure system for tracking pharmaceuticals that also significantly and asymmetrically increases costs for counterfeiters. The balance consists of a description of a low-cost Biometric Terminal currently in use by NGOs in India performing DOTS-based tuberculosis treatment, Blacknoise, an investigation into the use of low-cost cameraphones with noisy imaging sensors for image-based steganography, and finally Innoculous, a low-cost, crowdsourcing system for combating the spread of computer viruses, particularly among non-networked computers, while also collecting valuable "epidemiological" data.
In this thesis we prove intractability results for several well studied problems in combinatorial optimization.
Closest Vector Problem with Preprocessing (CVPP): We show that the preprocessing version of the well known Closest Vector Problem is hard to approximate to an almost polynomial factor unless NP is in quasi polynomial time. The approximability of CVPP is closely related to the security of lattice based cryptosystems.
Pricing Loss Leaders: We show hardness of approximation results for the problem of maximizing profit from buyers with single minded valuations where each buyer is interested in bundles of at most k items, and the items are allowed to have negative prices ("Loss Leaders"). For k = 2, we show that assuming the Unique Games Conjecture, it is hard to approximate the profit to any constant factor. For k > 2, we show the same result assuming P != N P.
Integrality gaps: We show SemiDefinite Programming (SDP) integrality gaps for Unique Games and 2 to 1 Games. Inapproximability results for these problems imply inapproximability results for many fundamental optimization problems. For the first problem, we show "approximate" integrality gaps for super constant rounds of the powerful Lasserre hierarchy. For the second problem we show integrality gaps for the basic SDP relaxation with perfect completeness.
A large portion of computer graphics and human/computer interaction is concerned with the creation, manipulation and use of two and three dimensional objects existing in a virtual world. By creating more natural physical interfaces and virtual worlds which behave in physically plausible ways, it is possible to empower nonexpert users to create, work and play in virtual environments. This thesis is concerned with the design, creation, and optimization of user-input devices which break down the barriers between the real and the virtual as well as the development of software algorithms which allow for the creation of physically realistic virtual worlds.
Counterfeiting of goods is a worldwide problem where the losses are in billions of dollars. It is estimated that 10% of all the world trade is counterfeit. To alleviate counterfeiting, a number of techniques are used from barcodes to holograms. But these technologies are easily reproducible and hence they are ineffective against counterfeiters.
In this thesis, we introduce PaperSpeckle, a novel way to fingerprint any piece of paper based on its unique microscopic properties. Next, we extend and generalize this work to introduce TextureSpeckle, a novel way to fingerprint and characterize the uniqueness of the surface of a material based on the interaction of light with the natural randomness present in the rough structure at the microscopic level of the surface. We show the existence and uniqueness of these fingerprints by analyzing a large number of surfaces (over 20,000 microscopic surfaces and 200 million pairwise comparisons) of different materials. We also define the entropy of the fingerprints and show how each surface can be uniquely identified in a robust manner even in case of damage.
From a theoretical perspective, we consider a discrete approximation model from light scattering theory which allows us to compute the speckle pattern for a given surface. Under this computational model, we show that given a speckle pattern, it is computationally hard to reconstruct the physical surface characteristics by simulating the multiple scattering of light. Using TextureSpeckle as a security primitive, we design secure protocols to enable a variety of scenarios such as: i) supply chain security, where applications range from drug tracking to inventory management, ii) mobile based secure transfer of money (mobile money), where any paper can be changed to an on-demand currency, and iii) fingerprint ecosystem, a cloud based system, where any physical object can be identified and authenticated on-demand.
We discuss the construction of the prototype device ranging from optical lens design to usability aspects and show how our technique can be applied in the real world to alleviate counterfeiting and forgery. In addition, we introduce Pattern Matching Puzzles (PMPs), a usable security mechanism that provides a 'human computable' one-time-MAC (message authentication code) for every transaction,making each transaction information-theoretically secure against various adversarial attacks. The puzzles are easy tosolve even for semi-literate users with simple pattern recognition skills.
Using Machine Learning Algorithms to analyze and predict security price patterns is an area of active interest. Most practical stock traders combine computational tools with their intuitions and knowledge to make decisions.
This document explains the algorithms and discusses various metrics of accuracy. It validates the models by applying the model to a real-life trading price stream. Though it is very hard to replace the expertise of an experienced trader, software like this may enhance the trader's performance.
In the Information Age, visual media take on powerful new forms. Photographs once printed on paper and stored in physical albums now exist as digital files. With the rise of social media, photo data has moved to the cloud for rapid dissemination. The upside can be measured in terms of increased efficiency, greater reach, or reduced printing costs. But there is a downside that is harder to quantify: the risk of private photos or videos leaking inappropriately. Human imagery is potentially sensitive, revealing private details of a persons body, lifestyle, activities, and more. Images create visceral responses and have the potential to permanently damage a persons reputation.
We employed the theory of contextual integrity to explore privacy aspects of transmitting the human form. In response to privacy threats from new sociotechnical systems, we developed practical solutions that have the potential to restore balance. The main work is a set of client-side, technical interventions that can be used to alter information flows and provide features to support visual privacy. In the first approach, we use crowdsourcing to extract specific, useful human signal from video to decouple it from bundled identity information. The second approach is an attempt to achieve similar ends with pure software. Instead of using information workers, we developed a series of filters that alter video to hide identity information while still revealing motion signal. The final approach is an attempt to control the recipients of photos by encoding them in the visual channel. The software completely protects data from third-parties who lack proper credentials and maintains data integrity by exploiting the visual coherence of uploaded images, even in the face of JPEG compression. The software offers end-to-end encryption that is compatible with existing social media applications.
As a tumor grows, it rapidly outstrips its blood supply, leaving portions of tumor that undergo hypoxia. Hypoxia is strongly correlated with poor prognosis as it renders tumors less responsive to chemotherapy and radiotherapy. During hypoxia, HIFs upregulate production of glycolysis enzymes and VEGF, thereby promoting metabolic heterogeneity and angiogenesis, and proving to be directly instrumental in tumor progression. Prolonged hypoxia leads to necrosis, which in turn activates inflammatory responses that produce cytokines that stimulate tumor growth. Hypoxic tumor cells interact with macrophages and fibroblasts, both involved with inflammatory processes tied to tumor progression. So it is of clinical and theoretical significance to understand: Under what conditions does hypoxia arise in a heterogeneous cell population? Our aim is to transform this biological origins problem into a computational inverse problem, and then attack it using approaches from computer science. First, we develop a minimal, stochastic, spatiotemporal simulation of large heterogeneous cell populations interacting in three dimensions. The simulation can manifest stable localized regions of hypoxia. Second, we employ and develop a variety of algorithms to analyze histological images of hypoxia in xenographed colorectal tumors, and extract features to construct a spatiotemporal logical characterization of hypoxia. We also consider characterizing hypoxia by a linear regression functional learning mechanism that yields a similarity score. Third, we employ a Bayesian statistical model checking algorithm that can determine, over some bounded number of simulation executions, whether hypoxia is likely to emerge under some fixed set of simulation parameters, and some fixed logical or functional description of hypoxia. Driving the model checking process is one of three adaptive Monte Carlo sampling algorithms we developed to explore the high dimensional space of simulation initial conditions and operational parameters. Taken together, these three system components formulate a novel approach to the inverse problem above, and constitute a design for a tool that can be placed into the hands of experimentalists, for testing hypotheses based upon known parameter values or ones the tool might discover. In principle, this design can be generalized to other biological phenomena involving large heterogeneous populations of interacting cells.
In response to Supreme Court Justice Samuel Alitoâs opinion that society should accept a decline in personal privacy with modern technology, Hanni M. Fakhoury, staff attorney with the Electronic Frontier Foundation, argued âTechnology doesnât involve an âinevitableâ tradeoff [of increased convenience] with privacy. The only inevitability must be the demand that privacy be a value built into our technologyâ [42]. Our position resonates with Mr. Fakhouryâs. In this thesis, we present three artifacts that address the balance between usability, efficiency, and privacy as we rethink information privacy for the web.
In the first part of this thesis, we present the design, implementation and evaluation of Cryptagram, a system designed to enhance online photo privacy. Cryptagram enables users to convert photos into encrypted images, which the users upload to Online Social Networks (OSNs). Users directly manage access control to those photos via shared keys that are independent of OSNs or other third parties. OSNs apply standard image transformations (JPEG compression) to all uploaded images so Cryptagram provides image encoding and encryption protocols that are tolerant to these transformations. Cryptagram guarantees that the recipient with the right credentials can completely retrieve the original image from the transformed version of the uploaded encrypted image while the OSN cannot infer the original image. Cryptagramâs browser extension integrates seamlessly with preexisting OSNs, including Facebook and Google+, and currently has over 400 active users.
In the second part of this thesis, we present the design and implementation of Lockbox, a system designed to provide end-to-end private file-sharing with the convenience of Google Drive or Dropbox. Lockbox uniquely combines two important design points: (1) a federated system for detecting and recovering from server equivocation and (2) a hybrid cryptosystem over delta encoded data to balance storage and bandwidth costs with efficiency for syncing end-user data. To facilitate appropriate use of public keys in the hybrid cryptosystem, we integrate a service that we call KeyNet, which is a web service designed to leverage existing authentication media (e.g., OAuth, verified email addresses) to improve the usability of public key cryptography.
In the third part of this thesis, we present the design of Compass, which realizes the philosophical privacy framework of contextual integrity (CI) as a full OSN design. CI), which we believe better captures users privacy expectations in OSNs. In Compass, three properties hold: (a) users are associated with roles in specific contexts; (b) every piece of information posted by a user is associated with a specific context; (c) norms defined on roles and attributes of posts in a context govern how information is shared across users within that context. Given the definition of a context and its corresponding norm set, we describe the design of a compiler that converts the human-readable norm definitions to generate appropriate information flow verification logic including: (a) a compact binary decision diagram for the norm set; and (b) access control code that evaluates how a new post to a context will flow. We have implemented a prototype that shows how the philosophical framework of contextual integrity can be realized in practice to achieve strong privacy guarantees with limited additional verification overhead.
Currently, users of geo-distributed storage systems face a hard choice between having serializable transactions with high latency, or limited or no transactions with low latency. We show that it is possible to obtain both serializable transactions and low latency, under two conditions. First, transactions are known ahead of time, permitting an a priori static analysis of conflicts. Second, transactions are structured as transaction chains consisting of a sequence of hops, each hop modifying data at one server. To demonstrate this idea, we built Lynx, a geo-distributed storage system that offers transaction chains, secondary indexes, materialized join views, and geo-replication. Lynx uses static analysis to determine if each hop can execute separately while preserving serializability.if so, a client needs wait only for the first hop to complete, which occurs quickly. To evaluate Lynx, we built three applications: an auction service, a Twitter-like microblogging site and a social networking site. These applications successfully use chains to achieve low latency operation and good throughput.
In this note, we provide complexity characterizations of model checking multi-pushdown systems. Multi-pushdown systems model recursive concurrent programs in which any sequential process has a finite control. We consider three standard notions for boundedness: context boundedness, phase boundedness and stack ordering. The logical formalism is a linear-time temporal logic extending well-known logic CaRet but dedicated to multi-pushdown systems in which abstract operators (related to calls and returns) such as those for next-time and until are parameterized by stacks. We show that the problem is EXPTIME-complete for context-bounded runs and unary encoding of the number of context switches; we also prove that the problem is 2EXPTIME-complete for phase-bounded runs and unary encoding of the number of phase switches. In both cases, the value k is given as an input (whence it is not a constant of the model-checking problem), which makes a substantial difference in the complexity. In certain cases, our results improve previous complexity results.
Telling cow from sheep is effortless for most animals, but requires much engineering for computers. In this thesis, we seek to tease out basic principles that underlie many recent advances in image recognition. First, we recast many methods into a common unsupervised feature extraction framework based on an alternation of coding steps, which encode the input by comparing it with a collection of reference patterns, and pooling steps, which compute an aggregation statistic summarizing the codes within some region of interest of the image.
Within that framework, we conduct extensive comparative evaluations of many coding or pooling operators proposed in the literature. Our results demonstrate a robust superiority of sparse coding (which decomposes an input as a linear combination of a few visual words) and max pooling (which summarizes a set of inputs by their maximum value). We also propose macrofeatures, which import into the popular spatial pyramid framework the joint encoding of nearby features commonly practiced in neural networks, and obtain significantly improved image recognition performance. Next, we analyze the statistical properties of max pooling that underlie its better performance, through a simple theoretical model of feature activation. We then present results of experiments that confirm many predictions of the model. Beyond the pooling operator itself, an important parameter is the set of pools over which the summary statistic is computed. We propose locality in feature configuration space as a natural criterion for devising better pools. Finally, we propose ways to make coding faster and more powerful through fast convolutional feedforward architectures, and examine how to incorporate supervision into feature extraction schemes. Overall, our experiments offer insights into what makes current systems work so well, and state-of-the-art results on several image recognition benchmarks.
Population genetics has seen a renewed interest since the completion of the human genome project. With the availability of rapidly growing volumes of genomic data, the scientific and medical communities have been optimistic that better understanding of human diseases as well as their treatment were imminent. Many population genomic models and association studies have been designed (or redesigned) to address these problems. For instance, the genome-wide association studies (GWAS) had raised hopes for finding disease markers, personalized medicine and rational drug design. Yet, as of today, they have not yielded results that live up to their promise and have only led to a frustrating disappointment.
Intrigued, but not deterred by these challenges, this dissertation visits the different aspects of these problems. In the first part, we will review the different models and theories of population genetics that are now challenged. We will propose our own implementation of a model to test different hypotheses. This effort will hopefully help us in understanding whether our expectations were unreasonably too high or if we had ignored a crucial piece of information. When discussing association studies, we must not forget that we rely on data that are produced by sequencing technologies, so far available. We have to ensure that the quality of this data is reasonably good for GWAS. Unfortunately, as we will see in the second part, despite the existence of a diverse set of sequencing technologies, none of them can produce haplotypes with phasing, which appears to be the most important type of sequence data needed for association studies. To address this challenge, we propose a novel approach for a sequencing technology, called SMASH that allows us to create the quality and type of haplotypic genome sequences necessary for efficient population genetics.
We present a novel approach to training discriminative tree-structured machine translation systems by learning to search. We describe three primary innovations in this work: a new parsing coordinator architecture and algorithms to synthesize the required training examples for the learning algorithm; a new semiring that provides an unbiased way to compare translations; and a new training objective that measures whether a translation inference improves the quality of a translation. We also apply the reinforcement learning concept of exploration to SMT. Finally, we empirically evaluate the effects of our innovations on the quality of translations output by our system.
The ability of a robot to track its position and its surroundings is critical in mobile robotics applications, such as autonomous transport, farming, search-and-rescue, and planetary exploration.
As a foundational building block to such tasks, localization must remain reliable and unobtrusive. For example, it must not provide an unneeded level of precision, when the cost of doing so displaces higher-level tasks from a busy CPU. Nor should it produce noisy estimates on the cheap, when there are CPU cycles to spare.
This thesis explores localization solutions that provide exactly the amount of accuracy needed to a given task. We begin with a real-world system used in the DARPA Learning Applied to Ground Robotics (LAGR) competition. Using a novel hybrid of wheel and visual odometry, we cut the cost of visual odometry from 100% of a CPU to 5%, clearing room for other critical visual processes, such as long-range terrain classification. We present our hybrid odometer in chapter 2.
Next, we describe a novel SLAM algorithm that provides a means to choose the desired balance between cost and accuracy. At its fastest setting, our algorithm converges faster than previous stochastic SLAM solvers, while maintaining significantly better accuracy. At its most accurate, it provides the same solution as exact SLAM solvers. Its main feature, however, is the ability to flexibly choose any point between these two extremes of speed and precision, as circumstances demand. As a result, we are able to guarantee real-time performance at each timestep on city-scale maps with large loops. We present this solver in chapter 3, along with results from both commonly available datasets and Google Street View data.
Taken as a whole, this thesis recognizes that precision and efficiency can be competing values, whose proper balance depends on the application and its fluctuating circumstances. It demonstrates how a localizer can and should fit its cost to the task at hand, rather than the other way around. In enabling this flexibility, we demonstrate a new direction for SLAM research, as well as provide a new convenience for end-users, who may wish to map the world without stopping it.
Satisifiability modulo theories (SMT) is the problem of deciding whether a given logical formula can be satisifed with respect to a combination of background theories. The past few decades have seen many significant developments in the field, including fast Boolean satisfiability solvers (SAT), efficient decision procedures for a growing number of expressive theories, and frameworks for modular combination of decision procedures. All these improvements, with addition of robust SMT solver implementations, culminated with the acceptance of SMT as a standard tool in the fields of automated reasoning and computer aided verification. In this thesis we develop new decision procedures for the theory of linear integer arithmetic and the theory of non-linear real arithmetic, and develop a new general framework fro combination of decision procedures. The new decision procedures integrate theory specific reasoning and the Boolean search to provide more powerful and efficient procedures, and allow a more expressive language for explaining problematic states. The new framework for combination of decision procedures overcomes the complexity limitations and restrictions on the theories imposed by the standard Nelson-Oppen approach.
Many problems in scientific computing require the accurate and fast solution to a variety of elliptic PDEs. These problems become increasingly dif.cult in three dimensions when forces become non-homogeneously distributed and geometries are complex.
We present an adaptive fast volume solver using a new version of the fast multipole method, incorporated with a pre-existing boundary integral formulation for the development of an adaptive embedded boundary solver.
For the fast volume solver portion of the algorithm, we present a kernel-independent, adaptive fast multipole method of arbitrary order accuracy for solving elliptic PDEs in three dimensions with radiation boundary conditions. The algorithm requires only a Greenâs function evaluation routine for the governing equation and a representation of the source distribution (the right-hand side) that can be evaluated at arbiÂtrary points.
The performance of the method is accelerated in two ways. First, we construct a piecewise polynomial approximation of the right-hand side and compute far-.eld expansions in the FMM from the coef.cients of this approximation. Second, we precompute tables of quadratures to handle the near-.eld interactions on adaptive octree data structures, keeping the total storage requirements in check through the exploitation of symmetries. We additionally show how we extend the free-space volume solver to solvers with periodic and well as Dirichlet boundary conditions.
For incorporation with the boundary integral solver, we develop interpolation methods to maintain the accuracy of the volume solver. These methods use the existing FMM-based octree structure to locate apÂpropriate interpolation points, building polynomial approximations to this larger set of forces and evaluating these polynomials to the locally under-re.ned grid in the area of interest.
We present numerical examples for the Laplace, modi.ed Helmholtz and Stokes equations for a variety of boundary conditions and geometries as well as studies of the interpolation procedures and stability of far-.eld and polynomial constructions.
Event extraction is a particularly challenging type of information extraction (IE). Most current event extraction systems rely on local information at the phrase or sentence level. However, this local context may be insufficient to resolve ambiguities in identifying particular types of events; information from a wider scope can serve to resolve some of these ambiguities.
In this thesis, we first investigate how to extract supervised and unsupervised features to improve a supervised baseline system. Then, we present two additional tasks to show the benefit of wider scope features in semi-supervised learning (self-training) and active learning (co-testing). Experiments show that using features from wider scope can not only aid a supervised local event extraction baseline system, but also help the semi-supervised or active learning approach.
Visually impaired users are in dire need of better accessibility tools. The past few years have witnessed an exponential growth in the computing capabilities and onboard sensing capabilities of mobile phones making them an ideal candidate for building next-generation applications. We believe that the mobile device can play a significant role in the future for aiding visually impaired users in day-to-day activities with simple and usable mobile accessibility tools. This thesis describes the design, implementation, evaluation and user-study based analysis of four different mobile accessibility applications.
Our first system is the design of a highly accurate and usable mobile navigational guide that uses Wi-Fi and accelerometer sensors to navigate unfamiliar environments. A visually impaired user can use the system to construct a virtual topological map across points of interest within a building based on correlating the user' walking patterns (with turn signals) with the Wi-Fi and accelerometer readings. The user can subsequently use the map to navigate previously traveled routes. Our second system, Mobile Brailler, presents several prototype methods of text entry on a modern touch screen mobile phone that are based on the Braille alphabet and thus are convenient for visually impaired users. Our third system enables visually impaired users to leverage the camera of a mobile device to accurately recognize currency bills even if the images are partially or highly distorted. The final system enables visually impaired users to determine whether a pair of clothes, in this case of a tie and a shirt, can be worn together or not, based on the current social norms of color-matching.
We believe that these applications together, provide a suite of important mobile accessibility tools to enhance four critical aspects of a day-to-day routine of a visually impaired user: to navigate easily, to type easily, to recognize currency bills (for payments) and to identify matching clothes.
Developers increasingly use streaming languages to write their data processing applications. While a variety of streaming languages exist, each targeting a particular application domain, they are all similar in that they represent a program as a graph of streams (i.e. sequences of data items) and operators (i.e. data transformers). They are also similar in that they must process large volumes of data with high throughput. To meet this requirement, compilers of streaming languages must provide a variety of streaming-specific optimizations, including automatic parallelization. Traditionally, when many languages share a set of optimizations, language implementors translate the source languages into a common representation called an intermediate language (IL). Because optimizations can modify the IL directly, they can be re-used by all of the source languages, reducing the overall engineering effort. However, traditional ILs and their associated optimizations target single-machine, single-process programs. In contrast, the kinds of optimizations that compilers must perform in the streaming domain are quite different, and often involve reasoning across multiple machines. Consequently, existing ILs are not suited to streaming languages.
This thesis addresses the problem of how to provide a reusable infrastructure for stream processing languages. Central to the approach is the design of an intermediate language specifically for streaming languages and optimizations. The hypothesis is that an intermediate language designed to meet the requirements of stream processing can assure implementation correctness; reduce overall implementation effort; and serve as a common substrate for critical optimizations. In evidence, this thesis provides the following contributions: (1) a catalog of common streaming optimizations that helps define the requirements of a streaming IL; (2) a calculus that enables reasoning about the correctness of source language translation and streaming optimizations; and (3) an intermediate language that preserves the semantics of the calculus, while addressing the implementation issues omitted from the calculus This work significantly reduces the effort it takes to develop stream processing languages, and jump-starts innovation in language and optimization design.
Developers increasingly use stream processing languages to write applications that process large volumes of data with high throughput. Unfortunately, when choosing which stream processing language to use, they face a difficult choice. On the one hand, dynamically scheduled languages allow developers to write a wider range of applications, but cannot take advantage of many crucial optimizations. On the other hand, statically scheduled languages are extremely performant, but cannot express many important streaming applications.
This paper presents the design of a hybrid scheduler for stream processing languages. The compiler partitions the streaming application into coarse-grained subgraphs separated by dynamic rate boundaries. It then applies static optimizations to those subgraphs. We have implemented this scheduler as an extension to the StreamIt compiler, and evaluated its performance against three scheduling techniques used by dynamic systems: OS thread, demand, and no-op. Our scheduler not only allows the previously static version of StreamIt to run dynamic rate applications, but it outperforms the three dynamic alternatives. This demonstrates that our scheduler strikes the right balance between expressivity and performance for stream processing languages.
Web applications increasingly require a storage system that is both scalable and can replicate data across many distant data centers or sites. Most existing storage solutions fall into one of two categories: Traditional databases offer strict consistency guarantees and programming ease, but are difficult to scale in a geo-replicated setting. NoSQL stores are scalable and efficient, but have weak consistency guarantees, placing the burden of ensuring consistency on programmers. In this dissertation, we describe two systems that help bridge the two extremes, providing scalable, geo-replicated storage for web applications, while also easy to program for. Walter is a key-value store that supports transactions and replicating data across distant sites. A key feature underlying Walter is a new isolation property: Parallel Snapshot Isolation (PSI). PSI allows Walter to replicate data asynchronously, while providing strong guarantees within each site. PSI does not allow write-write conflicts, alleviating the burden of writing conflict resolution logic. To prevent write-write conflicts and implement PSI, Walter uses two new and simple techniques: preferred sites and counting sets. Lynx is a distributed database backend for scaling latency-sensitive web applications. Lynx supports optimizing queries via data denormalization, distributed secondary indexes, and materialized join views. To preserve data constraints across denormalized tables and secondary indexes, Lynx relies on the a novel primitive: Distributed Transaction Chain (DTC). A DTC groups a sequence of transactions to be executed on different nodes while providing two guarantees. First, all transactions in a DTC execute exactly once despite failures. Second, transactions from concurrent DTCs are interleaved consistently on common nodes. We built several web applications on top of Walter and Lynx: an auction service, a microblogging service, and a social networking website. We have found that building web applications using Walter and Lynx is quick and easy. Our experiments show that the resulting applications are capable of providing scalable, low latency operation across multiple geo-replicated sites.
This dissertation focuses on fast system development for Information Extraction (IE). State-of-the-art systems heavily rely on extensively annotated corpora, which are slow to build for a new domain or task. Moreover, previous systems are mostly built with local evidence such as words in a short context window or features that are extracted at the sentence level. They usually generalize poorly on new domains.
This dissertation presents novel approaches for rapidly training an IE system for a new domain or task based on both local and global evidence. Specifically, we present three systems: a relation type extension system based on active learning, a relation type extension system based on semi-supervised learning, and a cross-domain bootstrapping system for domain adaptive named entity extraction.
The active learning procedure adopts features extracted at the sentence level as the local view and distributional similarities between relational phrases as the global view. It builds two classifiers based on these two views to find the most informative contention data points to request human labels so as to reduce annotation cost.
The semi-supervised system aims to learn a large set of accurate patterns for extracting relations between names from only a few seed patterns. It estimates the confidence of a name pair both locally and globally: locally by looking at the patterns that connect the pair in isolation; globally by incorporating the evidence from the clusters of patterns that connect the pair. The use of pattern clusters can prevent semantic drift and contribute to a natural stopping criterion for semi-supervised relation pattern discovery.
For adapting a named entity recognition system to a new domain, we propose a cross-domain bootstrapping algorithm, which iteratively learns a model for the new domain with labeled data from the original domain and unlabeled data from the new domain. We first use word clusters as global evidence to generalize features that are extracted from a local context window. We then select self-learned instances as additional training examples using multiple criteria, including some based on global evidence.
Cooperative systems are ubiquitous nowadays. In a cooperative system, end users contribute resource to run the service instead of only receiving the service passively from the system. For example, users upload and comment pictures and videos on Flicker and YouTube, users submit and vote on news articles on Digg. As another example, users in BitTorrent contribute bandwidth and storage to help each other download content. As long as users behave as expected, these systems benefit immensely from user contribution. In fact, five out of ten most popular websites are operating in this cooperative fashion (Facebook, YouTube, Blogger, Twitter, Wikipedia). BitTorrent is dominating the global Internet traffic.
A robust cooperative system cannot blindly trust that its users will truthfully participate in the system. Malicious users seek to exploit the systems for profit. Selfish users consume but avoid to contribute resource. For example, adversaries have manipulated the voting system of Digg to promote their articles of dubious quality. Selfish users in public BitTorrent communities leave the system to avoid uploading files to others, resulting in drastic performance degradation for these content distribution systems. The ultimate way to disrupt security and incentive mechanisms of cooperative systems is using Sybil attacks, in which the adversary creates many Sybil identities (fake identities) and use them to disrupt the systems' normal operation. No security and incentive mechanism works correctly if the systems do not have a robust identity management that can defend against Sybil attacks.
This thesis provides robust identity management schemes which are resilient to the Sybil attack, and use them to secure and incentivize user contribution in several example cooperative systems. The main theme of this work is to leverage the social network among users in designing secure and incentive-compatible cooperative systems. First, we develop a distributed admission control protocol, called Gatekeeper, that leverages social network to admit most honest user identities and only few Sybil identities into the systems. Gatekeeper can be used as a robust identity management for both centralized and decentralized cooperative systems. Second, we provide a vote aggregation system for content voting systems, called SumUp, that can prevent an adversary from casting many bogus votes for a piece of content using the Sybil attack. SumUp leverages unique properties of content voting systems to provide significantly better Sybil defense compared with applying a general admission control protocol such as \gatekeeper. Finally, we provide a robust reputation system, called Credo, that can be used to incentivize bandwidth contribution in peer-to-peer content distribution networks. Credo reputation can capture user contribution, and is resilient to both Sybil and collusion attacks.
Background : Several recent comparative functional genomics projects have indicated that the co-regulation of many genes is conserved across species, at least in part. This suggests that comparative analysis of functional genomics data-sets could prove powerful in identifying co-regulated groups that are conserved across multiple species.
Results : We present recent work to extend our cMonkey algorithm to simultaneously bicluster heterogeneous data from multiple species to identify conserved modules of orthologous genes, which can yield evolutionary insights into the formation of regulatory modules. We also present results from the multi-species analysis to two triplets of bacteria. The first of these is a triplet of Gram-positive bacteria consisting of Bacillus subtilis, Bacillus anthracis, and Listeria monocytogenes, while the second is a triplet of Gram-negative bacteria that includes Escherichia coli, Salmonella typhimurium and Vibrio cholerae. Finally, we will present initial results from the multi-species biclustering analysis of human and mouse hematopoietic differentiation data.
Conclusion : Analysis of biclusters obtained revealed a surprising number of gene groups with conserved modularity and high biological significance as judged by several measures of cluster quality. We also highlight cases of interest from the Gram-positive triplet, including one that suggests a temporal difference in the expression of genes governing sporulation in the two Bacillus species. While analysis of the mouse and human hematopoietic differentiation is preliminary, it indicates the applicability of this analysis to eukaryotic systems, including comparison of cancer model systems. Finally, we suggest ways in which this analysis could be extended to identify divergent modules that may exist between normal and disease tissue.
In collusion-free protocols, subliminal communication is impossible and parties are thus unable to communicate any information beyond what the protocol allows". Collusion-free protocols are interesting for several reasons, but have specifically attracted attention because they can be used to reduce trust in game-theoretic mechanisms. Collusion-free protocols are impossible to achieve (in general) when all parties are connected by point-to-point channels, but exist under certain physical assumptions (Lepinksi et al., STOC 2005) or in specific network topologies (Alwen et al., Crypto 2008).
In addition to proposing the definition, we explore necessary properties of the underlying communication resource. Next we provide a general feasibility result for collusion-preserving computation of arbitrary functionalities. We show that the resulting protocols enjoy an elegant (and surprisingly strong) fallback security even in the case when the underlying communication resource acts in a Byzantine manner. Finally, we investigate the implications of these results in the context of mechanism design.
Providing access to information for people in emerging regions is an important problem. Over the past decade there have been many proposed and increasingly numerous deployed systems to enable information access, but successes are few and modest at best. Internet in emerging regions is still generally unusable or intolerably slow. Mobile phone applications are either not designed for the phones that poor people own, otherwise, the applications lack functionality, are difficult to use, or expensive to operate. In this work we focus on enabling digital information access for people in emerging regions.
To advance the state of the art, we contribute numerous observations about how people access information in emerging regions, why the current models for web access and SMS platforms are broken, and techniques to enable applications over constrained Internet or SMS. The mechanisms presented here were designed after extensive field work in several different regions including rural, peri-urban, and urban areas in India, Kenya, Ghana, and Mexico. Multiple user studies were conducted throughout the course of system design and prototyping. We present a novel set of context appropriate platforms and tools, some spanning several layers of the networking stack. Five complete systems were implemented and deployed in the field. First, Event Logger for Firefox (ELF) is an easily deployable Firefox extension which functions as both a web browsing analysis tool and an in-browser web optimization platform. Second, RuralCafe provides a platform for web search and browsing over extremely slow or intermittent networks. Third, Contextual Information Portals (CIP) provide cached repositories of web pages tailored to the particular context in which it is to be used. Fourth, UjU is a mobile application platform that simplies the design of new SMS-based mobile applications. Finally, SMSFind is a SMS-based search service that runs on mobile phones without setup or subscription to a data plan.
Taken as a whole, the systems here are a comprehensive solution for addressing the problem of enabling digital information access in emerging regions.
In this thesis we present formal logical systems, concerned with reasoning about algebraic data types.
The first formal system is based on the quantifier-free calculus (outermost universally quantified). This calculus is comprised of state change rules, and computations are performed by successive applications of these rules. Thereby, our calculus gives rise to an abstract decision procedure. This decision procedure determines if a given formula involving algebraic type members is valid. It is shown that this calculus is sound and complete. We also examine how this system performs practically and give experimental results. Our main contribution, as compared to previous work on this subject,is a new and more efficient decision procedure for checking satisfiability of the universal fragment within the theory of algebraic data types.
The second formal system, called Term Builder, is the deductive system based on higher order type theory, which subsumes second order and higher order logics. The main purpose of this calculus is to formulate and prove theorems about algebraic or other arbitrary user-defined types.Term Builder supports proof objects and is both, an interactive theorem prover, and verifier. We describe the built-in deductive capabilities of Term Builder and show its consistency. The logic represented by our prover is intuitionistic. Naturally, it is also incomplete and undecidable, but its expressive power is much higher than that of the first formal system.
Among our achievements in building this theorem prover is an elegant and intuitive GUI for building proofs. Also, a new feature from the foundational viewpoint is that, in contrast with other approaches, we have uniqueness-of-types property, which is not modulo beta-conversion.
In earlier work on domain decomposition methods for elliptic problems in the plane, an assumption that each subdomain is triangular, or a union of a few coarse triangles, has often been made. This is similar to what is required in geometric multigrid theory and is unrealistic if the subdomains are produced by a mesh partitioner. In an earlier paper, coauthored with Axel Klawonn, the authors introduced a coarse subspace for an overlapping Schwarz method with one degree of freedom for each subdomain vertex and one for each subdomain edge. A condition number bound proportional to $(1+\log(H/h))^2(1+H/\delta)$ was established assuming only that the subdomains are John domains; here $H/\delta$ measures the relative overlap between neighboring subdomains and $H/h$ the maximum number of elements across individual subdomains. We were also able to relate the rate of convergence to a parameter in an isoperimetric inequality for the subdomains into which the domain of the problem has been partitioned.
In this paper, the dimension of the coarse subspace is decreased by using only one degree of freedom for each subdomain vertex; if all subdomains have three edges, this leads to a reduction of the dimension of the coarse subspace by approximately a factor four. In addition, the condition number bound is shown to be proportional to $(1+\log(H/h))(1+H/\delta)$ under a quite mild assumption on the relative length of adjacent subdomain edges.
In this study, the subdomains are assumed to be uniform in the sense of Peter Jones. As in our earlier work, the results are insensitive to arbitrary large jumps in the coefficients of the elliptic problem across the interface between the subdomains.
Numerical results are presented which confirm the theory and demonstrate the usefulness of the algorithm for a variety of mesh decompositions and distributions of material properties. It is also shown that the new algorithm often converges faster than the older one in spite of the fact that the dimension of the coarse space has been decreased considerably.
Given the continuing popularity of C for building large-scale programs, such as Linux, Apache, and Bind, it is critical to provide effective tool support, including, for example, code browsing, bug finding, and automated refactoring. Common to all such tools is a need to parse C. But C programs contain not only the C language proper but also preprocessor invocations for file inclusion (#include), conditional compilation (#if, #ifdef, and so on), and macro definition/expansion (#define). Worse, the preprocessor is a textual substitution system, which is oblivious to C constructs and operates on individual tokens. At the same time, the preprocessor is indispensable for improving C's expressivity, abstracting over software/hardware dependencies, and deriving variations from the same code base. The x86 version of the Linux kernel, for example, depends on about 7,600 header files for file inclusion, 7,000 configuration variables for conditional compilation, and 520,000 macros for code expansion.
In this paper, we present a new tool for parsing all of C, including arbitrary preprocessor use. Our tool, which is called SuperC, is based on a systematic analysis of all interactions between lexing, preprocessing, and parsing to ensure completeness. It first lexes and preprocesses source code while preserving conditionals. It then parses the result using a novel variant of LR parsing, which automatically forks parsers when encountering a conditional and merges them again when reaching the same input in the same state. The result is a well-formed AST, containing static choice nodes for conditionals. While the parsing algorithm and engine are new, neither grammar nor LR parser table generator need to change. We discuss the results of our problem analysis, the parsing algorithm itself, the pragmatics of building a real-world tool, and a demonstration on the x86 version of the Linux kernel.
Non-interactive zero-knowledge (NIZK) proofs have enjoyed much interest in cryptography since they were introduced more than twenty years ago by Blum et al. [BFM88]. While quite useful when designing modular cryptographic schemes, until recently NIZK could be realized efficiently only using certain heuristics. However, such heuristic schemes have been widely criticized. In this work we focus on designing schemes which avoid them. In [GS08], Groth and Sahai presented the first efficient (and currently the only) NIZK proof system in the standard model. The construction is based on bilinear maps and is limited to languages of certain satisfiable system of equations. Given this expressibility limitation of the system of equations, we are interested in cryptographic primitives that are "compatible" with it. Equipped with such primitives and Groth-Sahai proof system, we show how to construct cryptographic schemes efficiently in a modular fashion.
In this work, we describe properties required by any cryptographic scheme to mesh well with Groth-Sahai proofs. Towards this, we introduce the notion of "structure-preserving" cryptographic scheme. We present the first constant-size structure-preserving signature scheme for messages consisting of general bilinear group elements. This allows us (for the first time) to instantiate efficiently a modular construction of round-optimal blind signature based on the framework of Fischlin [Fis06].
Our structure-preserving homomorphic trapdoor commitment schemes yield efficient leakage-resilient signatures (in the bounded leakage model) which satisfy the standard security requirements and additionally tolerates any amount of leakage; all previous works satisfied at most two of those three properties.
Lastly, we build a structure-preserving encryption scheme which satisfies the standard CCA security requirements. While somewhat similar to the notion of verifiable encryption, it provides better properties and yields the first efficient two-party protocol for joint ciphertext computation. Note that the efficient realization of such a protocol was not previously possible even using the heuristics mentioned above.
Lastly, in this line of work, we revisit the notion of simulation extractability and define "true-simulation extractable" NIZK proofs. Although quite similar to the notion of simulation-sound extractable NIZK proofs, there is a subtle but rather important difference which makes it weaker and easier to instantiate efficiently. As it turns out, in many scenarios, this new notion is sufficient, and using it, we can construct efficient leakage resilient signatures and CCA encryption scheme.
In this thesis we study unsupervised learning algorithms for training feature extractors and building deep learning models. We propose sparse-modeling algo- rithms as the foundation for unsupervised feature extraction systems. To reduce the cost of the inference process required to obtain the optimal sparse code, we model a feed-forward function that is trained to predict this optimal sparse code. Using an efficient predictor function enables the use of sparse coding in hierarchical models for object recognition. We demonstrate the performance of the developed system on several recognition tasks, including object recognition, handwritten digit classification and pedestrian detection. Robustness to noise or small variations in the input is a very desirable property for a feature extraction algorithm. In order to train locally-invariant feature extractors in an unsupervised manner, we use group sparsity criteria that promote similarity between the dictionary elements within a group. This model produces locally-invariant representations under small pertur- bations of the input, thus improving the robustness of the features. Many sparse modeling algorithms are trained on small image patches that are the same size as the dictionary elements. This forces the system to learn multiple shifted versions of each dictionary element. However, when used convolutionally over large im- ages to extract features, these models produce very redundant representations. To avoid this problem, we propose convolutional sparse coding algorithms that yield a richer set of dictionary elements, reduce the redundancy of the representation and improve recognition performance.
REST is a software architectural style used for the design of highly scalable web applications. Interest in REST has grown rapidly over the past decade, spurred by the growth of open web APIs. On the other hand, there is also considerable confusion surrounding REST: many examples of supposedly RESTful APIs violate key REST constraints. We show that the constraints of REST and of RESTful HTTP can be pre- cisely formulated within temporal logic. This leads to methods for model checking and run-time verfication of RESTful behavior. We formulate several relevant verification questions and analyze their complexity.
The work presented focuses on two problems, that of synthesizing systems from formal specifications, and that of formalizing REST -- a popular web applications' development pattern.
For the synthesis problem, we distinguish between the synchronous and the asynchronous case. For the former, we solve a problem concerning a fundamental flaw in specification construction in previous work. We continue with exploring effective synthesis of asynchronous systems (programs on multi-threaded systems). Two alternative models of asynchrony are presented, and shown to be equally expressive for the purpose of synthesis.
REST is a software architectural style used for the design of highly scalable web applications. Interest in REST has grown rapidly over the past decade. However, there is also considerable confusion surrounding REST: many examples of supposedly RESTful APIs violate key REST constraints. We show that the constraints of REST and of RESTful HTTP can be precisely formulated within temporal logic. This leads to methods for model checking and run-time verification of RESTful behavior. We formulate several relevant verification questions and analyze their complexity.
The Reissner-Mindlin plate theory models a thin plate with thickness t. The condition number of finite element approximations of this model deteriorates badly as the thickness t of the plate converges to 0. In this thesis, we develop an overlapping domain decomposition method for the Reissner-Mindlin plate model discretized by Falk-Tu elements with a convergence rate which does not deteriorate when t converges to 0. We use modern overlapping methods which use the Schur complements to define coarse basis functions and show that the condition number of this overlapping method is bounded by C(1 + H/delta )^3*(1 + log(H/h))^2. Here H is the maximum diameter of the subdomains, delta the size of overlap between subdomains, and h the element size. Numerical examples are provided to confirm the theory. We also modify the overlapping method to develop a BDDC method for the Reissner-Mindlin model. We establish numerically an extension lemma to obtain a constant bound and an edge lemma to obtain a C(1 + log(H/h))^2 bound. Given such bounds, the condition number of this BDDC method is shown to be bounded by C(1 + log(H/h))^2.
Consider the problem of computing isotopic approximations of nonsingular curves and surfaces that are implicitly represented by equations of the form f (X, Y )=0 and f (X,Y, Z)=0. Thisfundamentalproblem has seen much progress along several fronts, but we will focus on domain subdivision algorithms. Two algorithms in this area are from Snyder(1992) and Plantinga and Vegter(2004). We introduce a family of new algorithms that combines the advantages of these two algorithms: like Snyder, we use the parameterizability criterion for subdivision, and like Plantinga and Vegter, we exploit nonlocal isotopy.
We first apply our approach to curves, resulting in a more efficient algorithm. We then extend our approach to surfaces. The extension is by no means routine, as the correctness arguments and case analysis are more subtle. Also, a new phenomenon arises in which local rules for constructing surfaces are no longer sufficient.
We further extend our algorithms in two important and practical directions: first, we allow subdivision cells to be non squares or non cubes, with arbitrary but bounded aspect ratios: in 2D, we allow boxes to be split into 2 or 4 children; and in 3D, we allow boxes to be split into 2, 4 or 8 children. Second, we allow the inputregion-of-interest(ROI) to have arbitrary geometry represented by anquadtreeoroctree,aslongas the curves or surfaces has no singularities in the ROI and intersects the boundary of ROI transversally.
Our algorithm is numerical because our primitives are based on interval arithmetic and exact BigFloat numbers. It is practical, easy to implement exactly (compared to algebraic approaches) and does not suffer from implementation gaps (compared to geometric approaches). We report some very encouraging experimental results,showing that our algorithms can be much more efficient than the algorithms of Plantinga and Vegter(2D and 3D)and Snyder(2D only).
The combination of ever increasing computational power and new mathematical models has fundamentally changed the field of computational chemistry. One example of this is the use of new algorithms for computing the charge density of a molecular system from which one can predict many physical properties of the system.
This thesis presents two new algorithms for minimizing the Kohn-Sham energy, which is used to describe a system of non-interacting electrons through a set of single-particle wavefunctions. By exploiting a known localization region of the wavefunctions, each algorithm evaluates the Kohn-Sham energy function and gradient at a set of iterates that have a special sparsity structure. We have chosen to represent the problem in real-space using finite-differences, allowing us to efficiently evaluate the energy function and gradient using sparse linear algebra. Detailed numerical experiments are provided on a set of representative molecules demonstrating the performance and robustness of these methods.
The recent advances in DNA sequencing technology and their many potential applications to Biology and Medicine have rekindled enormous interest in several classical algorithmic problems at the core of Genomics and Computational Biology: primarily, the whole-genome sequence assembly problem (WGSA). Two decades back, in the context of the Human Genome Project, the problem had received unprecedented scientific prominence: its computational complexity and intractability were thought to have been well understood; various competitive heuristics, thoroughly explored and the necessary software, properly implemented and validated. However, several recent studies, focusing on the experimental validation of de novo assemblies, have highlighted several limitations of the current assemblers.
Intrigued by these negative results, this dissertation reinvestigates the algorithmic techniques required to correctly and efficiently assemble genomes. Mired by its connection to a well-known NP-complete combinatorial optimization problem, historically, WGSA has been assumed to be amenable only to greedy and heuristic methods. By placing efficiency as their priority, these methods opted to rely on local searches, and are thus inherently approximate, ambiguous or error-prone. This dissertation presents a novel sequence assembler, SUTTA, that dispenses with the idea of limiting the solutions to just the approximated ones, and instead favors an approach that could potentially lead to an exhaustive (exponential-time) search of all possible layouts but tames the complexity through constrained search (Branch-and-Bound) and quick identification and pruning of implausible solutions.
Complementary to this problem is the task of validating the generated assemblies. Unfortunately, no commonly accepted method exists yet and widely used metrics to compare the assembled sequences emphasize only size, poorly capturing quality and accuracy. This dissertation also addresses these concerns by developing a more comprehensive metric, the Feature-Response Curve, that, using ideas from classical ROC (receiver-operating characteristic) curve, more faithfully captures the trade-off between contiguity and quality.
Finally, this dissertation demonstrates the advantages of a complete pipeline integrating base-calling (TotalReCaller) with assembly (SUTTA) in a Bayesian manner.
Raviart-Thomas finite elements are very useful for problems posed in H(div) since they are H(div)-conforming. We introduce two domain decomposition methods for solving vector field problems posed in H(div) discretized by Raviart-Thomas finite elements.
A two-level overlapping Schwarz method is developed. The coarse part of the preconditioner is based on energy-minimizing extensions and the local parts consist of traditional solvers on overlapping subdomains. We prove that our method is scalable and that the condition number grows linearly with the logarithm of the number of degrees of freedom in the individual subdomains and linearly with the relative overlap between the overlapping subdomains. The condition number of the method is also independent of the values and jumps of the coefficients across the interface between subdomains. We provide numerical results to support our theory.
We also consider a balancing domain decomposition by constraints (BDDC) method. The BDDC preconditioner consists of a coarse part involving primal constraints across the interface between subdomains and local parts related to the Schur complements corresponding to the local subdomain problems. We provide bounds of the condition number of the preconditioned linear system and suggest that the condition number has a polylogarithmic bound in terms of the number of degrees of freedom in the individual subdomains from our numerical experiments for arbitrary jumps of the coefficients across the subdomain interfaces.
We study the question of achieving cryptographic security on devices that leak information about their internal secret state to an external attacker.This study is motivated by the prevalence of side-channel attacks, where the physical characteristics of a computation (e.g. timing, power-consumption, temperature, radiation, acoustics, etc.) can be measured, and may reveal useful information about the internal state of a device. Since some such leakage is inevitably present in almost any physical implementation, we believe that this problem cannot just be addressed by physical countermeasures alone. Instead, it should already be taken into account when designing the mathematical specification of cryptographic primitives and included in the formal study of their security.
In this thesis, we propose a new formal framework for modeling the leakage available to an attacker. This framework, called the continual leakage model, assumes that an attacker can continually learn arbitrary information about the internal secret state of a cryptographic scheme at any point in time, subject only to the constraint that the rate of leakage is bounded. More precisely, our model assumes some abstract notion of time periods. In each such period, the attacker can choose to learn arbitrary functions of the current secret state of the scheme, as long as the number of output bits leaked is not too large. In our solutions, cryptographic schemes will continually update their internal secret state at the end of each time period. This will ensure that leakage observed in different time periods cannot be meaningfully combined to break the security of the cryptosystem. Although these updates modify the secret state of the cryptosystem, the desired functionality of the scheme is preserved, and the users can remain oblivious to these updates. We construct signatures, encryption, and secret sharing/storage schemes in this model.
In this thesis, we focus on surface representation for particle-based fluid simulators such as Smoothed Particle Hydrodynamics (SPH). We first present a new surface reconstruction algorithm which formulates the implicit function as a sum of anisotropic smoothing kernels. The direction of anisotropy at a particle is determined by performing Weighted Principal Component Analysis (WPCA) over the neighboring particles. In addition, we perform a smoothing step that re-positions the centers of these smoothing kernels. Since these anisotropic moothing kernels capture the local particle distributions more accurately, our method has advantages over existing methods in representing smooth surfaces, thin streams and sharp features of fluids. This method is fast, easy to implement, and the results demonstrate a significant improvement in the quality of reconstructed surfaces as compared to existing methods. Next,we introduce the idea of using an explicit triangle mesh to track the air/liquid interface in a SPH simulator.
Once an initial surface mesh is created, this mesh is carried forward in time using nearby particle velocities to advect the mesh vertices. The mesh connectivity remains mostly unchanged across time-steps; it is only modified locally for topology change events or for the improvement of triangle quality. In order to ensure that the surface mesh does not diverge from the underlying particle simulation, we periodically project the mesh surface onto an implicit surface defined by the physics simulation. The mesh surface presents several advantages over previous SPH surface tracking techniques: A new method for surface tension calculations clearly outperforms the state of the art in SPH surface tension for computer graphics. A new method for tracking detailed surface information (like colors) is less susceptible to numerical diffusion than competing techniques. Finally, a temporally-coherent surface mesh allows us to simulate high-resolution surface wave dynamics without being limited by the particle resolution of the SPH simulation.
The Satisfiability Modulo Theories Competition (SMT-COMP) is an annual competition aimed at stimulating the advance of the state-of-the-art techniques and tools developed by the Satisfiability Modulo Theories (SMT) community. As with the first three editions, SMT-COMP 2008 was held as a satellite event of CAV 2008, held July 7-14, 2008. This report gives an overview of the rules, competition format, benchmarks, participants and results of SMT-COMP 2008.
Most cryptographic primitives require randomness (for example, to generate secret keys). Usually, one assumes that perfect randomness is available, but, conceivably, such primitives might be built under weaker, more realistic assumptions. This is known to be achievable for many authentication applications, when entropy alone is typically sufficient. In contrast, all known techniques for achieving privacy seem to fundamentally require (nearly) perfect randomness. We ask the question whether this is just a coincidence, or, perhaps, privacy inherently requires true randomness?
We completely resolve this question for information-theoretic private-key encryption, where parties wish to encrypt a b-bit value using a shared secret key sampled from some imperfect source of randomness S. Our technique also extends to related primitives which are sufficiently binding and hiding, including computationally secure commitments and public-key encryption.
Our main result shows that if such n-bit source S allows for a secure encryption of b bits, where b > log n, then one can deterministically extract nearly b almost perfect random bits from S . Further, the restriction that b > log n is nearly tight: there exist sources S allowing one to perfectly encrypt (log n - log log n) bits, but not to deterministically extract even a single slightly unbiased bit.
Hence, to a large extent, true randomness is inherent for encryption: either the key length must be exponential in the message length b, or one can deterministically extract nearly b almost unbiased random bits from the key. In particular, the one-time pad scheme is essentially "universal".
Gene duplication can lead to genetic redundancy or functional divergence, when duplicated genes evolve independently or partition the original function. In this dissertation, we employed machine learning approaches to study two different views of this problem: 1) Redundome, which explored the redundancy of gene pairs in the genome of Arabidopsis thaliana, and 2) ContactBind, which focused on functional divergence of transcription factors by mutating contact residues to change binding affinity.
In the Redundome project, we used machine learning techniques to classify gene family members into redundant and non-redundant gene pairs in Arabidopsis thaliana, where sufficient genetic and genomic data is available. We showed that Support Vector Machines were two-fold more precise than single attribute classifiers, and performed among the best within other machine learning algorithms. Machine learning methods predict that about half of all genes in Arabidopsis showed the signature of predicted redundancy with at least one but typically less than three other family members. Interestingly, a large proportion of predicted redundant gene pairs were relatively old duplications (e.g., Ks>1), suggesting that redundancy is stable over long evolutionary periods. The genome-wide predictions were plot with similarity trees based on ClustalW alignment scores, and can be accessed at http://redundome.bio.nyu.edu .
In the ContactBind project, we use Bayesian networks to model dependences between contact residues in transcription factors and binding site sequences. Based on the models learned from various binding experiments, we predicted binding motifs and their locations on promoters for three families of transcription factors in three species. The predictions are publicly available at http://contactbind.bio.nyu.edu . The website also provides tools to predict binding motifs and their locations for novel protein sequences of transcription factors. Users can construct their Bayesian networks for new families once such a familial binding data is available.
The notion of identity-based encryption (IBE) was proposed as an economical alternative to public-key infrastructures. IBE is also a useful building block in various cryptographic primitives such as searchable encryption. A generalization of IBE is attribute-based encryption (ABE). A major application of ABE is fine-grained cryptographic access control of data. Research on these topics is still actively continuing.
However, security and privacy of IBE and ABE are hinged on the assumption that the authority which setups the system is honest. Our study aims to reduce this trust assumption.
The inherent key escrow of IBE has sparkled numerous debates in the cryptography/security community. A curious key generation center (KGC) can simply generate the user's private key to decrypt a ciphertext. However, can a KGC still decrypt if it does not know the intended recipient of the ciphertext? This question is answered by formalizing KGC anonymous ciphertext indistinguishability (ACI-KGC). All existing practical pairing-based IBE schemes without random oracles do not achieve this notion. In this thesis, we propose an IBE scheme with ACI-KGC, and a new system architecture with an anonymous secret key generation protocol such that the KGC can issue keys to authenticated users without knowing the list of users' identities. This also matches the practice that authentication should be done with the local registration authorities. Our proposal can be viewed as mitigating the key escrow problem in a new dimension.
For ABE, it is not realistic to trust a single authority to monitor all attributes and hence distributing control over many attribute-authorities is desirable. A multi-authority ABE scheme can be realized with a trusted central authority (CA) which issues part of the decryption key according to a user's global identifier (GID). However, this CA may have the power to decrypt every ciphertext, and the use of a consistent GID allowed the attribute-authorities to collectively build a full profile with all of a user's attributes. This thesis proposes a solution without the trusted CA and without compromising users' privacy, thus making ABE more usable in practice.
Underlying both contributions are our new privacy-preserving architectures enabled by borrowing techniques from anonymous credential.
We study policies aiming to minimize the weighted sum of completion times of jobs in the context of coordination mechanisms for selfish scheduling problems. Our goal is to design local policies that achieve a good price of anarchy in the resulting equilibria for unrelated machine scheduling. In short, we present the first constant-factor-approximate coordination mechanisms for this model.
First, we present a generalization of the ShortestFirst policy for weighted jobs, called SmithRule; we prove that it achieves an approximation ratio of 4 and we show that any set of non-preemptive ordering policies can result in equilibria with approximation ratio at least 3 even for unweighted jobs. Then, we present ProportionalSharing, a preemptive strongly local policy that beats this lower bound of 3; we show that this policy achieves an approximation ratio of 2.61 for the weighted sum of completion times and that the EqualSharing policy achieves an approximation ratio of 2.5 for the (unweighted) sum of completion times. Furthermore, we show that ProportionalSharing induces potential games (in which best-response dynamics converge to pure Nash equilibria).
All of our upper bounds are for the robust price of anarchy, defined by Roughgarden [36], so they naturally extend to mixed Nash equilibria, correlated equilibria, and regret minimization dynamics. Finally, we prove that our price of anarchy bound for ProportionalSharing can be used to design a new combinatorial constant-factor approximation algorithm minimizing weighted completion time for unrelated machine scheduling.
Software plays an increasingly crucial role in nearly every facet of modern life, from communications infrastructure to the control systems in automobiles, airplanes, and power plants. To achieve the highest degree of reliability for the most critical pieces of software, it is necessary to move beyond ad hoc testing and review processes towards verification---to prove using formal methods that a piece of code exhibits exactly those behaviors allowed by its specification and no others.
A significant portion of the existing software infrastructure is written in low-level languages like C and C++. Features of these language present significant verification challenges. For example, unrestricted pointer manipulation means that we cannot prove even the simplest properties of programs without first collecting precise information about potential aliasing relationships between variables.
In this thesis, I present several contributions. The first is a general framework for combining program analyses that are only conditionally sound. Using this framework, I show it is possible to design a sound verification tool that relies on a separate, previously-computed pointer analysis.
The second contribution of this thesis is Cascade, a multi-platform, multi-paradigm framework for verification. Cascade includes a support for precise analysis of low-level C code, as well as for higher-level languages such as SPL.
Finally, I describe a novel technique for the verification of datatype invariants in low-level systems code. The programmer provides a high-level specification for a low-level implementation in the form of inductive datatype declarations and code assertions. The connection between the high-level semantics and the implementation code is then checked using bit-precise reasoning. An implementation of this datatype verification technique is available as a Cascade module.
We consider four problems connected by the common thread of geometry. The first three involve problems and algorithms that arise in applications that apriori do not involve geometry but this turns out to be the right language for visualizing and analyzing them. In the fourth, we generalize some well known results in geometry to the topological plane. The techniques we use come from probability and topology.
First, we consider two algorithms that work well in practice but the theoretical mechanism behind whose success is not very well understood.
Greedy routing is a routing mechanism that is commonly used in wireless sensor networks. While routing on the Internet uses standard established protocols, routing in ad-hoc networks with little structure (like sensor networks) is more difficult. Practitioners have devised algorithms that work well in practice, however they were no known theoretical guarantees. We provide the first such result in this area by showing that greedy routing can be made to work on Planar triangulations.
Linear Programming is a technique for optimizing a linear function subject to linear constraints. Simplex Algorithms are a family of algorithms that have proven quite successful in solving Linear Programs in practice. However, examples of Linear Programs on which these algorithms are very inefficient have been obtained by researchers. In order to explain this discrepancy between theory and practice, many authors have shown that Simplex Algorithms are efficient in expectation on randomized Linear Programs. We strengthen these results by proving a partial concentration bound for the Shadow Vertex Simplex Algorithm.
Next, we point out a limitation in an algorithm that is used commonly by practitioners and suggest a way of overcoming this.
Recommendation Systems are algorithms that are used to recommend goods (books, movies etc.) to users based on the similarities between their past preferences and those of other users. Low Rank Approximation is a common method used for this. We point out a common limitation of this method in the presence of ill-conditioning: the presence of multiple local minima. We also suggest a simple averaging based technique to overcome this limitation.
Finally, we consider some basic results in convexity like Radon's, Helly's and Caratheodory's theorems and generalize them to the topological plane, i.e., a plane which has the concept of a linear path which is analogous to a straight line but no notion of metric or distances.
A domain decomposition algorithm, similar to classical iterative substructuring algorithms, is presented for two-dimensional problems in the space H0(curl). It is defined in terms of a coarse space and local subspaces associated with individual edges of the subdomains into which the domain of the problem has been subdivided. The algorithm differs from others in three basic respects. First, it can be implemented in an algebraic manner that does not require access to individual subdomain matrices or a coarse discretization of the domain; this is in contrast to algorithms of the BDDC, FETIâDP, and classical twoâlevel overlapping Schwarz families. Second, favorable condition number bounds can be established over a broader range of subdomain material properties than in previous studies. Third, we are able to develop theory for quite irregular subdomains and bounds for the condition number of our preconditioned conjugate gradient algorithm, which depend only on a few geometric parameters.
The coarse space for the algorithm is based on simple energy minimization concepts, and its dimension equals the number of subdomain edges. Numerical results are presented which confirm the theory and demonstrate the usefulness of the algorithm for a variety of mesh decompositions and distributions of material properties.
Maximum entropy (MaxEnt) framework has been studied extensively in the supervised setting. Here, the goal is to find a distribution p, that maximizes an entropy function while enforcing data constraints so that the expected values of some (pre-defined) features with respect to p, match their empirical counterparts approximately. Using different entropy measures, different model spaces for p and different approximation criteria for the data constraints yields a family of discriminative supervised learning methods (e.g., logistic regression, conditional random fields, least squares and boosting). This framework is known as the generalized maximum entropy framework.
Semi-supervised learning (SSL) has emerged in the last decade as a promising field that enables utilizing unlabeled data along with labeled data so as to increase the accuracy and robustness of inference algorithms. However, most SSL algorithms to date have had trade-offs, for instance in terms of scalability or applicability to multi-categorical data.
In this thesis, we extend the generalized MaxEnt framework to develop a family of novel SSL algorithms using two different approaches: i. Introducing Similarity Constraints We incorporate unlabeled data via modifications to the primal MaxEnt objective in terms of additional potential functions. A potential function stands for a closed proper convex function that can take the form of a constraint and/or a penalty representing our structural assumptions on the data geometry. Specifically, we impose similarity constraints as additional penalties based on the semi-supervised smoothness assumption; i.e., we restrict the generalized MaxEnt problem such that similar samples have similar model outputs. ii. Augmenting Constraints on Model Features We incorporate unlabeled data to enhance the estimates on the model and empirical expectations based on our assumptions on the data geometry.
In particular, we derive the semi-supervised formulations for three specific instances of the generalized MaxEnt on conditional distributions, namely logistic regression and kernel logistic regression for multi-class problems, and conditional random fields for structured output prediction problems. A thorough empirical evaluation on standard data sets that are widely used in the literature demonstrates the validity and competitiveness of the proposed algorithms. In addition to these benchmark data sets, we apply our approach to two real-life problems: i. vision based robot grasping, and ii. remote sensing image classification, where the scarcity of the labeled training samples is the main bottleneck in the learning process. For the particular case of grasp learning, we propose a combination of semi-supervised learning and active learning, another sub-field of machine learning that is focused on the scarcity of labeled samples, when the problem setup is suitable for incremental labeling.
The novel SSL algorithms proposed in this thesis have numerous advantages over the existing semi-supervised algorithms as they yield convex, scalable, inherently multi-class loss functions that can be kernelized naturally.
Design errors in computer systems, i.e. bugs, can cause inconvenience, loss of data and time, and in some cases catastrophic damages. One approach for improving design correctness is formal methods: techniques aiming at mathematically establishing that a piece of hardware or software satisfies certain properties. For some industrial cases in which formal methods are utilized, quantified first order formulas in satisfiability modulo theories (SMT) are useful. This dissertation presents several novel techniques for solving quantified formulas in SMT.
In general, deciding a quantified formula in SMT is undecidable. The practical approach for general quantifier reasoning in SMT is heuristics-based instantiation. This dissertation proposes a number of new heuristics that solves several challenges. Experimental results show that with the new heuristics a significant number of more benchmarks can be solved than before.
When only consider formulas within certain fragments of first order logic, it is possible to have complete algorithms based on instantiation. We propose several new fragments, and we prove that formulas in these fragments can be solved by a complete algorithm based on instantiation. For satisfiable quantified formulas in these fragments, we show how to construct the models.
As SMT solvers grow in complexity, the correctness of SMT solvers become questionable. A practical method to improve the correctness is to check the proofs from SMT solvers. We propose a proof translator that translates proofs from SMT solver CVC3 into a trusted solver HOL Light that actually checks the proofs. Experiments with the proof translator discover a faulty proof rule in CVC3 and two MIT-labeled quantified benchmarks in the SMT benchmark library SMT-LIB.
The classic method of Nelson and Oppen for combining decision procedures requires the theories to be stably-infnite. Unfortunately, some important theories do not fall into this category (e.g. the theory of bit-vectors). To remedy this problem, previous work introduced the notion of polite theories. Polite theories can be combined with any other theory using an extension of the Nelson-Oppen approach. In this paper we revisit the notion of polite theories, fxing a subtle flaw in the original definition. We give a new combination theorem which specifies the degree to which politeness is preserved when combining polite theories. We also give conditions under which politeness is preserved when instantiating theories by identifying two sorts. These results lead to a more general variant of the theorem for combining multiple polite theories.
In many domains we face the problem of determining the underlying causal structure from time-course observations of a system. Whether we have neural spike trains in neuroscience, gene expression levels in systems biology, or stock price movements in finance, we want to determine why these systems behave the way they do. For this purpose we must assess which of the myriad possible causes are significant while aiming to do so with a feasible computational complexity. At the same time, there has been much work in philosophy on what it means for something to be a cause, but comparatively little attention has been paid to how we can identify these causes. Algorithmic approaches from computer science have provided the first steps in this direction, but fail to capture the complex, probabilistic and temporal nature of the relationships we seek.
This dissertation presents a novel approach to the inference of general (type-level) and singular (token-level) causes. The approach combines philosophical notions of causality with algorithmic approaches built on model checking and statistical techniques for false discovery rate control. By using a probabilistic computation tree logic to describe both cause and effect, we allow for complex relationships and explicit description of the time between cause and effect as well as the probability of this relationship being observed (e.g. "a and b until c, causing d in 10-20 time units"). Using these causal formulas and their associated probabilities, we develop a novel measure for the significance of a cause for its effect, thus allowing discovery of those that are statistically interesting, determined using the concepts of multiple hypothesis testing and false discovery control. We develop algorithms for testing these properties in time-series observations and for relating the inferred general relationships to token-level events (described as sequences of observations). Finally, we illustrate these ideas with example data from both neuroscience and finance, comparing the results to those found with other inference methods. The results demonstrate that our approach achieves superior control of false discovery rates, due to its ability to appropriately represent and infer temporal information.
The Reissner-Mindlin plate theory models a thin plate with thickness t. The condition numbers of finite element approximations of this model deteriorate badly as the thickness t of the plate converges to 0. In this paper, we develop an overlapping domain decomposition method for the Reissner-Mindlin plate model discretized by the Falk-Tu elements with the convergence rate which does not deteriorate when t converges to 0. It is shown that the condition number of this overlapping method is bounded by C(1+ H/delta)^3(1 +logH/h)^2. Here H is the maximum diameter of the subdomains, delta the size of overlap between subdomains, and h the element size. Numerical examples are provided to confirm the theory.
We collect time series from real-world phenomena, such as gene interactions in biology or word frequencies in consecutive news articles. However, these data present us with an incomplete picture, as they result from complex dynamical processes involving unobserved state variables. Research on state-space models is motivated by simultaneously trying to infer hidden state variables from observations, as well as learning the associated dynamic and generative models.
I have developed a tractable, gradient-based method for training Dynamic Factor Graphs (DFG) with continuous latent variables. A DFG consists of (potentially nonlinear) factors modeling joint probabilities between hidden and observed variables. The DFG assigns a scalar energy to each configuration of variables, and a gradient-based inference procedure finds the minimum-energy state sequence for a given observation sequence. We approximate maximum likelihood learning by minimizing the expected energy over training sequences with respect to the factors' parameters. These alternated inference and parameter updates constitute a deterministic EM-like procedure.
Using nonlinear factors such as deep, convolutional networks, DFGs were shown to reconstruct chaotic attractors, to outperform a time series prediction benchmark, and to successfully impute motion capture data where a large number of markers were missing. In a joint work with the NYU Plant Systems Biology Lab, DFGs have been subsequently employed to the discovery of gene regulation networks by learning the dynamics of mRNA expression levels.
DFGs have also been extended into a deep auto-encoder architecture, and used on time-stamped text documents, with word frequencies as inputs. We focused on collections of documents that exhibit a structure over time. Working as dynamic topic models, DFGs could extract a latent trajectory from consecutive political speeches; applied to news articles, they achieved state-of-the-art text categorization and retrieval performance.
Finally, I used an embodiment of DFGs to evaluate the likelihood of discrete sequences of words in text corpora, relying on dynamics on word embeddings. Collaborating with AT&T; Labs Research on a project in speech recognition, we have improved on existing continuous statistical language models by enriching them with word features and long-range topic dependencies.
BDDC algorithms are constructed and analyzed for the system of almost incompressible elasticity discretized with Gauss-Lobatto-Legendre spectral elements in three dimensions. Initially mixed spectral elements are employed to discretize the almost incompressible elasticity system, but a positive definite reformulation is obtained by eliminating all pressure degrees of freedom interior to each subdomain into which the spectral elements have been grouped. Appropriate sets of primal constraints can be associated with the subdomain vertices, edges, and faces so that the resulting BDDC methods have a fast convergence rate independent of the almost incompressibility of the material. In particular, the condition number of the BDDC preconditioned operator is shown to depend only weakly on the polynomial degree $n$, the ratio $H/h$ of subdomain and element diameters, and the inverse of the inf-sup constants of the subdomains and the underlying mixed formulation, while being scalable, i.e., independent of the number of subdomains and robust, i.e., independent of the Poisson ratio and Young's modulus of the material considered. These results also apply to the related FETI-DP algorithms defined by the same set of primal constraints. Numerical experiments carried out on parallel computing systems confirm these results.
The tools of computer science can be a tremendous help to the working biologist. Two broad areas where this is particularly true are visualization and prediction. In visualization, the size of the data involved often makes meaningful exploration of the data and discovery of salient features difficult and time-consuming. Similarly, intelligent prediction algorithms can greatly reduce the lab time required to achieve significant results, or can reduce an intractable space of potential experiments to a tractable size.
Whereas the thesis discusses both a visualization technique and a machine learning problem, the thesis presentation will focus exclusively on the machine learning problem: prediction of temperature-sensitive mutations from protein structure. Temperature-sensitive mutations are a tremendously valuable research tool particularly for studying genes such as yeast essentially genes. To date, most methods for generating temperature-sensitive mutations involve large-scale random mutations followed by an intensive screening and characterization process. While there have been successful efforts to improve this process by rational design of temperature-sensitive proteins, surprisingly little work has been done in the area of predicting those mutations that will exhibit a temperature-sensitive phenotype. We describe a system that, given the structure of a protein of interest, uses a combination of protein structure prediction and machine learning to provide a ranked "top 5" list of likely candidates for temperature-sensitive mutations.
Kernel-based algorithms have been used with great success in a variety of machine learning applications. These include algorithms such as support vector machines for classification, kernel ridge regression, ranking algorithms, clustering algorithms, and virtually all popular dimensionality reduction algorithms, since they are special instances of kernel principal component analysis.
But, the choice of the kernel, which is crucial to the success of these algorithms, has been traditionally left entirely to the user. Rather than requesting the user to commit to a specific kernel, multiple kernel algorithms require the user only to specify a family of kernels. This family of kernels can be used by a learning algorithm to form a combined kernel and derive an accurate predictor. This is a problem that has attracted a lot of attention recently, both from the theoretical point of view and from the algorithmic, optimization, and application point of view.
This thesis presents a number of novel theoretical and algorithmic results for learning with multiple kernels.
It gives the first tight margin-based generalization bounds for learning kernels with Lp regularization. In particular, our margin bounds for L1 regularization are shown to have only a logarithmic dependency on the number of kernels, which is a significant improvement over all previous analyses. Our results also include stability-based guarantees for a class of regression algorithms. In all cases, these guarantees indicate the benefits of learning with a large number of kernels.
We also present a family of new two-stage algorithms for learning kernels based on a notion of alignment and give an extensive analysis of the properties of these algorithms. We show the existence of good predictors for the notion of alignment we define and give efficient algorithms for learning a maximum alignment kernel by showing that the problem can be reduced to a simple QP.
Finally, we also report the results of extensive experiments with our two-stage algorithms in classification and regression tasks, which show an improvement both over the uniform combination of kernels and over other state-of-the-art learning kernel methods for L1 and L2 regularization. These might constitute the first series of results for learning with multiple kernels that demonstrate a consistent improvement over a uniform combination of kernels.
In computer graphics and user interface design, selection problems are those that require the user to select a collection consisting of a small number of items from a much larger library. This dissertation explores selection problems in two diverse domains: large personal multimedia collections, containing items such as personal photographs or songs, and camera positions for 3D objects, where each item is a different viewpoint observing an object. Multimedia collections have by discrete items with strong associated metadata, while camera positions form a continuous space but are weak in metadata. In either domain, the items to be selected have rich interconnections and dependencies, making it difficult to successfully apply simple techniques (such as ranking) to aid the user. Accordingly, we develop separate approaches for the two domains.
For personal multimedia collections, we leverage the semantic metadata associated with each item (such as song title, artist name, etc.) and provide the user with a simple query language to describe their desired collection. Our system automatically suggests a collection of items that conform to the userâs query. Since any query language has limited expressive power, and since users often create collections via exploration, we provide various refinement techniques that allow the user to expand, refine and explore their collection directly through examples.
For camera positioning, we do not have the advantage of having semantic metadata for each item, unlike in media collections. We instead create a proxy viewpoint goodness function which can be used to guide the solution of various selection problems involving camera viewpoints. This function is constructed from several different attributes of the viewpoint, such as how much surface area is visible, or how "curvy" the silhouette is. Since there are many possible viewpoint goodness functions, we conducted a large user study of viewpoint preference and use the results to evaluate thousands of different functions and find the best ones. While we suggest several goodness functions to the practitioner, our user study data and methodology can be used to evaluate any proposed goodness function; we hope it will be a useful tool for other researchers.
Stream processing applications such as algorithmic trading, MPEG processing, and web content analysis are ubiquitous and essential to business and entertainment. Language designers have developed numerous domain-specific languages that are both tailored to the needs of their applications, and optimized for performance on their particular target platforms. Unfortunately, the goals of generality and performance are frequently at odds, and prior work on the formal semantics of stream processing languages does not capture the details necessary for reasoning about implementations. This paper presents Brooklet, a core calculus for stream processing that allows us to reason about how to map languages to platforms and how to optimize stream programs. We translate from three representative languages, CQL, StreamIt, and Sawzall, to Brooklet, and show that the translations are correct. We formalize three popular and vital optimizations, data-parallel computation, operator fusion, and operator re-ordering, and show under which conditions they are correct. Language designers can use Brooklet to specify exactly how new features or languages behave. Language implementors can use Brooklet to show exactly under which circumstances new optimizations are correct. In ongoing work, we are developing an intermediate language for streaming that is based on Brooklet. We are implementing our intermediate language on System S, IBM's high-performance streaming middleware.
Mass spectrometry is a powerful technique in analytical chemistry that was originally designed to determine the composition of small molecules in terms of their constituent elements. In the last several decades, it has begun to be used for much more complex tasks, including the detailed analysis of the amino acid sequence that makes up an unknown protein and even the identification of multiple proteins present in a complex mixture. The latter problem is largely unsolved and the principal subject of this dissertation.
The fundamental difficulty in the analysis of mass spectrometry data is that of ill-posedness. There are multiple solutions consistent with the experimental data and the data is subject to significant amounts of noise. In this work, we have developed application-specific machine learning algorithms that (partially) overcome this ill-posedness. We make use of labeled examples of a single class of peptide fragments and of the unlabeled fragments detected by the instrument. This places the approach within the broader framework of semi-supervised learning.
Recently, there has been considerable interest in classification problems of this type, where the learning algorithm only has access to labeled examples of a single class and unlabeled data. The motivation for such problems is that in many applications, examples of one of the two classes are easy and inexpensive to obtain, whereas the acquisition of examples of a second class is difficult and labor-intensive. For example, in document classification, positive examples are documents that address specific subject, while unlabeled documents are abundant. In movie rating, the positive data are the movies chosen by clients, while the unlabeled data are all remaining movies in a collection. In medical imaging, positive (labeled) data correspond to images of tissue affected by a disease, while the remaining available images of the same tissue comprise the unlabeled data. Protein identification using mass spectrometry is another variant of such a general problem.
In this work, we propose application-specific machine learning algorithms to address this problem. The reliable identification of proteins from mixtures using mass spectrometry would provide an important tool in both biomedical research and clinical practice.
Modern learning problems in computer vision, natural language processing, computational biology, and other areas are often based on large data sets of thousands to millions of training instances. However, several standard learning algorithms, such as kernel-based algorithms, e.g., Support Vector Machines, Kernel Ridge Regression, Kernel PCA, do not easily scale to such orders of magnitude. This thesis focuses on sampling-based matrix approximation techniques that help scale kernel-based algorithms to large-scale datasets. We address several fundamental theoretical and empirical questions including:
What approximation should be used? We discuss two common sampling-based methods, providing novel theoretical insights regarding their suitability for various applications and experimental results motivated by this theory. Our results show that one of these methods, the Nystrom method, is superior in the context of large-scale learning.
Do these approximations work in practice? We show the effectiveness of approximation techniques on a variety of problems. In the largest study to-date for manifold learning, we use the Nystrom method to extract low-dimensional structure from high-dimensional data to effectively cluster face images. We also report good empirical results for kernel ridge regression and kernel logistic regression.
How should we sample columns? A key aspect of sampling-based algorithms is the distribution according to which columns are sampled. We study both fixed and adaptive sampling schemes as well as a promising ensemble technique that can be easily parallelized and generates superior approximations, both in theory and in practice.
How well do these approximations work in theory? We provide theoretical analyses of the Nystrom method to understand when this technique should be used. We present guarantees on approximation accuracy based on various matrix properties and analyze the effect of matrix approximation on actual kernel-based algorithms.
This work has important consequences for the machine learning community since it extends to large-scale applications the benefits of kernel-based algorithms. The crucial aspect of this research, involving low-rank matrix approximation, is of independent interest within the field of numerical linear algebra.
We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This scheme makes it possible to robustly learn multiple layers of representation and we show a model with 4 layers, trained on images from the Caltech-101 dataset. We use our model to produce image decompositions that, when used as input to standard classification schemes, give a significant performance gain over low-level edge features and yield an overall performance competitive with leading approaches.
Inherent in many interesting regression problems is a rich underlying inter-sample "Relational Structure". In these problems, the samples may be related to each other in ways such that the unknown variables associated with any sample not only depends on its individual attributes, but also depends on the variables associated with related samples. One such problem, whose importance is further emphasized by the present economic crises, is understanding real estate prices. The price of a house clearly depends on its individual attributes, such as, the number of bedrooms. However, the price also depends on the neighborhood in which the house lies and on the time period in which it was sold. This effect of neighborhood and time on the price is not directly measurable. It is merely reflected in the prices of other houses in the vicinity that were sold around the same time period. Uncovering these spatio-temporal dependencies can certainly help better understand house prices, while at the same time improving prediction accuracy.
Problems of this nature fall in the domain of "Statistical Relational Learning". However the drawback of most models proposed so far is that they cater only to classification problems. To this end, we propose "relational factor graph" models for doing regression in relational data. A single factor graph is used to capture, one, dependencies among individual variables of sample, and two, dependencies among variables associated with multiple samples. The proposed models are capable of capturing hidden inter-sample dependencies via latent variables, and also permits non-linear log-likelihood functions in parameter space, thereby allowing considerably more complex architectures. Efficient inference and learning algorithms for relational factor graphs are proposed. The models are applied to predicting the prices of real estate properties and for constructing house price indices. The relational aspect of the model accounts for the hidden spatio-temporal influences on the price of every house. Experiments show that one can achieve considerably superior performance by identifying and using the underlying spatio-temporal structure associated with the problem. To the best of our knowledge this is the first work in the direction of relational regression and is also the first work in constructing house price indices by simultaneously accounting for the spatio-temporal effects on house prices using large-scale industry standard data set.
Overlapping Schwarz methods are considered for mixed finite element approximations of linear elasticity, with discontinuous pressure spaces, as well as for compressible elasticity approximated by standard conforming finite elements. The coarse components of the preconditioners are based on %spaces, with a fixed number of degrees of freedom per subdomain, spaces, with a number of degrees of freedom per subdomain which is uniformly bounded, and which are similar to those previously developed for scalar elliptic problems and domain decomposition methods of iterative substructuring type, i.e., methods based on non-overlapping decompositions of the domain. The local components of the new preconditioners are based on solvers on a set of overlapping subdomains.
We discuss the problem of finding the second largest eigenvalue of an operator that defines a reversible Markov chain. The second largest eigenvalue governs the rate at which the statistics of the Markov chain converge to equilibrium. Scientific applications include understanding the very slow dynamics of some models of dynamic glass. Applications in computing include estimating the rate of convergence of Markov chain Monte Carlo algorithms.
Most practical Markov chains have state spaces so large that direct or even iterative methods from linear algebra are inapplicable. The size of the state space, which is the dimension of the eigenvalue problem, grows exponentially with the system size. This makes it impossible to store a vector (for sparse methods), let alone a matrix (for dense methods). Instead, we seek a method that uses only time correlation from samples produced from the Markov chain itself.
In this thesis, we propose a novel Krylov subspace type method to estimate the second eigenvalue from the simulation data of the Markov chain using test functions which are known to have good overlap with the slowest mode. This method starts with the naive Rayleigh quotient estimate of the test function and refines it to obtain an improved estimate of the second eigenvalue. We apply the method to a few model problems and the estimate compares very favorably with the known answer. We also apply the estimator to some Markov chains occuring in practice, most notably in the study of glasses. We show experimentally that our estimator is more accurate and stable for these problems compared to the existing methods.
The creation of 3D models is a fundamental task in computer graphics. The task is required by professional artists working on movies, television, and games, and desired by casual users who wish to make their own models for use in virtual worlds or as a hobby.
In this thesis, we consider approaches to creating and editing 3D models that minimize the user's thinking in 3D. In particular, our approaches do not require the user to manipulate 3D positions in space or mentally invert complex 3D-to-2D mappings. We present interfaces and algorithms for the creation of 3D surfaces, for texturing, and for adding small-to-medium scale geometric detail.
First, we present a novel approach for texture placement and editing based on direct manipulation of textures on the surface. Compared to conventional tools for surface texturing, our system combines UV-coordinate specification and texture editing into one seamless process, reducing the need for careful initial design of parameterization and providing a natural interface for working with textures directly on 3D surfaces.
Second, we present a system for free-form surface modeling that allows a user to modify a shape by changing its rendered, shaded image using stroke-based drawing tools. A new shape, whose rendered image closely approximates user input, is c omputed using an efficient and stable surface optimization procedure. We demonstrate how several types of free-form surface edits which may be difficult to cast in terms of standard deformation approaches can be easily performed using our system.
Third, we present a single-view 2D interface for 3D modeling based on the idea of placing 2D primitives and annotations on an existing, pre-made sketch or image. Our interface frees users to create 2D sketches from arbitrary angles using their preferred tool---including pencil and paper---which they then "describe" using our tool to create a 3D model. Our primitives are manipulated with persistent, dynamic handles, and our annotations take the form of markings commonly used in geometry textbooks.
In this thesis we consider proximity problems on point sets. Proximity problems arise in all fields of computer science, with broad application to computation geometry, machine learning, computational biology, data mining and the like. In particular, we will consider the problems of approximate nearest neighbor search, and dynamic maintenance of a spanner for a point set.
It has been conjectured that all algorithms for these two problems suffer from the "curse of dimensionality," which means that their run time grow exponentially with the dimension of the point set. To avoid this undesirable growth, we consider point sets that occupy a doubling dimension lambda. We first present a dynamic data structure that uses linear space and supports a (1+e)-approximate nearest neighbor search of the point set. We then extend this algorithm to allow the dynamic maintenance of a low degree (1+e)-spanner for the point set. The query and update time of these structures are exponential in lambda (as opposed to exponential in the dimension); when lambda is small, this provides a significant spead-up over known algorithms, and when lambda is constant then these run times are optimal up to a constant. Even when no assumptions are made on lambda, the query and update times of the neighest neighbor search structure match the best known run times for approximate nearest neighbor search (up to a constant multiple in lambda). Further, the stretch of the spanner is optimal, and its update times exceed all previously known algorithms.
The creativity support community has a long history of providing valuable tools to artists and designers. Similarly, creative digital media practice has proven a valuable pedagogical strategy for teaching core computational ideas. Neither strain of research has focused on the domain of literary art however, instead targeting visual, and aural media almost exclusively.
To address this situation, this thesis presents a software toolkit created specifically to support creativity in computational literature. Two primary hypotheses direct the bulk of the research presented: first, that it is possible to implement effective creativity support tools for literary art given current resource constraints; and second, that such tools, in addition to facilitating new forms of literary creativity, will provide unique opportunities for computer science education.
Designed both for practicing artists and for pedagogy, the research presented directly addresses impediments to participation in the field for a diverse range of users and provides an end-to-end solution for courses attempting to engage the creative faculties of computer science students, and to introduce a wider demographic--from writers, to digital artists, to media and literary theorists --to procedural literacy and computational thinking.
The tools and strategies presented have been implemented, deployed, and iteratively refined in a real-world contexts over the past three years. In addition to their use in large-scale projects by contemporary artists, they have provided effective support for multiple iterations of 'Programming for Digital Art & Literature', a successful inter-disciplinary computer science course taught by the author.
Taken together, this thesis provides a novel set of tools for a new domain, and demonstrates their real-world efficacy in providing both creativity and pedagogical support for a diverse and emerging population of users.
We extend "A boundary integral method for simulating the dynamics of inextensible vesicles suspended in a viscous fluid in 2D", Veerapaneni et al. Journal of Computational Physics, 228(7), 2009 to the case of three dimensional axisymmetric vesicles of spherical or toroidal topology immersed in viscous flows. Although the main components of the algorithm are similar in spirit to the 2D case.spectral approximation in space, semi-implicit time-stepping scheme.the main differences are that the bending and viscous force require new analysis, the linearization for the semi-implicit schemes must be rederived, a fully implicit scheme must be used for the toroidal topology to eliminate a CFL-type restriction, and a novel numerical scheme for the evaluation of the 3D Stokes single-layer potential on an axisymmetric surface is necessary to speed up the calculations. By introducing these novel components, we obtain a time-scheme that experimentally is unconditionally stable, has low cost per time step, and is third-order accurate in time. We present numerical results to analyze the cost and convergence rates of the scheme. To verify the solver, we compare it to a constrained variational approach to compute equilibrium shapes that does not involve interactions with a viscous fluid. To illustrate the applicability of method, we consider a few vesicle-flow interaction problems: the sedimentation of a vesicle, interactions of one and three vesicles with a background Poiseuille flow.
Our goal is to solve nonlinear contact problems. We consider bodies in contact with each other divided into subdomains, which in turn are unions of elements. The contact surface between the bodies is unknown a priori, and we have a nonpen-etration condition between the bodies, which is essentially an inequality constraint. We choose to use an active set method to solve such problems, which has both outer iterations in which the active set is updated, and inner iterations in which a (linear) minimization problem is solved on the current active face. In the first part of this dissertation, we review the basics of domain decomposition methods. In the second part, we consider how to solve the inner minimization problems. Using an approach based purely on FETI algorithms with only Lagrange multipliers as unknowns, as has been developed by the engineering community, does not lead to a scalable algorithm with respect to the number of subdomains in each body. We prove that such an algorithm has a condition number estimate which depends linearly on the number of subdomains across a body; numerical experiments suggest that this is the best possible bound. We also consider a new method based on the saddle point formulation of the FETI methods with both displacement vectors and Lagrange multipliers as unknowns. The resulting system is solved with a block-diagonal preconditioner which combines the one-level FETIand the BDDC methods. This approach allows the use of inexact solvers. We show that this new method is scalable with respect to the number of subdomains, and that its convergence rate depends only logarithmically on the number of degrees of freedom of the subdomains and bodies. In the last part of this dissertation, a model contact problem is solved by two approaches. The first one is a nonlinear algorithm which combines an active set method and the new method of Chapter 4. We also present a novel way of finding an initial active set. The second one uses the SMALBE algorithm, developed by Dostal et al. We show that the former approach has advantages over the latter.
Recently, Computational Biology has emerged as one of the most exciting areas of computer science research, not only because of its immediate impact on many biomedical applications, (e.g., personalized medicine, drug and vaccine discovery, tools for diagnostics and therapeutic interventions, etc.), but also because it raises many new and interesting combinatorial and algorithmic questions, in the process. In this thesis, we focus on robust and efficient algorithms to analyze biological networks, primarily targeting protein networks, possibly the most fascinating networks in computational biology in terms of their structure, evolution and complexity, as well as because of their role in various genetic and metabolic diseases.
Classically, protein networks have been studied statically, i.e., without taking into account time-dependent metamorphic changes in network topology and functionality. In this work, we introduce new analysis techniques that view protein networks as being dynamic in nature, evolving over time, and diverse in regulatory patterns at various stages of the system development. Our analysis is capable of dealing with multiple time-scales: ranging from the slowest time-scale corresponding to evolutionary time between species, speeding up to inter-species pathway evolution time, and finally, moving to the other extreme at the cellular developmental time-scale.
We also provide a new method to overcome limitations imposed by corrupting effects of experimental noise (e.g., high false positive and false negative rates) in Yeast Two-Hybrid (Y2H) networks, which often provide primary data for protein complexes. Our new combinatorial algorithm measures connectivity between proteins in Y2H network not by edges but by edge-disjoint paths, which reflects pathway evolution better within single specie network. This algorithm has been shown to be robust against increasing false positives and false negatives, as estimated using variation of information and separation measures.
In addition, we have devised a new way to incorporate evolutionary information in order to significantly improve classification of proteins, especially those isolated in their own networks or surrounded by poorly characterized neighbors. In our method, the networks of two (or more) species are joined by edges of high sequence similarity so that protein-homologs of different species can exchange information and acquire new and improved functional associations.
Finally, we have integrated many of these techniques into one tool to create a novel analysis of malaria parasite P. falciparum's life-cycle at the scale of reaction-time, single cell level, and encompassing its entire inter-erythrocytic developmental cycle (IDC). Our approach allows connecting time-course gene expression profiles of consecutive IDC stages in order to assign functions to un-annotated Malaria proteins and predict potential targets for vaccine and drug development.
Curvilinear features allow one to represent a variety of real world regular patterns like honeycomb tiling as well as very complicated random patterns like networks of furrows on the surface of the human skin. We have developed a set of methods and new data representations for solving key problems related to curvilinear features, which include robust detection of intricate networks of curvilinear features from digital images, GPU-based sharp rendering of fields with curvilinear features, and a parametric synthesis approach to generate systems of curvilinear features with desirable local configurations and global control.
The existing edge-detection techniques may underperform in the presence of noise, usually do not link the detected edge points into chains, often fail on complex structures, heavily depend on initial guess, and assume significant manual phase. We have developed a technique based on active contours, or snakes, which avoids manual initial positioning of the snakes and can detect large networks of curves with complex junctions without user guidance.
The standard bilinear interpolation of piecewise continuous fields results in unwanted smoothing along the curvilinear discontinuities. Spatially varying features can be best represented as a function of the distance to the discontinuity curves and its gradient. We have developed a real-time, GPU-based method for unsigned distance function field and its gradient field interpolation which preserves discontinuity feature curves, represented by quadratic Bezier curves, with minimal restriction on their topology.
Detail features are very important visual clues which make computer-generated imagery look less artificial. Instead of using sample-based synthesis technique which lacks user control on features usually producing gaps in features or breaking feature coherency, we have explored an alternative approach of generating features using random fibre processes. We have developed a Gibbs-type random process of linear fibres based on local fibre interactions. It allows generating non-stationary curvilinear networks with some degree of regularity, and provides an intuitive set of parameters which directly defines fibre local configurations and global pattern of fibres.
For random systems of linear fibres which approximately form two orthogonal dominant orientation fields, we have adapted a streamline placement algorithm which converts such systems into overlapping random sets of coherent smooth curves.
The applicability of machine learning methods is often limited by the amount of available labeled data, and by the ability (or inability) of the designer to produce good internal representations and good similarity measures for the input data vectors.
The aim of this thesis is to alleviate these two limitations by proposing algorithms to learn good internal representations, and invariant feature hierarchies from unlabeled data. These methods go beyond traditional supervised learning algorithms, and rely on unsupervised, and semi-supervised learning.
In particular, this work focuses on ''deep learning'' methods, a set of techniques and principles to train hierarchical models. Hierarchical models produce feature hierarchies that can capture complex non-linear dependencies among the observed data variables in a concise and efficient manner. After training, these models can be employed in real-time systems because they compute the representation by a very fast forward propagation of the input through a sequence of non-linear transformations.
When the paucity of labeled data does not allow the use of traditional supervised algorithms, each layer of the hierarchy can be trained in sequence starting at the bottom by using unsupervised or semi-supervised algorithms. Once each layer has been trained, the whole system can be fine-tuned in an end-to-end fashion. We propose several unsupervised algorithms that can be used as building block to train such feature hierarchies. We investigate algorithms that produce sparse overcomplete representations and features that are invariant to known and learned transformations. These algorithms are designed using the Energy-Based Model framework and gradient-based optimization techniques that scale well on large datasets. The principle underlying these algorithms is to learn representations that are at the same time sparse, able to reconstruct the observation, and directly predictable by some learned mapping that can be used for fast inference in test time.
With the general principles at the foundation of these algorithms, we validate these models on a variety of tasks, from visual object recognition to text document classification and retrieval.
The two standard methods of obtaining a least-squares optimal estimator are (1) Bayesian estimation, in which one assumes a prior distribution on the true values and combines this with a model of the measurement process to obtain an optimal estimator, and (2) supervised regression, in which one optimizes a parametric estimator over a training set containing pairs of corrupted measurements and their associated true values. But many real-world systems do not have access to either supervised training examples or a prior model. Here, we study the problem of obtaining an optimal estimator given a measurement process with known statistics, and a set of corrupted measurements of random values drawn from an unknown prior. We develop a general form of nonparametric empirical Bayesian estimator that is written as a direct function of the measurement density, with no explicit reference to the prior. We study the observation conditions under which such "prior-free" estimators may be obtained, and we derive specific forms for a variety of different corruption processes. Each of these prior-free estimators may also be used to express the mean squared estimation error as an expectation over the measurement density, thus generalizing Stein's unbiased risk estimator (SURE) which provides such an expression for the additive Gaussian noise case. Minimizing this expression over measurement samples provides an "unsupervised regression" method of learning an optimal estimator from noisy measurements in the absence of clean training data. We show that combining a prior-free estimator with its corresponding unsupervised regression form produces a generalization of the "score matching" procedure for parametric density estimation, and we develop an incremental form of learning for estimators that are written as a linear combination of nonlinear kernel functions. Finally, we show through numerical simulations that the convergence of these estimators can be comparable to their supervised or Bayesian counterparts.
The modern proliferation of very large audio, video, and biological databases has created a need for the design of effective methods for indexing and searching highly variable or uncertain data. Classical search and indexing algorithms deal with clean or perfect input sequences. However, an index created from speech transcriptions is marked with errors and uncertainties stemming from the use of imperfect statistical models in the speech recognition process. Similarly, automatic transcription of music, such as assigning a sequence of notes to represent a stream of music audio, is prone to errors. How can we generalize search and indexing algorithms to deal with such uncertain inputs?
This thesis presents several novel algorithms, analyses, and general techniques and tools for effective indexing and search that not only tolerate but actually exploit this uncertainty. In particular, it develops an algorithmic foundation for music identification, or content-based music search; presents novel automata-theoretic results applicable generally to a variety of search and indexing tasks; and describes new algorithms for topic segmentation, or automatic splitting of speech streams into topic-coherent segments.
We devise a new technique for music identification in which each song is represented by a distinct sequence of music sounds, called "music phonemes." In our approach, we learn the set of music phonemes, as well as a unique sequence of music phonemes characterizing each song, from training data using an unsupervised algorithm. We also propose a novel application of factor automata to create a compact mapping of music phoneme sequences to songs. Using these techniques, we construct an efficient and robust music identification system for a large database of songs.
We further design new algorithms for compact indexing of uncertain inputs based on suffix and factor automata and give novel theoretical guarantees for their space requirements. Suffix automata and factor automata represent the set of all suffixes or substrings of a set of strings, and are used in numerous indexing and search tasks, including the music identification system just mentioned. We show that the suffix automaton or factor automaton of a set of strings U has at most 2Q-2 states, where Q is the number of nodes of a prefix-tree representing the strings in U, a significant improvement over previous work. We also describe a matching new linear-time algorithm for constructing the suffix automaton S or factor automaton F of U in time O(|S|).
We also define a new quality measure for topic segmentation systems and design a discriminative topic segmentation algorithm for speech inputs, thus facilitating effective indexation of spoken audio collections. The new quality measure improves on previously used criteria and is correlated with human judgment of topic-coherence. Our segmentation algorithm uses a novel general topical similarity score based on word co-occurrence statistics. This new algorithm outperforms previous methods in experiments over speech and text streams. We further demonstrate that the performance of segmentation algorithms can be improved by using a lattice of competing hypotheses over the speech stream rather than just the one-best hypothesis as input.
This paper describes a new visual representation of motion that is used to learn and classify body language - what we call .body signatures. - of people while they are talking. We applied this technique to several hours of internet videos and television broadcasts that include US politicians and leaders from Germany, France, Iran, Russia, Pakistan, and India, and public figures such as the Pope, as well as numerous talk show hosts and comedians. Dependent on the complexity of the task, we show up to 80% recognition performance and clustering into broader body language categories.
Multi-Experiment Studies (MESs) is a type of computational study in which the same simulation software is executed multiple times, and the result of all executions need to be aggregated to obtain useful insight. As computational simulation experiments become increasingly accepted as part of the scientific process, the use of MESs is becoming more wide-spread among scientists and engineers.
MESs present several challenging requirements on the computing system. First, many MESs need constant user monitoring and feedback, requiring simultaneous steering of multiple executions of the simulation code. Second, MESs can comprise of many executions of long-running simulations; the sheer volume of computation can make them prohibitively long to run.
Parallel architecture offer an attractive computing platform for MESs. Low-cost, small-scale desktops employing multi-core chips allow wide-spread dedicated local access to parallel computation power, offering more research groups an opportunity to achieve interactive MESs. Massively-parallel, high-performance computing clusters can afford a level of parallelism never seen before, and present an opportunity to address the problem of computationally intensive MESs.
However, in order to fully leverage the benefits of parallel architectures, the traditional parallel systems' view has to be augmented. Existing parallel computing systems often treat each execution of the software as a black box, and are prevented from viewing an entire computational study as a single entity that must be optimized for.
This dissertation investigates how a parallel system can view MESs as an end-to-end system and leverage the application-specific properties of MESs to address its requirements. In particular, the system can 1) adapt its scheduling decisions to the overall goal of an MES to reduce the needed computation, 2) simultaneously aggregate results from, and disseminate user actions to, multiple executions of the software to enable simultaneous steering, 3) store reusable information across executions of the simulation software to reduce individual run-time, and 4) adapt its resource allocation policies to the MES's properties to improve resource utilization.
Using a test bed system called SimX and four example MESs across different disciplines, this dissertation shows that the application-aware MES-level approach can achieve multi-fold to multiple orders-of-magnitude improvements over the traditional simulation-level approach.
Traditionally, the verification effort is applied to the abstract algorithmic descriptions of the underlining software. However, even well understood protocols such as Peterson's protocol for mutual exclusion, whose algorithmic description takes only half a page, have published implementations that are erroneous. Furthermore, the semantics of the implementations can be altered by optimizing compilers, which are very large applications and, consequently, are bound to have bugs. Thus, it is highly desirable to ensure the correctness of the compiled code especially in safety critical and high-assurance software. This dissertation describes two alternative approaches that bring us closer to solving the problem.
First, we present CoVaC - a deductive framework for proving program equivalence and its application to automatic verification of transformations performed by optimizing compilers. To leverage the existing program analysis techniques, we reduce the equivalence checking problem to analysis of one system - a cross-product of the two input programs. We show how the approach can be effectively used for checking equivalence of single-threaded programs that are structurally similar. Unlike the existing frameworks, our approach accommodates absence of compiler annotations and handles most of the classical intraprocedural optimizations such as constant folding, reassociation, common subexpression elimination, code motion, dead code elimination, branch optimizations, and others. In addition, we have developed rules for translation validation of interprocedural optimizations, which can be applied when compiler annotations are available.
The second contribution is the pancam framework for verifying multi-threaded C programs. Pancam first compiles a multithreaded C program into optimized bytecode format. The framework relies on Spin, an existing explicit state model checker, to orchestrate the program's state space search. However, the program transitions and states are computed by the pancam bytecode interpreter. A feature of our approach is that not only pancam checks the actual implementation, but it can also check the code after compiler optimizations. Pancam addresses the state space explosion problem by allowing users to define data abstraction functions and to constrain the number of allowed context switches. We also describe a partial order reduction method that reduces context switches using dynamic knowledge computed on-the-fly, while being sound for both safety and liveness properties.
This paper presents efficient algorithms for testing the finite, polynomial, and exponential ambiguity of finite automata with $\epsilon$-transitions. It gives an algorithm for testing the exponential ambiguity of an automaton $A$ in time $O(|A|_E2)$, and finite or polynomial ambiguity in time $O(|A|_E3)$. These complexities significantly improve over the previous best complexities given for the same problem. Furthermore, the algorithms presented are simple and are based on a general algorithm for the composition or intersection of automata. We also give an algorithm to determine the degree of polynomial ambiguity of a finite automaton $A$ that is polynomially ambiguous in time $O(|A|_E3)$. Finally, we present an application of our algorithms to an approximate computation of the entropy of a probabilistic automaton.
Microarray technology, in its simplest form, allows one to gather abundance data for target DNA molecules, associated with genomes or gene-expressions, and relies on hybridizing the target to many short probe oligonucleotides arrayed on a surface. While for such multiplexed reactions conditions are optimized to make the most of each individual probe-target interaction, subsequent analysis of these experiments is based on the implicit assumption that a given experiment gives the same result regardless of whether it was conducted in isolation or in parallel with many others. It has been discussed in the literature that this assumption is frequently false, and its validity depends on the types of probes and their interactions with each other. We present a detailed physical model of hybridization as a means of understanding probe interactions in a multiplexed reaction. The model is formulated as a system of ordinary di.erential equations (ODE.s) describing kinetic mass action and conservation-of-mass equations completing the system.
We examine pair-wise probe interactions in detail and present a model of .competition. between the probes for the target.especially, when target is in short supply. These e.ects are shown to be predictable from the a.nity constants for each of the four probe sequences involved, namely, the match and mismatch for both probes. These a.nity constants are calculated from the thermodynamic parameters such as the free energy of hybridization, which are in turn computed according to the nearest neighbor (NN) model for each probe and target sequence.
Simulations based on the competitive hybridization model explain the observed variability in the signal of a given probe when measured in parallel with di.erent groupings of other probes or individually. The results of the simulations are used for experiment design and pooling strategies, based on which probes have been shown to have a strong e.ect on each other.s signal in the in silico experiment. These results are aimed at better design of multiplexed reactions on arrays used in genotyping (e.g., HLA typing, SNP or CNV detection, etc.) and mutation analysis (e.g., cystic .brosis, cancer, autism, etc.).
Traditional methods for supervised learning involve treating the input data as a set of independent, identically distributed samples. However, in many situations, the samples are related in such a way that variables associated with one sample depend on other samples. We present a new form of relational graphical model that, in addition to capturing the dependence of the output on sample specific features, can also capture hidden relationships among samples through a non-parametric latent manifold. Learning in the proposed graphical model involves simultaneously learning the non-parametric latent manifold along with a non-relational parametric model. Efficient inference algorithms are introduced to accomplish this task. The method is applied to the prediction of house prices. A non-relational model predicts an ``intrinsic" price of the house which depends only on its individual characteristics, and a relational model estimates a hidden surface of ``desirability'' coefficients which links the price of a house to that of similar houses in the neighborhood.
Transactional memory is a programming abstraction intended to simplify the synchronization of conflicting concurrent memory accesses without the difficulties associated with locks. In the first part of this thesis we provide a framework and tools that allow to formally verify that a transactional memory implementation satisfies its specification. First we show how to specify transactional memory in terms of admissible interchanges of transaction operations, and give proof rules for showing that an implementation satisfies its specification. We illustrate how to verify correctness, first using a model checker for bounded instantiations, and subsequently by using a theorem prover, thus eliminating all bounds. We provide a mechanical proof of the soundness of the verification method, as well as mechanical proofs for several implementations from the literature, including one that supports non-transactional memory accesses.
Procedural programs with unbounded recursion present a challenge to symbolic model-checkers since they ostensibly require the checker to model an unbounded call stack. In the second part of this thesis we present a method for model-checking safety and liveness properties over procedural programs. Our method performs by first augmenting a concrete procedural program with a well founded ranking function, and then abstracting the Procedural programs with unbounded recursion present a challenge to symbolic model-checkers since they ostensibly require the checker to model an unbounded call stack. In the second part of this thesis we present a method for model-checking safety and liveness properties over procedural programs. Our method performs by first augmenting a concrete procedural program with a well founded ranking function, and then abstracting the augmented program by a finitary state abstraction. Using procedure summarization the procedural abstract program is then reduced to a finite-state system, which is model checked for the property.
It is well known that the use of points-to information can substantially improve the accuracyof a static program analysis. Commonly used algorithms for computing points-to information are known to be sound only for memory-safe programs. Thus, it appears problematic to utilize points-to information to verify the memory safety property without giving up soundness. We show that a sound combination is possible, even if the points-to information is computed separately and only conditionally sound. This result is based on a refined statement of the soundness conditions of points-to analyses and a general mechanism for composing conditionally sound analyses.
Overlapping Schwarz methods are extended to mixed finite element approximations of linear elasticity which use discontinuous pressure spaces. The coarse component of the preconditioner is based on a low-dimensional space previously developed for scalar elliptic problems and a domain decomposition method of iterative substructuring type, i.e., a method based on non-overlapping decompositions of the domain, while the local components of the preconditioner are based on solvers on a set of overlapping subdomains.
A bound is established for the condition number of the algorithm which grows in proportion to the square of the logarithm of the number of degrees of freedom in individual subdomains and the third power of the relative overlap between the overlapping subdomains, and which is independent of the Poisson ratio as well as jumps in the Lam\'e parameters across the interface between the subdomains. A positive definite reformulation of the discrete problem makes the use of the standard preconditioned conjugate gradient method straightforward. Numerical results, which include a comparison with problems of compressible elasticity, illustrate the findings.
Teaching a robot to perceive and navigate in an unstructured natural world is a difficult task. Without learning, navigation systems are short-range and extremely limited. With learning, the robot can be taught to classify terrain at longer distances, but these classifiers can be fragile as well, leading to extremely conservative planning. A robust, high-level learning-based perception system for a mobile robot needs to continually learn and adapt as it explores new environments. To do this, a strong feature representation is necessary that can encode meaningful, discriminative patterns as well as invariance to irrelevant transformations. A simple realtime classifier can then be trained on those features to predict the traversability of the current terrain.
One such method for learning a feature representation is discussed in detail in this work. Dimensionality reduction by learning an invariant mapping (DrLIM) is a weakly supervised method for learning a similarity measure over a domain. Given a set of training samples and their pairwise relationships, which can be arbitrarily defined, DrLIM can be used to learn a function that is invariant to complex transformations of the inputs such as shape distortion and rotation.
The main contribution of this work is a self-supervised learning process for long-range vision that is able to accurately classify complex terrain, permitting improved strategic planning. As a mobile robot moves through offroad environments, it learns traversability from a stereo obstacle detector. The learning architecture is composed of a static feature extractor, trained offline for a general yet discriminative feature representation, and an adaptive online classifier. This architecture reduces the effect of concept drift by allowing the online classifier to quickly adapt to very few training samples without overtraining. After experiments with several different learned feature extractors, we conclude that unsupervised or weakly supervised learning methods are necessary for training general feature representations for natural scenes.
The process was developed and tested on the LAGR mobile robot as part of a fully autonomous vision-based navigation system.
Traditional methods for supervised learning involve treating the input data as a set of independent, identically distributed samples. However, in many situations, the samples are related in such a way that variables associated with one sample depend on other samples. We present a new form of relational graphical model that, in addition to capturing the dependence of the output on sample specific features, can also capture hidden relationships among samples through a non-parametric latent manifold.
Learning in the proposed graphical model involves simultaneously learning the non-parametric latent manifold along with a non-relational parametric model. Efficient inference algorithms are introduced to accomplish this task. The method is applied to the prediction of house prices. A non-relational model predicts an ``intrinsic" price of the house which depends only on its individual characteristics, and a relational model estimates a hidden surface of ``desirability'' coefficients which links the price of a house to that of similar houses in the neighborhood.
Two-level overlapping Schwarz preconditioners are extended for use for a class of large, symmetric, indefinite systems of linear algebraic equations. The focus is on an enriched coarse space with additional basis functions built from free space solutions of the underlying partial differential equation. GMRES is used to accelerate the convergence of preconditioned systems. Both additive and hybrid Schwarz methods are considered and reports are given on extensive numerical experiments.
We consider the problem of efficiently encoding a signal by transforming it to a new representation whose components are statistically independent (also known as factorial). A widely studied family of solutions, generally known as independent components analysis (ICA), exists for the case when the signal is generated as a linear transformation of independent non-Gaussian sources. Here, we examine a complementary case, in which the signal density is non-Gaussian but elliptically symmetric. In this case, no linear transform suffices to properly decompose the signal into independent components, and thus, the ICA methodology fails. We show that a simple nonlinear transformation, which we call radial Gaussianization (RG), provides an exact solution for this case. We then examine this methodology in the context of natural image statistics, demonstrating that joint statistics of spatially proximal coefficients in a multi-scale image representation are better described as elliptical than factorial. We quantify this by showing that reduction in dependency achieved by RG is far greater than that achieved by ICA, for local spatial neighborhoods. We also show that the RG transformation may be closely approximated by divisive normalization transformations that have been used to model the nonlinear response properties of visual neurons, and that have been shown to reduce dependencies between multi-scale image coefficients.
Automatic generation of correct software from requirements has long been a ``holy grail'' for system and software development. According to this vision, instead of implementing a system and then working hard to apply testing and verification methods to prove system correctness, a system is rather built correctly by construction. This problem, referred to as synthesis, is undecidable in the general case. However, by restricting the domain to decidable subsets, it is possible to bring this vision one step closer to reality.
The focus of our study is reactive systems, or non-terminating programs that continuously receive input from an external environment and produce output responses. Reactive systems are often safety critical and include applications such as anti-lock braking systems, auto-pilots, and pacemakers. One of the challenges of reactive system design is ensuring that the software meets the requirements under the assumption of unpredictable environment input. The behavior of many of these systems can be expressed as regular languages over infinite strings, a domain in which synthesis has yielded successful results.
We present a method for synthesizing executable reactive systems from formal requirements. The object-oriented requirements language of Live Sequence Charts (LSCs) is considered. We begin by establishing a mapping between various subsets of the language and finite-state formal models. We also consider LSCs which can express time constraints over a dense-time domain. From one of these models, we show how to formulate a winning strategy that is guaranteed to satisfy the requirements, provided one exists. The strategy is realized in the form of a controller which guides the system in choosing only non-violating behaviors. We describe an implementation of this work as an extension of an existing tool called the Play-Engine.
The unprecedented growth of the Internet over the past decade and of data collection, more generally, has given rise to vast quantities of digital information, ranging from web documents and images, genomic databases to a vast array of business customer information. Consequently, it is of growing importance to develop tools and models that enable us to better understand this data and to design data-driven algorithms that leverage this information. This thesis provides several fundamental theoretical and algorithmic results for tackling such problems with applications to speech recognition, image processing, natural language processing, computational biology and web-based algorithms.
Probabilistic automata provide an efficient and compact way to model sequence- oriented data such as speech or web documents. Measuring the similarity of such automata provides a way of comparing the objects they model, and is an essential first step in organizing this type of data. We present algorithmic and hardness results for computing various discrepancies (or dissimilarities) between probabilistic automata, including the relative entropy and the Lp distance; we also give an efficient algorithm to determine if two probabilistic automata are equivalent. In addition, we study the complexity of computing the norms of probabilistic automata.
Organizing and querying large amounts of digitized data such as images and videos is a challenging task because little or no label information is available. This motivates transduction, a setting in which the learning algorithm can leverage unlabeled data during training to improve performance. We present novel error bounds for a family of transductive regression algorithms and validate their usefulness through experiments.
Widespread success of search engines and information retrieval systems has led to large scale collection of rating information which is being used to provide personalized rankings. We examine an alternate formulation of the ranking problem for search engines motivated bythe requirement that in addition to accurately predicting pairwise ordering, ranking systems must also preserve the magnitude of the preferences or the difference between ratings. We present algorithms with sound theoretical properties, and verify their efficacy through experiments.
Finally, price discovery in a market setting can be viewed as an (ongoing) learning problem. Specifically, the problem is to find and maintain a set of prices that balance supply and demand, a core topic in economics. This appears to involve complex implicit and possibly large-scale information transfers. We show that finding equilibrium prices, even approximately, in discrete markets is NP-hard and complement the hardness result with a matching polynomial time approximation algorithm.We also give a new way of measuring the quality of an approximation to equilibrium prices that is based on a natural aggregation of the dissatisfaction of individual market participants.
Modeling of high quality surfaces is the core of geometric modeling. Such models are used in many computer-aided design and computer graphics applications. Irregular behavior of higher-order differential parameters of the surface (e.g. curvature variation) may lead to aesthetic or physical imperfections. In this work, we consider approaches to constructing surfaces with high degree of smoothness.
One direction is based on a manifold-based surface definition which ensures well-defined high-order derivatives that can be explicitly computed at any point. We extend previously proposed manifold-based construction to surfaces with piecewise-smooth boundary and explore trade-offs in some elements of the construction. We show that growth of derivative magnitudes with order is a general property of constructions with locally supported basis functions and derive a lower bound for derivative growth and numerically study flexibility of resulting surfaces at arbitrary points.
An alternative direction to using high-order surfaces is to define an approximation to high-order quantities for meshes, with high-order surface implicit. These approximations do not necessarily converge point-wise, but can nevertheless be successfully used to solve surface optimization problems. Even though fourth-order problems are commonly solved to obtain high quality surfaces, in many cases, these formulations may lead to reflection-line and curvature discontinuities. We consider two approaches to further increasing control over surface properties.
The first approach is to consider data-dependent functionals leading to fourth-order problems but with explicit control over desired surface properties. Our fourth-order functionals are based on reflection line behavior. Reflection lines are commonly used for surface interrogation and high-quality reflection line patterns are well-correlated with high-quality surface appearance. We demonstrate how these can be discretized and optimized accurately and efficiently on general meshes.
A more direct approach is to consider a poly-harmonic function on a mesh, such as the fourth-order biharmonic or the sixth-order triharmonic. The biharmonic and the triharmonic equations can be thought of as a linearization of curvature and curvature variation Euler-Lagrange equations respectively. We present a novel discretization for both problems based on the mixed finite element framework and a regularization technique for solving the resulting, highly ill-conditioned systems of equations. We show that this method, compared to more ad-hoc discretizations, has higher degree of mesh independence and yields surfaces of better quality.
This paper describes an efficient reduction of the learning problem of ranking to binary classification. As with a recent result of Balcan et al. (2007), the reduction guarantees an average pairwise misranking regret of at most $2r$ using a binary classifier with regret $r$. However, our reduction applies to a broader class of ranking loss functions, admits a simpler proof, and the expected running time complexity of our algorithm in terms of number of calls to a classifier or preference function is improved from $\Omega(n2)$ to $O(n \log n)$. Furthermore, when the top $k$ ranked elements only are required ($k \ll n$), as in many applications in information extraction or search engines, the time complexity of our algorithm can be further reduced to $O(k \log k + n)$. Our reduction and algorithm are thus practical for realistic applications where the number of points to rank exceeds several thousands. Much of our results also extend beyond the bipartite case previously studied.
Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, \emph{$n$-way composition}, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The expected worst-case complexity of our algorithm for composing three transducers $T_1$, $T_2$, and $T_3$\ignore{ depending on the strategy used, is $O(|T_1|_E|T_2|_Q|T_3|_E + |T|)$ or $(|T_1|_Q|T_2|_E|T_3|_Q + |T|)$, } is $O(\min(|T_1|_E|T_2|_Q|T_3|_E, |T_1|_Q|T_2|_E|T_3|_Q) + |T|)$, where $T$ is the result of that composition and $|T_i| = |T_i|_Q + |T_i|_E$ with $|T_i|_Q$ the number of states and $|T_i|_E$ the number of transitions of $T_i$, $i = 1, 2, 3$. In many cases, this significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.
In this thesis, we present design techniques -- and systems that illustrate and validate these techniques -- for building data-intensive applications over the Internet. We enable the use of a traditional bandwidth-limited server in these applications. A large number of cooperating users contribute resources such as disk space and network bandwidth, and form the backbone of such applications. The applications we consider fall in one of two categories. The first type provide user-perceived utility in proportion to the data download rates of the participants; bulk data distribution systems is a typical example. The second type are usable only when the participants have data download rates above a certain threshold; video streaming is a prime example.
We built Shark, a distributed file system, to address the first type of applications. It is designed for large-scale, wide-area deployment, while also providing a drop-in replacement for local-area file systems. Shark introduces a novel locality-aware cooperative-caching mechanism, in which clients exploit each other's file caches to reduce load on an origin file server. Shark also enables sharing of data even when it originates from different servers. In addition, Shark clients are mutually distrustful in order to operate in the wide-area. Performance results show that Shark greatly reduces server load and reduces client-perceived latency for read-heavy workloads both in the wide and local areas.
We built RedCarpet, a near-Video-on-Demand (nVoD) system, to address the second type of applications. nVoD allows a user to watch a video starting at any point after waiting for a small setup time. RedCarpet uses a mesh-based peero-peer (P2P) system to provide the nVoD service. In this context, we study the problem of scheduling the dissemination of chunks that constitute a video. We show that providing nVoD is feasible with a combination of techniques that include network coding, avoiding resource starvation for different chunks, and overay topology management algorithms. Our evaluation, using a simulator as well as a prototype, shows that systems that do not optimize in all these dimensions could deliver significantly worse nVoD performance.
The goal of shape analysis is to analyze properties of programs that perform destructive updates of linked structures (heaps). This thesis presents an approach for shape analysis based on program augmentation (instrumentation), predicate abstraction, and model checking, that allows for verification of safety and liveness properties (which, for sequential programs, usually corresponds to program invariance and termination).
One of the difficulties in abstracting heap-manipulating programs is devising a decision procedure for a sufficiently expressive logic of graph properties. Since graph reachability (expressible by transitive closure) is not a first order property, the challenge is in showing that a decision procedure exists for a rich enough subset of first order logic with transitive closure.
Predicate abstraction is in general too weak to verify liveness properties. Thus an additional issue dealt with is how to perform abstraction while retaining enough information. The method presented here is domain-neutral, and applies to concurrent programs as well as sequential ones.
We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete.
The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels.
Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the Lp -norm of a probabilistic automaton.
This paper studies the learning problem of ranking when one wishes not just to accurately predict pairwise ordering but also preserve the magnitude of the preferences or the difference between ratings, a problem motivated by its crucial importance in the design of search engines, movie recommendation, and other similar ranking systems. We describe and analyze several algorithms for this problem and give stability bounds for their generalization error, extending previously known stability results to non- bipartite ranking and magnitude of preference-preserving algorithms. We also report the results of experiments comparing these algorithms on several datasets and contrast these results with those obtained using an AUC-maximization algorithm.
In the theory of domain decomposition methods, it is often assumed that each subdomain is the union of a small set of coarse triangles or tetrahedra. In this study, extensions to the existing theory which accommodates subdomains with much less regular shape are presented; the subdomains are only required to be John domains. Attention is focused on overlapping Schwarz preconditioners for problems in two dimensions with a coarse space component of the preconditioner which allows for good results even for coefficients which vary considerably. It is shown that the condition number of the domain decomposition method is bounded by C(1 + H/δ)(1 + log(H/h))
^{2}
, where the constant C independent of the number of subdomains and possible jumps in coefficients between subdomains. Numerical examples are provided which confirm the theory and demonstrate very good performance of the method for a variety of subregions including those obtained when a mesh partitioner is used for the domain decomposition.
In order to reach their large audiences, today's Internet publishers primarily use content distribution networks (CDNs) to deliver content. Yet the architectures of the prevalent commercial systems are tightly bound to centralized control, static deployments, and trusted infrastructure, inherently limiting their scope and scale to ensure cost recovery.
To move beyond such shortcomings, this thesis contributes a number of techniques that realize cooperative content distribution. By federating large numbers of unreliable or untrusted hosts, we can satisfy the demand for content by leveraging all available resources. We propose novel algorithms and architectures for three central mechanisms of CDNs: content discovery (where are nearby copies of the client's desired resource?), server selection (which node should a client use?), and secure content transmission (how should a client download content efficiently and securely from its multiple potential sources?).
These mechanisms have been implemented, deployed, and tested in production systems that have provided open content distribution services for more than three years. Every day, these systems answer tens of millions of client requests, serving terabytes of data to more than a million people.
This thesis presents five systems related to content distribution. First, Coral provides a distributed key-value index that enables content lookups to occur efficiently and returns references to nearby cached objects whenever possible, while still preventing any load imbalances from forming. Second, CoralCDN demonstrates how to construct a self-organizing CDN for web content out of unreliable nodes, providing robust behavior in the face of failures. Third, OASIS provides a general-purpose, flexible anycast infrastructure, with which clients can locate nearby or unloaded instances of participating distributed systems. Fourth, as a more clean-slate design that can leverage untrusted participants, Shark offers a distributed file system that supports secure block-based file discovery and distribution. Finally, our authentication code protocol enables the integrity verification of large files on-the-fly when using erasure codes for efficient data dissemination.
Taken together, this thesis provides a novel set of tools for building highly-scalable, efficient, and secure content distribution systems. By enabling the automated replication of data based on its popularity, we can make desired content available and accessible to everybody. And in effect, democratize content distribution.
The implementation of real-world type checkers requires a non-trivial engineering effort. The resulting code easily comprises thousands of lines, which increases the probability of software defects in a component critical to compiler correctness. To make type checkers easier to implement and extend, this paper presents Typical, a domain-specific language and compiler that directly and concisely captures the structure of type systems. Our language builds on the functional core for ML to represent syntax trees and types as variants and to traverse them with pattern matches. It then adds declarative constructs for common type checker concerns, such as scoping rules, namespaces, and constraints on types. It also integrates error checking and reporting with other constructs to promote comprehensive error management. We have validated our system with two real-world type checkers written in Typical, one for Typical itself and the other for C.
Grammars for many parser generators not only specify a language's syntax but also the corresponding syntax tree. Unfortunately, most parser generators pick a somewhat arbitrary combination of features from the design space for syntax trees and thus lock in specific trade-offs between expressivity, safety, and performance. This paper discusses the three major axes of the design space---specification within or outside a grammar, concrete or abstract syntax trees, and dynamically or statically typed trees---and their impact. It then presents algorithms for automatically realizing all major choices from the same, unmodified grammar with inline syntax tree declarations. In particular, this paper shows how to automatically (1) extract a separate syntax tree specification, (2) embed an abstract syntax tree within a concrete one, and (3) infer a strongly typed view on a dynamically typed tree. All techniques are implemented in the Rats! parser generator and have been applied to real-world C and Java grammars and their syntax trees.
The traditional natural language processing pipeline incorporates multiple stages of linguistic analysis. Although errors are typically compounded through the pipeline, it is possible to reduce the errors in one stage by harnessing the results of the other stages.
This thesis presents a new framework based on component interactions to approach this goal. The new framework applies all stages in a suitable order, with each stage generating multiple hypotheses and propagating them through the whole pipeline. Then the feedback from subsequent stages is used to enhance the target stage by re-ranking these hypotheses, and then produce the best analysis.
The effectiveness of this framework has been demonstrated by substantially improving the performance of Chinese and English entity extraction and Chinese-to-English entity translation. The inference knowledge includes mono-lingual interactions among information extraction stages such as name tagging, coreference resolution, relation extraction and event extraction, as well as cross-lingual interaction between information extraction and machine translation.
Such symbiosis of analysis components allows us to incorporate information from a much wider context, spanning the entire document and even going across documents, and utilize deeper semantic analysis; it will therefore be essential for the creation of a high- performance NLP pipeline.
In the theory for domain decomposition algorithms of the iterative substructuring family, each subdomain is typically assumed to be the union of a few coarse triangles or tetrahedra. This is an unrealistic assumption, in particular, if the subdomains result from the use of a mesh partitioner in which case they might not even have uniformly Lipschitz continuous boundaries.
The purpose of this study is to derive bounds for the condition number of these preconditioned conjugate gradient methods which depend only on a parameter in an isoperimetric inequality and two geometric parameters characterizing John and uniform domains. A related purpose is to explore to what extent well known technical tools previously developed for quite regular subdomains can be extended to much more irregular subdomains.
Some of these results are valid for any John domains, while an extension theorem, which is needed in this study, requires that the subdomains are uniform. The results, so far, are only complete for problems in two dimensions. Details are worked out for a FETI--DP algorithm and numerical results support the findings. Some of the numerical experiments illustrate that care must be taken when selecting the scaling of the preconditioners in the case of irregular subdomains.
While authentication within organizations is a well-understood problem, traditional solutions are often inadequate at the scale of the Internet, where the lack of a central authority, the open nature of the systems, and issues such as privacy and anonymity create new challenges. For example, users typically establish dozens of web accounts with independently administered services under a single password, which increases the likelihood of exposure of their credentials; users wish to receive email from anyone who is not a spammer, but the openness of the email infrastructure makes it hard to authenticate legitimate senders; users may have a rightful expectation of privacy when viewing widely-accessed protected resources such as premium website content, yet they are commonly required to present identifying login credentials, which permits tracking of their access patterns.
This dissertation describes enhanced authentication mechanisms to tackle the challenges of each of the above settings. Specifically, the dissertation develops: 1) a remote authentication architecture that lets users recover easily in case of password compromise; 2) a social network-based email system in which users can authenticate themselves as trusted senders without disclosing all their social contacts; and 3) a group access-control scheme where requests can be monitored while affording a degree of anonymity to the group member performing the request.
The proposed constructions combine system designs and novel cryptographic techniques to address their respective security and privacy requirements both effectively and efficiently.
Cryptographic primitives, such as hash functions and block ciphers, are integral components in several practical cryptographic schemes. In order to prove security of these schemes, a variety of security assumptions are made on the underlying hash function or block cipher, such as collision-resistance, pseudorandomness etc. In fact, such assumptions are often made without much regard for the actual constructions of these primitives. In this thesis, we address this problem and suggest new, and possibly better, design criteria for hash functions and block ciphers.
We start by analyzing the design criteria underlying hash functions. The usual design principle here involves a two-step procedure: First, come up with a heuristically-designed and ``hopefully strong'' fixed-length input construction (i.e. the compression function), then use a standard domain extension technique, usually the cascade construction, to get a construction that works for variable-length inputs. We investigate this design principle from two perspectives:
We next move on to discuss the Feistel network, which is used in the design of several popular block ciphers such as DES, Triple-DES, Blowfish etc. Currently, the celebrated result of Luby-Rackoff (and further extensions) is regarded as the theoretical basis for using this construction in block cipher design, where it was shown that a four-round Feistel network is a (strong) pseudorandom permutation (PRP) if the round functions are independent pseudorandom functions (PRFs). We study the Feistel network from two different perspectives:
We give a positive answer to the first question and a partial positive answer to the second question. In the process, we undertake a combinatorial study of the Feistel network, that might be useful in other scenarios as well. We provide several practical applications of our results for the Feistel network.
Bayesian estimators are commonly constructed using an explicit prior model. In many applications, one does not have such a model, and it is difficult to learn since one does not have access to uncorrupted measurements of the variable being estimated. In many cases however, including the case of contamination with additive Gaussian noise, the Bayesian least squares estimator can be formulated directly in terms of the distribution of noisy measurements. We demonstrate the use of this formulation in removing noise from photographic images. We use a local approximation of the noisy measurement distribution by exponentials over adaptively chosen intervals, and derive an estimator from this approximate distribution. We demonstrate through simulations that this adaptive Bayesian estimator performs as well or better than previously published estimators based on simple prior models.
In this paper we describe a new technique for the characterisation of populations of DNA strands. Such tools are vital to the study of ecological systems, at both the micro (e.g., individual humans) and macro (e.g., lakes) scales. Existing methods make extensive use of DNA sequencing and cloning, which can prove costly and time consuming. The overall objective is to address questions such as: (i) (Genome detection) Is a known genome sequence present at least in part in an environmental sample? (ii) (Sequence query) Is a specific fragment sequence present in a sample? (iii) (Similarity Discovery) How similar in terms of sequence content are two unsequenced samples?
We propose a method involving multiple filtering criteria that result in ``pools" of DNA of high or very high purity. Because our method is similar in spirit to hashing in computer science, we call the method {\it DNA hash pooling}. To illustrate this method, we describe examples using pairs of restriction enzymes. The {\it in silico} empirical results we present reflect a sensitivity to experimental error. The method requires minimal DNA sequencing and, when sequencing is required, little or no cloning.
This thesis proposes a novel approach for exploring Information Extraction scenarios. Information Extraction, or IE, is a task aiming at finding events and relations in natural language texts that meet a user's demand. However, it is often difficult to formulate, or even define such events that satisfy both a user's need and technical feasibility. Furthermore, most existing IE systems need to be tuned for a new scenario with proper training data in advance. So a system designer usually needs to understand what a user wants to know in order to maximize the system performance, while the user has to understand how the system will perform in order to maximize his/her satisfaction.
In this thesis, we focus on maximizing the variety of scenarios that the system can handle instead of trying to improve the accuracy of a particular scenario. In traditional IE systems, a relation is defined a priori by a user and is identified by a set of patterns that are manually crafted or acquired in advance. We propose a technique called Unrestricted Relation Discovery, which defers determining what is a relation and what is not until the very end of the processing so that a relation can be defined a posteriori. This laziness gives huge flexibility to the types of relations the system can handle. Furthermore, we use the notion of recurrent relations to measure how useful each relation is. This way, we can discover new IE scenarios without fully specifying definitions or patterns, which leads to Preemptive Information Extraction, where a system can provide a user a portfolio of extractable relations and let the user choose them.
We used one year news articles obtained from the Web as a development set. We discovered dozens of scenarios that are similar to the existing scenarios tried by many IE systems, as well as new scenarios that are relatively novel. We have evaluated the existing scenarios with Automatic Content Extraction (ACE) event corpus and obtained reasonable performance. We believe this system will shed new light on IE research by giving various experimental IE scenarios.
We present an approach to constituent parsing, which is driven by classifiers induced to minimize a single regularized objective. It is the first discriminatively-trained constituent parser to surpass the Collins (2003) parser without using a generative model. Our primary contribution is simplifying the human effort required for feature engineering. Our model can incorporate arbitrary features of the input and parse state. Feature selection and feature construction occur automatically, as part of learning. We define a set of fine-grained atomic features, and let the learner induce informative compound features. Our learning approach includes several novel approximations and optimizations which improve the efficiency of discriminative training. We introduce greedy completion, a new agenda-driven search strategy designed to find low-cost solutions given a limit on search effort. The inference evaluation function was learned accurately enough to guide the deterministic parsers to the optimal parse reasonably quickly without pruning, and thus without search errors. Experiments demonstrate the flexibility of our approach, which has also been applied to machine translation (Wellington et. al, AMTA 2006; Turian et al., NIPS 2007).
Modeling security for protocols running in the complex network environment of the Internet can be a daunting task. Ideally, a security model for the Internet should provide the following guarantee: a protocol that "securely" implements a particular task specification will retain all the same security properties as the specification itself, even when an arbitrary set of protocols runs concurrently on the same network. This guarantee must hold even when other protocols are maliciously designed to interact badly with the analyzed protocol, and even when the analyzed protocol is composed with other protocols. The popular Universal Composability (UC) security framework aims to provide this guarantee.
Unfortunately, such strong security guarantees come with a price: they are impossible to achieve without the use of some trusted setup. Typically, this trusted setup is global in nature, and takes the form of a Public Key Infrastructure (PKI) and/or a Common Reference String (CRS). However, the current approach to modeling security in the presence of such setups falls short of providing expected security guarantees. A quintessential example of this phenomenon is the deniability concern: there exist natural protocols that meet the strongest known security notions (including UC) while failing to provide the same deniability guarantees that their task specifications imply they should provide.
We introduce the Generalized Universal Composability (GUC) framework to extend the UC security notion and enable the re-establishment of its original intuitive security guarantees even for protocols that use global trusted setups. In particular, GUC enables us to guarantee that secure protocols will provide the same level of deniability as the task specification they implement. To demonstrate the usefulness of the GUC framework, we first apply it to the analysis and construction of deniable authentication protocols. Building upon such deniable authentication protocols, we then prove a general feasibility result showing how to construct protocols satisfying our security notion for a large class of two-party and multi-party tasks (assuming the availability of some reasonable trusted setup). Finally, we highlight the practical applicability of GUC by constructing efficient protocols that securely instantiate two common cryptographic tasks: commitments and zero-knowledge proofs.
Statistical machine translation (SMT) systems use empirical models to simulate the act of human translation between language pairs. This dissertation surveys the ability of currently popular syntax-aware SMT systems to model real-world multitext, and shows different types of linguistic phenomena occurring in natural language translation that these popular systems cannot capture. It then proposes a new grammar formalism, Generalized Multitext Grammar (GMTG), and a generalization of Chomsky Normal Form, that allows us to build an efficient SMT system using previously developed parsing techniques. The dissertation addresses many software engineering issues that arise when doing syntax-based SMT using large corpora and lays out a object-oriented design for a translation toolkit. Using the toolkit, we show that a tree-transduction based SMT system, which uses modern machine learning algorithms, outperforms a generative baseline.
One of the main challenges of formal verification is the ability to handle systems of realistic size, which is especially exacerbated in the context of software verification. In this dissertation, we suggest two related approaches that, while both rely on formal method techniques, they can still be applied to larger practical systems. The scalability is mainly achieved by restricting the types of properties we are considering and guarantees that are given.
Our first approach is a novel run-time monitoring framework. Unlike previous work on this topic, we expect the properties to be specified using Property Specification Language (PSL). PSL is a newly adopted IEEE P1850 standard and is an extension of Linear Temporal Logic (LTL). The new features include regular expressions and finite trace semantics, which make the new logic very attractive for run-time monitoring of both software and hardware designs. To facilitate the new logic we have extended the existing algorithm for LTL tester construction to cover the PSL specific operators. Another novelty of our approach is the ability to use partial information about the program that is being monitored while the existing tools only use the information about the observed trace and the property under consideration. This allows going beyond the focus of traditional run-time monitoring tools -- error detection in the execution trace, towards the focus of static analysis -- bug detection in programs.
In our second approach, we employ static analysis to compute SAT-based function summaries to detect invalid pointer accesses. To compute function summaries, we propose new techniques for improving the precision and performance in order to reduce the false error rates. In particular, we use BDDs to represent a symbolic simulation of functions, where BDDs allow an efficient representation of path-sensitive information and high level simplification. In addition, we use light-weight range analysis technique for determining lower and upper bounds for program variables, which can further offload the work form the SAT solver. Note that while in our current implementation the analysis happens at compile time, we can also use the function summaries as a basis for run-time monitoring.
Many techniques have been introduced in the last few decades to create ε-free automata representing regular expressions: Glushkov automata, the so-called follow automata, and Antimirov automata. This paper presents a simple and unified view of all these ε-free automata both in the case of unweighted and weighted regular expressions.It describes simple and general algorithms with running time complexities at least as good as that of the best previously known techniques, and provides concise proofs.The construction methods are all based on two standard automata algorithms: epsilon-removal and minimization. This contrasts with the multitude of complicated and special-purpose techniques and proofs put forward by others to construct these automata. Our analysis provides a better understanding of ε-free automata representing regular expressions: they are all the results of the application of some combinations of epsilon-removal and minimization to the classical Thompson automata. This makes it straight forward to generalize these algorithms to the weighted case, which also results in much simpler algorithms than existing ones. For weighted regular expressions over a closed semiring, we extend the notion of follow automata to the weighted case. We also present the first algorithm to compute the Antimirov automata in the weighted case.
We define the class of single-parent heap systems, which rely on a singly-linked heap in order to model destructive updates on tree structures. This encoding has the advantage of relying on a relatively simple theory of linked lists in order to support abstraction computation. To facilitate the application of this encoding, we provide a program transformation that, given a program operating on a multi-linked heap without sharing, transforms it into one over a single-parent heap. It is then possible to apply shape analysis by predicate and ranking abstraction as in [BPZ05]. The technique has been successfully applied on examples with trees of fixed arity (balancing of and insertion into a binary sort tree).
The method of ``Invisible Invariants'' has been applied successfully to protocols that assume a ``symmetric'' underlying topology, be it cliques, stars, or rings. In this paper we show how the method can be applied to proving safety properties of distributed protocols running under arbitrary topologies. Many safety properties of such protocols have reachability predicates, which, on first glance, are beyond the scope of the Invisible Invariants method. To overcome this difficulty, we present a technique, called ``coloring,'' that allows, in many instances, to replace the second order reachability predicates by first order predicates, resulting in properties that are amenable to Invisible Invariants, where ``reachable'' is replaced by ``colored.'' We demonstrate our techniques on several distributed protocols, including a variant on Luby's Maximal Independent Set protocol, the Leader Election protocol used in the IEEE 1394 (Firewire) distributed bus protocol, and various distributed spanning tree algorithms. All examples have been tested using the symbolic model checker TLV.
In many modern large-scale learning applications, the amount of unlabeled data far exceeds that of labeled data. A common instance of this problem is the 'transductive' setting where the unlabeled test points are known to the learning algorithm. This paper presents a study of regression problems in that setting. It presents 'explicit' VC-dimension error bounds for transductive regression that hold for all bounded loss functions and coincide with the tight classification bounds of Vapnik when applied to classification. It also presents a new transductive regression algorithm inspired by our bound that admits a primal and kernelized closed-form solution and deals efficiently with large amounts of unlabeled data. The algorithm exploits the position of unlabeled points to locally estimate their labels and then uses a global optimization to ensure robust predictions. Our study also includes the results of experiments with several publicly available regression data sets with up to 20,000 unlabeled examples. The comparison with other transductive regression algorithms shows that it performs well and that it can scale to large data sets.
Numerical non-robustness is a well-known phenomenon when implementing geometric algorithms. A general approach to achieve geometric robustness is Exact Geometric Computation (EGC). This dissertation explores the redesign and extension of Core Library, a C++ library which embraces the EGC approach. The contributions of this thesis are organized into three parts.
In the first part, we discuss the redesign of Core Library, especially the expression "Expr" and bigfloat "BigFloat" classes. Our new design emphasizes extensibility in a clean and modular way. The three facilities in "Expr", filter, root bound and bigfloat, are separated into independent modules. This allows new filters, root bounds and some bigfloat substitute to be plugged in. The key approximate evaluation and precision propagation algorithms have been greatly improved. A new bigfloat system based on MPFR and interval arithmetic has been incorporated. Our benchmark shows that the redesigned Core Library typically has 5-10 times speedup. We also provide tools to facilitate extensions of "Expr" to incorporate new type of nodes, especially transcendental nodes.
Although the Core Library was originally designed for algebraic applications, transcendental functions are needed in many applications. In the second part, we present a complete algorithm for absolute approximation of the general hypergeometric functions. It's complexity is also given. The extension of this algorithm to ``blackbox number'' is provided. A general hypergeometric function package based on our algorithm is implemented and integrated into the Core Library based on our new design.
Brent has shown that many elementary functions, such as $\exp, \log, \sin$, etc., can be efficiently computed using the Arithmetic-Geometric Mean (AGM) based algorithm. However, he only gave an asymptotic error analysis. The constants in the Big $O(\cdot)$ notation required for implementation are unknown. We provide a non-asymptotic error analysis of the AGM algorithm and the related algorithms for logarithm and exponential functions. These algorithms have been implemented and incorporated into the Core Library.
With more and more content being produced, distributed, and ultimately rendered and consumed in digital form, devising effective Content Protection mechanisms and building satisfactory Digital Rights Management (DRM) systems have become top priorities for the Publishing and Entertaining Industries.
To help tackle this challenge, several cryptographic primitives and constructions have been proposed, including mechanisms to securely distribute data over a unidirectional insecure channel (Broadcast Encryption), schemes in which leakage of cryptographic keys can be traced back to the leaker (Traitor Tracing), and techniques to combine revocation and tracing capabilities (Trace-and-Revoke schemes).
In this thesis, we present several original constructions of the above primitives, which improve upon existing DRM-enabling cryptographic primitives along the following two directions:
Our results along the first line of work include the following:
As for the second direction, our contribution can be divided as follows:
Overall, the cryptographic tools developed in this thesis provide more flexibility and more security than existing solutions, and thus offer a better match for the challenges of the DRM setting.
In Bioinformatics, finding correlations between species allows us the better understand the important biological functions of those species and trace its evolution. This thesis considers sequence alignment, a method for obtaining these correlations. We improve upon sequence alignment tools designed for DNA with Plains, an algorithm than uses piecewise-linear gap functions and parameter-optimization to obtain correlations in remotely-related species pairs such as human and fugu using reasonable amounts of memory and space on an ordinary computer. We then discuss Planar, which is similar to Plains, but is designed for aligning RNA, and accounts for secondary structure. We also explore SEPA, a tool that uses p-value estimation based on exhaustive empirical data to better emphasize key results from an alignment with a measure of reliability. Using SEPA to measure the quality of an alignment, we proceed to compare Plains and Planar against similar alignment tools, emphaisizing the interesting correlations caught in the process.
As the Web evolves, the number of network services deployed on the Internet has been growing at a dramatic pace. Such services usually involve a massive volume of data stored in physical or virtual back-end databases, and access the data to dynamically generate responses for client requests. These characteristics restrict use of traditional mechanisms for improving service performance and scalability: large volumes prevent replication of the service data at multiple sites required by content distribution schemes, while dynamic responses do not support the reuse required by web caching schemes.
However, many deployed data-centric network services share other properties that can help alleviate this situation: (1) service usage patterns exhibit locality of various forms, and (2) services are accessed using standard protocols and publicly known message structures. When properly exploited, these characteristics enable the design of alternative caching infrastructures, which leverage distributed network intermediaries to inspect traffic flowing between clients and services, infer locality information dynamically, and potentially improve service performance by taking actions such as partial service replication, request redirection, or admission control.
This dissertation investigates the nature of locality in service usage patterns for two well-known web services, and reports on the design, implementation, and evaluation of such a network intermediary architecture, named DataSlicer. DataSlicer incorporates four main techniques: (1) Service-neutral request inspection and locality detection on distributed network intermediaries; (2) Construction of oriented overlays for clustering client requests; (3)Integrated load-balancing and service replication mechanisms that improve service performance and scalability by either redistributing the underlying traffic in the network or creating partial service replicas on demand at appropriate network locations; and (4) Robustness mechanisms to maintain system stability in a wide-area network environment.
DataSlicer has been successfully deployed on the PlanetLab network. Extensive experiments using synthetic workloads show that our approach can: (1) create appropriate oriented overlays to cluster client requests according to multiple application metrics; (2) detect locality information across multiple dimensions and granularity levels; (3) leverage the detected locality information to perform appropriate load-balancing and service replication actions with minimal cost; and (4) ensure robust behavior in the face of dynamically changing network conditions.
In this thesis, we focus on multi-marker/-locus statistical methods for analyzing high-throughput array data used for the detection of genes implicated in complex disorders. There are two main parts: the first part concerns the localization of cancer genes from copy number variation data, with an application to lung cancer; the second part concerns the localization of disease genes using an affected-sib-pair design, with an application to inflammatory bowel disease. A third part addresses an important issue involved in the design of these disease-gene-detection studies. More details follow:
1. Detection of Oncogenes and Tumor Suppressor Genes using Multipoint Statistics from Copy Number Variation Data
ArrayCGH is a microarray-based comparative genomic hybridization technique that has been used to compare a tumor genome against a normal genome, thus providing rapid genomic assays of tumor genomes in terms of copy number variations of those chromosomal segments, which have been gained or lost. When properly interpreted, these assays are likely to shed important light on genes and mechanisms involved in initiation and progression of cancer. Specifically, chromosomal segments, amplified or deleted in a group of cancer patients, point to locations of cancer genes. We describe a statistical method to estimate the location of such genes by analyzing segmental amplifications and deletions in the genomes from cancer patients and the spatial relation of these segments to any specific genomic interval. The algorithm assigns to a genomic segment a score that parsimoniously captures the underlying biology. It computes a p-value for every putative disease gene by using results from the theory of scan statistics. We have validated our method using simulated datasets, as well as a real dataset on lung cancer.
2. Multi-locus Linkage Analysis of Affected-Sib-Pairs
A The affected-sib-pair (ASP) design is a simple and popular design in the linkage analysis of complex traits. The traditional ASP methods evaluate the linkage information at a locus by considering only the marginal linkage information present at that locus. However complex traits are influenced by multiple genes that together interact to increase the risk to disease. We describe a multi-locus linkage method that uses both the marginal information and information derived from the possible interactions among several disease loci, thereby increasing the significance of loci with modest marginal effects. Our method is based on a statistic that quantifies the linkage information contained in a set of markers. By a marker selection-reduction process, we screen a set of polymorphisms and select a few that seem linked to disease. We test our approach on simulated data and a genome-scan data for inflammatory bowel disease. We show that our method is expected to be more powerful than single-locus methods in detecting disease loci responsible for complex traits.
3. A Practical Haplotype Inference Algorithm
We consider the problem of efficient inference algorithms to determine the haplotypes and their distribution from a dataset of unrelated genotypes.
With the currently available catalogue of single-nucleotide polymorphisms (SNPs) and given their abundance throughout the genome (one in about $500$ bps) and low mutation rates, scientists hope to significantly improve their ability to discover genetic variants associated with a particular complex trait. We present a solution to a key intermediate step by devising a practical algorithm that has the ability to infer the haplotype variants for a particular individual from its own genotype SNP data in relation to population data. The algorithm we present is simple to describe and implement; it makes no assumption such as perfect phylogeny or the availability of parental genomes (as in trio-studies); it exploits locality in linkages and low diversity in haplotype blocks to achieve a linear time complexity in the number of markers; it combines many of the advantageous properties and concepts of other existing statistical algorithms for this problem; and finally, it outperforms competing algorithms in computational complexity and accuracy, as demonstrated by the studies performed on real data and synthetic data.
In this paper, a FETI-DP formulation for three dimensional elasticity on non-matching grids over geometrically non-conforming subdomain partitions is considered. To resolve the nonconformity of the finite elements, a mortar matching condition is imposed on the subdomain interfaces (faces). A FETI-DP algorithm is then built by enforcing the mortar matching condition in dual and primal ways. In order to make the FETI-DP algorithm scalable, a set of primal constraints, which include average and momentum constraints over interfaces, are selected from the mortar matching condition. A condition number bound, $C(1+\text{log}(H/h))2$, is then proved for the FETI-DP formulation for the elasticity problems with discontinuous material parameters. Only some faces need to be chosen as primal faces on which the average and momentum constraints are imposed.
Since the advent of motion capture animation, attempts have been made to extract the seemingly nebulously defined attributes of 'content' and 'style' from the motion data. Enabling quick access to highly precise data, the benefits of motion capture for animation purposes are abundant. Yet manipulating the expressive attributes of the motion data in a comprehensive manner has proved elusive. This dissertation poses practical solutions that are based on insights from the dance community and learning attributes from the motion data itself. The culminating project is a system which learns the deformations of the human body and reapplies them in exaggerated form for enhanced expressivity.
While simultaneously developing efficient and usable tools for animators, the result is a three pronged technique to enhance the expressive qualities of motion capture animation. The key aspect is the creation of a deformable skeleton representation of the human body using a unique machine learning approach. The deformable skeleton is modeled by replicating the actual movements of the human spine. The second step relies on exploiting the subtle aspects of motion, such as hand movement to create an emotional effect visually. Both of these approaches involve exaggerating the movements in the same vein as traditional 2-D animation technique of 'squash and stretch'. Finally, a novel technique for the application of style on a baseline motion capture sequence is developed.
All of these approaches are rooted in machine learning techniques. Linear discriminate analysis was initially applied to a single phrase of motion demonstrating various style characteristics in LABAN notation. A variety of methods including nonlinear PCA, and LLE were used to learn the underlying manifold of spine movements. Nonlinear dynamic models were learned in attempts to describe motion segments versus single phrases. In addition, the dissertation focuses on the variety of obstacles in learning with motion data. This includes the correct parameterization of angles, applying statistical analysis to quaternions, and appropriate distance measures between postures.
As the Internet has become increasingly ubiquitous, it has seen tremendous growth in the popularity of online services. These services range from online CVS repositories like sourceforge , shopping sites, to online financial and administrative systems, etc. It is critical for these services to provide correct and reliable execution for clients. However, given their attractiveness as targets and ubiquitous accessibility, online servers also have a significant chance of being compromised, leading to Byzantine failures.
Designing and implementing a service to run on a machine that may be compromised is not an easy task, since infrastructure under malicious control may behave arbitrarily. Even worse, as any monitoring facility may also be subverted at the same time, there is no easy way for system behavior to be audited, or for malicious attacks to be detected.
We propose our solution to the problem by reducing the trust needed on the server side in the first place. In the other words, our system is designed specifically for running on untrusted hosts. In this thesis, we realize this principle by two different approaches. First, we design and implement a new network file system -- SUNDR. In SUNDR, malicious servers cannot forge users' operations or tamper with their data without being detected. In the worst case, attackers can only conceal users' operations from each other. Still, SUNDR is able to detect this misbehavior whenever users communicate with each other directly.
The limitation of the approach above lies in that the system cannot guarantee ideal consistency with even one single failure. In the second approach, we use replicated state machines to tolerate some fraction of malicious server failures, which is termed Byzantine Fault Tolerance (BFT) in the literature. Classical BFT systems assume less than 1/3 of the replicas are malicious, to provide ideal consistency. In this thesis, we push the boundary from 1/3 to 2/3. With fewer than 1/3 of replicas faulty, we provide the same guarantees as classical BFT systems. Additionally, we guarantee weaker consistency, instead of arbitrary behavior, when between 1/3 and 1/3 of replicas fail.
A linear time-invariant dynamical system is robustly stable if the system as well as all of its nearby systems in a neighborhood of interest are stable. An important property of robustly stable systems is they decay asymptotically without exhibiting significant transient behavior. The first part of this thesis work focuses on measures revealing the degree of robust stability of a dynamical system. We put special emphasis on pseudospectral measures, those based on the eigenvalues of nearby matrices for a first-order system or matrix polynomials for a higher-order system. We present algorithms for the computation of pseudospectral measures for continuous and discrete systems with quadratic rate of convergence and analyze their accuracy in the presence of rounding errors. We also provide an efficient algorithm for the numerical radius of a matrix, the modulus of the outermost point in the field of values (the set of Rayleigh quotients) of the matrix. These algorithms are inspired by algorithms of Byers, Boyd-Balakrishnan and Burke-Lewis-Overton.
The second part is devoted to indicators of robust controllability. We call a system robustly controllable if it is controllable and remains controllable under perturbations of interest. We describe efficient methods for the computation of the distance to the closest uncontrollable system. Our first algorithm for the first-order distance to uncontrollability depends on a grid and is well-suited for low precision approximation. We then discuss algorithms for high precision approximation of the first-order distance to uncontrollability. These are based on the bisection method of Gu and the trisection variant of Burke-Lewis-Overton.
These algorithms require the extraction of the real eigenvalues of matrices of size $O(n2)$ typically at a cost of $O(n6)$, where $n$ is the dimension of the state space. We propose a new divide-and-conquer algorithm that reduces the cost to $O(n4)$ on average in both theory and practice and $O(n5)$ in the worst case. The new iterative approach to the extraction of real eigenvalues may also be useful in other contexts. For higher-order systems we derive a singular value characterization and exploit this characterization for the computation of the higher-order distance to uncontrollability to low precision. The algorithms in this thesis assume arbitrary complex perturbations are applicable to the input system and usually require the extraction of the imaginary eigenvalues of Hamiltonian matrices (or even matrix polynomials) or the unit eigenvalues of symplectic pencils (or palindromic matrix polynomials).
Systems Biology strives to hasten our understanding of the fundamental principles of life by adopting a systems-level approach for the analysis of cellular function and behavior. One popular framework for capturing the chemical kinetics of interacting biochemicals is Hybrid Automata. Our goal in this thesis is to aid Systems Biology research by improving the current understanding of hybrid automata, by developing techniques for symbolic rather than numerical analysis of the dynamics of biochemical networks modeled as hybrid automata, and by honing the theory to two classes of problems: kinetic mass action based simulation in genetic regulatory & signal transduction pathways, and pseudo-equilibrium simulation in metabolic networks.
We first provide new constructions that prove that the "open" Hierarchical Piecewise Constant Derivative (HPCD) subclass is closer to the decidability and undecidability frontiers than was previously understood. After concluding that the HPCD-like classes are unsuitable for modeling chemical reactions, our quest for semi-decidable subclasses leads us to define the "semi-algebraic" subclass. This is the most expressive hybrid automaton subclass amenable to rigorous symbolic temporal reasoning. We begin with the bounded reachability problem, and then show how the dense-time temporal logic Timed Computation Tree Logic (TCTL) can be model-checked by exploiting techniques from real algebraic geometry, primarily real quantifier elimination. We also prove the undecidability of reachability in the Blum-Shub-Smale Turing Machine formalism. We then develop efficient approximation strategies by extending bisimulation partitioning, rectangular grid-based approximation, polytopal approximation and time discretization. We then develop a uniform algebraic framework for modeling biochemical and metabolic networks, also extending flux balance analysis. We present some preliminary results using a prototypical tool Tolque. It is a symbolic algebraic dense time model-checker for semi-algebraic hybrid automata, which uses Qepcad for quantifier elimination.
The "Algorithmic Algebraic Model Checking" techniques developed in this thesis present a theoretically-grounded mathematically-sound platform for powerful symbolic temporal reasoning over biochemical networks and other semi-algebraic hybrid automata. It is our hope that by building upon this thesis, along with the development of computationally efficient parallelizable quantifier elimination algorithms and the integration of different computer algebra tools, scientific software systems will emerge that fundamentally transform the way biochemical networks (and other hybrid automata) are analyzed.
This dissertation presents a learning-based system for the detection, identification, localization, and measurement of various sub-cellular structures in microscopic images of developing embryos. The system analyzes sequences of images obtained through DIC microscopy and detects cell nuclei, cytoplasm, and cell walls automatically. The system described in this dissertation is the key initial component of a fully automated phenotype analysis system.
Our study primarily concerns the early stages of development of C. Elegans nematode embryos, from fertilization to the four-cell stage. The method proposed in this dissertation consists in learning the entire processing chain {\em from end to end}, from raw pixels to ultimate object categories.
The system contains three modules: (1) a convolutional network trained to classify each pixel into five categories: cell wall, cytoplasm, nuclear membrane, nucleus, outside medium; (2) an Energy-Based Model which cleans up the output of the convolutional network by learning local consistency constraints that must be satisfied by label images; (3) A set of elastic models of the embryo at various stages of development that are matched to the label images.
When observing normal (wild type) embryos it is possible to visualize important cellular functions such as nuclear movements and fusions, cytokinesis and the setting up of crucial cell-cell contacts. These events are highly reproducible from embryo to embryo. The events will deviate from normal behaviors when the function of a specific gene is perturbed, therefore allowing the detection of correlations between genes activities and specific early embryonic events. One important goal of the system is to automatically detect whether the development is normal (and therefore, not particularly interesting), or abnormal and worth investigating. Another important goal is to automatically extract quantitative measurements such as the migration speed of the nuclei and the precise time of cell divisions.
The notion of records, which are used to organize closely related groups of data so the group can be treated as a unit, and also provide access to the data within by name, is almost universally supported in programming languages. However, in virtually all cases, the operations permitted on records in statically typed languages are extremely limited. Providing greater flexibility in dealing with records, while simultaneously retaining the benefits of static type checking is a desirable goal.
This problem has generated considerable interest, and a number of type systems dealing with records have appeared in the literature. In this work, we present the first polymorphic type system that is expressive enough to type a number of complex operations on records, including three forms of concatenation and natural join. In addition, the precise types of the records involved are inferred, to eliminate the burden of explicit type declarations. Another aspect of this problem is an efficient implementation of records and their associated operations. We also present a compilation method which accomplishes this goal.
Resolving inconsistencies in data is a problem of critical practical importance. Inconsistent data arises whenever an attribute takes on multiple, inconsistent, values. This may occur when a particular entity is stored multiple times in one database, or in multiple databases that are combined.
We investigate Attribute Value Inconsistency Resolution (AVIR), the problem of semi-automatically resolving data inconsistencies among multiple database records that describe the same person or thing.
Our survey of the area shows that existing solutions are either limited in scope or impose a significant burden on their users. Either they do not cover all types of inconsistencies and attributes, or they require users to write or choose attribute resolution functions for each potentially conflicting attribute.
Our ML based approach applies to all types of inconsistencies and attributes, and automatically selects appropriate resolution functions based on the conflicting data. We have invented and developed a system, that uses a set of binary features that detect data properties and relationships and resolution functions that merge data. Many such features and resolution functions have been written. The system uses supervised learning with maximum likelihood estimation to determine which function(s) to apply, based on which feature(s) fire.
We have validated our system by comparing its error rate, decision rate and decision accuracy on a test data set to baseline values determined by a clairvoyant application of a standard approach where each potentially conflicting attribute is resolved by the best resolution function for the attribute.
The paper introduces the construct of \emm{temporal testers} as a compositional basis for the construction of automata corresponding to temporal formulas in the PSL logic. Temporal testers can be viewed as (non-deterministic) transducers that, at any point, output a boolean value which is 1 iff the corresponding temporal formula holds starting at the current position.
The main advantage of testers, compared to acceptors (such as Buchi automata) is that they are compositional. Namely, a tester for a compound formula can be constructed out of the testers for its sub-formulas. In this paper, we extend the application of the testers method from LTL to the logic PSL.
Besides providing the construction of testers for PSL, we indicate how the symbolic representation of the testers can be directly utilized for efficient model checking and run-time monitoring.
This thesis addresses the difficult open problem in computer graphics of autonomous human modeling and animation, specifically of emulating the rich complexity of real pedestrians in urban environments.
We pursue an artificial life approach that integrates motor, perceptual, behavioral, and cognitive components within a model of pedestrians as highly capable individuals. Our comprehensive model features innovations in these components, as well as in their combination, yielding results of unprecedented fidelity and complexity for fully autonomous multi-human simulation in large urban environments. Our pedestrian model is entirely autonomous and requires no centralized, global control whatsoever.
To animate a variety of natural interactions between numerous pedestrians and their environment, we represent the environment using hierarchical data structures, which efficiently support the perceptual queries of the autonomous pedestrians that drive their behavioral responses and sustain their ability to plan their actions on local and global scales.
The animation system that we implement using the above models enables us to run long-term simulations of pedestrians in large urban environments without manual intervention. Real-time simulation can be achieved for well over a thousand autonomous pedestrians. With each pedestrian under his/her own autonomous control, the self-animated characters imbue the virtual world with liveliness, social (dis)order, and a realistically complex dynamic.
We demonstrate the automated animation of human activity in a virtual train station, and we employ our pedestrian simulator in the context of virtual archaeology for visualizing urban social life in reconstructed archaeological sites. Our pedestrian simulator is also serving as the basis of a testbed for designing and experimenting with visual sensor networks in the field of computer vision.
Numerical computations with real algebraic numbers require algorithms for approximating and isolating real roots of polynomials. A classical choice for root approximation is Newton's method. For an analytic function on a Banach space, Smale introduced the concept of approximate zeros, i.e., points from which Newton's method for the function converges quadratically. To identify these approximate zeros he gave computationally verifiable convergence criteria called point estimates. However, in developing these results Smale assumed that Newton's method is computed exactly. For a system of $n$ homogeneous polynomials in $n+1$ variables, Malajovich developed point estimates for a different definition of approximate zero, assuming that all operations in Newton's method are computed with fixed precision. In the first half of this dissertation, we develop point estimates for these two different definitions of approximate zeros of an analytic function on a Banach space, but assume the strong bigfloat computational model of Brent, i.e., where all operations involve bigfloats with varying precision. In this model, we derive a uniform complexity bound for approximating a root of a zero-dimensional system of $n$ integer polynomials in $n$ variables. We also derive a non-asymptotic bound, in terms of the condition number of the system, on the precision required to implement the robust Newton method.
The second part of the dissertation analyses the worst-case complexity of two algorithms for isolating real roots of a square-free polynomial with real coefficients: The Descartes method and Akritas' continued fractions algorithm. The analysis of both algorithms is based upon amortization bounds such as the Davenport-Mahler bound. For the Descartes method, we give a unified framework that encompasses both the power basis and the Bernstein basis variant of the method; we derive an $O(n(L+\log n))$ bound on the size of the recursion tree obtained by applying the method to a square-free polynomial of degree n with integer coefficients of bit-length $L$, the bound is tight for $L=\Omega(\log n)$; based upon this result we readily obtain the best known bit-complexity bound of $\wt{O}(n^4L2) $ for the Descartes method, where $\wt{O}$ means we ignore logarithmic factors. Similar worst case bounds on the bit-complexity of Akritas' algorithm were not known in the literature. We provide the first such bound, $\wt{O}(n^{12}L3)$, for a square-free integer polynomial of degree $n$ and coefficients of bit-length $L$.
With the development and improvement of high throughput experimental technologies, massive amount of biological data including genomic sequences and optical-maps have been collected for various species. Comparative techniques play a central role in investigating the adaptive significance of organismal traits and revealing evolutionary relations among organisms by comparing these biological data. This dissertation presents two efficient comparative analysis tools used in comparative genomics and comparative optical-map study, respectively.
A complete genome sequence of an organism can be viewed as its ultimate genetic map, in the sense that the heritable information are encoded within the DNA and the order of nucleotides along chromosomes is known. Comparative genomics can be applied to find functional sites by comparing genetic maps. Comparing vertebrate genomes requires efficient cross-species sequence alignment programs. The first tool introduced in this thesis is COMBAT (Clean Ordered Mer-Based Alignment Tool), a new mer-based method which can search rapidly for highly similar translated genomic sequences using the stable-marriage algorithm (SM) as an alignment filter. In experiments COMBAT is applied to comparative analysis between yeast genomes, and between the human genome and the recently published bovine genome. The homologous blocks identified by COMBAT are comparable with the alignments produced by BLASTP and BLASTZ.
When genetic maps are not available, other genomic maps, including optical-maps, can be constructed. An optical map is an ordered enumeration of the restriction sites along with the estimated lengths of the restriction fragments between consecutive restriction sites. CAPO (Comparative Analysis and Phylogeny with Optical-Maps), introduced as a second technique in this thesis, is a tool for inferring phylogeny based on pairwise optical map comparison and bipartite graph matching. CAPO combines the stable matching algorithm with either the Unweighted Pair Group Method with Arithmetic Averaging (UPGMA) or the Neighbor-Joining (NJ) method for constructing phylogenetic trees. This new algorithm is capable of constructing phylogenetic trees in logarithmic steps and performs well in practice. Using optical maps constructed in silico and in vivo, our work shows that both UPGMA-flavored trees and the NJ-flavored trees produced by CAPO share substantial overlapping tree topology and are biologically meaningful.
It is difficult to provision and manage modern component-based Internet services so that they provide stable quality-of-service (QoS) guarantees to their clients, because: (1) component middleware are complex software systems that expose several independently tuned configurable application runtime policies and server resource management mechanisms; (2) session-oriented client behavior with complex data access patterns makes it hard to predict what impact tuning these policies and mechanisms has on application behavior; (3) component-based Internet services exhibit complex structural organization with requests of different types accessing different components and data sources, which could be distributed and/or replicated for failover, performance, or business purposes.
This dissertation attempts to alleviate this situation by targeting three interconnected goals: (1) providing improved QoS guarantees to the service clients, (2) optimizing server resource utilization, and (3) providing application developers with guidelines for natural application structuring, which enable efficient use of the proposed mechanisms for improving service performance. Specifically, we explore the thesis that exposing and using detailed information about how clients use component-based Internet services enables mechanisms that achieve the range of goals listed above. To validate this thesis we show its applicability to the following four problems: (1) maximizing reward brought by Internet services, (2) optimizing utilization of server resource pools, (3) providing session data integrity guarantees, and (4) enabling service distribution in wide-area environments.
The techniques that we propose for the identified problems are applicable at both the application structuring stage and the application operation stage, and range from automatic (i.e., performed by middleware in real time) to manual (i.e., involve the programmer, or the service provider). These techniques take into account service usage information exposed at different levels, ranging from high-level structure of user sessions to low level information about data access patterns and resource utilization by requests of different types. To show the benefits of the proposed techniques, we implement various middleware mechanisms in the JBoss application server, which utilizes the J2EE component model, and comprehensively evaluate them on several publicly-available sample J2EE applications - Java Pet Store, RUBiS, and our own implementation of the TPC-W web transactional benchmark. Our experimental results show that the proposed techniques achieve optimal utilization of server resources and improve application performance by up to two times for centralized Internet services and by up to 6 times for distributed ones.
Data arriving in time order (time series) arises in disciplines ranging from music to meteorology to finance to motion capture data, to name a few. In many cases, a natural way to query the data is what we call time series matching - a user enters a time series by hand, keyboard or voice and the system finds "similar" time series.
Existing time series similarity measures, such as DTW (Dynamic Time Warping), can accommodate certain timing errors in the query and perform with high accuracy on small databases. However, they all have high computational complexity and the accuracy dramatically drops when the data set grows. More importantly, there are types of errors that cannot be captured by a single similarity measure.
Here we present a general time series matching framework. This framework can easily optimize, combine and test different features to execute a fast similarity search based on the application's requirement. Basically we use a multi-filter chain and boosting algorithms to compose a ranking algorithm. Each filter is a classifier which removes bad candidates by comparing certain features of the time series data. Some filters use a boosting algorithm to combine a few different weak classifiers into a strong classifier. The final filter will give a ranked list of candidates in the reference data which matches the query data.
The framework is applied to build query algorithms for a Query-by-Humming system. Experiments show that the algorithm has a more accurate similarity measure and its response time increases much slower than the pure DTW algorithm when the number of songs in the database increases from 60 to 1400.
A large amount of new information is posted on the Web every day. Large-scale web search engines often update their index slowly and are unable to present such information in a timely manner. Here we present our solutions of searching new information from the web by tracking the changes of web documents.
First, we present the algorithms and techniques useful for solving the following problems: detecting web pages that have changed, extracting changes from different versions of a web page, and evaluating the significance of web changes. We propose a two-level change detector: MetaDetector and ContentDetector. The combined detector successfully reduces network traffic by about 67%. Our algorithm for extracting web changes consists of three steps: document tree construction, document tree encoding and tree matching. It has linear time complexity and extracts effectively the changed content from different versions of a web page. In order to evaluate web changes, we propose a unified ranking framework combining three metrics: popularity ranking, content-based ranking and evolution ranking. Our methods can identify and deliver important new information in a timely manner.
Second, we present an application using the techniques and algorithms we developed, named "Web Daily News Assistant (WebDNA): finding what's new on Your Web". It is a search tool that helps community users search new information on their community web. Currently WebDNA is deployed on the New York University web site.
Third, we model the changes of web documents using survival analysis. Modeling web changes is useful for web crawler scheduling and web caching. Currently people model changes to web pages as a Poisson Process, and use a necessarily incomplete detection history to estimate the true frequencies of changes. However, other features that can be used to predict change frequency have not previously been studied. Our analysis shows that PageRank value is a good predictor. Statistically, the change frequency is a function proportional to $\exp[0.36\cdot (\ln(PageRank)+C)]$. We further study the problem of combining the predictor and change history into a unified framework. An improved estimator of change frequency is presented, which successfully reduces the error by 27.3% when the change history is short.
Events occur in every aspect of our lives.
An unexpectedly large number of events occurring within some certain measurement (e.g. within some time duration or a spatial region) is called a {\em burst}, suggesting unusual behaviors or activities. Bursts come up in many natural and social processes. It is a challenging task to monitor the occurrence of bursts whose lasting duration is unknown in a fast data stream environment.
This work describes efficient data structures and algorithms for high performance burst detection under different settings. Our view is that bursts, as an unusual phenomenon, constitute a useful preliminary primitive in a knowledge discovery hierarchy. Our intent is to build a high performance primitive detection algorithm to support high-level data mining tasks.
The work starts with an algorithmic framework including a family of data structures and a heuristic optimization algorithm to choose an efficient data structure given the inputs. The advantage of this framework is that it's adaptive to different inputs. Experiments on both synthetic data and real world data show the new framework significantly outperforms existing techniques over a variety of inputs.
Furthermore, we present a greedy dynamic detection algorithm which handles the changing data. It evolves the structure to adapt to the incoming data. It achieves better performance in both synthetic and real data streams than a static algorithm in most cases.
We have applied this framework to different real world applications in physics, stock trading and website traffic monitoring. All the case studies show our framework has superb performance.
We extend this framework to multi-dimensional data and use it in an epidemiology simulation to detect infectious disease outbreak and spread.
Data arriving in time order (a data stream) arises in fields ranging from physics to finance to medicine to music, to name a few. Often the data comes from sensors (in physics and medicine for example) whose data rates continue to improve dramatically as sensor technology improves. Furthermore, the number of sensors is increasing, so analyzing data between sensors becomes ever more critical in order to distill knowledge from the data. Fast response is desirable in many applications (e.g. to aim a telescope at an activity of interest or to perform a stock trade). In applications such as finance, recent information, e.g. correlation, is of far more interest than older information, so analysis over sliding windows is a desired operation.
These three factors -- huge data size, fast response, and windowed computation -- motivated this work. Our intent is to build a foundational library of primitives to perform online or near online statistical analysis, e.g. windowed correlation, incremental matching pursuit, burst detection, on thousands or even millions of time series. Beside the algorithms, we also propose the concept of ``uncooperative'' time series, whose power spectra are spread over all frequencies with any regularity.
Previous work showed how to do windowed correlation with Fast Fourier Transforms and Wavelet Transforms, but such techniques don't work for uncooperative time series. This thesis will show how to use sketches (random projections) in a way that combines several simple techniques -- sketches, convolution, structured random vectors, grid structures, combinatorial design, and bootstrapping -- to achieve high performance, windowed correlation over a variety of data sets. Experiments confirm the asymptotic analysis.
To conduct matching pursuit (MP) over time series windows, an incremental scheme is designed to reduce the computational effort. Our empirical study demonstrates a substantial improvement in speed.
In previous work, Zhu and Shasha introduced an efficient algorithm to monitor bursts within windows of multiple sizes. We implemented it in a physical system by overcoming several practical challenges. Experimental results support the authors' linear running time analysis.
Event-driven middleware is a popular infrastructure for building large-scale asynchronous distributed systems. Content-based publish/subscribe systems are a type of event-driven middleware that provides service flexibility and specification expressiveness, creating opportunities for improving reliability and efficiency of the system.
The use of route-impacting control information, such as subscription filters and access control rules, has the potential to enable efficient routing for applications that require selective and regional distribution of events. Such applications range from financial information systems to sensor networks to service-oriented architectures. However, it has been a great challenge to design correct and efficient protocols for distributing control information and exploiting it to achieve efficient and highly available message routing.
In this dissertation, we study the problem of distributing and utilizing route-impacting control information. We present an abstract model of content-based routing and reliable delivery in redundant broker networks. Based on this model, we design a generic algorithm that propagates control information and performs content-based routing and delivers events reliably. The algorithm is efficient and light-weight in that it does not require heavy-weight consensus protocols between redundant brokers. We extend this generic algorithm to support consolidation and merging of control information. Existing protocols can be viewed as particular encodings and optimizations of the generic algorithm. We show an encoding using virtual time vectors that supports reliable delivery and deterministic dynamic access control in redundant broker networks. In our system, the semantics of reliable delivery is clearly defined even if subscription information and access control policy can dynamically change. That is, one or more subscribers of same principal will receive exactly the same sequence of messages (modulo subscription filter differences) regardless of where they are connected and the network latency and failure conditions in their parts of the network.
We have implemented these protocols in a fully-functioning content-based publish/subscribe system - Gryphon. We evaluate its efficiency, scalability and high availability.
Recent studies showed potential for using component frameworks for building flexible adaptible applications for deployment in distributed environments. However this approach is hindered by the complexity of deployment of component-based applications, which usually involves a great deal of configuration of both the application components and system services they depend on. In this paper we propose an infrastructure for automatic dynamic deployment of J2EE applications,that specifically addresses the problems of (1) inter-component connectivity specification and its effects on component configuration and deployment; and (2) application component dependencies on application server services, their configuration and deployment. The proposed infrastructure provides simple yet expressive abstractions for potential application adaptation through dynamic deployment and undeployment of components. We implement the infrastructure as a part of the JBoss J2EE application server and test it on several sample J2EE applications.
Motivation: Current microarray data analysis techniques draw the biologist's attention to targeted sets of genes but do not otherwise present global and dynamic perspectives (e.g., invariants) inferred collectively over a dataset. Such perspectives are important in order to obtain a process-level understanding of the underlying cellular machinery, especially how cells react, respond, and recover from stresses.
Results: We present GOALIE, a novel computational approach and software system that uncovers formal temporal logic models of biological processes from time course microarray datasets. GOALIE `redescribes' data into the vocabulary of biological processes and then pieces together these redescriptions into a Kripke-structure model, where possible worlds encode transcriptional states and are connected to future possible worlds. This model then supports various query, inference, and comparative assessment tasks, besides providing descriptive process-level summaries. An application of GOALIE to characterizing the yeast (S. cerevisiae) cell cycle is described.
Availability: GOALIE runs on Windows XP platforms and is available on request from the authors.
The theory of recursive data types is a valuable modeling tool for software verification. In the past, decision procedures have been proposed for both the full theory and its universal fragment. However, previous work has been limited in various ways, including an inability to deal with multiple constructors, multi-sorted logic, and mutually recursive data types. More significantly, previous algorithms for the universal case have been based on inefficient nondeterministic guesses and have been described in fairly complex procedural terms.
We present an algorithm which addresses these issues for the universal theory. The algorithm is presented declaratively as a set of abstract rules which are terminating, sound, and complete. We also describe strategies for applying the rules and explain why our recommended strategy is more efficient than those used by previous algorithms. Finally, we discuss how the algorithm can be used within a broader framework of cooperating decision procedures.
This paper describes a new large-scale motion capture based game that is called Squidball. It was tested on up to 4000 player audiences last summer at SIGGRAPH 2004. It required to build the world's largest motion capture space, the largest motion capture markers (balls), and many other challenges in technology, production, game play, and social studies. Our aim was to entertain the SIGGRAPH Electronic Theater audience with a cooperative and energetic game that is played by everybody together, in controlling real-time graphics and audio, while bouncing and batting multiple large helium filled balloons across the entire theater space. We detail in this paper all the lessons learned in producing such a system and game, and argue why we believe Squidball was a great success.
In recent years, domain decomposition methods have attracted much attention due to their successful application to many elliptic and parabolic problems. Domain decomposition methods treat problems based on a domain substructuring, which is attractive for parallel computation, due to the independence among the subdomains. In principle, domain decomposition methods may be applied to the system resulting from a standard discretization of the parabolic problems or, directly, be carried out through a direct discretization of parabolic problems. In this paper, a direct domain decomposition method is introduced to discretize the parabolic problems. The stability and convergence of this algorithm are analyzed, and an $O(\tau+h)$ error bound is provided.
There is a growing awareness, both in industry and academia, of the crucial role of formally verifying the translation from high-level source-code into low-level object code that is typically performed by an optimizing comiler. Formally verifying an optimizing compiler, as one woule verify any other large program, is not feasible due to its size, ongoing evolution and modification, and possibly, proprietary considerations. Translation validation is a novel approach that offers an alternative to the verification of translator in general and compilers in particular: Rather than verifying the compiler itself, one constructs a validation tool which, after every run of the compiler, formally confirms that the target code produced in the run is a correct translation of the source program. This thesis work takes an important step towards ensuring an extremely high level of confidence in compilers targeted at EPIC architectures.
In this thesis, we focus on the translation validation of structure preserving optimizations, i.e. transformations that do not modify programs' structure in a major way. This category of optimizations covers most of the global optimizations performed by compilers. This thesis has two main parts. One develops a proof rule that formally establishes the correctness of structure preserving transformation based on computational induction. The other part is the development of a tool that applies the proof rule to the automatic validation of global optimizaitons performed by Intel's ORC compiler for IA-64 architecture. With minimal instrumentation from the compiler, the tool constructs ''verification conditions'' -- formal theorems that, if valid, establish the correctness of a translation. The verificaiton conditions are then transferred to an automatic theorem prover that checks their validity. Together, the tool offers a fully automatic method to formally establish the correctness of each translation.
We present a nonlinear image representation based on multiscale local orientation measurements. Specifically, an image is first decomposed using a two-orientation steerable pyramid, a tight-frame representation in which the basis functions are directional derivatives of a radially symmetric blurring operator. The pair of subbands at each scale are thus gradients of progressively blurred copies of the original image. We then discard the magnitude information and retain only the orientation of each gradient vector. We develop a method for reconstructing the original image from this orientation information using an algorithm based on projection onto convex sets, and demonstrate its robustness to quantization.
The growing popularity of XML Web Services is resulting in a significant increase in the proportion of Internet traffic that involves requests to and responses from Web Services. Unfortunately, web service responses, because they are generated dynamically, are considered ``uncacheable" by traditional caching infrastructures. One way of remedying this situation is by developing alternative caching infrastructures, which improve performance using on-demand service replication, data offloading, and request redirection. These infrastructures benefit from two characteristics of web service traffic --- (1) the open nature of the underlying protocols, SOAP, WSDL, UDDI, which results in service requests and responses adhering to a well-formatted, widely known structure; and (2) the observation that for a large number of currently deployed data-centric services, requests can be interpreted as structured accesses against a physical or virtual database --- but require that there be sufficient locality in service usage to offset replication and redirection costs.
This paper investigates whether such locality does in fact exist in current web service workloads. We examine access logs from two large data-centric web service sites, SkyServer and TerraServer, to characterize workload locality across several dimensions: data space, network regions, and different time epochs. Our results show that both workloads exhibit a high degree of spatial and network locality: 10\% of the client IP addresses in the SkyServer trace contribute to about 99.95\% of the requests, and 99.94\% of the requests in the TerraServer trace are directed towards regions that represent less than 10\% of the overall data space accessible through the service. Our results point to the substantial opportunity for improving Web Services scalability by on-demand service replication.
Many of the data-centric network services deployed today hold massive volumes of data at their origin websites, accessing the data to dynamically generate responses. Such dynamic responses are poorly supported by traditional caching infrastructures and result in poor performance and scalability for such services. One way of remedying this situation is to develop alternative caching infrastructures, which can dynamically detect the often large degree of service usage locality and leverage such information to on-demand replicate and redirect requests to service portions at appropriate network locations. Key to building such infrastructures is the ability to cluster and inspect client requests, at various points across a wide-area network.
This paper presents a zone-based scheme for constructing oriented overlays, which provide such an ability. Oriented overlays differ from previously proposed unstructured overlays in supporting network traffic flows from many sources towards one (or a small number) of destinations, and vice-versa. A good oriented overlay would offer sufficient clustering ability without adversely affecting path latencies. Our overlay construction scheme organizes participating nodes into different zones according to their latencies from the origin server(s), and has each node associate with one or more parents in another zone closer to the origin. Extensive experiments with a PlanetLab-based implementation of our scheme shows that it produces overlays that are (1) robust to network dynamics; (2) offer good clustering ability; and (3) minimally impact end-to-end network latencies seen by clients.
Formal verification is important in designing reliable computer systems. For a critical software system, it is not enough to have a proof of correctness for the source code, there must also be an assurance that the compiler produces a correct translation of the source code into the target machine code. Verifying the correctness of modern optimizing compilers is a challenging task because of their size, their complexity, and their evolution over time.
In this thesis, we focus on the Translation Validation of loop optimizations. In order to validate the optimizations performed by the compiler, we try to prove the equivalence of the intermediate codes before and after the optimizations. There were previously a set of proof rules for building the equivalence relation between two programs. However, they cannot validate some cases with legal loop optimizations. We propose new proof rules to consider the conditions of loops and possible elimination of some loops, so that those cases can also be handled. According to these new proof rules, algorithms are designed to apply them to an automatic validation process.
Based on the above proof rules, we implement an automatic validation tool for loop optimizations which analyzes the loops, guesses what kinds of loop optimizations occur, proves the validity of a combination of loop optimizations, and synthesizes a series of intermediate codes. We integrate this new loop tool into our translation validation tool TVOC, so that TVOC handles not only optimizations which do not significantly change the structure of the code, but also loop optimizations which do change the structure greatly. With this new part, TVOC has succeeded in validating many examples with loop optimizations.
Speculative optimizations are the aggressive optimizations that are only correct under certain conditions that cannot be known at compile time. In this thesis, we present the theory and algorithms for validating speculative optimizations and generating the runtime tests necessary for speculative optimizations. We also provide several examples and the results of the algorithms for speculative optimizations.
Preconditioned conjugate gradient methods based on two-level overlapping Schwarz methods often perform quite well. Such a preconditioner combines a coarse space solver with local components which are defined in terms of subregions which form an overlapping covering of the region on which the elliptic problem is defined. Precise bounds on the rate of convergence of such iterative methods have previously been obtained in the case of conforming lower order and spectral finite elements as well as in a number of other cases. In this paper, this domain decomposition algorithm and analysis are extended to mortar finite elements. It is established that the condition number of the relevant iteration operator is independent of the number of subregions and varies with the relative overlap between neighboring subregions linearly as in the conforming cases previously considered.
Many modern wide-area distributed systems are component-based. This approach provides great flexibility in adapting applications to the changing state of the environment and user requirements, but increases the complexity of configuring the applications. Because of the scale and heterogeneity of modern wide-area environments, manual configuration is hard, inefficient, suboptimal, and error-prone. Automated application configuration is desired.
Constructing distributed applications requires choosing a set of components that will constitute the application instance and assigning network resources to component executions and data transfers. Stated this way, the application configuration problem (ACP) is similar to the planning (action selection) and scheduling (resource allocation) problems studied by the Artificial Intelligence (AI) community.
This thesis investigates the problem of solving the ACP using AI planning techniques. However, the ACP poses several challenges not usually encountered and addressed by the traditional AI solutions. The problem specification for the ACP can be much larger than the solution, with the relevant portions only identified during the search. Additionally, the interactions between planning operators are numeric rather than logical. Finally, it is desirable to be able to trade off quality of the solution versus search time.
We show that the ACP is undecidable in general. Therefore, instead of a single algorithm, we propose a set of techniques that can be used to compose an algorithm for a particular variety of the ACP that can exploit natural restrictions exhibited by that variety. These techniques address the challenges above by dynamically obtaining portions of the problem specification as necessary during the search, using envelope hierarchies based on numeric information for pruning and search guidance, and discretizing continuous variables to approximate numeric parameters without restricting the form of supported numeric functions.
We illustrate these techniques by describing their use in algorithms tailored for two specific varieties of the ACP --- snapshot configurations for dynamic component-based frameworks, and scheduling of grid workflows with replica selection and explicit resource reservations. Experimental evaluation of the performance of these two algorithms shows that the techniques successfully achieve their goals, with acceptable run-time overhead.
A BDDC (balancing domain decomposition by constraints) algorithm is developed for elliptic problems with mortar discretizations for geometrically non-conforming partitions in both two and three spatial dimensions. The coarse component of the preconditioner is defined in terms of one mortar constraint for each edge/face which is an intersection of the boundaries of a pair of subdomains. A condition number bound of the form $C \max_i \left\{ (1+\text{log} (H_i/h_i) )3 \right\}$ is established. In geometrically conforming cases, the bound can be improved to $C \max_i \left\{ (1+\text{log} (H_i/h_i) )2 \right\}$. This estimate is also valid in the geometrically nonconforming case under an additional assumption on the ratio of mesh sizes and jumps of the coefficients. This BDDC preconditioner is also shown to be closely related to the Neumann-Dirichlet preconditioner for the FETI--DP algorithms of \cite{K-04-3d,KL-02} and it is shown that the eigenvalues of the BDDC and FETI--DP methods are the same except possibly for an eigenvalue equal to 1.
In this paper, a FETI-DP formulation for the three dimensional elasticity problem on non-matching grids over a geometrically conforming subdomain partition is considered. To resolve the nonconformity of the finite elements, a mortar matching condition on the subdomain interfaces (faces) is imposed. By introducing Lagrange multipliers for the mortar matching constraints, the resulting linear system becomes similar to that of a FETI-DP method. In order to make the FETI-DP method efficient for solving this linear system, a relatively large set of primal constraints, which include average and momentum constraints over interfaces (faces) as well as vertex constraints, is introduced. A condition number bound $C(1+\text{log}(H/h))2$ for the FETI-DP formulation with a Neumann-Dirichlet preconditioner is then proved for the elasticity problems with discontinuous material parameters when only some faces are chosen as primal faces on which the average and momentum constraints will be imposed. An algorithm which selects a quite small number of primal faces is also discussed.
The purpose of this paper is to extend the BDDC (balancing domain decomposition by constraints) algorithm to saddle-point problems that arise when mixed finite element methods are used to approximate the system of incompressible Stokes equations. The BDDC algorithms are iterative substructuring methods, which form a class of domain decomposition methods based on the decomposition of the domain of the differential equations into nonoverlapping subdomains. They are defined in terms of a set of primal continuity constraints, which are enforced across the interface between the subdomains and which provide a coarse space component of the preconditioner. Sets of such constraints are identified for which bounds on the rate of convergence can be established that are just as strong as previously known bounds for the elliptic case. In fact, the preconditioned operator is effectively positive definite, which makes the use of a conjugate gradient method possible. A close connection is also established between the BDDC and FETI-DP algorithms for the Stokes case.
The standard BDDC (balancing domain decomposition by constraints) preconditioner is shown to be equivalent to a preconditioner built from a partially subassembled finite element model. This results in a system of linear algebraic equations which is much easier to solve in parallel than the fully assembled model; the cost is then often dominated by that of the problems on the subdomains. An important role is also played, both in theory and practice, by an average operator and in addition exact Dirichlet solvers are used on the subdomains in order to eliminate the residual in the interior of the subdomains. The use of inexact solvers for these problems and even the replacement of the Dirichlet solvers by a trivial extension are considered. It is established that one of the resulting algorithms has the same eigenvalues as the standard BDDC algorithm, and the connection of another with the FETI-DP algorithm with a lumped preconditioner is also considered. Multigrid methods are used in the experimental work and under certain assumptions, it can be established that the iteration count essentially remains the same as when exact solvers are used, while considerable gains in the speed of the algorithm can be realized since the cost of the exact solvers grows superlinearly with the size of the subdomain problems while the multigrid methods are linear.
Two popular non-overlapping domain decomposition methods, the FETI--DP and BDDC algorithms, are reformulated using Block Cholesky factorizations, an approach which can provide a useful framework for the design of domain decomposition algorithms for solving symmetric positive definite linear system of equations. Instead of introducing Lagrange multipliers to enforce the coarse level, primal continuity constraints in these algorithms, a change of variables is used such that each primal constraint corresponds to an explicit degree of freedom. With the new formulations of these algorithms, a simplified proof is provided that the spectra of a pair of FETI--DP and BDDC algorithms, with the same set of primal constraints, are the same. Results of numerical experiments also confirm this result.
Title: Real-time rendering of normal maps with discontinuities (NYU-CS-TR872) Authors: Evgueni Parilov, Ilya Rosenberg and Denis Zorin Abstract:
Normal mapping uses normal perturbations stored in a texture to give objects a more geometrically complex appearance without increasing the number of geometric primitives. Standard bi- and trilinear interpolation of normal maps works well if the normal field is continuous, but may result in visible artifacts in the areas where the field is discontinuous, which is common for surfaces with creases and dents.
In this paper we describe a real-time rendering technique which preserves the discontinuity curves of the normal field at sub-pixel level and its GPU implementation. Our representation of the piecewise-continuous normal field is based on approximations of the distance function to the discontinuity set and its gradient. Using these approximations we can efficiently reconstruct discontinuities at arbitrary resolution and ensure that no normals are interpolated across the discontinuity. We also described a method for updating the normal field along the discontinuities in real-time based on blending the original field with the one calculated from a user-defined surface profile.
Presently, there is no clear way to determine if the current body of biological facts is sufficient to explain phenomenology. Rigorous mathematical models with automated tools for reasoning, simulation, and computation can be of enormous help to uncover cognitive flaws, qualitative simplification or overly generalized assumptions. The approaches developed by control theorists analyzing stability of a system with feedback, physicists studying asymptotic properties of dynamical systems, computer scientists reasoning about discrete or hybrid (combining discrete events with continuous events) reactive systems---all have tried to address some aspects of the same problem in a very concrete manner. We explore here how biological processes could be studied in a similar manner, and how the appropriate tools for this purpose can be created.
In this paper, we suggest a possible confluence of the theory of hybrid automata and the techniques of algorithmic algebra to create a computational basis for systems biology. We start by discussing our basis for this choice -- semi-algebraic hybrid systems, as we also recognize its power and limitations. We explore solutions to the bounded-reachability problem through symbolic computation methods, applied to the descriptions of the traces of the hybrid automaton. Because the description of the automaton is through semi-algebraic sets, the evolution of the automaton can be described even in cases where system parameters and initial conditions are unspecified. Nonetheless, semialgebraic decision procedures provide a succinct description of algebraic constraints over the initial values and parameters for which proper behavior of the system can be expected. In addition, by keeping track of conservation principles in terms of constraint or invariant manifolds on which the system must evolve, we avoid many of the obvious pitfalls of numerical approaches.
Ongoing improvements to the performance and accessibility of less conventional input modalities such as speech and gesture recognition now provide new dimensions for interface designers to explore. Yet there is a scarcity of commercial applications which utilize these modalities either independently or multimodally. This scarcity partially results from a lack of development tools and design guidelines to facilitate the use of speech and gesture.
An integral aspect of the user interface design process is the ability to easily evaluate various design solutions through an iterative process of prototyping and testing. Through this process guidelines emerge that aid in the design of future interfaces. Today there is no shortage of tools supporting the development of conventional interfaces. However there do not exist resources allowing interface designers to easily prototype and quickly test, via remote distribution, interface designs utilizing speech and gesture.
The thesis work for this dissertation explores the development of an Extensible MultiModal Environment Toolkit (EMMET) for prototyping and remotely testing speech and gesture based multimodal interfaces to three-dimensional environments. The overarching goals for this toolkit are to allow its users to: explore speech and gesture based interface design without requiring an understanding of the details involved in the low-level implementation of speech or gesture recognition, quickly distribute their multimodal interface prototypes via the Web, and receive multimodal usage statistics collected remotely after each use of their application.
EMMET ultimately contributes to the field of multimodal user interface design by providing an environment to existing user interface developers in which speech and gesture recognition have been seamlessly integrated into their palette of user input options. Such seamless integration serves to increase the utilization within applications of speech and gesture modalities by removing any actual or perceived deterrents to the use of these modalities versus the use of conventional modalities. EMMET additionally strives to improve the quality of speech and gesture based interfaces by supporting the prototype-and-test development cycle through its Web distribution and usage statistics collection capabilities. These capabilities also allow developers to realize new design guidelines specific to the use of speech and gesture.
We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This ``push'' near the top of the list can be chosen arbitrarily large or small. We choose $\ell_p$-norms to provide a specific type of push; as $p$ becomes large, the algorithm concentrates harder near the top of the list.
We derive a generalization bound based on the $p$-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data.
A burst is a large number of events occurring within a certain time window. As an unusual activity, it's a noteworthy phenomenon in many natural and social processes. Many data stream applications require the detection of bursts across a variety of window sizes. For example, stock traders may be interested in bursts having to do with institutional purchases or sales that are spread out over minutes or hours. Detecting a burst over any of $k$ window sizes, a problem we call {\em elastic burst detection}, in a stream of length $N$ naively requires $O(kN)$ time. Previous work \cite{DiscoveryBook03} showed that a simple Shifted Binary Tree structure can reduce this time substantially (in very favorable cases near to $O(N)$) by filtering away obvious non-bursts. Unfortunately, for certain data distributions, the filter marks many windows of events as possible bursts, even though a detailed check shows them to be non-bursts.
In this paper, we present a new algorithmic framework for elastic burst detection: a family of data structures that generalizes the Shifted Binary Tree. We then present a heuristic search algorithm to find an efficient structure among the many offered by the framework, given the input. We study how different inputs affect the desired structures. Experiments on both synthetic and real world data show a factor of up to 35 times improvement compared with the Shifted Binary Tree over a wide variety of inputs, depending on the data distribution. We show an example application that identifies interesting correlations between bursts of activity in different stocks.
Client interactions with modern web-accessible network services are typically organized into sessions involving multiple requests that read and write shared application data. Therefore when executed concurrently, web sessions may invalidate each other's data. Depending on the nature of the business represented by the service, allowing the session with invalid data to progress might lead to financial penalties for the service provider, while blocking the session's progress and deferring its execution (e.g., by relaying its handling to the customer service) will most probably result in user dissatisfaction. A compromise would be to tolerate some bounded data inconsistency, which would allow most of the sessions to progress, while limiting the potential financial loss incurred by the service. In order to quantitatively reason about these tradeoffs, the service provider can benefit from models that predict metrics, such as the percentage of successfully completed sessions, for a certain degree of tolerable data inconsistency.
This paper develops such analytical models of concurrent web sessions with bounded inconsistency in shared data for three popular concurrency control algorithms. We illustrate our models using the sample buyer scenario from the TPC-W e-Commerce benchmark, and validate them by showing their close correspondence to measured results of concurrent session execution in both a simulated and a real web server environment. Our models take as input parameters of service usage, which can be obtained through profiling of incoming client requests. We augment our web application server environment with a profiling and automated decision making infrastructure which is shown to successfully choose, based on the specified performance metric, the best concurrency control algorithm in real time in response to changing service usage patterns.
In recent years, the increase in the amounts of available genomic as well as gene expression data has provided researchers with the necessary information to train and test various models of gene origin, evolution, function and regulation. In this thesis, we present novel solutions to key problems in computational biology that deal with nucleotide sequences (horizontal gene transfer detection), amino-acid sequences (protein sub-cellular localization prediction), and gene expression data (transcription factor - binding site pair discovery). Different pattern discovery techniques are utilized, such as maximal sequence motif discovery and maximal itemset discovery, and combined with support vector machines in order to achieve significant improvements against previously proposed methods.
Two inexact coarse solvers for Balancing Domain Decomposition by Constraints (BDDC) algorithms are introduced and analyzed. These solvers help remove a bottleneck for the two-level BDDC algorithms related to the cost of the coarse problem when the number of subdomains is large. At the same time, a good convergence rate is maintained.
BDDC algorithms are also developed for the linear systems arising from flow in porous media discretized with mixed and hybrid finite elements. Our methods are proven to be scalable and the condition numbers of the operators with our BDDC preconditioners grow only polylogarithmically with the size of the subdomain problems.
BDDC methods are nonoverlapping iterative substructuring domain decomposition methods for the solution of large sparse linear algebraic systems arising from discretization of elliptic boundary value problems. Its coarse problem is given by a small number of continuity constraints which are enforced across the interface. The coarse problem matrix is generated and factored by direct solvers at the beginning of the computation and it can ultimately become a bottleneck, if the number of subdomains is very large.
In this paper, two three-level BDDC methods are introduced for solving the coarse problem approximately in three dimensions. This is an extension of previous work for the two dimensional case and since vertex constraints alone do not suffice to obtain polylogarithmic condition number bound, edge constraints are considered in this paper. Some new technical tools are then needed in the analysis and this makes the three dimensional case more complicated than the two dimensional case.
Estimates of the condition numbers are provided for two three-level BDDC methods and numerical experiments are also discussed.
The BDDC (balancing domain decomposition by constraints) methods have been applied successfully to solve the large sparse linear algebraic systems arising from conforming finite element discretizations of elliptic boundary value problems. In this paper, the scalar elliptic problems for flow in porous media are discretized by a hybrid finite element method which is equivalent to a nonconforming finite element method. The BDDC algorithm is extended to these problems which originate as saddle point problems.
Edge/face average constraints are enforced across the interface and the same rate of convergence is obtained as in conforming cases. The condition number of the preconditioned system is estimated and numerical experiments are discussed.
The BDDC (balancing domain decomposition by constraints) algorithms are similar to the balancing Neumann-Neumann methods, with a small number of continuity constraints enforced across the interface throughout the iterations. These constraints form a coarse, global component of the preconditioner. The BDDC methods are powerful for solving large sparse linear algebraic systems arising from discretizations of elliptic boundary value problems. In this paper, the BDDC algorithm is extended to saddle point problems generated from the mixed finite element methods used to approximate the scalar elliptic problems for flow in porous media.
Edge/face average constraints are enforced and the same rate of convergence is obtained as for simple elliptic cases. The condition number bound is estimated and numerical experiments are discussed. In addition, a comparison of the BDDC method with an edge/face-based iterative substructuring method is provided.
Verification plays an indispensable role in designing reliable computer hardware and software systems. With the fast growth in design complexity and the quick turnaround in design time, formal verification has become an increasingly important technology for establishing correctness as well as for finding difficult bugs. Since there is no ``silver-bullet'' to solve all verification problems, a spectrum of powerful techniques in formal verification have been developed to tackle different verification problems and complexity issues. Depending on the nature of the problem whose most salient components are the system implementation and the property specification, a proper methodology or a combination of different techniques is applied to solve the problem.
In this thesis, we focus on the research and development of formal methods to uniformly verify parameterized systems. A parameterized system is a class of systems obtained by instantiating the system parameters. Parameterized verification seeks a single correctness proof of a property for the entire class. Although the general parameterized verification problem is undecidable [AK86], it is possible to solve special classes by applying a repertoire of techniques and heuristics. Many methods in parameterized verification require a great deal of human interaction. This makes the application of these methods to real world problems infeasible. Thus, the main focus of this research is to develop techniques that can be automated to deliver proofs of safety and liveness properties.
Our research combines various formal techniques such as deductive methods, abstraction and model checking. One main result in this thesis is an automatic deductive method for parameterized verification. We apply small model properties of Bounded Data Systems (a special type of parameterized system) to help prove deductive inference rules for the safety properties of BDS systems. Another methodology we developed enables us to prove liveness properties of parameterized systems via an automatic abstraction method called counter abstraction . There are several useful by-products from our research: A set of heuristics is established for the automatic generation of program invariants which can benefit deductive verification in general; also we proposed methodologies for the automatic abstraction of fairness conditions that are crucial for proving liveness properties.
In a mobile ad hoc network, mobile nodes communicate with each other through wireless links. Mobility causes frequent topology changes. This thesis addresses the fundamental challenges mobility presents to on-demand routing protocols and to TCP.
On-demand routing protocols use route caches to make routing decisions. Due to mobility, cached routes easily become stale. To address the cache staleness issue, prior work used adaptive timeout mechanisms. However, heuristics cannot accurately estimate timeouts because topology changes are unpredictable. I propose to proactively disseminate the broken link information to the nodes that have cached the link. I define a new cache structure called a cache table to maintain the information necessary for cache updates, and design a distributed cache update algorithm. This algorithm is the first work that proactively updates route caches in an adaptive manner. Simulation results show that proactive cache updating is more efficient than adaptive timeout mechanisms. I conclude that proactive cache updating is key to the adaptation of on-demand routing protocols to mobility.
TCP does not perform well in mobile ad hoc networks. Prior work provided link failure feedback to TCP so that it can avoid invoking congestion control mechanisms for packet losses caused by route failures. Simulation results show that my cache update algorithm significantly improves TCP throughput since it reduces the effect of mobility on TCP. TCP still suffers from frequent data and ACK losses. I propose to make routing protocols aware of lost TCP packets and help reduce TCP timeouts. I design two mechanisms that exploit cross-layer information awareness: early packet loss notification (EPLN) and best-effort ACK delivery (BEAD). EPLN notifies TCP senders about lost data. BEAD retransmits ACKs at intermediate nodes or at TCP receivers. Simulation results show that the two mechanisms significantly improve TCP throughput. I conclude that cross-layer information awareness is key to making TCP efficient in the presence of mobility.
I also study the impact of route caching strategies on the scalability of on-demand routing protocols with mobility. I show that making route caches adapt quickly and efficiently to topology changes is key to the scalability of on-demand routing protocols with mobility.
Information Extraction is the automatic extraction of facts from text, which includes detection of named entities, entity relations and events. Conventional approaches to Information Extraction try to find syntactic patterns based on deep processing of text, such as partial or full parsing. The problem these solutions have to face is that as deeper analysis is used, the accuracy of the result decreases, and one cannot recover from the induced errors. On the other hand, lower level processing is more accurate and it can also provide useful information. However, within the framework of conventional approaches, this kind of information can not be efficiently incorporated.
This thesis describes a novel supervised approach based on kernel methods to address these issues. In this approach customized kernels are used to match syntactic structures produced from different preprocessing phases. Using properties of a kernel, individual kernels are combined into composite kernels to integrate and extend all the information. The composite kernels can be used with various classifiers, such as Nearest Neighbor or Support Vector Machines (SVM). The main classifier we propose to use is SVM due to its ability to generalize in large dimensional feature spaces. We will show that each level of syntactic information can contribute to IE tasks, and low level information can help to recover from errors in deep processing.
The new approach has demonstrated state-of-the-art performance on two benchmark tasks. The first task is detecting slot fillers for management succession events (MUC-6). For this task two types of kernels were designed, a surface kernel based on word n-grams and a kernel built on sentence dependency trees; the second task is the ACE RDR evaluation, which is to recognize relations between entities in text from newswire and broadcast news transcript. For this task, five kernels were built to represent information from sentence tokenization, syntactic parsing and dependency parsing. Experimental results for the two tasks will be shown and discussed.
We describe an efficient algorithm to construct genome wide haplotype restriction maps of an individual by aligning single molecule DNA fragments collected with Optical Mapping technology. Using this algorithm and small amount of genomic material, we can construct the parental haplotypes for each diploid chromosome for any individual, one from the father and the other from the mother. Since such haplotype maps reveal the polymorphisms due to single nucleotide differences (SNPs) and small insertions and deletions (RFLPs), they are useful in association studies, studies involving genomic instabilities in cancer, and genetics. For instance, such haplotype restriction maps of individuals in a population can be used in association studies to locate genes responsible for genetics diseases with relatively low cost and high throughput. If the underlying problem is formulated as a combinatorial optimization problem, it can be shown to be NP-complete (a special case of K-population problem). But by effectively exploiting the structure of the underlying error processes and using a novel analog of the Baum-Welch algorithm for HMM models, we devise a probabilistic algorithm with a time complexity that is linear in the number of markers. The algorithms were tested by constructing the first genome wide haplotype restriction map of the microbe T. Pseudoana, as well as constructing a haplotype restriction map of a 120 Megabase region of Human chromosome 4. The frequency of false positives and false negatives was estimated using simulated data. The empirical results were found very promising.
This short paper describes a systems biology software tool that can engage in a dialogue with a biologist by responding to questions posed to it in English (or another natural language) regarding the behavior of a complex biological system, and by suggesting a set of facts about the biological system based on a timetested generate and test approach. Thus, this bioinformatics system improves the quality of the interaction that a biologist can have with a system built on rigorous mathematical modeling, but without being aware of the underlying mathematically sophisticated concepts or notations. Given the nature of the mathematical semantics of our Simpathica/XSSYS tool, it was possible to construct a well-founded natural language interface on top of the computational kernel. We discuss our tool and illustrate its use with a few examples. The natural language subsystem is available as an integrated subsystem of the Simpathica/XSSYS tool and through a simple Web-based interface; we describe both systems in the paper. More details about the system can be found at: http://bioinformatics.nyu.edu, and its sub-pages.
A considerable number of research projects are exploring how to extend object-oriented programming languages such as Java with, for example, support for generics, multiple dispatch, or pattern matching. To keep up with these changes, language implementors need appropriate tools. In this context, easily extensible parser generators are especially important because parsing program sources is a necessary first step for any language processor, be it a compiler, syntax-highlighting editor, or API documentation generator. Unfortunately, context-free grammars and the corresponding LR or LL parsers, while well understood and widely used, are also unnecessarily hard to extend. To address this lack of appropriate tools, we introduce Rats!, a parser generator for Java that supports easily modifiable grammars and avoids the complexities associated with altering LR or LL grammars. Our work builds on recent research on packrat parsers, which are recursive descent parsers that perform backtracking but also memoize all intermediate results (hence their name), thus ensuring linear-time performance. Our work makes this parsing technique, which has been developed in the context of functional programming languages, practical for object-oriented languages. Furthermore, our parser generator supports simpler grammar specifications and more convenient error reporting, while also producing better performing parsers through aggressive optimizations. In this paper, we motivate the need for more easily extensible parsers, describe our parser generator and its optimizations in detail, and present the results of our experimental evaluation.
A key problem in contemporary distributed systems is how to satisfy user quality of service (QoS) requirements for distributed applications deployed in heterogeneous, dynamically changing environments spanning multiple administrative domains.
An attractive solution is to create an infrastructure which satisfies user QoS requirements by automatically and transparently adapting distributed applications to any environment changes with minimum user input. However, successful use of this approach requires overcoming three challenges: (1) Capturing the application behavior and its relationship with the environment as a set of compact local specifications, using both general, quantitative (e.g., CPU usage) and qualitative (e.g., security) properties. Such information should be sufficient to reason about the global behavior of the application deployment. (2) Finding the ``best'' application deployment that satisfies both application and user requirements, and the various domain policies. The search algorithm should be complete, efficient, scalable with regard to application and network sizes, and guarantee optimality (e.g., resources consumed by applications). (3) Ensuring that the found deployments are practical and efficient, i.e., that the efficiency of automatic deployments is comparable with the efficiency of hand-tuned solutions.
This dissertation describes three techniques that address these challenges in the context of component-based applications. The modularity and reusability of the latter enable automatic deployments while supporting reasoning about the global connectivity based on the local information exposed by each component. The first technique extends the basic component-based application model with information about conditions and effects of component deployments and linkages, together with interactions between components and the network. The second technique uses AI planning to build an efficient and scalable algorithm which exploits the expressivity of the application model to find an application deployment that satisfies user QoS and application requirements. The last technique ensures that application deployments are both practical and efficient, by leveraging language and run-time system support to automatically customize components, as appropriate for the desired security and data consistency guarantees. These techniques are implemented as integral parts of the Partitionable Services Framework (PSF), a Java-based framework which flexibly assembles component-based applications to suit the properties of their environment. PSF facilitates on-demand, transparent migration and replication of application components at locations closer to clients, while retaining the illusion of a monolithic application.
The benefits of PSF are evaluated by deploying representative component-based applications in an environment simulating fast and secure domains connected by slow and insecure links. Analysis of the programming and the deployment processes shows that: (1) the code modifications required by PSF are minimal,(2) PSF appropriately adapts the deployments based on the network state and user QoS requirements, (3) the run-time deployment overheads incurred by PSF are negligible compared to the application lifetime, and (4) the efficiency of PSF-deployed applications matches that of hand-crafted solutions.
Wide-area network applications are increasingly being built using component-based models, enabling integration of diverse functionality in modules distributed across the network. In such models, dynamic component selection and deployment enables an application to flexibly adapt to changing client and network characteristics, achieve loadbalancing, and satisfy QoS requirements. Unfortunately, the problem of finding a valid component deployment is hard because one needs to decide on the set of components while satisfying various constraints resulting from application semantic requirements, network resource limitations, and interactions between the two. In this paper, we describe a general model for the component placement problem and present an algorithm for it, which is based on AI planning algorithms. We validate the effectiveness of our algorithm by demonstrating its scalability with respect to network size and number of components in the context of deployments generated for two example applications a security-sensitive mail service, and a webcast service in a variety of network environments.
Dual-Primal FETI methods are nonoverlapping domain decomposition methods where some of the continuity constraints across subdomain boundaries are required to hold throughout the iterations, as in primal iterative substructuring methods, while most of the constraints are enforced by Lagrange multipliers, as in one-level FETI methods. The purpose of this article is to develop strategies for selecting these constraints, which are enforced throughout the iterations, such that good convergence bounds are obtained, which are independent of even large changes in the stiffnesses of the subdomains across the interface between them. A theoretical analysis is provided and condition number bounds are established which are uniform with respect to arbitrarily large jumps in the Young's modulus of the material and otherwise only depend polylogarithmically on the number of unknowns of a single subdomain.
Bioinformatics is a challenging area for computer science, since the underlying computational formalisms span database systems, numerical methods, geometric modeling and visualization, imaging and image analysis, combinatorial algorithms, data analysis and mining, statistical approaches, and reasoning under uncertainty.
This thesis describes the Valis environment for rapid application prototyping in bioinformatics. The core components of the Valis system are the underlying database structure and the algorithmic development platform.
This thesis presents a novel set of data structures that has marked advantages when dealing with unstructured and unbounded data that are common in scientific fields and bioinformatics.
Bioinformatics problems rarely have a one-language, one-platform solution. The Valis environment allows seamless integration between scripts written in different programming languages and includes tools to rapidly prototype graphical user interfaces.
To date the speed of computation of most whole genome analysis tools have stood in the way of developing fast interactive programs that may be used as exploratory tools. This thesis presents the basic algorithms and widgets that permit rapid prototyping of whole genomic scale real-time applications within Valis.
Lots of objects in computer graphics applications are represented by surfaces. It works very well for objects of simple topology, but can get prohibitively expensive for objects with complex small-scale geometrical details.
Volumetric textures aligned with a surface can be used to add topologically complex geometric details to an object, while retaining an underlying simple surface structure. The simple surface structure provides great controllability on the overall shape of the model, and volumetric textures handle geometric details and topological changes efficiently.
Adding a volumetric texture to a surface requires more than a conventional twodimensional parameterization: a part of the space surrounding the surface has to be parameterized. Another problem with using volumetric textures for adding geometric detail is the difficulty of the rendering of implicitly represented surfaces, especially when they are changed interactively.
We introduce thick surfaces to represent objects with topologically complex geometric details. A thick surface consists of three components. First, a base mesh of simple structure is used to approximate the overall shape of the object. Second, a layer of space along the base mesh is parameterized. We define the layer of space as a shell, which covers the geometric details of the object. Third, volumetric textures of geometric details are mapped into the shell. The object is represented as the implicit surface encoded by the volumetric textures. Places without volumetric textures are filled with patches of the base mesh.
We present algorithms for constructing a shell around a surface and rendering a volumetric-textured surface. Mipmap technique for volumetric textures is explored as well. The gradient field of a generalized distance function is used to construct a non-self-intersecting shell, which has other properties desirable for volumetric texture mapping. The rendering algorithm is designed and implemented on NVIDIA GeForceFX video chips. Finally we demonstrate a number of interactive operations that these algorithms enable.
In any domain with change, the dimension of time is inherently involved. Whether the domain should be modeled in discrete time or continuous time depends on aspects of the domain to be modeled. Many complex real-world domains involve continuous time, resources, metric quantities and concurrent actions. Planning in such domains must necessarily go beyond simple discrete models of time and change.
In this thesis, we show how the SAT-based planning framework can be extended to generate plans of concurrent asynchronous actions that may depend on or make change piecewise linear metric constraints in continuous time.
In the SAT-based planning framework, a planning problem is formulated as a satisfiability problem of a set of propositional constraints (axioms) such that any model of the axioms corresponds to a valid plan. There are two parameters to a SAT-based planning system: an encoding scheme for representing plans of bounded length and a propositional SAT solver to search for a model. The LPSAT architecture is composed of a SAT solver integrated with a linear arithmetic constraint solver in order to deal with metric aspects of domains.
We present encoding schemes for temporal models of continuous time defined in PDDL+: ( i ) Durative actions with discrete and/or continuous changes; (ii) Real-time temporal model with exogenous events and autonomous processes capturing continuous changes. The encoding represents, in a CNF formula over arithmetic constraints and propositional fluents, time-stamped parallel plans possibly with concurrent continuous and/or discrete changes. In addition, we present encoding schemes for multi-capacity resources, partitioned interval resources, and metric quantities which are represented as intervals. An interval type can be used as a parameter to action as well as a fluent type.
Based on the LPSAT engine, the TM-LPSAT temporal metric planner has been implemented: Given a PDDL+ representation of a planning problem, the compiler of TM-LPSAT translates it in a CNF formula, which is fed into the LPSAT engine to find a solution corresponding to a plan for the planning problem. We also have experimented on our temporal metric encodings with other decision procedure, MathSAT, which deals with propositional combinations of linear constraints and Boolean variables. The results show that in terms of searching time the SAT-based approach to temporal metric planning can be comparable to other planning approaches and there is plenty of room to push further the limits of the SAT-based approach.
I present a new focal-plane analog VLSI sensor that estimates optical flow in two visual dimensions. The chip significantly improves previous approaches both with respect to the applied model of optical flow estimation as well as the actual hardware implementation. Its distributed computational architecture consists of an array of locally connected motion units that collectively solve for the unique optimal optical flow estimate. The novel gradient-based motion model assumes visual motion to be translational, smooth and biased. The model guarantees that the estimation problem is computationally well-posed regardless of the visual input. Model parameters can be globally adjusted, leading to a rich output behavior. Varying the smoothness strength, for example, can provide a continuous spectrum of motion estimates, ranging from normal to global optical flow. Unlike approaches that rely on the explicit matching of brightness edges in space or time, the applied gradient-based model assures spatiotemporal continuity on visual information. The non-linear coupling of the individual motion units improves the resulting optical flow estimate because it reduces spatial smoothing across large velocity differences. Extended measures of a 30x30 array prototype sensor under real-world conditions demonstrate the validity of the model and the robustness and functionality of the implementation.
The task of Information Extraction (IE) is to find specific types of information in natural language text. In particular, *event extraction* identifies instances of a particular type of event or fact (a particular "scenario"), including the entities involved, and fills a database which has been pre-defined for the scenario. As the number of documents available on-line has multiplied, entity extraction has grown in importance for various applications, including tracking terrorist activities from newswire sources and building a database of job postings from the Web, to name a few.
Linguistic contexts, such as predicate-argument relationships, have been widely used as *extraction patterns* to identify the items to be extracted from the text. The cost of creating extraction patterns for each scenario has been a bottleneck limiting the portability of information extraction systems to different scenarios, although there has been some research on semi-supervised pattern discovery procedures to reduce this cost. The challenge is to develop a fully automatic method for identifying extraction patterns for a scenario specified by the user.
This dissertation presents a novel approach for the unsupervised discovery of extraction patterns for event extraction from raw text. First, we present a framework that allows the user to have a self-customizing information extraction system for his/her query: the Query-Driven Information Extraction (QDIE) framework. The input to the QDIE framework is the user's query: either a set of keywords or a narrative description of the event extraction task.
Second, we assess the improvement in extraction pattern models. By considering the shortcomings of the prior work based on predicate-argument models and their extensions, we propose a novel extraction pattern model that is based on arbitrary subtrees of dependency trees.
Third, we address the issue of portability across languages. As a case study of the QDIE framework, we implemented a pre-CODIE system, a Cross-Lingual On-Demand Information Extraction system requiring minimal human intervention, which incorporates the QDIE framework as a component for pattern discovery. In addition, we assess the role of machine translation in cross-lingual information extraction by comparing translation-based implementations.
BDDC methods are nonoverlapping iterative substructuring domain decomposition methods for the solutions of large sparse linear algebraic systems arising from discretization of elliptic boundary value problems. They are similar to the balancing Neumann-Neumann algorithm. However, in BDDC methods, a small number of continuity constraints are enforced across the interface, and these constraints form a new coarse, global component. An important advantage of using such constraints is that the Schur complements that arise in the computation willa ll be strictly positive definite. The coarse problem is generated and factored by a direct solver at the beginning of the computation. However, this problem can ultimately become a bottleneck, if the number of subdomains is very large. In this paper, two three-level BDDC methods are introduced for solving the coarse problem approximately in two dimensional space, while still maintaining a good convergence rate. Estimates of the condition numbers are provided for the two three-level BDDC methods and numerical experiments are also discussed.
This dissertation presents an efficient and high-order boundary integral solver for the Stokes equations in complex 3D geometries. The targeted applications of this solver are the flow problems in domains involving moving boundaries. In such problems, traditional finite element methods involving 3D unstructured mesh generation expe- rience difficulties. Our solver uses the indirect boundary integral formulation and discretizes the equation using the Nyström method.
Although our solver is designed for the Stokes equations, we show that it can be generalized to other constant coefficient elliptic partial differential equations (PDEs) with non-oscillatory kernels.
First, we present a new geometric representation of the domain boundary. This scheme takes quadrilateral control meshes with arbitrary geometry and topology as input, and produces smooth surfaces approximating the control meshes. Our surfaces are parameterized over several overlapping charts through explicit nonsingular C ^{ ∞ } parameterizations, depend linearly on the control points, have fixed-size local support for basis functions, and have good visual quality.
Second, we describe a kernel independent fast multipole method (FMM) and its parallel implementation. The main feature of our algorithm is that it is based only on kernel evaluation and does not require the multipole expansions of the underlying kernel. We have tested our method on kernels from a wide range of elliptic PDEs. Our numerical results indicate that our method is efficient and accurate. Other ad- vantages include the simplicity of the implementation and its immediate extension to other elliptic PDE kernels. We also present an MPI based parallel implementation which scales well up to thousands of processors.
Third, we present a framework to evaluate the singular integrals in our solver. A singular integral is decomposed into a smooth far field part and a local part that contains the singularity. The smooth part of the integral is integrated using the trape- zoidal rule over overlapping charts, and the singular part is integrated in the polar coordinates which removes or decreases the order of singularity. We also describe a novel algorithm to integrate the nearly singular integrals coming from the evaluation at points close to the boundary.
Note: A significantly improved and expanded description of this material is available in the book High Performance Discovery in Time Series Springer Verlag 2004 ISBN 0-387-00857-8.
As extremely large time series data sets grow more prevalent in a wide variety of settings, we face the significant challenge of developing efficient analysis methods. This dissertation addresses the problem in designing fast, scalable algorithms for the analysis of time series.
The first part of this dissertation describes the framework for high performance time series data mining based on important primitives. Data reduction trasform such as the Discrete Fourier Transform, the Discrete Wavelet Transform, Singular Value Decomposition and Random Projection, can reduce the size of the data without substantial loss of information, therefore provides a synopsis of the data. Indexing methods organize data so that the time series data can be retrieved efficiently. Transformation on time series, such as shifting, scaling, time shifting, time scaling and dynamic time warping, facilitates the discovery of flexible patterns from time series.
The second part of this dissertation integrates the above primitives into useful applications ranging from music to physics to finance to medicine.
StatStream
StatStream is a system based on fast algorithms for finding the most highly correlated pairs of time series from among thousands of time series streams and doing so in a moving window fashion. It can be used to find correlations in time series in finance and in scientific applications.
HumFinder
Most people hum rather poorly. Nevertheless, somehow people have some idea what we are humming when we hum. The goal of the query by humming program, HumFinder, is to make a computer do what a person can do. Using pitch translation, time dilation, and dynamic time warping, one can match an inaccurate hum to a melody remarkably accurately.
OmniBurst
Burst detection is the activity of finding abnormal aggregates in data streams. Our software, OmniBurst, can detect bursts of varying durations. Our example applications are monitoring gamma rays and stock market price volatility. The software makes use of a shifted wavelet structure to create a linear time filter that can guarantee that no bursts will be missed at the same time that it guarantees (under a reasonable statistical model) that the filter eliminates nearly all false positives.
Title: A kernel-independent fast multipole algorithm (NYU-CS-TR839) Authors: George Biros, Lexing Ying, and Denis Zorin Abstract: We present a new fast multipole method for particle simulations. The main feature of our algorithm is that is kernel independent, in the sense that no analytic expansions are used to represent the far field. Instead we use equivalent densities, which we compute by solving small Dirichlet-type boundary value problems. The translations from the sources to the induced potentials are accelerated by singular value decomposition in 2D and fast Fourier transforms in 3D. We have tested the new method on the single and double layer operators for the Laplacian, the modified Laplacian, the Stokes, the modified Stokes, the Navier, and the modified Navier operators in two and three dimensions. Our numerical results indicate that our method compares very well with the best known implementations of the analytic FMM method for both the Laplacian and modified Laplacian kernels. Its advantage is the (relative) simplicity of the implementation and its immediate extension to more general kernels.
Two major techniques have been proposed for using the structure of links in the World Wide Web to determine the relative significance of Web Pages. The PageRank algorithm \cite{BP98}, which is a critical part of the Google search engine, gives a single measure of importance of each page in the Web. The HITS algorithm \cite{K98} applies to a set of pages believed relevant to a given query, and assigns two values to each page: the degree to which the page is a hub and the degree to which it is an authority. Both algorithms have a natural interpretation in terms of a random walk over the set of pages involved, and in both cases the computation involved amounts to computing an eigenvector over the transition matrix for this random walk.
This paper surveys the literature discussing these two techniques and their variants, and their connection to random walks and eigenvector computation. It also discusses the stability of these techniques under small changes in the Web link structure.
The current standard correlation coefficient used in the analysis of microarray data, including gene expression arrays, was introduced in [1]. Its formulation is rather arbitrary. We give a mathematically rigorous derivation of the correlation coefficient of two gene expression vectors based on James-Stein Shrinkage estimators. We use the background assumptions described in [1], also taking into account the fact that the data can be treated as transformed into normal distributions. While [1] uses zero as an estimator for the expression vector mean μ, we start with the assumption that for each gene, μ is itself a zero-mean normal random variable (with a priori distribution N(0,τ ^{2})), and use Bayesian analysis to update that belief, to obtain a posteriori distribution of μ in terms of the data. The estimator for μ, obtained after shrinkage towards zero, differs from the mean of the data vectors and ultimately leads to a statistically robust estimator for correlation coefficients.
To evaluate the effectiveness of shrinkage, we conducted in silico experiments and also compared similarity metrics on a biological example using the data set from [1]. For the latter, we classified genes involved in the regulation of yeast cell-cycle functions by computing clusters based on various definitions of correlation coefficients, including the one using shrinkage, and contrasting them against clusters based on the activators known in the literature. In addition, we conducted an extensive computational analysis of the data from [1], empirically testing the performance of different values of the shrinkage factor γ and comparing them to the values of γ corresponding to the three metrics adressed here, namely, γ=0 for the Eisen metric, γ=1 for the Pearson correlation coefficient, and γ computed from the data for the Shrinkage metric.
The estimated "false-positives" and "false-negatives" from this study indicate the relative merits of clustering algorithms based on different statistical correlation coefficients as well as the sensitivity of the clustering algorithm to small perturbations in the correlation coefficients. These results indicate that using the shrinkage metric improves the accuracy of the analysis.
All derivation steps are described in detail; all mathematical assertions used in the derivation are proven in the appendix.
[1] Eisen, M.B., Spellman, P.T., Brown, P.O., and Botstein, D. (1998), PNAS USA 95, 14863-14868.
Two complementary approaches have been proposed to achieve high performance inter-process coordination on highly parallel shared-memory systems. Gottlieb et. al. introduced the technique of combining concurrent memory references, thereby reducing hot spot contention and enabling the bottleneck-free execution of algorithms referencing a small number of shared variables. Mellor- Crummey and Scott introduced an alternative distributed local-spin technique that minimizes hot spot contention by not polling hotspot variables and exploiting the availability of processor-local shared memory. My principal contributions are a comparison of these two approaches, and significant improvements to the former.
The NYU Ultra3 prototype is the only system built that implements memory reference combining. My research utilizes micro-benchmark simulation studies of massively parallel Ultra3 systems executing coordination algorithms. This investigation detects problems in the Ultra3 design that result in higher-than-expected memory latency for reference patterns typical of busy-wait polling. This causes centralized coordination algorithms to perform poorly. Several architectural enhancements are described that significantly reduce the latency of these access patterns, thereby improving the performance of the centralized algorithms.
I investigate existing centralized algorithms for readers-writers and barrier coordination, all of which require fetch-and-add, and discovered variants that require fewer memory accesses (and hence have shorter latency). In addition,my evaluation includes novel algorithms that require only a restricted form of fetch-and-add.
Coordination latency of these algorithms executed on the enhanced combining architecture is compared to the latency of the distributed local-spin alternatives. These comparisons indicate that the distributed local-spin dissemination barrier, which generates no hot spot tra c, has latency slightly inferior to the best centralized algorithms investigated. However, for the less structured readers-writers problem, the centralized algorithms significantly outperform the distributed local-spin algorithm.
Trust management systems enable the construction of access-control infrastructures suitable for protecting sensitive resources from access by unauthorized agents. The state of the art in such systems (i) provide fail-safe in that access will be denied when authorizing credentials are revoked, (ii) can mitigate the risk of insider attacks using mechanisms for threshold authorization in which several independent partially trusted agents are required to co-sponsor sensitive activities, and (iii) are capable of enforcing intra- and inter- organizational access control policies.
Despite these advantages, trust management systems are limited in their ability to express partial trust. Additionally, they are cumbersome to administer when there are a large number of related access rights with differing trust (and thereby access) levels due to the need for explicit enumeration of the exponential number of agent combinations. More importantly, these systems have no provision for fault tolerance in cases where a primary authorization is lost (perhaps due to revocation), but others are available. Such situations may result in a cascading loss of access and possible interruption of service.
In this paper, we propose extending traditional trust management systems through a framework of reliability and confidence metrics. This framework naturally captures partial trust relationships, thereby reducing administrative complexity of access control systems with multiple related trust levels and increasing system availability in the presence of authorization faults while still maintaining equivalent safety properties.
Memory system congestion due to serialization of hot spot accesses can adversely affect the performance of interprocess coordination algorithms. Hardware and software techniques have been proposed to reduce this congestion and thereby provide superior system performance. The combining networks of Gottlieb et al. automatically parallelize concurrent hot spot memory accesses, improving the performance of algorithms that poll a small number of shared variables. The widely used MCS distributed-spin algorithms take a software approach: they reduce hot spot congestion by polling only variables stored locally. Our investigations detected performance problems in existing designs for combining networks and we propose mechanisms that alleviate them. Simulation studies described herein indicate that a centralized readers writers algorithms executed on the improved combining networks have performance at least competitive to the MCS algorithms.
Despite increases in network bandwidth, accessing network services across a wide area network still remains a challenging task. The difficulty mainly comes from the heterogeneous and constantly changing network environment, which usually causes undesirable user experience for network-oblivious applications.
A promising approach to address this is to provide network awareness in communication paths. While several such path-based infrastructures have been proposed, the network awareness provided by them is rather limited. Many challenging problems remain, in particular: (1) how to automatically create effective network paths whose performance is optimized for encountered network conditions, (2) how to dynamically reconfigure such paths when network conditions change, and (3) how to manage and distribute network resources among different paths and between different network regions. Furthermore, there is poor understanding of the benefits of using the path-based approach over other alternatives.
This dissertation describes solutions for these problems, built into a programmable network infrastructure called Composable Adaptive Network Services (CANS). The CANS infrastructure provides applications with network-aware communication paths that are automatically created and dynamically modified. CANS highlights four key mechanisms: (1) a high-level integrated type-based specification of components and network resources; (2) automatic path creation strategies; (3) system support for low-overhead path reconfiguration; and (4) distributed strategies for managing and allocating network resources.
We evaluate these mechanisms using experiments with typical applications running in the CANS infrastructure, and extensive simulation on a large scale network topology to compare with other alternatives. Experimental results validate the effectiveness of our approach, verifying that (1) the path-based approach provides the best and the most robust performance under a wide range of network configurations as compared to end-point or proxy-based alternatives; (2) automatic generation of network-aware paths is feasible and provides considerable performance advantages, requiring only minimal input from applications; (3) path reconfiguration strategies ensure continuous adaptation and provide desirable adaptation behaviors by using automatically generated paths; (4) both run-time overhead and reconfiguration time of CANS paths are negligible for most applications; (5) the resource management and allocation strategies allow effective setting up shared resource pools in the network and sharing resources among paths.
Because of heterogeneous and dynamic changing network environments, content delivery across the network requires system support for coping with different network conditions in order to provide satisfactory user experiences. Despite the existence of many adaptation frameworks, the question that which adaptation approach performs the best under what network configurations still remains unanswered. The performance implication of different adaptation approaches (end-point, proxy-based and path-based approaches) has not been studied yet. This paper aims to address this shortcoming by conducting a series simulation-based experiments to compare performance among these adaptation approaches under different network configurations. In order to make a fair comparison, in this paper approach-neutral strategies are proposed for constructing communication paths and managing network resources. The experiment results show that there are well-defined network environments under which each of these approaches delivers its best performance; and among them, the path-based approach, which uses the whole communication path to do adaptation, provides the best and the most robust performance under different network configurations, and for different types of servers and clients.
Balancing Neumann-Neumann methods are extended to the equations arising from the mixed formulation of almost-incompressible linear elasticity problems discretized with discontinuous-pressure finite elements. This family of domain decomposition algorithms has previously been shown to be effective for large finite element approximations of positive definite elliptic problems. Our methods are proved to be scalable and to depend weakly on the size of the local problems. Our work is an extension of previous work by Pavarino and Widlund on BNN methods for Stokes equation.
Our iterative substructuring methods are based on the partition of the unknowns into interior ones - including interior displacements and pressures with zero average on every subdomain - and interface ones - displacements on the geometric interface and constant-by-subdomain pressures. The restriction of the problem to the interior degrees of freedom is then a collection of decoupled local problems that are well-posed even in the incompressible limit. The interior variables are eliminated and a hybrid preconditioner of BNN type is designed for the Schur complement problem. The iterates are restricted to a benign subspace, on which the preconditioned operator is positive definite, allowing for the use of conjugate gradient methods.
A complete convergence analysis of the method is presented for the constant coefficient case. The algorithm is extended to handle discontinuous coefficients, but a full analysis is not provided. Extensions of the algorithm and of the analysis are also presented for problems combining pure-displacement and mixed finite elements in different subregions. An algorithm is also proposed for problems with continuous discrete pressure spaces.
All the algorithms discussed have been implemented in parallel codes that have been successfully tested on large sample problems on large parallel computers; results are presented and discussed. Implementations issues are also discussed, including a version of our main algorithm that does not require the solution of any auxiliary saddle-point problem since all subproblems of the preconditioner can be reduced to solving symmetric positive definite linear systems.
Since the debut of the World Wide Web, Web users have been facing the following problems:
[Extended Semantics]
When we read or study a digital document that we wish to explore further, typically, we interrupt our work to start a search. It costs time.
[Reverse Hyperlink]
When we visit a web page, we might be curious about what other hyperlinks point to the visited page. These links would most likely be of related interest. Can we get ``real time'' information about what other pages are pointing to this page?
[Version Control]
Many of us have been frustrated and even annoyed when the hyperlink that we follow gives us a ``404 not found'' or the retrieved webpage content is entirely different from the one we have bookmarked. Could we also have access to the past versions even if the hyperlink has been removed or the content has been changed?
[Composition Assistant]
Writing is not an easy task. We labor to structure a body of text, sort out ideas, find materials, and digest information. We would like an automated service that can associate the content we have produced with other contexts(on the Web) and bring these web contexts to us for reference.
In this thesis, we provide a unified framework and architecture, named enriched content, to resolve the above problems. We apply the architecture and show how the enriched content can be used in each application. We demonstrate that this method can be a new way of writing add-on functions for various document applications without having to write individual plug-in for each application or re-write each application. We also briefly discuss possible future development.
An order-dependent query is one whose result (interpreted as a multi-set) changes if the order of the input records is changed. In a stock-quotes database, for instance, retrieving all quotes concerning a given stock for a given day does not depend on order, because the collection of quotes does not depend on order. By contrast, finding the five price moving average in a trade table gives a result that depends on the order of the table. Query languages based on the relational data model can handle order-dependent queries only through add-ons. SQL:1999, for example, permits the use of a data ordering mechanism called a "window" in limited parts of a query. As a result, order-dependent queries become difficult to write in those languages and optimization techniques for these features, applied as pre- or post-enumerating phases, are generally crude. The goal of this paper is to show that when order is a property of the underlying data model and algebra, writing order-dependent queries in a language can be as natural as is their optimization. We introduce AQuery, an SQL-like query language and algebra that has from-the-ground-up support for order. We also present a framework for optimization of the order-dependent queries catagories it expresses. The framework is able to take advantage of the large body of query transformations on relational systems while incorporating new ones described here. We show by experiment that the resulting system is orders of magnitude faster than current SQL:1999 systems on many natural order-dependent queries.
We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two untrusted components, a block store and a consistency server. The block store holds all file data and most metadata. Without interpreting metadata, it presents a simple interface for clients to store variable-sized data blocks and later retrieve them by cryptographic hash.
The consistency server implements a novel protocol that guarantees close-to-open consistency whenever users see each other's updates. The protocol roughly consists of users exchanging version-stamped digital signatures of block server metadata, though a number of subtleties arise in efficiently supporting concurrent clients and group-writable files. We have proven the protocol's security under basic cryptographic assumptions. Without somehow producing signed messages valid under a user's (or the superuser's) public key, an attacker cannot tamper with a user's files---even given control of the servers and network. Despite this guarantee, SUNDR performs within a reasonable factor of existing insecure network file systems.
The problem of program optimization is a non-trivial one. Compilers do a fair job, but can't always deliver the best performance. The expressibility of general-purpose languages is limited, not allowing programmers to describe expected run time behavior, for example, and some programs are thus more amenable to optimization than others, depending on what the compiler expects to see. We present a generic framework that allows addressing this problem in two ways: through specifying verifiable source annotations to guide compiler analyses, and through optimistically using some assumptions and analysis results for the subset of the program seen so far. Two novel applications are presented, one for each of the above approaches: a dynamic optimistic interprocedural type analysis algorithm, and a mechanism for specifying immutability assertions. Both applications result in measurable speedups, demonstrating the feasibility of each approach.
We present a robust algorithm for estimating non-rigid motion in video sequences. We build on recent methods for tracking video by enforcing global structure (such as rank constraints) on the tracking. These methods assume color constancy in the neighborhood of each tracked feature, an assumption that is violated by occlusions, deformations, lighting changes, and other effects. Our method identifies outliers while solving for flow. This allows us to obtain high-quality tracking from difficult sequences, even when there is no single "reference frame" in which all tracks are visible.
In many cases, censoring documents on the Internet is a fairly simple task. Almost any published document can be traced back to a specific host, and from there to an individual responsible for the material. Someone wishing to censor this material can use the courts, threats, or other means of intimidation to compel the relevant parties to delete the material or remove the host from the network. Even if these methods prove unsuccessful, various denial of service attacks can be launched against a host to make the document difficult or impossible to retrieve. Unless a host's operator has a strong interest in preserving a particular document, removing it is often the easiest course of action.
A censorship-resistant publishing system allows an individual to publish a document in such a way that it is difficult, if not impossible, for an adversary to completely remove, or convincingly alter, a published document. One useful technique for ensuring document availability is to replicate the document widely on servers located throughout the world. However, replication alone does not block censorship. Replicas need to be protected from accidental or malicious corruption. In addition, a censorship-resistant publishing system needs to address a number of other important issues, including protecting the publisher's identity while simultaneously preventing storage flooding attacks by anonymous users.
This dissertation presents the design and implementation of two very different censorship-resistant publishing systems. The first system, Publius, is a web based system that allows an individual to publish, update, delete and retrieve documents in a secure manner. Publius's main contributions include an automatic tamper checking mechanism, a method for updating or deleting anonymously published content and methods for publishing anonymously hyperlinked content. The second system, Tangler, is a peer-to-peer based system whose contributions include a unique publication mechanism and a dynamic self-policing network. The benefits of this new publication mechanism include the automatic replication of previously published content and an incentive to audit the reliability with which servers store content published by other people. In part through these incentives, the self-policing network identifies and ejects servers that exhibit faulty behavior.
Several link-based algorithms, such as PageRank [19], HITS [15] and SALSA [16], have been developed to evaluate the popularity of web pages. These algorithms can be interpreted as computing the steady-state distribution of various Markov processes over web pages. The PageRank and HITS algorithms tend to over-rank tightly interlinked collections of pages, such as well-organized message boards. We show that this effect can be alleviated using a number of modications to the underlying Markov process. Specically, rather than weight all outlinks from a given page equally, greater weight is given to links between pages that are, in other respects, further off in the web, and less weight is given to links between pages that are nearby. We have experimented with a number of variants of this idea, using a number of different measures of ``distance'' in the Web, and a number different weighting schemes. We show that these revised algorithms often do avoid the over-ranking problem and give better overall rankings.
Edge detection is a fundamental problem of computer vision and has been widely investigated. We propose a new framework for edge detection based on edge profiles.
Our model, based on one-dimensional qualitative edge profile fitting and edge consistency, will produce one continuous edge from an initial seed point. A "profile" is defined as a finite cross-section of a two-dimensional image along a line segment. "Edge consistency" means that all the profiles on the same edge should be consistent.
Appropriate evaluation functions are needed for different types of edge profiles, such as step edges, ramp edges, etc. An evaluation function must meet the requirement that it will produce local minima at the positions where edges of a given type occurs in the profile. Instead of subjective thresholding, image noise is measured statistically and used as a systematic way of filtering false edges. We describe our method as "qualitative edge profile fitting" because it is not based on arbitrary thresolding. Once an edge point is localized, it can be extended into an edge by matching compatible profiles. Two profiles are considered compatible as long as their average di erence is within the noise measurement. Another feature of our approach is its subpixel accuracy. The utilization of profiles and noise-induced threshold selection make tasks such as joining broken edges more objective.
We develop the necessary algorithms and implement them. Different evaluation functions are constructed for different edge models and experimented on different one-dimensional profiles. The edge detector, using these evaluation functions, is then examined using different images and under different noise conditions.
On-demand routing protocols use route caches to make routing decisions. Due to frequent topology changes, cached routes easily become stale. To address the cache staleness issue in DSR (the Dynamic Source Routing protocol), prior work mainly used heuristics with ad hoc parameters to predict the lifetime of a link or a route. However, heuristics cannot accurately predict timeouts because topology changes are unpredictable. In this paper, we present a novel distributed cache update algorithm to make route caches adapt quickly to topology changes without using ad hoc parameters. We define a new cache structure called a cache table to maintain the information necessary for cache updates. When a node detects a link failure, our algorithm proactively notifies all reachable nodes that have cached the broken link in a distributed manner. We compare our algorithm with DSR with path caches and with Link-MaxLife through detailed simulations. We show that our algorithm significantly outperforms DSR with path caches and with Link-MaxLife.
This dissertation addresses the problem of dealing with large numbers of set-based patterns, such as association rules and itemsets, discovered by data mining algorithms. Since many discovered patterns may be spurious, irrelevant, or trivial, one of the main problems is how to validate them, e.g., how to separate the ``good'' rules from the ``bad.'' Many researchers have advocated the explicit involvement of a human expert in the validation process. However, scalability becomes an issue when large numbers of patterns are discovered, since the expert cannot perform the validation on a pattern-by-pattern basis in a reasonable period of time. To address this problem, this dissertation describes a new expert-driven approach to set-based pattern validation.
The proposed validation approach is based on validation sequences, i.e., we rely on the expert's ability to iteratively apply various validation operators that can validate multiple patterns at a time, thus making the expert-based validation feasible. We identified the class of scalable set predicates called cardinality predicates and demonstrated how these predicates can be effectively used in the validation process, i.e., as a basis for validation operators. We examined various properties of cardinality predicates, including their expressiveness. We also have developed and implemented the set validation language (SVL) that can be used for manual specification of cardinality predicates by a domain expert. In addition, we have proposed and developed a scalable algorithm for set and rule grouping that can be used to generate cardinality predicates automatically.
The dissertation also explores various theoretical properties of sequences of validation operators and facilitates a better understanding of the validation process. We have also addressed the problem of finding optimal validation sequences and have shown that certain formulations of this problem are NP-complete. In addition, we provided some heuristics for addressing this problem.
Finally, we have tested our rule validation approach on several real-life applications, including personalization and bioinformatics applications.
This thesis describes a web-based, responsive, zooming and panning visual- ization system for a full-featured geographic description of the United States. Current web-based map servers provide, from a visualization standpoint, little more than one static image per page, with hyperlinks for navigation; continuous zooming and panning requires locally stored data. Our primary contribution is a multi-threaded, scalable and responsive client-server architecture that responds to user requests as naturally and quickly as possible, regardless of network band- width reliability. This architecture can be generalized for use in other applica- tions, including non-geographic ones. To this we add a scalable and exible user interface for navigation of multi-scale geographic data, with intuitive zooming and panning, pop-up feature labels, and a user controlled tree-hierarchy of windows. We build software tools and algorithms for translating the U.S. Census Bureau's TIGER data into a format designed for speedy database retrieval and network delivery, and for generalizing the data into multiple levels of detail. Because of anomalies in the TIGER data, this processing requires some human intervention.
The increasing demand for highly detailed geometric models poses new and important problems in computer graphics and geometric modeling. Applications for complex models range from geometric design and scientific simulations to feature movies and video games.
We focus on the fundamental problem of creating and manipulating complex surface models. We address the problem by designing an efficient and general surface representation, and develop algorithms for efficient modification of surfaces represented in this form. Our surface representation extends existing subdivision-based representations with explicit representation of sharp features and boundaries, which is crucial in many computer-aided design applications.
We consider two types of surface modifications: boolean operations on solids bounded by surfaces, and surface pasting. Our technique rapidly and robustly computes an approximate result rather than aiming for the precise solution. At the same time, our approach allows one to trade speed for accuracy, and, in most cases, compute the result with any desired accuracy. The second type of editing operations we consider address the problem of transferring geometric features between different objects. Our technique makes it easy to combine geometric data from various sources (e.g. 3D scanning, CAGD model) into a single model.
We present a new method for the solution of the unsteady incompressible Navier-Stokes equations. Our goal is to achieve a robust and scalable methodology for two and three dimensional incompressible laminar flows. The Navier-Stokes operator discretization is done using boundary integrals and structured-grid finite elements. We use a two-step second-order accurate scheme to advance the equations in time. The convective term is discretized by an explicit, but unconditionally stable, semi-Lagrangian formulation; at each time step we inverta spatial constant-coefficient (modified) Stokes operator. The Dirichlet problem for the modified Stokes operator is formulated as a double-layer boundary integral equation. Domain integrals are computed via finite elements with appropriate forcing singularities to account for the irregular geometry. We use a velocity-pressure formulation which we discretize with bilinear elements (Q1-Q1), which give equal order interpolation for the velocities and pressures. Stabilization is used to circumvent the div-stability condition for the pressure space. The integral equations are discretized by Nystrom's method. For the specific approximation choices the method is second order accurate. We will present numerical results and discuss the performance and scalability of the method in two dimensions.
Let G=(V,E) be a graph with time-dependent edges where the cost of a path p through the graph is determined by a vector functions F(p)=[f_1(p),f_2(p), \dots, f_n(p)], where f_1,f_2,...,f_n are independent objective functions. Where n>1 there is no clear idea of what a ``best'' solution is, instead we turn to the idea of Pareto-optimality to define the efficiency of a path. Given the set of paths P through the network, a path p' is Pareto-optimal if for every p in P for all the objective functions (f_i(p) >= f_i(p')).
The problem of planning itineraries on a transportation system involves computing the set of optimal paths through a time-dependent network where the cost of a path is determined by more than one, possibly non-linear and non-additive, cost function. This thesis introduces an algorithmic toolkit for finding the set of Pareto-optimal paths in time-dependent networks in the presence of multiple objective functions.
Multi-criteria path optimization problems are known to be NP-Hard, however, by exploiting geometric and periodic properties of the dynamic graphs that model transit networks we show that it is possible to compute the Pareto-optimal solutions sets rapidly without using heuristics. We show that we can solve the itinerary problem in the presence of response time constraints for a large scale graph.
Adaptation to network changes is important to provide applications with seamless service access in a shared wireless environment. Path-based mechanisms, which augment data paths with application-specific ``bridging'' components guided by minimal application input, are promising approaches for providing such support. Although shown to be successful in static network situations, their utility under dynamically changing network conditions has not been well-studied.
In this paper, we answer this question by investigating the performance of a path-based approach, CANS (Composable Adaptive Network Services) in a dynamic environment. We find that the suitability of CANS-like approaches is hampered by inaccurate component models and expensive planning and reconfiguration. We address these problems by extending CANS to support (1) generalized path creation strategies to match different application performance preferences; (2) refined component models that enable adjustment at a finer granularity and more accurately represent behavior of component compositions; and (3) local planning and reconfiguration mechanisms that improve responsiveness. We present the problems and evaluate our solutions using an image streaming application. The experiment results show that our solutions are effective.
Balancing Neumann-Neumann methods are extented to mixed formulations of the linear elasticity system with discontinuous coefficients, discretized with mixed finite or spectral elements with discontinuous pressures.
These domain decomposition methods implicitly eliminate the degrees of freedom associated with the interior of each subdomain and solve iteratively the resulting saddle point Schur complement using a hybrid preconditioner based on a coarse mixed elasticity problem and local mixed elasticity problems with natural and essential boundary conditions. A polylogarithmic bound in the local number of degrees of freedom is proven for the condition number of the preconditioned operator in the constant coefficient case.
Parallel and serial numerical experiments confirm the theoretical results, indicate that they still hold for systems with discontinuous coefficients, and show that our algorithm is scalable, parallel, and robust with respect to material heterogeneities. The results on heterogeneous general problems are also supported in part by our theory.
A two-level overlapping domain decomposition method is analyzed for a Nedelec spectral element approximation of a model problem appearing in the solution of Maxwell's equations. The overlap between subdomains can consist of entire spectral elements or rectangular subsets of spectral elements. For fixed relative overlap and overlap made from entire elements, the condition number of the method is bounded, independently of the mesh size, the number of subregions, the coefficients and the degree of the spectral elements. In the case of overlap including just parts of spectral elements, a bound linear in the degree of the elements is proven. It is assumed that the coarse and fine mesh are quasi-uniform and shape-regular and that the domain is convex. Arguments that would not require quasi-uniformity of the coarse mesh and convexity of the domain are mentioned. Our work generalizes results obtained for lower-order Nedelec elements in Toselli [Numerische Mathematik (2000) 86:733-752]. Numerical results for the two-level algorithm in two dimensions are also presented, supporting our analysis.
In this paper, a dual-primal FETI method is developed for solving incompressible Stokes equations approximated by mixed finite elements with discontinuous pressures in three dimensions. The domain of the problem is decomposed into non-overlapping subdomains, and the continuity of the velocity across the subdomain interface is enforced by introducing Lagrange multipliers. By a Schur complement procedure, the indefinite Stokes problem is reduced to a symmetric positive definite problem for the dual variables, i.e., the Lagrange multipliers. This dual problem is solved by a Krylov space method with a Dirichlet preconditioner. At each step of the iteration, both subdomain problems and a coarse problem on a coarse subdomain mesh are solved by a direct method. It is proved that the condition number of this preconditioned dual problem is independent of the number of subdomains and bounded from above by the product of the inverse of the inf-sup constant of the discrete problem and the square of the logarithm of the number of unknowns in the individual subdomain problems. Illustrative numerical results are presented by solving lid driven cavity problems. This algorithm is also extended to solving linearized non-symmetric Navier-Stokes equation.
Finite element tearing and interconnecting (FETI) type domain decomposition methods are first extended to solving incompressible Stokes equations. One-level, two-level, and dual-primal FETI algorithms are proposed. Numerical experiments show that these FETI type algorithms are scalable, i.e., the number of iterations is independent of the number of subregions into which the given domain is subdivided. A convergence analysis is then given for dual-primal FETI algorithms both in two and three dimensions.
Extension to solving linearized nonsymmetric stationary Navier-Stokes equations is also discussed. The resulting linear system is no longer symmetric and a GMRES method is used to solve the preconditioned linear system. Eigenvalue estimates show that, for small Reynolds number, the nonsymmetric preconditioned linear system is a small perturbation of that in the symmetric case. Numerical experiments also show that, for small Reynolds number, the convergence of GMRES method is similar to the convergence of solving symmetric Stokes equations with the conjugate gradient method. The convergence of GMRES method depends on the Reynolds number; the larger the Reynolds number, the slower the convergence.
Dual-primal FETI algorithms are further extended to nonlinear stationary Navier-Stokes equations, which are solved by using a Picard iteration. In each iteration step, a linearized Navier-Stokes equation is solved by using a dual-primal FETI algorithm. Numerical experiments indicate that convergence of the Picard iteration depends on the Reynolds number, but is independent of both the number of subdomains and the subdomain problem size.
Distribution and replication of network-accessible applications has been shown to be an effective approach for delivering improved Quality of Service (QoS) to end users. An orthogonal trend seen in current-day network services is the use of component-based frameworks. Even though such component-based applications are natural candidates for distributed deployment, it is unclear if the design patterns underlying component frameworks also enable efficient service distribution in wide-area environments. In this paper, we investigate application design rules and their accompanying system-level support essential to a beneficial and efficient service distribution process. Our study targets the widely used Java 2 Enterprise Edition (J2EE) component platform and two sample component-based applications: Java Pet Store and RUBiS. Our results present strong experimental evidence that component-based applications can be efficiently distributed in wide-area environments, significantly improving QoS delivered to end users as compared to a centralized solution. Although current design patterns underlying component frameworks are not always suitable, we identify a small set of design rules for orchestrating interactions and managing component state that together enable efficient distribution. Futhermore, we show how enforcement of the identified design rules and automation of pattern implementation can be supported by container frameworks.
We introduce online codes - a class of near-optimal codes for a very general loss channel which we call the free channel. Online codes are linear encoding / decoding time codes, based on sparse bipartite graphs, similar to Tornado codes, with a couple of novel properties: local encodability and rateless-ness. Local encodability is the property that each block of the encoding of a message can be computed independently from the others in constant time. This also implies that each encoding block is only dependent on a constant-sized part of the message and a few preprocessed bits. Rateless-ness is the property that each message has an encoding of practically infinite size.
We argue that rateless codes are more appropriate than fixed-rate codes for most situations where erasure codes were considered a solution. Furthermore, rateless codes meet new areas of application, where they are not replaceable by fixed-rate codes. One such area is information dispersal over peer-to-peer networks.
This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this work concentrates on the detection of tampering attacks and stale data. Ideally, users of an untrusted storage server would immediately and unconditionally notice any misbehavior on the part of the server. This ideal is unfortunately not achievable. However, we define a notion of data integrity called fork consistency in which, if the server delays just one user from seeing even a single change by another, the two users will never again see one another's changes - a failure easily detectable with on-line communication. We give a practical protocol for a multi-user network file system called SUNDR, and provfe that SUNDR offers fork consistency whether or not the server obeys the protocol.
We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multi-scale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the amplitudes of pyramid coefficients. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear (Wiener) estimate over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive Gaussian noise of known covariance that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error. In addition, we demonstrate the performance of the algorithm in removing sensor noise from high-ISO digital camera images.
We explore the role of features in solving problems in computer vision and learning. Features captures important domain-dependent knowledge and are fundamental in simplifying problems. Our goal is to consider the universal features of the problem concerned, and not just particular algorithms used in its solution. Such an approach reveals only the fundamental difficulties of any problem. For most problems we will face a host of other specialized concerns. Therefore, we consider simplified problems which captures the essence of our approach.
This thesis consists of two parts. First, we explore means of discovering features. We come up with an information theoretic criterion to identify features which has deep connections to statistical estimation theory. We consider features to be ``nice'' representations of objects. We find that, ideally, a feature space representation of on image is the most concise representation of an image which captures all available information in it. In practice, however, we are satisfied with an approximation to it. Therefore, we explore a few such approximations and explain their connection to the information-theoretic approach. We look at the algorithms which implement these approximation and look at their generalizations in the related field of stereo vision.
Using features, whether they come from some feature-discovery algorithm or are hand crafted, is usually an ad hoc process which depends on the actual problem, and the exact representation of features. This diversity mostly arises from the multitude of ways features capture information. In the second part of this thesis, we come up with an architecture which lets us use features in a very flexible way, in the context of content-addressable memories. We apply this approach to two radically different domains, face images and English words. We also look at human performance in reconstructing words from fragments, which give us some information about the memory subsystem in human beings.
Requests for dynamic and personalized content increasingly dominate current-day Internet traffic; however, traditional caching architectures are not well-suited to cache such content. Several recently proposed techniques, which exploit reuse at the sub-document level, promise to add this shortcoming, but require a better understanding of the workloads seen on web sites that serve such content. In this paper, we study the characteristicsof a medium-sized personalized web site, NYUHOME, which is a customizable portal used by approximately 44,000 users from the New York University community. Our study leverages detailed server-side overheads, and the client-perceived request latencies. We then use these statistics to derive general implications for efficient caching and edge generation of dynamic content in the context of our ongoing CONCA project. Our study verifies both the need for and likely benefit from caching content at sub-document granularity, and points to additional opportunities for reducing client-perceived latency using prefetching, access prediction, and content transcoding.
Consider the problem of monitoring tens of thousands of time series data streams in an online fashion and making decisions based on them. In addition to single stream statistics such as average and standard deviation, we also want to find high correlations among all pairs of streams. A stock market trader might use such a tool to spot arbitrage opportunities. This paper proposes efficient methods for solving this problem based on Discrete Fourier Transforms and a three level time interval hierarchy. Extensive experiments on synthetic data and real world financial trading data show that our algorithm beats the direct computation approach by several orders of magnitude. It also improves on previous Fourier Transform approaches by allowing the efficient computation of time-delayed correlation over any size sliding window and any time delay. Correlation also lends itself to an efficient grid-based data structure.The result is the first algorithm that we know of to compute correlations over thousands of data streams in real time. The algorithm is incremental,has fixed response time, and can monitor the pairwise correlations of 10,000 streams on a single PC. The algorithm is embarrassingly parallelizable.
This paper describes the unerlying mathematical model and the dynamic programming algorithm technique for the valicdation of a (DNA) sequence against a (DNA) map. The sequence can be obtained from a variety of sources (r,g, GenBAnk, Sanger's Lab, or Celera P.E.) and it is assumed to be written out as a string of nucleotides. The map is an ordered restriction map obtained through an optical mapping process and is augmented with statistical information which will ne used to place (or not) the sequence in the genome.
Our approach has many other applications beyond validation: e.g. map-based sequence assembly, phasing sequence contigs, detecting and closing gaps and annotation of partially sequenced genomes to find open reading frames, genes and synteny groups.
We tested our system by checking various maps against publicly available sequence data for Plasmodium falciparum.
As the number of networked computers grows and the amount of sensitive information available on them grows as well there is an increasing need to ensure the security of these systems. The security of computer networks is not a new issue. We have dealt with the need for security for a long time with such measures as passwords and encryption. These will always provide an important initial line of defense. However, given a clever and malicious individual these defenses can often be circumvented. Intrusion detection is therefore needed as another way to protect computer systems. This thesis describes a novel three stage algorithm for building classification models in the presence of non-stationary, temporal, high dimensional data, in general, and for detecting network intrusion detections, in particular. Given a set of training data records the algorithm begins by identifying "interesting'' temporal patterns in this data using a modal logic. This approach is distinguished from other work in this area where frequent patterns are identified. We show that when frequency is replaced by our measure of "interestingness'' the problem of finding temporal patterns is NP-complete. We then offer an efficient heuristic approach that has proven effective in experiments. Having identified interesting patterns, these patterns then become the predictor variables in the construction of a Multivariate Adaptive Regression Splines (MARS) model. This approach will be justified, after surveying other methods for solving the classification problem, by its ability to capture complex nonlinear relationships between the predictor and response variables which is comparable to other heuristic approaches such as neural networks and classification trees, while offering improved computational properties such as rapid convergence and interpret-ability. After considering a variety of approaches to the problems of over-fitting which is inherent when modeling high dimensional data and non-stationarity, we describe our approach to addressing these issues through the use of truncated Stein shrinkage. This approach is motivated by showing the inadmissibility of the maximum likelihood estimator (MLE) in the high dimensional (dimension >= 3) data. We then discuss the application of our approach as participants in the 1999 DARPA Intrusion Detection Evaluation where we were able to exhibit the benefits of our approach. Finally, we suggest another area of research where we believe that our work would meet with similar success, namely, the area of disease classification.
The growing popularity of network-based services and peer-to-peer networks has resulted in situations where components of a distributed application often need to execute in environments that are only partly trusted by the application's owner. Such deployment into partial or unstable trust environments exacerbates the classical problems of distributing decomposable services: authentication and access control, trust management, secure communication, code distribution and installation, and process rights management. Unfortunately, the application developer's burden of coping with these latter issues often dominates the benefits of service distribution. The DisCo infrastructure is specifically targeted to the development of systems and services deployed into coalition environments: networks of users and hosts administered by multiple authorities with changing trust relationships. The DisCo infrastructure provides application-neutral support for the classical problems of distributed services, thereby relieving the developer of the burden of independently managing these features. DisCo also includes support for continuously monitoring established connections, enabling corrective action from an application to cope with changing trust relationships. Our experience with building a secure video distribution service using the DisCo toolkit indicates that the latter permits distributed secure deployment into a partly trusted environment with minimal application developer effort, affording the advantages of natural expression and convenient deployment without compromising on efficiency.
Software development in distributed computation is complicated by the extra overhead of communication between connected, dispersed hosts in dynamically changing, multiple administrative domains. Many disparate technologies exist for trust management, authentication, secure communication channels, and service discovery, but composing all of these elements into a single system can outweigh principal development efforts. The NYU Disco Switchboard consolidates these connectivity issues into a single convenient, extensible architecture, providing an abstraction for managing secure, host-pair communication with connection monitoring facilities. Switchboard extends the secure authenticated communication channel abstraction provided by standard interfaces such as SSL/TLS with mechanisms to support trust management, key sharing, service discovery, and connection liveness and monitoring. We present an extensible architecture which is particularly useful in dynamically changing, distributed coalition environments. Applications that utilize Switchboard benefit from the availability of authentication, trust management, cryptography, and discovery, while retaining the simplicity of a common interface.
Distributed Role-Based Access Control (dRBAC) is a scalable, decentralized trust-management and access-control mechanism for systems that span multiple administrative domains. dRBAC represents controlled actions in terms of roles , which are defined within the trust domain of one entity and can be transitively delegated to other roles within a different trust domain. dRBAC utilizes PKI to identify all entities engaged in trust-sensitive operations and to validate delegation certificates. The mapping of roles to authorized name spaces obviates the need to identify additional policy roots. dRBAC distinguishes itself from previous trust management and role-based access control approaches in its support for three features: (1) third-party delegations , which improve expressiveness by allowing an entity to delegate roles outside its namespace when authorized by an explicit delegation of assignment ; (2) valued attributes , which modulate transferred access rights via mechanisms that assign and manipulate numerical values associated with roles; and (3) credential subscriptions , which enable continuous monitoring of established trust relationships using a pub/sub infrastructure to track the status of revocable credentials. This paper describes the dRBAC model, its scalable implementation using a graph-based model of credential discovery and validation, and its application in a larger security context.
Advances in wireless communication together with the growing number of mobile end devices hold the potential of ubiquitous access to sophisticated internet services; however, such access must cope with an inherent mismatch between the low-bandwidth, limited-resource characteristics of mobile devices and the high-bandwidth expectations of many content-rich services. One promising way of bridging this gap is by deploying application-specific components on the path between the device and service, which perform operations such as protocol conversion and content transcoding. Although several researchers have proposed infrastructures allowing such deployment, most rely on static, hand-tuned deployment strategies restricting their applicability in dynamic situations.
In this paper, we present an automatic approach for the dynamic deployment of such transcoding components, which can additionally be dynamically reconfigured as required. Our approach relies on three components: (a) a high-level integrated type-based specification of components and network resources, essential for "late binding" components to paths; (b) an automatic path creation strategy that selects and maps components so as to optimize a global metric; and (c) system support for low-overhead path reconfiguration, consisting of both restrictions on component interfaces and protocols satisfying application semantic continuity requirements. We comprehensively evaluate the effectiveness of our approach over a range of network and end-device characteristics using both a web-access scenario where client performance is for reduced access time, and a streaming scenario where client preference is for increased throughput. Our results verify that (1) automatic path creation and reconfiguration is achievable and does in fact yield substantial performance benefits; and (2) that despite their flexibility, both path creation and reconfiguration can be supported with low run-time overhead.
We describe new algorithms and tools for generating paintings, illustrations, and animation on a computer. These algorithms are designed to produce visually appealing and expressive images that look hand-painted or hand-drawn. In many contexts, painting and illustration have many advantages over photorealistic computer graphics, in aspects such as aesthetics, expression, and computational requirements. We explore three general strategies for non-photorealistic rendering:
First, we describe explicit procedures for placing brush strokes. We begin with a painterly image processing algorithm inspired by painting with real physical media. This method produces images with a much greater subjective impression of looking hand-made than do earlier methods. By adjusting algorithm parameters, a variety of styles can be generated, such as styles inspired by the Impressionists and the Expressionists. This method is then extended to processing video, as demonstrated by painterly animations and an interactive installation. We then present a new style of line art illustration for smooth 3D surfaces. This style is designed to clearly convey surface shape, even for surfaces without predefined material properties or hatching directions.
Next, we describe a new relaxation-based algorithm, in which we search for the painting that minimizes some energy function. In contrast to the first approach, we ideally only need to specify what we want, not how to directly compute it. The system allows as fine user control as desired: the user may interactively change the painting style, specify variations of style over an image, and/or add specific strokes to the painting.
Finally, we describe a new framework for processing images by example, called ``image analogies.'' Given an example of a painting or drawing (e.g. scanned from a hand-painted source), we can process new images with some approximation to the style of the painting. In contrast to the first two approaches, this allows us to design styles without requiring an explicit technical definition of the style. The image analogies framework supports many other novel image processing operations.
For problems with piecewise smooth solutions, spectral element methods hold great promise. They combine the exponential convergence of spectral methods with the geometric flexibility of finite elements. Spectral elements are well-established for scalar elliptic problems and problems of fluid dynamics, and recently the first methods for problems in H(curl) and H(div) were proposed. In this dissertation we study spectral element methods for a model problem. We first consider Maxwell's equation and derive the model problem in H(curl). Then we introduce anisotropic spectral Nédélec element discretizations with variable numerical integration for the model problem. We discuss their structure, and their convergence and approximation properties. We also obtain results on the norm of the Nédélec interpolants between Nédélec and Raviart-Thomas spaces of different degree, needed for the computation of the splitting constant for the domain decomposition preconditioner and the numerical analysis of nonlinear equations. We also prove a Friedrichs-like inequality for the model problem for the spectral case.
We present fast direct solvers for the model problem on separable domains, taking advantage of the tensor product discretization and fast diagonalization methods. We use those fast solvers as local solvers in domain decomposition methods for problems that are too large to be solved directly, or posed on non-separable domains, and use them to compute and subassemble the Schur complement system corresponding to the interface. We also apply them in the direct solution of the Schur complement system for general domains.
As an example for the domain decomposition methods that can be implemented with these tools, we introduce overlapping Schwarz methods, both one-level and two-level versions.
We extend the theory for overlapping Schwarz methods to the spectral Nédélec element case. We reduce the proof of the condition number estimate to three basic estimates, and present theoretical and numerical results on those estimates. The technique of the proof works in both the two-dimensional and three-dimensional case.
We also present numerical results for one-level and two-level methods in two dimensions.
Instruction-level parallelism(ILP) is a family of processor and compiler design techniques that speed up execution by allowing individual machine operations. Explicitly Parallel Instruction computing (EPIC) processors evolved in an attempt to achieve high levels of ILP without the hardware complexity. In EPIC processors most of the functions to extract ILP are performed by the compiler. To take advantage higher level of ILP of these architectures, the ILP compiler must use aggressive ILP technique. This opportunity for improved performance comes at the price of increased compilation time.
As the size of the compilation unit is limited, the compilation time can be reduced. But the limited scope of compilation may restrict the scope of optimization. As a result, the compiler may generate less efficient quality of code. Ideally, we want to get smaller compilation time and the same or better execution time as that obtained using the global approach.
In this thesis, we address the problem of the compilation time and execution performance trade-off in region-based compilation within the context of the key optimization of register allocation . We demonstrate that schemes designed for region-based allocation perform as well as or even better than schemes designed for global based allocation while having smaller compilation time. To achieve this goal, we propose several innovative techniques which form the core of this thesis.
We show considerable compilation time savings with comparable execution time performance by synthesizing our techniques in a region-based register allocation. We also explore the relation between the performance of the register allocation and the region size and quantify it. Our research shows selecting the right size of region has the important impact to the performance of register allocation. We proposed the concept of restructuring the regions based on register pressure and discussed how we can estimate the register pressure in order to improve compilation time while maintaining the execution time.
A new type of overlapping Schwarz methods, the overlapping Schwarz algorithms using discontinuous iterates is constructed from the classical overlapping Schwarz algorithm. It allows for discontinuities at each artificial interface. The new algorithm, for Poisson's equation, can be considered as an overlapping version of Lions' Robin iteration method for which little is known concerning the convergence. Since overlap improves the performance of the classical algorithms considerably, the existence of a uniform convergence factor is the fundamental question for our new algorithm.
The first part of this thesis concerns the formulation of the new algorithm. A variational formulation of the new algorithm is derived from the classical algorithms. The discontinuity of the iterates of the new algorithm is the fundamental distinction from the classical algorithms. To analyze this important property, we use a saddle-point approach. We show that the new algorithm can be interpreted as a block Gauss-Seidel method with dual and primal variables.
The second part of the thesis deals with algebraic properties of the new algorithm. We prove that the fractional steps of the new algorithm are nonsymmetric. The algebraic systems of the primal variables can be reduced to those of the dual variables. We analyze the structure of the dual formulation algebraically and analyze its numerical behavior.
The remaining part of the thesis concerns convergence theory and numerical results for the new algorithm. We first extend the classical convergence theory, without using Lagrange multipliers, in some limited cases. A new theory using Lagrange multiplier is then introduced and we find conditions for the existence of uniform convergence factors of the dual variables, which implies convergence of the primal variables, in the two overlapping subdomain case with any Robin boundary condition. Our condition shows a relation between the given conditions and the artificial interface condition. The numerical results for the general case with cross points are also presented. They indicate possible extensions of our results to this more general case.
In this paper, certain iterative substructuring methods with Lagrange multipliers are considered for elliptic problems in three dimensions. The algorithms belong to the family of dual--primal FETI methods which have recently been introduced and analyzed successfully for elliptic problems in the plane. The family of algorithms for three dimensions is extended and a full analysis is provided for the new algorithms. Particular attention is paid to finding algorithms with a small primal subspace since that subspace represents the only global part of the dual--primal preconditioner. It is shown that the condition numbers of several of the dual--primal FETI methods can be bounded polylogarithmically as a function of the dimension of the individual subregion problems and that the bounds are otherwise independent of the number of subdomains, the mesh size, and jumps in the coefficients. These results closely parallel those for other successful iterative substructuring methods of primal as well as dual type.
Go is a board game with simple rules but complex strategy requiring ability in almost all aspects of human reasoning. A good Go player must be able to hypothesize moves and analyze their consequences; to judge which areas are relevant to the analysis at hand; to learn from successes and failures; to generalize that knowledge to other ``similar'' situations; and to make inferences from knowledge about a position.
Unlike computer chess, which has seen steady progress since Shannon's [23] and Turing's [24] original papers on the subject, progress on computer Go remains in relative infancy. In computer chess, minimax search with [IMAGE ] - [IMAGE ] pruning based on a simple evaluation function can beat a beginner handily. No such simple evaluation function is known for Go. To accurately evaluate a Go position requires knowledge of the life and death status of the points on the board. Since the player with the most live points at the end of the game wins, a small mistake in this analysis can be disastrous.
In this dissertation we describe the design, performance, and underlying logic of a knowledgebased program that solves life and death problems in the game of Go. Our algorithm applies life and death theory coupled with knowledge about which moves are reasonable for each relevant goal in that theory to restrict the search space to a tractable size. Our results show that simple depth-first search armed with a goal theory and heuristic move knowledge yields very positive results on standard life and death test problems - even without sophisticated move ordering heuristics.
In addition to a description of the program and its internals we present a modal logic useful for describing strategic theories in games and use it to give a life and death theory and to formally state the rules of Go. We also give an axiomatization for this logic using the modal [IMAGE ] calculus [15] and prove some basic theorems of the system.
Two machine instruction level compiler optimization problems are considered in this work.
The first problem is time-constrained instruction scheduling, i.e., finding optimal schedules for machine code in the presence of time constraints such as release-times and deadlines. These types of time constraints appear naturally in embedded applications, and also as a side effect of many other compiler optimization problems. While the general problem is NP-hard, we have developed a new algorithm which can optimally handle many P-time solvable sub-instances. In fact, we show that almost all previous algorithms in this related area can be seen as an instance of the priority computation scheme that we have developed. Our work extends and unifies many algorithmic results in classical deterministic scheduling theory related to release-times, deadlines and pipeline latencies.
The second problem that we investigate in this work is scalar optimizations in machine code. We present a new framework that utilizes static single assignment form (SSA) at the level of individual machine instructions. Complementing the framework, we have also developed new SSA construction algorithms which are faster than previous algorithms, and are very simple to implement.
In this paper, a dual-primal FETI method is developed for incompressible Stokes equation approximated by mixed finite elements with discontinuous pressures. The domain of the problem is decomposed into nonoverlapping subdomains, and the continuity of the velocity across the subdomain interface is enforced by introducing Lagrange multipliers. By a Schur complement procedure, solving the indefinite Stokes problem is reduced to solving a symmetric positive definite problem for the dual variables, i.e., the Lagrange multipliers. This dual problem is solved by a Krylov space method with a Dirichlet preconditioner. At each step of the iteration, both subdomain problems and a coarse problem on the course subdomain mesh are solved by a direct method. It is proved that the condition number of this preconditioned problem is independent of the number of subdomains and bounded from above by the product of the inverse of the inf-sup constant of the discrete problem and the square of the logarithm of the number of unknowns in the individual subdomain problems. Illustrative results are presented by solving a lid driven cavity problem.
We have developed an on-line handwriting recognition system. Our approach integrates local bottom-up constructs with a global top-down measure into a modular recognition engine. The bottom-up process uses local point features for hypothesizing character segmentations and the top-down part performs shape matching for evaluating the segmentations. The shape comparison, called Fisher segmental matching, is based on Fisher's linear discriminant analysis. The component character recognizer of the system uses two kinds of Fisher matching based on different representations and combines the information to form the multiple experts paradigm.
Along with an efficient ligature modeling, the segmentations and their character recognition scores are integrated into a recognition engine termed Hypotheses Propagation Network (HPN), which runs a variant of topological sort algorithm of graph search. The HPN improves on the conventional Hidden Markov Model and the Viterbi search by using the more robust mean-based scores for word level hypotheses and keeping multiple predecessors during the search.
We have also studied and implemented a geometric context modeling termed Visual Bigram Modeling that improves the accuracy of the system's performance by taking the geometric constraints into account, in which the component characters in a word can be formed in relation with the neighboring characters. The result is a shape-oriented system, robust with respect to local and temporal features, modular in construction and has a rich range of opportunities for further extensions.
We propose a new framework for shape representation and salient shape selection. The framework is considered as a low- to middle-level vision process. The framework can be applied to various topics, including figure/ground separation, searching of the shape axis, junction detection and illusory figure finding. The model construction is inspired by the Gestalt studies. They suggest that proximity, convexity, similarity, good continuation, closure, symmetry, etc, are useful for figure/ground separation and visual organization construction. First, we quantify those attributes for (completed or partial) shapes by our distributed systems. The shape will be evaluated and represented by those results. In particular, the shape convexity, rather than other shape attributes like the symmetry axis or size which were well-studied before, will be emphasized in our discussion. Our problem is proposed in a continuous manner. For the shape convexity, unlike the conventional mathematical definition, we are aimed at deriving a definition to describe a shape ``more convex'' or ``less convex'' than the other. To search the shape axis, more than a binary information telling a point on or off any axis, a continuous information will be obtained. We distinguish axes with ``stronger'' or ``weaker'' declarations. An Easy and natural scheme of pruning can be applied by such representation. For the junction detection, we do not assume any artificial threshold. Instead, the transition from low-curvature to high-curvature curves or curves with discontinuities will be shown by our representation. The model is based on a variational approach, provided by the minimization of the data fitting error as well as the neighborhood discrepancy. Two models will be proposed, the decay diffusion process and the orientation diffusion process.
This paper describes the unerlying mathematical model and the Balancing Neumann-Neumann methods are introduced and studied for incompressible Stokes equations discretized with mixed finite or spectral elements with discontinuous pressures. After decomposing the original domain of the problem into nonoverlapping subdomains, the interior unknowns, which are the interior velocity component and all except the constant pressure component, of each subdomain problem are implicitly eliminated. The resulting saddle point Schur complement is solved with a Krylov space method with a balancing Neumann-Neumann preconditioner based on the solution of a coarse Stokes problem with a few degrees of freedom per subdomain and on the solution of local Stokes problems with natural %Neumann velocity and essential boundary conditions on the subdomains. This preconditioner is of hybrid form in which the coarse problem is treated multiplicatively while the local problems are treated additively. The condition number of the preconditioned operator is independent of the number of subdomains and is bounded from above by the product of the square of the logarithm of the local number of unknowns in each subdomain and the inverse of the inf-sup constants of the discrete problem and of the coarse subproblem. Numerical results show that the method is quite fast; they are also fully consistent with the theory.
Requests for dynamic and personalized content increasingly dominate current-day Internet traffic, driven both by a growth in dynamic web services and a ``trickle-down'' effect stemming from the effectiveness of caches and content-distribution networks at serving static content. To efficiently serve this trend, several server-side and cache-side techniques have recently been proposed. Although such techniques, which exploit different forms of reuse at the sub-document level, appear promising, a significant impediment to their widespread deployment is (1) the absence of good models describing characteristics of dynamic web content, and (2) the lack of effective synthetic content generators, which reduce the effort involved in verifying the effectiveness of a proposed solution.
This paper addresses both of these shortcomings. Its primary contribution is a set of models that capture the characteristics of dynamic content both in terms of independent parameters such as the distributions of object sizes and their freshness times, as well as derived parameters such as content reusability across time and linked documents. These models are derived from an analysis of the content from six representative news and e-commerce sites, using both size-based and level-based splitting techniques to infer document objects. A secondary contribution is a Tomcat-based dynamic content emulator, which uses these models to generate ESI-based dynamic content and serve requests for whole document and separate objects. To validate both the models and the design of the content emulator, we compare the bandwidth requirements seen by an idealized cache simulator that is driven by both the real trace and emulated content. Our simulation results verify that the output of the content emulator effectively and efficiently models real content.
This dissertation develops programming languages and associated techniques for sound and efficient implementations of algorithms for program generation.
First, we develop a framework for practical two-level languages. In this framework, we demonstrate that two-level languages are not only a good tool for describing program-generation algorithms, but a good tool for reasoning about them and implementing them as well. We pinpoint several general properties of two-level languages that capture common proof obligations of program-generation algorithms:
In addition, to justify concrete implementations, we use a native embedding of a two-level language into a one-level language.
We present two-level languages with these properties both for a call-by-name object language and for a call-by-value object language with computational effects, and demonstrate them through two classes of non-trivial applications: one-pass transformations into continuation-passing style and type-directed partial evaluation for call-by-name and for call-by-value.
Next, to facilitate implementations, we develop several general approaches to programming with type-indexed families of values within the popular Hindley-Milner type system. Type-indexed families provide a form of type dependency, which is employed by many algorithms that generate typed programs, but is absent from mainstream languages. Our approaches are based on type encodings, so that they are type safe. We demonstrate and compare them through a host of examples, including type-directed partial evaluation and printf-style formatting.
Finally, upon the two-level framework and type-encoding techniques, we recast a joint work with Bernd Grobauer, where we formally derived a suitable self application for type-directed partial evaluation, and achieved automatic compiler generation.
Future scalable, high throughput, and high performance applications are likely to execute on platforms constructed by clustering multiple autonomous distributed servers, with resource access governed by agreements between the owners and users of these servers. As an example, application service providers (ASPs) can pool their resources together according to pre-specified sharing agreements to provide better services to their customers. Such systems raise several new resource management challenges, chief amongst which is the enforcement of agreements to ensure that, despite the distributed nature of both requests and resources, user requests only receive a predetermined share of the aggregate resource and that the resources of a participant are not misused. Current solutions only enforce such agreements at a coarse granularity and in a centralized fashion, limiting their applicability for general workloads.
This paper presents an architecture for the distributed enforcement of resource sharing agreements. Our approach exploits a uniform application-independent representation of agreements, and combines it with efficient time-window based coordinated queuing algorithms running on multiple nodes. We have successfully implemented this general strategy in two different network layers: a layer-7 HTTP redirector and a layer-4 packet redirector, which redirect connection requests from distributed clients to a cluster of distributed servers. Our measurements of both implementations verify that our approach is general and effective: different client groups receive service commensurate with their agreements.
Although networks and coordinated processes figure prominently in the kinds of data manipulation found in everything from scientific modeling to large-scale data mining, programmers charged with setting up the requisite software systems frequently find themselves hampered by the inadequacy of available languages. The ``real'' languages such as C++ and Java tend to be low-level, requiring the specification of a great deal of often repetitive detail, whereas the higher-level ``scripting'' languages tend to lack the kinds of structuring facilities that lend themselves to the reliable construction of even modestly large systems.
The high-level language SETL meets both of these needs. Originally conceived as a language which aimed to bring programming a little closer to the idealized world of mathematics, making it extremely useful in the human-to-human communication of algorithms, SETL has proven itself over the years to be an excellent language for software prototyping, primarily because its conciseness and immediacy lend it well to rapid experimentation. These characteristics, together with its general freedom from machine-oriented restrictions, its value semantics, its comprehension-style constructors for aggregates, its skill with strings, and especially its syntactic support for mappings, also make it well suited to high-level data processing.
In order to play the role of a full-fledged modern data processing language, however, SETL had to acquire the ability to manipulate processes and communicate with them easily, and furthermore to be able to work with networks, particularly the client-server model that rules the Internet. Accordingly, I have integrated a full set of process and network management features into SETL. In my dissertation, I show how the liberal use of fullweight processes, with the high, protective walls that surround them, sustains a modular design approach which in turn provides a strong defense against the main hazards of distributed computing, namely race conditions and deadlock, while preserving the luxury and convenience of programming in a truly high-level language. To this end, I have evolved protocols and design patterns for developing multiplexing servers and clients in SETL, and in my talk, will present examples of fairly complex systems where hierarchies of processes communicate over the network. Such systems tend to be notorious for their unreliability, but in these instances, robustness seems to follow naturally from the readability of simple programs written in an ancient and friendly language.
Muller (1998) develops a language of motion and shape change in terms of topological relations and temporal order relations between regions of space-time (histories). He uses this language to state and prove the transition rules developed in (Randell, Cui, and Cohn, 1992) that constrain the changes in spatial relations possible for objects whose shape changes continuously. Unfortunately, Muller's statement of the transition rules is inadequate. This paper presents an alternative statement of these transition rules.
A natural approach to defining continuous change of shape is in terms of a metric that measures the difference between two regions. We consider four such metrics over regions: the Hausdorff distance, the dual-Hausdorff distance, the area of the symmetric difference, and the optimal-homeomorphism metric. Each of these gives a different criterion for continuous change. We establish qualitative properties of all of these; in particular, the continuity of basic functions such as union, intersection, set difference, area, distance, and the boundary function; the transition graph between RCC relations (Randell, Cui, and Cohn, 1992). We discuss the physical significance of these different criteria.
We also show that the history-based definition of continuity proposed by Muller (1998) is equivalent to continuity with respect to the Hausdorff distance. An examination of the difference between the transition rules that we have found for the Hausdorff distance and the transition theorems that Muller derives leads to the conclusion that Muller's analysis of state transitions is not adequate. We propose an alternative characterization of transitions in Muller's first-order language over histories.
Processors conforming to the IEEE Standard for Floating-Point Arithmetic have been commonplace for some years, and now several programming languages seem to support or conform to this standard, from hereon referred to as ``the IEEE Standard.'' For example, The Java Language Specification by Gosling, Joy, and Steele, which defines the Java language, frequently mentions the IEEE Standard. Indeed, Java, as do other languages, supports some of the features of the IEEE Standard, including a couple floating-point data formats, and even requires (in section 4.2.4 ``Floating-Point Operations'' of the aforementioned book) that ``operators on floating-point numbers behave exactly as specified by IEEE 754.''
Arguing that the support current languages offer is not enough, this thesis establishes clear criteria for what it means to fully support the IEEE Standard in a programming language. Each aspect of the IEEE Standard is examined in detail from the point of view of how various arithmetic engines implement that aspect of the IEEE Standard, how different languages (and implementations thereof) support it, and what the range of options are in supporting that aspect. Practical recommendations are then offered (particularly, but not exclusively, for Ada and Java), taking, for example, programmer convenience and impact on performance into consideration. A detailed model specification following these recommendations is provided for the Ada language.
In addition, a variety of issues related to the floating-point aspects of programming languages are discussed, so as to serve as a more complete guide to language designers. One such issue is floating-point expression evaluation schemes, and, more specifically, whether bit-for-bit identical results are actually achievable on a variety of platforms that conform to the IEEE Standard, as the Java language promises. Closely tied to this issue is that of double rounding, which occurs when a (possibly intermediate) result is rounded more than once before subsequent use or before being delivered to its final destination. So this thesis discusses when double rounding makes a difference, how it can be avoided, and what the performance impact is in avoiding it.
The growth of the internet has been fueled by an increasing number of sophisticated network-accessible services. Unfortunately, the high bandwidth and processing requirements of such services is at odds with current trends towards increased variation in network characteristics and a large diversity in end devices. Ubiquitous access to sucr services requires the injection of additional functionality into the network to handle protocol conversion, data transcoding, and in general bridge disparate portions of the physical network. Several researchers have proposed infrastructures for injecting such functionality; however, many challenges remain before these infrastructures can be widely deployed.
CANS is an application-level infrastructure for injecting application-specific components into the network that focuses on three such challenges: (a) efficient and dynamic composition of individual components; (b) dynamic and distributed adaptation of injected components in response to system conditions; and (c) support for legacy applications and services. The network view supported by CANS consists of applications, stateful services, and data paths between them built up from mobile soft-state objects called drivers. Both services and data paths can be dynamically created and reconfigured: a planning and event propagation model assists in distributed adaptation, and a run-time type-based composition model dictates how new services and drivers are integrated with existing components. An interception layer that virtualizes network bindings permits legacy applications to plug into the CANS infrastructure, and a delegation model does the same for legacy services.
This paper describes the CANS architecture and implementation, and a case study involving a shrink-wrapped client application in a dynamically changing network environment where CANS was used to improve overall user experience.
An effective algorithm design language should be 1) wide-spectrum in nature, i.e. capable of expressing both abstract specifications and low-level implementations, and 2) "computationally transparent", i.e. facilitate accurate estimation of time and space requirements. The conflict between these requirements is exemplified by SETL which is wide-spectrum, but lacks computational transparency because of its reliance on hash-based data structures. The first part of this thesis develops an effective algorithm design language, and the second demonstrates its usefulness for algorithm explanation and discovery.
In the first part three successively more abstract set-theoretic languages are developed and shown to be computationally transparent. These languages can collectively express both abstract specifications and low-level implementations. We formally define a data structure selection method for these languages using a novel type system. Computational transparency is obtained for the lowest-level language through the type system, and for the higher-level languages by transformation into the next lower level. We show the effectiveness of this method by using it to improve a difficult database query optimization algorithm from expected to worst-case linear time. In addition, a simpler explanation and a shorter proof of correctness are obtained.
In the second part we show how our data structure selection method can be made an effective third component of a transformational program design methodology whose first two components are finite differencing and dominated convergence. Finite differencing replaces costly repeated computations by cheaper incremental counterparts, and dominated convergence provides a generalized iteration scheme for computing fixed-points. This methodology has led us to a simpler explanation of a complex linear-time model-checking algorithm for the alternation-free modal mu-calculus, and to the discovery of an O ( N ^{ 3 } ) time algorithm for computing intra-procedural may-alias information that improves over an existing O ( N ^{ 5 } ) time algorithm.
We use relaxation to produce painted imagery from images and video. An energy function is first specified; a painting is then generated by performing a search for a painting with minimal energy. The appeal of this strategy is that, ideally, we need only specify what we want, not how to directly compute it. Because the energy function is very difficult to optimize, we use a relaxation algorithm combined with various search heuristics.
This formulation allows us to specify painting style by varying the relative weights of energy terms. The basic energy function yields an economical painting that effectively conveys an image with few strokes. This economical style produces moderate temporal coherence when processing video, without losing the essential 2D quality of the painting. The system allows as fine user control as desired: the user may interac-tively change the painting style, specify variations of style over an image, and/or add specific strokes to the painting. Procedural stroke textures may be used to enhance visual appeal.
A network of non-dedicated workstations can provide computational resources at minimal or no additional cost. If harnessed properly, the combined computational power of these otherwise ``wasted'' resources can outperform even mainframe computers. Performing demanding computations on a network of non-dedicated workstations efficiently has previously been studied, but inadequate handling of the unpredictable behavior of the environment and possible failures resulted in limited success only.
This dissertation presents a shared memory software system for executing programs with nested parallelism and synchronization on a network of non-dedicated workstations. The programming model exhibits a very convenient and natural programming style and is especially suitable for computations whose complexity and parallelism emerges only during their execution, such as in divide and conquer problems. To both support and take advantage of the flexibility inherent in the programming model, an architecture that distributes both the shared memory management and the computation is developed. This architecture removes bottlenecks inherent in centralization, thus enhancing scalability and dependability. By adapting available resource dynamically and coping with unpredictable machine slowdowns and failures, the system also supports dynamic load balancing, and fault tolerance--both transparently to the programmer.
One of the challenges of computer vision is that the information we seek to extract from images is not even defined for most images. Because of this, we cannot hope to find a simple process that produces the information directly from a given image. Instead, we need a search, or an optimization, in the space of parameters that we are trying to estimate.
In this thesis, I introduce two new optimization methods that use graph algorithms. They are characterized by their ability to find a global optimum efficiently. Each method defines a graph that can be seen as embedded in a Euclidean space. Graph- theoretic entities such as cuts and cycles represent geometric objects that embody the information we seek.
The first method finds a hypersurface in a Euclidean space that minimizes a certain kind of energy functional. The hypersurface is approximated by a cut of an embedded graph so that the total cost of the cut corresponds to the energy. A globally optimal solution is found by using a minimum cut algorithm. In particular, it can globally solve first order Markov Random Field problems in more generality than was previously possible. I prove that the convexity of the smoothing function in the energy is essential for the applicability of the method and provide an exact criterion in terms of the MRF energy.
The second method proposed here efficiently finds an optimal cycle in a Euclidean space. It uses a minimum ratio cycle algorithm to find a cycle with minimum energy in an embedded graph. In the case of two dimensions, the energy can depend not only on the cycle itself but also on the region defined by the cycle. Because of this, the method unifies the two competing views of boundary and region segmentation.
I demonstrate the utility of the methods in applications, with the results of experiments in the areas of binocular stereo, image restoration, and image segmentation. The image segmentation, or contour extraction, experiments are carried out in various situations using different types of information, for example motion, stereo, and intensity.
Object recognition is a central problem in computer vision. Typically it is assumed to follow a sequential model in which successively more specific hypotheses are generated about the image. This is a rather simplistic model, allowing as it does no margin for error at any point. We follow a more general approach in which the various representations involved are allowed to influence one another from the outset. As a guide and ultimate goal, we study the problem of finding the region occupied by human beings in images, and the separation of the region into arms, legs and head. We approach the problem as that of defining a functional on the space of boundaries in images whose minimum specifies the region occupied by the human figure. Previous work that uses such functionals suffers from a number of difficulties. These include an uncontrollable dependence on scale, an inability to find the global minimum for boundaries in polynomial time, and the inability to include region as well as boundary information. We present a new form of functional on boundaries in a manifold that solves these problems, and is also the unique form of functional in a specific class that possesses a non-trivial, efficiently computable global minimum. We describe applications of the model to single images and to the extraction of boundaries from stereo pairs and motion sequences. In addition, the functionals used in previous work could not include information about the shape of the region sought. We develop a model for the part structures of boundaries that extends previous work to the case of real images, thus including shape information in the functional framework. We show that such part structures are hyperpaths in a hypergraph. An `optimal hyperpath' algorithm is developed that globally minimizes the functional under some conditions. We show how to use exemplars of a shape to construct a functional that includes specific information about the topology of the part structure sought. An algorithm is developed that globally minimizes such functionals in the case of a fixed boundary. The behaviour of the functional mimics an aspect of human shape comparison.
We present a protocol for the fault-tolerant execution of parallel programs. The protocol leaves the implementation free to make choices concerning efficiency tradeoffs. Thus, we are proposing a design pattern rather than a fully specified algorithm. The protocol is modeled with the help of Petri nets.
Based on the Petri net model, we formally prove the correctness of the design pattern. This verification serves two goals: first, it guarantees the correctness of the design pattern; second, it serves as a test case for the underlying verification technique.
We consider a scalar advection-diffusion problem and a recently proposed discontinuous Galerkin approximation, which employs discontinuous finite element spaces and suitable bilinear forms containing interface terms that ensure consistency. For the corresponding sparse, non-symmetric linear system, we propose and study an additive, two--level overlapping Schwarz preconditioner, consisting of a coarse problem on a coarse triangulation and local solvers associated to suitable problems defined on a family of subdomains.
This is a generalization of the corresponding overlapping method for approximations on continuous finite element spaces. Related to the lack of continuity of our approximation spaces, some interesting new features arise in our generalization, which have no analog in the conforming case.
We prove an upper bound for the number of iterations obtained by using this preconditioner with GMRES, which is independent of the number of degrees of freedom of the original problem and the number of subdomains. The performance of the method is illustrated by several numerical experiments for different test problems, using linear finite elements in two dimensions.
We address the problem of authorization in large-scale, open, distributed systems. Authorization decisions are needed in electronic commerce, mobile-code execution, remote resource sharing, content advising, privacy protection, etc. We adopt the trustmanagement approach, in which “authorization” is viewed as a “proof-of-compliance” problem: Does a set of credentials prove that a request complies with a policy? We develop a logic-based language Delegation Logic (DL) to represent policies, credentials, and requests in distributed authorization. Delegation Logic extends logic programming (LP) languages with expressive delegation constructs that feature delegation depth and a wide variety of complex principals (including, but not limited to, k-out-of-n thresholds). D1LP, the monotonic version of DL, extends the LP language Datalog with delegation constructs. D2LP, the nonmonotonic version of DL, also features classical negation, negation-as-failure, and prioritized conflict handling. Our approach to defining and implementing DL is based on tractably compiling DL programs into ordinary logic programs (OLP’s). This compilation approach enables DL to be implemented modularly on top of existing technologies for OLP, e.g., Prolog. As a trust-management language, Delegation Logic provides a concept of proof-ofcompliance that is founded on well-understood principles of logic programming and knowledge representation. DL also provides a logical framework for studying delegation, negation of authority, conflicts between authorities, and their interplay.
Domain decomposition methods are powerful iterative methods for solving systems of algebraic equations arising from the discretization of partial differential equations by, e.g., finite elements. The computational domain is decomposed into overlapping or nonoverlapping subdomains. The problem is divided into, or assembled from, smaller subproblems corresponding to these subdomains. In this dissertation, we focus on domain decomposition methods for mortar finite elements, which are nonconforming finite element methods that allow for a geometrically nonconforming decomposition of the computational domain into subregions and for the optimal coupling of different variational approximations in different subregions.
We introduce a FETI method for mortar finite elements, and provide numer- ical comparisons of FETI algorithms for mortar finite elements when different preconditioners, given in the FETI literature, are considered. We also analyze the complexity of the preconditioners for the three dimensional versions of the algorithms.
We formulate a variant of the balancing method for mortar finite elements, which uses extended local regions to account for the nonmortar sides of the subre- gions. We prove a polylogarithmic condition number estimate for our algorithm in the geometrically nonconforming case. Our estimate is similar to those for other Neumann{Neumann and substructuring methods for mortar finite elements.
In addition, we establish several fundamental properties of mortar finite elements: the existence of the nonmortar partition of any interface, the L^2 stability of the mortar projection for arbitrary meshes on the nonmortar side, and prove Friedrichs and Poincare inequalities for geometrically nonconforming mortar elements.
The Finite Element Tearing and Interconnecting (FETI) method is an iterative substructuring method using Lagrange multipliers to enforce the continuity of the finite element solution across the subdomain interface. Mortar finite elements are nonconforming finite elements that allow for a geometrically nonconforming decomposition of the computational domain into subregions and, at the same time, for the optimal coupling of different variational approximations in different subregions. We present a numerical study of FETI algorithms for elliptic self-adjoint equations discretized by mortar finite elements. Several preconditioners which have been successful for the case of conforming finite elements are considered. We compare the performance of our algorithms when applied to classical mortar elements and to a new family of biorthogonal mortar elements and discuss the differences between enforcing mortar conditions instead of continuity conditions for the case of matching nodes across the interface. Our experiments are carried out for both two and three dimensional problems, and include a study of the relative costs of applying different preconditioners for mortar elements.
Queryable Expert Systems
10:00 a.m., Tuesday, October 17, 2000
12th floor conference room, 719 Broadway
Abstract
Interactive rule-based expert systems, which work by ``interviewing'' their users, have found applications in fields ranging from aerospace to help desks. Although they have been shown to be useful, people find them difficult to query in flexible ways. This limits the reusability of the knowledge they contain. Databases and noninteractive rule systems such as logic programs, on the other hand, are queryable but they do not offer an interview capability. This thesis is the first investigation that we know of into query-processing for interactive expert systems.
In our query paradigm, the user describes a hypothetical condition and then the system reports which of its conclusions are reachable, and which are inevitable, under that condition. For instance, if the input value for bloodSugar exceeds 100 units, is the conclusion diabetes then inevitable? Reachability problems have been studied in other settings, e.g., the halting problem, but not for interactive expert systems.
We first give a theoretical framework for query-processing that covers
a wide class of interactive expert systems. Then we present a
query algorithm for a specific language of expert systems. This language
is a restriction of production systems to an acyclic form
that generalizes decision trees and classical spreadsheets.
The algorithm effects a reduction from the reachability and inevitability queries
into datalog rules with constraints. When preconditions are
conjunctive, the data complexity is tractable.
Next, we optimize for queries to production systems that contain regions which are
decision trees. When general-purpose datalog
methods are applied to the rules that result from our queries,
the number of constraints that must be solved is
O
(
n
^{
2
}
), where
n
is the size of the
trees. We lower the complexity to
O
(
n
). Finally, we have built a
query tool for a useful subset of the acyclic production systems. To our knowledge,
these are the first interactive expert systems that can be queried about the
reachability and inevitability of their conclusions.
In this paper, we show that iterative substructuring methods of Finite Element Tearing and Interconnecting type can be successfully employed for the solution of linear systems arising from the finite element approximation of scalar advection-diffusion problems. Using similar ideas as those of a recently developed Neumann-Neumann method, we propose a one-level algorithm and a class of two-level algorithms, obtained by suitably modifying the local problems on the subdomains. We present some numerical results for some significant test cases. Our methods appear to be optimal for flows without closed streamlines and possibly very small values of the viscosity. They also show very good performances for rotating flows and moderate Reynolds numbers. Therefore, the algorithms proposed appear to be well-suited for many convection-dominated problems of practical interest.
We propose and analyze a domain decomposition method on non-matching grids for partial differential equations with non-negative characteristic form. No weak or strong continuity of the finite element functions, their normal derivatives, or linear combinations of the two is imposed across the boundaries of the subdomains. Instead, we employ suitable bilinear forms defined on the common interfaces, typical of discontinuous Galerkin approximations. We prove an error bound which is optimal with respect to the mesh-size and suboptimal with respect to the polynomial degree. Our analysis is valid for arbitrary shape-regular meshes and arbitrary partitions into subdomains. Our method can be applied to advective, diffusive, and mixed-type equations, as well, and is well-suited for problems coupling hyperbolic and elliptic equations.
Information Extraction (IE) is an emerging NLP technology, whose function is to process unstructured, natural language text, to locate specific pieces of information, or facts , in the text, and to use these facts to fill a database. IE systems today are commonly based on pattern matching. The core IE engine uses a cascade of sets of patterns of increasing linguistic complexity. Each pattern consists of a regular expression and an associated mapping from syntactic to logical form. The pattern sets are customized for each new topic , as defined by the set of facts to be extracted.
Construction of a pattern base for a new topic is recognized as a time-consuming and expensive process--a principal roadblock to wider use of IE technology in the large. An effective pattern base must be precise and must have wide coverage. This thesis addresses the portability problem in two stages.
First, we introduce a set of tools for building patterns manually from examples . To adapt the IE system to a new subject domain quickly, the user chooses a set of example sentences from a training text, and specifies how each example maps to the extracted event--its logical form. The system then applies meta-rules to transform the example automatically into a general set of patterns. This effectively shifts the portability bottleneck from building patterns to finding good examples.
Second, we propose a novel methodology for discovering good examples automatically from a large un-annotated corpus of text. The system is initially seeded with a small set of relevant patterns provided by the user. An unsupervised learning procedure then identifies new patterns and classes of related terms on successive iterations. We present experimental results, which confirm that the discovered patterns exhibit high quality, as measured in terms of precision and recall.
Advances in computing and networking technology, and an explosion in information sources has resulted in a growing number of distributed systems getting constructed out of resources contributed by multiple sources. Use of such resources is typically governed by sharing agreements between owning principals, which limit both who can access a resource and in what quantity. Despite their increasing importance, existing resource management infrastructures offer only limited support for the expression and enforcement of sharing agreements, typically restricting themselves to identifying compatible resources. In this paper, we present a novel approach building on the concepts of tickets and currencies to express resource sharing agreements in an abstract, dynamic, and uniform fashion. We also formu-late the allocation problem of enforcing these agreements as a linear-programming model, automatically factoring the transitive availability of resources via chained agreements. A case study modeling resource sharing among ISP-level web proxies shows the benefits of enforcing transitive agreements: worst-case waiting times of clients accessing these proxies improves by up to two orders of magnitude.
Conditional synchronization - a mechanism that conditionally blocks a thread based on the value of a boolean expression currently exists in several programming languages. We propose promoting conditional synchronization to first-class status allowing the synchronization object representing a suspended conditional synchronization to be passed as a value.
To demonstrate our idea we extend Concurrent ML and present several examples illustrating the expressiveness of first-class conditional synchronization (FCS). FCS has broadcast semantics making it appropriate for applications such as barriers and discrete-event simulation. The semantics also guarantee that no transient store configurations are missed. The end result facilitates abstraction and adds flexibility in writing concurrent programs. To minimize re-evaluation of synchronization conditions we propose a static analysis and translation that identifies expressions for the run-time system that could affect the value of a synchronization condition. The static analysis (which is based on an effect type system) therefore precludes excessive run-time system polling of synchronization conditions.
The advantages of using a set of networked commodity computers for parallel processing is well understood: such computers are cheap, widely available, and mostly underutilized. So why has the use of such environments for compute-intensive applications not proliferated? A major reason is that the inherent complexities of programming applications and coordinating their execution on networked computers outweighs the advantages.
In networked environments populated with multiuser commodity computers, both the computing speed and the number of available computers for executing parallel programs may change frequently and unpredictably. As a consequence, programs need to continuously adapt their execution to the changing environment. The execution of an application must therefore address such issues as dynamic changes in effective machine speeds, dynamic changes in the number of available machines, and sudden network and machine failures. It is not feasible for an application programmer to write programs that adapt to the behavior of a system whose critical aspects cannot be anticipated.
I will present a unified set of techniques to implement a virtual reliable parallel-processing platform on a set of unreliable computers with temporally varying execution speeds. These techniques are specifically designed for automatically adapting the execution of parallel programs to distributed environments. I will explain these techniques in the context of two software systems, Calypso and ResourceBroker, that have been built to validate them.
Calypso gives a programmer a simple tool to build and effectively execute parallel programs on a set of commodity computers. The notable properties of Calypso are: (1) a simple, intuitive programming model based on a virtual machine interface; (2) separation of logical and physical parallelism, allowing the source code to codify the algorithm rather than the execution environment; and (3) a runtime system that efficiently adapts the execution of the program to the dynamic nature of the runtime environment. ResourceBroker is a resource manager that demonstrates a novel technique to dynamically manage the assignment of computers to parallel programs. ResourceBroker can work with a variety of parallel systems, even transparently managing those that are not aware of its existence, such as PVM and MPI, and will distribute available resources fairly among multiple computations. As a result, a mix of parallel programs, written using diverse programming systems can effectively execute concurrently on a set of computers.
In this paper we introduce improved rules for Catmull-Clark and Loop subdivision that overcome several problems with the original schemes (lack of smoothness at extraordinary boundary vertices, folds near concave corners). In addition, our approach to rule modification allows generation of surfaces with prescribed normals, both on the boundary and in the interior, which considerably improves control of the shape of surfaces.
This paper presents visualizations of binary search trees and splay trees. The visualizations comprise sequences of figures or frames, called comic strips. Consecutive frames are viewed two at a time to facilitate user (viewer) understanding of the algorithm steps. The visualizations are implemented in Java to facilitate their wide use. This paper explores several other considerations in the design of instructional visualizations.
We present a set of very low bandwidth techniques for navigating remote environments. In a typical setup using our system, a virtual environment resides on a server machine, and one or more users explore the environment from client machines. Each client uses previous views of the environment to predict the next view, using the known camera motion and image-based rendering techniques. The server performs the same prediction, and sends only the difference between the predicted and actual view. Compressed difference images require significantly less bandwidth than the compressed images of each frame, and thus can yield much higher frame rates. To request a view, the client simply sends the coordinates of the desired view and of the previous view to the server. This avoids the overhead of maintaining connections between the server and each client.
No restrictions are placed on the scene or the camera motions; the view compression technique may be used with arbitrarily complex 3D scenes or dynamically changing views from a web camera or a digital television broadcast. A lossy compression scheme is presented in which the client estimates the cumulative error in each frame, and requests a comprete refresh before errors become noticable.
This work is applicable to remote exploration of virtual worlds such as on head-mounted displays, Digital Television, or over the Internet.
This thesis describes a novel statistical named-entity (i.e. ``proper name'') recognition system known as ``MENE'' (Maximum Entropy Named Entity). Named entity (N.E.) recognition is a form of information extraction in which we seek to classify every word in a document as being a person-name, organization, location, date, time, monetary value, percentage, or ``none of the above''. The task has particular significance for Internet search engines, machine translation, the automatic indexing of documents, and as a foundation for work on more complex information extraction tasks.
Two of the most significant problems facing the constructor of a named entity system are the questions of portability and system performance. A practical N.E. system will need to be ported frequently to new bodies of text and even to new languages. The challenge is to build a system which can be ported with minimal expense (in particular minimal programming by a computational linguist) while maintaining a high degree of accuracy in the new domains or languages.
MENE attempts to address these issues through the use of maximum entropy probabilistic modeling. It utilizes a very flexible object-based architecture which allows it to make use of a broad range of knowledge sources in making its tagging decisions. In the DARPA-sponsored MUC-7 named entity evaluation, the system displayed an accuracy rate which was well-above the median, demonstrating that it can achieve the performance goal. In addition, we demonstrate that the system can be used as a post-processing tool to enhance the output of a hand-coded named entity recognizer through experiments in which MENE improved on the performance of N.E. systems from three different sites. Furthermore, when all three external recognizers are combined under MENE, we are able to achieve very strong results which, in some cases, appear to be competitive with human performance.
Finally, we demonstrate the trans-lingual portability of the system. We ported the system to two Japanese-language named entity tasks, one of which involved a new named entity category, ``artifact''. Our results on these tasks were competitive with the best systems built by native Japanese speakers despite the fact that the author speaks no Japanese.
This paper addresses the problem of recovering 3D non-rigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full head and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example, Tomasi and Kanade's factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three step process to yield to pose, configuration and shape. We demonstrate this simple but effective algorithm on video sequences of speaking people. We were able to recover 3D non-rigid facial models with high accuracy.
Individual components of financial option portfolios cannot be evaluated independently under nonlinear models in mathematical finance. This entails increased algorithmic complexity if the options under consideration are path-dependent. We describe algorithms that price portfolios of vanilla, barrier and American options under worst-case assumptions in an uncertain volatility setting. We present a generalized approach to worst-case volatility scenarios in which only the duration, but not the starting dates of periods of high volatility risk are known. Our implementation follows object-oriented principles and is modular and extensible. Combinatorial and numerical algorithms are separate and orthogonal to each other. We make our tools available to a wide audience by using standard Internet technologies.
Given an affine subspace of square matrices, we consider the problem of minimizing the spectral abscissa (the largest real part of an eigenvalue). We give an example whose optimal solution has Jordan form consisting of a single Jordan block, and we show, using non-lipschitz variational analysis, that this behaviour persists under arbitrary small perturbations to the example. Thus although matrices with nontrivial Jordan structure are rare in the space of all matrices, they appear naturally in spectral abscissa minimization.
We consider spectral functions f o lambda, where f is any permutation-invariant mapping from C^n to R, and lambda is the eigenvalue map from C^{n X n} to C^n, ordering the eigenvalues lexicographically. For example, if f is the function "maximum real part", then f o lambda is the spectral abscissa, while if f is "maximum modulus", then f o lambda is the spectral radius. Both these spectral functions are continuous, but they are neither convex nor Lipschitz. For our analysis, we use the notion of subgradient extensively analyzed in Variational Analysis, R.T. Rockafellar and R. J.-B. Wets (Springer, 1998), which is particularly well suited to the variational analysis of non-Lipschitz spectral functions. We derive a number of necessary conditions for subgradients of spectral functions. For the spectral abscissa, we give both necessary and sufficient conditions for subgradients, and precisely identify the case where subdifferential regularity holds. We conclude by introducing the notion of semistable programming: minimizing a linear function of a matrix subject to linear constraints, together with the constraint that the eigenvalues of the matrix all lie in the right half-plane or on the imaginary axis. This is a generalization of semidefinite programming for non-Hermitian matrices. Using our analysis, we derive a necessary condition for a local minimizer of a semistable program, and give a generalization of the complementarity condition familiar from semidefinite programming.
The popularity of mobile and networked applications has resulted in an increasing demand for execution ``sandboxes''---environments that impose irrevocable qualitative and quantitative restrictions on resource usage. Existing approaches either verify application compliance to restrictions at start time (e.g., using certified code or language-based protection) or enforce it at run time (e.g., using kernel support, binary modification, or active interception of the application's interactions with the operating system). However, their general applicability is constrained by the fact that they are either too heavyweight and inflexible, or are limited in the kinds of sandboxing restrictions and applications they can handle.
This paper presents a secure user-level sandboxing approach for enforcing both qualitative and quantitative restrictions on resource usage of applications in distributed systems. Our approach actively monitors an application's interactions with the underlying system, proactively controlling it as desired to enforce the desired behavior. Our approach leverages a core set of user-level mechanisms that are available in most modern operating systems: fine-grained timers, monitoring infrastructure (e.g., the /proc filesystem), debugger processes, priority-based scheduling, and page-based memory protection. We describe implementations of a sandbox that imposes quantitative restrictions on CPU, memory, and network usage on two commodity operating systems: Windows NT and Linux. Our results show that application usage of resources can be restricted to within 3% of desired limits with minimal run-time overhead.
Current technology trends point towards both an increased heterogeneity in hardware platforms and an increase in the mechanisms available to applications for controlling how these platforms are utilized. These trends motivate the design of resource-aware distributed applications, which proactively monitor and control utilization of the underlying platform, ensuring a desired performance level by adapting their behavior to changing resource characteristics.
This paper describes a general framework for enabling application adaptation on distributed platforms. The framework combines programmer specification of alternate execution behaviors (configurations) with automatic support for deciding when and how to adapt, relying extensively on two components: (1) profile-based modeling of application behavior, automatically generated by measuring application performance in a virtual execution environment with controllable resource consumption, and (2)application-specific continuous monitoring of current resource characteristics. The latter detects when application configurations need to change while the former guides the selection of a new configuration.
We evaluate these framework components using an interactive image visualization application. Our results demonstrate that starting from a natural specification of alternate application behaviors and an automatically generated performance database, our framework permits the application to both configure itself in diverse distributed environments and adapt itself to run-time changes in resource characteristics so as to satisfy user preferences of output quality.
The development of a prototyping language should follow the usual software-engineering methodology: starting with an evolvable, easily modifiable, working prototype of the proposed language. Rather than committing to the development of a mammoth compiler at the outset, we can design a translator from the prototyping language to another high-level language as a viable alternative. From a software-engineering point of view, the advantages of the translator approach are its shorter development cycle and lessened maintenance burden.
In prototyping language design, there are often innovative cutting-edge features which may not be well-understood. It is inevitable that numerous experimentations and revisions will be made to the current design, and hence supporting evolvability and modifiability is critical in the translator design.
In this dissertation we present an action-semantics-based framework for high-level source-to-source language translation. Action semantics is a form of denotational semantics that is based on abstract semantic algebra rather than Scott domain and lambda-notation. More specifically, this model not only provides a formal semantics definition for the source language and sets guidelines for implementations as well as migration, but also facilitates mathematical reasoning and a correctness proof of the entire translation process. The translation is geared primarily towards readability, maintainability, and type-preserving target programs, only secondarily towards reasonable efficiency.
We have acquired a collection of techniques for the translation of certain non-trivial high-level features of prototyping languages and declarative languages into efficient procedural constructs in imperative languages like Ada95, while using the abstraction mechanism of the target languages to maximize the readability of the target programs. In particular, we translate Griffin existential types into Ada95 using its object-oriented features, based on coercion calculus. This translation is actually more general, in that one can add existential types to a language (with modicum of extra syntax) supporting object-oriented paradigm without augmenting its type system, through intra-language transformation. We also present a type-preserving translation of closures which allows us to drop the whole-program-transformation requirement.
Let $V$, $E$, and $D$ denote the cardinality of the vertex set, the cardinality of the edge set, and the maximum degree of a bipartite multigraph $G$. We show that a minimal edge-coloring of $G$ can be computed in $O(E\log D)$ time.
We give a randomized algorithm for the {\em Pattern Matching with Swaps} problem which runs in $O(m \log m \log |\Sigma| )$ time on a text of length $2m-1$ and a pattern of length $m$ drawn from an alphabet set of size $|\Sigma|$. This algorithm gives the correct answer with probability at least $1-\frac{1}{m}$ and does not miss a match. The best deterministic algorithm known for this problem takes $O(m^{4/3} \mbox{polylog}(m))$ time.
This dissertation examines bounded rationality as a tool in distributed systems of intelligent agents. We have implemented, in Java, a simulator for complex adaptive systems called CAF??. We use our framework to simulate a simple network and compare the effectiveness of bounded rationality at routing and admission control to that of a more traditional, source based, greedy routing approach. We find that the boundedly rational approach is particularly effective when user behavior is synchronized, such as occurs during breaking news releases on the World Wide Web, for example. We develop the key structures of our framework by first examining, through simulation, the behavior of boundedly rational speculators in a simple economy. We find them to be instrumental in bringing the economy quickly to price equilibrium as well as in maintaining the equilibrium in the face of changing conditions. We draw several interesting conclusions as to the key similarities between economy and computational systems and also, the situations where they differ drastically.
Molecular Biology studies the composition and interactions of life's agents, namely the various molecules (e.g. DNA, proteins, lipids) sustaining the living process. Traditionally, this study has been performed in wet labs using mostly physicochemical techniques. Such techniques, although precise and detailed, are often cumbersome and time consuming. On top of that, recent advances in sequencing technology have allowed the rapid accumulation of DNA and protein data. As a result a gap has been created (and is constantly being expanded): on the one side there is a rapidly growing collection of data containing all the information upon which life is built; and on the other side we are currently unable to keep up with the study of this data, impaired by the limits of existing analysis tools. It is obvious that alternative analysis techniques are badly needed. In this work we examine how computational methods can help in drilling the information contained in collections of biological data. In particular, we investigate how sequence similarity among various macromolecules (e.g. proteins) can be exploited towards the extraction of biologically useful information.
In the fields of computational vision and image understanding, the object recognition problem can often be formulated as a problem of matching a collection of model features to features extracted from an observed scene. This dissertation is concerned with the use of feature-based match similarity measures and feature match algorithms in object detection and classification in the context of image understanding from complex signature data. Our applications are in the domains of target vehicle recognition from radar imagery, and binocular stereopsis.
In what follows, we will consider “image understanding” to encompass the set of activities necessary to identify objects in visual imagery and to establish meaningful three-dimensional relationships between the objects themselves, or between the object and the viewer. The main goal in image understanding then involves the transformation of images to symbolic representation, effectively providing a high-level description of an image in terms of objects, object attributes, and relationships between known objects. As 2 such, image understanding subsumes the capabilities traditionally associated with image processing, object recognition and artificial vision [Crevier and Lepage 1997].
In human and/or biological vision systems, the task of object recognition is a natural and spontaneous one. Humans can recognize immediately and without effort a huge variety of objects from diverse perceptual cues and multiple sensorial inputs. The operations involved are complex and inconspicuous psychophysical and biological processes, including the use of properties such as shape, color, texture, pattern, motion, context, as well as considerations based on contextual information, prior knowledge, expectations, functionality hypothesis, and temporal continuity. These operations and their relation to machine object recognition and artificial vision are discussed in detail elsewhere [Marr 1982], [Biederman 1985], but they are not our concern in this thesis.
In this research, we consider only the simpler problem of model-based vision, where the objects to be recognized come from a library of three-dimensional models known in advance, and the problem is constrained using context and domain-specific knowledge.
The relevance of this work resides in its potential to support state-of-the-art developments in both civilian and military applications including knowledge-based image analysis, sensors exploitation, intelligence gathering, evolving databases, 3 interactive environments, etc. A large number of applications are reviewed below in section 1.4. Experimental results are presented in Chapters 5, 6, and
Hind et al.~\cite({Hind99}) use a standard data flow framework \cite{Rosen79, Tarjan81} to formulate an intra-procedural may-alias computation. The intra-procedural aliasing information is computed by applying well-known iterative techniques to the Sparse Evaluation Graph (SEG) (\cite{Choi91}). The computation requires a transfer function for each node that causes a potential pointer assignment (relating the data flow information flowing into and out of the node), and a set of aliases holding at the entry node of the SEG. The intra-procedural analysis assumes that precomputed information in the form of summary functions is available for all function-call sites in the procedure being analyzed. The time complexity of the intra-procedural may-alias computation for the algorithm presented by Hind et al.~(\cite{Hind99}) is $O(N^6)$ in the worst case (where $N$ is the size of the SEG). In this paper we present a worst case $O(N^3)$ time algorithm to compute the same may-alias information.
This talk concerns the strategic behavior of automated agents in the framework of network game theory, with particular focus on the collective behavior that arises via learning. In particular, ideas are conveyed on both the theory and simulation of learning in network games, in terms of two sample applications. The first application is network control, presented via an abstraction known as the Santa Fe bar problem, for which it is proven that rational learning does *not* converge to Nash equilibrium, the classic game-theoretic solution concept. On the other hand, it is observed via simulations, that low-rationality learning, where agents trade-off between exploration and exploitation, typically converges to mixed strategy Nash equilibria in this game. The second application is the economics of shopbots - agents that automatically search the Internet for price and product information - in which learning yields behaviors ranging from price wars to tacit collusion, with sophisticated low-rationality learning algorithms converging to Nash equilibria. This work forms part of a larger research program that advocates learning and game theory as a framework in which to model the interactions of computational agents in network domains.
This thesis investigates GUIs and their shortcomings. We demonstrate that there is room for refinement of existing graphical user interfaces, including those interfaces with which we are most familiar. A foundation for our designs is first established. It consists of known human capabilities, especially concerning hand-eye coordination, short term and long term memory, and visual perception. Accumulated experience in static and animated visual design provides additional guides for our work. On the basis of this foundation we analyze existing widgets. A series of new widgets are then proposed to address observed deficiencies in existing designs for scrolling, multiple copy and paste in text environments, text insertion and selection, and window management. Lessons learned from analyzing our new designs and observations of existing widgets are generalized into principles of widget design.
We propose an interactive framework for reconstructing an arbitrary 3D scene consistent with a set of images, for use in example-based image synthesis. Previous research has used human input to specify feature matches, which are then processed off-line; however, it is very difficult to correctly match images without feedback. The central idea of this paper is to perform and display 3D reconstruction during user modification. By allowing the user to interactively manipulate the image correspondence and the resulting 3D reconstruction, we can exploit both the user's intuitive image understanding and the computer's processing power.
The work users do with an application can be divided into actual work accomplished using the application and overhead performed in order to use the application. The latter can be further partitioned based on the time at which the work is performed: before (application location and delivery), during (installation) and after (upgrade) the installation of the application. This category can be characterized as the software deployment overhead. This thesis presents a component architecture RADIUS (Rapid Application location, Delivery, Installation and Upgrade System) in which applications can be built with no software deployment overhead to the users. An application is deployed automatically by simply giving the user a document produced by the application. Furthermore, the facilities in RADIUS make the applications self-upgrading. In the end, the users perform no deployment overhead work at all.
The conventional way of using an application is to install the application first, then start using documents of the application. The object-oriented programming (OOP) paradigm suggests that this order should be reversed: the data should lead to the code. However, almost all software fails to meet this model of design at the persistence level. While modern software often use OOP at the program level, the underlying operating systems do not support OOP at the document/file level. OOP languages use pointers to methods to indicate what operations can be performed on the objects. We extend the idea to include "pointers to applications". Each document has an attached application pointer, which is read by RADIUS when the document is opened. This application pointer is then used to locate and deliver the application module necessary for the document.
RADIUS is designed to be compatible with existing technologies and requires no extensions to either programming languages or operating systems. It is orthogonal to programming tools, is language-independent and compatible among operating systems, and consequently does not impose limitations on which environments the developers can use. We illustrate the implementations for the two most popular platforms today - C++ on Windows, and Java. RADIUS is also orthogonal to other component systems such as CORBA or COM and is easy to integrate with them.