This thesis consists of two parts. First, we explore means of discovering features. We come up with an information theoretic criterion to identify features which has deep connections to statistical estimation theory. We consider features to be ``nice'' representations of objects. We find that, ideally, a feature space representation of on image is the most concise representation of an image which captures all available information in it. In practice, however, we are satisfied with an approximation to it. Therefore, we explore a few such approximations and explain their connection to the information-theoretic approach. We look at the algorithms which implement these approximation and look at their generalizations in the related field of stereo vision.
Using features, whether they come from some feature-discovery algorithm or are hand crafted, is usually an ad hoc process which depends on the actual problem, and the exact representation of features. This diversity mostly arises from the multitude of ways features capture information. In the second part of this thesis, we come up with an architecture which lets us use features in a very flexible way, in the context of content-addressable memories. We apply this approach to two radically different domains, face images and English words. We also look at human performance in reconstructing words from fragments, which give us some information about the memory subsystem in human beings.