Fall 2012 Graduate Special Topics in Computer Science

NOTE: for descriptions of standard graduate computer science courses, see Graduate Course Descriptions.

CSCI-GA 3033-001 Statistical Natural Language Processing

In this course we will explore statistical, model based approaches to natural language processing. There will be a focus on corpus-driven methods that make use of supervised and unsupervised machine learning methods and algorithms. We will examine some of the core tasks in natural language processing, starting with simple word-based models for text classification and building up to rich, structured models for syntactic parsing and machine translation. In each case we will discuss recent research progress in the area and how to design efficient systems for practical user applications. In the course assignments you will construct basic systems and then improve them through a cycle of error analysis and model redesign. This course assumes a good background in basic probability and a strong ability and interest to program in Java. The class is open to graduate as well as undergraduate students.

CSCI-GA 3033-002 Financial Software Projects

The theme of this course is an "applied case study" and focuses on fixed income markets. Topics covered include an overview of the markets, the inner workings of an investment bank, the market players, and where software engineers fit in. Students will be grouped into small teams to build a financial application using practical software engineering principles. Each team will build a risk management framework, starting with basic components. Prerequisites: It is assumed that the students can code in C++. No prior experience in the financial sector domain is required.

CSCI-GA 3033-003 Production Quality Software

In this course, students learn to develop production quality software. Lectures present real-world development practices that maximize software correctness and minimize development time. A special emphasis is placed on increasing proficiency in a particular programming language by doing weekly development projects and participating in code reviews. Assignments become more sophisticated as the semester progresses, eventually incorporating unit tests, build scripts, design patterns, and other techniques.

CSCI-GA 3033-004 Open Source Tools

This course covers a brief history and philosophy of open source software, followed by an in-depth look at open source tools intended for developers. In particular, we will present an overview of the Linux operating system, command line tools (find, grep, sed), programming tools (GIT, trace), web and database tools (Apache, MySQL, App Engine), and system administration tools. We will also cover scripting languages such as shell and Python, and use them to write web applications.

CSCI-GA 3033-005 Distributed Systems

Distributed systems help programmers aggregate the resource of many networked computers to construct highly available and scalable services. This class teaches the abstraction, design and implementation techniques that allow one to build fast, scalable, fault-tolerant distributed systems. Topics include multithreading, network programming, consistency, naming, fault tolerance, security and several case studies of distributed systems.

CSCI-GA 3033-006 Motion Capture for Gaming & Urban Sensing

CSCI-GA 3033-007 Music Software Projects

Did you ever wonder why there are 12 notes in the western music scale? Or how the intervals between notes came to be? When were the first musical scales developed or "discovered" and how (and why) have they been modified since? Who were the key innovators of western music theory over the last few centuries?

It is not uncommon for software developers to have an affinity for music. After all, the creation of both software and music is part art and part science. Further, music and computing are built upon fundamental mathematical principles. While it is not required to understand music theory to be a good player, understanding why we are constrained to a certain set of notes is an enlightening topic - for musicians and non-musicians alike.

This course is for students interested in how both music and software are constructed. Student teams will build software in phases which will demonstrate the underlying rules in modern western music theory. The beauty of software is that it can be applied in just about any domain.

Music students are encouraged to apply even though this course is primarily a software development class. The interdisciplinary product development teams will be composed of at least one engineer and one subject domain expert who will work together on the assignments. The software the teams build will be used to demonstrate how music theory developed as well as give students an intuitive grasp of some fascinating underlying universal truths...

CSCI-GA 3033-008 Cancelled

CSCI-GA 3033-009 Speech Recognition

This course gives a computer science presentation of automatic speech recognition, the problem of transcribing accurately spoken utterances. The description includes the essential algorithms for creating large-scale speech recognition systems. The algorithms and techniques presented are now used in most research and industrial systems.

Many of the learning and search algorithms and techniques currently used in natural language processing, computational biology, and other areas of application of machine learning were originally designed for tackling speech recognition problems. Speech recognition continues to feed computer science with challenging problems, in particular because of the size of the learning and search problems it generates.

The objective of the course is thus not just to familiarize students with particular algorithms used in speech recognition, but rather use that as a basis to explore general text and speech and machine learning algorithms relevant to a variety of other areas in computer science. The course will make use of several software libraries and will study recent research and publications in this area.

CSCI-GA 3033-010 Computer Games

CSCI-GA 3033-011 Cloud Computing: Concepts & Practice

This is a graduate level course on Cloud Computing with emphasis on hands-on design and implementations. Both Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) cloud technologies and concepts will be covered. By the end of the course, students should have fair amount of knowledge about how to use a Cloud, write applications on Cloud and build your own private Cloud.

The first part of the course covers basic building blocks such as virtualization technologies, virtual appliance, automated provisioning, elasticity, and cloud monitoring. We shall learn these concepts by using and extending capabilities available in real clouds such as Amazon AWS, Google App Engine and OpenStack.

The second part of the course will cover more advanced topics with emphasis on ultra large scale systems, computation models and storage clouds for big data. Example topics are storage cloud, cloud security, Hadoop for Big Data, Network Virtualization (SDNs) and new services leveraging cloud migration. Several real world applications will be covered to illustrate these concepts and research innovations including Facebook Cassandra, Amazon Dynamo, Google Big Table, Hadoop HDFS, Yahoo Zookeeper.

Students will benefit from background in Operating Systems, and object oriented programming such as Java. The students are expected to participate in class discussions, present research papers, and conduct a significant course project.

CSCI-GA 3033-012 Multicore Processors: Architecture & Programming

The tremendous advances in process technology have created a revolution both in hardware and in software. On the hardware side, we moved from single core processors to multicore/manycore processors. Multicore chips are now everywhere. You can find them in smartphones, playstations, notebooks, all the way up to supercomputers. To benefit from these chips, software must be parallelized, which starts another revolution in software.

The purpose of this course is to introduce students to both the hardware advances and parallel programming techniques targeting multicore and manycore processors. Students will learn how to make the best use of the underlying hardware to build applications that can take advantage of the on-chip parallelism.

CSCI-GA 3033-013/MATH-GA 2011-003 Analytical Methods in Computer Science

In this course we will explore some of the most exciting developments in theoretical computer science over the last decade or two, emphasizing the common use of analytic techniques such as Fourier analysis. The main areas we will touch upon include:

* Property testing: can you test that a certain program does what it is supposed to do using only a small number of invocations of the program?

* Hardness of approximation: how does one prove that a certain problem is hard to approximate?

* Computational learning: how can a computer learn an unknown concept?

* Voting: is there a way to conduct a vote leading to a ranking of three candidates?

Underlying all these topics is the theory of Fourier analysis of Boolean functions, which would be the common thread throughout the course. We will see some of the key concepts in this theory, including the hypercontractive inequality and the "majority is stablest" theorem.

Depending on time constraints and interest we will also get to see very recent topics such as the use of quantum algorithm to construct low-degree functions, or linear programs for the traveling salesman problem.

Although there are no specific prerequisites, this course is rather mathematical in nature, and so mathematical maturity is a must. In addition, familiarity with the basics of probability, probabilistic method, algebra (especially finite fields), analysis of algorithms, and computational complexity would be helpful, but not necessary.

CSCI-GA 3033-014 Principles of Software Security

Modern societies are increasingly dependent upon the proper functioning of their computing infrastructure. Yet, that infrastructure is riddled with flaws that at best mean systems fail, and at worst, allow a malicious attacker to take control. Broadly speaking, this course will address two questions.

1. What are common security problems and what are their underlying causes?
2. What are programming techniques, guidelines, principles, and tools that can help to detect and prevent them?

Traditionally, computer security is enforced by the operating system, which uses special hardware support to ensure security properties at application boundaries. However, the proliferation of successful attacks, such as viruses, worms, SQL injection, and cross-site scripting, shows that traditional approaches to security are insufficient. Adversaries exploit weaknesses both in the operating system itself, bypassing any protection mechanisms, and more and more frequently at the application level, where the operating system provides very limited guarantees. In this class we consider how programming language techniques can be used to fill the security gap by defending against application-level attacks.

Prerequisites: The course is opened to Master and PhD students. The students are assumed to have previously studied a course in programming languages, to have a good practice of programming in any high-level programming language, and to have a basic knowledge in formal methods.

top | contact webmaster@cs.nyu.edu