All data science graduate students take coursework in statistics, algorithmics, infrastructure and large scale computing, plus electives, research colloquium, and a capstone project. All courses taken from a department affiliated with the Data Science program must be taken A-F if available.

To use any course requiring advisor or DGS approval, please submit a syllabus or detailed course description and a short note on the relevancy of the course to data science (in many cases one sentence may suffice). Topics class listings apply only to the specific topics listed.

## Statistics

(6-8 credits: take 2 courses, at least one of which is from Tier I)

### Tier I Courses

• STAT 5101 - Theory of Statistics I / MATH 5651 - Basic Theory of Probability and Statistics (STAT 5101 and MATH 5651 are equivalencies)
• STAT 5102 - Theory of Statistics II
• STAT 5302 - Applied Regression Analysis
• STAT 5511 - Time Series Analysis
• STAT 5401 - Applied Multivariate Methods
• STAT 8051 - Applied Statistical Methods 1: Computing and Generalized Linear Models
• STAT 8101 - Theory of Statistics I
• STAT 8102 - Theory of Statistics II
• PUBH 7401 - Fundamentals of Biostatistical Inference
• PUBH 7402 - Biostatistics Modeling and Methods
• PUBH 7440 - Introduction to Bayesian Analysis

### Tier II Courses

• AST/STAT 5731 - Bayesian Astrostatistics
• PUBH 8401 - Linear Models
• PUBH 8432 - Probability Models for Biostatistics
• PUBH 7405 - Biostatistics Regression
• PUBH 7406 - Advanced Regression and Design
• PUBH 7407 - Analysis of Categorical Data
• PUBH 7430 - Statistical Methods for Correlated Data
• PUBH 7460 - Advanced Statistical Computing
• PUBH 7485 - Methods for Causal Inference
• PUBH 8442 - Bayesian Decision Theory
• STAT 5052 - Statistical and Machine Learning
• STAT 5201 - Sampling Methodology in Finite Populations
• STAT 5303 - Designing Experiments
• STAT 5421 - Analysis of Categorical Data
• STAT 5601 - Nonparametric Methods
• STAT 5701 - Statistical Computing
• STAT 8112 - Mathematical Statistics II
• EE 5531 Probability and Stochastic Processes
• EE 8581 - Detection and Estimation Theory
• Any course from a list of STAT/Biostat 5xxx/8xxx classes (but not STAT 5021) with advisor and DGS approval

## Algorithmics

(6 credits: take 2 courses, at least one of which is from Tier I)

### Tier I Courses

• CSCI 5521 - Introduction to Machine Learning (formerly Pattern Recognition)
• CSCI 5523 - Introduction to Data Mining
• CSCI 5525 - Machine Learning
• EE 8591 - Predictive Learning from Data
• PUBH 7475 - Statistical Learning and Data Mining
• PUBH 8475 - Statistical Learning and Data Mining

### Tier II Courses

• CSCI 5302 - Analysis of Numerical Algorithms
• CSCI 5304 - Computational Aspects of Matrix Theory
• CSCI 5511 - Artificial Intelligence I
• CSCI 5512 - Artificial Intelligence II
• CSCI 5609 - Visualization (renumbered from CSCI 5109)
• CSCI 8314 - Sparse Matrix Computations
• CSCI 8581 - Big Data in Astrophysics
• EE 5239 - Introduction to Nonlinear Optimization
• EE 5251 - Optimal Filtering and Estimation
• EE 5389 - Introduction to Predictive Learning
• EE 5391 - Computing With Neural Networks
• EE 5542 - Adaptive Digital Signal Processing
• EE 5551 - Multiscale and Multirate Signal Processing
• EE 5561 - Image Processing and Applications
• EE 5581 - Information Theory and Coding
• EE 5585 - Data Compression
• EE 8231 - Optimization Theory
• IE 5531 - Engineering Optimization I
• IE 8521 - Optimization
• IE 8531 - Discrete Optimization
• Any advanced class in optimization, game theory, or topic related to the listed Algorithmics courses (with advisor and DGS approval)

## Infrastructure and Large Scale Computing

(6 credits: take 2 courses, at least one of which is from Tier I)

### Tier I Courses

• CSCI 5105 - Introduction to Distributed Systems
• CSCI 5451 - Introduction to Parallel Computing: Architectures, Algorithms, and Programming
• CSCI 5707 - Principles of Database Systems
• CSCI 5708 - Architecture and Implementation of Database Management Systems
• EE 5351 - Applied Parallel Programming
• EE 8367/CSCI 8205 - Parallel Computer Organization

### Tier II Courses

• CSCI 5103 - Operating Systems
• CSCI 5211 - Data Communications and Computer Networks
• CSCI 5231 - Wireless and Sensor Networks
• CSCI 5271 - Introduction to Computer Security
• CSCI 5715 - From GPS and Virtual Globes to Spatial Computing
• CSCI 5751 - Big Data Engineering and Architecture
• CSCI 5801 - Software Engineering I
• CSCI 5802 - Software Engineering II
• CSCI 8102 - Foundations of Distributed Computing
• CSCI 8701 - Overview of Database Research
• CSCI 8715 - Spatial Databases and Applications
• CSCI 8725 - Databases for Bioinformatics
• CSCI 8735 - Advanced Database Systems
• CSCI 8801 - Advanced Software Engineering
• EE 5355 - Algorithmic Techniques for Scalable Many-core Computing
• EE 5371 - Computer Systems Performance Measurement and Evaluation
• EE 5381 - Telecommunications Networks
• EE 5501 - Digital Communication
• Any advanced class in large-scale data management or analysis, or topic related to the listed Infrastructure courses (with advisor and DGS approval)

## Special Topics Courses

• CSCI 8980 - Topic: Cloud Computing/Big Data - Infrastructure and Large Scale Computing Tier II
• CSCI 5980 - Topic: Big Data Engineering and Analytics (now CSci 5751) - Infrastructure and Large Scale Computing Tier II
• CSCI 8980 - Topic: Advanced Topics in Distributed Systems - Infrastructure and Large Scale Computing Tier II
• CSCI 5980/8980 - Topic: Think Deep Learning - Elective
• IE 8534 - Topic: Modern Nonconvex Nondifferentiable Optimization with Applications in Statistical Learning - Elective
• IE 8534 - Topic: Stochastic Programming and Robust Optimization - Elective
• IE 8534 - Topic: Multilevel Monte Carlo for Problems in Data Science - Elective

## Electives

(9 credits; of which 3 must be 8xxx if you complete a one-semester Capstone project or 6 credits if you complete a two-semester Capstone project)

Suggestions and pre-approved options are listed below. This is a non-exclusive list. An elective course is a course that explores more deeply concepts or methodologies addressed in a regular track course listed above, or a course that addresses tools or methodologies needed to make the methods above work, or a course in an application area outside data science in which issues of data management, data analysis, or data mining are discussed in the context of that application area. There are potentially many such courses around the University. Below are some suggestions. You may also use any course listed above if not used to satisfy a track requirement. To use a course not specifically listed as an elective, submit a short note to your advisor explaining how the course meets one of these criteria together with a syllabus or other details of the course which lists the topics and the level at which the topics are taught. The level is often indicated indirectly by the prerequisites.

• CSCI 5106 - Programming Languages
• CSCI 5123 -  Recommender Systems
• CSCI 5421 - Advanced Algorithms and Data Structures
• CSCI 5461 - Functional Genomics, Systems Biology, and Bioinformatics
• CSCI 5561 - Computer Vision
• CSCI 8271 - Security and Privacy in Computing
• CSCI 8363 - Numerical Linear Algebra in Data Exploration
• EE 5393 - Circuits, Computation and Biology
• GEOG 8920 - Urban Mobility & Accessibility
• IE 8535 - Introduction to Network Science
• MATH 5467 - Introduction to the Mathematics of Image and Data Analysis
• PUBH 7445 - Statistics for Human Genetics and Molecular Biology
• PUBH 7461 - Exploring and Visualizing Data in R
• PUBH 8445 - Statistics for Human Genetics and Molecular Biology
• PUBH 8446 - Advanced Statistical Genetics and Genomics
• PUBH 8472 - Spatial Biostatistics
• 5xxx 8xxx topics classes in CSCI, EE, STAT (but not STAT 5021), PUBH, etc. (See advisor for approval)
• Any advanced class in advanced data driven applications or emerging applications (with advisor and DGS approval)
• Note: No PUBH 6xxx, 5xxx, or 4xxx level courses can be used as electives in Data Science

## Colloquium

(1 credit)

•  DSCI 8970 - Data Science M.S. Colloquium

## Capstone project

(3 credits)

Every student in the data science master's program is required to complete a capstone research project. Under the supervision of a faculty member, students will go through the entire process of solving a real-world problem: from collecting and processing real-world data, to designing the best method to solve the problem, and finally, to implementing a solution. Some examples of labs with appropriate projects can be found on the graduate faculty research page, though this is only a partial list. Data-driven capstone projects may also grow from temporary internships at companies, with DGS approval.

• DSCI 8760 - Data Science M.S. Plan B Project