Category Archives: Department of Statistics

Career opportunities, Department of Statistics

Preparing for a career in "Big Data"

November 12, 2012 dunger

This comes from a Statistics PhD student of ours who is also a Data Scientist for a computing company in Research Park. Here are his thoughts on what students can do to prepare.

“I would recommend taking Java courses which is relative simple and more widely used these days, then maybe can take some training about Hadoop and Hive. There are also some good Books, like : Think in Java, Hadoop Definitive guide, Programming Hive etc. And there are maybe some open source project using “Big Data”, usually we can learn a lot from other’s design and code.”

Department of Statistics, Graduate opportunities, Seminars

Bohrer Workshop – Nov. 15

November 12, 2012 dunger

One of the most notable activities in our department for graduate students is the Bohrer Workshop. A subset of the graduate students are selected to give a quality presentation of their current research in statistics. This is an excellent opportunity for undergraduate students to get a peek at what might lie ahead if you too choose to pursue an advanced degree. You don’t have to be there the whole day. Just pick one or maybe two that sound interesting. Schedule below.

Bohrer Workshop Schedule

Department of Statistics

The power of Statistics

November 9, 2012 dunger

A post from one of your fellow Statistics students…

I wanted to bring you attention to fivethirtyeight.com as you may have seen during the election. Nate Silver used some awesome yet simple models to predict the election. This LA times article has some awesome stuff about math/statistics and the future use of it! http://www.latimes.com/news/nationworld/nation/la-fi-election-math-20121108,0,2926239.story

Here is the link to his book: http://www.amazon.com/The-Signal-Noise-Predictions-Fail-but/dp/159420411X/ref=sr_1_1?ie=UTF8&qid=1352401490&sr=8-1&keywords=nate+silver

I thought it was very relevant and interesting as this just shows the power of numbers. I think some of our grad students (even undergrad) would be interested in it so feel free to pass this on. I followed that website for the last 6 months, and it was really cool to see how his numbers and projections changed. He sure did stand his ground with his methods.

Department of Statistics, Seminars

Statistics Seminar — John Lafferty at AIIS this Friday!

November 7, 2012 dunger

This week we are hosting Prof. John Lafferty (http://www.cs.cmu.edu/~lafferty/.) He’ll deliver a talk at the AIIS seminar (http://cogcomp.cs.illinois.edu/sites/aiis/). Please note that the talk venue has been moved from 3405 SC to 2405 SC as we are expecting a larger audience. Following are the details:

When:
Nov 9, Friday. 4 pm.

Where:
2405, Siebel Center

Title:
Graphical Model Estimation

Abstract:
The graphical model has proven to be a useful abstraction in statistics and machine learning. The starting point is the graph of a distribution. While often the graph is assumed given, we have been studying the problem of estimating the graph from data. In this talk we present several nonparametric and semi-parametric methods for graph estimation. One approach is a nonparametric extension of the Gaussian graphical model that allows arbitrary graphs. For the discrete Gaussian (Ising model), we use parallel neighborhood selection with L1-regularized logistic regression. Alternatively, we can restrict the family of graphs to spanning forests, enabling the use of fully nonparametric density estimation in high dimensions. When additional covariates are available, we propose a framework for graph-valued regression. The resulting methods are easy to understand and use, theoretically well supported, and effective for modeling and exploring high dimensional data. Joint work with Han Liu, Pradeep Ravikumar, Martin Wainwright, and Larry Wasserman.

Bio
John Lafferty is the Louis Block Professor in the Departments of Statistics, Computer Science, and the College at The University of Chicago. His research area is machine learning, with a focus on computational and statistical aspects of nonparametric methods, high-dimensional data, graphical models, and applications. An associate editor of the Journal of Machine Learning Research, Dr. Lafferty served as program co-chair and general co-chair of the Neural Information Processing Systems Foundation conferences in 2009 and 2010. Dr. Lafferty received his doctoral degree in mathematics from Princeton University, where he was a member of the Program in Applied and Computational Mathematics. Prior to joining the University of Chicago in 2011, he was Professor of Computer Science, Machine Learning, and Statistics at Carnegie Mellon University, where he is currently an Adjunct Professor.

Department of Statistics, Seminars

Dept. of Statistics Weekly seminar

October 29, 2012 dunger

Lee DeVille (University of Illinois at Urbana-Champaign): Stochastic dynamics on networks. Emergence of collective behaviors

Date Nov 1, 2012

Time 4:00 pm – 4:50 pm

Location 156 Henry

Sponsor Statistics Department

Event type Seminar

Dynamical systems defined on networks have applications in many fields in science and engineering. In particular, it is important to understand when networks exhibit synchronous or other types of coherent collective behaviors. Other questions include whether such coherent behavior is stable with respect to random perturbation, or what the detailed structure of this behavior is as it evolves. We will examine several models of networked dynamical systems and present a mixture of results that range from rigorous theorems for abstract models to quantitative comparisons of models and data.

Department of Statistics, Seminars

Department of Statistics Weekly Seminar

October 15, 2012 dunger

Wei Sun (University of North Carolina): Statistical methods for RNA-seq data

Date Oct 18, 2012

Time 4:00 pm – 5:00 pm

Location 156 Henry

Sponsor Statistics Department

Event type Seminar

RNA-seq is replacing gene expression microarrays as the most commonly used technique to assess genome-wide transcription abundance. RNA-seq delivers two novel features. First, it provides information on allele-specific expression (ASE), which is not available from gene expression microarrays. Second, it generates unprecedentedly rich data to study RNA-isoform expression. I will present statistical methods for joint study of allele-specific expression and total expression of a gene, transcriptome reconstruction, isoform abundance estimation, and Differential isOform usage Testing (DOT).

Department of Statistics, Seminars

Department of Statistics weekly seminar

October 8, 2012 dunger

Yuan Ji, Ph.D. (NorthShore University HealthSystem): Bayesian Models for Next-Generation Sequencing Data on Histone Modifications

Speaker Yuan Ji, Ph.D. (NorthShore University HealthSystem )

Date Oct 11, 2012

Time 4:00 pm – 4:50 pm

Location 156 Henry

Sponsor Statistics Department

Event type Seminar

In this talk, I will describe how Bayesian models are successfully applied to the field of epigenetics, which is concerned about regulatory mechanism of gene expression. Epigenetics, one of the most heavily researched and challenging field in biology, increasingly draws attention from statisticians due to breakthroughs in bioengineer and biotechnology that allow large-scale and high-throughput experiments to be routinely conducted with affordable cost. A central topic of epigenetics is to understand the chromatin state — modifications to histones and other proteins that package the DNA. A complex mechanism called “histone code” is believed to dictate the dynamics of DNA expression. As a step towards deciphering the histone code, we develop Bayesian models based on genome-wide mapping of histone modifications. Such models are only initial attempts to decipher the complex histone code but highlight the need of Bayesian inference in the research of gene regulations, receiving relatively small amount of attention from statisticians. I will summarize our recent work and results using a comprehensive ChIP-Seq data set.

Department of Statistics, Seminars

Department of Statistics Weekly Seminar

October 1, 2012 dunger

Heike Hofmann, Ph.D. (Iowa State University)

Speaker Heike Hofmann, Ph.D. (Iowa State University)

Date Oct 4, 2012

Time 4:00 pm – 5:00 pm

Location 156 Henry

Sponsor Statistics Department

A Discussion of Graphical Inference

How do you know if something that you see in a data plot is really there?

Statistical inference for exploratory data analysis allows us to quantitatively assess the strength of a visual finding, and places statistical graphics in the context of classical inference. New work builds on the lineup protocol, which puts graphics into an inference framework, that examines the data plot in relation to null plots. This talk describes various aspects of the development of graphics inference: definitions of terminology and concepts, experiments conducted to validate the lineup protocol, how to compute p-values and power. Applications of visual inference in practice will be discussed. This includes how to choose the best display and also includes scenarios where no classical test exists, because critical assumptions are violated.

Department of Statistics, Seminars

Department of Statistics Weekly Seminar

September 24, 2012 dunger

Sewoong Oh (University of Illinois at Urbana-Champaign): Budget-Optimal Task-Allocation for Reliable Crowdsourcing Systems

Speaker Sewoong Oh, University of Illinois at Urbana-Champaign

Date Sep 27, 2012

Time 4:00 pm – 5:00 pm

Location 156 Henry

Sponsor Statistics Department

Event type Seminar

This talk is on my ongoing research on designing reliable and cost-efficient crowdsourcing systems. Crowdsourcing is a novel paradigm for solving large scale problems by breaking them down into small tasks that are electronically distributed to numerous on-demand human contributors. In typical crowdsourcing, these tasks are submitted to an electronic labor market and completed by any worker choosing to pick it up for a small reward. However, since typical crowdsourced tasks are tedious and the reward is small, errors are common even among those who make an effort. Thus, all taskmasters need to devise schemes to increase confidence in their answers. A common approach is to assign each task multiple times and combining the answers in some way such as majority voting. For such systems, there is a fundamental problem of interest: how can we achieve a certain reliability in our answers at minimum cost? Under a general model, we provide an optimal algorithm based on low-rank matrix approximation and belief propagation. We prove that our approach signiﬁcantly outperforms majority voting and, in fact, is asymptotically order-optimal through comparison to an oracle that knows the reliability of every worker. We also provide experimental results on synthetic and real datasets that support the optimality of our approach.

Department of Statistics, Seminars

Department of Statistics Weekly Seminar

September 17, 2012 dunger

Song-Xi Chen (Iowa State University): High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

Speaker Song-Xi Chen, Iowa State University

Date Sep 20, 2012

Time 3:30 pm – 4:30 pm

Location 122 Illini Hall

Sponsor Statistics Department

Event type Seminar

This paper studies the maximum empirical likelihood estimation (MELE) and inference on parameters identified by generalized estimating equations with weakly dependent data when the dimensions of the estimating equations and the parameters are diverging. Our theory greatly extends a wide range of existing results to the new time series framework of growing dimensions of the parameters, the estimating equations and the observed covariates. We obtain the consistency with rates and the asymptotic normality of the MELE by properly restricting the growth rates of the dimensions of the parameters and the estimating equations, as well as the degree of dependence. We also show that, even in this high dimensional nonlinear time series setting, the empirical likelihood ratio still behaves like a Chi-square random variable asymptotically. (Note that time and location are different from usual)

Undergraduate Advising in Statistics

Category Archives: Department of Statistics

Preparing for a career in "Big Data"

Bohrer Workshop – Nov. 15

The power of Statistics

Statistics Seminar — John Lafferty at AIIS this Friday!

Dept. of Statistics Weekly seminar

Department of Statistics Weekly Seminar

Department of Statistics weekly seminar

Department of Statistics Weekly Seminar

Department of Statistics Weekly Seminar

Department of Statistics Weekly Seminar

Department of Statistics, University of Illinois at Urbana-Champaign