Our research focuses on compiler-based approaches to obtaining high performance on state-of-the-art and experimental architectures, including multi-cores, GPUs and petascale platforms. We are developing auto-tuning compiler technology to systematically map application code to make efficient use of these diverse architectures. An auto-tuning compiler generates a set of alternative implementations of a computation, and uses empirical measurement to select the slots best-performing solution. Our compiler can work automatically or collaboratively with application programmers to accelerate their performance tuning and in some cases, produce results far better than is possible with manual tuning. Our group has access to DOE Leadership Class computing facilities, the University of Utah Center for High Performance Computing systems, and an NVIDIA Tesla system with over 30,000 cores.