Publications

  1. Data-Driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs Khalid Ahmad, Hari Sundar, Mary Hall, ACM Transactions on Architectures and Code Optimization 16(5), Dec. 2019.
  2. Exploiting Reuse and Vectorization in Blocked Stencil Computations on CPUs and GPUs T. Zhao, S. Williams, M. Hall, H. Johansen  International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2019.
  3. SWIRL: High-Performance Many-Core CPU Code Generation for Deep Neural
    Networks Anand Venkat, Tharindu Rusira, Raj Barik, Mary Hall, Leonard Truong, International Journal of High-Performance Computing Applications, 33(6), 2019.
  4. Sparse Computation Data Dependence Simplification for Efficient Compiler-Generated Inspectors M. Mohammadi, K. Cheshmi, E. Davis, M. Hall, M. Dehnavi, P. Nandy, C. Olschanowsky, A. Venkat. T. Yuki, M. Strout, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2019.
  5. The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code M. M. Strout, M. Hall and C. Olschanowsky, Proceedings of the IEEE 106(11):1921–1934, Nov. 2018.
  6. Autotuning in High-Performance Computing Applications Prasanna Balaprakash, Jack Dongarra, Todd Gamblin, Mary Hall, Jeffrey K. Hollingsworth, Boyana Norris, and Richard Vuduc, Proceedings of the IEEE 106(11):2068–2083, Nov. 2018.
  7. Automating Wavefront Parallelization for Sparse Matrix Codes A. Venkat, M. Mohamadi, J. Park, R. Barik, H. Rong, M. Strout, M. Hall, International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2016, Best Paper Finalist.
  1. Data-Driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs Khalid Ahmad, Hari Sundar, Mary Hall, ACM Transactions on Architectures and Code Optimization 16(5), Dec. 2019.
  2. SWIRL: High-Performance Many-Core CPU Code Generation for Deep Neural
    Networks Anand Venkat, Tharindu Rusira, Raj Barik, Mary Hall, Leonard Truong, International Journal of High-Performance Computing Applications, 33(6), 2019.
  3. The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code M. M. Strout, M. Hall and C. Olschanowsky, Proceedings of the IEEE 106(11):1921–1934, Nov. 2018.
  4. Autotuning in High-Performance Computing Applications Prasanna Balaprakash, Jack Dongarra, Todd Gamblin, Mary Hall, Jeffrey K. Hollingsworth, Boyana Norris, and Richard Vuduc, Proceedings of the IEEE 106(11):2068–2083, Nov. 2018.
  5. Student Cluster Competition 2017, Team University of Utah: Reproducing Vectorization of the Tersoff Multi-Body Potential on the Intel Broadwell and Intel Skylake Platforms J. Lake, Q. Chao, H. Eyre, E. Ford, K. Parker, K. Savoie, H. Sundar, M. Hall, Parallel Computing 79, Jul. 2018.
  6. Reproducing ParConnect for SC16 Marek Baranowski, Braden Caywood, Hannah Eyre, Janaan Lake, Kevin Parker, Kincaid Savoie, Hari Sundar, Mary Hall,
    Parallel Computing 70:18–21, Dec. 2017.
  7. Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers P. Basu, S. Williams, B. Van Straalen, L. Oliker, P. Colella, and M. Hall, Parallel Computing 64(C):50–64, May 2017.
  8. Designing a Tunable Nested Data-Parallel Programming System S. Muralidharan, M. Garland, A. Sidelnik, M. Hall, ACM Transactions on Architecture and Code Optimization, 13(4), December 2016.
  9. Towards Making Autotuning Mainstream  P. Basu, M. Hall, M. Khan, S. Maindola, S. Muralidharan, S. Ramalingam, A. Rivera, M. Shantharam, A. Venkat, International Journal of High Performance Computing Applications, 27(4), November 2013.
  10. A script-based autotuning compiler system to generate high-performance CUDA code  M. Khan, P. Basu, G. Rudy, M. Hall, C. Chen, and J. Chame. ACM Transactions on Architecture and Code Optimization, 9(4), January 2013.
  11. Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters  H. Dursun, M. Kunaseth, K. Nomura, J. Chame, R.F. Lucas, C. Chen, M. Hall, R.K. Kalia, A. Nakano, P. Vashishta, The Journal of Supercomputing, 62(2):946-966, December 2012.
  12. Auto-tuning Full Applications: A Case Study A. Tiwari, C. Chen, C. Liao, J. Chame, J. Hollingsworth, M. Hall and D. Quinlan, International Journal of High Performance Computing Applications, 25(3):286-294, Aug. 2011.
  13. Domain-Specifi c Optimization of Signal Recognition Targeting FPGAs M. Demertzi, P.C. Diniz, M.W. Hall, A.C. Gilbert and Y.Wang, ACM Transactions on Reconfi gurable Technology and Systems, 4(2), May, 2011.
  14. Parameterized speci fication, confi guration and execution of data-intensive scienti fic work-flows  V.S. Kumar, T. Kurc, V. Ratnakar, J. Kim, G. Mehta, K. Vahi, Y.L. Nelson, P. Sadayappan, E. Deelman, Y. Gil, M. Hall and J. Saltz, Cluster Computing, April 2010.
  15. Compiler Research: The Next Fifty Years M. Hall, D. Padua and K.  Pingali, Communications of the ACM, Feb. 2009.
  16. Self-Con figuring Applications for Heterogeneous Systems: Program Composition and Optimization Using Cognitive Techniques M. Hall, Y. Gil and R. Lucas. Proceedings of the IEEE, Special Issue on Cutting-Edge Computing, Vol. 96(5), May 2008.
  1. Exploiting Reuse and Vectorization in Blocked Stencil Computations on CPUs and GPUs T. Zhao, S. Williams, M. Hall, H. Johansen  International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2019.
  2. Sparse Computation Data Dependence Simplification for Efficient Compiler-Generated Inspectors M. Mohammadi, K. Cheshmi, E. Davis, M. Hall, M. Dehnavi, P. Nandy, C. Olschanowsky, A. Venkat. T. Yuki, M. Strout, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2019.
  3. Automating Wavefront Parallelization for Sparse Matrix Codes A. Venkat, M. Mohamadi, J. Park, R. Barik, H. Rong, M. Strout, M. Hall, International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2016, Best Paper Finalist.
  4. Synchronization Tradeoffs in GPU Implementations of Graph Algorithms R. Kaleem, A. Venkat, S. Pai, M. Hall, K. Pingali, Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2016.
  5. Architecture-Adaptive Code Variant Tuning S. Muralidharan, A. Roy, M. Hall, M. Garland, and P. Rai, Proceedings of the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), April 2016.
  6. Generating Efficient Tensor Contractions for GPUs T. Nelson, A. Rivera, P. Balaprakash, M. Hall, P.D. Hovland, E. Jessup, B. Norris, Proceedings of the IEEE International Conference on Parallel Processing (ICPP), Sept. 2015.
  7. Loop and Data Transformations for Sparse Matrix Code Anand Venkat, Mary Hall, Michelle Strout, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June, 2015.
  8. Nitro: A Framework for Adaptive Code Variant Tuning S. Muralidharan, M. Shantharam, M. Hall, M. Garland, B. Catanzaro, Proceedings of the International Parallel and Distributed Processing Symposium, May, 2014.
  9. Non-Affine Extensions to Polyhedral Code Generation A. Venkat, M. Shantharam, M. Hall, M. M. Strout, Proceedings of the International Conference on Code Generation and Optimization, Feb. 2014.
  10. Compiler Generation and Autotuning of Communication-Avoiding Operators for Geometric Multigrid P. Basu, S. Williams, B. Van Straalen, A. Venkat, L. Oliker, M. Hall, High Performance Computing Conference (HiPC), December 2013.
  11. Analyzing the effect of compiler optimizations on application reliability M. Demertzi, M. Annavaram and M. Hall, Proceedings of the IEEE International Symposium on Workload Characterization, Nov., 2011.
  12. EigenCFA: Accelerating Flow Analysis with GPUs  T. Prabhu, S. Ramalingam , M. Might, M. Hall, In ACM SIGPLAN Principles of Programming Languages, Jan. 2011.
  13. Autotuning and Specialization: Speeding up Nek5000 with Compiler Technology Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul Fischer, Paul D. Hovland, International Conference on Supercomputing, June, 2010.
  14. An Integrated Framework for Parameter-based Optimization of Scientific Workflows  V. S. Kumar, P. Sadayappan, G. Mehta, K. Vahi, E. Deelman, V. Ratnakar, J. Kim, Y. Gil, M. Hall, T. Kurc, J. Saltz, Proceedings of the International Symposium on High Performance Distributed Computing, June, 2009.
  15. Model-Guided Autotuning of High-Productivity Languages for Petascale Computing H. Zima M. Hall, C. Chen, J. Chame, In Proceedings of the International Symposium on High Performance Distributed Computing, June, 2009.
  16. A Scalable Autotuning Framework for Compiler Optimization A. Tiwari, C. Chen, J. Chame, M. Hall and J. K. Hollingsworth, In Proceedings of the International Parallel and Distributed Processing Symposium, May, 2009.
  1. “A Programming Language Interface to Describe Transformations and Code Generation,” G. Rudy, M. Khan, M. Hall, C. Chen and J. Chame, Lecture Notes in Computer Science, 2011, Volume 6548, Languages and Compilers for Parallel Computing, Springer Verlag, Pages 136-150
  2. Languages and Compilers for Autotuning,” M.W. Hall and J. Chame, In Performance Tuning of Scienti c Applications, edited by David Bailey, Robert F. Lucas and Sam Williams. Taylor and Francis publishers, Nov. 2010.
  3. “Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology,” Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul D. Hovland, Software Automatic Tuning: from concepts to state-of-the-art results, edited by Keita Teranishi, John Cavazos, Ken Naono and Reiji Suda, Springer-Verlag Publishers, 2010, Pages 353-370
  4. “Loop Transformation Recipes for Code Generation and Auto-Tuning,” Mary Hall, Jacqueline Chame, Chun Chen, Jaewook Shin and Gabe Rudy, Lecture Notes in Computer Science, 2010, Volume 5898, Languages and Compilers for Parallel Computing, Springer-Verlag, Pages 50-64
  1. “Improving High-Performance Sparse Libraries using Compiler-Assisted Specialization : A PETSc Case Study,” Shreyas Ramalingam, M. Hall and C. Chen, Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), held in conjunction with International Parallel and Distributed Processing Symposium, May 2012
  2. “Understanding the Behavior of Pthread Applications on Non-Uniform Cache Architectures,” G. S. Sachdev, K. Sudan, M. W. Hall, and R. Balasubramonian, (poster paper), In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Oct. 2011
  3. “Generating High Performance Libraries using CHiLL and Autotuning,” S. Ramalingam and M. Hall, (poster), International Workshop on Languages and Compilers for Parallel Computing, Sept. 2011
  4. Evaluating graph coloring on GPUs,” P. Grosset, P. Zhu, S. Liu, S. Venkatasubramanian, and M. Hall. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming (PPoPP ’11), Feb. 2011. Received runner-up for Best Student Poster.
  5. “CUDA-CHiLL: Using Compiler-Based Autotuning to Generate High-Performance GPU Libraries,” M. Khan, G. Rudy, C. Chen, M. Hall, J. Chame, (poster) SC’10, Nov. 2010
  6. “Automatic High-Performance GPU code Generation using CUDA-CHiLL”, (poster) Malik Khan, Jacqueline Chame, Gabe Rudy, Chun Chen, Mary Hall, Mark Hall, Nvidia GPU Technology Conference, Sept. 2010
  7. “Takagi Factorization on GPU using CUDA,” (poster paper), Gagandeep S. Sachdev, Vishay Vanjani and Mary W. Hall, Symposium on Application Accelerators for High Performance Computing, July, 2010
  8. GPU Accelerated Particle System for Triangulated Surface Meshes,” (poster paper), B. Peterson, M. Datar, M. Hall and R. Whitaker, Symposium on Application Accelerators for High Performance Computing, July, 2010.
  9. “Autotuning and Specialization: Speeding up Nek5000 with Compiler Technology,” (poster) J. Shin, M. W. Hall, J. Chame, C. Chen, P. F. Fischer, P. D. Hovland, SC’09, Nov. 2009
  10. “Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology,” Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul D. Hovland, International Workshop on Automatic Performance Tuning, October, 2009
  11. Assembling Large Mosaics of Electron Microscope Images using GPU,” (poster paper) Kannan Venkataraju, Mark Kim, Dan Gerszewski, James R. Anderson, and Mary Hall, Symposium on Application Accelerators for High Performance Computing, July, 2009.
  12. Computation reuse in domain-speci c optimization of signal recognition“, (poster paper) Melina Demertzi, Pedro C. Diniz, Mary W. Hall, Anna C. Gilbert, and Yi Wang, In Proceeding of the ACM/SIGDA international symposium on Field programmable gate arrays (FPGA ’09),
    Feb. 2009, p. 281.
  13. Model-Guided Performance Tuning of Parameter Values: A Case Study with Molecular Dynamics Visualization,” Y. Nelson, B. Bansal, M. Hall, A. Nakano, and K. Lerman, Proceedings of the Workshop on High-Level Parallel Programming Models and Supportive Environments, held in conjunction with IPDPS ’08, April, 2008.
  1. Data-Driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs Khalid Ahmad, Hari Sundar, Mary Hall, ACM Transactions on Architectures and Code Optimization 16(5), Dec. 2019.
  2. Exploiting Reuse and Vectorization in Blocked Stencil Computations on CPUs and GPUs T. Zhao, S. Williams, M. Hall, H. Johansen  International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2019.
  3. SWIRL: High-Performance Many-Core CPU Code Generation for Deep Neural
    Networks Anand Venkat, Tharindu Rusira, Raj Barik, Mary Hall, Leonard Truong, International Journal of High-Performance Computing Applications, 33(6), 2019.
  4. Sparse Computation Data Dependence Simplification for Efficient Compiler-Generated Inspectors M. Mohammadi, K. Cheshmi, E. Davis, M. Hall, M. Dehnavi, P. Nandy, C. Olschanowsky, A. Venkat. T. Yuki, M. Strout, Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2019.
  5. The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code M. M. Strout, M. Hall and C. Olschanowsky, Proceedings of the IEEE 106(11):1921–1934, Nov. 2018.
  6. Autotuning in High-Performance Computing Applications Prasanna Balaprakash, Jack Dongarra, Todd Gamblin, Mary Hall, Jeffrey K. Hollingsworth, Boyana Norris, and Richard Vuduc, Proceedings of the IEEE 106(11):2068–2083, Nov. 2018.
  7. Automating Wavefront Parallelization for Sparse Matrix Codes A. Venkat, M. Mohamadi, J. Park, R. Barik, H. Rong, M. Strout, M. Hall, International Conference on Supercomputing, Networking, Storage and Analysis (SC), Nov. 2016, Best Paper Finalist.
  8. Loop and Data Transformations for Sparse Matrix Code, ” Anand Venkat, Mary Hall, Michelle Strout, Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June, 2015.
  9. Nitro: A Framework for Adaptive Code Variant Tuning,” S. Muralidharan, M. Shantharam, M. Hall, M. Garland, B. Catanzaro, Proceedings of the International Parallel and Distributed Processing Symposium, May, 2014.
  10. “Non-affine Extensions to Polyhedral Code Generation,” A. Venkat, M. Shantharam, M. Hall, M. M. Strout, Proceedings of the International Conference on Code Generation and Optimization, Feb. 2014.
  11. Compiler Generation and Autotuning of Communication-Avoiding Operators for Geometric Multigrid,” P. Basu, S. Williams, B. Van Straalen, A. Venkat, L. Oliker, M. Hall, High Performance Computing Conference (HiPC), December 2013.
  12. Towards Making Autotuning Mainstream,” P. Basu, M. Hall, M. Khan, S. Maindola, S. Muralidharan, S. Ramalingam, A. Rivera, M. Shantharam, A. Venkat, International Journal of High Performance Computing Applications, 27(4), November 2013.
  13. A script-based autotuning compiler system to generate high-performance CUDA code,” M. Khan, P. Basu, G. Rudy, M. Hall, C. Chen, and J. Chame. ACM Transactions on Architecture and Code Optimization, 9(4), January 2013.
  14. Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters,” H. Dursun, M. Kunaseth, K. Nomura, J. Chame, R.F. Lucas, C. Chen, M. Hall, R.K. Kalia, A. Nakano, P. Vashishta, The Journal of Supercomputing, 62(2):946-966, December 2012.
  15. Understanding ACM’s Past,” M. Hall, Communications of the ACM, 55(12), December 2012.
  16. “Improving High-Performance Sparse Libraries using Compiler-Assisted Specialization : A PETSc Case Study,” Shreyas Ramalingam, M. Hall and C. Chen, Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS), held in conjunction with International Parallel and Distributed Processing Symposium, May 2012
  17. Analyzing the effect of compiler optimizations on application reliability,” M. Demertzi, M. Annavaram and M. Hall, Proceedings of the IEEE International Symposium on Workload Characterization, Nov., 2011.
  18. “Understanding the Behavior of Pthread Applications on Non-Uniform Cache Architectures,” G. S. Sachdev, K. Sudan, M. W. Hall, and R. Balasubramonian, (poster paper), In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Oct. 2011
  19. “Generating High Performance Libraries using CHiLL and Autotuning,” S. Ramalingam and M. Hall, (poster), International Workshop on Languages and Compilers for Parallel Computing, Sept. 2011
  20. Auto-tuning Full Applications: A Case Study,” A. Tiwari, C. Chen, C. Liao, J. Chame, J. Hollingsworth, M. Hall and D. Quinlan, International Journal of High Performance Computing Applications, 25(3):286-294, Aug. 2011.
  21. Domain-Specific Optimization of Signal Recognition Targeting FPGAs,” M. Demertzi, P.C. Diniz, M.W. Hall, A.C. Gilbert and Y.Wang, ACM Transactions on Reconfigurable Technology and Systems, 4(2), May, 2011.
  22. Evaluating graph coloring on GPUs,” P. Grosset, P. Zhu, S. Liu, S. Venkatasubramanian, and M. Hall. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming (PPoPP ’11), Feb. 2011. Received runner-up for Best Student Poster.
  23. EigenCFA: Accelerating Flow Analysis with GPUs,” T. Prabhu, S. Ramalingam , M. Might, M. Hall, In ACM SIGPLAN Principles of Programming Languages, Jan. 2011.
  24. “A Programming Language Interface to Describe Transformations and Code Generation,” G. Rudy, M. Khan, M. Hall, C. Chen and J. Chame, Lecture Notes in Computer Science, 2011, Volume 6548, Languages and Compilers for Parallel Computing, Springer Verlag, Pages 136-150
  25. Languages and Compilers for Autotuning,” M.W. Hall and J. Chame, In Performance Tuning of Scientic Applications, edited by David Bailey, Robert F. Lucas and Sam Williams. Taylor and Francis publishers, Nov. 2010.
  26. “CUDA-CHiLL: Using Compiler-Based Autotuning to Generate High-Performance GPU Libraries,” M. Khan, G. Rudy, C. Chen, M. Hall, J. Chame, (poster) SC’10, Nov. 2010
  27. “Automatic High-Performance GPU code Generation using CUDA-CHiLL”, (poster) Malik Khan, Jacqueline Chame, Gabe Rudy, Chun Chen, Mary Hall, Mark Hall, Nvidia GPU Technology Conference, Sept. 2010
  28. “CUDA-CHILL: A PROGRAMMING LANGUAGE INTERFACE FOR GPGPU OPTIMIZATIONS AND CODE GENERATION”, Gabe Rudy, Master’s thesis, 2010.
  29. “Takagi Factorization on GPU using CUDA,” (poster paper), Gagandeep S. Sachdev, Vishay Vanjani and Mary W. Hall, Symposium on Application Accelerators for High Performance Computing, July, 2010
  30. GPU Accelerated Particle System for Triangulated Surface Meshes,” (poster paper), B. Peterson, M. Datar, M. Hall and R. Whitaker, Symposium on Application Accelerators for High Performance Computing, July, 2010.
  31. Autotuning and Specialization: Speeding up Nek5000 with Compiler Technology,” Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul Fischer, Paul D. Hovland, International Conference on Supercomputing, June, 2010.
  32. Parameterized specification, configuration and execution of data-intensive scientific work-flows,” V.S. Kumar, T. Kurc, V. Ratnakar, J. Kim, G. Mehta, K. Vahi, Y.L. Nelson, P. Sadayappan, E. Deelman, Y. Gil, M. Hall and J. Saltz, Cluster Computing, April 2010.
  33. “Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology,” Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul D. Hovland, Software Automatic Tuning: from concepts to state-of-the-art results, edited by Keita Teranishi, John Cavazos, Ken Naono and Reiji Suda, Springer-Verlag Publishers, 2010, Pages 353-370
  34. Loop Transformation Recipes for Code Generation and Auto-Tuning Mary Hall, Jacqueline Chame, Chun Chen, Jaewook Shin and Gabe Rudy, Lecture Notes in Computer Science, 2010, Volume 5898, Languages and Compilers for Parallel Computing, Springer-Verlag, Pages 50-64.
  35. Autotuning and Specialization: Speeding up Nek5000 with Compiler Technology (poster) J. Shin, M. W. Hall, J. Chame, C. Chen, P. F. Fischer, P. D. Hovland, SC’09, Nov. 2009
  36. “Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology,” Jaewook Shin, Mary W. Hall, Jacqueline Chame, Chun Chen, Paul D. Hovland, International Workshop on Automatic Performance Tuning, October, 2009
  37. GPU Acceleration of the Generalized Interpolation Material Point Method,” W. Chiang, M. DeLisi, T. Hummel, T. Prete, K. Tew, M. Hall, P. Wallstedt, and J. Guilkey, Symposium on Application Accelerators for High Performance Computing, July, 2009.
  38. Assembling Large Mosaics of Electron Microscope Images using GPU,” (poster paper) Kannan Venkataraju, Mark Kim, Dan Gerszewski, James R. Anderson, and Mary Hall, Symposium on Application Accelerators for High Performance Computing, July, 2009.
  39. An Integrated Framework for Parameter-based Optimization of Scientific Work flows,” V. S. Kumar, P. Sadayappan, G. Mehta, K. Vahi, E. Deelman, V. Ratnakar, J. Kim, Y. Gil, M. Hall, T. Kurc, J. Saltz, Proceedings of the International Symposium on High Performance Distributed Computing, June, 2009.
  40. Model-Guided Autotuning of High-Productivity Languages for Petascale Computing,” H. Zima M. Hall, C. Chen, J. Chame, In Proceedings of the International Symposium on High Performance Distributed Computing, June, 2009.
  41. A Scalable Autotuning Framework for Compiler Optimization,” A. Tiwari, C. Chen, J. Chame, M. Hall and J. K. Hollingsworth, In Proceedings of the International Parallel and Distributed Processing Symposium, May, 2009.
  42. HPC and Grid Computing for Integrative Biomedical Research,” T. Kurc, S. Hastings, V. Kumar, S. Langella, A. Sharma, T. Pan, S. Oster, D. Ervin, J. Permar, S. Narayanan, Y. Gil, E. Deelman, M. Hall, J. Saltz, International Journal of High Performance Computing Applications, 2009.
  43. Compiler Research: The Next Fifty Years,” M. Hall, D. Padua and K. Pingali, Communications of the ACM, Feb. 2009.
  44. Computation reuse in domain-specific optimization of signal recognition“, (poster paper) Melina Demertzi, Pedro C. Diniz, Mary W. Hall, Anna C. Gilbert, and Yi Wang, In Proceeding of the ACM/SIGDA international symposium on Field programmable gate arrays (FPGA ’09),
    Feb. 2009, p. 281.
  45. Evaluating Compiler Technology for Control-Flow Optimizations for Multimedia Extension Architectures,” J. Shin, M. Hall and J. Chame. Award paper invited from MSP 7 International Journal of Embedded Systems, 2009.
  46. PERI Auto-Tuning,” David H. Bailey, Jacqueline Chame, Chun Chen, Jack Dongarra, Mary Hall, Jeffrey K. Hollingsworth, Paul Hovland, Shirley Moore, Keith Seymour, Jaewook Shin, Ananta Tiwari, Sam Williams, Haihang You, Journal of Physics: Conference Series, Vol. 125, 2008.
  47. Self-Configuring Applications for Heterogeneous Systems: Program Composition and Optimization Using Cognitive Techniques,” M. Hall, Y. Gil and R. Lucas. Proceedings of the IEEE, Special Issue on Cutting-Edge Computing, Vol. 96(5), May 2008.
  48. Model-Guided Performance Tuning of Parameter Values: A Case Study with Molecular Dynamics Visualization,” Y. Nelson, B. Bansal, M. Hall, A. Nakano, and K. Lerman, Proceedings of the Workshop on High-Level Parallel Programming Models and Supportive Environments, held in conjunction with IPDPS ’08, April, 2008.