Skip navigation and jump directly to page content

 IU Trident Indiana University

UITS Research Technologies

Cyberinfrastructure enabling research and creative activities
banner-image

Success Stories

mlrho-versions

mlRho Performance Analysis

We are working with members of the Lynch Lab at IU to help them efficiently run their computational jobs on IU and XSEDE machines. They are using a program called mlRho for their calculations. They have hundred of thousands of single-core jobs that they need to complete.

We did a thorough performance analysis of mlRho using Vampir. The initial analysis was of mlRho version 1.10. We sent this report to the developer and after multiple iterations of testing and optimization, we are now at version 1.24, thanks to the developer. This version is more than 40 times faster than the original version of the code. You can find our original analysis in the PDF below. The graph in the image shows the runtime of different versions of mlRho with the same configuration and arguments on Mason.

Read More

HPA_web_mlrho

Running mlRho using BigJob

We are working with members of the Lynch Lab at IU to help them efficiently run their computational jobs on IU and XSEDE machines. They are using a program called MlRho for their calculations. They have millions of single-core jobs that they need to run.


We are attacking this problem from two different angles: getting more processors to run the jobs and optimizing the code so that it will run faster.
We assisted them in getting a start-up allocation on Ranger (XSEDE). We are currently doing some scaling tests with the code and subsequently will assist the researchers in applying for a larger XSEDE allocation.

We helped them with the setup and usage of a pilot-job framework called BigJob to bundle their single-core jobs into larger jobs, thereby achieving better throughput.

equation

SciAPT works with AVL on dental images

Dental caries is an infectious, communicable disease that causes destruction of teeth via acid-forming bacteria found in dental plaque. Scientists evaluate its treatment by analyzing Microfocus Computed Tomography (μ-CT) images collected from tooth specimen over time. With each tooth having 5-phase longitudinal evaluation and each phase generating one thousand high-resolution images, the overall data volume is tremendous. SciAPT and the Advanced Visualization Lab (AVL) are working together to utilize HPC resources to segment images and identify the region of interest (ROI). The ParaView high performance visualization software is also used to get qualitative understanding of the data.

equation

SciAPT, ZIH and NCGAS conduct detailed performance study of Trinity

The SciAPT group, together with the National Center for Genome Analysis Support (NCGAS) and the Center for Information Services and High Performance Computing at Technische Universitaet Dresden (ZIH) are conducting a detailed performance study of Trinity. Trinity is a novel method for efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. It delivers very good results at the cost of a long runtime. To speed up the analysis, we are looking at each step of the Trinity workflow and how it can be optimized for running on a cluster of large memory nodes (e.g. Mason at IU). Initial results are showing a performance increase of up to 30%, just by properly installing and configuring Trinity to take full advantage of all the features of a modern HPC system.

equation

SciAPT assists with the upgrade of Dr. Mike Jolly's Fortran 77 code

Professor Mike Jolly and the PDE Group in Applied Mathematics and Computation are investigating how energy is transferred through length scales in a turbulent fluid flow with the support of a grant from the National Science Foundation. Their Fortran-based code solves for an approximate solution of the 2D Navier-Stokes equations. These are partial differential equations (PDEs) whose solution depends on both time and space. Their length scales are approximately inversely proportional to the indices in the 2D arrays.


This research presents three computational challenges. Patterns must be averaged over long time intervals. As precision increases, the length of time steps must be decreased. Finally, each experiment must be run under multiple force configurations. The SciAPT group is assisting with the upgrade of this code from a Fortran 77 style to a more efficient and modern Fortran construct and exploring thread level parallelism.

GitHub

Leveraging SDSC’s Dash to Enable Genomics Research

Researchers from the Lynch Lab in the IUB Biology department focus their research “on mechanisms of evolution at the gene, genomic, and phenotypic levels, with special attention being given to the roles of mutation, random genetic drift, and recombination.” The researchers needed to assemble relatively large gene sequences; however, the assembly software requires a large amount of memory to assemble these sequences. Since the group’s memory requirements exceeded the current capabilities at IU, the SciAPT group assisted the researchers in acquiring compute time on the TeraGrid’s Dash machine. Dash is a virtual shared memory machine utilizing the ScaleMP foundation and containing an aggregate of 768GB of shared memory. Access to this system allowed the Lynch group to proceed with their assemblies.

GitHub

SciAPT works with Prof. Mu-Hyun Baik in the IUB Chemistry department

Prof. Mu-Hyun Baik and his group in the IUB Chemistry department use computational quantum chemistry in their research on artificial photosynthesis, reaction pathways of the cancer drug cisplatin, and the chemistry involved in Alzheimer's disease. These studies involve extensive calculations of the structure of complex molecules, and of reaction mechanisms that are difficult or impossible to observe experimentally.


The quantum chemistry codes they use can be quite complex, and learning how to compile and run them on large parallel computers can be difficult and time consuming. Such is the case with two principal code packages the Baik Group uses, Jaguar and MOLCAS. Especially with Jaguar, the problems of compiling and getting the code to run correctly and efficiently have become major roadblocks to progress. SciAPT is partnering with the group to take responsibility for these issues. We are installing and maintaining Jaguar and MOLCAS on IU's platforms, and assisting in recompiling, testing and benchmarking as the Baik Group makes changes to the codes. We are also using Vampir to analyze the parallel performance of these codes.


This is a project where establishing a collaborative relationship is important. In addition to installing and maintaining Jaguar and MOLCAS, we are helping with production runs. By becoming familiar with the Baik Group's computational procedures and problems, we are solving some of the difficult problems of maintaining and running these complex codes on IU's parallel machines. In addition, results from our Vampir analysis may help the Jaguar and MOLCAS developers improve their codes' performance.


Close collaboration between SciAPT and the Baik Group is enabling Prof. Baik and his group to concentrate on doing chemistry. Helping them use IU's HPC resources more efficiently is reducing the time and cost for them to obtain research results.

Human brain vs. chimp brain

SciAPT assists in processing thousands of skull images

A new faculty member has brought thousands of skull images from his previous institution. He has begun a project to "average" these images to create a template human skull. Given this skull template, individual images can be compared to determine the variability of human skulls. Eventually, an atlas of templates will be created giving an average skull by region of the earth. The resulting atlas and variability maps have application in Anthroplogy, Archeology, and Forensics.

SciAPT assisted in installation of the analysis programs to carry out this work and is currently developing a workflow system to automate the processing of this large number of images.

SC11

SciAPT Supports IU's 100 Gbit/sec efforts at the SC11 SCinet Research Sandbox

At SC11 (November 12-18), IU showcased a first of its kind 100 Gbit/secproduction network equipped with multi-vendor OpenFlow-capable switches.SciAPT supported this entry into the SCinet Research Sandbox (SRS). "TheData Superconductor: An HPC cloud using data-intensive scientificapplications, Lustre-WAN, and OpenFlow over 100Gb Ethernet," used theLustre file system and cutting-edge network infrastructure to addresschallenges created by the exponential growth in volume of digitalscientific research data.

For the SRS demonstration IU ran a series of applications using theLustre-WAN file systems as their main storage resource. SciAPT identifiedkey applications that can benefit from such an extremely powerful networkand managed the applications as they ran across this 2000 mile 100Gbit/sec link.

3D Ultrasound Segmentation

Parallelizing the algorithm for 3D Ultrasound Segmentation

This project aims to port an algorithm for 3D Ultrasound Segmentation from Windodws onto a Linux cluster, and then parallelize the algorithm to take full advantage of multi-core processors. The long-term goal is to enable real-time image segmentation for clinical use.

Implementing 3D SPHARM Surfaces Registration on Cell processor

Implementing 3D SPHARM surfaces registration on the Cell B.E. processor

A 3D SPHARM Surfaces Registration algorithm, which takes hours to run in MATLAB, was ported to and optimized for parallel platforms and the run time is decreased to just seconds.

Read More