Project News

VC3 at PEARC18

2018-07-25
The VC3 team will be at [PEARC18](http://pearc.org/) in Pittsburgh, PA. ## VC3: A Virtual Cluster Service for Community Computation - Event Type: Technical Paper - Presentation: Wednesday, July 253:45pm - 4pm - Location Grand Ballroom 3 ### Description > A traditional HPC computing facility provides a large amount of computing power but has a fixed > environment designed to satisfy local needs. This makes it very challenging for users to deploy complex applications that span multiple sites and require specific application software, > scheduling middleware, or sharing policies. The DOE-funded VC3 project aims to address these challenges by making it possible for researchers to easily aggregate and share resources, install custom software environments, and deploy clustering frameworks across multiple HPC facilities through the concept of ``virtual clusters''. This paper presents the design, implementation, and initial experience with our prototype self-service VC3 platform which automates deployment of cluster frameworks across diverse computing facilities. > To create a virtual cluster, the VC3 platform materializes a custom head node in a secure private cloud, specifies a choice of scheduling middleware, then allocates resources from the remote facilities where the desired software and clustering framework is installed in user space. As resources become available from scheduled nodes from individual clusters, the research team simply sees a private cluster they can access directly or share with collaborators, such as a science gateway community. We discuss how this service can be used by research collaborations requiring shared resources, specific middleware frameworks, and complex applications and workflows in the areas of astrophysics, bioinformatics and high energy physics. ### Authors - Lincoln Bryant, University of Chicago - Jeremy Van, University of Chicago - Benedikt Riedel, University of Chicago - Robert Gardner, University of Chicago - Jose Caballero Bejar, Brookhaven National Lab - John Hover, Brookhaven National Lab - Ben Tovar, University of Notre Dame - Kenyi Hurtado, University of Notre Dame - Douglas Thain, University of Notre Dame (*presenter*)
Tags: papers

Read more »

VC3 at CHEP 2018

2018-07-10
The VC3 project was well represented at the CHEP 2018 Conference, Sofia, Bulgaria. Two publications are forthcoming which included contributions from the VC3 project: ## Deploying and extending CMS Tier 3s using VC3 and the OSG Hosted CE service Speaker: Kenyi Paolo Hurtado Anampa (University of Notre Dame (US)) Description: >CMS Tier 3 centers, frequently located at universities, play an important role in the physics analysis of CMS data. >Although different computing resources are often available at universities, meeting all requirements to deploy a valid Tier 3 able to run CMS workflows can be challenging in certain scenarios. For instance, providing the right operating system (OS) with access to the CERNVM File System (CVMFS) on the worker nodes or having a Compute Element (CE) on the submit host is not always allowed or possible due to e.g: lack of root access to the nodes, TCP port network policies, maintenance of a CE, etc. The Notre Dame group operates a CMS Tier 3 with ~1K cores. In addition to this, researchers have access to an opportunistic pool with +25K cores that are used via lobster for CMS jobs, but cannot be used with other standard CMS submission tools on the grid like CRAB, as these resources are not part of the Tier 3 due to its opportunistic nature. This work describes the use of VC3, a service for automating the deployment of virtual cluster infrastructures, in order to provide the environment (user-space CVMFS access and customized OS via singularity containers) needed for CMS workflows to work. Also, its integration with the OSG Hosted CE service, to add these resources to CMS as part of our existing Tier 3 in a seamless way. - Primary author: Kenyi Paolo Hurtado Anampa (University of Notre Dame (US)) Co-authors: - Benjamin Tovar (University of Notre Dame) - Douglas Thain (University of Notre Dame) - Kevin Patrick Lannon (University of Notre Dame (US)) Materials: - [website](https://indico.cern.ch/event/587955/contributions/2937282/) ## Interactive, scalable, reproducible data analysis with containers, Jupyter, and Parsl Speaker: Ms Anna Elizabeth Woodard (Computation Institute, University of Chicago) Description >In the traditional HEP analysis paradigm, code, documentation, and results are separate entities that require significant effort to keep synchronized, which hinders reproducibility. Jupyter notebooks allow these elements to be combined into a single, repeatable narrative. HEP analyses, however, commonly rely on complex software stacks and the use of distributed computing resources, requirements that have been barriers to notebook adoption. In this presentation we describe how Jupyter can be combined with Parsl (Parallel Scripting Library) and containers to enable intuitive and interactive high performance computing in Python. >Parsl is a pure Python library for orchestrating the concurrent execution of multiple tasks. Parsl is remarkable for its simplicity. Its primary construct is an “app” decorator, which the programmer uses to indicate that certain functions (either pure Python or wrappers around shell programs) are to be treated as “apps.” App function calls then result in the creation of a new “task” that runs concurrently with the main program and other tasks, subject to dataflow constraints defined by the availability of app function input data. Data dependencies can be in-memory objects, or external files. App decorators can further specify which computation resources to use and the required software environment to run the decorated function. Parsl abstracts hardware details, allowing a single script to be executed efficiently on one or more laptops, clusters, clouds, and/or supercomputers. To manage complex execution environments on various resources and also to improve reproducibility, Parsl can use containers— lightweight, virtualized constructs for packaging software with its environment— to wrap tasks. > In this presentation we 1) show how a real-world complete HEP analysis workflow can be developed with Parsl and 2) demonstrate efficient and reproducible execution of such workflows on heterogeneous resources, including leadership-class computing facilities, using containers to wrap analysis code, Parsl to orchestrate the execution of these containers, and Jupyter as the interface for writing and executing the Parsl script. Primary authors - Mr Yadu Babuji (Computation Institute, University of Chicago and Argonne National Laboratory) - Dr Kyle Chard (Computation Institute, University of Chicago and Argonne National Laboratory) - Dr Ian Foster (Computation Institute, University of Chicago and Argonne National Laboratory) - Dr Daniel S. Katz (ional Center for Supercomputing Applications, University of Illinois Urbana-Champaign) - Dr Michael Wilde (Computation Institute, University of Chicago and Argonne National Laboratory) - Ms Anna Elizabeth Woodard (Computation Institute, University of Chicago) - Dr Justin M. Wozniak (Computation Institute, University of Chicago and Argonne National Laboratory) Materials: - [website](https://indico.cern.ch/event/587955/contributions/2937563/)
Tags: papers

Read more »

May 23, 2018 Limited Beta Release

2018-05-23
The VC3 project is inviting users and HPC resource providers to help evaluate its limited beta release. ## Beta Users During this release there might be system bugs, instabilities, disruptions and lack of VC3 availability. * The VC3 user limited beta invitation signup is [here](http://bit.ly/vc3-signup). ## Beta Resources If you would like to add an HPC resource to the VC3 platform for testing, please complete this [form](http://bit.ly/vc3-new-resource). You do not necessarily need to have an institutional affiliation with the HPC resource organization, but you should have authorization to run jobs there.
Tags: announcements

Read more »

VC3 OSG All-Hands Meeting

2018-05-23
We presented the VC3 project at the OSG All-Hands Meeting at University of Utah. [slides](images/2018-03-20-osg-ahm/2018-03-20-osg-ahm.pdf)
Tags: presentations

Read more »

VC3 at HTCondor Week 2018

2018-05-23
We presented the VC3 project at the HTCondor Week 2018. During this presentation we announced the [limited beta release](2018-05-23-Limited). [slides](images/2018-05-23-condor/2018-05-23-condor-week.pdf)
Tags: presentations

Read more »

VC3 at IC2E18

2018-04-18
The VC3 team will be at [IC2E18](http://http://conferences.computer.org/IC2E/2018) in Orlando, FL. ## Automatic Dependency Management for Scientific Applications on Clusters - Event Type: Technical Paper - Presentation: Wednesday, April 15:00hrs - Location Hyatt Regency, Orlando [link to the paper](http://ccl.cse.nd.edu/research/papers/tovar-vc3builder-ic2e2018.pdf) [link to the slides](images/2018-04-18-ic2e/vc3-builder-ice2e.pdf) ### Description Software installation remains a challenge in scientific computing. End users require custom software stacks that are not provided through commodity channels. The resulting effort needed to install software delays research in the first place, and creates friction for moving applications to new resources and new users. Ideally, end-users should be able to manage their own software stacks without requiring administrator privileges. To that end, we describe *vc3-builder*, a tool for deploying software environments automatically on clusters. Its primary application comes in cloud and opportunistic computing, where deployment must be performed in batch as a side effect of job execution. *vc3-builder* uses workflow technologies as a means of exploiting cluster resources for building in a portable way. We demonstrate the use of *vc3-builder* on three applications with complex dependencies: MAKER, Octave, and CVMFS, building and running on three different cluster facilities in sequential, parallel, and distributed modes. ### Authors - Ben Tovar, University of Notre Dame (*presenter*) - Nicholas Hazekamp, University of Notre Dame - Nathaniel Kremer-Herman, University of Notre Dame - Douglas Thain, University of Notre Dame
Tags: papers

Read more »

Sept 2017 Developer Meeting

2017-09-07
With the Fall "capability demonstrator" at the end of the month, the VC3 team had a face-to-face developer meeting in Chicago on September 5 and 6 2017. The main goals where to stand up virtual clusters, create virtual clusters from the website, and finalize details for the "capability demonstrator." We were able to create a virtual cluster and submit sample workloads to UChicago's Midway campus cluster and MWT2's CoreOS Cluster and request the creation of a virtual cluster from the website. We also finalized where and which payloads we would submit during the "capability demonstrator," mainly UChicago's Midway campus cluster, MWT2's CoreOS Cluster, NERSC's Cori, and Syracuse's OrangeGrid. <img src="https://raw.githubusercontent.com/vc3-project/vc3-flatpages/master/images/2017-09-07-Sept-2017-Developer-meeting/grafana_plot_workloads.png" align="center" width="100%" >
Tags: meetings

Read more »

July 2017 Developer Meeting

2017-08-10
With the first release of VC3 and the Fall "capability demonstrator" on the horizon, the VC3 team had a face-to-face developer meeting in Chicago from July 17 to July 19 2017. The main goals where to finalize the software architecture of the VC3 static infrastructure, commission the VC3 static infrastructure on our OpenStack private cloud, integrate the website into static infrastructure, manually create a virtual cluster (VC), and sketch out a road map forward to the first release. For the VC3 static infrastructure, we were able to finalize and implement the software architecture, and commission the VC3 static infrastructure including a VC3 master, VC3 inforservice, VC3 factory, and static HTCondor central manager using the our OpenStack private cloud. We were able to use the static infrastructure to submit jobs to MWT2 resources and the UChicago campus cluster. This provides us the foundation for create a VC. We were able to nearly complete the connection between the VC3 website and the static infrastructure. As for the manual creation of a VC, we had all the individual components running. We ran out of time to be able to have information flow from one end of the VC3 chain (website or CLI interface) to submitting pilot jobs. Overall, we considered this meeting a success. We were able to accomplish nearly all the goals we had set out and were able to see jobs going to sites we will be using for our Fall "capability demonstrator".
Tags: meetings

Read more »

Demonstrating the VC3 client

2017-08-10
### The VC3 Command Line Interface for Administrators This example shows how the vc3-client interacts with the VC3 information service to create new users, projects and resources, and using these spin up an HTCondor virtual cluster. These commands are meant to be executed by VC3 administrators. #### Create new users, projects, and resource profiles $ vc3-client user-create --firstname Lincoln --lastname Bryant --email lincolnb@uchicago.edu --institution UChicago lincolnb $ vc3-client project-create --owner rwg --members rwg,jhover ATLAS $ vc3-client project-adduser ATLAS jcaballero $ vc3-client resource-create --owner lincolnb --accesstype batch --accessmethod ssh --accessflavor htcondor --accesshost condor.grid.uchicago.edu --accessport 22 uchicago-grid $ vc3-client resource-create --owner lincolnb --accesstype batch --accessmethod ssh --accessflavor slurm --accesshost cori.nersc.gov --accessport 22 nersc-cori <script type="text/javascript" src="https://asciinema.org/a/a6fajD1XmHW1dxEvHvxqwwjb2.js" id="asciicast-40j5dnd6m67yog3y4qa4tw957" async data-size="small" data-theme="monokai"></script> #### Watch the service logs when creating new users, projects, and resource profiles. <script type="text/javascript" src="https://asciinema.org/a/rw3gao3WobjZKe6ZyLZwezrAg.js" id="asciicast-40j5dnd6m67yog3y4qa4tw957" async data-size="small" data-theme="monokai"></script>
Tags: demos

Read more »

VC3 Seminar given at Brookhaven

2017-04-17
A seminar describing the VC3 architecture was given to the SDSS/RACF computing group at Brookhaven National Laboratory by John Hover, the lead architect of the VC3 core. Slides are available [here](https://docs.google.com/presentation/d/1lVVLAQN6G1nZXBSCZ4AZWkYeCpCT4Bx5Cta0cqAx5Fk/edit?usp=sharing).
Tags: meetings

Read more »

Play Toy Demos of Primordial VC3 Components

2016-09-08
Our goal for the fall is to create a "capability demonstrator". As we're just getting started we'll build from existing software and applications and mold them to higher level abstractions. The examples below demonstrate prototyping features of a VC3 pilot and factory service. ### Deploy a global file system on the fly $ vc3-pilot --require cvmfs <script type="text/javascript" src="https://asciinema.org/a/40j5dnd6m67yog3y4qa4tw957.js" id="asciicast-40j5dnd6m67yog3y4qa4tw957" async data-size="small" data-theme="monokai"></script> ### Deploy the MAKER application on the fly $ vc3-pilot --require maker-ecoli-example-01 <script type="text/javascript" src="https://asciinema.org/a/4qzmcrpmrzssxen6s1knkgw86.js" id="asciicast-4qzmcrpmrzssxen6s1knkgw86" async data-size="medium" data-theme="monokai"></script> ### Send VC3 pilots to a remote cluster via ssh "glide in" <script type="text/javascript" src="https://asciinema.org/a/7a9ku2k4z3ujtnr1v4cjo6mq3.js" id="asciicast-7a9ku2k4z3ujtnr1v4cjo6mq3" async data-size="medium" data-theme="monokai"></script> ### Start VC3 pilots on a resource requiring 2-factor authentication <script type="text/javascript" src="https://asciinema.org/a/84798.js" id="asciicast-84798" async data-size="medium" data-theme="monokai"></script>
Tags: demos

Read more »

Next Generation Networks for Science (NGNS) Principal Investigators' (PI) Meeting

2016-09-08
Today and tomorrow we have the Next Generation Networks for Science (NGNS) Principal Investigators' (PI) Meeting at Argonne National Laboratory. More information can be found [here](https://www.orau.gov/ngns2016/default.htm). Doug Thain will be giving the VC3 presentation tomorrow.
Tags: meetings

Read more »

VC3 Kickoff Meeting

2016-06-07
Our official "kickoff" meeting to get the project formally started. The main theme will be to revisit our vision and proposal, to identify and carve out the “VC3 space” given changes in the community since the proposal, and to prepare for the [NGNS PI workshop](https://www.orau.gov/ngns2016/default.htm). Expect the meeting will be small and informal with lots of white-boarding. Success here would be to come out of the meeting with a year 1 program of work and a Fall "capability demonstrator".
Tags: meetings

Read more »

APF in Journal of Physics

2012-12-15
## AutoPyFactory: A Scalable Flexible Pilot Factory Implementation - Event Type: Technical Paper - Published: Journal of Physics Conference Series 396(3) ### Description > The ATLAS experiment at the CERN LHC is one of the largest users of grid computing infrastructure, which is a central > part of the experiment's computing operations. Considerable efforts have been made to use grid technology in the most > efficient and effective way, including the use of a pilot job based workload management framework. In this model the > experiment submits ‘pilot’ jobs to sites without payload. When these jobs begin to run they contact a central service > to pick-up a real payload to execute. The first generation of pilot factories were usually specific to a single Virtual > Organization (VO), and were bound to the particular architecture of that VO's distributed processing. A second > generation provides factories which are more flexible, not tied to any particular VO, and provide new and improved > features such as monitoring, logging, profiling, etc. In this paper we describe this key part of the ATLAS pilot > architecture, a second generation pilot factory, AutoPyFactory. AutoPyFactory has a modular design and is highly > configurable. It is able to send different types of pilots to sites and exploit different submission mechanisms and > queue characteristics. It is tightly integrated with the PanDA job submission framework, coupling pilot flow to the > amount of work the site has to run. It gathers information from many sources in order to correctly configure itself > for a site and its decision logic can easily be updated. Integrated into AutoPyFactory is a flexible system for > delivering both generic and specific job wrappers which can perform many useful actions before starting to run > end-user scientific applications, e.g., validation of the middleware, node profiling and diagnostics, and monitoring. > AutoPyFactory also has a robust monitoring system that has been invaluable in establishing a reliable pilot factory > service for ATLAS. ### Authors - Jose Caballero Bejar, Brookhaven National Lab - John Hover, Brookhaven National Lab - Peter Love, University of Manchester - Graeme Stewart, University of Glasgow
Tags: papers

Read more »