Scale Production
Execute production data science work — managing
massive data sets and running complicated workflows.
Arvados provides a new generation of open source software infrastructure designed to address the most important platform challenges that IT organizations face managing and processing massive data sets. Use content addressable storage to track files and containerization to run reproducible and versioned computational workflows.
Arvados runs the same on every major public cloud provider in your own account.
Arvados runs on private clouds and your own computing clusters.
For fun, try it on your PC. (For production, get a cloud or a cluster.)
Arvados clusters can be built with low-cost commodity hardware by pushing fault-tolerance into the software layer to drive down storage costs.
Hyperconverged infrastructureA hyperconverged, scale-out architecture simplifies deployment and radically lowers costs for installation, setup, and expansion.
Storage efficiencyThe data manager automatically dedups on write and provides visibility into the origin and usage of datasets, so you know what to keep or delete.
We are actively building connectors that will let you integrate Arvados clusters with existing storage and compute clusters.
Simplify operationsYour data scientists will be more productive using Docker and Arvados to streamline the process of deploying pipelines
Great supportSupport from Curoverse helps bioinformaticians quickly get past problems that slow down usage of existing systems.
Node optimizationThe Arvados job manager monitors available nodes and intelligently deploys jobs to nodes to maximize system utilization.
Track the history and performance of every pipeline run on your cluster and trace detailed usage data back to each individual user.
Data origin and usageEasily identify the origin and usage of every dataset stored on your system to identify who is using what and why.
Virtual private serversGet users off your head nodes and monitor resource usage with virtual private servers.
Use Arvados to transition to a distributed and elastic computing architecture that takes advantage of object storage, virtualization, and containerization.
Hybrid cloudRun on-premise, in the cloud, or both. Arvados works across clouds making your data scientists' work cloud-ready and easily portable.
Supported 100% open sourceUsing Arvados gives you a 100% open source solution built for production data science that has strong commercial support.
Turnkey deploymentYou can easily setup Arvados on your hardware, use pre-configured hyperconverged servers, or deploy in your cloud account.
Curoverse Operation Service (COS) provides an unparalleled array of data center and system services that help you consistently realize the full potential of your Arvados cluster. See the complete overview.
Ensure your complete production software stack is tested, certified, and maintained across every component in the cluster.
Access the most experienced, motivated, and knowledgeable Arvados support engineers for sys admins and end users.
Leverage turnkey 24/7/365 cluster administration with predictive maintenance and data-driven performance optimization.
We can help you achieve your goals including implementation, system integration, pipeline porting, and federating clusters between data centers.
Curoverse is based on the Arvados free and open source software platform. If you want a deep dive into the technology, read the overview, check out the , join the , or download .