Canonical, the publisher of Ubuntu, today announced the general availability of Data Science Stack (DSS), an out-of-the-box solution for data science that enables ML environments on your AI workstation. It is fully open source, free to use and native to Ubuntu. It is also accessible on other Linux distributions, on Windows using Windows Subsystem Linux (WSL), and on macOS with Multipass. DSS is a command line interface-based tool that bundles Jupyter Notebooks, MLflow and frameworks like PyTtorch and TensorFflow on top of an orchestration later. Canonical provides security maintenance for all of the packages included in the solution, ensuring timely vulnerability patching and protection of both the software and created artefacts.
AI adoption is widespread, but so are the challenges in successfully implementing it. Consider the following statistics from Deloitte:
In light of this context, business leaders are feeling the pressure to quickly get AI capability up to speed and showcase return-on-investment from AI projects. Shortening the time required to set up ML environments is crucial to accelerating project delivery and the initial exploration phase of AI within organisations. That’s why we created the Data Science Stack (DSS).
Data Science Stack (DSS) can be set up with just three commands, enabling quick initial exploration on AI workstations. Practitioners will only need to set up their container orchestration layer, install the DSS CLI and initialise the Data Science Stack in order to access the environment. This can be done in 10-30 minutes, depending on the practitioner’s experience level.
Canonical’s Silicon Alliance ecosystem manager, Chris Schnabel, elaborated, “This takes away the burden of managing any of the package dependencies or setting up the compute resources, thanks to the simple commands that AI practitioners can run. By default, DSS includes access to Jupyter Notebook for model development, MLflow for experiment tracking and model registry and ML frameworks such as Pytorch and Tensorflow. However, users can customise Data Science Stack and add new libraries depending on their use case.”
For practitioners, DSS is helpful to get familiar with tools they can use for large scale ML environments. DSS also provides migration paths, helping them grow their AI initiatives as projects mature.
We created Data Science Stack to work on any hardware type, in order to optimise the user experience and enable users to get the best performance on their hardware of choice. DSS uses optimised ML frameworks from different vendors like PyTorch and TensorFlow, in order to provide users choice of the most popular distributions and achieve the highest performance levels possible. In the case of Intel, they drive their hardware optimizations upstream to these community projects. However, in order to get earlier access to performance enhancements and capabilities like Intel GPU support in advance of arrival upstream, AI practitioners can also access ITEX and IPEX, Intel’s distributions of PyTorch and Tensorflow. IPEX and ITEX add improved optimisation performance based on the hardware, taking advantage of the Advanced Vector Extensions (AVX), Vector Neural Network Instructions (VNNI) and Advanced Matrix Extensions (AMX). By integrating these extensions, in addition to GPU acceleration, DSS benefits from acceleration for operations prevalent in AI use cases, reducing the model training time and accelerating the experimentation phase of ML projects.
“Canonical’s Data Science Stack provides an essential foundation for AI practitioners to quickly advance their machine learning and data science capabilities,” Arun Gupta, Vice President and General Manager for Open Ecosystem at Intel. “By aligning with upstream PyTorch and TensorFlow, we ensure that developers are working with the most progressive tools available. Our collaboration through the OPEA project amplifies this impact, streamlining AI development and making innovation more accessible for everyone.”
AI workstations are a strategic product for many computer manufacturers. Solutions like Data Science Stack enable them to offer a seamless experience on any device, helping them diversify the chosen GPU, without affecting the user experience.
McKinsey reports that 51% of organisations using AI consider cybersecurity to be the highest risk they need to mitigate, followed by regulatory compliance at 36%. This affects all layers and scales of ML development, from AI workstations to data centres or edge devices.
When data scientists set up their ML environments, they use containers and open source tools from different sources, without necessarily considering the security risks. Within an enterprise, security can also quickly turn into a burden for sysadmins, who need to deploy and maintain a variety of tools – which are often different due to lack of standards. DSS provides a consistent architecture for ML environments that can be rolled out at scale on many machines.
Ubuntu is the most adopted Linux distribution (source: StackOverflow report), with a high number of AI/ML practitioners using it for their projects. As business leaders allocate budget for ML projects and professionals start doing initial exploration, they will start deploying solutions on workstations as well.
Organisations can purchase security maintenance and support through Ubuntu Pro. Enterprises benefit from enterprise support for their ML solution on environments so that they can resolve issues in a timely manner, in line with Canonical’s SLAs.
To learn more about Data Science Stack, look at our webinar. We’ll provide a rundown of the features you can benefit from and demonstrate the three-command setup.
One of the most critical gaps in traditional Large Language Models (LLMs) is that they…
Canonical is continuously hiring new talent. Being a remote- first company, Canonical’s new joiners receive…
What is patching automation? With increasing numbers of vulnerabilities, there is a growing risk of…
Wouldn’t it be wonderful to wake up one day with a desire to explore AI…
Ubuntu and Ubuntu Pro supports Microsoft’s Azure Cobalt 100 Virtual Machines (VMs), powered by their…
Welcome to the Ubuntu Weekly Newsletter, Issue 870 for the week of December 8 –…