Why unified programming is the future of application development

Moving an application to a new processor type or chip vendor means creating an entirely new code base. That extra cost and delay is never welcome – especially today, when organizations face intense pressure to quickly maximize the value of data and commercial offerings, and new workloads and processor options to handle them are exploding.


As the world’s data-centric workloads become larger and more diverse, so do architectures that can best process them. A recent McKinsey report shows the diversification of preferred compute architectures through 2025. For both data center and edge computing, big growth is expected in the use of ASICs (application specific integrated circuits) and FPGAs (field programmable gate arrays), adding to the current lineup consisting predominantly of CPUs and GPUs.


Organizations and individuals looking to extract the greatest value from their data, while minimizing development times and cost, can benefit from a heterogenous compute strategy supported by a unified programming model.


oneAPI for many platforms


A unified programming model offers enterprises and OEMs a way to cost-effectively take advantage of the growing diversity of processor platforms, says Bill Savage, Intel vice president and general manager of Compute Performance & Developer Products. “Unified development lets companies share their source code investment across vendors and architectures: scalar, vector, matrix, and spatial (SVMS), while taking advantage of new accelerators.”


The idea is familiar: Developers program once, tune their code as needed for each target architecture, and deploy that tuned code to the respective processor. So, for example, a company could pilot an AI project on a general-purpose CPU (central processing unit), then move as needed to a more specialized processor to handle the increasing demands of, say, streaming edge analytics. What’s new here is applying this kind of single-source code model widely in the traditionally closed, vendor-proprietary world of specialized processor architectures.


Revolutionary, realistic goals


That’s precisely the core of a major new Intel initiative called oneAPI. Announced at Intel Architecture Day in December 2018, the project aims to revolutionize application development through a unified, open development model to simplify programming across processors. The goal, Savage explains, is threefold: increase application portability across diverse computing architectures, raise developer productivity, and most importantly deliver peak processor performance to high-growth, data-centric applications in data centers, edge, and cloud.


While a unified development model will be a game-changer, Savage is careful to set realistic expectations. “Developers will still need to tune code for the hardware they target.”


“We want to avoid the perception that we can make FPGA programming as easy as CPU and GPU programming. Yes, we’ve made it dramatically more productive than traditional FPGA programming, by providing a unified cross architecture programming language and environment which is bolstered with Intel’s tools, but that’s still significantly more work than a CPU or GPU program.”


Ecosystem collaboration drives open innovation


There is a need in the industry for a development model that gives programmers the freedom to develop for diverse workloads and across architectures with no environment lock-in.


Based on industry standards and open specifications, the oneAPI project aims to increase ecosystem collaboration and adoption while breaking through the limitations of proprietary models. Data Parallel C++ (shorthanded as DPC++), the direct programming language component of oneAPI, is an open cross-industry alternative to single-architecture languages.


An open design that can be used with different architectures, vendors, and targets should not only provide developers with the flexibility and choice to write their applications for new hardware targets but also encourage broad developer and ecosystem partner engagement and innovation. oneAPI can also be the foundation that specialized hardware manufacturers use for their software platforms.


No processor left behind


oneAPI leverages two types of programming to provide an efficient unified development model that can offer full, native code performance across a range of hardware: direct programming and API-based programming.


DPC++ is a direct programming language based on the familiar C++ programming model and is key to oneAPI. It offers an open, cross-industry alternative to single architecture, proprietary languages, explains Savage.


Teams at Intel undertook an exhaustive evaluation of existing alternatives, examining the cost, benefits, and performance of using Open CL, C++, Fortran, and Nvidia CUDA. “We wanted to build upon C++, and SYCL from The Khronos group had some really good constructs that we thought provided a very good starting point. We’ve extended and improved it to achieve the goals that we want to achieve. Most of the DPC++ extensions will eventually be synced upstream into SYCL.”


DPC++, he says, “will be easy to learn for people who know C++. It’ll be quite natural and constructs for selecting a kernel to be offloaded for parallelism will look very familiar to people that have worked in CUDA and other places.”


The second key is API-based programming: oneAPI’s libraries span several workload domains that benefit from acceleration including linear algebra, video processing, data analytics, and deep learning among many others. The libraries offer an easy on-ramp to accelerating applications through highly optimized functions. In addition to being highly optimized, the library functions are also custom-coded for target architectures and require no developer tuning upon migration between supported architectures.


Analysis and debug tools complete the offering. Here, Intel says it will build on its class-leading tools to deliver enhanced versions that support DPC++ and the range of SVMS architectures.




Industry-wide benefits  


The ability to efficiently program across a variety of chip architectures benefits the entire industry.


For industry and commercial developers, moving to unified development eliminates the need to work with separate code bases, multiple programming languages, and different tools and workflows. ISVs gain the ability to quickly create products and services that can be used across a variety of platforms. All developers benefit from the flexibility and productivity afforded by the unified programming model.


“IT shops, cloud service providers, and other data center managers and developers are embracing a broad set of architectures. Right now, they’ve got to adopt a vendor, or a vendor-specific library interface, and completely rewrite their software, depending on the accelerator they’re targeting,” Savage explains. “Today in the Deep Learning space, if you need to target a GPU versus a CPU, you’re reworking your software and doing something with a customized API, or potentially, a vendor-specific language. oneAPI holds the promise of changing that in the future”


Analyst feedback: positive


Unified development gives Intel a powerful new competitive edge. “Intel benefits by being able to expand its footprint and can provide customers more targeted products, solutions, and services that offer an alternative to vendor-proprietary approaches,” says Savage.


oneAPI is a key piece of a larger, long-term Intel reinvention targeting an addressable market estimated at more than $300 billion. Initial industry reaction has been favorable.


“This is a powerful strategy and a big differentiator,” says analyst firm Tractica. “The ability for AI developers to target not just GPUs but many other types of computer architectures open up the market.” Patrick Moorhead of analyst firm Moor Insights & Strategy has called oneAPI the “magic API” given its potential. “If Intel can pull it off, it will be very valuable to developers, researchers, and businesses alike, and I believe it will be a competitive advantage,” Moorhead says.


Savage believes that Intel, backed by decades of experience developing software products and tools for the more than12 million developers in the company’s ecosystem, is uniquely positioned to deliver on oneAPI. 


Proof today


The ideal of write once/use everywhere, in Java and some object-oriented schemes, has often delivered sub-optimal results. So, where’s the proof that cross-processor development with uncompromised performance as envisioned by oneAPI is realistic? Developers can see oneAPI’s concepts successfully at work today in the Intel Distribution of OpenVINO™ Toolkit. OpenVINO supports heterogeneous execution across computer vision accelerators—CPU, GPU, Intel® Movidius™ Neural Compute Stick, and FPGA—using a common API, and includes optimized calls for OpenCV and OpenVX.


“OpenVINO is actually fulfilling the vision of one API in one specific domain,” says Savage. “It provides a deep learning inference engine that can accept trained models from a variety of sources: TensorFlow, PyTorch, and others, and then allows you to target the different architectures without significantly changing the developer’s code. It then gives you the choice to select the most appropriate hardware.”


Beta launch Q4


Intel teams are continuing to develop the concepts proven in OpenVINO, creating a set of cross-architecture toolkits and learning libraries in advance of the oneAPI beta release in the fourth quarter of this year.


The oneAPI beta toolkits will include support for current Intel architectures such as CPUs, iGPUs, and FPGAs with support for upcoming GPU and other accelerator chips being added after their release. “So, developers at a data center can, for example, start their development on integrated graphics, transition to the first low-end discrete card, and then be prepared for the high-end data center part.”


“Customers want an open alternative to proprietary solutions,” Savage sums up. “With oneAPI, Intel is embracing all four classes of compute architecture scalar (CPU), vector (GPU), matrix (AI Accelerators), and spatial (FPGA), to best meet the computing needs of the data center and the industry, now and into the future.”


Comments

Popular posts from this blog

SSO — WSO2 API Manager and Keycloak Identity Manager

Single Value Decomposition in Data Science

Video Analysis: Creating Highlights Using Python