We provide compiler and runtime development and heterogeneous parallel computing optimization services.

Our focus is on using and extending open source frameworks to encourage collaboration and to avoid repeated "re-engineering of the wheel".


Parmance is a company consisting of specialists familiar with processor architectures, instruction-set simulators, compilers and heterogeneous parallel computing. We provide our services for projects which benefit from such a special area of knowledge.

We believe open source is often the best option, not only due to its low starting costs, but also for the enhanced collaboration possibilities, and especially for the "shared source" aspect; ending up with a vendor lock-in with a key piece of intellectual property such as the processor's software stack is something our clients generally want to avoid.


Oct 15 2022: Parmace joined Intel!

In the past couple of months, the whole team of Parmance transitioned to Intel! In Intel we will keep on working towards a more open-standards-based heterogeneous computing world: Our initial focus is to make the CHIP-SPV tool an efficient path from HIP/CUDA to SPIR-V and OpenCL or Level Zero driven devices. We thank our partners for the fruitful collaboration in the past and hope to continue working with you in the future!

May 10 2018: Proof-of-Concept of C++17 Parallel STL Offloading for GCC/libstdc++ Released

Parmance and General Processor Technologies have been collaborating on C++17 Parallel STL offloading support based on HSA (Heterogeneous System Architecture) and GCC (GNU Compiler Collection). A working proof-of-concept has been now released and made available in Github. This post in IEEE Computing Now blog provides a high level overview of the project.



Our engineers have worked with popular open source compiler frameworks such as LLVM and GCC for years.

If your company could use an efficient pair of hands or two to port or optimize a compiler for your architecture, we can help. We can work as a part of your compiler team to boost the pace of development.


Heterogeneous computing requires a robust runtime library implementation to support APIs such as OpenCL or HSA to make the programming easy, efficient and with minimal portability issues. Implementing the APIs efficiently and robustly on new platforms benefits from the expertise which we can provide.


Having trouble squeezing enough performance for your application of interest even if your processor claims to have all those GOPS available? Why not let us take a look?

If you deliver us your algorithm or a reference code we'll chart the options for improving the performance on your selected platform, and will optimize it for you if you wish.


Compiler development is tricky and risky. First, the developers hired to compiler teams need to possess strong knowledge of both the hardware and the software, which is relatively uncommon these days. Also, the developed software itself, the compiler, is a multilayered beast with each layer having their complexities and inter-dependencies.

What makes matters worse and brings additional risks is the debugging part which is especially demanding: When a compiler crashes, it is actually a positive type of failure. In such a case it's often quite easy to spot the root of the problem at hand. However, when the compiler produces code that makes a large program produce a single wrong value in an output of megabytes — not so nice.

Our engineers have worked with popular open source compiler frameworks such as LLVM and GCC for years and have grown quite thick skin during their gory miscompilation debugging sessions. If your company could use an efficient pair of hands or two to port or optimize a compiler for your architecture, we can help. We can work as a part of your compiler team to boost the pace of development.


Heterogeneous computing promises to deliver high performance with low power consumption. Nowadays using GPUs as high performance general purpose compute platforms next to general purpose CPUs is already almost mainstream. Almost. Except that their efficient programming is still considered too hard. Especially performance portability of heterogeneous programs is a joke that is not funny. And, how "heterogeneous" really is a platform with only a CPU and a GPU made easily available to the programmer?

We believe there is quite a long road ahead before heterogeneous platforms get the mind share and the easiness of programming the homogeneous platforms such as SMPs have. Nevertheless, the power-performance promise of heterogeneous computing is clearly there, and is recognized especially in the embedded community. We think open standards such as OpenCL and HSA are important steps in the evolution towards solving the programming and portability issues plaguing the heterogeneous computing. We also believe there is plenty of room and use cases for a much wider variety of co-processors and accelerators than only those originally designed for graphics. In our opinion it's just a matter of programmer friendliness: As soon as the processor's resources are easily usable using a familiar programming API, developers will find more and more uses for them in accelerating their applications.

If your company sells a processor architecture which would benefit from common standardized heterogeneous system APIs such as OpenCL or HSA, we can help. We can implement an efficient software stack for you by utilizing open source components as a basis, without the infamous vendor lock-in.


If you need your application to run faster or with less power consumption, we can most likely do something about it. If we cannot — we will congratulate you.

There are several options in our toolbox to deliver the performance or power efficiency you need in the platform of your choice. We can handle SMP CPUs with SIMD instructions, DSPs/ISPs with their special instructions and VLIW architectures, or GPUs with a massive amount of parallel resources. We are fluent with OpenCL and HSA which we can use to accelerate your application while providing the desired degree of portability (both of the source code portability and performance portability kinds). Or we can simply hand-optimize your algorithm loops in assembly, if you don't care for portability.

We can also help choosing the compute platform for you to find suitable ones which fulfill the performance goals, but not forgetting the other goals such as cost or the form factor. If off-the-shelf components are not enough for your demanding case, we can even customize a co-processor architecture for you from the scratch. Yes, there's an open source tool for that too.




c/o Kampusareena

Korkeakoulunkatu 7

33720 Tampere, Finland

E-mail: info@parmance.com