Parallel programming is an important topic for boasting computer performance. It brings the concept of building multi-threaded software programs that enable one to utilize many computer resources at the same time. However, it is not a very smooth decision for one to start using parallel programming in software programming partly because the tools for parallel programming are still in low-level languages such as Fortran and hence will be discussing C/C++ parallel computing tools.

There are many parallel computing tools you can choose from. The most famous ones being OpenMP, openMPI. But feel free to choose any parallel computing tools of your choice and enjoy learning more


OpenMP is an API (Application programming interface) for multi-threaded programming. OpenMP provides a portable, scalable model for developers of shared memory parallel applications. The API has three main components that enable it openMP be added as just an extension to C/C++.

  1. Compiler Directives
  2. Runtime Library Routines
  3. Environment Variables

Compiler Directives

The compiler directives are extension headers to the C/C++ programming language and includes #pragma omp parallel, #pragma omp section, #pragma omp for, #pragma omp single, #pragma omp task, #pragma omp atomic, #pragma omp master, #pragma omp barrier, #pragma omp critical, #pragma omp flush, #pragma omp ordered, #pragma omp threadprivate and #pragma omp taskwait, #pragma omp barrier. OpenMP directives exploit shared memory parallelism by defining various types of parallel regions. Parallel regions can include both iterative and non-iterative segments of program code.

Runtime Library Routines

OpenMP provides run-time library routines to help you manage your program in parallel mode. Many of these run-time library routines have corresponding environment variables that can be set as defaults. The run-time library routines let you dynamically change these factors to assist in controlling your program. In all cases, a call to a run-time library routine overrides any corresponding environment variable. Examples include:

void omp_set_num_threads(int nthreads)

Sets the number of threads to use for subsequent parallel regions created by the calling thread.

void omp_set_dynamic(int dynamic_threads)

Enables or disables dynamic adjustment of the number of threads used to execute a parallel region

void omp_set_nested(int nested)

Enables or disables nested parallelism.

These are just examples and there many other runtime routines used to monitor and influence threads and the parallel environment.

Environment Variables

Another beauty of using OpenMP constructs for parallelization is you can specify runtime options using the OMP environment variables. OpenMP runtime options affecting parallel processing are set by specifying OMP environment variables. These environment variables use the syntax of the form: invariable=option_and_args. Examples include:

  • omp_dynamic,
  • omp_max_active_levels
  • omp_num_threads
  • Omp_thread_num


OpenMPI is a library for message-passing between shared-memory processes. One might confuse it with openMP which is just a language-extension for expressing data-parallel operations (commonly arrays parallelized over loops). The goals for OpenMPI includes:

  1. Create a free, open-source, peer-reviewed, production-quality complete MPI implementation.
  2. Provide extremely high, competitive performance (latency, bandwidth.
  3. Directly involve the HPC community with external development and feedback
  4. Provide a stable platform for 3rd party research and commercial development.
  5. Help prevent the “forking problem” common to other MPI projects.
  6. Support a wide variety of HPC platforms and environments.

Advantages of openMPI over FT-MPI, LA-MPI, LAM/MPI, PACX-MPI are:

  • Open MPI represents the next generation of each of these implementations.
  • Open MPI effectively contains the union of features from each of the previous MPI projects. If you find a feature in one of the prior projects that is not in Open MPI, chances are that it will be soon.
  • The vast majority of our future research and development work will be in Open MPI.
  • All the same developers from your favorite project are working on Open MPI.


MPICH is a high-performance and widely portable implementation of the Message Passing Interface (MPI) standard (MPI-1, MPI-2, and MPI-3). The goals of MPICH are:

  1. To provide an MPI implementation that efficiently supports different computation and communication platforms including commodity clusters.
  2. To enable cutting-edge research in MPI through an easy-to-extend modular framework for other derived implementations.


MVAPICH is a BSD-licensed implementation of the MPI standard developed by Ohio State University. It is an MPI-3.1 implementation based on MPICH ADI3 layer.

Features and platforms

OFA-IB-CH3: This interface supports all InfiniBand compliant devices based on the OpenFabrics libibverbs layer with the CH3 Channel (OSU enhanced) of MPICH2 stack. This interface has the most features and is the most widely used.

OFA-iWARP-CH3: This interface supports all iWARP compliant devices supported by OpenFabrics.

Shared-Memory-CH3: This interface provides native shared memory support on multi-core platforms where communication is required only within a node. Such as SMP-only systems, laptops, etc.

What is parallel computing?

Parallel computing is the simultaneous execution of processes. A computer executes many tasks at the same time and combines the results at the end. Parallel computing takes advantage of computers having more than one processor which can run in parallel and uses each processor effectively to speed the computing

What is the best parallel programming language?

It depends on the purpose of the program, but generally, C and C++ are the best two programming languages for parallel computing. Also, OpenMP and OpenMPI, which are the two most popular and widely used parallel programming tools are also implemented in c and c++ .

Parallel computing with example in c++

parallel computing with example (c/c++ code)
#include <omp.h>
# include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[])
int nthreads, tid;
/* Fork a team of threads giving them their own copies of variables */
pragma omp parallel private(nthreads, tid)
/* Obtain thread number */
tid = omp_get_thread_num();
printf(“Hello World from thread = %d\n”, tid);
/* Only master thread does this */
if (tid == 0)
nthreads = omp_get_num_threads();
printf(“Number of threads = %d\n”, nthreads);
} /* All threads join master thread and disband */