Skip to content

CEN310 Parallel Programming

Week-1

Course Introduction and Development Environment Setup


Download



Outline (⅓)

  1. Course Overview
  2. Course Description
  3. Learning Outcomes
  4. Assessment Methods
  5. Course Topics

  6. Development Environment Setup

  7. Required Hardware
  8. Required Software
  9. Installation Steps

Outline (⅔)

  1. Introduction to Parallel Programming
  2. What is Parallel Programming?
  3. Why Parallel Programming?
  4. Basic Concepts

  5. First Parallel Program

  6. Hello World Example
  7. Compilation Steps
  8. Running and Testing

Outline (3/3)

  1. Understanding Hardware
  2. CPU Architecture
  3. Memory Patterns

  4. Performance and Practice

  5. Parallel Patterns
  6. Performance Measurement
  7. Homework
  8. Resources

1. Course Overview

Course Description

This course introduces fundamental concepts and practices of parallel programming, focusing on: - Designing and implementing efficient parallel algorithms - Using modern programming frameworks - Understanding parallel architectures - Analyzing and optimizing parallel programs


Learning Outcomes (½)

After completing this course, you will be able to:

  1. Design and implement parallel algorithms using OpenMP and MPI
  2. Analyze and optimize parallel program performance
  3. Develop solutions using various programming models

Learning Outcomes (2/2)

  1. Apply parallel computing concepts to real-world problems
  2. Evaluate and select appropriate parallel computing approaches based on:
  3. Problem requirements
  4. Hardware constraints
  5. Performance goals

Assessment Methods

Assessment Weight Due Date
Midterm Project Report 60% Week 8
Quiz-1 40% Week 7
Final Project Report 70% Week 14
Quiz-2 30% Week 13

Course Topics (½)

  1. Parallel computing concepts
  2. Basic principles
  3. Architecture overview
  4. Programming models

  5. Algorithm design and analysis

  6. Design patterns
  7. Performance metrics
  8. Optimization strategies

Course Topics (2/2)

  1. Programming frameworks
  2. OpenMP
  3. MPI
  4. GPU Computing

  5. Advanced topics

  6. Performance optimization
  7. Real-world applications
  8. Best practices

Why Parallel Programming? (½)

Historical Evolution

  • Moore's Law limitations
  • Multi-core revolution
  • Cloud computing era
  • Big data requirements

Industry Applications

  • Scientific simulations
  • Financial modeling
  • AI/Machine Learning
  • Video processing

Why Parallel Programming? (2/2)

Performance Benefits

  • Reduced execution time
  • Better resource utilization
  • Improved responsiveness
  • Higher throughput

Challenges

  • Synchronization overhead
  • Load balancing
  • Debugging complexity
  • Race conditions

Parallel Computing Models (½)

Shared Memory

CPU     CPU     CPU     CPU
  │       │       │       │
  └───────┴───────┴───────┘
    Shared Memory
  • All processors access same memory
  • Easy to program
  • Limited scalability
  • Example: OpenMP

Parallel Computing Models (2/2)

Distributed Memory

CPU──Memory   CPU──Memory
    │             │
    └─────Network─┘
    │             │
CPU──Memory   CPU──Memory
  • Each processor has private memory
  • Better scalability
  • More complex programming
  • Example: MPI

Memory Architecture Deep Dive (⅓)

Cache Hierarchy

// Example showing cache effects
void demonstrateCacheEffects() {
    const int SIZE = 1024 * 1024;
    int* arr = new int[SIZE];

    // Sequential access (cache-friendly)
    Timer t1;
    for(int i = 0; i < SIZE; i++) {
        arr[i] = i;
    }
    double sequential_time = t1.elapsed();

    // Random access (cache-unfriendly)
    Timer t2;
    for(int i = 0; i < SIZE; i++) {
        arr[(i * 16) % SIZE] = i;
    }
    double random_time = t2.elapsed();

    printf("Sequential/Random time ratio: %f\n", 
           random_time/sequential_time);
}

Memory Architecture Deep Dive (⅔)

False Sharing Example

#include <omp.h>

// Bad example with false sharing
void falseSharing() {
    int data[4];
    #pragma omp parallel for
    for(int i = 0; i < 4; i++) {
        for(int j = 0; j < 1000000; j++) {
            data[i]++; // Adjacent elements share cache line
        }
    }
}

// Better version avoiding false sharing
void avoidFalseSharing() {
    struct PaddedInt {
        int value;
        char padding[60]; // Separate cache lines
    };
    PaddedInt data[4];

    #pragma omp parallel for
    for(int i = 0; i < 4; i++) {
        for(int j = 0; j < 1000000; j++) {
            data[i].value++;
        }
    }
}

Memory Architecture Deep Dive (3/3)

NUMA Awareness

// NUMA-aware allocation
void numaAwareAllocation() {
    #pragma omp parallel
    {
        // Each thread allocates its own memory
        std::vector<double> local_data(1000000);

        // Process local data
        #pragma omp for
        for(int i = 0; i < local_data.size(); i++) {
            local_data[i] = heavyComputation(i);
        }
    }
}

OpenMP Fundamentals (¼)

Basic Parallel Regions

#include <omp.h>

void basicParallelRegion() {
    #pragma omp parallel
    {
        // This code runs in parallel
        int thread_id = omp_get_thread_num();

        #pragma omp critical
        std::cout << "Thread " << thread_id << " starting\n";

        // Do some work
        heavyComputation();

        #pragma omp critical
        std::cout << "Thread " << thread_id << " finished\n";
    }
}

OpenMP Fundamentals (2/4)

Work Sharing Constructs

void workSharing() {
    const int SIZE = 1000000;
    std::vector<double> data(SIZE);

    // Parallel for loop
    #pragma omp parallel for schedule(dynamic, 1000)
    for(int i = 0; i < SIZE; i++) {
        data[i] = heavyComputation(i);
    }

    // Parallel sections
    #pragma omp parallel sections
    {
        #pragma omp section
        { task1(); }

        #pragma omp section
        { task2(); }
    }
}

OpenMP Fundamentals (¾)

Data Sharing

void dataSharing() {
    int shared_var = 0;
    int private_var = 0;

    #pragma omp parallel private(private_var) \
                         shared(shared_var)
    {
        private_var = omp_get_thread_num(); // Each thread has its copy

        #pragma omp critical
        shared_var += private_var; // Updates shared variable
    }
}

OpenMP Fundamentals (4/4)

Synchronization

void synchronization() {
    #pragma omp parallel
    {
        // Barrier synchronization
        #pragma omp barrier

        // Critical section
        #pragma omp critical
        {
            // Exclusive access
        }

        // Atomic operation
        #pragma omp atomic
        counter++;
    }
}

Practical Workshop (⅓)

Matrix Multiplication

void matrixMultiply(const std::vector<std::vector<double>>& A,
                   const std::vector<std::vector<double>>& B,
                   std::vector<std::vector<double>>& C) {
    int N = A.size();

    #pragma omp parallel for collapse(2)
    for(int i = 0; i < N; i++) {
        for(int j = 0; j < N; j++) {
            double sum = 0.0;
            for(int k = 0; k < N; k++) {
                sum += A[i][k] * B[k][j];
            }
            C[i][j] = sum;
        }
    }
}

Practical Workshop (⅔)

Performance Comparison

void comparePerformance() {
    const int N = 1000;
    auto A = generateRandomMatrix(N);
    auto B = generateRandomMatrix(N);
    auto C1 = createEmptyMatrix(N);
    auto C2 = createEmptyMatrix(N);

    // Sequential version
    Timer t1;
    matrixMultiplySequential(A, B, C1);
    double sequential_time = t1.elapsed();

    // Parallel version
    Timer t2;
    matrixMultiply(A, B, C2);
    double parallel_time = t2.elapsed();

    printf("Speedup: %f\n", sequential_time/parallel_time);
}

Practical Workshop (3/3)

Exercise Tasks

  1. Implement matrix multiplication
  2. Measure performance with different matrix sizes
  3. Try different scheduling strategies
  4. Plot performance results

2. Development Environment

Required Hardware

  • Multi-core processor
  • 16GB RAM (recommended)
  • 100GB free disk space
  • Windows 10/11 (version 2004+)

Required Software

  1. Visual Studio Community 2022
  2. Windows Subsystem for Linux (WSL2)
  3. Ubuntu distribution
  4. Git for Windows

Step-by-Step Installation Guide

Windows Installation (30 minutes)

  1. Visual Studio Code Installation
  2. Go to Visual Studio Code
  3. Click "Download for Windows"
  4. Run the installer (VSCodeUserSetup-x64-*.exe)
  5. ✅ Check "Add to PATH" during installation
  6. ✅ Check "Add 'Open with Code' action"

  7. MinGW Compiler Installation

    # Step 1: Download MSYS2
    # Visit https://www.msys2.org/ and download installer
    
    # Step 2: Run MSYS2 installer
    # Use default installation path: C:\msys64
    
    # Step 3: Open MSYS2 terminal and run:
    pacman -Syu  # Update package database
    # Close terminal when asked
    
    # Step 4: Reopen MSYS2 and install required packages:
    pacman -S mingw-w64-x86_64-gcc
    pacman -S mingw-w64-x86_64-gdb
    
  8. Add to PATH

  9. Open Windows Search
  10. Type "Environment Variables"
  11. Click "Edit the system environment variables"
  12. Click "Environment Variables"
  13. Under "System Variables", find "Path"
  14. Click "Edit" → "New"
  15. Add C:\msys64\mingw64\bin
  16. Click "OK" on all windows

  17. Verify Installation

    # Open new Command Prompt and type:
    gcc --version
    g++ --version
    gdb --version
    

VS Code Configuration (15 minutes)

  1. Install Required Extensions
  2. Open VS Code
  3. Press Ctrl+Shift+X
  4. Install these extensions:

    • C/C++ Extension Pack
    • Code Runner
    • GitLens
    • Live Share
  5. Create Workspace

    # Open Command Prompt
    mkdir parallel_programming
    cd parallel_programming
    code .
    
  6. Configure Build Tasks

  7. Press Ctrl+Shift+P
  8. Type "Tasks: Configure Default Build Task"
  9. Select "Create tasks.json from template"
  10. Select "Others"
  11. Replace content with:
    {
        "version": "2.0.0",
        "tasks": [
            {
                "label": "build",
                "type": "shell",
                "command": "g++",
                "args": [
                    "-g",
                    "-fopenmp",
                    "${file}",
                    "-o",
                    "${fileDirname}/${fileBasenameNoExtension}"
                ],
                "group": {
                    "kind": "build",
                    "isDefault": true
                }
            }
        ]
    }
    

First OpenMP Program (15 minutes)

  1. Create Test File
  2. In VS Code, create new file: test.cpp
  3. Add this code:

    #include <iostream>
    #include <omp.h>
    
    int main() {
        // Get total available threads
        int max_threads = omp_get_max_threads();
        printf("System has %d processors available\n", max_threads);
    
        // Set number of threads
        omp_set_num_threads(4);
    
        // Parallel region
        #pragma omp parallel
        {
            int id = omp_get_thread_num();
            printf("Hello from thread %d\n", id);
    
            // Only master thread prints total
            if (id == 0) {
                printf("Total %d threads running\n", 
                       omp_get_num_threads());
            }
        }
        return 0;
    }
    
  4. Compile and Run

  5. Press Ctrl+Shift+B to build
  6. Open terminal (Ctrl+`)
  7. Run program:

    ./test
    
  8. Experiment

    # Try different thread counts
    set OMP_NUM_THREADS=2
    ./test
    
    set OMP_NUM_THREADS=8
    ./test
    

Common Issues and Solutions

  1. Compiler Not Found
  2. Verify PATH setting
  3. Restart VS Code
  4. Restart Command Prompt

  5. OpenMP Not Recognized

  6. Ensure -fopenmp flag in tasks.json
  7. Check compiler version supports OpenMP

  8. Program Crashes

  9. Check array bounds
  10. Verify thread synchronization
  11. Use proper reduction clauses

Practice Exercises

  1. Basic Parallel For

    // Create array_sum.cpp
    #include <omp.h>
    #include <vector>
    
    int main() {
        const int SIZE = 1000000;
        std::vector<int> data(SIZE);
        long sum = 0;
    
        // Initialize array
        for(int i = 0; i < SIZE; i++) {
            data[i] = i;
        }
    
        // Parallel sum
        #pragma omp parallel for reduction(+:sum)
        for(int i = 0; i < SIZE; i++) {
            sum += data[i];
        }
    
        printf("Sum: %ld\n", sum);
        return 0;
    }
    
  2. Thread Private Data

    // Create thread_private.cpp
    #include <omp.h>
    
    int main() {
        int thread_sum = 0;
    
        #pragma omp parallel private(thread_sum)
        {
            thread_sum = omp_get_thread_num();
            printf("Thread %d: sum = %d\n", 
                   omp_get_thread_num(), thread_sum);
        }
    
        printf("Final sum: %d\n", thread_sum);
        return 0;
    }
    

3. Introduction to Parallel Programming

What is Parallel Programming? (½)

Parallel programming is the technique of writing programs that: - Execute multiple tasks simultaneously - Utilize multiple computational resources - Improve performance through parallelization


What is Parallel Programming? (2/2)

Key Concepts: - Task decomposition - Data distribution - Load balancing - Synchronization


4. First Parallel Program

Hello World Example

#include <iostream>
#include <omp.h>

int main() {
    #pragma omp parallel
    {
        int thread_id = omp_get_thread_num();
        int total_threads = omp_get_num_threads();

        printf("Hello from thread %d of %d!\n", 
               thread_id, total_threads);
    }
    return 0;
}

Compilation Steps

Visual Studio:

# Create new project
mkdir parallel_hello
cd parallel_hello

# Compile with OpenMP
cl /openmp hello.cpp

Running and Testing

Windows:

hello.exe

Linux/WSL:

./hello

Expected Output:

Hello from thread 0 of 4!
Hello from thread 2 of 4!
Hello from thread 3 of 4!
Hello from thread 1 of 4!

5. Understanding Hardware

CPU Architecture

CPU
├── Core 0
│   ├── L1 Cache
│   └── L2 Cache
├── Core 1
│   ├── L1 Cache
│   └── L2 Cache
└── Shared L3 Cache

Memory Access Patterns

void measureMemoryAccess() {
    const int SIZE = 1000000;
    std::vector<int> data(SIZE);

    // Sequential access
    auto start = std::chrono::high_resolution_clock::now();
    for(int i = 0; i < SIZE; i++) {
        data[i] = i;
    }
    auto end = std::chrono::high_resolution_clock::now();

    // Random access
    start = std::chrono::high_resolution_clock::now();
    for(int i = 0; i < SIZE; i++) {
        data[(i * 16) % SIZE] = i;
    }
    end = std::chrono::high_resolution_clock::now();
}

6. Parallel Patterns

Data Parallelism Example

#include <omp.h>
#include <vector>

void vectorAdd(const std::vector<int>& a, 
               const std::vector<int>& b, 
               std::vector<int>& result) {
    #pragma omp parallel for
    for(int i = 0; i < a.size(); i++) {
        result[i] = a[i] + b[i];
    }
}

Task Parallelism Example

#pragma omp parallel sections
{
    #pragma omp section
    {
        // Task 1: Matrix multiplication
    }

    #pragma omp section
    {
        // Task 2: File processing
    }
}

7. Performance Measurement

Using the Timer Class

class Timer {
    std::chrono::high_resolution_clock::time_point start;
public:
    Timer() : start(std::chrono::high_resolution_clock::now()) {}

    double elapsed() {
        auto end = std::chrono::high_resolution_clock::now();
        return std::chrono::duration<double>(end - start).count();
    }
};

Measuring Parallel Performance

void measureParallelPerformance() {
    const int SIZE = 100000000;
    std::vector<double> data(SIZE);

    Timer t;
    #pragma omp parallel for
    for(int i = 0; i < SIZE; i++) {
        data[i] = std::sin(i) * std::cos(i);
    }
    std::cout << "Time: " << t.elapsed() << "s\n";
}

8. Homework

Assignment 1: Environment Setup

  1. Screenshots of installations
  2. Version information
  3. Example program results
  4. Issue resolution documentation

Assignment 2: Performance Analysis

  1. Process & thread ID printing
  2. Execution time measurements
  3. Performance graphs
  4. Analysis report

9. Resources

Documentation

  • OpenMP API Specification
  • Visual Studio Parallel Programming
  • WSL Documentation

Books and Tutorials

  • "Introduction to Parallel Programming"
  • "Using OpenMP"
  • Online courses

Next Week Preview

We will cover: - Advanced parallel patterns - Performance analysis - OpenMP features - Practical exercises


10. Advanced OpenMP Features

Nested Parallelism (½)

#include <omp.h>

void nestedParallelExample() {
    omp_set_nested(1); // Enable nested parallelism

    #pragma omp parallel num_threads(2)
    {
        int outer_id = omp_get_thread_num();

        #pragma omp parallel num_threads(2)
        {
            int inner_id = omp_get_thread_num();
            printf("Outer thread %d, Inner thread %d\n", 
                   outer_id, inner_id);
        }
    }
}

Nested Parallelism (2/2)

Expected Output:

Outer thread 0, Inner thread 0
Outer thread 0, Inner thread 1
Outer thread 1, Inner thread 0
Outer thread 1, Inner thread 1

Benefits: - Hierarchical parallelism - Better resource utilization - Complex parallel patterns


Task-Based Parallelism (⅓)

void taskBasedExample() {
    #pragma omp parallel
    {
        #pragma omp single
        {
            #pragma omp task
            heavyTask1();

            #pragma omp task
            heavyTask2();

            #pragma omp taskwait
            printf("All tasks completed\n");
        }
    }
}

Task-Based Parallelism (⅔)

Fibonacci Example

int parallel_fib(int n) {
    if (n < 30) return fib_sequential(n);

    int x, y;
    #pragma omp task shared(x)
    x = parallel_fib(n - 1);

    #pragma omp task shared(y)
    y = parallel_fib(n - 2);

    #pragma omp taskwait
    return x + y;
}

Task-Based Parallelism (3/3)

Task Priority

void priorityTasks() {
    #pragma omp parallel
    {
        #pragma omp single
        {
            #pragma omp task priority(0)
            lowPriorityTask();

            #pragma omp task priority(100)
            highPriorityTask();
        }
    }
}

11. Performance Optimization Techniques

Loop Optimization (⅓)

Loop Scheduling

void demonstrateScheduling() {
    const int SIZE = 1000000;

    // Static scheduling
    #pragma omp parallel for schedule(static)
    for(int i = 0; i < SIZE; i++)
        work_static(i);

    // Dynamic scheduling
    #pragma omp parallel for schedule(dynamic, 1000)
    for(int i = 0; i < SIZE; i++)
        work_dynamic(i);

    // Guided scheduling
    #pragma omp parallel for schedule(guided)
    for(int i = 0; i < SIZE; i++)
        work_guided(i);
}

Loop Optimization (⅔)

Loop Collapse

void matrixOperations() {
    const int N = 1000;
    double matrix[N][N];

    // Without collapse
    #pragma omp parallel for
    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            matrix[i][j] = compute(i, j);

    // With collapse
    #pragma omp parallel for collapse(2)
    for(int i = 0; i < N; i++)
        for(int j = 0; j < N; j++)
            matrix[i][j] = compute(i, j);
}

Loop Optimization (3/3)

SIMD Directives

void simdExample() {
    const int N = 1000000;
    float a[N], b[N], c[N];

    #pragma omp parallel for simd
    for(int i = 0; i < N; i++) {
        c[i] = a[i] * b[i];
    }
}

12. Common Parallel Programming Patterns

Pipeline Pattern (½)

struct Data {
    // ... data members
};

void pipelinePattern() {
    std::queue<Data> queue1, queue2;

    #pragma omp parallel sections
    {
        #pragma omp section // Stage 1
        {
            while(hasInput()) {
                Data d = readInput();
                queue1.push(d);
            }
        }

        #pragma omp section // Stage 2
        {
            while(true) {
                Data d = queue1.pop();
                process(d);
                queue2.push(d);
            }
        }

        #pragma omp section // Stage 3
        {
            while(true) {
                Data d = queue2.pop();
                writeOutput(d);
            }
        }
    }
}

Pipeline Pattern (2/2)

Benefits: - Improved throughput - Better resource utilization - Natural for streaming data

Challenges: - Load balancing - Queue management - Termination conditions


13. Debugging Parallel Programs

Common Issues (½)

  1. Race Conditions
    // Bad code
    int counter = 0;
    #pragma omp parallel for
    for(int i = 0; i < N; i++)
        counter++; // Race condition!
    
    // Fixed code
    int counter = 0;
    #pragma omp parallel for reduction(+:counter)
    for(int i = 0; i < N; i++)
        counter++;
    

Common Issues (2/2)

  1. Deadlocks
    // Potential deadlock
    #pragma omp parallel sections
    {
        #pragma omp section
        {
            #pragma omp critical(A)
            {
                #pragma omp critical(B)
                { /* ... */ }
            }
        }
    
        #pragma omp section
        {
            #pragma omp critical(B)
            {
                #pragma omp critical(A)
                { /* ... */ }
            }
        }
    }
    

14. Real-World Applications

Image Processing Example

void parallelImageProcessing(unsigned char* image, 
                           int width, int height) {
    #pragma omp parallel for collapse(2)
    for(int y = 0; y < height; y++) {
        for(int x = 0; x < width; x++) {
            int idx = (y * width + x) * 3;

            // Apply Gaussian blur
            float sum_r = 0, sum_g = 0, sum_b = 0;
            float weight_sum = 0;

            for(int ky = -2; ky <= 2; ky++) {
                for(int kx = -2; kx <= 2; kx++) {
                    int ny = y + ky;
                    int nx = x + kx;

                    if(ny >= 0 && ny < height && 
                       nx >= 0 && nx < width) {
                        float weight = gaussian(kx, ky);
                        int nidx = (ny * width + nx) * 3;

                        sum_r += image[nidx + 0] * weight;
                        sum_g += image[nidx + 1] * weight;
                        sum_b += image[nidx + 2] * weight;
                        weight_sum += weight;
                    }
                }
            }

            // Store result
            image[idx + 0] = sum_r / weight_sum;
            image[idx + 1] = sum_g / weight_sum;
            image[idx + 2] = sum_b / weight_sum;
        }
    }
}

Monte Carlo Simulation

double parallelMonteCarlo(int iterations) {
    long inside_circle = 0;

    #pragma omp parallel reduction(+:inside_circle)
    {
        unsigned int seed = omp_get_thread_num();

        #pragma omp for
        for(int i = 0; i < iterations; i++) {
            double x = (double)rand_r(&seed) / RAND_MAX;
            double y = (double)rand_r(&seed) / RAND_MAX;

            if(x*x + y*y <= 1.0)
                inside_circle++;
        }
    }

    return 4.0 * inside_circle / iterations;
}

15. Advanced Workshop

Project: Parallel Sort Implementation

  1. Sequential Quicksort
  2. Parallel Quicksort
  3. Performance Comparison
  4. Visualization Tools

Workshop Tasks (⅓)

// Sequential Quicksort
void quicksort(int* arr, int left, int right) {
    if(left < right) {
        int pivot = partition(arr, left, right);
        quicksort(arr, left, pivot - 1);
        quicksort(arr, pivot + 1, right);
    }
}

Workshop Tasks (⅔)

// Parallel Quicksort
void parallel_quicksort(int* arr, int left, int right) {
    if(left < right) {
        if(right - left < THRESHOLD) {
            quicksort(arr, left, right);
            return;
        }

        int pivot = partition(arr, left, right);

        #pragma omp task
        parallel_quicksort(arr, left, pivot - 1);

        #pragma omp task
        parallel_quicksort(arr, pivot + 1, right);

        #pragma omp taskwait
    }
}

Workshop Tasks (3/3)

Performance Analysis Tools:

void analyzePerformance() {
    const int SIZES[] = {1000, 10000, 100000, 1000000};
    const int THREADS[] = {1, 2, 4, 8, 16};

    for(int size : SIZES) {
        for(int threads : THREADS) {
            omp_set_num_threads(threads);

            // Run and measure
            auto arr = generateRandomArray(size);
            Timer t;

            #pragma omp parallel
            {
                #pragma omp single
                parallel_quicksort(arr.data(), 0, size-1);
            }

            double time = t.elapsed();
            printf("Size: %d, Threads: %d, Time: %f\n",
                   size, threads, time);
        }
    }
}

Cross-Platform Development Environment (⅕)

Project Template

Download or clone the template project:

git clone https://github.com/ucoruh/cpp-openmp-template
# or create manually:
mkdir parallel-programming
cd parallel-programming

Create this structure:

parallel-programming/
├── CMakeLists.txt
├── src/
│   ├── main.cpp
│   └── include/
│       └── config.h
├── build/
│   ├── windows/
│   └── linux/
└── scripts/
    ├── build-windows.bat
    └── build-linux.sh

Cross-Platform Development Environment (⅖)

CMakeLists.txt

cmake_minimum_required(VERSION 3.15)
project(parallel-programming)

# C++17 standard
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

# Find OpenMP
find_package(OpenMP)
if(OpenMP_CXX_FOUND)
    message(STATUS "OpenMP found")
else()
    message(FATAL_ERROR "OpenMP not found")
endif()

# Add executable
add_executable(${PROJECT_NAME} 
    src/main.cpp
)

# Include directories
target_include_directories(${PROJECT_NAME}
    PRIVATE
        ${CMAKE_CURRENT_SOURCE_DIR}/src/include
)

# Link OpenMP
target_link_libraries(${PROJECT_NAME}
    PRIVATE
        OpenMP::OpenMP_CXX
)

Cross-Platform Development Environment (⅗)

Build Scripts

build-windows.bat:

@echo off
setlocal

:: Create build directory
mkdir build\windows 2>nul
cd build\windows

:: CMake configuration
cmake -G "Visual Studio 17 2022" -A x64 ..\..

:: Debug build
cmake --build . --config Debug

:: Release build
cmake --build . --config Release

cd ..\..

echo Build completed!
pause

build-linux.sh:

#!/bin/bash

# Create build directory
mkdir -p build/linux
cd build/linux

# CMake configuration
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release ../..

# Build
ninja

cd ../..

echo "Build completed!"

Cross-Platform Development Environment (⅘)

Platform-Independent Code

config.h:

#pragma once

// Platform check
#if defined(_WIN32)
    #define PLATFORM_WINDOWS
#elif defined(__linux__)
    #define PLATFORM_LINUX
#else
    #error "Unsupported platform"
#endif

// OpenMP check
#ifdef _OPENMP
    #define HAVE_OPENMP
#endif

main.cpp:

#include <iostream>
#include <vector>
#include <omp.h>
#include "config.h"

int main() {
    // OpenMP version check
    #ifdef _OPENMP
        std::cout << "OpenMP Version: " 
                  << _OPENMP << std::endl;
    #else
        std::cout << "OpenMP not supported" << std::endl;
        return 1;
    #endif

    // Set thread count
    omp_set_num_threads(4);

    // Parallel region
    #pragma omp parallel
    {
        int thread_id = omp_get_thread_num();
        int total_threads = omp_get_num_threads();

        #pragma omp critical
        {
            std::cout << "Thread " << thread_id 
                      << " of " << total_threads 
                      << std::endl;
        }
    }

    return 0;
}

Cross-Platform Development Environment (5/5)

Common Issues and Solutions

  1. CMake OpenMP Issues:
  2. Windows: Reinstall Visual Studio
  3. Linux: sudo apt install libomp-dev

  4. WSL Connection Issues:

    wsl --shutdown
    wsl --update
    
  5. Build Errors:

  6. Delete build directory
  7. Delete CMakeCache.txt
  8. Rebuild project

  9. VS2022 WSL Target Missing:

  10. Run VS2022 as administrator
  11. Install Linux Development workload
  12. Restart WSL

Additional Resources

For questions and help: - GitHub Issues - Email - Office hours

\[ End-Of-Week-1 \]