Archive for the ‘scripts’ Category

A small OpenMP practice

Monday, October 21st, 2013

I had to parallelize a piece of code. The requirements were, single node with multiple cores, RAM might be limited based on data sets. OpenMP seems a good fit for my purpose, here is my solution:

	#pragma omp parallel num_threads(MAX_THREADS_NUMBER)
		int total_threads = omp_get_num_threads();
		int ID = omp_get_thread_num();
		int block = (int) N / (double) total_threads;
		int start = ID * block;
		int end = (ID + 1) * block; 
		if (ID == (total_threads - 1)) {end = N;}

		VpTree* tree = new VpTree(X);

		for(int n = start; n< end; n++) {
		        search(n, tree);
		delete treeC;

At beginning, the start and end for the loop segment are manually defined. Well, this is not necessary in some cases since one could use

	#pragma omp parallel for
	for(int n = 0; n< N; n++) {
	        search(n, tree);

But the problem was that, in “search()” there are a bit messy/complicated/un-neat reading of the pointer tree. So I always ended up “Segmentation fault”. That’s why I was duplicating the pointer “tree” for each thread. The max number of threads allowed is pre-defined in order to comply with RAM issues. There might be more elegant solution, but I’m easy to satisfy.

Last but not least, I benefited a lot from the presentation of Tim Mattson and Larry Meadows from Intel: A “Hands-on” Introduction to OpenMP.

Interview series II: sum of two elements

Wednesday, June 29th, 2011

I was asked such an interview question.

Given an unsorted array of positive integers, and also a target value K, find two elements that sum up to K.

So I suggested, sort the array first, and then start from the last (maximum) element that is smaller than K, x_max and then use binary search to find K-x_max. Well the complexity for sorting is O(n log n) and for finding is another O(n log n). Any improvement if the array is sorted? I stuck there, 5 seconds later, he tipped me with something like how about using two index pointers. WOW, great hint. Now the complexity for this step is O(n). Below is one solution.

#include <iostream>

using namespace std;

void quickSort(int *arr, int left, int right) {
	if(left>=right) {return;}
	int Start=left;
	int End = right;
	int pivot=(left+right)/2;
	int pivotV=*(arr+pivot);
	int tmp;

	while(left<right) {
		if(*(arr+left) <= pivotV) {
		if(*(arr+right)>=pivotV) {
		if(*(arr+left)>pivotV && *(arr+right)<pivotV) {
	quickSort(arr, Start, left-1);
	quickSort(arr, right+1, End);

void ksum(int *arr, int len, int k) {
	int left, right, ktmp;
	while(left<right) {
		if(ktmp==k) {
			cout << *(arr+left) << " + " << *(arr+right) << " = " << k << endl;
		} else if(ktmp>k) {right--;} else {left++;}
	cout << "Couldn't find the two elements!\n";

int main() {
	int a[10]={10,3,4,9,8,7,2,1,5,6};
	quickSort(a, 0, 9);
	ksum(a, 10, 12);
	return 0;