Archive for June, 2011

A new machine learning challege on Kaggle

Wednesday, June 29th, 2011

This time it is a wikipedia’s participation challenge, which is to predict how many edits an editor will make in 5 months. It is very intrigue…

Someday I should set up a challenge myself, how often I update my blog! Given the fact that bloggers are tend to be older than facebkookers and twitters, and older people tend to do things slower, a hint for an excellent model is that it should also incorporate that!

Interview series II: sum of two elements

Wednesday, June 29th, 2011

I was asked such an interview question.

Given an unsorted array of positive integers, and also a target value K, find two elements that sum up to K.

So I suggested, sort the array first, and then start from the last (maximum) element that is smaller than K, x_max and then use binary search to find K-x_max. Well the complexity for sorting is O(n log n) and for finding is another O(n log n). Any improvement if the array is sorted? I stuck there, 5 seconds later, he tipped me with something like how about using two index pointers. WOW, great hint. Now the complexity for this step is O(n). Below is one solution.

#include <iostream>

using namespace std;

void quickSort(int *arr, int left, int right) {
	if(left>=right) {return;}
	int Start=left;
	int End = right;
	int pivot=(left+right)/2;
	int pivotV=*(arr+pivot);
	int tmp;

	while(left<right) {
		if(*(arr+left) <= pivotV) {
		if(*(arr+right)>=pivotV) {
		if(*(arr+left)>pivotV && *(arr+right)<pivotV) {
	quickSort(arr, Start, left-1);
	quickSort(arr, right+1, End);

void ksum(int *arr, int len, int k) {
	int left, right, ktmp;
	while(left<right) {
		if(ktmp==k) {
			cout << *(arr+left) << " + " << *(arr+right) << " = " << k << endl;
		} else if(ktmp>k) {right--;} else {left++;}
	cout << "Couldn't find the two elements!\n";

int main() {
	int a[10]={10,3,4,9,8,7,2,1,5,6};
	quickSort(a, 0, 9);
	ksum(a, 10, 12);
	return 0;

install twitteR on Ubuntu

Thursday, June 23rd, 2011

Well, I have heard that Google is M$ yesterday, Facebook is Google today, and Twitter is Facebook tomorrow. It is still a surprise to see there is a package in R for twitter already. It is “twitteR”. In order to install it, the following is what worked for me on my Ubuntu.

twitteR can be downloaded here.
As claimed in the document, it requires some libraries installed first.

1). Install libraries for RCurl:
On Ubuntu menu,
System-> Administration->Synaptic Package Management
quick search “curl” will result all programs with curl in their names or description, select to install “curl” only or with additional programs related to libcurl development as well.

2). Install RCurl
Download it and then go to where is saved,
R CMD INSTALL RCurl_1.6-6.tar.gz

3). Install twitteR
Start R console, then install the following:


4), Run twitteR:


Job interview series I: Steps to answer a technical question

Tuesday, June 21st, 2011

I am officially on the job market and found this is a great market, lots of opportunities and as well as competition. :) This series are nothing but notes that are job interview related. I have two reference books, “Crack the coding interview” and “Programming interview exposed”.

1. Clarify the questions with the interviewer. Knowing what is asked is the basic. This step also could be tricky, sometimes, interviewers don’t know what they are asking, so it could be a step of “Let’s define a question together.”
2. Design an algorithm. Think how to solve it and come out with a solution proposal.
3, Present the idea/algorithm/solution either with pseudo-code or just white board demonstration. If the proposed is wrong, there is no base to go further. So discuss with the interviewer.
4. Code it.
5. Test and bug fixing.

An interesting post about politics and software architecture

Sunday, June 12th, 2011

It is more than just fun to read it.