{C++ etc.}

Latest Posts
 
Home | C++ Posts | Linux Posts | Programming Posts | Issue Tracking Posts

Monday, July 14, 2008

Scaling - Technology vs. Services

I just read a blog post by Dare Obasanjo which focused on the importance (or lack of it) when dealing with rapid scaling. Here's and excerpt from it:

If someone tells you "technology X doesn't scale" without qualifying that statement, it often means the person either doesn't know what he is talking about or is trying to sell you something. Technologies don't scale, services do. Thinking you can just sprinkle a technology on your service and make it scale is the kind of thinking that led Blaine Cook (former architect at Twitter) to publish a presentation on Scaling TwitterFail Whale begs to differ. which claimed their scaling problems where solved with their adoption of memcached. That was in 2007. In 2008, let's just say the Fail Whale begs to differ.

If a service doesn't scale it is more likely due to bad design than to technology choice. Remember that.

You could find the full entry here.

I have to contradict. Choosing the right technology can have a significant impact on handling ever increasing data loads. Scaling one's service (as Dare has said) is simple if a proper design has been adopted at an early stage. But one has to take into consideration the cost factor. Hardware and bandwidth levy a recurring toll on the budget and in most instances it's not feasible to scale in terms of expanding your service horizontally.
Choosing the correct technology for a project is also a critical design decision and would aid in handling a bigger load on one's current infrastructure. Prot buffs are better than XML when you consider the bandwidth and overhead on processing. Big tables are supposed to be faster than relational data bases for in most common cases. Our company develops distributed systems which delas with massive amounts of data and we optimize our processes by adopting proper data structures, light-weight messaging protocols, minimal logginh etc. to achive the best performance numbers before we even think about scaling our services.
In conclusion, I think it's important to adopt lighter, faster technologies when designing a system as well as making it scalable at the service level.

Wednesday, July 09, 2008

Inheriting from a Template Class


A class could be derived from a template class simply as follows;

template class
class A
{
int i_A;
}

template class
class B : public A
{

}

But when accessing member variables of the base class, a compiler error is given if it's done in the following manner;

template class
class B : public A
{
void Init(int i)
{
i_A = i;
}
}

the solution is to use "this->i_A" instead of "i_A" as per the GCC 3.4 release.

template class
class B : public A
{
void Init(int i)
{
this->i_A = i;
}
}

The reason for this is as follows (taken from GCC 3.4.0 release notes):


In a template definition, unqualified names will no longer find members of a dependent base (as specified by [temp.dep]/3 in the C++ standard). For example,

template struct B {
int m;
int n;
int f ();
int g ();
};
int n;
int g ();
template struct C : B {
void h ()
{
m = 0; // error
f (); // error
n = 0; // ::n is modified
g (); // ::g is called
}
};

You must make the names dependent, e.g. by prefixing them with this->. Here is the corrected definition of C::h,

template void C::h ()
{
this->m = 0;
this->f ();
this->n = 0
this->g ();
}

As an alternative solution (unfortunately not backwards compatible with GCC 3.3), you may use using declarations instead of this->:

template struct C : B {
using B::m;
using B::f;
using B::n;
using B::g;
void h ()
{
m = 0;
f ();
n = 0;
g ();
}
};

Full release notes...

Monday, July 07, 2008

Google Protocol Buffers Released


Google has released it's protobuf API for data serialization/retrieval under open source license. So far it appears to be pretty good and would help to avoid alot of headaches over data persistence and managing countless numbers of object types. Performance values are yet to be determined...

protobuf home

Sunday, July 06, 2008

Portable Social Networks, The Building Blocks Of A Social Web - by Ben Ward


A great article about the future of social networking where different sites would be able to interact with the help of protocols and standards which exist even today and do it on mobile platforms.

Read full article at Digital Web Magazine

IE Antivirus : A pain in the %^$#


IE antivirus is one of the most annoying malware I've ever come across in my life. It pops up warnings and opens pages in my browser when ever I double click on a folder. I guess it's a kind of explorer hijacker but it's anybody's guess. After two antivirus programmes and countless scans which failed to do anything about this little devil I was getting really mad... Today I was finally able to get rid of it using a program called Malwarebytes' Anti-Malware. One scan and a re-start was enough to do the trick... Removal procedure

Deriving from a C++/C# disposable class (via void Nish(char* szBlog))





There's a nice article by Nish on the mysterious world of using managed and unmanaged objects bound by derivation. He's given a clear and concise explanation on how to derive a C++/CLI class from a C# class which implements IDisposable and avoid the imminent garbage collection pitfalls.
Read more here...

Thursday, June 19, 2008

Bit by void* (from jaredpar's WebLog)

Recently I got bit by void* again because of another C++ quirk I didn't think through. I had a class which wrapped a void* which could be one of many different structs. The structs were POD and didn't have any shared functionality hence I didn't bother creating an inheritance hierarchy. Unfortunately I defined the structs like so

class C1 {
struct S1 {
int field1;
float field2;
};
struct S2 {
char field1;
};
~C1() {
delete m_pData;
}
void* m_pData; // Can be S1,S2,etc ...
}

Unfortunately this appeared to work fine for quite some time. Then after a couple of days of bug fixes I ended up with a memory leak which I quickly tracked down to a leaked COM object. Although C1 was at fault I didn't suspect any changes to this class because after all it was working fine for some time and all I did was add a new field to one of the structs. If the structs were being successfully free'd before a new field shouldn't change anything.

The field I added was of type CComPtr which exposed a greater problem in my code. Even though I properly delete the pointer in C1::~C1() I wasn't running the destructor on the pointed at data and instead I was just freeing the memory. Until I added a field which had a non-trivial destructor this wasn't a problem (still a bug though).

Why did this happen? By deleting a void* and expecting a destructor to run what I'm really doing is asking C++ to behave polymorphicly. C++ as a rule won't behave this way unless it is specifically asked to with inheritance and virtual. In the case of void*, it just won't. The fix is to actually implement an inheritance hierarchy which supports polymorphism.

It's just another rule that I need to remember when coding C++.

Deleting void* is dangerous, period.

Unfortunately C++ has too many of these rules and not enough enforcement.

Via jaredpar's WebLog

Thursday, April 17, 2008

atoi vs. Custom String to Int Convertion

I had two code reviews of my process and it turned up a plethora of bugs and unoptimized code. One of these was a function which is called very frequently, which contained three atoi calls. atoi is expensive, I know, but I didn't have an idea as to HOW expensive it was. Further, I thought there was no other way to achieve its results, until our team lead Mr. Prabath Fernando pointed out that it was possible to get the corresponding value of a ascii int character by subtracting 48 from it. This could be better understood by looking at an ascii table.

Anyway, I wrote a macro to do the same thing as atoi and did a performance test.


Code:

#include
#include
#include

#define QSaToi(iLen, zString, iOut) {int j = 1; iOut = 0; \
for (int i = iLen - 1; i >= 0; --i) \
{ iOut += ((zString[i] - 48) * j); \
j = j*10;}}

int main(int argc, char** argv)
{
clock_t ct_start = clock();
const char* zTest = "1234567";
int iOut = 0;

for (int i = 0; i < result = " << iOut << std::endl; std::cout << " time = " << ((double)clock() - ct_start)/(CLOCKS_PER_SEC) << " ct_start =" clock();" i =" 0;" iout =" atoi(zTest);" result = " << iOut << std::endl; std::cout << " time = " << ((double)clock() - ct_start)/(CLOCKS_PER_SEC) << ">> iOut;
}

Results:

Result from custom method = 1234567
Time for custom method = 0.063 us
Result from atoi = 1234567
Time for atoi = 0.422 us

Conclusion : Custom atoi macro is ~7 times faster than normal atoi.

NOTE: this was tested only with microsoft visual c++ 2008 compiler so results might differ on unix machines

Sunday, March 30, 2008

How Bytes Add Up

I've been working on a process which handles hand summarizes trade data from multiple markets. During development I didn't take much time on memory calculations because the structures used were light weight and the bulk of the data was persisted to a flat files.
With a code review approaching fast, I guessed it was time to do some mathematics. The result...

Worst case scenario

Market Count = 20
Instrument Count = 1000 (per partition)
Time = 8hrs. = 3600 * 8 s = 28800 s

Total Memory = 28800 * 20 * 1000 * 16 = 9,216,000,000B = 9.2 Gb


Yep..Even though this is the worst case, the figure was unacceptable. So I had to find a way to prove that in reality, the memory consumption of my process would be less.

The secret lay in the number of trades which occur on a given day. For the 10 or so markets that we connect to, this would be around 10 million. I took this down to 5 million to be on the safe side and it changed the result drastically.

Trade Count = 5,000,000

Memory = (16 * 5000000)B = 80,000,000 b = 80 MB

For the worst case, the trade count would be;

Trade count in order for this memory level to be reached = 28800 * 20 * 1000 =

576,000,000 = 576 million

So I can safely say that my process would operate at between 100MB and 1GB as long as the trade count remains below 50 million... Since it's unlikely that we will be receiving this many messages any time soon, the current architecture could be used for a long time to come.

If this fails we'll have to go for plan B, which is to move our indexes into files, and cope with having to do 2 file reads for each data segment retrieval...



Thursday, March 27, 2008

Importing a resource from one MFC project to another

God only knows how I hate MFC. When I first came across it I was already a sort of a veteran at C# and .Net and had quite a bit of experience developing front ends using these technologies. My first impression on MFC was "Maaaaaan!!! this feels AAAAAnnnnnnncient". Yep... But since it was the platform being used at our company I had to learn to live with it...

So, it happens that I'm developing a front-ends for a product(I'm mainly a back end developer :)) and it has to be customized for different projects. Recently, I made an enhancement in one project and now I have to do almost the exact same thing for another one. Importing the controls from the original project seemed like the obvious solution.


It was a surprise to see how easy it was to do this in VS6. The steps are;

* In VC6, pick "Open" from the File menu.
* At the very bottom of the open file dialog is a combo box labeled
"open as", which says "Auto" by default.
* Change that so it says "Resources" instead.
* Locate the resource file, select it, and hit open
* Drag and drop the required control from the main window to your resource view.

These steps were adequate for me to get the control integrated with all the control IDs properly assigned. But sometimes the IDs might get reset to some meaningless numeric values. In that case, those would have to be renamed manually by going to the "Properties" of each control.

(Thanks to Katy Mulvey who provided this excellent guide down at the devX forum. Link)