Sunday, July 17, 2011

Be careful with judgements and decisions if you don't know the full picture!

People often make judgements with only partly information. This is often a necessity and this ability makes the human race going, but sometimes it can make you look very dumb and in other circumstances it can have serious detrimental effects.

For example when you have obtained insufficient information about the capabilities of a new product or about your needs. When buying a new car or house. Or when buying a new ERP system. If you are going to spend the money, you better assure you get the right thing. Strangely enough, people tend to spend proportionally less time analysing the needs and the specifics of a product for major purchases compared to the time they spend considering the purchase of a $5 product in the supermarket.

In many other cases the consequences are less obvious and impact more the social situation.

I usually buy my lunch at the same place and have a given variety of options I pick for my lunch. But I will be served by different people. If someone new comes in to serve and the coincidence is that that person served me twice or three times in a row with the same thing, they tend to think that I always order the same thing. They are really surprised if I order the next time something different. I just smile back.

The other day when we had some issues with one of our systems and had problems identifying the cause, one of our other colleagues listened to what was going on and suddenly spoke out in a loud voice: "Well I am not going to say anything, but if someone would have bothered to look at this report you would get a clear indication of what happened!". In other words, we were dumb and he was smart.

Unfortunately for him, he would have found that if he would have looked closer that the information presented on the report was misleading. You wonder who was the dumbo. But he had a fair point that it is good practice to look at the report even though in this case it did not help much.

Years ago, I was sitting next to a help desk that was servicing external customers. On one of the calls, the customer was requested to put the floppy disk of the software in the computer. It did not help and the conclusion was that the disk was corrupted. A new disk was sent over to no avail. So an engineer went over and he found that the first thing the customer did was punch two holes in the disk to file it in a folder. Pretty dumb huh? We had a good laugh. But if you have never been explained how a disk works and how computers work in general, you might think that the disk is just some carton on which the information was written with invisible ink. In those days many people would not have the slightest notion that a disk could contain information or have the slightest idea what software was.

Many of the help desk calls we receive relate to the users not knowing how to use the system or due to business processes insufficiently been defined. Regardless of the cause, the user can't proceed with his work. This is frustrating and it becomes a technology problem. In a case at hand that I dealt with, I had myself insufficiently informed and advised the customer that this was an unfortunate limitation of the system. Customer far from happy as you can understand. After consulting an expert in my team, I found out that the cause was data related and that another party had failed to enter the relevant data due to lack of knowledge of the system and how the data is used throughout the system. I informed the customer of this but did not want to hear anything of it and did not care who caused it. As long as I would resolve it. Which we did of course. But who had not him or her self properly informed and who was judging or deciding too quickly?

I think this applies to all three people involved and I think all three of us had to sigh deep. As it should be, we kept it all professionlly and this is the requierd mechanism to deal with frustrting situations. But it is easy to see how such a simple thing could have escalated. Specifically if you don't have the option of direct communication and immediate assistance from experts to help out.

It is easy to say that you need to know all the facts. However, how do you know that you don't have all the facts? Unfortunately we can't live without a certain amount of uncertainty and this also creates many of the good things in life.

Friday, July 1, 2011

Computer systems health problems are sometimes not much dissimilar as that of humans

Treating health problems of computer systems is sometimes not much dissimilar as that of humans.

You probably have experienced yourself a health problem (or if not you will know someone who had a health problem) that didn't want to go away and for which the doctor couldn't find a cause. It could have been a serious problem or it could have been one of those little annoying things that seem to come and go.

Computer systems sometimes have the same problems.

Recently we upgraded our Oracle databases and as a consequence one of the systems that ran for a long time without issues regularly stopped working. We knew that the problems was database related, but what? Log files analysis did not help much. We found also hanging locks in the database but how did they get there? The database itself did not reveal much of it's secrets. We found one hint in some blogs relating to foreign key indexes and created a few of the missing indexes. We asked Oracle whether the specific behaviour could have been caused by the missing indexes. But no clear answer from Oracle.

We expected that it was something like the missing foreign key indexes. Probably we did something that was tolerated by the older releases of the database but not by the new one.

Going back to the old release is not really a good option but this could be necessary if we wouldn't find a solution.

The difficulty with these type of problems is that it can depend on user behaviour. At one moment in time, people need to do a certain series of tasks and if a few people do certain things in parallel, then the problem can occur. But that might happen at one moment in time and before this happens again, it can be weeks or months. This is not much dissimilar as finding out whether a certain health symptom is caused by a food allergy or not.

But at least then you deal with only one person. In our situation it is difficult to go back to all the users and ask them exactly what they did and at exactly what moment in time.

Our problem has not occurred anymore for a little while and we just hope that it was caused by the missing foreign key indexes. Otherwise we can expect it come back again to bite us.