Tuesday, March 02, 2004

I had an interview with Dr. John Blondin on 3/01/03. Dr. Blondin conducts Theoretical Astrophysics research and has used a large number of diverse HPC systems. I explained the problem of expressing reliability preferences as a query on a set of service data.

Dr. Blondin first pointed out some shortcomings of my example service data set. Cue time is usually a function of how many processors you request, and the AvgCueTime would be more useful if some context information was given. The long queue time for someone else’s small job is not a good prediction of the queue time for your short job. He mentioned that his simulation applications are bottlenecked by network topology, and this is not well represented in my example service data set. Dr. Blondin pointed out that my service data set assumes that users behave in similar ways, and perhaps this is not a fair assumption. He believes that a service data set will become less useful as the machines sending data to the registry become less homogeneous.

In contrast to Dr. Edwards, Dr. Blondin stated that knowing the past frequency of exceptions is a good predictor as to the overall reliability and chance of future success. However, he also said that the length of a services history is not important. He would be willing to give a new service the “benefit of the doubt,” and he pointed out that often the newest resources are often the fastest.

Dr Blondin also recognized that any request for a service must balance different “goods.” He suggested that perhaps the user should have a control to assign weights to: speed, reliability, bandwidth, and CPU intensity. He said he would envision some sliding bars that would correspond to the construction of a utility function.

Dr. Blondin did suggest some possible service data queries. He believed the following query would be useful for most of his applications. He would choose the service with the best AvgRunTime. If there were several AvgRunTimes within 10% of each other, he would then pick the service with the fewest OutstandingRequests. He said he would also eliminate any service with an exception rate greater than or equal to 50%. He said, “a failure rate of 50% would scare the bejeezus out of me.”


This page is powered by Blogger. Isn't yours?