Distributed System (distributed + system)

Distribution by Scientific Domains
Distribution within Information Science and Computing


Selected Abstracts


Out-of-Core and Dynamic Programming for Data Distribution on a Volume Visualization Cluster

COMPUTER GRAPHICS FORUM, Issue 1 2009
S. Frank
I.3.2 [Computer Graphics]: Distributed/network graphics; C.2.4 [Distributed Systems]: Distributed applications Abstract Ray directed volume-rendering algorithms are well suited for parallel implementation in a distributed cluster environment. For distributed ray casting, the scene must be partitioned between nodes for good load balancing, and a strict view-dependent priority order is required for image composition. In this paper, we define the load balanced network distribution (LBND) problem and map it to the NP-complete precedence constrained job-shop scheduling problem. We introduce a kd-tree solution and a dynamic programming solution. To process a massive data set, either a parallel or an out-of-core approach is required. Parallel preprocessing is performed by render nodes on data, which are allocated using a static data structure. Volumetric data sets often contain a large portion of voxels that will never be rendered, or empty space. Parallel preprocessing fails to take advantage of this. Our slab-projection slice, introduced in this paper, tracks empty space across consecutive slices of data to reduce the amount of data distributed and rendered. It is used to facilitate out-of-core bricking and kd-tree partitioning. Load balancing using each of our approaches is compared with traditional methods using several segmented regions of the Visible Korean data set. [source]


Towards an integrated GIS-based coastal forecast workflow

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 14 2008
Gabrielle Allen
Abstract The SURA Coastal Ocean Observing and Prediction (SCOOP) program is using geographical information system (GIS) technologies to visualize and integrate distributed data sources from across the United States and Canada. Hydrodynamic models are run at different sites on a developing multi-institutional computational Grid. Some of these predictive simulations of storm surge and wind waves are triggered by tropical and subtropical cyclones in the Atlantic and the Gulf of Mexico. Model predictions and observational data need to be merged and visualized in a geospatial context for a variety of analyses and applications. A data archive at LSU aggregates the model outputs from multiple sources, and a data-driven workflow triggers remotely performed conversion of a subset of model predictions to georeferenced data sets, which are then delivered to a Web Map Service located at Texas A&M University. Other nodes in the distributed system aggregate the observational data. This paper describes the use of GIS within the SCOOP program for the 2005 hurricane season, along with details of the data-driven distributed dataflow and workflow, which results in geospatial products. We also focus on future plans related to the complimentary use of GIS and Grid technologies in the SCOOP program, through which we hope to provide a wider range of tools that can enhance the tools and capabilities of earth science research and hazard planning. Copyright © 2008 John Wiley & Sons, Ltd. [source]


An efficient concurrent implementation of a neural network algorithm

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 12 2006
R. Andonie
Abstract The focus of this study is how we can efficiently implement the neural network backpropagation algorithm on a network of computers (NOC) for concurrent execution. We assume a distributed system with heterogeneous computers and that the neural network is replicated on each computer. We propose an architecture model with efficient pattern allocation that takes into account the speed of processors and overlaps the communication with computation. The training pattern set is distributed among the heterogeneous processors with the mapping being fixed during the learning process. We provide a heuristic pattern allocation algorithm minimizing the execution time of backpropagation learning. The computations are overlapped with communications. Under the condition that each processor has to perform a task directly proportional to its speed, this allocation algorithm has polynomial-time complexity. We have implemented our model on a dedicated network of heterogeneous computers using Sejnowski's NetTalk benchmark for testing. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Distributed loop-scheduling schemes for heterogeneous computer systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 7 2006
Anthony T. Chronopoulos
Abstract Distributed computing systems are a viable and less expensive alternative to parallel computers. However, a serious difficulty in concurrent programming of a distributed system is how to deal with scheduling and load balancing of such a system which may consist of heterogeneous computers. Some distributed scheduling schemes suitable for parallel loops with independent iterations on heterogeneous computer clusters have been designed in the past. In this work we study self-scheduling schemes for parallel loops with independent iterations which have been applied to multiprocessor systems in the past. We extend one important scheme of this type to a distributed version suitable for heterogeneous distributed systems. We implement our new scheme on a network of computers and make performance comparisons with other existing schemes. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A static mapping heuristics to map parallel applications to heterogeneous computing systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 13 2005
Ranieri Baraglia
Abstract In order to minimize the execution time of a parallel application running on a heterogeneously distributed computing system, an appropriate mapping scheme is needed to allocate the application tasks to the processors. The general problem of mapping tasks to machines is a well-known NP-hard problem and several heuristics have been proposed to approximate its optimal solution. In this paper we propose a static graph-based mapping algorithm, called Heterogeneous Multi-phase Mapping (HMM), which permits suboptimal mapping of a parallel application onto a heterogeneous computing distributed system by using a local search technique together with a tabu search meta-heuristic. HMM allocates parallel tasks by exploiting the information embedded in the parallelism forms used to implement an application, and considering an affinity parameter, that identifies which machine in the heterogeneous computing system is most suitable to execute a task. We compare HMM with some leading techniques and with an exhaustive mapping algorithm. We also give an example of mapping of two real applications using HMM. Experimental results show that HMM performs well demonstrating the applicability of our approach. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Full waveform seismic inversion using a distributed system of computers

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 11 2005
Indrajit G. Roy
Abstract The aim of seismic waveform inversion is to estimate the elastic properties of the Earth's subsurface layers from recordings of seismic waveform data. This is usually accomplished by using constrained optimization often based on very simplistic assumptions. Full waveform inversion uses a more accurate wave propagation model but is extremely difficult to use for routine analysis and interpretation. This is because computational difficulties arise due to: (1) strong nonlinearity of the inverse problem; (2) extreme ill-posedness; and (3) large dimensions of data and model spaces. We show that some of these difficulties can be overcome by using: (1) an improved forward problem solver and efficient technique to generate sensitivity matrix; (2) an iteration adaptive regularized truncated Gauss,Newton technique; (3) an efficient technique for matrix,matrix and matrix,vector multiplication; and (4) a parallel programming implementation with a distributed system of processors. We use a message-passing interface in the parallel programming environment. We present inversion results for synthetic and field data, and a performance analysis of our parallel implementation. Copyright © 2005 John Wiley & Sons, Ltd. [source]


End-to-end response time with fixed priority scheduling: trajectory approach versus holistic approach

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 1 2005
Steven Martin
Abstract In this paper, we are interested in providing deterministic end-to-end guarantees to real-time flows in a distributed system. We focus on the end-to-end response time, quality of service (QoS) parameter of the utmost importance for such flows. We assume that each node uses a Fixed Priority scheduling. We determine a bound on the end-to-end response time of any real-time flow with a worst case analysis using the trajectory approach. We establish new results that we compare with those provided by the classical holistic approach for flows visiting the same sequence of nodes. These results show that the trajectory approach is less pessimistic than the holistic one. Moreover, the bound provided by our worst-case analysis is reached in various configurations, as shown in the examples presented. Copyright © 2004 John Wiley & Sons, Ltd. [source]


Active network architecture and management

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 10 2007
Roy Ladner
Access and retrieval of meteorological and oceanographic data from heterogeneous sources in a distributed system presents many issues. There are a number of features of the TEDServices system that illustrate active network management for such data. There is a self-aware or intelligent aspect with respect to the mechanisms for shutdown, data ordering, and propagation of data orders. Intelligent cache management and collaborative application sharing process are other features of the active network management. Additionally a very important capability is the implementation of resumable object streams, which allows either the client or server side of a request to lose network connection, regain it, and the request will continue where it left off. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 1123,1138, 2007. [source]


Preparation of nano-sized UV-absorbing titanium-oxo-clusters via a photo-curing ceramer process,

POLYMERS FOR ADVANCED TECHNOLOGIES, Issue 2-3 2005
Mark D. Soucek
Abstract A titanium sol-gel precursor, titanium (IV) isopropoxide (TIP), was mixed with an epoxidized linseed oil (ELO). Using a cationic super-acid photoinitiator, triarylsulfonium hexafluoroantimonate, both the organic phase (ELO) and the inorganic phase (TIP) were concomitantly cured. The exposure to moisture was strictly controlled before and during the UV-curing process. The UV-Visible spectra, SAXs (small angle X-ray), DMA (dynamic mechanical properties), and contact angle were investigated as a function of sol-gel precursor. The UV-Visible spectra revealed that the inorganic/organic hybrid materials were more effective at blocking the UV-light than nanoparticulate titanium dioxide while maintaining complete transparency in the visible region. The contact angle data indicated that the inorganic phase preferentially was concentrated at the film-surface interface. The SAXs data was indicative of a 2--5 nm titanium-oxo-cluster size, and the DMA data suggests a well distributed system. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A study of time-between-events control chart for the monitoring of regularly maintained systems

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 7 2009
Michael B. C. Khoo
Abstract Owing to usage, environment and aging, the condition of a system deteriorates over time. Regular maintenance is often conducted to restore its condition and to prevent failures from occurring. In this kind of a situation, the process is considered to be stable, thus statistical process control charts can be used to monitor the process. The monitoring can help in making a decision on whether further maintenance is worthwhile or whether the system has deteriorated to a state where regular maintenance is no longer effective. When modeling a deteriorating system, lifetime distributions with increasing failure rate are more appropriate. However, for a regularly maintained system, the failure time distribution can be approximated by the exponential distribution with an average failure rate that depends on the maintenance interval. In this paper, we adopt a modification for a time-between-events control chart, i.e. the exponential chart for monitoring the failure process of a maintained Weibull distributed system. We study the effect of changes on the scale parameter of the Weibull distribution while the shape parameter remains at the same level on the sensitivity of the exponential chart. This paper illustrates an approach of integrating maintenance decision with statistical process monitoring methods. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Fighting fire with fire: using randomized gossip to combat stochastic scalability limits

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 3 2002
Indranil Gupta
Abstract The mechanisms used to improve the reliability of distributed systems often limit performance and scalability. Focusing on one widely-used definition of reliability, we explore the origins of this phenomenon and conclude that it reflects a tradeoff arising deep within the typical protocol stack. Specifically, we suggest that protocol designs often disregard the high cost of infrequent events. When a distributed system is scaled, both the frequency and the overall cost of such events often grow with the size of the system. This triggers an O() phenomenon, which becomes visible above some threshold sizes. Our findings suggest that it would be more effective to construct large-scale reliable systems where, unlike traditional protocol stacks, lower layers use randomized mechanisms, with probabilistic guarantees, to overcome low-probability events. Reliability and other end-to-end properties are introduced closer to the application. We employ a back-of-the-envelope analysis to quantify this phenomenon for a class of strongly reliable multicast problems. We construct a non-traditional stack, as described above, that implements virtually synchronous multicast. Experimental results reveal that virtual synchrony over a non-traditional, probabilistic stack helps break through the scalability barrier faced by traditional implementations of the protocol. Copyright © 2002 John Wiley & Sons, Ltd. [source]


Model error and sequential data assimilation: A deterministic formulation

THE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 634 2008
A. Carrassi
Abstract Data assimilation schemes are confronted with the presence of model errors arising from the imperfect description of atmospheric dynamics. These errors are usually modelled on the basis of simple assumptions such as bias, white noise, and first-order Markov process. In the present work, a formulation of the sequential extended Kalman filter is proposed, based on recent findings on the universal deterministic behaviour of model errors in marked contrast with previous approaches. This new scheme is applied in the context of a spatially distributed system proposed by Lorenz. First, it is found that, for short times, the estimation error is accurately approximated by an evolution law in which the variance of the model error (assumed to be a deterministic process) evolves according to a quadratic law, in agreement with the theory. Moreover, the correlation with the initial condition error appears to play a secondary role in the short-time dynamics of the estimation error covariance. Second, the deterministic description of the model error evolution, incorporated into the classical extended Kalman filter equations, reveals that substantial improvements of the filter accuracy can be gained compared with the classical white-noise assumption. The universal short-time quadratic law for the evolution of the model error covariance matrix seems very promising for modelling estimation error dynamics in sequential data assimilation. Copyright © 2008 Royal Meteorological Society [source]


A test framework for CORBA* component model-based software systems

BELL LABS TECHNICAL JOURNAL, Issue 3 2003
Harold J. Batteram
In this paper we present a framework for testing software systems that is based on the Common Object Request Broker Architecture (CORBA*) component model (CCM) standard. An important aspect of CCM-based systems is that they must be verifiable and testable at the abstract level of their design, regardless of the language chosen to implement the component. Component-based systems allow the development and testing of components to be divided among development groups working in parallel. However, dependencies between separately developed components may cause delays in testing. The test framework we present allows for the automatic generation,based on their external specification,of reactor components that testers can use as substitutes for components their components depend on, but that have not yet been developed. The test components generated can respond to an invocation interactively or automatically by means of a test script. The framework can also visualize interactions between components as they flow through a distributed system, and can compare runtime interactions with design specifications. The approach to testing that we describe was first explored in the distributed software component (DSC) framework developed as part of the FRIENDS project, and has been used successfully in the WINMAN European research project, which deals with network management applications. The test framework has now been extended and adapted for the CCM architecture. It is currently implemented as part of the COACH research project, which is sponsored by the European Commission. © 2003 Lucent Technologies Inc. [source]


A comparative study of awareness methods for peer-to-peer distributed virtual environments

COMPUTER ANIMATION AND VIRTUAL WORLDS (PREV: JNL OF VISUALISATION & COMPUTER ANIMATION), Issue 5 2008
S. Rueda
Abstract The increasing popularity of multi-player online games is leading to the widespread use of large-scale Distributed Virtual Environments (DVEs) nowadays. In these systems, peer-to-peer (P2P) architectures have been proposed as an efficient and scalable solution for supporting massively multi-player applications. However, the main challenge for P2P architectures consists of providing each avatar with updated information about which other avatars are its neighbors. This problem is known as the awareness problem. In this paper, we propose a comparative study of the performance provided by those awareness methods that are supposed to fully solve the awareness problem. This study is performed using well-known performance metrics in distributed systems. Moreover, while the evaluations shown in the literature are performed by executing P2P simulations on a single (sequential) computer, this paper evaluates the performance of the considered methods on actually distributed systems. The evaluation results show that only a single method actually provides full awareness to avatars. This method also provides the best performance results. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Adaptive structured parallelism for distributed heterogeneous architectures: a methodological approach with pipelines and farms

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 15 2010
Horacio González-Vélez
Abstract Algorithmic skeletons abstract commonly used patterns of parallel computation, communication, and interaction. Based on the algorithmic skeleton concept, structured parallelism provides a high-level parallel programming technique that allows the conceptual description of parallel programs while fostering platform independence and algorithm abstraction. This work presents a methodology to improve skeletal parallel programming in heterogeneous distributed systems by introducing adaptivity through resource awareness. As we hypothesise that a skeletal program should be able to adapt to the dynamic resource conditions over time using its structural forecasting information, we have developed adaptive structured parallelism (ASPARA). ASPARA is a generic methodology to incorporate structural information at compilation into a parallel program, which will help it to adapt at execution. ASPARA comprises four phases: programming, compilation, calibration, and execution. We illustrate the feasibility of this approach and its associated performance improvements using independent case studies based on two algorithmic skeletons,the task farm and the pipeline,evaluated in a non-dedicated heterogeneous multi-cluster system. Copyright © 2010 John Wiley & Sons, Ltd. [source]


A formalized approach for designing a P2P-based dynamic load balancing scheme

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 10 2010
Hengheng Xie
Abstract Quality of service (QoS) is attracting more and more attention in many areas, including entertainment, emergency services, transaction services, and so on. Therefore, the study of QoS-aware systems is becoming an important research topic in the area of distributed systems. In terms of load balancing, most of the existing QoS-related load balancing algorithms focus on Routing Mechanism and Traffic Engineering. However, research on QoS-aware task scheduling and service migration is very limited. In this paper, we propose a task scheduling algorithm using dynamic QoS properties, and we develop a Genetic Algorithm-based Services Migration scheme aiming to optimize the performance of our proposed QoS-aware distributed service-based system. In order to verify the efficiency of our scheme, we implement a prototype of our algorithm using a P2P-based JXTA technique, and do an emulation test and a simulation test in order to analyze our proposed solution. We compare our service-migration-based algorithm with non-migration and non-load-balancing approaches, and find that our solution is much better than the other two in terms of QoS success rate. Furthermore, in order to provide more solid proofs of our research, we use DEVS to validate our system design. Copyright © 2010 John Wiley & Sons, Ltd. [source]


High-level distribution for the rapid production of robust telecoms software: comparing C++ and ERLANG

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 8 2008
J. H. Nyström
Abstract Currently most distributed telecoms software is engineered using low- and mid-level distributed technologies, but there is a drive to use high-level distribution. This paper reports the first systematic comparison of a high-level distributed programming language in the context of substantial commercial products. Our research strategy is to reengineer some C++/CORBA telecoms applications in ERLANG, a high-level distributed language, and make comparative measurements. Investigating the potential advantages of the high-level ERLANG technology shows that two significant benefits are realized. Firstly, robust configurable systems are easily developed using the high-level constructs for fault tolerance and distribution. The ERLANG code exhibits resilience: sustaining throughput at extreme loads and automatically recovering when load drops; availability: remaining available despite repeated and multiple failures; dynamic reconfigurability: with throughput scaling near-linearly when resources are added or removed. Secondly, ERLANG delivers significant productivity and maintainability benefits: the ERLANG components are less than one-third of the size of their C++ counterparts. The productivity gains are attributed to specific language features, for example, high-level communication saves 22%, and automatic memory management saves 11%,compared with the C++ implementation. Investigating the feasibility of the high-level ERLANG technology demonstrates that it fulfils several essential requirements. The requisite distributed functionality is readily specified, even although control of low-level distributed coordination aspects is abrogated to the ERLANG implementation. At the expense of additional memory residency, excellent time performance is achieved, e.g. three times faster than the C++ implementation, due to ERLANG's lightweight processes. ERLANG interoperates at low cost with conventional technologies, allowing incremental reengineering of large distributed systems. The technology is available on the required hardware/operating system platforms, and is well supported. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Performance and effectiveness trade-off for checkpointing in fault-tolerant distributed systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 1 2007
Panagiotis Katsaros
Abstract Checkpointing has a crucial impact on systems' performance and fault-tolerance effectiveness: excessive checkpointing results in performance degradation, while deficient checkpointing incurs expensive recovery. In distributed systems with independent checkpoint activities there is no easy way to determine checkpoint frequencies optimizing response-time and fault-tolerance costs at the same time. The purpose of this paper is to investigate the potentialities of a statistical decision-making procedure. We adopt a simulation-based approach for obtaining performance metrics that are afterwards used for determining a trade-off between checkpoint interval reductions and efficiency in performance. Statistical methodology including experimental design, regression analysis and optimization provides us with the framework for comparing configurations, which use possibly different fault-tolerance mechanisms (replication-based or message-logging-based). Systematic research also allows us to take into account additional design factors, such as load balancing. The method is described in terms of a standardized object replication model (OMG FT-CORBA), but it could also be applied in other (e.g. process-based) computational models. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Distributed loop-scheduling schemes for heterogeneous computer systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 7 2006
Anthony T. Chronopoulos
Abstract Distributed computing systems are a viable and less expensive alternative to parallel computers. However, a serious difficulty in concurrent programming of a distributed system is how to deal with scheduling and load balancing of such a system which may consist of heterogeneous computers. Some distributed scheduling schemes suitable for parallel loops with independent iterations on heterogeneous computer clusters have been designed in the past. In this work we study self-scheduling schemes for parallel loops with independent iterations which have been applied to multiprocessor systems in the past. We extend one important scheme of this type to a distributed version suitable for heterogeneous distributed systems. We implement our new scheme on a network of computers and make performance comparisons with other existing schemes. Copyright © 2005 John Wiley & Sons, Ltd. [source]


On coordination and its significance to distributed and multi-agent systems

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 4 2006
Sascha Ossowski
Abstract Coordination is one of those words: it appears in most science and social fields, in politics, warfare, and it is even the subject of sports talks. While the usage of the word may convey different ideas to different people, the definition of coordination in all fields is quite similar,it relates to the control, planning, and execution of activities that are performed by distributed (perhaps independent) actors. Computer scientists involved in the field of distributed systems and agents focus on the distribution aspect of this concept. They see coordination as a separate field from all the others,a field that rather complements standard fields such as the ones mentioned above. This paper focuses on explaining the term coordination in relation to distributed and multi-agent systems. Several approaches to coordination are described and put in perspective. The paper finishes with a look at what we are calling emergent coordination and its potential for efficiently handling coordination in open environments. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Study of a highly accurate and fast protein,ligand docking method based on molecular dynamics

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 14 2005
M. Taufer
Abstract Few methods use molecular dynamics simulations in concert with atomically detailed force fields to perform protein,ligand docking calculations because they are considered too time demanding, despite their accuracy. In this paper we present a docking algorithm based on molecular dynamics which has a highly flexible computational granularity. We compare the accuracy and the time required with well-known, commonly used docking methods such as AutoDock, DOCK, FlexX, ICM, and GOLD. We show that our algorithm is accurate, fast and, because of its flexibility, applicable even to loosely coupled distributed systems such as desktop Grids for docking. Copyright © 2005 John Wiley & Sons, Ltd. [source]


On the fundamental communication abstraction supplied by P2P overlay networks

EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, Issue 1 2006
Curt Cramer
The disruptive advent of peer-to-peer (P2P) file sharing in 2000 attracted significant interest. P2P networks have matured from their initial form, unstructured overlays, to structured overlays like distributed hash tables (DHTs), which are considered state-of-the-art. There are huge efforts to improve their performance. Various P2P applications like distributed storage and application-layer multicast were proposed. However, little effort was spent to understand the communication abstraction P2P overlays supply. Only when it is understood, the reach of P2P ideas will significantly broaden. Furthermore, this clarification reveals novel approaches and highlights future directions. In this paper, we reconsider well-known P2P overlays, linking them to insights from distributed systems research. We conclude that the main communication abstraction is that of a virtual address space or application-specific naming. On this basis, P2P systems build a functional layer implementing, for example lookup, indirection and distributed processing. Our insights led us to identify interesting and unexplored points in the design space. Copyright © 2004 AEI. [source]


Automated application component placement in data centers using mathematical programming

INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT, Issue 6 2008
Xiaoyun Zhu
In this article we address the application component placement (ACP) problem for a data center. The problem is defined as follows: for a given topology of a network consisting of switches, servers and storage devices with varying capabilities, and for a given specification of a component-based distributed application, decide which physical server should be assigned to each application component, such that the application's processing, communication and storage requirements are satisfied without creating bottlenecks in the infrastructure, and that scarce resources are used most efficiently. We explain how the ACP problem differs from traditional task assignment in distributed systems, or existing grid scheduling problems. We describe our approach of formalizing this problem using a mathematical optimization framework and further formulating it as a mixed integer program (MIP). We then present our ACP solver using GAMS and CPLEX to automate the decision-making process. The solver was numerically tested on a number of examples, ranging from a 125-server real data center to a set of hypothetical data centers with increasing size. In all cases the ACP solver found an optimal solution within a reasonably short time. In a numerical simulation comparing our solver to a random selection algorithm, our solver resulted in much more efficient use of scarce network resources and allowed more applications to be placed in the same infrastructure. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Case study: a maintenance practice used with real-time telecommunications software

JOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION: RESEARCH AND PRACTICE, Issue 2 2001
Miroslav Popovi
Abstract In this paper we present a case study of the software maintenance practice that has been successfully applied to real-time distributed systems, which are installed and fully operational in Moscow, St. Petersburg, and other cities across Russia. In this paper we concentrate on the software maintenance process, including customer request servicing, in-field error logging, role of information system, software deployment, and software quality policy, and especially the software quality prediction process. In this case study, the prediction process is shown to be integral and one of the most important parts of the software maintenance process. We include a software quality prediction procedure overview and an example of the actual practice. The quality of the new software update is predicted on the basis of the current update's quantity metrics data and quality data, and new update's quantity metrics data. For management, this forecast aids software maintenance efficiency, and cost reduction. For practitioners, the most useful result presented is the process for determining the value for the break point. We end this case study with five lessons learned. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Fighting fire with fire: using randomized gossip to combat stochastic scalability limits

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 3 2002
Indranil Gupta
Abstract The mechanisms used to improve the reliability of distributed systems often limit performance and scalability. Focusing on one widely-used definition of reliability, we explore the origins of this phenomenon and conclude that it reflects a tradeoff arising deep within the typical protocol stack. Specifically, we suggest that protocol designs often disregard the high cost of infrequent events. When a distributed system is scaled, both the frequency and the overall cost of such events often grow with the size of the system. This triggers an O() phenomenon, which becomes visible above some threshold sizes. Our findings suggest that it would be more effective to construct large-scale reliable systems where, unlike traditional protocol stacks, lower layers use randomized mechanisms, with probabilistic guarantees, to overcome low-probability events. Reliability and other end-to-end properties are introduced closer to the application. We employ a back-of-the-envelope analysis to quantify this phenomenon for a class of strongly reliable multicast problems. We construct a non-traditional stack, as described above, that implements virtually synchronous multicast. Experimental results reveal that virtual synchrony over a non-traditional, probabilistic stack helps break through the scalability barrier faced by traditional implementations of the protocol. Copyright © 2002 John Wiley & Sons, Ltd. [source]