Reliability engineering and best practices in enterprise software

Introduction

Since the middle of the last century, reliability engineering has been present in organizations and their development. Over time it has managed to consolidate itself as an important tool in the design, planning, analysis and control of companies.

This document compiles information regarding the areas covered by reliability engineering, addresses some of the most widely used methods, reports on how it is evaluated and how it relates to the different departments of organizations.

Origin

The concept of Reliability Engineering arose during the Second World War, since at that time it was a fundamental goal to achieve high reliability in war material. This concept has been refined vertiginously in recent years, becoming an important area of research in which a great variety of mathematical and statistical concepts are incorporated.

Definition

To speak of Engineering is to refer to a discipline that provides methods, techniques and tools for solving real problems that arise in everyday life, whether in a company, at school and in general to any type of organization (Salvador., 2003).

Meanwhile the term reliability is generally used to express a certain degree of assurance that a device or system operates successfully in a specific environment during a certain period.

Therefore, Reliability Engineering is the set of methods, techniques and tools that are used to determine the degree of security in which a device, product or system will work in optimal conditions during a certain period of time. It could also be said that Reliability Engineering is a staff unit, perfectly integrated with the rest of the company's functions, so that it feeds on data and information, fundamentally from the systems associated with the execution of maintenance, production and of engineering, to transform them into knowledge of practical and concrete use, therefore, the point of view of the organizational structure could be incorporated in traditional engineering oriented to products and processes, for this reason,it is necessary and convenient that reliability engineering be a unit exclusively oriented to this work, without distracting its attention to other more traditional activities.

Trust vs Reliability

Trust and reliability are not the same, the term trust refers to the favorable opinion in which a person or group is capable of acting correctly in a certain situation. Trust is the assurance or firm hope that someone has for another individual or something.

It is also about self-presumption and the courage or vigor to act. For example: "This man does not inspire confidence in me, I think I'm not going to accept the deal", "Juan gave him his confidence and she betrayed him", "I have the necessary confidence to defeat the rival". Confidence refers, on the other hand, to the familiarity in the deal: "You don't have to comb your hair every time I go to your house, we already have enough confidence", "How dare you talk to me like that? I never gave you such confidence ”.

For social psychology and sociology, trust is a hypothesis that is made about the future behavior of others. It is a belief that estimates that a person will be able to act in a certain way when faced with a certain situation: "I am going to tell my father everything, I am confident that she will understand me and help me." In this sense, trust can be strengthened or weakened according to the actions of the other person.

In the example above, if the father helps his son, trust will be strengthened; otherwise, trust will be betrayed and, in the future, the child will most likely not act in the same way. Trust supposes a suspension, at least temporarily, of uncertainty regarding the actions of others. When someone trusts the other, they believe that they can predict their actions and behaviors. Trust, therefore, simplifies social relationships.

The modern quantitative conception of reliability had its origins in military and space technology. However, the increase in the complexity of the systems, the competitiveness in the market, and the growing competition for budget and resources have caused the expansion of the discipline to many other areas. When reliability is defined quantitatively, it can be specified, analyzed, and becomes a parameter of the design of a system that competes against other parameters such as cost and performance.

Recent research shows that most of the education that an Engineer receives is basically relevant to design, however 80% of engineers must work "In the Care and Operation of Facilities" that have already been designed and built.

"Taking care of something already designed and built" comes in the form of Operation, Maintenance and related activities. These functions require a series of knowledge and skills different from those required to design equipment or processes. Going through the solution of problems, the generation of maintenance and operation plans, to the Optimization of plant shutdowns, including the management of personnel, cultural change and the management of uncertainty, etc. (Duran, 2003)

Reliability applied to engineering has proven its efficiency over the years in the results obtained for the anticipation of operation failures found in companies or organizations. For this verification, field tests have had to be developed applying statistics; This is how production problems can be prevented through reliability tools, which allow having a product, or machinery, among other things, durable and with quality.

Goals:

Almost all engineers when they arrive at their first plant job receive a series of Tips; such as: learn from the technician, shut up, observe and start looking for how to show that you are learning, but remember that you do not know anything, etc. Others think they will receive very extensive training before they begin to receive responsibilities. In reality, many of our engineers come to the industry with a very weak training in relation to the operation of a real plant. Short internships or internships, which in many cases are optional, are practically the only contact with “real life”. It is very difficult to quantify what the result of this is, however, we could imagine a better world where the education of our engineers had a better component in the care and exploitation of assets.(Duran, 2003)

Most of the actual training of engineers in the care and exploitation of assets is empirical and this is transmitted from generation to generation. This brings both desired and unwanted components, for example the perpetuity of unwanted paradigms today.

From the design of the organization there is a need to deliver equipment or systems that have the benefits desired by the customer and that are also reliable, easy to maintain and with safe and economical operation during their useful life.

Reliability Engineering focuses on achieving the following objectives. (LLC)

Apply engineering knowledge to prevent or reduce the frequency of failures; Identify and correct the causes of catastrophic or repetitive failures; Define methods to mitigate failures if their causes have not been identified and corrected; Apply techniques to estimate reliability in new designs and analyze reliability data.

Benefits:

The main benefits of Reliability engineering are summarized as follows: Reach customer expectations on the functionality and useful life of the equipment; Reduce the foreseeable risks inherent to the operation of the equipment and the health hazards; Improve Reliability and the Availability of the systems (reduce failure rates and reduce downtime); Achieve production objectives; Improve the commercialization of products and guarantees.

Application of reliability engineering

The application of reliability to product and process engineering has shown excellent results as a means of anticipating operational failures. The development of field tests, accompanied by analysis of failures and their corresponding probabilities of occurrence, offers an excellent alternative to develop robust products and processes capable of manufacturing them.

In this context, a product is understood as any manufactured good that fulfills a specific function for a user or customer; thus this product can be a machine, a piece of equipment or any general consumer good.

Many of the production problems can be prevented through reliability techniques, with which a product can be obtained according to the customer's expectations in terms of durability and quality, the technological and operational limitations of manufacturing and working capital.

The great competition of national and international markets forces companies to develop strategies that are based on four fundamental factors: price, quality, reliability and delivery time. These strategies have gained a lot of interest these days, since it is a reality that success will be for those who manage to arrive first, with a satisfactory quality for the client and with a reasonable and affordable price for the market niche that is intended to be captured. Additionally, these products are wanted to perform without failure for a sufficient time (useful life) to meet customer expectations.

Design

If you seek to maximize the value of the money invested (optimize costs) during the life cycle of the project, the application of the concepts, goals and reliability procedures should not be limited to the engineering stage, they should be applied throughout the entire life cycle of the project associated with the installation. This is what is known as Reliability by Design (CDD). The application of reliability will have a greater impact on the results, if it is applied from the earliest stage of a project, "During the design phase", which is why it is necessary to generate a document that specifies the actions to continue in reliability during the project design stage. The proposed methodology is a guide for the direction and management of maintenance projects from the design stage,Reliability actions and guidelines are given that must be considered during the project design phase, specifically the Definition and Development phase (Visualization, Conceptualization and Definition).

The methodology can be used by the personnel who participate during the design phases of the projects and has the purpose of ensuring, standardizing and standardizing in an orderly manner the application of the concepts, procedures and reliability methodologies during the design phase and integrating them. or "tie them" with the activities and documents that are generated during the development of engineering projects. Understanding as engineering projects those projects for the operation of new facilities, extensions and "revamps" within all the operational areas of the company.

Reliability from Design (CDD) Considerations and Concepts

It has recently been recognized that one of the most important approaches to increasing the value of a facility is by improving the availability or utilization of the facility.

The traditional approach commonly used to increase value has been to increase the volume of sales, increase the manufacturing capacity of the asset, reduce costs, open to new markets or a combination of these factors. An increase in availability can be achieved by improving Operating Procedures, Maintenance Techniques, Human Reliability and with the Intrinsic Reliability of the Installation.

As a result of the recognition of this new approach, the concept of Asset Utilization (UA) has emerged, which takes into account sales and availability. The primary objective of a facility is to maximize the (UA) or maximize the value of the money invested throughout the life cycle of the project. When benchmarking with other companies, it has been found that UA's loss of opportunity is due to issues that are evenly distributed among Operations, Maintenance, and Design.

To improve the availability of a facility, it is necessary to apply reliability concepts, goals and procedures throughout the life of the project. This is what is known as Reliability From Design (CDD).

The key to obtaining an installation that is cost-effective and having a reliable product / installation is through the application of the reliability concepts from the earliest stage of the project or in the design stage (particularly in the Definition and Development). It is at this stage when the reliability application has the greatest impact or opportunity to affect the results, since the project is flexible enough to be modified or redesigned without a high impact on costs. Otherwise, if reliability improvements are applied after the design has been “frozen”, any changes or modifications will have a substantial impact on costs.

The application of reliability in the design phase of a project requires the participation of the experiences and multidisciplinary skills of different specialists. To maximize value, a combination of management, finance, engineering, construction, and other practices applied to assets is required in pursuit of economic life-cycle cost. This concept has to do directly with Reliability From Design (CDD) and maintainability of assets (facilities).

One aspect to consider throughout the life cycle of a project is to achieve an adequate balance between productivity and safety at an optimal cost. This has a direct effect on reliability, and therefore should be considered as part of the reliability aspects to be applied in the project life cycle.

It is achieved through risk management defining the strategies for each of the following aspects, some of which are closely related:

Design (Robust design vs. low cost design). Maintenance and operation strategy. Management of abnormal events. Disincorporation of the asset. Management of personnel and corporate culture. Responsibility for security. Management of scarce resources. Attitude before regulatory agents (government entities)

Defining strategies could cause conflicts between productivity and safety. For example, when uninterrupted production requires actions that affect safety in the short or long term. The most prudent strategies are supported by a robust design, frequent preventive maintenance, early responses to signs of deterioration.

At the other extreme, the strategies are driven by an aggressive production plan, which results in less robust installations or design (often cheaper), minimal inspection and maintenance while waiting to obtain maximum production with a minimum of operational interruptions.

The strategies to apply in each of the aforementioned aspects depend on various factors, including: company policy, available budget, market projection, etc. Risk management aspects need to be taken into consideration during the definition and development stage, which are the first two mentioned above: Design and Maintenance and Operation Strategy.

software

Organizations that develop software-based products require effective practices to improve product quality. Software Reliability Engineering is a quantitative practice that can be implemented in organizations of any size under different development models. Organizations that develop software-based products allocate large amounts of resources to improve the quality of their products.

A part of these resources is used for the adoption of best practices. However, the difficulty in adopting these practices lies not only in the cost and time required to institutionalize them, but in how to measure their impact on the quality of the software, as well as demonstrate the return on said investment.

This article introduces you to Software Reliability Engineering (ICS). ICS is a low-cost practice, independent of the development model and of the technological platform, which allows the quantitative characterization and control of product quality.

Software quality, flaws and reliability

Quality is an attribute perceived by users or customers of any product or service. In the case of software-based products, the perception of quality is a function of the failures that the client perceives during its operation.

Reliability is an attribute that measures the degree to which a product operates flawlessly under established conditions for a specified period of time. Reliability is a quantitative attribute that has been widely analyzed, studied, and used in other industries to characterize the quality of products or services.

In its most general conception, reliability is an attribute that measures the degree to which a product operates without failure under conditions established for a specified period of time.

A failure is the manifestation perceived by the customer that something is not working properly and impacts their perception of quality. A defect is the problem in the software product that causes a failure.

What is Software Reliability Engineering

ICS is a practice that allows you to plan and guide the software testing process in a quantitative way. ICS is not something new. It originates in the 1970s with the works of JD Musa, A. Iannino, and K. Okumoto.

Its effectiveness has made many companies incorporate this practice in their projects, such as AT&T, Alcatel, HP, IBM, Lockheed-Martin, Microsoft, Motorola, among others. The impact of this practice has been seen in the approval of an AIAA standard (in 1993) as well as its corresponding versions in the IEEE standards. It is worth mentioning that more than 60 articles have been documented reporting the results of the application of ICS in different projects.

Two elements characterize ICS:

The relative expected use of system functionalities and customer-defined quality requirements, including reliability, release date, and project life cycle cost.

The first element focuses on quantitatively characterizing the expected use of the system by defining the so-called operating profile of the system. This quantitative characterization allows optimizing the use of resources in the functions that have a greater impact and greater expected use within the system.

The operating profile of a system is the quantitative characterization of the expected use of the main functionalities of the system. Probabilities are generally used to quantify such expected use.

The second element refers to customer focus by establishing quantitative objectives associated with product quality (represented based on product failures). The satisfaction of these objectives allows establishing a balance between the costs of the product, as well as the satisfaction of the client's needs.

Why use ICS?

ICS is independent of technology and development platform. It does not require any changes in architecture, design, or code, but can suggest changes that would be useful. Also, ICS is highly customer-oriented and highly correlated with levels 4 and 5 of the Integrated Capabilities Maturity Model of the Institute of Software Engineering.

Its high customer orientation is due to the nature of the information required in the ICS process, which implies having frequent and close contact with customers. This interaction improves customer satisfaction and reduces risks in a similar way to what is proposed in agile development methods.

The high correlation with CMM-I maturity levels 4 and 5 is due to the fact that this practice satisfies several objectives related to measurement for the optimization of the development process. ICS is a good option to achieve this goal. Compared with the advantages, the cost of applying ICS is low according to the experience of John D. Musa (Musa, 2004).

The Software Reliability Engineering Process

The ICS process can be seen as a set of activities additional and complementary to those already carried out within any development process. Six activities define the ICS framework described below:

Define the Product. It can be seen as a complement to the Requirements Analysis and Architectural Design. This activity defines who are the clients, users, suppliers and other related systems. Develop the Operation Profile. The complete set of operations (ie, tasks or main logical functionalities of the system) with their corresponding probability of occurrence or expected use is defined. In this stage, the administration of resources takes a quantitative level based on the importance of each operation of the system.

Processes

The reliability of a system (product or process) can be estimated by means of a study that is carried out in four phases:

Definition of objectives and requirements for reliability of the product or process: this phase is executed by a multidisciplinary team in which the voice of the client captured by marketing and the voice of the process captured by engineering intervene and in which the technological and engineering limitations are considered of materials and machines, a quality function deployment study is an excellent tool for this type of analysis. Disaggregation of the product or process into components and estimation of reliability for each of these components. The product or process is divided into its components and these, in turn, into its parts, in order to determine the micro level in the value of the reliability of each one of them.In this phase, block diagrams and “gozinto” diagrams can be used to carry out an orderly disaggregation in which essential components of the product or process are not lost. Prediction of product reliability based on the reliability of its components. The combination of the reliability of all source components to the reliability value of the product or process as a whole. Macro-level reliability estimation is complicated and can lead to errors. This estimation uses the theory of probabilities to determine the reliability of the product or process. Analysis of the product or process in order to determine strengths and weaknesses and take advantage of new opportunities for improvement. Once the reliability of the product or process has been determined during its design,Product failures are studied during manufacturing and throughout its useful life, as these are excellent agents for detecting weaknesses that lead to improving the performance of products.

Control

Control charts show us how a characteristic compares over time. If all the points are within the limits and do not follow a specific pattern, the process is said to be in control. Control limits depend on the behavior of the data

Mathematical models

Perform quantitative risk analysis in order to quantify the risk of an installation failure based on the identification of failure modes and the calculation of their probabilities. In failure modes it is important to include human failures.

Identify maintenance strategies (maintainability). An adequate maintenance policy must be generated, seeking to optimize costs. In this case, the maintenance costs required to achieve a certain level of reliability (and therefore safety and long-term production) are balanced with the costs of failure. This consideration leads to increased facility availability and is achieved by considering accessibility, rapid fault detection and isolation, on-line maintenance, ease of removal, replacement and repair with minimal adjustments.

These recommendations and tasks will avoid that at the end of the detailed engineering, the final design is totally or partially subject to revision for reasons of maintainability, which can lead to redesign before the construction phase. This redesign can be costly in labor and time.

Probability distributions:

BINOMIALPOISSONNORMAWEIBULLEXPONENTIALTRIANGULAR

Certification

Standards are currently necessary for all organized activity, for this reason in the world, organizations create them and follow them rigidly in order to successfully achieve the objectives of the organization. At present worldwide the ISO 9000 and ISO 14000 standards are required, because they guarantee the quality of a product through the implementation of exhaustive controls, ensuring that all the processes that have intervened in its manufacture operate within the expected characteristics.

Every company must take into account these standards as they are the starting point in the quality strategy, as well as for the subsequent certification of the company. The quality of a product is not born from efficient controls, it is born from a production process and supports that operate properly, in this spirit the ISO standards are based, for this reason these standards are applied to the company and not to its products.

The company that implements the standards assures its customers that the quality of the product they purchase will be maintained over time. In this way there will be differentiation in the market, of companies that have already been certified and those that have not, this will eventually become commonplace and discrimination against non-certified companies will occur, this situation already occurs in developed countries in where the supply departments of large corporations demand the standard from all their suppliers.

The ISO 14000 standard is not a single standard, but is part of a family of standards that refer to environmental management applied to the company, whose objective is to standardize ways of producing and providing services that protect the environment environment, increasing product quality and consequently its competitiveness in the face of demand for products whose components and manufacturing processes are carried out in a context where the environment is respected.

These are also part of the ISO series (International Standard Organization) from which the well-known ISO 9000 and ISO 9001 come, the latter referring to total quality within the company.

Today there are some certification plans, basically two currents are observed:

Certification Through Exams.

These plans "certify" people only by taking exams or written tests, therefore they can only certify that people know the theory of things, they cannot certify that people have the skills to make things happen. In itself it is not a question of training programs but a way of "demonstrating" that people have a level of knowledge.

Learning By Doing Certification.

This type of program is gaining a lot of support and basically seeks to solve the weaknesses of the previous currents by leveraging its advantages. Basically it is an academic program combined with a certification program, but also requiring plant implementations.

Maintenance

Skills training needs

More than 80% of graduate engineers are seen today facing the world of asset operation and maintenance, with an arsenal of weapons prepared for the design of facilities, but in reality they will work most of their lives with assets that have already been designed. On the other hand, another group of engineers will dedicate themselves to the design of assets that they will never operate or maintain in their lives.

A great weakness in the current university training plan is that in general it is very dedicated to the technical field and very weak in economic training, however, today's Asset Management requires that decisions be based from a technical-economic point of view.

However, some changes are taking place slowly, today we hear about issues such as reliability and maintainability from the design, reliability departments / managements are generated, however the skills required to execute these functions are far from being present in the people who execute the charges.

There is great confusion regarding the functions that must be executed both in maintenance and in reliability, let's see some and their consequences as an example:

Who plans and who schedules Maintenance? Here then we see the planner confused in the day-to-day, locating spare parts, tools, coordinating with crews, then leaving his function of planner, we also see the programmers looking at the planning, looking "over the shoulder" of the planner. Consequence: planning crisis. Who schedules plant shutdowns? Here the maintenance planner (sometimes doing the work of a programmer) struggling on a day-to-day basis and in turn "planning the shutdown they will have in 18 months." Consequence: "there is always something missing at the stop". What are the functions of Reliability Engineering? Here we see a great confusion, in some companies the reliability engineers are dedicated to carrying indicators, in others they want to solve all the problems,in others they want to carry complicated programs to “predict” failures (which happen every day by the way), others evaluate why they cannot use the current “software's” and try to justify others, others do not know what to do (charge not described for being new).

Consequence: It is difficult to demonstrate the real and potential benefit of Operational Reliability Engineering.

Who should do Reliability Engineering? Here we see the eternal discussion of who should do this work, if a specific position or is everyone's subject (nobody does anything in the end). Consequences: By not being clear about the respective responsibilities on the subject "the ball falls into no man's land" and little can be achieved in this regard. How should maintenance support the design? Here we see companies making engineering specifications as general as "the concepts of Reliability and Maintainability" should be considered from the design. Consequence: Not knowing clearly how this should be done is unlikely to get the desired product from the supplier.What maintenance or reliability improvement techniques should I use? Here we notice great confusion since there is a tendency to "fall in love" with a particular technique and treat it as a panacea, without making a prior evaluation of where and when to use it or the counterpart is that it is not decided what to use in the face of so much confusion. Consequences: The desired results are not achieved. How to establish an improvement plan, where to start implementing it? This point is closely aligned with the previous one and its major consequences are the inadequate dimension of the resources required for the implementation of an improvement plan and it is over or under dimensioned and projects are abandoned or projects are kept out of time. This coupled with the field of training and staff training, which generates this type of confusion:What are the responsibilities of maintenance and reliability personnel? How to draw up a training plan that is sustainable? What technical competencies should be included? What business competencies should be included? How to draw a matrix of positions, responsibilities, skills-competencies?

Total productive maintenance

The TPM aims to create a corporate system that maximizes the efficiency of the entire production system, establishing a system that prevents losses in all operations of the company. This includes "zero accidents, zero defects and zero failures" throughout the life cycle of the production system. It is applied in all sectors, including production, development and administrative departments. It is supported by the participation of all members of the company, from senior management to operational levels. Obtaining zero losses is achieved through the work of small teams.

The TPM makes it possible to differentiate an organization in relation to its competition due to the impact on cost reduction, improved response times, reliability of supplies, the knowledge that people have, and the quality of the final products and services. TPM looks for:

Maximize team effectiveness Develop a productive maintenance system for the life of the equipment Involve all departments that plan, design, use, or maintain equipment in the implementation of TPM Actively involve all employees, from senior management to Floor workers Promote TPM through motivation with autonomous activities of small groups Zero accidents Zero defects Zero breakdowns

Objectives of the TPM

Strategic objectives

The TPM process helps build competitive capabilities from the company's operations, thanks to its contribution to improving the effectiveness of production systems, flexibility and response capacity, reducing operating costs, and maintaining industrial "knowledge".

Operational objectives

The purpose of the TPM in the daily actions is that the equipment operates without breakdowns and failures, eliminate all kinds of losses, improve the reliability of the equipment and truly use the installed industrial capacity.

Organizational objectives

The TPM seeks to strengthen teamwork, increase worker morale, create a space where each person can contribute their best, all this, with the purpose of making the workplace a creative, safe, productive and where to work is really pleasant.

TPM Features:

Maintenance actions in all stages of the equipment life cycle Wide participation of all the people in the organization It is observed as a global company strategy, rather than a system to maintain equipment Oriented to improve the Global Effectiveness of the operations, rather than paying attention to keeping equipment running Significant intervention by personnel involved in operation and production in the care and conservation of equipment and physical resources Maintenance processes based on the deep use of knowledge that personnel have about the processes.

TPM benefits

Organizational

Improvement of the quality of the work environment Better control of operations Increased employee morale Creation of a culture of responsibility, discipline and respect for rules Lifelong learning Creation of an environment where participation, collaboration and creativity are a reality Proper sizing of personnel templates Effective communication networks.

Security

Improve environmental conditions Culture of prevention of negative health events Increase in the ability to identify potential problems and search for corrective actions Understand the reason for certain regulations, rather than how to do it Prevention and elimination of potential causes of accidents Radically eliminate sources of pollution and pollution.

Productivity

Eliminate losses that affect plant productivity Improve equipment reliability and availability Reduce maintenance costs Improve final product quality Lower financial cost for changes Improve company technology Increase responsiveness to market movements Create competitive capabilities from the factory

Pillars of the TPM

The pillars or fundamental processes of the TPM serve as support for the construction of an orderly production system. They are implemented following a disciplined, powerful and effective methodology.

The pillars considered necessary for the development of the TPM in an organization are those indicated below:

Pillar 1: Focused Improvements (Kaizen)

Focused improvements are activities that are developed with the intervention of the different areas involved in the production process, in order to maximize the Global Effectiveness of the Equipment, process and plant; all this through organized work in multidisciplinary teams, using specific methodology and concentrating its attention on the elimination of waste that occurs in industrial plants.

It is about developing a continuous improvement process similar to the one that exists in Total Quality Control processes, applying maintenance procedures and techniques. If an organization has similar improvement activities, it can simply incorporate into its process, Kaizen or improvement, new tools developed in the TPM environment. You should not modify your current improvement process that you currently apply.

Pillar 2: Autonomous Maintenance (Jishu Hozen)

Autonomous maintenance is composed of a set of activities that are carried out daily by all workers in the equipment they operate, including inspection, lubrication, cleaning, minor interventions, change of tools and parts, studying possible improvements, analyzing and solving equipment problems and actions that lead to keeping the equipment in the best operating conditions.

These activities must be carried out following previously prepared standards with the collaboration of the operators themselves. Operators must be trained and have the knowledge necessary to master the equipment they operate.

The fundamental objectives of autonomous maintenance are:

Using the equipment as an instrument for learning and acquiring knowledge Developing new skills for analyzing problems and creating a new thinking about the work Through correct operation and permanent verification according to the standards, avoiding deterioration of the equipment Improving the operation of the equipment with The creative input of the operator Build and maintain the necessary conditions for the equipment to function without breakdowns and full performance Improve safety at work Achieve a total sense of belonging and responsibility of the worker Improve morale at work

Pillar 3: Progressive or Planned Maintenance (Keikaku Hozen)

Progressive maintenance is one of the most important pillars in the search for profit in an industrial organization. The purpose of this pillar is the need to move gradually towards the goal of "zero breakdowns" for an industrial plant.

The planned maintenance that is practiced in many companies has the following limitations, among others:

There is no historical information necessary to establish the most appropriate time to perform preventive maintenance actions. The times are established according to experience, manufacturer recommendations and other criteria with little technical foundation and without the support of data and historical information on past behavior. The stoppage of a team is used to "do everything necessary on the machine »Since we have it available. Will a similar intervention time be necessary for all the elements and systems of a piece of equipment? Will this be economical? Preventive maintenance plans are applied to equipment with high accumulated deterioration. This deterioration affects the dispersion of the (statistical) distribution of failures,making it impossible to identify a regular behavior of the failure and with which the preventive maintenance plan should be established.Equipment and systems are given a similar treatment from the point of view of the definition of preventive routines, regardless of their criticality, risk, effect on quality, degree of difficulty in obtaining the replacement or replacement, etc. It is rare that maintenance departments have specialized standards for carrying out their technical work. The usual practice is to print the work order with some assignments that do not indicate the detail of the type of action to be carried out. The planned maintenance work does not include Kaizen actions to improve work methods.Actions to improve technical capacity and improve the reliability of maintenance work are not included, nor is it frequent to observe the development of plans to eliminate the need for maintenance actions. This should also be considered as a preventive maintenance activity.

Pillar 4: Education and Training

This pillar considers all the actions that must be carried out for the development of skills to achieve high levels of performance of people in their work. It can be developed in steps like all TPM pillars and employs techniques used in autonomous maintenance, focused improvements, and quality tools.

Pillar 5: Early Maintenance

This pillar seeks to improve the technology of production equipment. It is essential for companies that compete in sectors of accelerated innovation, Mass Customization or versatile manufacturing, since in these production systems the continuous updating of the equipment, the capacity for flexibility and failure-free operation are extremely critical factors. This pillar acts during the planning and construction of the production equipment.

For its development, information management methods are used on the operation of current equipment, economic project management actions, quality engineering techniques and maintenance. This pillar is developed through teams for specific projects. The departments of research, development and design, process technology, production, maintenance, planning, quality management and commercial areas participate.

Pillar 6: Quality Maintenance (Hinshitsu Hozen)

It is intended to establish equipment conditions at a point where "zero defects" is feasible. The quality maintenance actions seek to verify and measure the "zero defects" conditions regularly, in order to facilitate the operation of the equipment in the situation where quality defects are not generated.

Quality Maintenance is not…

Apply quality control techniques to maintenance tasks Apply an ISO system to the maintenance function Use statistical quality control techniques to maintenance Apply continuous improvement actions to the maintenance function

Quality Maintenance is…

Carry out maintenance actions aimed at caring for the equipment so that it does not generate quality defects Prevent quality defects by certifying that the machinery meets the conditions for "zero defects" and that these are within technical standards Observe the variations in the characteristics of the equipment to prevent defects and take actions in anticipation of the potential abnormality situation Carry out equipment engineering studies to identify the equipment elements that have a high incidence on the quality characteristics of the final product, carry out the control of these elements of the machine and intervene these elements

Principles of Quality Maintenance

The principles on which Quality Maintenance is based are:

Classification of defects and identification of the circumstances in which they occur, frequency and effects Carry out a physical analysis to identify the equipment factors that generate quality defects Establish standard values for the characteristics of equipment factors and assess the results through a measurement process Establish a periodic inspection system for critical characteristics Prepare maintenance matrices and periodically assess standards.

Pillar 7: Maintenance in Administrative Areas

The purpose of this pillar is to reduce the losses that can occur in manual work in the offices. If about 80% of the cost of a product is determined in the product design and production system development stages. Productive maintenance in administrative areas helps to avoid loss of information, coordination, accuracy of information, etc. Employs focused improvement techniques, 5's strategy, autonomous maintenance actions, education and training, and job standardization. It is developed in the administrative areas with individual or team actions.

Pillar 8: Safety, Health and Environment Management

Its purpose is to create a comprehensive security management system. It employs methodologies developed for the pillars of focused improvement and autonomous maintenance. It contributes significantly to preventing risks that could affect the integrity of people and negative effects on the environment.

Pillar 9: Specials (Monotsukuri)

The purpose of this pillar is to improve the flexibility of the plant, implement postponement technology, level flow, apply Just in Time and other technologies to improve manufacturing processes.

Preventive maintenance models

Preventive maintenance can be applied considering various strategies. The choice of each of them will depend on the economic benefit achieved from its application.

For the modeling and selection of a preventive maintenance policy (convenient from the economic point of view) the following considerations should be taken into account:

The failure rate of the component in question must be increasing The total cost of the emergency intervention must be higher than the total cost of the preventive intervention There are only two possible states for components under analysis, functioning or non-functioning

Maintenance models

A. Corrective Model

This model is the most basic, and includes, in addition to the visual inspections and lubrication mentioned above, the repair of breakdowns that arise. It is applicable, as we shall see, to equipment with the lowest level of criticality, whose breakdowns do not pose any economic or technical problem. In this type of equipment it is not profitable to dedicate greater resources or efforts

B. Conditional Model

It includes the activities of the previous model, and also the performance of a series of tests or trials, which will condition a subsequent action. If after the tests we discover an anomaly, we will schedule an intervention; if, on the contrary, everything is correct, we will not act on the team. This maintenance model is valid in those equipment of little use, or equipment that despite being important in the production system, its probability of failure is low.

C. Systematic Model

This model includes a set of tasks that we will perform regardless of the condition of the equipment; We will also carry out some measurements and tests to decide if we carry out other larger tasks; and finally, we will solve the faults that arise. It is a model of great application in equipment of average availability, of certain importance in the productive system and whose breakdowns cause some problems.

It is important to note that a computer subject to a systematic maintenance model does not have to have all its tasks with a fixed periodicity. Simply, a team with this maintenance model can have systematic tasks, which are carried out regardless of the time it has been working or the status of the elements that are being worked on. It is the main difference with the two previous models, in which to perform a task some fault symptom must be present.

An example of equipment subject to this maintenance model is a batch reactor, in which the materials to be reacted are introduced in one go, the reaction takes place, and the reaction product is subsequently extracted, before carrying out a new one. load. Regardless of whether this reactor is duplicated or not, when it is in operation it must be reliable, which is why it is justified to carry out a series of tasks regardless of whether they have presented any failure symptoms.

D. High Availability Maintenance Model

It is the most demanding and exhaustive model of all. It is applied to those equipment that under no circumstances may suffer a breakdown or malfunction. They are teams that are also required to have very high levels of availability, above 90%. The reason for such a high level of availability is generally the high production cost that a breakdown has.

With such a high demand, there is no time for maintenance that requires equipment shutdown (corrective, systematic preventive). To maintain this equipment, it is necessary to use predictive maintenance techniques, which allow us to know the status of the equipment with it running, and with scheduled shutdowns, which will entail a complete general overhaul, with a generally annual or higher frequency. In this revision, in general, all those parts subject to wear or with a probability of failure throughout the year (parts with a life of less than two years) are replaced. These reviews are prepared well in advance, and they don't have to be exactly the same year after year.

As this model does not include corrective maintenance, that is, the objective sought in this equipment is ZERO FAULTS, in general there is no time to adequately correct the incidents that occur, being convenient in many cases to carry out quick provisional repairs to keep the equipment running until the next overhaul. Therefore, the annual Zeroing must include the resolution of all those provisional repairs that have had to be carried out throughout the year.

Some examples of this maintenance model can be the following:

Turbines for the production of electrical energy High temperature furnaces, in which an intervention involves cooling and reheating the furnace, with the consequent energy expenditure and associated production losses Rotating equipment that works continuously Reactor tanks or reaction tanks do not duplicates, which are the basis of production and must be kept in operation for as many hours as possible.

Trust culture

The culture of reliability can be described in three words:

Focus.Pro-action.Priority.

These are essential components of reliability. The question is “focus what? and pro-acting for what? Priority gives focus and pro-action its direction and support. All three components are extremely important if reliable operations are to produce truly remarkable results.

Can we agree, intellectually speaking, that facilities that focus on the most important issues, and that pro-act to prevent surprises and deviations in effective operation, will be more likely to achieve superior results? The author would like to analyze these three components of reliability from a human aspect because when these do not exist, and the performance is unsatisfactory, clearly the problem is a human issue.

Priority

There is priority when senior management clearly delineates institutional direction and assigns responsibilities. There is another important factor that management must take into account, that is, the support mechanisms to facilitate the work of line management. In this way, it forcefully demonstrates to the population involved that it sets the direction that production managers are following. In other words, it shows that "he will do what he preaches."

To effectively carry out a necessary cultural change, senior management must focus its efforts by establishing a perspective. The wording of the vision becomes extremely important if it is to influence necessary behavior changes. It is one thing to say that "we want to achieve a 10% increase in the market in 5 years" but it is better to say that "we will be number one or two in the market with our products in 5 years, or we will no longer be in that business". This was, of course, what General Electric's Jack Welch did.

To establish a priority, senior management must engage in an open discussion about the paradigm shifts necessary to achieve meaningful results. The result will be to agree on what thinking should be changed. Knowing this, senior management can provide the necessary support.

Management will have to hope that some people in the organization will disagree with an expected change in behavior. In reality, if no disagreement, complaint, or "noise" is perceived, no change is taking place.

In short, when a cultural change is needed to perform better, senior management must be part of the process. You need to examine what thinking and behavior need to be changed, including your own, to start the process. You certainly need to set the vision, goals, and values that you want the organization to achieve and you must make the necessary policy changes. In addition, it is necessary to provide visible support, seek the agents of change and remove obstacles.

Focus

The focus is the direction of human capacity and energy towards the few important issues and opportunities that result in significant benefits. Now this seems to be so logical that we have to wonder why it is not generally done.

Most industrial facilities have, within their facilities, the ability to solve most of their problems but, even so, they continue to suffer difficulties due to recurring failures. Actually, what else is done daily other than taking care of chronic problems?

Two beliefs prevail and are instrumental in limiting our ability to focus: It is believed that it limits our career to resist assignments even though they obstruct more important work. To belong, it is important not to object to assigned work, even if it is not as important as the work that is being done.

Can it be argued that these beliefs are not representative of the thinking of the organization's staff, probably at all levels? This represents a dilemma for most people. Do I work on the many trivial things or challenge my work assignments? The first decision promotes mediocrity, the second can be perceived as insubordination. In reality, challenges to job assignments can polarize relationships between bosses and subordinates.

The answer is to challenge, but to do so in a way that is not perceived as insubordination. A variety of techniques can be used to set priorities. This reduces the challenge to a technique on a piece of paper, allowing the supervisor to see the logic of the challenge. Actually, the supervisor can modify the priority using his own logic. In this way, the supervisor can use the document to present the site vision to her own boss, if necessary.

Some of the techniques for establishing a reliability approach:

Management Introspection

This is a way of approaching that requires the management group to examine the health of the organization, first establishing a perspective view of the future, together with the values that are to represent the organization, and then a day of long introspection of the organization. health of the organization for which they are responsible. Finally, a plan focused on the forward mobilization of the organization is developed. If the conclusion is reached that the organization is unhealthy, as seen in many plant organizations, the result of this session will be a plan with two goals: one is to restore health and the other is to move forward.

Failure Mode Analysis and Its Effects Modified

Instead of concentrating staff on only the failures that are perceived to be of concern to senior management, or the most dramatic failure of the day, we need to focus our trained resources on those failures that are most important to meeting and exceeding our financial goals.. To achieve this, a very effective technique developed in the aerospace industry has been simplified and is easy to use for application in the continuous process industry. The result is a method that captures vital information held by personnel in the field that is generally not found in our data systems.

Consequently, a modified version of the Failure Mode and Effects analysis uses field resources to develop the information that identifies which failures account for 80% of the losses in the facilities. The technique, although somewhat subjective, is very powerful and capable of identifying the few major failures that should undergo Root Cause Failure Analysis.

Decisions in pairs

By tradition, as employees of a company, orders to do work come from our bosses. It is also traditional that objections to such orders are generally not tolerated. Since traditions are our paradigms, they have the effect of promoting mediocrity. They also pose a dilemma for employees… "Do I challenge job assignments or continue to work on trivial cases?" Decisions in pairs is a technique that provides a vehicle for challenging job assignments in an impersonal way. It also allows a list of jobs that require attention to be prioritized by comparing each job to each of the other jobs to be performed and then sorting the list according to how often a particular job is selected.

Priority Matrix

The Priority Matrix is a two-dimensional technique. This means that instead of comparing the importance of a job with the importance of other jobs, we can rank them based on the impact of a job as well as the ease of performing that job.

When we allow subordinates to question priorities, we are in effect allowing boundaries to be challenged and opening our plants to truly real progress.

“… Leaders are using standards that allow them to get six years of average life for their pumps instead of two years, which is generally considered acceptable”

Proaction

It is any activity of improvement, vision and / or execution that prevents human, equipment and process failures or that mitigates the consequence of a failure.

Conclusions

Reliability engineering (IC) is a set of techniques, methods and knowledge that help us determine the degree of effective response that a system will have during a given period of time.

This new vision of reliability has led to the creation of a new culture of reliability in organizations, which consider systems, whatever they may be, as a whole; and try to compile the minimum requirements for the system to obtain optimal results.

The CI is supported by mathematical methods to make forecasts, trend calculations, estimate costs and degrees of reliability. Likewise, it is evaluated, supervised and supported by international organizations.

The correct management of knowledge and knowledge of the environment of organizations is of crucial importance, to determine the mapping of the company's scope and apply the appropriate methods of reliability engineering that guarantee the proper functioning of the organization.

Bibliography

BROOME, DW (sf). MANUAL OF THE CERTIFIED ENGINEER IN RELIABILITY. CALIDAD, SA (sf). RELIABILITY ENGINEERING COURSE Durán, MJ (2003). The Woodhouse Partnership Limited. IG GROUP SAS (sf). Obtained from http://www.iggroupla.com/capacitaciones/certificacion.html#queesL. Amendola, P. (nd). Reliability From Design. VALENCIA, SPAIN, LLC, IC (sf). IMR Consulting LLC, asset integrity management & reliability. Obtained from http://imrconsulting.net/?page_id=39&lang=esMusa, JD (2004). Most reliable software edition and faster second cheapest: Reliability Software Engineering. AuthorHouse.O´CONNOR, PD (sf). PRACTICAL RELIABILITY ENGINEERING. JOHON WILE & SON LTD.

Thesis proposal

Administrative restructuring, through the application of reliability engineering, of the equipment maintenance department, of the Technological Institute of Orizaba.

objective

Administratively restructure the equipment maintenance department to ensure service and quality standards under the regulations that govern the Orizaba Technological Institute.