This application claims priority to U.S. Provisional Application No. 61/224,372 which was filed Jul. 9, 2009 and which is fully incorporated herein by reference.BACKGROUND
The present disclosure relates to methods and systems for managing failover events in computer servers.
2. Description of Related Art
Failover is a backup operational mode in which the functions of a system component (such as a processor, server, network, or database, for example) are assumed by secondary system components when the primary component becomes unavailable through either failure or scheduled down time. Failover protocols and components are often implemented for mission-critical systems that must be constantly available or for busy commercial systems in which down time causes substantial lost income or expenses. A failover procedures transfer tasks to a standby system component so that failover is not discernable, or is only minimally discernable, to the end user. Failover events are triggered when a failure of a primary component is detected, causing the failover protocol to be implemented to shift tasks to the designated back-up component. Automatic failover procedures maintain normal system functions despite unplanned equipment failures.
Modern server systems, such as the storage area network (SAN), enable any-to-any connectivity between servers and data storage systems but require redundant sets of all the components involved. Multiple connection paths, each with redundant components, are used to help ensure that the connection is still viable even if one (or more) paths fail. Multiple redundancies increase the chances that a failover event will not be disruptive, but increase system capital and operating costs. In systems with fewer redundancies, or where it is deemed undesirable to maintain more than one active component at any given instant of time, rapid failure detection is key to triggering a seamless failover response. Current failover systems rely on monitoring the “heartbeat” of a primary component by the backup component; that is, the components communicate by exchanging packets called “heartbeats” at regular periods, for example approximately 2 Hz (twice per second). If the primary component fails, the secondary node will fail to receive the expected heartbeat packet, which immediately triggers a transfer of the primary role to the backup component. This prevents file corruption caused by multiple competing primary components. Notwithstanding the advantages of present failover systems, it is desirable to provide an improved failover system and procedures.SUMMARY
The present technology provides for automatic failover in systems having at least one primary (active) and at least one secondary (stand-by) server, and at least one (typically two or more) routing appliances routing traffic to the primary and secondary servers. The system may be implemented over a public network, for example, a network using TCP/IP protocol. The primary and secondary servers are configured with identical applications, and maintain duplicate databases of all data required for operation that are synchronized continuously or periodically at all times both servers are active. Each routing appliance maintains IP addresses for the primary and secondary servers. These IP addresses are configurable at any desired time, for example via the system's enterprise messaging system (EMS), which enables asynchronous (queued) messaging between system applications. One of the IP addresses is indicated as the primary server and the other IP address is indicated for the secondary server. The routing appliance maintains a secure tunnel to both IP addresses, but routes all incoming traffic to the IP address of the currently designated primary server.
Each routing appliance actively monitors the operational state of the currently designated primary server and the secondary server, using keep-alive signals. A keep-alive signal, also called a heartbeat, may be transmitted to each routing appliance every 200 ms or less from the primary and secondary servers, respectively. In addition, the secure tunnels may be maintained using a periodic challenge/response protocol as available, for example, in OpenVPN™ using TLS authentication.
The secondary server also actively monitors the operational state of the primary server. The primary server generates and transmits the keep-alive signal to the secondary server in parallel with signals sent to the routing appliances. Therefore when a failure of the primary server occurs, the routing appliances and the primary server detect the failure virtually simultaneously.
In response to detecting failure of the primary server, each routing appliance switches all traffic to the secondary server, which assumes the role of the primary server. This is quickly accomplished through use of the secure tunnel maintained from each routing appliance to the secondary server. In parallel, the routing appliance makes repeated attempts to establish a secure tunnel to the failed primary server. When the failed server comes back online, the routing appliance will establish a secure tunnel to the recovered server, but does not route traffic through it. The failed and recovered server instead becomes the new stand-by server for the former secondary server, to which traffic is now being routed through the stand-by secure tunnels by the routing appliances.
In concert with the switching by the routing appliances, when the secondary server detects that its primary peer has failed, the secondary server takes over as the active server. For example, a secure proxy application on the secondary server communicates with a network (Web) application also on the secondary server. Also, the secondary (now active) server continuously monitors for the failed server to recover, and upon detecting recovery to a ready state, restores database synchronization procedures to restore data minoring between the active and stand-by servers.
Once the transfer to the stand-by server is executed, all system components and clients serviced by the system should be communicating with the former stand-by server now assuming the active server role, via the routing appliances. Configuration of the routing appliances and the use of the secure tunnels between the servers and routing appliances are important features for ensuring optimum operability of the failover system. Use of the secure tunnels may allow system architecture to extend beyond the subnet (sub-network) level of components computers and devices that have a common, designated IP address routing prefix. Routing appliances beyond the subnet are able to recognize the origin of server status keep-alive signals received via the secure tunnels. After configuration, status messages received via the secure tunnels can be readily recognized by any routing appliance, regardless of network location. When a failure occurs, routing appliances that are distant (in network topology terms) from the primary and secondary servers can quickly reroute traffic to the stand-by server. Also, operational overhead is reduced because packet exchange is not required, and the quantity of routing appliances in use is consequently more easily scaled up.
A more complete understanding of the failover procedures and system will be afforded to those skilled in the art, as well as a realization of additional advantages and objects thereof, by a consideration of the following detailed description. Reference will be made to the appended sheets of drawings which will first be described briefly.BRIEF DESCRIPTION OF THE DRAWINGS
Throughout the several figures and in the specification that follows, like element numerals are used to indicate like elements appearing in one or more of the figures.DETAILED DESCRIPTION
The present technology provides for implementing failover in a redundant computer system, using failure monitoring using coordinated switching in response to detection of failure.
EMS component 112 may be used to exchange information with other server components and with at least one routing appliance 114, or at least two routing appliances 114, 116. Any non-zero number of routing devices may be used. If multiple routers are used, each may be configure as described herein for the at least one router 114. Each routing appliance maintains IP addresses for the first and second servers 102, 104 in internal memory. These stored addresses may be configurable via EMS 112. Each routing device 114 may further include a failure detection algorithm that operates to detect a failure of the active server, in response to which the router switches traffic to the formerly stand-by server, e.g., server 104. The former standby server becomes the new active server until it fails, while after the former active server 102 comes back on line, it becomes the new stand-by server.
System 100 may further include one or more switches 118 connected to both servers 102, 104 and to a plurality of clients 120, 122 each running a redundant failover application. Clients 120, 122 manage failover events via switch 118. Routing devices 114, 116 may communicate with the active server 102 via a VPN secure tunnel implemented over a public network. The routing devices also maintain a secure tunnel to the stand-by server 104 ready for use, but do not route traffic through this tunnel until a failover event occurs. Routing devices 114, 116 may be located outside of the subnet that contains servers 102 and 104. In the alternative, one or more of the routing devices may be located inside the subnet. The active server generates a periodic keep-alive signal at a defined frequency, for example ‘N’ times per second, where ‘N’ is a number between 1 and 100, for example, 5. This keep-alive signal may be transmitted to the routing appliances 114, 116. The stand-by server may similarly provide a keep-alive signal to these appliances. Servers 102, 104 may exchange a keep-alive signal or equivalent status message at a lower frequency, for example, once per second. The active server transmits the keep-alive signal via a secure communication channel to each routing device 114, 116 and to the standby server 104. The receiving device recognizes the keep-alive signal and thereby confirms that the no failure condition has occurred. If the receiving device fails to receive a status message within a defined time period of the last status message received, then the device may treat this as indicating a failure condition on the server and initiate failover procedures.
In alternative embodiments, keep-alive signals from the primary or secondary server may include a hardware fingerprint for the originating device. As used herein, a hardware fingerprint is data characterized by being reliably reproducible by an application operating on a particular client machine, while being virtually irreproducible by any other means, and virtually impossible to guess using any systematic or brute force algorithm. In short, each fingerprint is a complex and unique data pattern generated from a stable system configuration of a particular client machine. Methods for generating a machine fingerprint are discussed in the detailed description below. Hence, the fingerprinted status message confirms operational status of the originating server. The server may generate and transmit these status messages according to a designated frequency (for example, 12 times per hour, or less frequently) to each routing appliance. Each message may consist or comprise a brief header, time stamp, and machine fingerprint. The machine fingerprint positively identifies the source of the status message. No packet exchange is required.
At least one application running on each of servers 102, 104 or otherwise accessing each server's hardware and file system may generate a machine fingerprint. This may be done whenever a system server is configured or booted up, and a copy of the fingerprint saved in a temporary for use in generating status messages as described above for alternative embodiments. The application may generate the fingerprint data using a process that operates on data indicative of the server's configuration and hardware. The fingerprint data may be generated using user-configurable machine parameters, non-user-configurable machine parameters, or both as input to a process that generates a fingerprint data file as binary data.
Each machine parameter indicates a state or identifier for a hardware component, software component, or data component of the server. To obtain stable fingerprint data, relatively stable or static machine parameters should be selected. The machine parameters may be selected such that the resulting fingerprint data has a very high probability (e.g., greater than 99.999%) of being unique to the server. In addition, the machine parameters may be selected such that the fingerprint data includes at least a stable unique portion up to and including the entire identifier that has a very high probability of remaining unchanged during normal operation of the server. The resulting fingerprint data should be highly specific, unique, reproducible and stable as a result of properly selecting the machine parameters.
For example, a server may comprise a motherboard on which reside a CPU and one or more auxiliary processors. The CPU may comprise a cache memory in communication with a random access memory (RAM). A video processor may communicate with these components via Northbridge hub and provide video data through video RAM to a display device. Other components may communicate with the CPU via a Southbridge hub, such as, for example a BIOS read-only memory or flash memory device, one or more bus bridges, a network interface device, and a serial port. Each of these and other components may be characterized by some data or parameter settings that may be collected using the CPU and used to characterize the server device.
To generate the fingerprint, a fingerprint application operating on the server may perform a system scan to determine a present configuration of the computing device. The application may then select the machine parameters to be used as input for generating the unique fingerprint data. Selection of parameters may vary depending on the system configuration. Once the parameters are selected, the application may generate the identifier.
Illustrative examples of various machine parameters that may be accessible to an application or applications running on or interacting with a processor of the client machine include: machine model; machine serial number; machine copyright; machine ROM version; machine bus speed; machine details; machine manufacturer; machine ROM release date; machine ROM size; machine UUID; and machine service tag. For further example, these machine parameters may include: CPU ID; CPU model; CPU details; CPU actual speed; CPU family; CPU manufacturer; CPU voltage; and CPU external clock; memory model; memory slots; memory total; and memory details; video card or component model; video card or component details; display model; display details; audio model; and audio details; network model; network address; Bluetooth address; BlackBox model; BlackBox serial; BlackBox details; BlackBox damage map; BlackBox volume name; NetStore details; and NetStore volume name; optical drive model; optical drive serial; optical details; keyboard model; keyboard details; mouse model; mouse details; printer details; and scanner details; baseboard manufacturer; baseboard product name; baseboard version; baseboard serial number; and baseboard asset tag; chassis manufacturer; chassis type; chassis version; and chassis serial number; IDE controller; SATA controller; RAID controller; and SCSI controller; port connector designator; port connector type; port connector port type; and system slot type; cache level; cache size; cache max size; cache SRAM type; and cache error correction type; fan; PCMCIA; modem; portable battery; tape drive; USB controller; and USB hub; device model; device model IMEI; device model IMSI; and device model LCD; wireless 802.11; webcam; game controller; silicone serial; and PCI controller; machine model, processor model, processor details, processor speed, memory model, memory total, network model of each Ethernet interface, network MAC address of each Ethernet interface, BlackBox Model, BlackBox Serial (e.g., using Dallas Silicone Serial DS-2401 chipset or the like), OS install date, nonce value, and nonce time of day. The foregoing examples are merely illustrative, and any suitable machine parameters may be used.
Because many servers are mass-produced, using hardware parameters limited to the server box may not always provide the desired level of assurance that a fingerprint is unique. Use of user-configurable parameters may ameliorate this risk considerably, but at the cost of less stability. In addition, sampling of physical, non-user configurable properties for use as parameter input may also lessen the risk of generating duplicate fingerprint data. Physical device parameters available for sampling may include, for example, unique manufacturer characteristics, carbon and silicone degradation and small device failures.
Measuring carbon and silicone degradation may be accomplished, for example, by measuring a processor chip's performance in processing complex mathematical computations, or its speed in response to intensive time variable computations. These measurements depend in part on the speed with which electricity travels through the semi-conductor material from which the processor is fabricated. Using variable offsets to compensate for factors such as heat and additional stresses placed on a chip during the sampling process may allow measurements at different times to reproduce the expected values within a designated degree of precision. Over the lifetime of the processor, however, such measurements may change due to gradual degradation of the semi-conductor material. Recalibration or rewriting the fingerprint data may be used to compensate for such changes.
In addition to the chip benchmarking and degradation measurements, the process for generating a fingerprint data may include measuring physical, non-user-configurable characteristics of disk drives and solid state memory devices. For example, each data storage device may have damaged or unusable data sectors that are specific to each physical unit. A damaged or unusable sector generally remains so, and therefore a map of damaged sectors at a particular point in time may be used to identify a specific hardware device later in time. Data of this nature may also be included in a fingerprint file.
The application may read parameters from operating system data files or other data stored on the server, or actively obtain the parameters by querying one of more hardware components in communication with a processor on which the application is operating. A processor provided with at least one application operating in this fashion of gather the machine parameters may comprise a means for collecting and generating fingerprint data.
In addition, the process of generating a working machine fingerprint may include at least one irreversible transformation, such as, for example, a cryptographic hash function, such that the input machine parameters cannot be derived from the resulting fingerprint data. Each fingerprint data, to a very high degree of certainty, cannot be generated except by the suitably configured application operating or otherwise having had access to the same computing device for which the fingerprint data was first generated. Conversely, each fingerprint, again to a very high degree of certainty, can be successfully reproduced by the suitably configured application operating or otherwise having access to the same computing device on which the identifier was first generated.
Failure monitoring and detection may be handled by the routing appliances and standby servers in systems 200 and 300 as in system 100. Active servers generate and transmit a keep-alive signal at a defined frequency. In the multiple server embodiments, the slave active servers run the secure proxy application only, which communicates with the web application operating on the primary active server. In the event the primary server goes down, the active slave (whether primary slave 204/304 or secondary slave 208/308, if active) is configured to communicate with the web application residing on the secondary server 206.
In all systems 100, 200 and 300, communication from system computers 201, 301 should not be disrupted in the event that any active server goes down, whether slave or not.
Initially, first and second servers are configured 402 as duplicate peer servers, including at least an application for servicing network requests and optionally, a database, enterprise messaging system, and proxy server applications. The servers are configured to replicate the database of the active peer. Only one of the servers is allowed to be active at any one time, the other operates in a standby mode. Both peers are able to switch from standby to primary mode in response to detecting that its peer has failed or gone down. Each peer's default mode after resetting is stand-by mode.
Each peer may optionally generate a machine fingerprint 404 at initialization, such as after any reset or rebooting event. These fingerprints may be generated in any suitable manner as described herein. Each peer stores its own fingerprint in a local memory. Initially, one of the peers operates in active mode and the other operates in standby mode. The active server may transmit the machine fingerprint to the routing devices at a defined frequency. The standby server may also transmit its fingerprint.
Routing appliances are configured 406 by storing the IP addresses of the active and primary servers and receiving, at system initialization, an indication of which server is designated the active server. The routers may be configured with instructions for receiving keep-alive signals, detecting server failure using these signals, and initiating failover procedures in response detecting server failure In some embodiments, the routers may also receive a copy of each server fingerprint and may be configured with instructions for receiving the fingerprint-signed status messages from the servers. At 408, the router devices establish secure communication channels to each server, using a secure VPN tunnel. Each router is configured to route all traffic to the currently designated active server, also secure channels are maintained to both.
Each routing device and the standby server monitor current status of the active server by receiving and status messages and detecting when the status message stream is interrupted. If the router detects an interruption in the message stream 412, the router switches 414 the server status such that the stand-by server becomes the new active server. The router routes all traffic 416 to the new standby server 416, establishes a secure communication channel 408 with the other server as soon as it comes back on line, and resumes status monitoring 410.
Based on the foregoing, a failover procedure according to an exemplary embodiment of the invention may be expressed as a series of salient process steps, as follows: a step for routing traffic from a routing device to a first server, a step for storing in the routing device data representing a fingerprint of the first server, a step for receiving periodically at the routing device a status message from the first server, a step for detecting at the routing device an invalid status message from the first server by absence of the fingerprint in a status message from the first server within a predetermined time period after last receiving a valid status message at the routing device, and a step for routing the traffic from the routing device to a second server in response to detecting the invalid status message from the first server. The steps may be performed in a sequence other than the sequence expressed without departing from the invention.
Having thus described a preferred embodiment of a method and system for failover management in a server system, it should be apparent to those skilled in the art that certain advantages of the within system have been achieved. It should also be appreciated that various modifications, adaptations, and alternative embodiments thereof may be made without departing from the scope and spirit of the present technology. The following claims define the scope of what is claimed.
As used in this application, the terms “component,” “module,” “system,” and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component can be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components can communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems by way of the signal).
It is understood that the specific order or hierarchy of steps in the processes disclosed herein in an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged while remaining within the scope of the present disclosure. The accompanying method claims present elements of the various steps in sample order, and are not meant to be limited to the specific order or hierarchy presented.
Those skilled in the art will further appreciate that the various illustrative logical blocks, modules, circuits, methods and algorithms described in connection with the examples disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, methods and algorithms have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.