A recent focus of our technical staff has been the creation of a highly fault-tolerant GPWA Approved Portal seal serving service. We designed the service so it is fully redundant at every level and heals itself in real time when any component fails.
We’ve now accomplished the goals we set for ourselves, and so I think it is appropriate to share with everyone the inside details of what we have done. A diagram overviewing the service components and a narative description of how they fit together follows. I think our technical staff did a great job, and I hope you agree.
GPWA Fault-Tolerant Seal Service Components
All requests for GPWA seal services are provided using the host name certify.gpwa.org. All traffic we receive for this host name is directed to an IP address that is normally serviced by our “Master Loadbalancer.” The Master Loadbalancer passes the requests it receives off to either Seal Server 1 or Seal Server 2 as shown in the diagram, balancing the workload between these two systems. If either seal server becomes inoperative for any reason, the Master Loadbalancer removes it from the mix within a few seconds. Later, when the Seal Server resumes normal operation it is automatically added back into the mix of servers actively processing seal requests under the management of the Master Loadbalancer.
When a seal server receives a request, it queries the database server containing seal information (the “Seal Database”). The referrer information and URL of the seal request are used to determine the site on which the seal is to be served. If the site is authorized, the specific size of the seal returned is based on the configuration information contained in our Seal Database for the site. If the site is not authorized to display the requested seal then a single pixel image is returned instead. In the event the Seal Database is temporarily unavailable, locally cached information saved on the Seal Server is used to process the request. The responses from the Seal Servers are sent directly back to the Internet, bypassing the Master Loadbalancer on the return trip.
There is also a Slave Loadbalancer that monitors the Master Loadbalancer through the use of Heartbeat Traffic. If the Master Loadbalancer becomes inoperative, the Slave Loadbalancer takes over its function within a fraction of a second. Later, when the Master Loadbalancer becomes operative again, control is automatically returned.
The Loadbalancers, Seal Servers, and Seal Database all exist within a carefully designed fully-redundant network infrastructure without any single points of failure. Upstream Internet connectivity is provided to four major Internet backbone providers via a pair of redundant Edge Routers. And all internal servers are connected to the Edge Routers via redundant Core Routers. Every server has redundant power supplies connected to independent uninterruptible power supplies backed up by emergency power generation equipment.
So there you have it in a nutshell. The new, fault-tolerant, GPWA Approved Portal seal serving service!
Michael