This page describes the ePages Web Interface introduced in 6.17.0.

Request Processing

Request Processing

Web Browser (Client)

The client resoves the DNS name of the web server, then sends a request to IP address of the web server via TCP using the HTTP protocol (usually on port 80). Then it waits for the response.

Web Server

After traveling through the Internet, passing the firewall and an optional load balancer, the request arrives at the web server. The web server decides based on the URL, whether the requested resource is a static file or dynamic content. If the URL schema matches the pattern /epages/*.(sf|mobile|admin), the request is passed on to the Web Adapter. For short URLs the process is a little bit more complicated.

Web Adapter

  1. asks ASPoolDBCacheServer for responsible Request Router(s) via UDP

  2. asks Request Router for a free Application Server via UDP

  3. forwards the HTTP request to the Application server via TCP

  4. forwards the HTTP response to the client

ASPoolDB Cache Server

Given the domain name and URL prefix of the request, this server returns the following information:

  • request router IP address and port number

  • fallback request router IP addresses and ports

  • shop GUID

  • database name

  • limits of concurrent requests per shop or per pool

It executes:

  1. Finds the shop by domain name or URL prefix

  2. Get database by shop

  3. Get application server pool by database

  4. Get request routers by pool

  5. Get limits per shop or per pool

This is a single-threaded server and it sees each and every dynamic request. It’s the scalability bottleneck of the whole system. Therefore it must be very fast.

It is very fast, because A) it’s very simple and B) because it caches the response of all requests. There is even a cache for empty results. That’s useful for deleted shops.

Request Router

  1. Select an application server pool

  2. Test limits → abort

  3. Find idle application servers

  4. Select the best application server to get a high cache hit rate based on: .* application server priority .* client IP address .* shop GUID .* maximum idle time

The request router knows the current status of all application servers that are running and are assigned to one of the pools it’s responsible for. Therefore it knows which AS are busy and which ones are currently idle. It also remembers the most recent shop GUID and client IP address for each AS.

Application Server

  1. Connects to the database

  2. Tests permissions

  3. Executes all ChangeActions

  4. Executes the ViewAction (returns the response from the page cache or processes templates to build the response content)

Why Is It So Complicated?

It’s all about caching. See next section.

About Data Caching

The application server caches almost all database queries to provide good response times. This drastically reduces the load of the database server and improves the avarage response time. These advantages come with some costs: The application server requires a lot of RAM and data is not 100% up to date.

How Is the Data Cache Kept Up To Date?

When an application server starts up, it connects to the Message Center port of its primary request router. While the application server is idle, it queries the Message Center regulary (every 30 seconds) for cache update events, i.e. database changes made by other application servers. If there are any updates, it selectively deletes the locally cached data. When the application server receives a request, it queries the Message Center again for cache update events and selectively deletes the locally cached data. The DAL layer keeps track of all database changes made by the application server. At the end of the request, before sending the response, the application server convcerts the collected database changes into cache update events and sends them to the Message Center.

Why Is a Good Cache Hit Rate Important?

A request without cache would take about 6 seconds. Tis time is very roughly distributed in:

  • 2 seconds for global data (classes, languages, hooks, system settings)

  • 2 second for shop data (domain, open/closed, user groups, style sttings, navigation bars, special offers, basket settings, tax settings)

  • 1 second for page-specific data (name, description, sub categories, products, variations, cross selling)

  • 1 second for template processing (or 0.1 seconds for cached pages)

Conclusion: Re-using an application server that was used for the same shop before, saves 4-5 seconds (66-83%).

Some statistics from a live system:

  • Number of requests: 45,071,745

  • Hits for client IP address: 22,610,737 (50,2%)

  • Hits for shop GUID: 5,372,225 (11,9%)

Why Do We Cache Data Locally For Each Application Server?

Advantages:

  • better response time

  • no communication required for data that’s already in the cache (scalability)

  • no context switches during processing

Disadvantages:

  • lower cache hit rate

  • higher database load

  • high memory footprint

  • communication required for cache updates

Application Server Pools

Application server processes are grouped into pools and each pool is assigned to a subset of the databases. For example, application servers 1-10 are assigned to database A, while application servers 11-20 are assigned to database B. One application server pool can also serve multipley databases, but one database belongs to exactly one pool.

Why Do We Need them

Since one AS is only responsible for a subset of databases, the potential amount of cacheable data is devided by the number of pools. This improves the cache hit rate and limits the amount of cache data per application server. It also limits the amount of communication required to exchange cache updates, because these updates are only communicated between application servers that belong to the same pool.

Without pools, each application server would be resposible for all shops. The more shops are created, the more data would have to be cached by every application server. Adding more application servers would result in higher avarage response times, because it becomes less and less likely to find an application server that has the right cache data. And the application servers would become more expensive, because they’d need much more RAM.

Solution: Application server pools allow to scale application servers linear to the number of shops.

ASPoolDB

This central database contains all shops of the installation and the following information about the shop:

  • database name

  • domain name and URL prefix

  • optional limitations of concurrent requests

It also knows the assignment of databases to application server pools.

Sizing Recommendation

  • 1 database = 5000 shops

  • 1 database = 1 application server pool

  • 1 application server pool = 100 application servers

  • max 10 concurrent requests per site

  • max 5 concurrent requests per client ip

  • 500 MBytes RAM per Application server

  • 1 virtual processor per 2 application servers

Example For 5000 Shops

  • 1 database server machine with quad-core processor, 16 GByte RAM

  • 3 application servers machines with 2 quad core processors and hyper threading activated, 16GByte RAM each

  • 2 web server machines with quad core processors and hyper threading activated, 8GByte RAM each. Install a request router and an ASPoolDB cache server on both machines

New WebInterface Configuration

  • New service: ASPoolDBCacheServer

  • ASPoolDB uses MySQL instead of SQLite -> solves problems with NFS

  • Pool configuration parameters are moved from ASPoolDB to ServerConfig.xml

Changed: WebInterface.conf

  • no longer used by Request Router

  • WebInterface.conf is still used for Web Adapter, Application Server and ShortURL-Filter

  • WebInterface.conf - Server sections are moved to ServerConfig.xml

  • See ePages_6.17+_WebInterface.conf

New: ServerConfig.xml

Configuration File Usage

Changes to WebInterface.conf and ServerConfig.xml are never re-loaded automatically. After relevant changes, the affected components have to be re-started.This is not a big problem, because all components can be configured redundantly.

WebAdapter

  • Uses WebInterface.conf (parameters: TIMEOUT, RETRIES, KEEPALIVE, MAXCONTENTLENGTH)

  • Uses ServerConfig.xml (list of ASPoolDBCacheServer instances, IP and Port)

RequestRouter

  • Uses ServerConfig.xml - list of RequestRouter instances for the current host

ASPoolDBCacheServer

  • Uses ServerConfig.xml - ASPoolDBCacheServer port on the current host

ePages Service

  • Uses WebInterface.conf (parameter: OPTIONS)

  • Uses ServerConfig.xml - list of application server instances on the current host

ShortURLFilter

  • Uses WebInterface.conf (section URLRewrite)

Application Server

  • Uses WebInterface.conf (parameters: COOKIS, AUTORECOMPILE, MONITOR_TIMEOUT, MAXCONTENTLENGTH, process priorities)

  • Uses ServerConfig.xml (pool name, application server priority, memory limit)